If you are frustrated by the complexity of packaging Python-based Lambda applications, I highly recommend checking out Chalice: <a href="https://github.com/aws/chalice" rel="nofollow">https://github.com/aws/chalice</a>. It's very well engineered and drastically simplifies the process (for example, it handles pre-built wheels properly, so you can cross-compile your Lambda (build and deploy it from a Mac, etc).<p>Actually, we just open-sourced a template for using Lambda with Chalice and Terraform that automates this and many other relevant steps: <a href="https://github.com/chanzuckerberg/chalice-app-template" rel="nofollow">https://github.com/chanzuckerberg/chalice-app-template</a>. It's not 100% directly applicable to this use case yet, because SAM/CloudFormation templates don't have a good way to manage bucket event subscriptions. But domovoi (<a href="https://github.com/kislyuk/domovoi" rel="nofollow">https://github.com/kislyuk/domovoi</a>) can manage S3 event subscriptions (direct or bridged through SNS, SQS, or SNS-SQS) in an idempotent declarative (non-IaC) process.
Lambda does have a learning curve, and so does deployment safety. Many Python deployment strategies are kinda interesting because they simply always re-download the packages. Lambda doesn't allow this.<p>And they were close to a solution. They have a CI pipeline which can and should be doing the packaging for them. The Linux image only has to be close enough to Amazon Linux, not exactly Amazon Linux. Heck, even CodeBuild uses Ubuntu [0].<p>It also doesn't help that there's a lack of information or simply misinformation out there. Sometimes I think frameworks with a very specific use-case like Zappa do more harm than good. Yes, it's easier to get running, but it doesn't give you a general purpose solution and makes you think everyone is just hacking around the mess.<p>Serverless' serverless-python-requirements is a good solution if you can't be bothered having the CI do the packaging/artifact creation.<p>[0] <a href="https://docs.aws.amazon.com/codebuild/latest/userguide/build-env-ref-available.html" rel="nofollow">https://docs.aws.amazon.com/codebuild/latest/userguide/build...</a>
It boggles the mind that JSON to arvo support for Athena isn't a native part of Firehose. Thousands of devs have had to fight, and in my case lose, this same battle. I've had decent luck with Severless framework's python requirements plugin for other C libs and for large modules. Outside of that is a hellscape of logs with import errors and a lot of waiting for feedback.
The easy way to get around this is to use AWS CodeBuild with the built in Python Docker image and have it trigger when you push. You can trigger a build from either AWS Code Commit or GitHub.<p>As part of the buildspec.yml just import all of your dependencies using<p>pip install {package} -t . # (The period at the end forces it to install in the local directory.<p>In your artifacts section make sure you include everything.<p>CodeBuild will then create a zip file that you can load into the console.<p>For bonus points, once you create the lambda manually for the first time using the AWS console, you can export the CloudFormation yml file and use that as part of your automation strategy where you have a CF Parameter that specifies the name of your zip file that was uploaded to S3 by CodeBuild.<p>I use this strategy all of the time to develop on Windows and deploy to lambda.
We've been developing some products for lambda but have been building them completely in Go. We can compile them into a single fat binary with the text files inside so our deploys become a simple zip file with the binary inside. Another advantage is warmup time for go is quite fast.
That's a rude awakening but an expected one. It's very easy to write an elegant piece of code in scripting language only to find that some of dependencies who work magic are pretty messy to deploy.<p>You can usually count on having native libraries for a given activity for Java, so you can just use a JVM-based language (does Lambda support that? I bet it does.)