AWS Lambda Concurrency: A Perspective
In this article, we are going to dive into some of the weeds of concurrency. I would treat this article as a high-level overview of how AWS Lambda concurrency works; not as a low-level end-all, be-all. With that context, let's jump in!
So what is concurrency?
"The property or an instance of being concurrent; something that happens at the same time as something else." 
I think this is a fine definition to start from, but let's expand a bit further. The last part of the definition was "happens at the same time". With AWS Lambda you have a small isolated compute instance which can scale horizontally and have thousands of individual isolated units running "at the same time".
To dive a bit deeper into AWS Lambda, let's talk about the spectrum of compute:
- On Premises (On-prem)
- Virtual Machines (thinks AWS EC2)
- Containers (think Docker and AWS ECS)
- Cloud Functions (AWS Lambda)
When we look at that list above of the spectrum of compute, we start with the most overhead (On-prem) and move down towards the least overhead with Cloud Functions (AWS Lambda).
AWS Lambda does exist on-prem, but we don't manage that. AWS has secure data centers around the world which have servers on racks from floor to ceiling. On those servers they have virtual machines running. And inside those virtual machines they have containers running. And I'm sure you guessed it, inside those containers they have cloud functions running.
Now isn't that an interesting way to think about it!
Let's keep going. Because of the AWSome work of AWS, we get to build entire applications which are powered by cloud functions without needing to think about the data center, the virtual machine, or the container.
All we need to worry about is the code that we write and neatly packaging that code up to be uploaded to the AWS cloud function offering called AWS Lambda as a zip file.
This zip file contains everything our backend code needs to run including application code, dependencies, and so on.
Because of this simplified format with cloud functions and all the underlying work that AWS has put in under the hood. AWS Lambda can scale horizontally and achieve crazy amounts of concurrency.
What does this looks like in practice?
Imagine that your application records clicks of a button:
- Button gets clicked and fires off an API request to your backend
- Your backend then processes "what button" was clicked and "who clicked" and saves that change in your database
Simple enough. Now let's imagine that we got 1,000 clicks happening at the same time and our backend code was running on AWS Lambda.
What would take place is, AWS Lambda would spin up 1,000 instances of your backend code to handle those requests at the same time. Then once those requests finish the AWS Lambda instance would stop.
There are some nuances with AWS Lambda. For example, what if instead of 1,000 requests at the same time it was 1,000 requests over a 10 minute period. In the 10 minute period example, we may have what's called container re-use.
Container re-use is basically AWS saying, "Hey, I saw you got one request to your AWS Lambda function, I'm going to leave the AWS Lambda instance up for a few minutes in case you get another request. As that will make it easier for both of us and the 2nd-nth request will be faster than the first because we don't have to re-download your code."
That's pretty slick right? Now, let's talk about the other side of that coin, something called cold-starts.
Cold starts are in essence that first request where AWS has to download your dependencies. As you can imagine, the larger the dependencies, the slower the cold-start.
Cold start is less of an issue than it's made out to be, but generally the best practice we recommend to clients at serverlessguru.com is to keep your 3rd party dependency usage to a minimum and isolate functionality into many smaller AWS Lambda functions versus building a monolithic AWS Lambda function.
It sounds easy, but in reality the traditional route of software development is to throw the kitchen sink at even the simplest problems and then just crank that virtual machine dial to 100 to counter compensate.
AWS Lambda has a best practice forcing function to it which pushes developers to not throw the kitchen sink and architect things in an arguably better way than they would otherwise.
Alright so I know this article was about concurrency, as I mentioned at the beginning it's not the low-level guide, but more of a high level opinion piece about how I think about AWS Lambda concurrency.
Thanks for reading and see you next time!