AWS Lambda Managed Instances with Java 25 and AWS SAM – Part 6 Lambda function performance improvement approaches
Introduction
In part 5 of this series, we measured the initial Lambda function performance (the Java code was intentionally not optimized). Still, we saw a quite noticeable cold start when the request to the Lambda function reached the underlying EC2 instance for the first time. We found that a Lambda function with the Lambda Managed Instances compute type can’t eliminate the behavior of the programming language. One of the examples of such behaviour is a JVM warm-up period.
In this article, we’ll optimize our Lambda function to improve the cold start time significantly.
Lambda function performance improvement approaches
Move the Object Mapper initialization to the static initializer block
The first thing that we notice during the deployment phase (in case we initially deploy or re-deploy the Lambda function) is that the handler class, for example, for GetProductByIdHandler for the GetProductByIdHandler Lambda function will be initialized. We can prove it by adding some log statements to the static initializer block of this Java class. With this, we know that the static code will be executed before we even invoke our Lambda function. So our goal is to preload as much as possible in the static initializer block of the Lambda handler class.
The first thing that we notice is that we instantiate the ObjectMapper (to convert JSON representation of the Product into the Product class) for each invocation (within the handleRequest method) of the Lambda function:
This is suboptimal because we can reuse this object, so let’s move it to the static initializer block of the GetProductByIdHandler :
Now let’s redeploy our application and send a first request to this Lambda function, for example, via API Gateway GET HTTP request. Let’s then look into the CloudWatch metrics of this Lambda function :
We see that the response latency of this Lambda function is 1579 milliseconds. You might observe a different latency, because it also depends on the type of EC2 instance itself. But it decreased by roughly 165 milliseconds compared to the initial version. Why? Generally speaking, when you create an ObjectMapper for the first time, there will be a lot of classes that will be loaded for the first time. Some of those classes are singletons that need to be instantiated only once. It takes, depending on the hardware, more than a hundred milliseconds. If you create the second ObjectMapper in the same Java process, it takes 1 millisecond. By moving the ObjectMapper instantiation to the static initializer block of the Lambda function, which will be executed during the deployment phase, we saved a bunch of time, and the first invocation of the Lambda function on the EC2 instance became quicker.
Move the DynamoDB Product DAO initialization to the static initializer block
Let’s move on. We also so that we instatiate DynamoProductDao for each invocation (within the handleRequest method) of the Lambda function (see the code above). With that, we also instantiate DynamoDBClient in each Lambda request. This is suboptimal as well, because we can reuse this object, so let’s move it to the static initializer block of the GetProductByIdHandler :
Now let’s redeploy our application and send a first request to this Lambda function, for example, via API Gateway GET HTTP request. Let’s then look into the CloudWatch metrics of this Lambda function :
This optimization is huge: response latency went down to 691 milliseconds. The key is also preloading a bunch of classes and instantiating DynamoDBClient during the deployment phase. Ok, this was an easy part. We’re not done yet.
Prime the Dynamo DB request
I wrote a lot of articles about Lambda SnapStart and the Lambda cold start optimization strategies using priming techniques. The truth is that we don’t require implementing the CRaC API for such priming. The first priming approach that we’ll apply now is DynamoDB (request) priming. The idea is to make a call to the DynamoDB table to do the following.
- Preload all classes required for such a request and delivering response.
- To instantiate the HTTP Client (the default one is Apache 4.5, but you can set others). An HTTP client is required to talk to the DynamoDB, and its initialization is an expensive one-time operation.
- To use ObjectMapper to convert the JSON result to Product, which will also involve some expensive one-time operations.
How can we do it? Let’s look at the example:
We simply use the static initialization block to get the product by id 0. We are not even considering the result (the product with id 0 might not even exist in the database), but only in the 3 things I mentioned previously.
Now let’s redeploy our application again and send a first request to this Lambda function, for example, via API Gateway GET HTTP request. Let’s then look into the CloudWatch metrics of this Lambda function :
This is another huge win, and the response latency is now down to 75 milliseconds. We had to write only a few lines of code to achieve this. Please find the full code in GetProductByIdWithDynamoDBPrimingHandler, and I provided an extra /productsWithDynamoDBPriming/{id} endpoint in the API Gateway. We can basically stop here, because the latency is nearly acceptable for most use cases.
Prime the full API Gateway request event
I gave the explanation in my article mentioned above. Now let’s redeploy our application from the last time and send a first request to this Lambda function, for example, via API Gateway GET HTTP request. Let’s then look into the CloudWatch metrics of this Lambda function :
We saved another 20 milliseconds of the response latency, which is now only 60 milliseconds. There is still some internal work to be done by the Lambda client deployed on EC2 before the handleRequest method is invoked, and we can’t optimize it. I’d stop by doing DynamoDB (request priming), as I’m pretty happy to have the responseLatency below 100 milliseconds.
Conclusion
In this article, we optimized our Lambda function in multiple steps to decrease the cold start time (response latency of the first request) significantly.