Additional Lambda performance optimization approaches

Serverless applications on AWS with Lambda using Java 25, API Gateway and Aurora DSQL

Serverless applications on AWS using Lambda with Java 25, API Gateway and Aurora DSQL - Part 1

Serverless applications on AWS with Lambda using Java 25, API Gateway and Aurora DSQL – Part 1 Sample application

Serverless applications on AWS using Lambda with Java 25, API Gateway and Aurora DSQL – Part 2 Initial performance measurements

Serverless applications on AWS with Lambda using Java 25, API Gateway and Aurora DSQL – Part 3 Introducing Lambda SnapStart

Serverless applications on AWS using Lambda with Java 25, API Gateway and Aurora DSQL - Part 4

Serverless applications on AWS with Lambda using Java 25, API Gateway and Aurora DSQL – Part 4 Using SnapStart with DSQL request priming

Serverless applications on AWS using Lambda with Java 25, API Gateway and Aurora DSQL - Part 3

Serverless applications on AWS with Lambda using Java 25, API Gateway and Aurora DSQL – Part 5 Using SnapStart with full priming

Serverless applications on AWS using Lambda with Java 25, API Gateway and Aurora DSQL - Part 6

Serverless applications on AWS with Lambda using Java 25, API Gateway and Aurora DSQL – Part 6 Using GraalVM Native Image

Serverless applications on AWS using Lambda with Java 25, API Gateway and Aurora DSQL - Part 7

Serverless applications on AWS with Lambda using Java 25, API Gateway and Aurora DSQL – Lambda performance optimization approaches

Introduction

In the previous articles of the series about how to develop, run, and optimize Serverless applications on AWS with Lambda using Java 25, API Gateway, and Aurora DSQL database, we used:

Managed Java 25 runtime
GraalVM Native Image deployed as Lambda Custom Runtime

We also did Lambda performance (cold and warm starts) measurements with the following settings:

Lambda functions used 1024 MB of memory
Java compilation option “-XX:+TieredCompilation -XX:TieredStopAtLevel=1”
Lambda x86_64 architecture used

In this article, we’ll introduce some additional Lambda performance (cold and warm starts) optimization approaches to apply to our sample application. You’ll need to measure the performance by yourself to figure out whether they will provide the desired Lambda performance improvements.

Please keep in mind that you can also deploy our sample application on AWS Lambda as a (Docker) Container Image. I didn’t cover this approach, but you can look into my article series Lambda function using Docker Container Image for a step-by-step introduction on how to do it. I used DynamoDB as a database in this example. The cold start will be quite big. Lambda SnapStart isn’t available for the Lambda deployment as a Container Image. Instead, you can use Ahead-of-Time (AOT) and CDS caches for the Container Image and then measure the Lambda performance.

Lambda performance optimization approaches

To find a good balance between the cold and warm start times of the Lambda function, you can try out the optimization techniques introduced below. I have not taken any additional measurements with our sample application with Java and GraalVM 25, but have done so using older Java, GraalVM, and dependency versions.

We can apply the following approaches to the managed Java runtime and GraalVM Native Image. For the managed Java runtime, it includes SnapStart being enabled and applying the priming techniques on top:

Try out different Lambda memory settings. We performed all measurements with 1024 MB of memory for the Lambda function. With different memory settings, you might become better at the price-performance trade-off.
Try out setting Lambda arm64 architecture using the AWS Graviton2 processor, which supports SnapStart since July 2024. This can provide a better cost-performance trade-off compared to x86 architecture.

We can apply the following approaches primarily only to the managed Java runtime on Lambda. This includes SnapStart being enabled and applying the priming techniques on top:

Try out different Java compilation options for the Lambda function. We performed all measurements until now with the compilation option “-XX:+TieredCompilation -XX:TieredStopAtLevel=1”. We can provide other compilation options to the Lambda function using an environment variable called JAVA_TOOL_OPTIONS. This can have different cold and warm start trade-offs. For GraalVM Native Image, the choice of Java compilation method doesn’t have much impact on the Lambda performance. This is because our application is already compiled natively.
Further exclude unused dependencies. With that, we can especially reduce the cold start times (also for SnapStart enabled). In the case of GraalVM Native Image, only reachable Java classes, functions, and methods will become a part of the Native Image, so including unused dependencies may not help that much.

We can apply the following approach primarily to the managed Java runtime on Lambda with SnapStart enabled:

Search for further Lambda SnapStart priming potential in addition to those we introduced in this series. For this, you can use AWS Lambda Profiler Extension for Java. I described it in my article Improving Lambda performance with Lambda SnapStart and priming.

We can apply the following approach primarily to the GraalVM Native Image :

Try out Profile-Guided Optimizations to see whether you can further improve Lambda performance. The difficulty of trying out this technique is that you’ll need to do some additional semi-automated steps to run your application either with the Lambda emulator locally or in an extra environment to obtain the profile of your application, which you’ll then need to use to generate the optimized Native Image. You can use the Lambda extension for it, but it still requires a lot of additional work. This is the work AWS did for us in case Lambda SnapStart is enabled. I really appreciate that I don’t need to care about generating, encrypting, storing, and restoring the snapshots/profiles.

Conclusion

In this article, we introduced additional Lambda performance optimization approaches that we can use in our sample application. Try them out on your own to figure out whether they will provide the desired Lambda performance improvements.

Please also watch out for another series where I use a NoSQL serverless Amazon DynamoDB database instead of Aurora DSQL to do the same Lambda performance measurements.