Using AWS Lambda Extensions to Accelerate AWS Secrets Manager Access

Making serverless faster using AWS Lambda Extensions

Reddit
LinkedIn

This was a collaboration between Michael Weissbacher from Security Infrastructure and Michele Titolo from Cloud Foundations.

Overview

AWS Lambda Extensions are a new way for tools to integrate deeply into the Lambda environment, and they can run before the start of a Lambda function. This allows us to perform preparatory tasks that are only necessary on a cold start for example. We developed and open sourced an extension that pre-fetches secrets from AWS Secrets Manager. We did this because we noticed calls to Secrets Manager could take significant time, sometimes up to 300ms. By prefetching we can eliminate this overhead and secrets can be available immediately for Lambda function invocations. In our experiments we compared a baseline Lambda function that fetches three secrets to using an extension to prefetch. We measured improvements of 500ms speedup. In this article we will give an overview on how extensions work and how we built one to speed up our use of Secrets Manager.

Introduction

We are excited to share what we've built with the new AWS Lambda Extensions feature, now in Preview. Square has been building tooling for serverless applications for almost a year, and with the first set of Lambda functions running in production, teams are starting to need more than an MVP. For the past few months one focus area has been secrets management, as teams using Lambda functions increasingly need access to sensitive information. Before learning about extensions, our plan was to pull the secrets at runtime initialization, similar to how Lambda functions pull certificates for mTLS today. Most Lambda functions at Square use 4 or 5 secrets for mTLS and other integrations, such as third party vendors. We have seen the performance cost of doing multiple calls to Secrets Manager, and we knew the experience could be better, so we were very excited to learn about Lambda Extensions as an avenue to improve developer experience.

With extensions we are able to run a background process inside the Lambda function, and AWS manages the lifecycle of that process. Additionally the extension starts up before the runtime and function, which means it has less impact on performance. Our Lambda layer for mTLS already runs as a background process, but it's been tricky to set up inside the Lambda function. For our first foray into extensions, we chose to investigate what the most time consuming part of that layer would look like as an extension, which is pulling secrets from Secrets Manager and making them available to the function.

Architecture

Square manages its secrets with Keywhiz, an open source secrets distribution tool. In our data centers, teams use Keywhiz to manage secrets, and those secrets magically appear on hosts. We wanted to deliver a similar experience for Lambda functions, especially since teams will still use Keywhiz to manage secrets.

We updated Keywhiz to synchronize secrets to Secrets Manager in AWS so applications which run in AWS can retrieve secrets without calling into the data center. However, calling into Secrets Manager can take 300ms. This is less of an issue for long running applications, but for Lambda functions which are intended to be spun up and down as fast as possible, it can be a big hindrance.

The extension we created is conceptually simple -- it reads secret ARNs from a configuration file, and downloads those secrets in parallel to /tmp which is shared with the Lambda function execution environment. The Lambda function can then read the secret whenever it's needed. We realize that making a single call to Secrets Manager itself is not a burden for other teams, but we do know that performance is important for teams using Lambda functions.

Cloud Portal and Lambda POC
End to End system architecture. We use Keywhiz to synchronize secrets into AWS Secrets Manager, and use the new extension feature to pre-fetch secrets before Lambda functions execute.

Since secrets are, well, secret, we investigated how the shared file system works within Lambda functions. Thankfully, isolation of the execution environment and /tmp is guaranteed by the Lambda function security architecture (Security Overview of AWS Lambda § Storage and State). We don’t need to worry about other Lambda functions, even other instances of the same function, reading the storage for a single Lambda function. Between that and the short life cycle, we were comfortable caching the secrets to disk.

We wrote the extension in Go and distributed it as a binary. Creating a binary has significantly reduced the overhead in supporting additional Lambda function runtimes. Teams at Square have been building Lambda functions with runtimes that are not fully supported server application languages internally. As new runtimes are adopted, binaries make them easy to support.

Our extension registers with the Lambda Runtime Extension API, only subscribing to the SHUTDOWN event. We did this to simplify the extension, as it doesn't do anything on an invoke. This means the extension will currently cache the secrets for the duration of the Lambda function’s container. While AWS doesn’t provide guidance on how long that is, we see those stick around for 2-4 hours. They can be forced to recreate with an update to the layer or function, but we do not want teams to manually have to do that. This is an area we plan on re-evaluating in the future, as waiting a few hours for secrets to propagate is not ideal.

Our Lambda extension is available on GitHub.

Learnings from using Lambda extensions

Using this technology early was exciting and we noticed several peculiarities that we wanted to share. The border between an extension and a function is complicated and while some things are shared, others are not.

The extension does not have access to all the same environment variables as the runtime. In our initial experiments we tried using the LAMBDA_TASK_ROOT environment variable to read the config file, but it’s not there. Some environment variables are, but not all, so double check the official documentation if you plan on relying on one of them. The extension can authenticate to other AWS services without additional configuration.

Logs from the extension, other layers, and function are sent to the same CloudWatch Log Stream. We appended [extension] to all of the logs for our extension so we could easily tell where the log line came from.

The extension should not try to use anything from the runtime, and needs to be self-contained. In one experiment, we tried writing our extension in ruby, for a ruby Lambda function. This required digging around the container for the ruby executable, and using that to run a script. There were a bunch of issues with this setup, and we eventually learned it’s not the correct way to create an extension. What goes in the zip file needs to be executable in isolation, which is why we again ended up writing in Go.

Results

Once we got the extension working, we wanted to see if it really provided any performance benefits to our internal customers.

First we compared the duration of prefetching a secret with an extension vs not prefetching, while reusing the AWS client. We added a larger Lambda function memory size for the extension to compare as more processes lead to more memory and CPU usage -- and increased memory does improve performance. This is something we already knew from our previous work with mTLS.

Comparison of Function Duration

Invocation 128MB w/ Extension 192MB w/ Extension 128MB No Extension
1 12.52 3.65 328.53
2 11.89 16.72 41.08
3 6.12 2.94 39.15
4 37.23 3.73 50.24
5 3.58 9.10 36.61
6 41.06 23.18 49.83
7 38.08 2.66 32.45
8 21.34 3.00 45.69
9 26.45 9.81 32.23
10 37.36 22.67 46.23

While there is some volatility in the function duration, overall we were able to confirm that additional memory improves performance with the extension, and that the extension itself does not significantly impact performance of the function.

To measure performance changes from deploying the extension, we created a Lambda function that fetches three secrets in the runtime init. We use this as the baseline of what we consider expected developer behavior. In this test, only the Initialization was significantly different.

Runtime init

Duration (ms) Initialization (ms)
2400 1900
2000 1600
1900 1500
1900 1500
1900 1500
1800 1500
1800 1500
1700 1400
1700 1300
1600 1300

We see that the Duration is on average 1870 ms, with a median value of 1850ms. Initialization has an average of 1500ms and median of 1500ms.

Next, we measured the performance in the extension that we built that prefetches secrets. The results were a Duration averaging 1450ms and a median of 1400ms. Initialization averaged 918.1ms with a median of 891.5ms.

Extension

Duration (ms) Initialization (ms)
1600 1300
1600 898
1500 970
1400 885
1400 766
1400 856
1400 838
1400 807
1400 898
1400 963

Comparing average and median of the measurements we can see significant performance wins for using extensions to prefetch secrets.

Duration (ms)
Baseline

Extension

Change
Initialization (ms)
Baseline

Extension

Change
Average 1870 1450 420 (22.46%) 1500 918.1 581.9 (38.79%)
Median 1850 1400 450 (24.32%) 1500 891.5 608.5 (40.57%)

The extension is significantly more performant than using the SDK inside the Lambda function when it comes to startup time. Some of this is due to differences in the runtime, as all tests were performed with the same set of ruby Lambda functions.

Conclusion

As Square moves from the data center to the cloud, we're enabling developers to use AWS Lambda functions more. Lambda extensions are a new technology available for Lambda functions that AWS shared early with us. In this writeup we tested using Lambda extensions to prefetch secrets so they are available faster to applications before they execute. Overall we’re really happy with extensions so far. They encapsulate lifecycle management of background processes, and provide solid performance benefits of reducing cold start time by 30% or more. We believe extensions are a great vehicle to encapsulate common startup logic and deliver performance improvements. The extension we developed is available as open source.