27 October 2022

A Hidden Gem: Two Ways to Improve AWS Fargate Container Launch Times

TL;DR

There’s a little gem hidden away on GitHub about boosting Fargate launch times. Kudos to Massimo Re Ferrè, the author of the Github comment in question. With this post, I want to bring his comment to your attention. Feel free to just read the original posting, this blog post is a shorter version that zooms into the core problem and solutions.

The problem

On AWS Fargate, each containerized workload, Amazon ECS Task or Kubernetes Pod, runs on its own single-use single-tenant instance that’s not reused after the workload finishes. The container images required to run a workload on AWS Fargate are downloaded for every Amazon ECS Task or Kubernetes Pod. This process is in contrast to multi-tenant instances like Amazon ECS Container Instances or Kubernetes Nodes, where a container image may already exist on a host from a replica of the same workload.

The use of a single-use single-tenant instance makes image caching not easy to fix on AWS Fargate. However, AWS is working on two alternative approaches to reduce container launch times.

Approach 1: Reducing AWS Fargate Startup Times with zstd Compressed Container Images

By default, container image builders compress each container image layer using the gzip compression algorithm. When a container runtime downloads an image layer, the runtime decompresses the contents into the container runtime storage system.

Know that container image builders and container runtimes also support an alternative compression algorithm for image layers: zstd. Benchmarks show that zstd can achieve higher compression ratios and higher decompression speeds than the gzip compression algorithm. AWS internal testing of zstd compressed container images on AWS Fargate has shown up to a 27% reduction in Amazon ECS Task or Kubernetes Pod startup times.

The reduction in startup time varies by container image, with the larger container images seeing the most significant improvement. Uses cases like machine learning, artificial intelligence, and data analytics traditionally have large container images. Consequently, these workloads could see the most benefit from adopting zstd compression.

From: Reducing AWS Fargate Startup Times with zstd Compressed Container Images

Approach 2: Seekable OCI for lazy loading container images

Prior research has shown that container image downloads account for 76% of container startup time, but on average only 6.4% of the data is needed for the container to start doing valuable work.

Lazy loading is an approach where data is downloaded from the registry in parallel with the application startup.

Lazy Loading

Seekable OCI (SOCI) is a technology open-sourced by AWS that enables containers to launch faster by lazily loading the container image. It’s usually not possible to fetch individual files from gzipped tar files. With SOCI, AWS borrowed some of the design principles from stargz-snapshotter, but took a different approach. A SOCI index is generated separately from the container image and is stored in the registry as an OCI Artifact and linked back to the container image by OCI Reference Types. This means that the container images do not need to be converted, image digests do not change, and image signatures remain valid.

The soci-snapshotter tool is used to create SOCI indices for existing OCI container images and a remote snapshotter. It provides containerd the ability to lazy load images that have been indexed by SOCI.

From: Introducing Seekable OCI for lazy loading container images

Conclusion

While image caching for AWS Fargate is not solved, know that you have these two techniques to your proposal to reduce Fargate launch times!

Enjoy and until next time!

Subscribe to our newsletter

We'll keep you updated with more interesting articles from our team.

(about once a month)