Half Baked

Half Baked AMI

With Thanksgiving and the holiday season fast approaching, things are heating up in the kitchen. The best way to cook a turkey is to soak it in brine for a day before stuffing and putting it in the oven. No no, the best way to cook a turkey is deep frying it.

The fact is, there is no "best" way. It is a debate with no right answer.

The same goes with baking AMIs:

  • Do you bake the software, configuration and your code into the AMI (à la Netflix)
  • Do you bake only the software and configuration, and download the code on boot
  • Do you use a clean OS AMI, and do everything on boot (à la Chef Server/Puppet)

Its a scale - fully baked through to half baked and unbaked/raw.

AMI baking scale

I have watched engineers debate the pros and cons of how much to bake into an AMI and it ends up being a religious (and sometimes heated) discussion. It is certainly a discussion that needs to be had, and a call needs to be made, but there is no one right answer.

To aid in the debate, here are some talking points:

Fully-baked Pros

  • Instance boot up time is as small as it can be - there is no further work to do during the boot sequence
  • All instances using the same AMI are exactly the same
  • The AMI built for staging can be reused in production - no need to rebuild
  • Nothing can go wrong during boot that didn't go wrong before

Fully-baked Cons

  • When baking AMIs as part of the build process (CI), there are a lot of AMIs created. You will need to clean them up (perhaps using Janitor Monkey)
  • No further customizations are done during boot - you are stuck with the same version

Unbaked Pros:

  • Can reuse existing Chef, Puppet, etc code (particularly good when migrating to AWS)
  • No need to manage the lifecycle of AMIs

Unbaked Cons:

  • Recipes may need to be downloaded from some central and highly available configuration master
  • The recipes/playbooks can fail during execution, causing some/all instances to be unusable
  • Boot up time can be significant. If your application sees large and sudden spikes of traffic, your service will be degraded or unresponsive until the new instances have been configured and can handle traffic
  • Can cause a thundering herd to bring down something like a Puppet Master depending on how many instances are booting at the same time, all asking for the same information

The majority of people using AWS seem to go for some kind of half-baked situation, picking and choosing which tradeoffs make the most sense for how they like to work. Episode 5 shows you how to use Aminator with Ansible to bake AMIs, but makes no assumption as to how much you want to bake in. Episode 4 shows you how to perform customizations on boot using user-data and cloud-init. Both are perfectly valid, and can even be used together.

If I have missed any points, please let me know in the comments, and I'll be happy to edit this post.

Have a Happy Thanksgiving, and enjoy your turkey, no matter how it is prepared.

Wednesday 11/27/2013 at 09:10am | Peter Sankauskas
comments powered by Disqus