Pay AWS Less for your Dev and Test Workloads

| | | 0 comments

24×7 environments are handy, but are they required for Dev and Test?

I’m going to assume your development team is not leveraging development environments 24 hours a day 7 days a week.  That is to say, I’m assuming you don’t have 3+ teams on shift throughout the world.  I’m also going to assume you aren’t building and destroying development environments as part of your continuous deployment pipeline (more on that later).  Lastly, I’m going to assume that development does not need to mirror production (that’s what Test is for).  First off, AWS provides numerous means by which you can tune your setup with cost savings in mind.  These are focused around some dev and test behavior that may have gone overlooked.  So with all that in mind, let’s cut costs.

Scheduled development servers

First off, let’s simply make our development server layer mirror our actual development schedule(s).  I am talking about the stateless tier, not the database.  Let’s create time-based scaling triggers (natively supported in Auto Scaling Groups, Elastic Beanstalk, OpsWorks, or just use Lambda!).  Scale-In to 0 instances in service 1 hour after development teams stop working.  Scale-Out to 1 instance in service 1 hour before development teams start working.  Even a hardcore development team working from 6am to midnight six days a week still leaves money on the table.  Let’s see how much exactly, assuming 5 development teams, each using a single c4.large development environment:

Parking development servers

Let’s say we have already implemented scheduled development servers as outlined above, or maybe we haven’t yet but we are using Auto Scaling Groups.  In either case, we can simply park an unused development environment if we know no one is actively contributing code changes.  Let’s take the following scenario.  We have a major push happening on 2 of our products.  Development on the other 3 has been halted so resources can be borrowed.  We will be suspending any work for two consecutive sprints (let’s say we use two week sprints).  That’s 3 development environments with no changes for a month.  We can simply set our desired capacity in the auto scaling group to 0, or if we setup time-based scheduling, simply remove the Scale-Out rule.

 

Resource utilization, or lack thereof

It’s easy to get complacent with an instance type.  The development team has grown to love the fast provisioning and rock solid compute & network reliability of the c4.  Furthermore, since you use c4.2xlarge instances in production it’s a logical development downgrade using the c4.large.  But what if your development environment didn’t really require the processor performance and network stability? What if instead you simply needed any 2 core ~ 4GB memory server?  A t2.medium might well do the job..  It lacks the sustained compute capability, but that is rarely needed in dev.  Assuming 5 development environments, you can save even more:

 

Resource configuration, being thoughtful with your decision

Let’s say your production database has strict IOPS requirements.  This database was created before AWS released General Purpose SSD (GP2) EBS storage.  As a result, you configured Provisioned IOPS to meet  your requirements.  Since test needed to match production, the same Provisioned IOPS were brought over to the database.  Furthermore, in an effort to mirror test with production, they also made test Multi-AZ RDS.  These decisions were innocent at the time, but have a definite impact on hourly costs.  Since our databases are running 24×7, there is an immediate opportunity to provide the same performance level at a reduced availability and cost.  Here is how that would play out, assuming we reduced our test DB to a single availability zone with gp2 based storage.

It is critical to test against a duplicate of production before deploying to production, but let’s first assume you aren’t doing blue/green deployments (more on that later).  Test only needs to mirror production during testing.  And just like scheduled development servers, test environments should be running during automated and QA test schedules.  Going even one step further, the mirroring of production in test need only occur during performance tests (or similar tests where resources impact meaningful results).  Non performance related testing can occur on a production-like environment, similar in layers but resourced like development and run on a schedule matching QA and automation testing.  Then when true performance testing occurs, the test environment can be modified to production-like resources.

Test should mirror production, sometimes

You have a test environment in one of two states. State 1 is the QA and Automation validation and regression testing setup. State 2 is the production mirror performance testing setup. Let’s compare 24×7 vs Scheduled and Non Performance Testing vs Performance Testing. Production uses Elastic Load Balancing, 15 c4.2xlarge application servers, 3 node m4.xlarge Redis cluster and an r3.xlarge MySQL database. For ease of scheduled states, the database will be production specification running 24×7, but we will not be using the AWS RDS Multi-AZ feature like we do in production. This environment is about $6000 / month for a truly cloned production setup. But we don’t need that level of scale unless we are performance testing.  By scaling in and applying our 8am-6pm scheduling, we can reduce the costs dramatically.  When it’s time to do performance testing, we scale up only for the tests, lets say twice a week for 4 hours. Sparing you the math, We end up at around $1,040 / month.  As you can see, combining all of these techniques can save a fortune.

Do I even need a dedicated test environment?

Back to blue/green deployments.  This is not a blog post about the what or why or how of blue/green deployments.  But let’s say your environment supports that kind of deployment.  Why run a test environment that is a mirror of production 24 hours a day (or even on a schedule) when you can simply build the environment, run through the testing, perform the auto scaling group swap (for example), wait a reasonable amount of time to support rollback, and finally terminate the previous production environment.  In this case, a test environment that was running 24×7, or even 96 hours a week can be reduced to the time it takes to build, test and support rollback.  If this is automated (more on that later), your scheduled test environment running 60 hours a week could potentially be reduced to a production clone running for more like 8 hours a week.  In this world, let’s assume our workflow is very simple, measured in days.  Day 1, we build and test.  Day 2 we cutover blue/green and leave both running.  Day 3, we leave both running for one more day of rollback.  Our 60 hour scheduled test environment is now 24 hours per week with the added benefit that we are testing against production specifications while at the same time saving money.

Why automate the build as part of testing?

Whether you are doing blue/green deployments or not, there is a justification in building the test environment from nothing each time you do it.  While the only cost savings measure in doing so is reduced run time (you build, test, and destroy, or build, test, and cutover), the benefits go way beyond that.  This workflow validates far more than an updated application deployment.  You are also testing your configuration management, your infrastructure code and potentially the same workflow you would use for disaster recovery.  Building a server using configuration management to create a test environment and never doing so again is stopping short of the power of configuration management.  This “permanent” environment you created introduces false assurance that you can recreate this setup any time.  Since application code is being pushed to servers with application code already on them, unknown dependencies get introduced into the application.  That dependency structure doesn’t stop at the application either. By not building from your servers from nothing, even the configuration management code may be presuming a given state.  Automating test environments from scratch leads to automating build/deploy, which leads to well exercised infrastructure code.

The Reinvention of Amazon Bedrock

The Reinvention of Amazon Bedrock

Amazon Bedrock is a sophisticated and fully managed service provided by AWS, designed to facilitate the development and scaling of generative AI applications. Some key improvements have been launched at AWS Re:Invent this week. We’ll dive deeper into those later....