Amazon Web Services (AWS) publishes their opinions of best practice architectures, and has historically performed Well Architected Reviews for key customers. Recently, Amazon has enlisted a small group of partners to join them in performing these reviews. AWS invited Foghorn and we happily accepted, and have recently been approved. We are excited to offer this service to new and existing customers because of the incredible positive impact it can have on the benefit our customers get from public cloud. You might ask, ‘what benefits’?
Well Architected Benefits
The benefits of being well architected are very well documented. In general, they follow the 5 pillars of the well architected framework:
- Cost Optimization
You can imagine how each of these areas can be positively impacted by architecting your workload in a way that takes most advantage of a cloud infrastructure provider like AWS. If you can’t, check this out.
Why Foghorn? Why Well Architected?
Well, I guess you’d have to ask AWS that, I’d like to think it has something to do with the fact that we’ve been successfully doing similar work for the past 6 years or so. The more important question would be: Why is Foghorn participating? First of all, I love the idea of following a standard framework. It allows our customers to understand what we are going to do, and what they are going to get, even if they are new to AWS, new to Foghorn, or new to the public cloud entirely. Second, it allows us to finely hone our delivery process, templates, deliverables, and associated tooling. Although most consulting projects are bespoke, there are great benefits to be had when services can be standardized and productized. These always include two big ones:
- Higher Quality of Work
- Faster Delivery Time
How does a Well Architected Review Work?
The process is pretty painless, but requires involvement from both business and technical stakeholders. It is performed on a workload-by-workload basis. We usually start with a 2-4 hour workshop where we run through a provocative list of questions. Bring your architecture diagrams, they will be pinned up on the wall and relied on heavily! If you don’t have them, no worries, we’ll work our way through it anyway.
We then take away a bunch of information, and sometimes some read-only credentials if you’d like us to take a look in the account and make sure the info we gather is in line with reality. We compile the information into 2 main deliverables:
We deliver a detailed scorecard on several areas in each of the pillars of the Well Architected Framework. We rank these on severity, and make them easy to digest.
We won’t just tell you what’s broken, we’ll also tell you how to fix it! Oh, and by the way, we are not the kind of consulting company that makes recommendations and then walks away. The engineering staff at Foghorn is second to none. If you’ve enjoyed working with us during the Well Architected Review, we hope you choose us to help you optimize in the areas that are most important to your business. We take on all challenges.
What Might we Find?
Everyone likes to be in line with best practices in the industry, but will getting there really be that valuable to your business? It obviously depends on what we find, however I thought I’d give a few examples of what we’ve done in the past to give you an idea of the potential:
One of the biggest things we find when it comes to operations is helping identify areas that are blocking companies from treating their infrastructure like ‘cattle’ instead of ‘pets’. Often, small transformations can get a company over the hump. What can this mean? Well, if I’m managing 100 servers as pets, that means monitoring 100 servers, backing up 100 local volumes, etc. If I can instead monitor one end point, one autoscaling group, and one set of configuration management code that creates my custom AMI, well, that’s a LOT easier, and much less expensive.
The main reason we are asked to do a well architected review is to assure that the security posture of a workload is strong. There are lots of horror stories out there around a misconfigured S3 bucket causing a huge data leak. Unlike the other pillars, having a poorly architected security pillar is not simply inconvenient or expensive. It’s a silent killer. You may have no symptoms until the issue is exposed, and a single issue can literally kill a company.
AWS gives great options to build highly reliable workloads, however they make the assumption that you are well architected when creating their services. So although EC2 can be made highly reliable, AWS expects you to architect in the ability to lose an instance (i.e the Instance is not reliable), or an entire AZ. We often run across companies who have a critical function running in a single availability zone. This may be a simple micro-service that has been very reliable over the years, but is at risk of an outage at any time. A well architected review is a great opportunity to have a second set of eyes looking for these small oversights which have the potential to cause site-wide outages.
The best cloud example of performance efficiency is effectively scaling resources to match demand. Although AWS gives us lots of tools to do this, namely autoscaling, there are often application constraints which make autoscaling difficult or impossible to implement and reliably achieve the required performance. The result is often over provisioning. Although this type of recommendation usually includes some bit of application transformation, sometimes a little transformation can go a long long way. For compute intensive workloads, moving to a queue based, fan-out architecture can fundamentally change your business.
There’s lots of low hanging fruit here. Often great cost savings can be found by simply selecting a different purchase option, or swapping some resources for less expensive ones that can do the job equally well. With a little transformation, storage and compute costs can be drastically cut, often by up to 90%.
Ironically, when we ask customers about the value of a well architected review, cost optimization is rarely top of the priority list, but a high percentage of the recommendations result in cost optimization. Why? Other than the low hanging fruit mentioned above, we often see that high cloud costs are a result of architectural mistakes in the other pillars. For example, customers who often suffer from load based cascading failures of their front end will naturally defend against these outages by over-provisioning. Customers who are running fleets of stateful EC2 instances see rapidly rising costs of EBS snapshots, and complicated and brittle snapshot, snapshot copy, and snapshot pruning scripts required to meet their operations SLAs, when these costs and complexity disappear after transforming and migrating that persistence to a managed storage service like S3.
So often, money is a band-aid for sites that otherwise would be unreliable, non-performant, and operationally inefficient. Fix the problem, and costs also come down!
How do I Sign Up?
Interested? Give us a shout, we’d be happy to get you more information on how we can help you.