At re:Invent 2013 Amazon Web Services launched CloudTrail, a service which exposes the log files of all API actions that occur within the AWS accounts of those who enable the service. Some might wonder why they would need such a service, as they’ve gotten along just fine up to now without it. Let me assure you that the reason this service was created was because of specific customer demand. We have customers who wouldn’t go live with production workloads without it, which is how we ended up working with AWS to integrate Foghorn Web Services with CloudTrail prior to it’s official launch.
Simply put – without CloudTrail enabled, there is no way to monitor or audit who is doing what in your AWS account. Sure, you can set up your user rights to ensure that people are only allowed to do what they need to do. But in the event that something is changed, without CloudTrail there is no way to know who changed it, when it changed, or if it had always been that way (i.e. it hadn’t changed).
Let’s take a hypothetical situation. An online retailer has their site hacked and customer credit card information stolen. Upon inspection of system log files, it comes to light that a web server was running FTP and was exploited by an IP address located in China. But the web server security group does not allow FTP traffic from any public IP addresses. Furthermore, the ACL associated with the VPC subnet that houses the web servers also blocks all FTP traffic. How can this be? With CloudTrail, the retailer is able to see that the security group and ACL were both modified by an IAM user account, which is protected by multi-factor authentication, prior to the incident, and then returned to its original state after the incident. Upon investigation, it comes to light that the employee associated with the IAM user account is still in possession of his MFA device. Case closed
Is it just for Security?
Audit trail and security compliance is the obvious driving factor for a service like CloudTrail. But innovative companies will use CloudTrail for much more. It can be used to help diagnose performance or stability events, and can help AWS customers to optimize their operations to increase stability and performance, reduce cost, and comply with governance principals.
Great, I’ve enabled CloudTrail, now what?
CloudTrail is available for all customers – the only cost are the resources required to run it (SQS, SNS, S3 storage, etc.). The setup is fairly simple, just follow the wizard in the CloudTrail portal. Shortly after that, your designated S3 bucket will begin receiving log files every few minutes or so. These log files are in JSON format, and they do not aggregate into a single file, so to make it possible to do analysis, you’ll need to be processing and storing the raw data in a query-able system. One of the key setup options is whether you would like AWS to notify you when a new CloudTrail file has been delivered. AWS will deliver these notifications to an SNS topic of your choosing. This makes it possible for you to easily automate the download and ingestion of the raw JSON files into the data analytics platform of your choosing.
At Foghorn, we’ve decided to add one additional component – SQS. Instead of notifying our application directly, we have configured our SNS topic to post to an SQS queue, which gives us some increased flexibility around building a stateless, horizontally scalable method of managing our CloudTrail logs. I’ve included a screenshot of our portal interface to give you an idea of what can be done with the raw JSON log files to make it a bit easier to extract meaningful information.