Security can be painful sometimes, and it certainly can get in the way of moving fast and breaking things. But that’s not to say that a loose approach to development environments has to result in a poorly secured production environment. A digital healthcare company had the challenge of maintaining AWS least privilege access for their microservices in an efficient way, and reached out to Foghorn to come up with a solution. Foghorn was able to help on this security effort, and this case study walks through how we achieved that.
AWS has kindly documented best practices for using EKS and one section, in particular, we will focus on here. This section covers Security and in specific Identity and Access management. Under the Recommendations section, there are a few concepts to note before describing the framework we implemented to solve for a few of them. We are specifically focusing on Kubernetes Service Accounts and IAM Roles for Service Accounts (IRSA) as applied to least-privilege AWS access for pods.
For some context, let’s consider the following use case. You are a digital healthcare company with a mobile app that uses backend Java microservices that will be running in EKS. Those microservices will leverage the AWS Java SDK to access DynamoDB and S3 resources. Security mandates that each use case (in our scenario each microservice accessing AWS resources) use its own KMS key. Your infrastructure is currently in Terraform.
In order to simplify the aggregation of the required components to enable a Terraform author to create resources the microservice will need along with the IAM Role and Policies using those resources, we created a module that uses the outputs of the resources (ARNs, etc.) to dynamically populate the IAM Policy to attach to the Role, which is already set up to be leveraged by the EKS OIDC Provider.
At a high level, we created the following (this being a per-microservice subset to illustrate the example, the actual solution has countless microservices that each have their own resources and policy):
Taking a closer look at each section. We started by creating the resources the microservice will interact with in Terraform (using whatever module or pattern is currently in use for a per-environment deployment). We create a new DynamoDB Table, a new S3 Bucket, and finally a KMS key for the microservice to use for both. We feed those outputs into our IRSA module to create the microservice role, policy with access to only use the specified actions against the microservice DynamoDB Table and S3 Bucket resources, and also only allowing KMS actions to the KMS key we provisioned for that microservice This is handled via a pattern of common usage actions that are for each resource type and then having the module take in resources dynamically building out the least privilege policy. Finally, this module also handles the IRSA component by taking in the service account(s) and OIDC provider ARN to complete the needed steps to have the IRSA setup work. We are now in a position of not just least privilege from an AWS actions standpoint, but also from an AWS resources standpoint including encryption via KMS.
At the pod level, we simply add the serviceAccountName to our deployment and allow the magic of the Service Account Token Volume Projection to enable our pod to assume our IAM role through `sts:AssumeRoleWithWebIdentity` simply by leveraging the default credentials provider in the Java SDK. This token is rotated automatically and the Java SDK will reload the token when rotated.
While it may be easier to simply aggregate all of the AWS actions needed into the EKS Worker IAM Instance Profile, this is far from ideal in terms of least privilege, even if you added all the in-scope resources as well. And since pods are not (if configured correctly) using the IAM Instance Profile of the worker nodes, this can be limited in access privileges as a means to also audit pods that are trying to use it (and configure the proper security for them once found). This leaves your EKS worker nodes with only IAM policies necessary to manage the worker instances and pull ECR images, no actual permissions to resources provisioned for the various microservices running inside the EKS cluster.
If you are looking to enhance your AWS EKS security profile, please reach out to the Foghorn team and we would love to learn more about your workloads and areas where we can improve your security posture.