🔥Let’s Do DevOps: Terraform S3 Policies Construction with Home Folders
This blog series focuses on presenting complex DevOps projects as simple and approachable via plain language and lots of pictures. You can do it!
All code for you to build this new pattern is provided as a GitHub link at the bottom of this article ;)
Hey all!
Building S3 policies is hard. There’s lots of stanzas needed for even simple use cases, and the potential for doing something wrong and exposing data to the wrong partner, or even to the internet, is ever-present.
That’s pretty scary for folks in any industry, but for regulated industries like finance and healthcare, it’s can be a Lawsuit Generating Event! 😬
Which means most companies have two choices — require that only senior engineers touch these policies due to risk, or take the risk that a junior engineer won’t accidentally assign the wrong policy.
However, I have a new way to share — using Terraform in an iterative constructor pattern to build the policies for you, by passing Terraform only the tiniest amount of information.
Here’s what you’d pass your iterator to build out per-partner access on an S3 bucket with a home folder and automatic data and versioning clean-up. That’s pretty cool and easy, isn’t it? And the names and ARNs for the partner user could even be strings to make life even simpler!
Let’s first talk about how you’d do this by hand, writing all the IAM policies yourself, before we go over how to automate it.
Doing it By Hand
S3 is an AWS service that’s used for storing and sharing files. Access to the data plane (files) and the control plane (to make changes) are both controlled via IAM policies, which are long JSON documents.
Our goal with this policy is to:
Permit a specific AWS IAM User ARN to access a bucket
Permit the user to access only the target folder, no other users’ folders
To do that, we have to grant a few permissions:
s3:ListBucket on the root directory of the bucket — This permits the partner to connect to the S3 bucket directly and view the folder we’ll permit them to access, rather than remembering their folder name to connect directly to it
s3:GetObject and s3:DeleteObject recursively on the partner’s folder path, usually the partner’s name (but not always) — This permits the partner to access and delete objects in their bucket folder path recursively
s3:PutObject on the partner’s folder path with a conditional to assign bucket owner full control — This permits the partner to put files into the bucket, and the conditional tells the S3 service to change ownership to the bucket owner (us!)
Deny s3:ListBucket access to the bucket and folder paths — This blocks seeing any other folder paths in the bucket, so our partners aren’t even aware there are other clients (or what their names are!)
Let’s look at what each policy looks like. As we iterate through these, keep in mind what level of engineer you’d be comfortable with adding an entirely new partner — entry level, medium level, senior.
s3:ListBucket on the root directory of the bucket
First, let’s let our partners access the bucket and list their folders only. If we are copying and pasting a working stanza from another client, we have to make sure to update:
The AWS principal, which is the partner’s IAM User ARN
The folder path the client is permitted to access
If you miss either of these, you risk a partner being able to read the file names of a different partner, which might leak data.
Here’s what the stanza looks like:
s3:GetObject and s3:DeleteObject recursively on the partner’s folder path
Next, we need to permit the partner to GetObject (download) and DeleteObject (delete) files and folders recursively in their folder path only.
Just like the above access, if we are copying and pasting a working stanza from another client, we have to make sure to update:
The AWS principal, which is the partner’s IAM User ARN
The folder path the client is permitted to access
If you miss either of these, you risk a partner being able to read and delete someone else’s leaking data < — Scary!
Here’s what that stanza looks like:
s3:PutObject on the partner’s folder path with a conditional
Next, we want to permit our partner to upload files to only the folder path assigned to them, recursively. Note the conditional here. Usually, conditionals control access, but in this case, the S3 service is mis-using it to instruct the S3 service to update the ownership permissions of any files uploaded to change the ownership to the S3 owner.
How interesting is it that S3 team is mis-using IAM in this way? Super cool
Just like the above access, if we are copying and pasting a working stanza from another client, we have to make sure to update:
The AWS principal, which is the partner’s IAM User ARN
The folder path the client is permitted to access
If you miss either of these, you risk a partner being able to data to someone else’s folder path, which could be confusing or lead to a data leak incident where partners share data with one another by accident.
Here’s what that stanza looks like:
Deny s3:ListBucket access to the bucket and folder paths
Next, we want to block any partner from being able to read the names of other partners’ folders. This likely isn’t needed — we’re only granting access above to their own folder, but this is a nice security back-stop in case access is accidentally granted.
IAM denies take precedence, so if a deny matches, the access will be blocked
Here’s what that stanza looks like. Note the 2 conditionals, and the folder and ARN names that need to be updated.
WOW
So. That’s a lot of stanzas that aren’t terribly obvious, right? Especially the conditions, and there’s even one condition that doesn’t work as a condition.
Unless you have experience with S3, you won’t be able to write this from scratch, and unless you’re very careful you’ll miss a partner ARN and folder path pairing when you’re copying and pasting for a new client.
Clearly, this is a risk for your business. It would be so very much better if we could construct these policies, rather than requiring our engineers to write them by hand.
Enter: Terraform!
Terraform is an Infra as Code tool that lets us use declarative configurations to build resources, including IAM resource policies. This is amazing since no one writes JSON on purpose.
Terraform’s AWS provider has an aws_iam_policy_document
resource that permits us to build a stanza of IAM policy. We can specify the specific permissions, and ingest the user ARN as a variable.
We can pair that with the for_each
argument to build out a few stanzas, say, one for each partner.
The following terraform looks scary at first compared to the IAM, but keep in mind that you’d write this exactly one time. Every future update to this configuration would be to send another variable pair (Partner User ARN + home directory name) only. There’s no more writing IAM once this is implemented.
Each of the blocks looks very similar to the above, so let’s just look at one. Again, all the code is linked at the bottom of the article, if you’d like to copy and paste this module in its entirety — heck yeah.
Note the try()
for the s3:prefix condition. That’s a lookup in case the partner’s folder path doesn’t match the partner’s name. This permits us to write tweaked policies if someone needs to access a folder path with a slightly incorrect name. Think PartnerA needs to access folder path partner_a
or similar. We can do that easily, woot.
Note that this is a “data” resource which means it won’t do anything in your environment. It’s only a method of organizing data within Terraform itself. We’ll need to apply this configuration somehow in order to apply it.
Super-Policy!
So we’ve built out n
stanzas above for each client — it could be A LOT. We now have a few dozen stanzas we need to combine. Thankfully, Terraform and the AWS provider have some very cool tricks to help us.
First, we use the same aws_iam_policy_document
resource and provide it a source_policy_documents
argument. That is built to take lots of JSON IAM policies and combine them into a single valid policy.
However, each of our policies is built a bunch of times — we have lots of clients, so we build out the s3_allow_list
policy a lot of times. So we use the for
loop construction in terraform. What that means is we wrap our for loop in square brackets, to structure it as an list (or array), and we tell it to iterate over each instance of the data block and print out the .json
attribute of each. We run that for every single IAM stanza data meta-resource.
We also have one other trick that might make this module more resilient, try()
. Try says to attempt to do what we ask, and if it fails, pass a null
instead. So if we decide to turn off any of these access permissions, we wouldn’t need to update the combined
meta-resource. It would try
to do that, fail, and instead pass a null
, which the flatten()
would entirely get rid of.
Writing resilient terraform modules is a great business best practice, and also guarantees you can post it and then never look at it again. Look at you, you superstar.
Use the Module!
So now we have an amazing module that’s prepared to iterate over all our partners, each partner’s IAM User ARN, and potentially the updated folder path.
I chose a map of maps as a data storage method — we have each partner’s name as the map key, Partner1
. We assume this is the folder name also, but permit over-writing it with the folder_name
argument (not shown in this snapshot).
Then we pass in the principals
list argument. This is a list of ARNs in case a partner has multiple IAM roles that need to access the folder. After all, some partners want to have different roles do different things. The module fully supports a list of partners and will expand our IAM policy accordingly.
Summary
And that’s it! We only send the map of maps to the module, and we don’t have to write ANY IAM POLICY at all, which is incredible. You as an engineer or PR reviewer have way more time, get better sleep, and your business is protected from leaking data with a mis-type on these long, confusing policies.
And that’s pretty amazing.
All Terraform code used to construct these IAM bucket policies and bucket lifecycle configurations can be found in GitHub here:
GitHub - KyMidd/CloudSecNext2022-TfIamConstruction: Don't build IAM by hand, use TF for that…
This repo is created to pair with CloudSecNext 2022 presentation titled "Zero Trust: Building IAM with Terraform" AWS…github.com
And keep an eye out for this talk at CloudSecNext2022, where I’ll be presenting to the SANS audience for CPE credits :)
Good luck out there!
kyler