When: May 2019 - Aug 2019
Where: Sunnyvale
During my internship at Proofpoint, I worked with Proofpoint's Infrastructure Engineering team to optimize their AWS costs.
Some highlights:
1. Participated in the company-wide hackathon within my first week
2. Presented my projects to the CFO of Proofpoint, Paul Auvil
3. Won Best Table Topics award at my first meeting of Proofpoint's Toastmasters club
My Proofpoint Internship Experience: Working on AWS Cost Optimization
During my internship at Proofpoint, I worked with Proofpoint's Infrastructure Engineering team to optimize their AWS costs.
Hold on...
What is AWS?
AWS stands for Amazon Web Services, and they essentially provide a variety of cloud computing services to help with things like computation, storage, databases, networking, security, machine learning, AI and much, much more.
How does Proofpoint use AWS?
Proofpoint uses a variety of AWS's products, the most popular being EC2 (Elastic Cloud Compute), which provides virtual computing environments (known as instances) for customers to run their applications on. Proofpoint runs a large portion of their services on these EC2 instances, and uses other services like S3 or RDS to store information and data.
Why does Proofpoint need to optimize their AWS costs?
While it may seem like AWS is cheap at a glance, according to
a report published by Flexera using data surveyed from their customers, 26% of companies spend between $1.2 million to $6 million and another 26% spend over $6 million a year on public cloud services. Not every company may be spending $6 million on cloud, but the report also states that on average, approximately 35% of a company's cloud bill is wasted spend. This is a significant amount of unnecessary spending, regardless of a business's actual cloud costs, which is why it's important for any company to make sure that their cloud computing is optimized and that their infrastructure is stable and scalable.
How can Proofpoint optimize their AWS costs?
Well, that's the question that my team and I are trying to answer! Automating certain processes, enforcing proper tagging of assets, and managing the lifecycles of assets can all help to identify and reduce our costs and optimize our AWS usage. Helping developers and PMs see their service costs (cost attribution) can also help them understand how they can optimize their usage.
If you're new to AWS and still a little confused, don't worry — so am I! When I first started my internship, I knew very little about AWS. I had a baseline understanding of what AWS did and some of the services they offered like EC2 and S3, and I had somewhat of an idea of what Proofpoint used AWS for, but that was about it. However, the past few months have taught me more than I imagined possible; not just lessons about AWS, but also important aspects of working in a company, project design, teamwork, and communication. One thing I have learned is if you want to learn something new, one of the best ways to do it is to just throw yourself into it — get familiar with the new material by playing around with it and trying some hands-on application of the material. That's what I did for AWS! In the first week of my internship, Proofpoint was hosting their annual Hackday competition. I was eager to participate and wanted to do a project related to my internship focus, so I decided to try and calculate the company's usage of S3 buckets (and the corresponding cost of this usage). Within the first couple of days, I was learning to use the AWS CLI and Python SDK, S3 access logs, lambda functions, Athena, Quicksight, and more. By the end of the week, I felt comfortable in working with AWS and was confident that I would be able to complete some of the other projects I had planned.
Speaking of projects, here is a look at some of the things I've done and what they taught me, in both technical and soft skills.
1. S3 Usage Calculation
Objective: Create a visualization to represent Proofpoint's usage of S3 buckets and the associated cost so users can understand which buckets can be optimized.
Impact: Being able to visualize our usage of S3 cost can help us identify and eliminate waste, as well as understand s3 access patterns for various teams. Understanding access patterns allows us to appropriately configure our storage, which involves determining what data can be transferred to S3 Infrequent Access or S3 Glacier Storage, both of which cost less than standard storage and are more appropriate for data that doesn't need to be accessed often.
What I did
Wrote a lambda function in Python to automatically enable S3 access logging
Wrote a lambda function to gather bucket meta data such as business unit, creation date, and bucket size
Queried access logs using Athena to create a table with bucket name, last access date, and bucket size
Visualized this using data visualization tool Infogram
Later, I extended this project by scheduling the athena queries and deleting buckets older than 90 days using a lambda function.
What I learned
2. Metric Generation
Objective: Automate gathering of specific AWS metrics, such as tagging coverage and EBS cost per GB, to help teams understand their AWS usage performance in relation to other teams.
Impact: Automating the gathering of metrics reduced the work needed to be done by Craig Feeck and AWS Cost Management team and gives them more time to talk to teams about these metrics. Publishing the metrics to S3 bucket allows teams to access information on how their team is doing in terms of AWS optimization in comparison to other teams, and helps start conversations about AWS cost optimization within the workplace.
What I did
Wrote a lambda function in Python that runs every month and does the following:
Retrieves a spreadsheet from an S3 bucket
Gathers metrics from Cloudhealth
Performs necessary calculations
Uploads data into spreadsheet and color codes rows
Puts updated spreadsheet back into S3 bucket
What I learned
3. Autotagging assets
Objective: Find and implement a method to automatically tag AWS assets when created
Impact: Autotagging assets will help to reduce significant pain points for multiple stakeholders; it will help our finance team properly attribute costs to the appropriate teams and reduce the need for retroactive tagging, which places a large burden on engineering teams who have to then spend the time going back and inputting values for their resources. Autotagging will also help make our tagging standards more uniform. Overall, autotagging will help us to more effectively attribute and allocate costs, identify and eliminate waste, and implement a culture of tagging resources and cost optimization awareness.
What I did:
Wrote a lambda function triggered by API calls that finds creator of asset using cloudtrail and tries to tag asset
Obtains tag values either using LDAP integration or IAM tags
If it can't find tag values, emails creator
Wrote an IAM policy to prevent new users from being created w/o tags
Wrote IAM policy to prevent creation of EC2 and EBS w/o tags
What I learned:
Communicating across numerous teams
Identifying different approaches to a problem and evaluating them based on different values and priorities
Identifying and communicating the value of my project
These projects were awesome challenges for me. Although they were daunting at first because they required so many new skills and knowledge, the results were incredibly rewarding and I definitely learned a lot, technically and otherwise. What I love the most about this project is how it opened the conversation to cost optimization, and brought engineers into the discussion. In conversations with engineers, managers, the finance team and yes, even the CFO of Proofpoint, I realized how disconnected each of these groups were in the conversation around cost optimization. I knew that the best solution would require harmonious communication between all groups, and so I sought to not only accomplish the projects I was given but generally spark more intrapersonal discussion around cost optimization and the need for promoting cost optimization. The projects I completed only scrape the surface of what's possible for cost optimization in AWS, but I’m glad they opened the conversation. I was ecstatic to hear a few months after the completion of my internship that Proofpoint had built on top of my work and implemented a modified version of my auto-tagging project into their development and production environments.