• Skip to primary navigation
  • Skip to main content
RunAsCloud

RunAsCloud

Advanced Cloud Consultants

  • Services
    • Security and Compliance
    • DevOps
    • IT Governance
    • Data and Analytics
    • Cloud Architecture Review
    • BAA Readiness Assessment
    • Security Tune-Up
  • Blog
  • Our Team
  • Careers
  • FAQs
    • Frequently Asked Questions – Customers
    • Frequently Asked Questions – Candidates
  • Contact Us
  • Show Search
Hide Search

Journal

What Really is DevOps?

October 17, 2019 By Jason Silva

When I started my career as a Systems Administrator, I thought that I would be doing that for my whole career. A few years later, I thought to myself, ”If I were to progress, what would be my next step?” After a little bit of research, I learned of a position called a DevOps Engineer. Since I had been learning how to code on my off time, I thought that this would be a perfect next step for me in my career. After doing a lot more research into DevOps, I came to realize that there was a lot more to the position than I previously understood.

So, what exactly is DevOps?

  • Is it a new and improved Systems Administrator position for the Cloud era?
  • Is it a culture in which Developers and Operation Engineers work in unison with shared responsibilities?
  •  
  •  
  •  
  •  
  •  
  •  
  •  
  •  
  •  
  •  
  •  
  •  
  •  
  •  
  • Is it a combination of the two?

The answer is not as concrete as one may think and is going to differ from person to person and from company to company. Some companies will say that Dev/Ops engineers are just Operations engineers. If you Google “What is DevOps?”, there are a lot of companies that will give you their definition and for the most part, they all sound similar. For example, here is the definition of DevOps from AWS:

“DevOps is the combination of cultural philosophies, practices, and tools that increases an organization’s ability to deliver applications and services at high velocity: evolving and improving products at a faster pace than organizations using traditional software development and infrastructure management processes. This speed enables organizations to better serve their customers and compete more effectively in the market.”

The term DevOps doesn’t necessarily mean that it pertains to Operations Engineers. The term also doesn’t mean “Strictly Infrastructure Automation”, although it is a big part. I believe DevOps is a culture and position that bridges the gap between Developers and Operations Engineers. Much like test-driven development, Operations metrics/tasks should be moved over to the left of the development cycle.

A byproduct of that is the application getting shipped quicker and everyone taking responsibility instead of trying to pass it elsewhere. Another byproduct is the better audibility of the infrastructure.  A few examples of this would be; Pull Requests on Terraform / Cloudformation / ARM templates, developers being a part of the operations on-call rotations, developers being embedded on an operations team or vice versa. Another case where Operation Engineers are embedded into feature teams can be found at Spotify.

Ultimately, DevOps is not just one thing. It brings a whole “hodgepodge” of processes and tooling together to make a smoother and more enjoyable Software Development Lifecycle.

Capital One and EC2 – part 3

August 9, 2019 By Nate Aiman-Smith

In two previous articles,

I described how the Capital One breach took advantage of an EC2-specific function to obtain AWS credentials which were then used to obtain multiple files containing sensitive information.  If you haven’t already done so, I’d encourage you to read parts one and two before continuing. You might also want to pull up the complaint for reference; the juicy bits describing the attack are on pages 6-8.

In this final installment of the article, I’ll describe some measures that Capital One could have taken to prevent this kind of attack.  However, before I do that, I do want to point out something in defense of Capital One; on the surface of it, this application probably looked secure.  I don’t have any way to test most of these, but I’m going to guess they did the following:

  • Only allowed required ports for their application, both internally and externally
  • Enforced HTTPS on connections from the Internet
  • Enabled automatic encryption of objects in the S3 buckets and EBS Volumes
  • Used associated IAM roles rather than static credentials (*)
  • Enabled CloudTrail (*)
  • Implemented a Web Application Firewall (WAF) (*)

(*) – The complaint states or strongly implies that this was implemented

Honestly, this puts Capital One ahead of many other implementations I’ve seen.  If Capital One followed a security review checklist (and I’m guessing they did), this application ticked all the boxes.

on the surface of it, this application probably looked secure

With that qualifier out of the way, here are some relatively easy additional steps Capital One could have taken to avoid this issue:

Easy step 1: practice least privilege in IAM Roles:

Simply put, don’t give any more permissions to an application than it needs.  If this server was only functioning as a WAF (and not, for example, also as an application server) then it probably didn’t need any S3 access except perhaps to back up and restore its configuration.  It definitely didn’t need the ability to list the S3 buckets owned by the account, and it probably didn’t need the ability to list anything at all. Had Capital One simply denied any “s3:List*” API access in the policy, the attacker could have full read and write privileges but still be effectively blind.  A better approach still would be to only allow those S3 API calls required to the resources it explicitly needed. As it is, the high level of access implied that the Role simply had list and read privileges for all S3 objects and buckets.

Easy step 2: limit S3 access to sensitive data to the local VPC:

S3 bucket policies provide the ability to restrict access to just the local network in AWS – this means that requests from the Internet will be denied, so even if the attacker had the credentials she wouldn’t be able to do anything with them.

I’m a little hesitant to put this here; if the attacker was already able to get the IAM credentials, then theoretically she should have been able to craft HTTP requests to do her misdeeds through the EC2 instance, so adding this step would have slowed her down but might not have stopped her.  In general, though, it’s another form of least privilege that absolutely should be exercised.

Easy step 3: use separate KMS keys in S3 for different projects:

AWS generally offers two choices for encryption of resources: AES-256 or KMS.  These names are a bit of a misnomer – it’s really a choice between using a master key shared with all of AWS, or using a master key managed by the individual AWS account.  The AWS-managed key effectively doesn’t have any access control on it, so even though the data is encrypted at rest (thereby checking the relevant compliance boxes), it’s not preventing anyone else with an AWS account from reading it.  A customer-managed key, on the other hand, has a default “deny” policy, and much like S3 itself requires both the key and the requester’s IAM policy to allow access. The result of using KMS encryption in S3 is that even if credentials are breached with an overly generous S3 policy, any data encrypted by KMS is still safe unless that policy also enables decrypting data with that key.

The three suggestions above are relatively easy to implement and can easily be added to security checklists for projects with sensitive data.  Although none of them would have stopped the attack, they would have greatly reduced the impact (referred to as the “blast radius” in security parlance).

There are also some general steps that can be taken to cover multiple projects, which should have been standard practice for a bank the size of Capital One:

Shared step 1: monitor API requests

This really should have already been implemented: any AWS API access from a known anonymizer VPN or TOR exit IP should raise some alarms.  CloudTrail provides full logging of pretty much all S3 API calls (which is how Capital One was able to give the FBI such detailed forensic data later on), and there are plenty of tools that can scour through the logs and search for any successful API requests from any suspicious IP.  Honestly, it’s a little disconcerting that Capital One didn’t catch this attack from CloudTrail logs.

Checking for suspicious IPs in CloudTrail is the tip of the iceberg and pretty easy to implement – an advanced DevSecOps team should also be looking for irregularities: why is this IAM role that typically hits the API once every few days suddenly mass downloading?  Why are we seeing tons of new requests from this new IP that doesn’t belong to us or to AWS? These take time, money, and engineering brainpower, but Capital One should have plenty of all three.

Shared step 2: filter out the metadata IP address with a WAF

All EC2 metadata (including IAM role credentials) is accessed by an HTTP call to the IP address “169.254.169.254”.  I can’t think of any conceivable reason to have this IP address as part of your request body or post payload; therefore, any request that includes it should probably get dropped.  You can use the AWS WAF to create a role like this or add it to your own WAF (although if the WAF itself was the attack vector, that might not have saved it).

All of the above suggestions are far from the full list of precautions that Capital One could have (and should have) taken to avoid this, but implementing any one of them would have either prevented the attack or at least alerted Capital One at the time of the attack.

Besides taking the above recommended steps as general practice, a project that’s going to be collecting sensitive data and be exposed to the Internet should generally have a detailed security review as a best practice.  A security architect would have asked questions like “how are we implementing least privileges?” and “For each of these components, how are we limiting the fallout if they’re compromised?” Security architects aren’t inexpensive, but they’re cheaper than a lawsuit.

For the reader: if you got some value out of this, please let me know about it in the comments.  If you’d like to further discuss your security posture in AWS, feel free to reach out to me or contact us through our website.  For those in South Florida, I’m going to be presenting a talk based on this article on August 22, 2019 at Venture Cafe Miami – hope to see you there!

Capital One and EC2 – part 2

August 6, 2019 By Nate Aiman-Smith

In a previous post, I mentioned that the attack vector for the Capital One breach specifically targeted an EC2 feature.

In this post, I’ll give my educated guesses about how the attack actually worked.

[Note 1: if anyone happens to have any of the contents of the original gist then I’d love to get a look at it to confirm these guesses – until then I’m going to draw my conclusions from the text of the complaint]

[Note 2: after writing this I saw a Krebs post in which the author claims to have some insider information that backs up my guesses and confirms that the WAF was itself the attack vector – always nice when an educated guess turns out to be correct]

A bit of background: as I stated before, the first step in the attack was to obtain the credentials for an IAM role. I won’t go into a deep dive into IAM roles – the short version is that EC2 has the ability to associate a server (“instance” in AWS parlance) with a set of permissions.  In order to actually use those permissions, a user or application needs to request a special URL via HTTP that can only be accessed from the instance itself – the HTTP response will include temporary AWS credentials that grant those permissions (see here for AWS’s documentation on the process).

Overall, this is a pretty good setup and far superior to embedding static credentials directly in your code base – I have personally seen two very high-impact incidents in which someone created an IAM user and accidentally committed the credentials to a public GH repo, which is exactly what this is meant to protect against.  However, it does create an interesting threat vector that otherwise wouldn’t exist, which is an EC2-specific variant of a Server-Side Request Forgery (SSRF) attack.

Let me give a simplified example of the SSRF attack; let’s imagine you hobbled together a widget for your site that’s meant to slurp in external content and make it fit in with the theme of your site.  An attacker, inspecting your site, spots a call to “/widgets/format-external-content.php?external-site=[some_https_string]”. Just for grins, the attacker tries manually pasting that content into a browser, but replaces [some_https_string] with the IAM metadata string from the AWS documentation.  Your plugin, not knowing what to do with the JSON response, just spits it out more or less unchanged and just like that, your credentials have been captured.

In page 6 of the complaint, the agent writes that the gist contained the IP address of a server and three commands, the first of which returned the credentials for “an account known as *****-WAF-Role” – a pretty strong indicator that the attack retrieved EC2 IAM credentials.  Since we don’t have the original exploit, we don’t know exactly *how* it worked, but my money’s on some variation of the SSRF method described above.

Once the attacker had the credentials, it was pretty much game over; the credentials could be used to get a list of buckets and objects and then pick and choose which of those the attacker wanted.

This particular flavor of attack has been on my mind for years now and I’m surprised that it’s taken so long to surface.  I’d also like to point out the irony here:

  1. This attack was made possible by engineers following an established best practice (use an associated IAM role for AWS credentials).  Had they been using a non-recommended method like a config file, the application could not have been compromised this way.
  2. The name of the role ends in “WAF-Role”; combined with the fact that the attacker referenced an IP address, this indicates that the compromised server was a Web Application Firewall (WAF), or possibly that the creators of the application just made a single role called “WAF-Role” and applied it to everything.  Given that a WAF’s job is to filter out malicious requests, it’s especially ironic that it was the attack vector.

In part 3 of this article, I’ll describe what Capital One could have done to prevent this, and how to protect against it happening to you.  Stay tuned.

Capital One and EC2 Hack – an Overview

August 5, 2019 By Nate Aiman-Smith

There’s been a ton of coverage of the recently discovered Capital One breach.

I’m generally very skeptical when AWS security makes the news; so far, most “breaches” have been a result of the customer implementing AWS services in an insecure manner, usually by allowing unrestricted internet access and often overriding defaults to remove safeguards (I’m looking at you, NICE and Accenture and Dow Jones!).  Occasionally, a discovered “AWS vulnerability” impacts a large number of applications in AWS – and it also impacts any similarly-configured applications that are *not* in AWS (see, for example, this PR piece…um, I mean “article” from SiliconAngle).  Again, this is a lack of basic security hygiene – anyone who’s worked in IT in the last 20 years knows that you need to patch any internet-facing software before an attacker finds it (and, incidentally, the time you have until a vulnerability gets found and exploited is continuously getting smaller, so you better find a way to automate that – but that’s another discussion for another post).

When I looked at the Capital One breach, I immediately assumed it would fit into one of those categories, but instead it looks like we finally have an honest-to-goodness AWS-specific hack.  Furthermore, from what I can tell, it was the result of a customer trying to follow best practices.

Although I didn’t have a chance to look at the exploit before it was taken down, we can get some idea of how it worked from the text of the complaint (primarily by reading between the lines of the agent’s description of the attacker’s deployment).  I’ll go into tech detail in another post, but the short version is that the attacker found a way – almost certainly through some misconfigured 3rd-party software – to get temporary AWS credentials from an EC2 instance’s metadata.  The temporary credentials gave the attacker access to an S3 bucket that contained sensitive data, which she then posted online.

Notice that I didn’t write “a misconfigured EC2 instance” above; the EC2 configuration (called an “associated EC2 IAM Role”) is a recommended practice when developing applications for AWS.  This is, unfortunately, an increasingly common issue with security-oriented tools and best practices; having them in place but not using them correctly (or, in the case of this attack, using them *almost* correctly) can sometimes be even worse than not using them at all. This is particularly heartbreaking to see as a security professional – they tried to do this correctly, but it completely backfired and opened a backdoor.

I will leave it to the reader to decide if Capital One should be forgiven for this; my personal opinion is that they have enough money and resources for a detailed security review, particularly for applications that will be collecting sensitive information from people.  A cursory security review would probably have passed, but a deep dive probably would have revealed the underlying vulnerabilities (or at least reduced or eliminated the impact).

I’ll get into some of the tech details in part 2 of this article, because they really are very interesting, and in part 3 I’ll dive into what organizations can do in the future to prevent themselves from this type of attack.

Welcome aboard Bill Lumbot

July 30, 2019 By Jake Berkowsky

Every morning I try to follow a checklist that I wrote.

I read over resumes, check out PRs, check my email accounts, etc… One critical thing I do (or did) was checked to see who forgot to log their hours from the day before (or who left the timer running). Since we are a consultancy, it’s important not only that we log our hours, but that we log them correctly, and if we catch ourselves forgetting at the end of the month there’s no way we’re gonna remember what we worked on that day. So, it’s an important task. Unfortunately, it’s a thankless one, I don’t enjoy doing it and it makes me feel like Bill Lumbergh from office space, asking people to fill out their TPS reports.

One of our core values at RunAsCloud is automation. When building out cloud infrastructure, automation ensures that we can push something out quickly and consistently. When helping clients push code we build an automated pipeline to ensure that deployments happen easily and without the worry of someone accidentally fat fingering something.

As such, I’d like to introduce our newest addition to RunAsCloud, Lumbot. Lumbot is a slackbot that runs in AWS lambda toward the end of every workday. He is alerted via a Cloudwatch Scheduled event and proceeds to check our engineer’s timesheets, look up anyone who may have forgotten to update via slack and send them a polite reminder so that I no longer have to.

Unlike his namesake however, Lumbot won’t be asking you to come in on Saturday (his cron is only scheduled to run on weekdays).

Taking on AWS re:Inforce by Force

July 30, 2019 By Cai Walkowiak

AWS re:Inforce 2019,

the first security-focused AWS event, was held at the Boston Convention and Exposition Center—An incredible 516,000 sq ft modern-art well-architected venue of steel and glass. The event occupied 4 floors with ground 0 being the main expo of vendor booths, buffet lunch, and breakfast meals. They kept the same AWS feel of their other events. This was a common experience in their organization, keeping the registration process clean and clear with early badging/onsite registration(I think they’re starting to learn their lesson), in addition to offering a great variety of topics and presenters.

One item they did not carry from the Summits (which I am grateful for) was a “certified engineer lounge” which had offered coffee, chairs and charging stations to those who held an official cert. It felt odd, at previous events, sitting with my phone charging and watching the plebs wander by. Instead re:Inforce had lots of areas where work and charging could be accomplished without the “Elite” status of being certified and dismissed all the pretentious air afloat. 

There was an arcade center and coffee bar at the “Well-Architected Lounge” at which anyone could participate. The staff from PAXs East (a video game nerd event) were very enthusiastic for the arcade games. Not to mention that the coffee station had a “printer” which could take your photograph and then print it onto the foam of your coffee/cappuccino.

There were also some challenges AWS events have yet to overcome. Most classes were ‘sold-out’ or ‘walk-up only’ a short time after event topics opened for seat-reservation, a month before. There were a lot of LONG walk-in lines at the most popular topics, many of which were in the hands-on labs which held 8-10 tables in a single venue room and only had 6-8 seats per table. At one point there was a walk-in line over 70 people long. People at the back actually stuck it out for half an hour hoping to sit in one or two of the remaining seats on their choice topic.

In the past there have been whitepapers, presentations and numerous discussions regarding the security technologies present, leveraged and integrated in AWS Cloud Services. Their shared security model was hammered repeatedly—informing the unaware that AWS takes care of the hardest piece which a lot of smaller and start-up companies fall down on—providing the proper security for their infrastructure. This was the buzz phrase of several topics, presentations and side-bar conversations. AWS also provides heavy infrastructure at a low introductory cost—proper redundancy, high availability and durability while maintaining the security-first focus—which is truly unmatched (as one of the opening Keynote presentations spoke to with some small digs at competitors).

AWS offered a security specialty exam in the past, which returned a couple years ago with a beta round and then the latest full specialty certification. re:Inforce offered onsite testing, bootcamps, and smaller “exam preparedness” courses. Having our security pod onsite with two certs already in hand made the event a reassurance in our knowledge and progressive stance in the AWS security space.

AWS and those in the cloud space often describe Security being ground-0—the 1st level of any project in the public cloud. Professionals in the info-sec space know the potential dangers and concerns about putting private data anywhere accessible outside of the private network, and the difficulty of maintaining the security of that data. Every week, month, and year it seems like another team, large or small, has been compromised due to a misunderstanding of configuration requirements for securing their architecture. As the largest cloud services provider, AWS is challenged with their large attack surface across many different offerings. Trolling public S3 buckets, broken DB tables and security groups, or permissive permission policies are all too common and were spoken to repeatedly. AWS continues to move towards a strong security posture on all fronts.

Every Summit, re:Invent and sometimes even AWS pop-ups have a number of “surprise” release for new services or enhancements. At this point everyone was trying to guess what security related offerings this event held in store and it is doubtful anyone was disappointed:

  • Opt-in for default EBS encryption – Enables encryption on new volumes, making it easier for people to do the right thing (encryption-at-rest)
  • VPC Traffic Mirroring – IDS, DLP and Forensics companies will all have major stakes in this development. Several key vendors were privy to early integration and had products dropped that same day.
  • Security Hub and Control Tower are now GA – Both offer a central location for a variety of security related offerings and most of us see as boilerplate for other services—lowering the bar for proper security entry.

During the event, there was a “Security Jam” which we participated in. The Jam was held in “Capture the Flag” style which offers competitors a series of different challenges that they have to solve to uncover a special string, or the flag, to verify solving the puzzle. We came in strong. Even though we started late, we held 4th place for a long time and came two questions away from tying for first. The topics ranged from secure architecture and IAM permissions, to some more bizarre forensics and IoT security scenarios. We’re definitely planning a full assault at re:Invent this winter.

Next year they will be holding re:Inforce in Houston, TX June 16th and 17th and announced next year’s start date right in the kick-off keynote. It is clear they are continuing to make security a priority in their documentation, training and offerings.

Check out the re:Inforce recordings here: https://www.youtube.com/playlist?list=PLhr1KZpdzuke2ncPH0DVp9PswBFY5dIl6

Hope to see you at my first re:Invent in early December (2nd – 6th) https://reinvent.awsevents.com/ 

Say “Hi” – you can’t miss us in our RunAsCloud blue attire.

Why Ops?

July 30, 2019 By Jake Berkowsky

It seems the whole tech world is full of developers.

Everyone and their mother (and my mother even) have learned to code and many are trying to start or change their careers to match. There are code bootcamps in every major city and it’s easy than ever to get started. Operations, however, remains a dirty word invoking images of pale nerds replacing wires in datacenters or manually parsing through log files. Companies make millions with their products that allow developers to “not worry” about anything outside of their code. In reality, ops is a challenging, interesting and rewarding field.

 

Ops has its own set of unique challenges. Figuring out how to fit everything together in a scalable, available and secure way requires not only a deep pool of knowledge but also creativity to come up with novel solutions as well as work with others in various disciplines. A good ops person has to keep up to date with all sorts of new techniques and technologies and is typically the first person consulted when a company is considering new technology. Ops is also rewarding, both in terms of personal fulfillment but also financially as cloud architects are more and more in demand every year.

 

If you’re interested in learning more about ops (especially cloud ops) RunAsCloud is always looking for interns. Email internship@runascloud.com to learn more!

 

RunAsCloud

  • Services
  • Blog
  • Our Team
  • Careers
  • FAQs
  • Contact Us
Close