Automated Security in the CI/CD

TL;DR
  1. Security automation embedded within the CI/CD process is the ONLY way to properly secure applications that are built and deployed quickly. 
  2. We need to secure artifacts as they are built/deployed in the CI/CD pipeline and we need to secure the CI/CD pipeline itself (many forget to do this!)
  3. Stop using “shift left” as a buzz word (</endrant>)
  4. Start small. Detection only. Focus on high risk areas first. Follow Principles.

If you want to view the final set up… click here

The Groundwork

Let’s define some terminology and break down “CI/CD” before we dive into securing it. 

  • CI – Continuous Integration. Basically it’s a practice that allows developers to frequently share code in a repository and their code is automatically tested and “built” by “something” (we will look at that later). 
  • CD – Continuous Delivery. This is an extension of “CI” and it picks up the built code and deploys it to dev,staging or prod environments. You also might hear Continuous Deployment… but we’ll leave this out for now.
  • Build System – When we defined “CI” above, we mentioned “something” automatically builds the code and runs tests. This is typically what the build system does.
  • Artifact – An artifact can be basically anything that is an output of the build system. The most common example is an docker image.
  • Registry – An registry is just a place to store artifacts. For example your docker images would be placed here. 
  • Source Control Management (SCM) – This is Git! Or if you have ever used Github/Gitlab. This allows you to version and manage your code.
  • Orchestrator – Not every environment uses an Orchestrator. But if you have heard of Kubernetes… that’s an Orchestrator. It just allows you to run and manage all of your docker containers. 

Now that we have some basic terminology… an CI/CD pipeline allows your team to develop, test and deploy applications consistently and (hopefully) quickly. Developers spend more time writing code that benefits your business, you make lots of money, yay. The diagram below is rough example of what it might look like:

Simplistic View of CI/CD
And Now the Security Part

The importance of identifying and fixing violations early in the CI/CD process is well documented. Basically it saves everyone time and money. If you want proof just sign up for a free trial of Qualys or Nessus and read the x20 emails per day informing you of the benefits. 

For the rest of the post… we are going to say that our application is a web app. It allows people to view public blog posts. We package the application as a docker image. The CI/CD pipeline is 90% automated and it takes 10 minutes to build & deploy a new version of our application. That’s awesome. 

But now we have the problem of securing this application. Manual testing takes too long. If you make developers and infrastructure wait 48 hours while you run a vuln scan with a 50% false positive ratio… have fun. Our security “checks” need to be quick, accurate and have a high return, otherwise the business will prioritize new features over your security “blockers”. If you think, “Well Ill just catch this stuff after its deployed”. No you won’t and it will be expensive to fix. 

Please remember that our goal is trying to secure artifacts AND the pipeline itself. I can’t express how common it is to have stale service and user accounts throughout this whole process. Anyway, we want to focus our security controls in various stages of the CI/CD pipeline… so let’s break this down.

  1. Develop
  2. Build
  3. Store
  4. Run
Stage 1: Develop

Typically in this stage, developers are writing code in their IDE and interfacing with SCM (i.e. Github). They are pushing & pulling code, doing some local tests, integrating new libraries.. etc, etc.

Security Goals:

  1. RBAC on which repo’s the developers can push/pull from. 
  2. Authentication to SCM (SSH keys, Personal Access Tokens, MFA, wire it up with your IDP)
  3. Repository controls such as using private repo’s, limited permissions for service accounts & users. 
  4. Prevent secrets from being committed to SCM
    • i.e. Pre-commit hooks (https://github.com/deshpandetanmay/git-secrets)
  5. (optional) Flag old or vulnerable dependencies in the IDE
    • Many SAST scanners have plugins for the IDE that give feedback to the developer as they code.
  6. Flag old or vulnerable dependencies in SCM.
    • In Github, you can use dependabot for dependencies & docker images and Github has built-in secret scanning. Secret scanning is just regex matching… so it won’t catch everything. But basically these tools just open a PR when a old/vulnerable version is detected.
Stage 2: Build

In this stage, your build system (Jenkins) is consuming a webhook or polling SCM, then it checks out your code, runs any tests and packages your code as a docker image. After that, it saves the docker image (artifact) to your Registry.

Security Goals:

  1. Limited permission’s on any service accounts used between SCM, Build system and the Registry. 
  2. Depending on your Build system (looking at you Jenkins), there could be many security vulnerabilities.
  3. Static (SAST) scanning. SAST doesn’t run the code so it can produce lots of false positives.
    • SonarQube is a good solution but SAST usually identifies code coverage, coupling of the code, how complicated it is, vulnerabilities in libraries/dependencies, secrets in the code, security violations (such as SQL injection).
  4. Docker Image security
    • Identifying vulnerable application dependencies is good but we also need to ensure our docker image doesn’t have nasty vulnerabilities or embedded secrets. These are referred to as “OS level” violations. 
    • This topic deserves a large post itself (https://snyk.io/blog/10-docker-image-security-best-practices/) but usually “fat” images contain large amounts of binaries that at any time could have a vulnerability. Building smaller, purpose built images is ideal.
    • Twistlock, Anchore, Tivy usually do this (they can scan App and OS dependencies)
  5. Code coverage (SAST can help here)
    • There are many different “tests” that developers should write to ensure their application operates properly but we will just talk about code coverage. This is just the # of lines that is covered by test cases. 
    • Code coverage is critical for Security b/c of two reasons:
      1. Vulnerability remediation – High code coverage means when you update a vulnerable library.. your test suite will usually scream if the update broke anything (this is huge for vuln remediation)
      2. Predictable – Software that behaves how you expected is usually more secure than untested software. Vuln’s are born from bugs and bugs are born from bad coding.
Stage 3: Store

This stage is where the build system uploads the docker image to the Registry. Usually you would have different repositories for specific environments (dev, staging, prod). Depending on where the image is uploaded, determines where it is deployed. 

Security Goals:

  1. Limited permissions on who can push to dev, staging and prod repositories and any service accounts
  2. Registry ideally has a way to block vulnerable versions of software from being downloaded. It can also scan packages and docker images
    • JFrog XRay is a good example of this
    • Scanning packages/images can be super helpful for third party software or anything that doesn’t originate from your build system.
  3. We should be able to look into a specific repository (i.e. Prod) and see what artifacts are deployed and any new vulnerabilities
  4. (Optional) Remember when we used SAST to scan the code? Well we may want to use DAST (OWASP Zap, BurpSuite) to dynamically scan the running web application. This test is usually pretty accurate b/c it scans the fully built application and it can run actual exploits to identify OWASP Top 10 vulnerabilities, for example.
    • DAST can take minutes or hours to run so its actually difficult to include in the CI pipeline… here are some tips for using it in the CI/CD pipeline:
      • Use Swagger/OpenAPI docs and feed it to the DAST scanner so it doesn’t have to spend time discovering the application.
      • When a new artifact is uploaded to your Registry, the Registry will send a webhook to your scanner with the “test” URL. The scanner will then kick off a scan. This way you don’t block the build waiting for a long scan.
Stage 4: Run

In this stage, “something” (in our example its Kubernetes) takes our new image sitting in our Registry and deploys it to the environment. This is commonly called “Runtime”. 

Security Goals:

  1. Of course use limited permission’s on any service accounts used here.
  2. Any secrets that the application uses must be loaded as environment variables (not hardcoded!). And any outside identities that the application might use should follow least privileges as well.
  3. Ensure that images deployed to our environments only come from our trusted Registry
    • In K8, you can use OPA policies to block any images that are NOT being pulled from your trusted registry domain’s
  4. Runtime compliance!
    • When using docker images, there is too much to talk about -> https://www.stackrox.com/post/2020/05/kubernetes-security-101/
    • If your App uses other services (i.e. databases, API’s), these are in scope.
  5. Be able to monitor and identify new vulnerabilities or threats
    • Somebody tried to exec into a container?
      • Twistlock and Falco are great here
    • New vulnerabilities? Push a patch through your automated CI/CD pipeline and deploy a new version of the image.
                               Let’s see the automated CI/CD pipeline 
Security in CI/CD
Closing Thoughts

As you can see, there are a ton of potential ways to become proactive in the CI/CD process. Start small in detection only, adapt and see what works for your organization. Remember that the goal is to highlight and fix high risk vulnerabilities in the application and the process. 

 

Leave a Reply

Your email address will not be published. Required fields are marked *