DevSecOps at Scale Using Amazon Elastic Kubernetes Services (EKS) and the JFrog Platform [swampUP 2021]

Anuj Sharma ,Solutions Architect (AWS)

June 27, 2021

2 min read

In this talk, we will simulate a real world scenario by building a shared services platform using the JFrog platform on Amazon Elastic Kubernetes Service (EKS) using AWS Cloud Development Kit (CDK) to deploy a containerized workload into test and production AWS accounts, using deployment pipelines. Learn More: https://jfrog.co/3zTBhDK

Anuj is a container specialist solution architect at Amazon Web Services and has been building on AWS for over 7 years. He has hands-on industry experience for over 15 years in application and infrastructure development. When not developing, he likes to hike around Mt Rainer.

Along the way, we will cover topics such as static code analysis, image scanning and deploying a Helm chart from a management EKS cluster to clusters in Test and Production AWS accounts, and implement DevSecOps best practices using the JFrog Platform.

SHOW LESS

Related Resources

Video

July 7, 2020 | < 1 min read

Kubernetes in production is hard! [swampUP 2020]

Here is the JFrog’s journey to Kubernetes: https://jfrog.com/whitepaper/%20the-j… In the past two years, we moved to deploying and managing JFrog SaaS applications in Kubernetes on…

Shownote

April 12, 2019 | 2 min read min read

Building a Kubernetes Powered Central Go Modules Repository @ Munich Kubernetes Meetup – 2019

Today, Kubernetes is the defacto standard if you want to run container workloads in a production environment, though that wasn’t always the case. We had/have…

Video

July 7, 2020 | < 1 min read

Artifactory as an IT Service @ Siemens [swampUP 2020]

Best Practices for Artifactory Backups and Disaster Recovery: https://jfrog.com/whitepaper/best-pra… What are the advantages and challenges of setting up Artifactory as an IT Service in a…

Video

July 7, 2020 | < 1 min read

Kubernetes in production is hard! [swampUP 2020]

Here is the JFrog’s journey to Kubernetes: https://jfrog.com/whitepaper/%20the-j… In the past two years, we moved to deploying and managing JFrog SaaS applications in Kubernetes on…

Video Transcript

Hello and welcome to the session to learn about implementing DevSecOps practices
at scale using Amazon EKS and JFrog platform.
This session is pre-recorded.
However, when it is streamed live, I’m available to chat with you,
so feel free to ping me at any time during the session when it is getting played.
A quick introduction about me,
I’m a specialist Solution Architect at Amazon Web Services
and have over 15+ years of hands on experience in
application and infrastructure development.
And I’m really passionate about building large, scalable solutions for developers and enterprise customers.
I have won multiple awards ranging from software developer, architect,
engineer manager and product manager in a variety of industries
ranging from financial services to retail.
When not building solutions for my customers and partners,
I love spending time running behind my three years old daughter
indoors and outdoors when whenever Seattle weather permits.
Let’s have a quick look at the topics that we will cover today.
We will first talk about what is the shared services platform and why should you build one within the organization.
We will then go through in detail and understanding management shared services platform architectural topology
on AWS, and then see how to create a shared services platform on Amazon EKS using
AWS cloud development kit, commonly referred as cDk.
We will then shift gears on understanding in depth how to implement DevSecOps practices
for the JFrog platform on AWS
and EKS and then go through how to use the application platform that we created
to deliver the code to an application EKS cluster,
running in a cross account and implement EKS practices all the way using X ray.
We will also go through the different components of implementing DevSecOps practices
on JFrog platform and [inaudible].
Towards the end, we will wrap up with a quick demo,
next steps and then open up for Q&A.
We have a lot to cover in like next 25 minutes or so.
So let’s get started.
To level set what we will talk about in the session today
and make a good use of it,
let’s first understand what is a shared services platform
or commonly referred as SSP and why should an organization be investing in actually building one?
Shared services platform is an organization’s internal platform
which is used by multiple teams internally.
As the platform is leveraged by multiple teams it should be secure by default.
Shared services platform allows the separation of concerns in software delivery,
very similar to AWS shared responsibility model,
which relieves the customers from the operational burden
as AWS operates, manages and controls the components
from the host operating system virtualization laid down
and then the customer assumes the responsibility and management of the guest operating system.
Shared services model provides a team within the organization to manage a portion of shared services.
Let it be the tooling for continuous integration and deployment
or identity and access management.
This allows application teams to build software
that delivers business values to its customers,
leveraging the services from shared services.
Typical shared Services architecture is provision using AWS control tower
and AWS organizations allowing different AWS accounts to be provisioned at scale for each purpose.
In an organization there are typically around four separate AWS accounts at least
there could be more depending on the size, the nature and the use cases for the organization.
First category of AWS account is used as an example hosting CI\CD platform
or tooling leveraged by multiple application team.
In this talk, we will talk about and we will see later in the session also today
to use and provision the JFrog platform
in a totally totally separate AWS account.
The second category is generally used for identity and access management
or for network management.
Third one is a master pay account which is used for billing purposes.
And the fourth one, typically is used for ensuring the desired state of your security.
This is an account where all the security logs
from multiple sources and multiple accounts like cloud trail, cloud watch,
guard duty are aggregated to perform analysis, forensics and remediation.
Managing and optimizing the software lifecycle is often a disjointed effort
and disjointed process with developers and IT operations teams working in silos.
This lack of coordination can introduce in consistencies,
errors and vulnerabilities.
Continuous integration and continuous deployment helps to avoid these challenges.
According to a recent CNCS survey report, there has been an increase in enterprises using containers
for CI\CD as companies realized the need to shift to a cloud like development and deployment model.
Shared Services platform on containers help companies achieve the benefits of
agility, portability, security and speed
with the practicality of CI\CD methodologies,
as well architected platform provides automated integration of workflows and solution
as well as the functionality for developers to quickly scale on demand.
In addition, as day one,
it should be architecturally designed for immutability
to limit the potential of cyber attacks.
Shifting gears a little bit, let’s now dive deep on building a shared services CI\CD platform
using artifactory and X ray.
But let’s first understand what are the various components and
the ecosystem as there are like different moving pieces all the way around,
I promise to make it like as easy as possible to understand.
Shift services platform is typically provision in a separate AWS account,
because of the reasons that we mentioned before.
In order to be highly available, the VPC spans out to at least two different private subnets,
there are no public facing subnets involved as the load balancer with which the platform actually accesses
does not needs to be accessed publicly.
So there is no point in like creating
a private subnet, and everything is accessed within the organization itself.
We then create the Amazon EKS cluster,
expanding on to these nodes
and probably is best to use a managed nodes to join to the EKS control plan.
A quick introduction on what are managed nodes.
With EKS manage node groups,
you don’t need to separately provision or register the EC two instances
that provide compute capacity to run your Kubernetes application,
which in our case is the entire j frog platform.
You can create, update or terminate nodes for your cluster with a single operation.
Nodes run using the latest EKS optimized EMIs in your AWS accounts while node updates and terminations
gracefully drain nodes to ensure that your application stays up and available always.
All managed nodes are provision as a part of auto scaling group
that is managed for you by EKS.
All resources include the instances and auto scaling group
run within your AWS managed account.
Each node group uses the EKS optimized EMI and can run across
multiple availability zones that you define.
You can also provision the platform on bottle rocket.
Now quick segue as we are touching the topic of the operating system.
Let’s talk more about what is bottle rocket operating system.
Bottle Rocket is a Linux based operating system
that is purpose built by AWS for running containers on virtual machines or bare metal host.
Most customers today run containerized applications on general purpose operating systems
that are updated package on package which makes operating system updates difficult to automate.
Updates to bottle rocket, however, are applied in a single state rather than package on package basis.
This single step update process
helps reduce management overhead by making OS updates
easy to automate using a container orchestration service such as EKS itself.
The single step updates also improve uptime for container applications
by minimizing update failures and enabling easy update rollbacks.
Additionally, and most importantly, in my opinion,
bottle rocket includes only the essential softwares to run containers,
which further improves the resource utilization and greatly reduces the surface attack.
We also use Amazon RDS MySQL database as the database tier two with the JFrog platform connects to.
When you provision a multi AZ database instance,
Amazon RDS automatically creates a primary database instance,
and synchronously replicates the data to a standby instance in a different availability zone.
In case of an infrastructure failure,
Amazon RDS performs an automatic failover to the standby instance,
since the endpoint your database remains the same after the failover.
Your application can resume database operation
without the need for manual administrative intervention.
We also configure ideas to store the admin credentials in AWS secrets manager
so that users and applications retrieve the secrets with a call to secrets manager APIs.
Essentially eliminating the need to hard code sensitive information in plain text.
Secret manager offers secret rotations
with built in integrations for Amazon RDS.
We use Amazon S3 for the persistence layer
and artifactory uses cash file system as a read buffer to cache the frequent requests,
thereby reducing the number of hits going to S3 significantly
and drastically improving the performance.
Artifactory also uses eventual cache as a write buffer,
which means uploads are asynchronous and early
and the cache is local on the disk.
In the end, the front pods,
which are powering up the artifactory HA and X ray in application
with elastic load balancer.
Our security is of prime importance as
a de-zero responsibility,
we also use AWS certificate manager
to add and link it with Amazon load balancing services
for seamlessly manage and rotate the certificates.
Now, as we just saw, there are like multiple moving pieces in this setup,
and provisioning this entire open ended setup can be challenging.
As we need to install the JFrog platform in a Kubernetes cluster,
by using traditional approaches there will be a need to do context switching by using at least two different APIs,
one for provisioning and managing AWS resources
and then Kubernetes API to deploy the workloads in the cluster.
The workloads in this case are the different moving pieces of the JFrog platform,
including artifactory pods and X ray pods.
Developers and administrators need to be aware of at least two different CLIs,
syntax and technology in order to make the platform in its completeness.
Not to mention AWS’ and JFrog’s efforts are always to make the installation of the platform as seamless as possible
as this is an open ended infrastructure.
This specific challenge is greatly solved by using Amazon cloud development kit,
as I said commonly referred as AWS cDk.
Before we go on further, a quick introduction on what is AWS cDk.
cDk is an open source software development framework
to define your cloud application resources using familiar programming languages
such as go Lang, Java, TypeScript, and Python.
Provisioning cloud applications can be challenging process as we saw earlier,
as it requires you to perform manual actions,
write custom scripts, maintain templates or even learn domain specific languages.
AWS cDk uses the familiarity and expressive power of programming languages
for modeling your applications.
It also provides you with a high level component called constructs
that pre configures cloud resources with proven defaults,
so that you can focus on building cloud applications without the need to be an expert in it.
AWS cDk provisions your resources in a safe repeatable manner
through AWS cloud formation
and uses the same cloud formation engine in order provision the resources.
It also enables you to compose and share your own custom constructs
that incorporate your organization’s requirement,
helping you to start new projects faster.
The JFrog cDk construct behind the scene also uses cDk
in its library to install the platform using Helm
as a package manager for Kubernetes
with smart defaults and optional overrides.
With that, I’m happy to share the developer preview of the new JFrog cDk L3 construct
which is an open ended abstraction of the entire setup that we just saw.
And that is something that we develop in partnership with JFrog.
This cDk construct will empower you to quickly install the entire JFrog platform
on Amazon EKS with the recommended architecture that we just saw.
We strongly encourage you to try out this L3 construct
and let us know any feedback
so that we can further iterate on it for improvement.
You can download this effective immediately from NPM and pipey repositories.
And the constructs are available with the package name of JFrog-cDk-constructs.
Now let’s talk about DevSecOps.
But before we go further, let’s first define what exactly is DevSecOps.
DevSecOps is the combination of cultural philosophies,
practices and tools that exploits the advances made in IT automation
to achieve a state of production immutability,
frequent delivery of business value, and automated enforcement of security policy.
DevSecOps is achieved by integrating and automating the enforcement of preventive,
detective, and responsive security controls into a pipeline.
Working backwards from achieving automation,
there are three different components of DevSecOps that we target,
which is security of the pipeline, security in the pipeline,
and finally, enforcement of the pipeline.
Let’s dive deep on each of these.
First step security of the pipeline.
The first step of achieving security of pipeline is
essentially treating the pipeline and the related infrastructure as workload
and treat it with the same regard like any other piece of application code.
We provision the entire JFrog platform with AWS cDk,
and any deviation to aid must be code reviewed as well.
Moreover, JFrog platform or JFrog pipelines or even for that matter, AWS pipelines
can be saved as YAML dsls
and it is strongly recommended that each team manages the pipeline as a code.
Next up, have a monthly review of the permissions attached to the JFrog pipelines
and rotate the keys on a regular basis.
Grant access to AWS resources based on the lease principle…
privileged principle to be very specific.
Use role based access control to identify who can access the pipelines
because that becomes a key element in order to deliver the software products.
Configure access and system logging and metrics in analytics services like
Amazon, Elasticsearch, cloudwatch,
Amazon managed grafana, or Prometheus using flow in the plugin.
And lastly, practice the operations required to patch and update
not only the pipelines, but also the entire platform.
Next up, let’s talk about security in the pipeline and how to achieve it.
Security in the pipeline needs to be considered throughout the lifecycle of the application.
Let it be in a form of containers or other binaries.
It starts with static code analysis at rest,
in the version control software like AWS code commit.
Now how would you actually go about integrating
or doing the code analysis when your code is stored at rest
in services like AWS code commit?
The answer to that is use a service like Amazon code guru,
which is a developer tool that not only provides intelligent recommendations
to improve code quality and identify an application’s most expensive line of code,
but it also uses machine learning algorithm and automated reasoning
to identify critical issues, security vulnerabilities and hard to find debug,
or hard to find bugs during application development,
and provides recommendations to improve the code quality.
Next up, during the build process,
and let’s consider container application build.
A typical life cycle of containerized application
involves doing an image build.
However, when pushed, trigger the image vulnerability scanning using X ray
and fail the build if vulnerability is found,
so that the pipelines will not be able to further deploy to a target environment.
If there is a vulnerability found,
the build is marked as failed, and the pipeline execution stops.
Lastly, let’s talk about enforcement of the pipeline.
First and foremost, have a multi account strategy to separate out the test and prod workloads
with the shared services platform.
JFrog platform can deploy to these tests and prod workloads
by assuming the necessary roles from the respective accounts.
Go back.
Secondly, as principal enforce pipelines to do the deployments
and ensure there are no more humans involved in the deployment process,
other than clicking a button in your pipeline only if required.
It will be hard to do this enforcement to begin with,
but start by identifying why would a human need to access to production account
inside the workload
and then use tooling to orchestrate the specific requirement and address that specific requirement.
My common requirement is I need to view the production logs.
You can eliminate by identifying what logs are required,
where they are sourced from
and use tools like flowindy or cloud watch agents,
or AWS open telemeter distribution to output the telemetry data to an
external aggregator like cloud watch.
Alright, let’s do a quick demo.
Alright, so we start the sample application by creating a cDk application
and declare the L3 construct that we created, which is JFrog-cDk-constructs
as a dependency.
Once we declare the dependency,
you just need to create new instance of the class
and fill in the properties for the cluster, the nodes and IDAs,
as we’ll see in this case, and specifying the properties of Kubernetes version to be used,
the endpoints to be accessible, that’s a Kubernetes cluster specific endpoints.
What is the kind of nodes in this case, I’m using the bottlerocket OS nodes
and the specifics of RDS database.
And I want the Postgres equal version of 12.5
and that database name and the username that I should use.
Once I do that next step is to simply compile the TypeScript to the JavaScript.
And once that is done, use cDk deploy command
to deploy to the specific AWS account.
And you’ll notice that once cDk will tell you
the list of changes that it is trying to deploy
and it will start creating a new stack into the AWS account that you have specified.
If you have to see on the AWS console how it looks like,
simply log into the console, and you can see the cloud formation stack.
And you’ll see in a few minutes,
a new stack will come up,
which will have the entire stack deployed end to end,
which includes the EKS cluster, the ideas databases, secrets, load balancers, and pretty much everything end to end.
Alright, so you’ll see that it’s getting deployed.
Take a quick pause and resume once the deployment is complete.
All right, so once the stack deployment gets over,
you can see the enter output in the console itself, something like this.
And if you want to log into the Kubernetes cluster,
you can run the update, cube CT update cube config command in order to,
to connect with the cluster that is created.
And at the same time, if you have to see the output in the cloud formation console,
it will look something like this, the stack is in create complete status.
So once the stack it gets created, once the platform gets deployed onto EKS cluster,
you can open the URL that you have created,
and then create the pipelines.
Something like this, the entire pipeline, DSL can be saved as a code in your version control
so that you can you can collaboratively work on that and also as a good
DevSecOps practice, it’s probably best to save the pipelines as a code as well.
The steps in the pipeline code looks something like this, for a containerized application as an example,
you do the Docker build,
you do the Docker push to the repository,
which is an artifact itself and this triggers the X ray scan
and once the X ray scan is complete,
it is promoted to the repository,
which is then used to deploy to the EKS cluster and test and then to prod.
So, in a nutshell, this is how the entire application lifecycle using DevSecOps looks like.
And as you see the Docker promotion step here will not proceed further
unless and until the X ray scan which does the vulnerability scanning
is complete, because the policy is for x rays are configured like that.
All right, so switching back,
that was all I had for the demo.
Quick next step that you want to try it out is give the AWS cDk construct for JFrog a try.
It is available at JFrog-cDk-constructspackage
in NPM and pipey repositories.
And also, we have created a hands on lab with JFrog.AWSworkshop.io.
Give the workshop a try so that you can try out the end experience on your own
and reach out to us if you have any questions.
We’re happy to hear the feedback at any point of time.
With that, thank you so much for listening to this session.
I hope you find it useful. And again,
I’m available here to answer any questions that you may have
around the sessions or anything in general.
Happy SwampUP and looking forward for your feedback.
Thank you.