When you’re new to an industry, you encounter a lot of new concepts. We tend to use a lot of jargon, the documentation may be written with someone more experienced in mind or rely on contextual knowledge of the rest of the space, and it often doesn’t explain the “why” for the tool. This can make it really difficult to get your feet underneath you on an unfamiliar landscape, especially for junior engineers.
In this webinar, we’ll break down what CI/CD is — how it solves problems for you as a developer, your organization, and your users. We’ll also go over some of the most popular CI tools out there, and give you a practical example of how to get started yourself.
Hi everybody, thank you for joining me on this JFrog webinar, and welcome to DevOps one on one, ci CD.
Let’s get started.
First, a little bit about myself, my name is Kat Cosgrove and I’m a developer advocate at JFrog. Before that I was an engineer on the IoT team here,
And I didn’t really get where I am through like strictly traditional means. I did freelance as a web developer for a while but I have also been a bartender, a waiter, a teacher and the resident horror expert and an independent video store so if you have questions about horror movies. I can answer those two. I credit a lot of my success as an engineer and as a developer advocate to the broad experience, I have and to my like kind of non traditional circuitous route into tech, it’s allowed me to see the tech industry from a lot of different angles and have a lot of different struggles, but some of these struggles are actually fairly universal in tech, it has nothing to do with the fact that I don’t have a computer science degree. Computer Science grads experience these problems too. And somehow, they are still completely unaddressed, and that is not great. I would like to put a stop to that, so that the next person who comes after me, doesn’t have to spend as much time figuring these things out, as I have. So by the end of this webinar, you should understand what CI CD is, its history, and why it’s important.
If you would like to get a hold of me later. You can find me on Twitter at Dixie three flatline, you can also email me at Kat C at JFrog.com though, just a heads up.
I’m kind of slow to respond to emails, sometimes it is a character flaw that I am aware of and working on, but I’m always happy to help. If it is within my power so if you have a question, please don’t hesitate to reach out, whether it’s, you know, privately or in the QA box here.
Here we go.
First of all, it just isn’t simple. You see these things called simple a lot, but it is when you are new to an industry, you are going to encounter a lot of new concepts, and it can make it really really difficult to get your feet underneath beneath you on and completely unfamiliar landscape is this is especially true for like junior engineers for like actual newbies. So, like, what does DevOps actually mean. What is all of this software, what’s all the jargon is DevOps, a methodology, or is it a toolset. Is any of this, actually going to make my life easier, or is this just a bunch of industry buzzwords that like upper management and leadership likes to throw around to look fancy.
A lot of the documentation we have. It also kind of assumes that you already have some additional context or you’re proficient in some related tooling, and that doesn’t exactly make it easy to learn. So, in this webinar I’m going to break down, ci CD.
Let’s start with what that actually stands for the CI stands for continuous integration, and the CD can stand for either continuous delivery, or ointments.
Yes, they knew mean different things, and no I don’t like that, but it is what it is.
I promise I will explain that though.
We’ll start with the CI.
Practicing continuous integration means merging all developers working codebase, with the source, multiple times a day.
Doing this requires a series of automated build and unit tests to ensure that none of the proposed changes caused problems, but the result is that bugs and integration issues are discovered way earlier in the development process.
Like ideally a build is triggered with every single commit I know that sounds really difficult to manage, but it’s not that bad. So, Ideally the failures are caught by the developer immediately and corrected immediately, because there is a new build and new tests running every time something is committed. This also forces the engineers to write code that is more modular, which makes it easier to support later on because part of the problem here is people, other people have to maintain your code, other people have to be able to read and understand your code. So prioritizing readability and maintainability is an important piece here.
Continuous integration has actually been a thing for a pretty long time. It’s been around since the early 90s, though it hasn’t always been called that. And some of the implementation has changed a little. This spirit is largely unchanged. Merge changes into source in smaller but more frequent increments test that the project still builds and runs with those changes, and make sure your engineers are all working on the most recent version of the source. Do this and you won’t end up with a bunch of merge conflicts or surprise problems when it comes time to build.
This was first proposed by a man named Grady Booch in 1991 with his book, object oriented analysis and design with applications.
The boot method advocated for more frequent use of classes and objects in programming. In order to simplify design, his version of continuous integration didn’t suggest releasing multiple times a day, though, that happens much later in 1987 Extreme Programming became a thing, and it builds on the Booch method by advocating for releasing multiple times a day, it kind of changed the game honestly as ridiculous as the name sounds in retrospect, it didn’t mean Extreme Programming and like, 90s, the X Games like extreme sports kind of way. It meant extreme as in taking like things that were already kind of accepted parts of programming and taking them like really really really far like wing it to the extreme. So, thanks to extreme programming, we now have shorter release cycles, pair programming was just an extension of extreme code review that extreme programming, gave us. We also have unit testing, and acceptance testing as just like standard aspects of writing software and these are things that we all do now because of a thing from the 90s.
And these things all make our lives easier. They radically improved quality, more and more methodologies, then built on this from, from combine all with one goal in mind, make it easier to write clearer, higher quality code and get it out to the users faster.
In the early days, while we recognized that we needed to be releasing more frequently. We didn’t really have the tools to make that easier. We didn’t get the first open source tool to make continuous integration easier to achieve until 2001 With the release of cruise control. If you go do like a Google image search for cruise control, it looks pretty primitive now, but at the time, it was revolutionary.
So the concept has been around for quite a while like pushing 30 years, but it’s been changing the whole time. And we still haven’t really gotten good at explaining this.
Let’s talk about the CDs. Continuous Delivery means what it says on the box, your software updates are continuously delivered in concert with continuous integration. This means that you should have the ability to deploy a new build very rapidly because you’ve already automated some quality gates that would otherwise need to be performed manually, like the process of building and testing that reduction in manual labor, means you get to release a bunch of small changes, rather than one huge update every couple of months. I don’t know what the like age and experience demographic of people listening right now is but there was a time not too long ago where it wasn’t super unusual to see like major applications go a year or more without receiving a software update, I will talk a little bit about that later on.
And since you’re now making like smaller more incremental changes. You can also be more confident that your release isn’t going to break when you deploy to your users, or that if it does, you’re going to be able to track down more quickly what caused the problem, because you are going to push about update eventually continuous deployment is similar, but it goes just one step further. The deployment step is automated to in continuous delivery, there is still a manual quality gate involved before an update is out in the wild. This is kind of like a big step for some people, and it does require a lot of trust in the system you’ve built, but personally I am a huge fan of automating the deployment as well, because like, for a modern DevOps pipeline, which means you as well the engineer to be as efficient as we possibly can be human involvement has to be removed wherever we can. I say this in like, maybe 40% of the talks I give, but humans are really really bad at repetitive tasks, there’s like a hard ceiling to how good we can get at a menial task by doing it over and over again. We get bored really easily, we get distracted, maybe somebody is like just having a bad day, or they skipped breakfast or whatever and they’re not working at 100%, and we’re also really really really slow. And that’s like, that is all normal humans are gonna make mistakes. But the computer doesn’t get bored or distracted, and it’s not slow. If you write good comprehensive tests and automate everything you can and then except that you absolutely are going to deploy a bad update eventually whether a human is involved in pushing the big green button or not, you are going to have a better day. What matters here is how quickly you can respond to, and correct a bad update. Ci CD, helps you get there.
Get a little bit more high level on what CI CD is, what does it get you.
This is literally just a term for the marriage of the concepts of continuous integration and either continuous delivery or continuous deployment we bundled them together now, but they did used to be like distinctly different things now you don’t really see much of one without the other. And it’s a really important part of DevOps since automation efficiency is the whole point of DevOps one does not really work in the context of DevOps, without the other, you do need both.
Implementing CI CD practices gets you eight faster, more reliable release cycle, you can add new features or bug fixes much faster since you know your engineers are all working for the most recent source, and you know there are unit integration tests, and you know it builds. There are a bunch of manual steps involved for the engineers or QA or whoever else you might have managing quality gates. Instead, somebody pushes code or opens a pull request, and those steps are taken care of by your CI CD tooling, It can also automate things like promoting things from dev to staging to production, you really can automate every part of your, your lifecycle.
Let’s talk about how it actually works though. To start, you need a CI CD tool that is what’s going to automate a bunch of manual processes for you.
It does take some time to set one of these up at the start of a new project, but personally, I am the particular flavor of engineering lazy, where I’m willing to spend a lot of extra time at the beginning, to make sure that I don’t have to do a bunch of repetitive manual tasks, every single time I push code later on the specifics of configuring any of these tools does kind of vary, so check the documentation for your chosen tool I will go over a couple in brief later on, but broadly, they all work the same way.
You said something as a trigger, like telling it to watch your source repository for a commit or emerge. You then configure a series of steps, each with Pass Fail conditions like telling it how to run your unit tests, telling it how to build, how to scan for vulnerabilities or deploy your application with a sufficiently detailed pipeline, you really don’t have to do anything but write code and push it, everything else is going to be handled for you you can make these things do just about anything.
The specifics and how the steps are defined does kind of vary from tool to tool, but usually there is some kind of configuration file that defines the steps for you, that might live in the repo with your code or in a separate repo that’s just for CI, and it’s usually in a format called yaml.
There are a lot of ongoing jokes about how people who work in DevOps are just YAML engineers or the DevOps is 100% yaml and the jokes are not entirely untrue if you do a lot of work with CI CD, or with with infrastructure you are probably going to be writing a lot of yaml.
Most of these tools do also have a like web interface type of thing that gives you a graphical overview of what your steps look like as well, frequently with logging output so you know exactly what is going on, and when it is pretty rad.
Now we know what CI CD is. So let’s talk about what goes into a good update and why it’s important if you’re around my age or older, You probably remember how much drama used to be involved in software updates. They were large, they were in frequent, they took a lot of time to apply the change logs were huge. There was a fairly significant chance that the new version would be buggy in some way, and it was just kind of generally a super inconvenient experience in a lot of situations, it wasn’t even possible to just download an update, the manufacturer would have to provide you with the update via physical media, whether that meant floppy disks or CD ROMs or a USB drive.
The last time I had to do this actually was in 2009, which doesn’t feel that long ago but I guess it is updating the software for the video store I worked at, they we relied on this application to operate. and it was old and it needed an update, and literally in 2009, the vendor had to snail mail me a thumb drive.
No, they would not just email me the executable. Yeah I did ask, they really would not do it.
and in their defense, I guess this particular application still required a machine with a serial port, which, by that time was increasingly uncommon on like just the kind of machine you could build from parts at Best Buy, because it. the license authentication was performed by verifying the presence of a physical dongle that only connected via serial. Some of you were like, younger than me, you may not even know what a serial port looks like just wild. Some things do still work this way to be clear, but it is exceedingly rare outside of a handful of industries and in those industries, it is still. Regrettably, the best way.
Phones have suffered similar problems, way back in the 90s there wasn’t really a way to update your phone software to get like snake or whatever you decide to buy a new phone.
Eventually, phones did get smarter but updates still require plugging the phone in and cloud storage wasn’t really liquidus yet, so data transfers when you upgrade to a new phone, were done by a physical cable. This is also not too far in the past, the last time I had to do this was when I was switching from a Samsung Galaxy with Google Pixel. A few years ago.
Now it’s something that we don’t even think about, though, when you switch phones, all of your contacts and photos and apps and whatever else, are just there on the new phone. There’s some setup time. Sure, but it happens pretty seamlessly and I don’t notice it when an application on my phone updates anymore because it, it just does it automatically while I’m sleeping it like three in the morning as long as my phone’s on a charger.
Let’s look at things from a consumer angle, though. Most modern cars are full of computers. Those computers need updates.
Most of them actually can’t be updated over the air and need to be taken into a service center or a dealership or something if there is a problem with the software.
I know you’re thinking of Tesla, but I’m going to talk about Tesla, but Tesla’s are the exception and not the rule. This is actually like a, like a huge problem in the automotive industry, and it’s the topic of a totally different talk, but suffice it to say that over the air updates causes a lot of expensive problems for car manufacturers, and I am going to use Tesla as an example because they can do it, but they aren’t perfect.
You might have heard of phantom braking. If not, this is a problem with Tesla Autopilot where conditions were good, the road was clear, no visible obstructions, but for some reason, the autopilot slammed on the brakes. Kind of annoying. It’s probably a little bit scary. And it took a while to fix and not really because it was difficult to track down the cause. So, why, because it seems important. Right, like something that should be addressed pretty quickly
because of a chess game, actually. Well Tesla’s certainly has a CI CD system in place, they weren’t using a design pattern we call continuous updates.
Instead, they release updates in batches, where multiple features and fixes are included in the same update. This is actually usually fine. It’s pretty common, but in this case, it was not fine.
The fix of the Phantom braking issue was rolled in with a larger update that added a chess game to the car’s infotainment system is a chess game more important than fixing an issue with the car’s brakes, your knee jerk reaction might be to say, No of course not. The brakes are more important, but it actually kind of depends. If you don’t have the autopilot for your Tesla, at the time it was not included for everybody. Why should you care.
You just want to play chess, while your car charges, or whatever. If you do have the auto pilot package though, you probably care a little bit less about the chess game and a little bit more about your brakes, continuous updates could have prevented this problem because instead of releasing fixes and features as they’re ready. Here you have features that are important to one set of users, waiting on features that are irrelevant to them but important to an entirely separate set of users. This is a lose lose situation, and a sufficiently advanced CI CD pipeline could have provided
for leadership. The money angle, not updating frequently and automatically can actually be really, really expensive, it turns out, and sometimes fatally sell
in 2012, a bad software update caused a major stock market disruption and tanked the value of a company called Knight capital.
They were a trading firm that specialized in automated transactions on the New York Stock Exchange, and one day, it all came crashing down.
Easy. They weren’t updating.
Anything, automatically they weren’t automating their deployments, and they weren’t updating frequently.
And like I said earlier, humans are really bad at repetitive tasks. We are especially bad at them when we don’t get any practice at those tasks.
Two things contributed to the disaster.
When the engineer was updating the eight servers responsible for handling these automated trades. He forgot one only seven of the eight has a new software.
They also reuse an API endpoint, but change the behavior in the new version.
You can probably see where this is headed.
When it went into production. It was chaos request to the old server running the old API endpoint reached havoc on the share prices of 148 companies. Engineers responded by taking the seven updated machines offline to roll back the software, increasing the load on the one machine that was never updated to begin with.
There was no logging, no monitoring, and there were no alerts.
They had to sit there, desperately trying to figure out what had happened, losing money, minute over minute in the 45 minutes at this bad update was live night capital lost $440 million, and ultimately went out of business. As a result, this was tragic and it was preventable, those servers were all being updated manually, but humans, again, we get tired, we get bored or we get distracted or we have a bad day. And again, it’s normal, you can’t control for that is the process had been automated. This might not have happened.
For me though, the most compelling argument in favor of spending the time and effort on CI CD is security.
Ci CD enables you to update your software, much more quickly, which translates to being less vulnerable to known issues and being able to react more quickly if a breach does occur.
This is something you are probably familiar with. It’s fairly recent though you might not be aware of the cause.
This is a story of the Equifax disaster in September of 2017, Equifax announced that a serious breach that occurred between May and July of that year. Remember the first month there. May the names, addresses, birthdates, driver’s license numbers and social security numbers of just shy of 149 million Americans had been stolen. The hackers had exploited CVE 2017 5638 a vulnerability in a very commonly used web framework called Apache struts, talking about how the volume actually works, is kind of out of scope for this webinar.
But the gist is that it would allow an attacker to remotely execute commands with whatever authority the web server had.
That’s pretty pretty serious. And to be clear, this was not a vulnerability that flew under the radar, Apple, even before the breach this CVE was big news in the security world everybody panicked. It was classed as a critical vulnerability given a CBSs score of 10, which is the maximum as something that like absolutely had to be taken seriously and corrected immediately. The vulnerability was disclosed and published in March, two months before Equifax would actually be breached.
Once they discovered the breach. It took them two full further months to find an upgrade all usages of the vulnerable version of struts, lawsuits were filed, resulting in a $575, million settlement to the FTC, as well as payments and credit monitoring for those affected.
Unfortunately for us, the number of people affected by the breach was so large that after attorneys fees, many people didn’t get anything other than a lifetime of being paranoid about their credit.
This was caused initially by a lack of vulnerability detection, they just didn’t know.
But it was exacerbated by their inability to update rapidly it should not have taken them two months to resolve this issue.
So hopefully by now I’ve explained, What’s the ICD is, and sufficiently convinced you that it is worth the effort.
Now I would like to take you to look at a little bit of tooling.
Let’s go have a look at a CI CD system in action, as well as the config file that defines the steps for a freemium solutions where like free usage is limited but it’s generally fine for personal projects or small stuff. I use GitHub actions, a lot and it’s what I will use for an example here since it’s pretty simple to set up and available to any project hosted on GitHub. I will also be using the JFrog CI so that I can have, where my like binaries are actually stored talk to my CI system just like in real life.
I can also show you how you can get vulnerability detection involved as well. Both Artifactory and X ray are included in the JFrog free tier so you can try this too.
If you’re in a more enterprise II environment, JFrog pipelines is an option for you it is not currently included in the free tier. But, you know, if you’re an enterprise customer you can probably talk to your ref and see about getting a quick trial so let me share my screen for you.
Here we go.
So, I will zoom in on this so it’s easier for y’all to read, but it will make things look a little bit squished.
Is my Artifactory This is my GitHub repo that I use for a workshop. So we have this Pull Request hanging out here.
And we see that it’s got some checks, this is our GitHub action. And let’s take a look at what is actually going on in there.
Just called Sample workflow it’s real little. It’s just there for purposes of demos.
But it’s handling a lot of stuff that we would ordinarily have to do manually and the syntax for these is pretty simple.
It starts by setting up the JFrog COI for me. That way I can talk to Artifactory which is where all of my packages are currently stored configures it to talk to my Artifactory instance.
And then it’s going to do a bunch of stuff that would be a big manual process. First, it is logging in through Docker, so that it can build my Docker container, it’s just a little one line Docker file is just pulling Ubuntu,
then it’s going to push the tagged built back to Artifactory I will show you it all living in there shortly. It’s also going to collect a bunch of environment variables for me, and package it up into something we call build info, it gives you a bunch of additional metadata and like context around what’s going on in your build, so it’s easier to keep track of like what’s going on what’s being promoted. What is failing, where that kind of thing. Personally, I, I’m kind of over the top, about needing to know everything that could possibly be going on around my bills so this is something I really really really liked to have as part of my CI.
And then it publishes the build info back to Artifactory. Let’s look at the actual workflow so you can see what one of these looks like. So the way GitHub actions works is you’ll have a directory in your repo for a simple one like this hidden folder called dot GitHub. And within that, a folder called workflows.
This is just a sample workflow, but if you’ve never seen YAML before, this is what it looks like. It is very human readable, it’s like super easy to read, like, at a glance what’s going on. And, generally, it is easy to write. You can run into some problems with like tabs versus spaces but you just use a YAML linter in your editor and it’s, it’s fine. But we’ve given it a name sample workflow which is what you saw under Actions earlier when I clicked on. And we first define, like when the action, we’ll run this is the trigger I talked about earlier. In this case, it’s going to trigger whenever something is pushed to my master branch, or whenever a pull request is opened against the master branch. So we saw that earlier because I had a pull request open from my dev branch to master can define our jobs, and some information about our bills, like the type of runner that it goes on, I always default to Ubuntu but there are other options. And from there, we start defining our steps. And within these it is. Das bash scripting. So if you are familiar with bash, you can do, dump it on anything that you can just do with bash in a normal Lubuntu environment. You can also if you need to like install Python and Pip, if you need to have different soften things or whatever it’s just a stripped down Lubuntu container. So it is really really flexible. This is true for almost every CI CD tool out there, if you want to automate something that you were doing mainly manually you probably can, with one of these tools up here it’s literally just looks like normal terminal commands back over into Artifactory what that gets us, is these builds, like this is the Docker build that you saw, it’s collecting build info for and building and tagging and throwing back into Artifactory. You also see something here called X ray status, this is being scanned for vulnerabilities and it has found a problem.
So let’s take a look and see what’s going on in here.
X ray data. Yeah we see some violations here I’ve set up some rules in my Artifactory and X ray, to make sure that everything is actually looked at at a certain point, and alert me if something exceeds my comfort level in this case I wanted to let me know if anything with a medium severity or higher is found, and this is one of those conditions that you could bounce a bill for in your CI system. So, if you want to
edit this one.
He wants you as part of your CI system. If something triggers a policy like you’ve said outfit if something with a severity of high is found. The build would fail, it wouldn’t get it wouldn’t get built, you can also block down downloads and block release final distribution. So this is the kind of thing that as part of a CI system would have possibly prevented the Equifax disaster, which is kind of, yikes to think about but it’s really, really important to try to automate as many of these things, as you can because you and you’re engineers, you don’t have the time, the energy, the resources to do all of this stuff manually as frequently as you should be doing them.
It is just far in a way better to spend the time learning how to use one of these tools, learning how to declare a YAML pipeline and have it handle your testing your build your vulnerability detection your deployment, and all of it for you, instead of making somebody do this repeatedly and manually. It does take work to learn how to do this at first, but really it saves you so much time in the long term. I hope you all will try it. But if you would like to try this yourself, You can sign up for like the free tier version of the JFrog platform which gets you Artifactory and X ray. So you have a place to store all of your packages, all of your builds for, like, virtually anything you could be using because there is also a generic repository type.
In case you are using something we don’t explicitly, support, and then set up GitHub actions. If you don’t have an enterprise platform account if you’re using the free tier, just try it with GitHub actions, and see if you like it if you want to go check out this repository cat cause Grove slash DevOps Dash 101 Dash workshop, you can even copy my sample workflow and try that yourself, see how it goes, you will have to fill in some blanks, but there are instructions in there,
there’s, there are a bunch of documentation in here that will help walk you through it if this is something that you want to try like hands on yourself if you go to the CIC module. There’s all kinds of instructions for you, that’ll hopefully make it a little bit easier than it otherwise might be Googling around blindly on the internet.
So, in conclusion, like, I’m writing a research paper in high school. See ICD is a combination of methodology and tooling, with the ultimate goal of increasing your speed and efficiency as a developer, by automating things like building and testing and deploying, so that you can do all of those things, more often, instead of spending a ton of time doing them manually.
The benefit to you is more frequent software releases earlier detection of bugs and vulnerability issues, and bad releases making it to production, less often.
There are several tools out there that can help you accomplish this goal from freemium things like GitHub actions or Travis or circle CI to enterprise scale tools with a bunch of additional features like JFrog pipelines.
I hope I’ve helped you understand what C ICD is and what it does for you.
If you are still confused. That is also okay this is kind of like a big problem, and some of it is kind of still in flux, It has been changing a lot over the last three years, it will probably continue to change for a little bit longer. Again if you want to get a hold of me you can do that on Twitter at Dixie sweet flatline, by email and cat see at JFrog. COMM if LinkedIn is your thing, my LinkedIn is on this slide as well so I don’t actually use it all that much. It’s kind of just there.
Let’s go look at some questions.