DevSecOps – The Broken or Blurred Lines of Defense

John Willis
Senior Director Global Transformation Office

A classic model for risk management and control is something called “The Three Lines of Defense (3LOD).”

The three lines are as follows: Line 1: Risk Owners – Front-line staff and operational management Line 2: Risk Oversight – Risk management and compliance functions Line 3: Risk Assurance – Internal audit However, with the advent of modern sociotechnical systems like Agile, Cloud Native, and Event-Driven architectures these legacy lines (3LOD) are at best blurred and at worst completely broken.

With the modern patterns and practices of DevOps and DevSecOps it’s not clear who the front-line owners are anymore. Risk management and organizational compliance teams struggle to adapt to new cloud-native models such as ephemeral containers, microservices, and event-driven architecture like serverless.

Most organizations’ internal audit processes today are highly toil-based and have low efficacy. This is something I have called in previous presentations “Security and Compliance Theater.” In this presentation, we are going to look at a couple of case studies that include the good, the bad, and the ugly when it comes to 3LOD.

Primary topics covered are organizational design, DevSecOps, and Automated Governance.

Transcription de la vidéo

Hello, I’m John Willis. I work at Red Hat, also known as @botchagalupe on Twitter. Today we’re doing a presentation called security differently, or sometimes called what I call the blurred and broken lines of defense. So, most of you probably know me, I’ve been around for quite a while I’m pretty easy to find. In short, I’ve written about 12 books over the last 40 years and been in about 11 startups, somewhat. Probably most notable book is the DevOps Handbook with Jean Kim Patrick Revoir and Jase Humble, I also did beyond the Phoenix project with Jean Kim, worked on some interesting working paper, which became reference guides, DevOps, automated governance and [inaudible], I’ll be covering those two a little bit in this presentation. And I helped, you know, in an advisement ship for some of the unicorn project work with Gene’s book, a wonderful book, and then a lot of startups, sold a company to Dell, sold a company to Docker who was early you know, on cloud canonical, this is pre open stack. And involved in the DevOps movement as one of the early advisors, core organizers of the original DevOps days and then also DevOps enterprise summit, and today, I work at Red Hat.

 One last thing, I’ve always been pretty much a geek on the Edward Deming, so I started a podcast recently, so sort of my passion called profound. If you’re interested, or you want to talk about Deming, or you know, something when Deming you would like to discuss, just reach out to me, love to hear from you. t the trillion API. So six star is actually a collaboration effort between red hat and Google, but it has these possibilities for a to expand the whole notion of trust electronically, and distributed in what I would call what, you know what I call eastwest transactional trust. Anyway, so I know I went through some of this pretty quick, but that was the intent was to be more of a survey view of this whole what I call security differently. Thank you. I’m john Willis. I can be reached at J. Willis at Red Hat comm or bycicle lubuntu. Or maybe you can be on the podcast.

 So one of the things I did when I started thinking about this presentation for this year, is ask this question.

 I do this a lot when I’m starting out in the beginning a year of writing a theme presentation that year. And this one being security differently in 2021.

 I was asking this question to myself, what would DevSecOps look like if DevOps never existed? And so, if we think about DevSecOps, it’s been, you know, the glasses definitely have fell, right? In other words, we’ve added a lot of security onto our pipelines, we’ve built some really awesome DevSecOps reference architectures, a lot of automation that didn’t exist bolted on to our sort of DevOps things, or pipelines and whatnot. But in some ways, like the picture to the left there, you know, have we tried to force a square peg into a round hole? And a lot of good things are making it through. But the question I guess I’m asking is, what if DevOps never existed?

 What would DevSecOps look like? In other words,were we forced into a model or some bias around DevOps? Again, not criticizing the good work we’ve done, but just questioning, how would we have done things a little bit differently had we just started with the notion of DevSecOps, and not tried to sort of bolt on security to the DevOps mode? And one of a couple of examples, and I just wanted to pick one, which is this notion of the three lines of defense. So a lot of times when I go into a large shop, and I talk to people, they show me their, you know, sort of DevSecOps reference architecture. And I’m like, yeah, that’s really cool.

 That’s awesome. And they’ve done all the sort of good things that you need to do.

 They’re doing, you know, they basically get good hygiene on terms of delivery, small batch, all that sort of good DevOps principles.

 But then they’re also doing their SASS or DAST, and all that in the pipeline. And, again, all the things that you’d find in, probably a good hygiene DevSecOps reference architecture. But then in the same breath, we’ll talk about how we follow the three lines of defense model. And there’s different sort of versions of this, but one of them by the Institute of internal auditors. And that’s where, by design the organization is encouraged to actually create these walls, right?

 You have some third line, which is internal audit, and then you have second line, which would be sort of risk owners, and then you have first line as the actual control people who actually implement the controls. And this design is predicated on basically silos, firewalls, if you will, you know, sort of firewalling to different groups or the silos on purpose like in other words, the second group will catch with the first group and the third group will catch with the second group. So this is an anti pattern in terms of like, that’s the sort of square peg in the round hole, in my opinion. And you know, and if we think about, you know, the original intent of DevOps, in fact, the original character, this is Andrew clay Schafer, who I worked for over at Red Hat. And it’s a dear friend of mine for many years, one of the founders of the DevOps movement.

 Early basically in 2009, the original velocity where we started talking about… O’Reilly’s velocity conference, we talked about some of these ideas were… even more, John Ospar gave his famous 10 deploys a day of flicker.

 Andrew, at the same day was giving a presentation called agile infrastructure. And in his presentation, he showed this, you know, this now famous character of the wall of confusion, where Dev and Ops had these different goals, and there was always this sort of wall and non collaboration. And then also, I think, what we find if we go back to the Conway’s Law or this adage that states that an organization’s design systems mirror their communication structure. So what you sort of put in, so if you have that wall between Dev and Ops, and we successfully smashed down that wall in the DevOps conversation over the year, but if you don’t, your organization design of how you deliver your software, how you communicate, all those things, become a mirror of that communication, or that sort of wall in a place. So what I would argue then is the three lines of defense is a great example of an anti pattern to what we possibly should have had with DevSecOps, but we feel like it’s a normal practice.

 But it really is these two walls of confusion, where we’re sort of by design, not enabling collaboration. And, in a sense, creating, you know, some of that sort of square peg that’s not getting through, where, you know, possibly, if you didn’t have this particular sort of mandate or organizational structure in place, you’d have better collaboration between audit.

 I mean, that’s one of the big problems we find today is how do you get audit in the service delivery teams to talk? How do you get the risk control to have a conversation about regulatory controls? And a lot of that we’re not seeing a lot of positive activity in large organizations. And then there’s also the Larman’s law of organization behavior. And I think, you know, we won’t go through all these, but I sort of…

 I love number two, the second law, if you will, which is any change initiative will be reduced to findings of overloading the new terminology to mean basically the same as status quo. In other words, that is part of the problem, right?

 Even when you start to try to collaborate, certainly, if you have the walls, and you have this collaboration between internal risk or risk control owners and people service delivery teams, that they’re gonna sort of, even if you sort of push them in the same room to have a conversation, if they’re not freely communicating, they’re basically going to… basically, oh, yeah, we do that, okay, good. And they’re gonna agree on sort of overload and terminology or sort of walk out of the room, really not communicating, but sort of agreeing. So where this puts us then is… all these sort of issues are true here but also, when we start thinking about the complexity of a post cloud, native world or post cloud native development, what we find is these opportunities for higher risk become more accentuated, bigger gaps. So there are these sort of three principles that are universally true, whether you’re in sort of a post cloud native world, you know, I use this phrase, it’s probably a terrible phrase, post cloud native modernization, I mean, things that are sort of happening, you know, your sort of microservices, your cloud native development, your containerization, your clustering, your, you know, even your sort of functions, as well.

 These things regardless of that, you should be you know, thinking about how do you prove that we’re safe to how do we demonstrate that we’re secure? And how do we do both, but in the post cloud native world, it even gets worse, right?

 Because the complexity of like service mesh, and we can just go on and on. And the question is that, if we’re asking how do we prove it was safe, how do we do that today in most organizations?

 Pre or post cloud, which is we typically use service you know, ServiceNow or ci Records. And we document sort of human conversation of yes, I’m going to do this, but I prepare for that. And then somebody asks a question, Well, you know, what happens if this and you know, before you go to production I’d like to see this. And it’s all this subjective conversation. And then how do you demonstrate? And the truth is, you really don’t, right?

 Except it’s an audit. So maybe once a year, you have an audit 30 or 40 days, high toil, low efficacy. But the question I’m trying to raise here is if we’re doing security differently, maybe proving how we’re safe should be more objective, or explicit and how do you demonstrate it should be sort of immutable and instantaneously, so in order to be able to do both, and s what needs to happen security differently 2021 move from implicit security to explicit proof based, right? And in this case, not sort of implied in a ServiceNow record, but explicit to automation. Or, more specifically, let’s change the subjective nature of how we do things, to objective and verifiable again, sort of a human, let’s go ahead and make electronically maybe digital signatures in the subjective things we can show without any human intervention, and verifiable and we’ll show some examples of that. So what I’ve done is I said, okay, there’s this notion that I guess, what I would say is called modern governance, and what is the transition? And if we look at, you know, post cloud native world, right?

If we look at security, we talked about DevSecOps, one of the things that always bothered me about the DevSecOps conversation, is, once you start describing, you know, sort of the culture and behavior patterns that we inherit from DevOps, right?

 Very good stuff. But then all of a sudden, now you look at the all this sort of security things, it’s pretty wide horizontal spread. And so, you know, I would always talk about DevSecOps, and somebody would come up to me, John, well, what about this, or what about that? And I felt that like, it was hard to use that one word for the large breath that we have of sort of security. So as I started thinking more about this sort of modern in this post on cloud native world, and one of the really important things in this post cloud native world, right? And it really, to me, it boils down to three things that I can sort of look at and say, well, this is modern governance and this is what we need to be thinking from subjective to objective and verifiable, okay, and there’s a lot of stuff. So risk is one of them, defense, and trust. So if we look at risk, we move from subjective change management to electronic, you know, digitally signed attestations and control in the pipeline, automated, no human intervention and then also, we move into verifiable, which could be something like continuous verification or chaos, chaos engineer, or security, chaos engineering. So we’re constantly verifying the things that we set during the delivery and the attestation, right? You know, maybe this port should never be open as part of this sort of configuration definition. But let’s just test that. Or maybe this vulnerability should never be in a container image in production. Well, let’s basically start a container image and see what happens with that vulnerability.

 On the defense side, we look at moving from subjective detect and respond to intelligence and data lakes, where people are constantly taking all the information and you know, creating sort of the metadata, to normalize your information, getting really intense about sort of decorating information from multiple clouds, multiple passes, just anywhere. And we move into something that Shannon Leeds, who has been a great mentor of mine, she was already to it.

 She calls adversary analysis, this is your verifiable, which is where you’re just constantly using automation to attack the environment from the outside in. And then finally, in a trust, in this modern governance, we move from perimeter base to zero trust architectures, right?

 Which sort of everybody’s kind of gotten that memo, but even zero trust architectures are not good enough because, you know, if you look at a lot of these major breaches we’ve had is they’re usually an account takeover or server side request forgery. So even in a zero trust environment, if you can, sort of impersonate or fake forgery, your identity, especially in cloud environments, where there are sort of a lot of shared environment authorizations like take, for example metadata server, once you compromise that, so we move from verifiable into more of a distributed trust model, where we have these trust circles and clusters. And I’ll talk a little bit about that a little bit. So if we look at risk differently, you know, what we’re trying to do is reduce audit toil, increase audit, efficacy. And, you know, we’re using automated objective, immutable evidence, right, talking about digitally signed evidence, you know, we’re using continuous audit and verification, sort of security chaos engineering.

Back in 2019, I’ve given presentations to a bunch of sort of industry leaders we created this DevOps, automated reference architecture where we showed how you could create these different stages and create gating and attestations.

 There’s evidence to digital signatures. And over the years, we’ve added in the original guide, we had like 75 attestations, this would be your evidence points during the pipeline. But as you can see, we’ve expanded this and just, for example, you can look at things like unit test coverage, maybe a percentage change size, cyclomatic complexity, pull requests branching strategy, clean dependencies. And then, you know, moving on to the sort of build stage, you know, all the things that will sort of be in the build stage, like either SAS scan for something like JFrog X-Ray. And then the package stage, of course, Artifactory was a great example here, we could do artifact version, package metr. And, you know, and again, container scanning with something like JFrog X-Ray. And then onto the pre pod and, you know, on and on.

 Also, one of the things that we found was that after we built in that reference architecture, this sort of staging, attestations and the ability sort of… to remunerate the different control points and evidence, it made sense to actually start thinking about could that be driven by some type of DSL, you know, a risk is code or sort of policy DSL, where the sort of intent could be collaborate, be in collaboration between, you know, something like a third line of defense, again, breaking down those walls, second line, and first line, all collaborating to create sort of a YAML file. And you might laugh, but there’s a couple of banks right now that are actually doing this, where sort of their collaboration is this artifact. And you know, like that Larman’s law, it’s less likely to create overloaded terminology if the agreement is you have to create a DSL artifact as opposed to agreement on some verbiage that goes into a change record.

 Like I said, we started down in 2019, we wrote this reference architecture, we’re actually writing a second version now, which should be out later this summer.

 It’s a lot of fun. The first version was just a rote, dry kind of boring reference architecture. But this time, we’re doing sort of stealing a page from the Phoenix project where we’re actually creating a company called investment unlimited and we’re talking about how they failed an audit. And it’s a lot of fun.

 The other thing I wanted to mention is I had this opportunity earlier this year, to be invited in to give a presentation to some of the Solarwinds executive.

 So we’ve all heard about the Solarwinds breach. And I was asked by a friend who knew my passion for automated governance, the paper that we’re working on. the following work. And so I gave a presentation and what I was able to do is I went out and I looked at some of the really good at you know, sort of explanations of the actual Kill Chain that actually infiltrated Solarwinds, not what happened to everybody else with the Solarwinds software. And so CrowdStrike and mitre actually had probably the two best write ups and one of the things I wanted to do is show like, not like it was, you know, the perfect example but shows like how a lot of the things that was going on here for a couple years, could have actually been highlighted through this sort of approach them talking about security differently, specifically, automated governance. And here’s the migra paper. So the CrowdStrike is very specific, that blog article.

 This is an overall great read about, you know, what mitre is doing and the whole sort of process. So both are a good combination. So what I did is I went in and I took the things that were described in the mitre attack tactics, and just played what if with an automated governance structure which is, you know, again, we’ll go through all these you can read these, the slides will be available, but like, for example, you know, one that really stood out to me was code signing.

 One of the things in the CrowdStrike was there was constantly hash mismatches.

 If you know how the breach occurred, basically, they were able to sit in there, wait and compromise MS build so that they could add their code into the delivery. And there was all this evidence of code mismatches.

 Sign mismatches, or hash mismatches, which could have been caught with a chromatic hash code signing attestation or gate.

 There was also examples of log masquerading, so an immutable log structure could have been used, image scanning, there was just all sorts of things that these could have been, you know, really highly raised red flags, or gates in the development, or at least, you know, instantaneous verification from continuous compliance or automated governance structure. And so, you know, we talk about sort of modern risk solutions, definitely automated governance, In Red Hat, I’ve been working on some tools called Plagos, sort of software factory, certainly I have the JFrog suite of products. Also, in the original paper, one of the things that we did, which was we focused in on

 Grafeas’ attestation repository store, but there’s been some real good developers in something called six store which is based on a Merkle tree.

 Certainly Red Hat has some interesting tools in terms of compliance operator, advanced cluster management here, again, I’m a big fan. That’s why I speak almost every year for the last four or five years at a SwampUP.

 The JFrog artifactory, X-Ray, all their tools, and then we talk about continuous verification. Also, if you haven’t had any look into this software build of materials, these are some of the interesting industry things are going on.

 This is a fascinating subject, I’m actually going to write a whole blog article about what’s happening and the transition here, I think, is a very important piece. And if we talk about defense, that second in that sort of three points that I mentioned earlier, risk, defense and trust, if we look at the defense differently. So in defense, basically what we want to do here is reduce defense toil… same idea, reduce the toil, increase the efficacy.

 Start looking at a more unified approach to sim and sore, all this sort of information that’s coming in from sort of everywhere, but certainly multi cloud passes, and then start looking at sort of creating a common meta to create the intelligent cyber data lakes, getting more advanced in deception technology. And I talked a lot about what Shannon Leach has been working on, adversary analysis, a very interesting topic.

 This was another working paper that I worked out with a group of people.

 Where we focused on cloud, automated cloud governance. This is like the

 earlier paper, the result of Creative Commons and available. And some of the organizations that worked on the original were sort of halfway through the second paper, but FedEx, Goldman Sachs, cygnar, just on and on. So a lot of large organizations. And basically, again, trying to sort of create this commonality between all the data that comes in from all these different cloud providers to sort of aggregate for our GRC purposes. And we actually wound up calling this the cloud security notification framework.

 Onug is the sponsoring organization, they’ve just done a large conference here in May, and you can find a lot of information about what we talked about. But basically this idea of creating this

 decorator for all this sim and sore data, where we can come in from different cloud providers, but then really importantly, allow to aggregate the common matter, you know, the problem being like one event from say, AWS or another event from Google, or Azure might have the same meaning, but the context of the structure are going to be different, and so how do you normalize that? But even sort of harder is how do you get the common meta, like the department, the the account, the resource name, these are all different. So we’ve done some tremendously good work there. And so to wind that down, you know, the solutions are these intelligence, cyber data lakes, again, I would definitely advise you to if this is of interest, Onug…

 We just had our spring conference and we talked about the cloud security notification framework, mitre attack framework, you know, brilliant stuff, looking to escap, open escap, cyber ranges, another interesting topic. And then for the adversary analysis, stuff like this, there really are no products but follow Shannon Leach’s work. She’s spoken about this at numerous conferences. And then finally in the trust differently, so the third leg, we’ll go quickly through this because there’s a lot here, any of these should be their own presentation, I’m just giving you sort of a survey of this sort of security differently.

 You know, we talked about zero trust architectures, automated control based assessments, some of the stuff that is used for FedRAMP documentation, there’s some

 great tools out there, I’ll point that out.

 Distribute secrets management, of course, vault is a good tool. But I think the world is going to get even more complicated than vault, right?

 We’ll show you some of the things. In distributed trust, one of the main points I’ve tried to make about trust was, we need to move from a North-South sort of model, you know, if you think about, like, authentication and how we do this very much still, even in the worlds where we have complex sort of container clusters, service mesh infrastructures, we’re still using, you know, sort of mentally the North-South, we need to get to East-West, like transactions for security or trust and security pods.

Very interesting stuff.

I’ll show you some of the technologies. So certainly for zero trust, the NIST 800 207. Spiffy is a really interesting tool for this, what I would call East-West trust. And then the six store, which is a tool that has been developed out of Red Hat that was originally based on Google’s Trillian, which is was originally for certificate transparency, but it actually has audit capability. And what six store is, is really more of a simplified tool to extract internally the APIs so six store is actually a collaboration effort between Red Hat and Google but it has these possibilities to expand the whole notion of trust electronically and distribute it in what I call East-West transactional trust Anyway, so, I know I went through some of this pretty quick but the intent was to be more of a survey view of this whole, what I call security differently.

Thank you, I’m John Willis.

I can be reached at Jwillis@redhat.com or @botchagalupe on Twitter.

 

Release Fast Or Die