Watch author and evangelist John Willis describe the “Seven Deadly Diseases of Devops” with a focus on the most costly of them all – Security and Compliance Theater. This presentation drills in on the practices needed to create long-term systemic “safe” improvement. Understanding these key patterns enables an organization to focus mainly on the intersection of human capital and technology. Although prescriptive practices like Lean, Agile, SAFE and even DevOps may be necessary for IT acceleration they are in most cases not sufficient for long-term systemic improvement. In other words, you can’t Lean, Agile, SAFe or DevOps your way around institutionalized organizational habits. Willis describes the “Seven Deadly Diseases” of organizational behavior: Invisible Work – Management System Toil – Tribal Knowledge – Misalignment of Incentives – Incongruent Organizational Design – Misunderstanding Complexity – Security and Compliance Theater. This presentation examines how all seven are indistinguishably related to cybersecurity, risk and compliance in IT organizations.
Hey, there we go. All right, just before we get started, I do do parties. I’m terrible. Absolutely terrible. If you want me to practice, it costs way more money. That’s an old Marx Brothers joke. Anyway … But I’ll go back to that.
That was actually DevOpsDays Austin. We got a whole karaoke band and people came up and jammed. It was totally awesome. People just …
Is that a prerequisite of any Austin event? You have to have music somehow woven-
That’s what I suggest when I’m coming. Yeah, we had people … You know how many musicians are in our industry. People just came out, “Hey, can I play?” “Yeah, come on out, man.” It was awesome.
All right. We’ll go ahead and start. Security and Compliance Theater, “The Seventh Deadly Disease.” Anyway, you saw this. I’m Botchagalupe. That’s where I live. Twitter, GitHub. It’s a terrible name, but that’s how you find me. So if you hate this presentation you can tweet and say, “I hate this presentation.” But you’ll probably be gone before I finish.
I’m going to go real quick on my … I got a lot to cover and I really want to cover some really meaty stuff, and hopefully I don’t run out of time, but I’ve done a lot of stuff. Probably more interesting to this … I’ve done 11 startups in 40 years. I’ve authored 12 books. Most of them, nobody’s ever read. And probably the interesting thing, I started my career at Exxon. I worked on one the first private cloud infrastructures for Canonical. It was called Ubuntu Enterprise Cloud. It was terrible. I was really early at Chef … Ninth person. Helped build the whole … Everything that was in pure development. Had a company I sold to Dell. That was kind of cool. And then I sold a company to Docker. I spent two years in the hurricane tornado of Docker. But we won’t talk about that here.
Just quickly, some of the books. The DevOps Handbook, a co-author of that book. So that was good. This one, this is my only … Except for my last slide, is my only gratuitous … One Audible credit. One. All right? So if you got eight Audible credits … In truth, it’s an audio only. Me and Gene Kim did it. If you like lean learning organizations, and resilience safety and that kind of stuff … It’s awesome. If you don’t like that stuff, don’t waste the credit.
I’ve got two books coming out this year. The DevSecOps Handbook … And then I’m going to talk about this. This is a whitepaper I did about two months ago with an incredible team called DevOps Automated Governance. So I’ll end up with that discussion.
Anyway, level setting. We have DevOps, right? And then there’s all these, “What is DevOps?” You ask 10 people, you get 10 answers. Here’s the thing … They’re all correct. There is no wrong answer. There’s ones I may disagree with. But if you’re selling a CICV pipeline and you say DevOps is [inaudible 00:03:12] … God bless you. But a few years ago, I started getting called into high level CIOs and some CEOs, where I had five minutes to explain DevOps. And I struggled with this. For myself, I came up with this. “How would I do this?” I’m going to stick with this. I’m not trying to convince you that this is the Canonical definition, but … DevOps is a set of practices and patterns that turn human capital into high performance organizational capital.
That’s it. Sorry folks, it’s not Kubernetes. I mean, Kubernetes is awesome. X-Ray and Artifactory is awesome. But this is the hard, hard, hard stuff. That’s my son. I don’t know if you’ve ever seen the Grand Canyon thing. The horseshoe. It’s pretty cool. Here’s the other thing too. I left Docker about two years ago, and I was convinced I’m going to transforma- I spent 10 years with vendors. Prior to that I did consulting. Back when it was incredibly hard to do transformation with the technology there was 15, 20 years ago. It just didn’t work.
Spent 10 years at places like Chef, and then … Opscode, of course, Chef. And then Docker. And then I left. I said, “I’m going to go out now … Not be anchored by a vendor.” Even though the vendors let me do what I do. But I still had that vendor anchor with me. “Oh, you’re the Docker guy.” Or, “You’re the Chef guy.” And I was going to go out and use all these tools, all these things I know. I’m not the smartest person in DevOps, but I would challenge anybody to say they’ve studied this space stronger than me. I am a student of this thing we call DevOps. I was the only American at the first DevOpsDay.
What I found out is all these tools like lean value stream mapping, and I’m not going to go through this stuff right now. It’s in the handbook. That was too late. There was a whole set of questions, like a level zero discussion you had to have, how people thought. Because, you go in and people are like, “John, we don’t need you. We got SAFE.” I’m like, “Okay.” You know? Or, “We got Agile.” Or, “We got DevOps.” These are abstractions that actually do you harm. Again, I’m not being a hypocrite. I love DevOps. But the bottom line is, read that. “You can’t Lean, Agile, SAFE or DevOps your way out of a bad organizational culture.” You just can’t.
I kind of stole this from Karina Maslach. You don’t know her. She’s the leading authority on organizational burnout. I’ve gotten to know her, interview her. And in a conversation we had, she made this thing about burnout. And I realized, “Oh, it just so fits our world.” Which is, whenever you’re talking about any kind of change or improvement, you are counting on a bunch of human beings to change and make it happen. If they haven’t been part of figuring out how to do it, the change efforts will be dead on arrival.
You can’t go into an organization of 35,000 people … And because 200 of them did this amazing pipeline and all this stuff … And then tell the rest of the organization, “You have to do this,” without talking to them. So what I did … I thought, “Okay, I’m going to these big companies.” … Large banks. Top 10 banks. “And I’m going to do all these tools,” and I realized I have to do something different. And I don’t even have a name for it, but I call it organizational forensic. And for those of you who are like, “Oh my God, this is not a security presentation.” Yes it is. You’ll see. This all funnels down into the seventh deadly disease.
So I spend this time with people, and I basically interview hundreds and hundreds of people. I talked to a development team all day on Monday, another development team all day on Tuesday. And I just ask … I really just try to create a conversation to find out how they think things work. And there’s a point at which you start getting them to start telling you this incredible amount of truth about the organization. The other thing I do is, before I go, I actually get on a call with a bunch of leaders. And I learned this from Kevin Behr, he’s one of the co-authors of The Phoenix Project. It’s just an incredible tool. It’s better when you do it in person. It’s sort of like that Rorschach Test. You hit somebody with, before you even interview or ask what you do, “What are the five things that your organization is not doing that you should be doing?”
And what happens is, before they get to the third answer they’re like, “Holy shit. I can’t believe I told you that.” And you interview 20, 30 of the leaders and you get this … And now you can aggregate up. So I did this for a pretty large company. And here’s the kicker. I won’t say, “No disrespect” to the big five, because I am disrespectful to the big five. So I go to a CIO and I show her this chart. And she knew she had a capacity problem. But this was with 30 or 40 of her leadership. When I told her that her leadership basically said the number one problem in our organization was communication, she literally pounded her fists and said, “You know, I spent millions of dollars with McKinsey for over a year, and they didn’t tell me that.” Now here’s the joke. I charged 10K, McKinsey probably charged five million. So who’s the winner?
But here’s the seven deadly diseases. I have other presentations that focus on all of this. I’m going to go through the first six reasonably quick to satisfy … [inaudible 00:08:52] was like, “You have to do security. You can’t do,-” I’m like, “All right.” So I want to quickly drive what the other ones do to get me to the final, which is … And it sort of magically works too. I’d like to say I was some kind of genius and thought this up front. But what I found was, if you went through this work thing, and I’ll talk a little bit about it … By the time you drilled down the funnel, you literally in most cases have proved that the organization, for what they think they’re doing from attesting to orders, is all theater. And I’ll show you some examples. And then the seventh is the security compliance theater. I just put some of the things you expand, like vulnerability theater. “How much do you pay for your vulnerability scanner? Ten million? Oh, mine’s 12 million. We’re awesome.” It’s not that easy.
Topo Pal at Capital One … He was the first fellow at Capital One. He’s a good friend of mine. I do a lot of work with him. He was asked by his CIO … Capital One has 15,000 Java developers … 70 percent Java … He was asked, “How much open-source does Capital One use?” So Topo was actually one of the guys who worked on the original JAR project. Right? He knows Java. Anybody want to guess what the number was? And if you’ve heard my presentation before, you’re not allowed to answer. Anybody want to take a wild guess? No? Come on. Be bold. What?
No, how much … Open source? That’s the worst answer I’ve gotten. Sorry. I owe you a beer bud, that’s good. I just made fun of you and now I’m going to make up for it. It was 99 percent. Imagine, 15,000 developers, 70 percent of all their code is java, 99 percent of it was open source. So Topo tried to answer one more question, which he wasn’t asked. How much of it were they actually using? I won’t make you guess. It was only 10 percent. So the dependency map of our … So I won’t, I’d spend the whole presentation on how complicated scanning and dependencies are. But that should just scare the shit out of you, in terms of how complex this problem is. And I don’t care what AI you put on it. If you start following [inaudible 00:11:08], and some of the embedded methods in there. And Bob’s from ThoughtWorks. It’s insane.
One of the things I do is, when I get people in a room, I do this … They’re like, “What is this guy doing? Another person that’s going to make my life better.” They’re really ready to eat me alive, right? And I walk up to the chart and I just draw a box, and I say, “Where does work start.” They’re like, “What?” “Well yeah, where does work start?” And they’re like, “That’s not a good question.” I’m like, “Okay. Where does work start?” “Oh, John. You don’t understand capital markets.” I’m like, “Yep, you’re right. But where does work start?” On average, it takes me a freaking hour to get people to answer that question. They want to argue for an hour why I, you know, “You really can’t.” And then finally, “Okay, if you must know.” I’m like, “Great.”
Now I can ask you what percentage, what gets documented. And so the first deadly disease is invisible work. I find, in average, a mediocre company captures about 50 percent of all the work that they do. When I do the readout with the CIO at the end, and they’re like, “How did we do, John?” And I’m like, “Really bad.” I kind of want to smoke a cigar and put my sneakers up on their wooden desk. And they’re like, “Well, what do you mean by that, John?” And I’m not making fun of them, but, “Let me put it this way. If you built airplanes, I would not fly on them.” And they’re like, “Okay, John. Give me specifics.” I’m like, “Okay. I estimate that you have well over a billion dollar IT budget. And you only capture 30 percent of all the work that goes on. What other part of your business … What if the finance guys said, ‘Every other Wednesday we’re going to roll things up?’”
In IT, we’re sort of a joke. We’re a bunch of cowboys. If 70 percent of what’s going on- All right, so the point is, and I’m going to go a little faster now. This is interesting, how many people read The Phoenix Project? Yeah, everybody. So there’s the whole [inaudible 00:13:15]. Gene did this incredibly interesting thing, it was actually Kevin Barrett and Gene, where he literally simplified Little’s Law and queue theory so that my mom could understand it. It’s brilliant, right? If you’re a queue theory or a Little’s Law, don’t yell at me. The deal was, there’s a point at which the question is asked, “How come it takes you 63 hours to do a 15 minute task?” And somebody says, “Let me explain. Everybody is about 90 percent busy, so in general, there’s a nine hour backlog for every hour of work. You got seven downstream dependencies. So what you thought was that one little piece of work actually equated to … And I find this when I go to people, I’m like, “Why didn’t you record the work from that team?” “Oh, that’s [inaudible 00:14:05] team. They’re always good.”
And later I find out there’s all these dependencies and they’re like, “Well that’s not my problem.” “Yeah but you didn’t record it so we don’t even know that it wasn’t 15 minutes.” And so John Allspaw has this thing, again, I’ll have this presentation up. I won’t spend a whole lot of time. He created this idea called dark debt, based on dark matter. So imagine a billion dollar IT budget. You only capture 30 percent of everything that’s going on. And you actually heard of the thing called the Butterfly Effect. What do you think is actually going on in your infrastructure? Unresolved chaos under the thing that you don’t even see because it looks like it’s dark matter.
Again, I’m trying to go through the … There’s a longer version, a misalignment. You find that this becomes incentives. How people … In fact, I just did a company … It is hard. I got to go back to CIOs and tell them terrible stuff. Now the good news is I only do one time gigs. That’s my advantage over IBM, Deloitte and all those guys, because they won’t tell the CIOs the things that I will tell them because they get really freaking mad. So this woman in this one company literally has this incredible MCGs, OKRs, all bullshit. You want to have a conversation about, they’re all just bullshit. Non-determinism, complex nav systems. Another discussion.
But she had these MCGs. Mission critical goals. I don’t know what book she read. Tableau like screens. So I interview 200 of her people. And I get everybody on this final day where I’m like, “This is what I’m going to tell your CIO. Tell me if I’m wrong. This is your chance.” And I said, “Basically I heard that you’re pretty good at” … They’re doing sprints. “That you’re pretty good at like, 80 percent and 20 percent capacity and buffering. But what you told me was all the goals, or none of the 80 percent work have anything to do with the MCGs. And in fact, you have to steal the 20 percent time to update your records to get those things so they show up on the tableau.” I got to write this report to this woman.
So the thing is, you can’t do a DevOps presentation without mentioning Conway’s Law. So, checkbox done there. So here’s the thing, right? Most of you probably are development focused, I’m guessing. Your context of … If you don’t know what Conway’s Law is, read that real quick. You probably think of Conway’s Law in terms of microservices, as an anti-Conway maneuvers, or different names for these things like decoupling. So your typical architectures are built on your organizational structure and design, and you want to decouple. But it’s not just for code. It’s the way your organizational structure looks too.
So I look for this. I hope everybody has heard about the Equifax breach in 2017? Still, today, the largest financial breach. There’s been bigger ones, but financially it’s … Five billion market cap loss, still to this day. So Conway’s Law … Question. I won’t say what’s wrong. What’s odd about this slide? Second beer. If you don’t drink beer, a really fancy Coke float. Come on. Be bold. Mighty forces come to your age.
IT has no relation to security.
Yeah. Susan Mauldin, the CSO, reports to the Chief Legal Officer. Guess what? You’re not an IT company. So it gets worst. If you want to read a train wreck, if you like train wrecks and outages, the Congress did an incredible job documenting the Equifax breach … 2018, it’s public. One of the questions they asked was … When she came on board, she was the CSO. She had a good background. She was a smart woman. They tried to blame everything on her. No, that wasn’t the problem. They asked, “Didn’t you think it was odd that you were reporting to the Chief Legal Officer?” And she basically said, “Yes, but … I figured they knew what they were doing.” They call that, actually, pluralistic ignorance.
But here’s the kicker. This is the Conway’s Law, bleeding, oozing. I won’t make you read the whole thing but they asked, “When you found out the PII, Personally Identifiable Information, was compromised, why didn’t you go to the CIO?” And she basically said, “I don’t remember. I don’t remember thinking about that.” Of course, she didn’t. She reported to Chief Legal Officer. That’s it, right? How many people have heard of this concept of shaped I, T or E? Again, on this slide … What we’re trying to do is get out of the, “I am the article DBA. That’s all I do.” To maybe, “I do article, and Mongo, and I know how to do some JS, or Go.” The high performers get to this E-shape, or comb. Where people have lots of skills, sort of SRE-ish in a way. And one of the things that … You’ve probably heard the concept of two pizza team.
I try to give people … I listen and I say, “How should you start migrating to how you’re thinking?” And you want to get to … There’s no good argument for being all I-shaped. There really isn’t … T, or E-Shaped. This concept of a build/run, where you have the autonomous teams, and you start breaking out … That by general nature starts creating E-shaped individuals. Because everybody becomes collaborative and knowledge based. And it’s really paying off a level of technical debt, in terms of your skills of your people.
And then complexity, again. I’ll go through this really quick, because I do want to get to the security stuff. Here’s the point, all of this stuff if you remember that funnel, drills into my final argument is … And by the way, there’s probably some out there, but it’s never been that this isn’t true where I can literally tell the CIO, “Everything that you do on your audit is just bullshit.” If you want to read Sidney Dekker, or watch what John Allspaw is doing from resilience complex systems, psychological safety. I mentioned this, normalization deviance is interesting. This comes from The Challenger disaster. A woman called, I know I’m going really fast, but I can … Ping me. I’ll give you whatever links you want. But Diane Vaughan wrote a whole book about The Challenger, and she called it … NASA had this normalization of deviance. Where you just see things that are bad, and you get used to them because it doesn’t blow up. So you start accepting that as a normal thing until the day people die.
And this one. I don’t know why I put these things in here, because it always makes me run out of time. But I’ve got to tell. How many people heard the Abraham Wald story? You probably have but you haven’t it heard it that way. Abraham Wald was a statistician during World War II that was brought in, they had a whole team of these brilliant statisticians to try to figure out how to repair planes when they came back with bullet holes. So they were doing all this thinking about the weight of the metal, where to place it, where to cover the holes. And somewhere along the way he had this epiphany. He’s like, “You know what? We’re total idiots. We’re thinking about the place where the bullet holes are. Those are the planes that are coming back. We should be thinking about the places where the bullet holes aren’t, because they’re the ones that aren’t coming back.
But here’s the thing, when I do these interviews, I look for these people. I’ll interview five build/run teams, and everyone of them will tell me, “Yeah the business can never get it right.” They give us the, “We want the red button.” We give them the red button, and they say, “Oh, it should’ve had white linings.” So then we give them the white linings, and then, “It should have purple lettering.” And I find a team where … It works magically. And then I say, “Let’s figure out what the product owner is.” And I talk to the product owner, and I find that that product owner has these characteristics that somehow get things done in a place where nobody else can get things done.
You all know there’s somebody in a company, like, “Give it to Jane.” Use that as the Abraham Wald. It’s a complex emergent idea. Instead of trying to figure out all the ways to do things, find the things that are working really well and follow those. I don’t know what the enterprise versus startup versus, but, there’s a race in large enterprises. And the gap between the people on this corner of the building doing Kubernetes and Docker and all that stuff, and the people over here still running Remedy Atrium and CMDBs, they are so far apart that I don’t know we can ever get them back together.
But, and again, I’ll go through this quick because I have a longer version. So what is a CMDB? What is the relevance of a CMDB today? Topo Pal at Capital One says, “You know, John, four years ago when I started DevOps in earnest at Capital One, I had basically 400 services. Today I have 50,000 services.” Right? Microservices. By the way, some of them are [inaudible 00:23:55] we don’t who they- … I think we have to as an industry figure this out. We need a service catalog. But nobody’s actually trying to bridge these worlds in an honest way. Microservices, a type of delivery mesh. If you’ve got a group that’s still doing Idle and [inaudible 00:24:13] and you’re trying to do SRE over here with SLOs … Where’s that conversation happening?
Are you service mesh aware? Again, very quickly … I go into companies and five percent of the organization has ever even heard of ISTO and ENVOY. And that’s table stakes. That’s basically table stakes. Sorry, if you don’t know what it is, you’re not playing the game. And that’s even before I talk about multiple meshes. Because there are other meshes out there. Things like framework identity for nodes, or a mesh architecture. And then let’s get into API extensibility in Kubernetes and CRDs. I don’t know the answer, I’m just trying to identify in an organization that you’ve got to figure out- You can’t have both. You can’t have people with a CMDB that’s about 30 percent accurate in any given day, and then expect to send whatever your favorite Honeycomb or [Single FX 00:25:14] events to people, because you don’t have anywhere to send them, because you don’t have an [inaudible 00:25:19].
All right, moving faster. This is the delivery mesh thing, I think is really interesting. I think it’s worth a whole long conversation. We talked about service mesh and how things are working. We don’t have a whole lot of conversation about what happened before they become in production. So it’s like, and I go on for a long time, and I don’t have that much time to go on. But if you want to geek about it, I don’t know the answer. The one thing I do know is, why isn’t swagger a first class primitive in all our deliveries? Why? Because, here’s the question. How many people pen test their API tree structure? Be honest now. Yeah … Rooms of 1,000 … Three hands go up. I’m just like, let’s just make swagger a first class citizen. We can basically, in the build … You know what I mean? But we’re not even having a conversation about that.
All right. DevSecOps, we’re finally here. Drilling down. This guy’s getting mad. He’s like, “I thought this was a security presentation.” I’m just kidding. Now I owe you a beer too. You all stopped laughing. That sucks. So this is actually how an insurance company that I work with, where we did the full deal of, “How do you do DevSecOps in a pipe-” and really what it means is you have your pipeline, and you’re just going to do the abstraction overlay of all things security in there. Again, I have a whole presentation. If you were here last year, you heard a good part of that presentation. And then you saw me on stage with the mentalist make a fool of myself, so … How many people saw that? Yeah. Thank you.
So, the deadliest disease of them all. The seventh, right? Why seven? Seven’s cool. Why do I call it deadly disease? Cause it gets me on stage. In general it’s just patterns. When I go client side, I’m just using these patterns to put things in buckets. Literally, so again, the longer version of this is if you’re only capturing 30 percent of your work, if you’ve got misalignment of incentives, if you’ve got management system toil. There are companies like, there are 10 different ways to create tickets for the 30 percent that they do create. “We use SharePoint. No, these guys use JIRA. These people use,” all with different contexts. Even if you tried to aggregate, you can’t.
In the Phoenix story, this is tribal knowledge. I look for the, in fact, the first company I did this with is this guy named Lou. So now I call it the Lou Circle. As I’m mapping this stuff out I drew a circle. It’s like, “I found another Lou.” And they’re the people that seem to be the people that never say no, they’re as friendly, as nice as can be. And they know how to fix everything, but they have no time to explain how to fix it. So the junior people that come on board quit, because every time they ask Lou how something works he’s like, “I’d love to tell you but I have no time.”
So you have to horizontalize that. Organizational design, we talked about. Not just microservice, but in general. You don’t want that Equifax scenario. And then complexity. Getting people to realize complex adaptive systems are a real thing. You have to think differently when you have this dark matter idea. Things are very complex. You have to think more … I’m going to get really scary here, but, you have to be more quantum about it. Some of the infrastructures now that are running … Like, top five, top six banks, right? These are incredibly complex systems. I mean, they may still have mainframes and coldwall, but the aggregate of them are beyond human comprehension complexity.
Just like when physics went from Newtonian to quantum, they had to switch to be more probability based. There’s a point at which our industry has to get better at being deterministic about … I’m going nuts here, but deterministic about, “This shall always look like this. And if it doesn’t, we’ll hammer it in.” To, “No, these are adaptive systems and you have to think differently about emergence. And patterns of emergence.” It’s way out there stuff, but we’re in this realm.
So this all bubbles down to … You have workarounds. People are not documenting work. You send tickets to a change advisory board every Wednesday. I talk about this shift left auditors. So what we really have to do, like most large corporations have two levels … Depending on who they are, three or four levels of people they have to attest to. A bank, it’s usually … It’s internal [ROM 00:30:13] and then regulators. My banking friends say, “John, here’s the deal. If we lose our license we’re not a bank. This is very simple math.” I see places that have all these different review boards. The architecture review, the project [inaudible 00:30:35] the CAB. I had one company that was so embedded in their job scheduling XML into their application … In fact, I had to walk away from this company … That they actually had an XML review board. Their XML job scheduling was so tightly coupled with their code that it literally had to have its own standalone review board.
Yeah. Checkbox compliant. And then you find how many people, when you find out about the woman who runs payroll for 40,000 people says, “You know what, I’ll put in the change rec into CAB on Wednesday. I’ll play that game. By the way my backup documentation is the same documentation I’ve been using for two years. Don’t even change the dates. Even put a little message in there, ‘If you see this, call me.’” In other words, people don’t read it. I had one company, 51 reviewers. When they went back to try to figure out how to do automated governance, there was only three people that ever reviewed. And none of them actually read. Right?
So this woman who did the payroll says, “You know what? Here’s the deal. I know how to put thing in production if I need to. It’s my code. I own it.” This is a big capital market company. She says, “If the Wednesday afternoon CAB is going to stop people from getting paid, I’m making the change.” And you find pockets of these things all over the place. You get people to admit, they’re using Idle … How many people, when they know it should be a major impact change, put minor, no impact because they don’t want to go through the CAB? All hands go up.
Then I ask the head of security, don’t you have a bar chart or something that shows you that there are a gazillion of these and two of these? “Well, if they get caught by the auditor,” and they go back to people like, “You get a slap on the wrist.” This is the most exciting thing I’ve probably worked on in my whole career. DevOps experts, here’s the DevOps handbook. Tell a CIO, “You’ve got to get rid of the CAB, buddy.” And they’re like, “You know, here’s the thing, John. All you DevOps tell me how the CAB is evil, how I’ve got to get rid of it, but you never tell me how to do it. Because I’ve got regulators, I’ve got ORMs, I’ve got internal auditors, and external auditors that need proof. So I’m sticking with subjective attestation.
What’s attestation? The proof that you did the things, or that you’re doing the things that some form of policymakers are telling you you should be doing. So my framing over the last year has been, we need as an industry to change subjective attestation, which is me sending you a change record, you looking at it and saying, “Yeah, okay.” Him looking at it and saying, “I think I’ll let this go in my production system.” And then six months later some auditor says, “Oh, what’s the change record for that?” “Oh, it’s some remedy ticket. And these human, to human, to humans all said it was good.” What’s the efficacy on that? Thirty, 20 percent? As opposed to an objective model, where it’s built in and automated and there’s no humans.
This idea of DevOps automated governance … It was actually Kit Merker, I don’t know, blame him or whatever. He got me started on this idea about a year and a half ago. And there were some other companies who were doing some of this. This idea that we could build attestations into the pipeline. So I think of an attestation as sort of a … I dare to say, you know, cough cough blockchain, but … It’s not really blockchain. It’s just a link list of crypto’d events that have a list that has a [SHA 00:34:34] that says that you did all the things and it’s immutable. That’s it. There’s a trusted source.
How many people read the Liquid Software book? You should read it. Honestly, I love JFrog, they’re nice to me, they do great things for me. I finally read the book. It’s a really good book. The metaphor of liquid, they did a really good job. I actually stole the paragraph from the end, but … The idea is, you build these attestations in the form of hash crypto events that are immutable. Now your efficacy is really high. It’s not human to human. And so, one of the ways that I started thinking about this is Topo Pal in 2018 wrote this article about how Capital One does DevOps in the pipeline. It’s out there. I’ll get you the link.
If you google this, they’re focusing on … But he said that we have, and originally he called them gates. Now he calls them control points. That in order to get a system managed at Capital One, in other words, if you wanted to have an auto approve. So this is like their gamification of auto approve. If you want your service to go through auto approve, and not have to go to the CAB, you have to evidence that you do those 16. Now it’s 29 today. But you had to evidence these things.
So somebody had to basically look to say, “Yep, yep, yep, yep, you’re doing all these things great. You can be auto approved.” I mean, depending on the change. There were some changes obviously that wouldn’t. He’s migrated that now into this terminology of control points. These are control points that are, think of them as immutable crypto hashes with a trusted source that you did this. And some list that you did all of them.
So what I did is, Gene Kim once a year invites about 40 people down to Portland. And we work on these case study or research projects. And I’ve been going for six years. I’m fortunate to be part of that cabal. And some of the biggest of the biggest companies show up. Invitation only. I’m bragging, but anyway. I’ve been going for six years. Most of the times I just float, because I have more fun just floating around. Early on I wrote the Myth Busters one. But this year I decided, “You know what? I’m going to basically prefab my project.” So I called somebody I know who runs all retail at Nike. The guy who runs basically one of the largest commercial Kubernetes, most revenue … Revenue for a company generating through, I mean, we’re talking 30, 40 billion in production Kubernetes of Marriott’s revenue. The guy who runs Azure. A guy who runs all the infrastructure for PNC Bank. Topo Pal from Capital One. And we threw in Mike Nygard from Release It!, just for fun.
And we basically built a reference architecture of what it could look like to do automated governance. The first thing we did, so, you’ve got to imagine there’s a white board. You’ve got the guy from Marriott, PNC, Sam Guggenheim if you don’t, he’s like, amazing. All standing up there. I got goosebumps. And we’re trying to figure out, how do we explain to the world … And there’s a million ways to define the pipelines. And we’re not saying this is the defacto way you should all describe pipelines. But for what we were doing, this is what we came up with. There were seven stages. Two of them ran on their own cycle. Again, I don’t have real time, but the Dependency Manager, Artifact, had their own stream. But they became inputs to build or package. Great, thank you.
And then we said, “Okay, so that’s the template of the stages of how you do things.” Again, hundreds of different people have different ways to do this. But for our purposes … And this book will be out later this year. It’ll be creative commons. Again, just the people who were in that room. I can’t tell you. When I watched Marriott, and PNC, and Microsoft Azure argue about what’s the best control point, or we do this here … It was incredible. And it’s all documented. We basically put together a reference architecture of 65 control points based on all those companies best what they do. Think of it, things like, “Does it start in social control?” Even human based things. Not even automated. Like, “Did you do a pairing? A peer review?” All these things as attestations. We’re still embryonic. I don’t know exactly how all those attestations will line up in an attestation database.
And we did three sample reference implementations. The best one we got the furthest with was with Google’s Grafeas. So if you know what Grafeas, it’s an interesting … It was actually designed for this problem. It’s been open source. It doesn’t get talked about. We did simple java microservices, to Kubernetes. And my buddy John Rastovski at PNC, I always say, I suspect when he takes a shower he uses Kafka to run the water. I call him John Kafka Rastovski. So he, of course, built the whole guaranteed delivery on Kafka. So we got a nice working prototype with Grafeas. We have another one that’s based on Hygieia from Capital One. And just to show you there’s … I’m not going to show you all the stages, but here’s the SourceCode one. The risks, and then these were what we defined as … And again, this is just a starting point.
We’re going to open this up and ask people, “What are your control points? Does this work?” I want to turn this into a very open and collaborative thing where we can start having an honest conversation about, “How do you actually move from subjective attestation to objective attestation?” So these are the five that we came up with. The actors, the actions. We always define input/output for each stage. And the risks. Again, I’m not going through all the ones, but I figured since we’re at JFrog let’s talk about … I’ll cover the Artifact one.
So the controls there are only allow updated from trusted package source, has to be immutable artifact, a retention policy. Again, we’re not saying this is what you have to be. This was, those five or six companies are really smart people agreeing that this [inaudible 00:41:02] best. And there were 65 in total, out of these seven stages that we came up with that are documented, with descriptions, inputs and outputs. Artifact is interesting because we didn’t know … Do you call the output an artifact? Yes. We try to be really tight on terminology too. It becomes now more of an immutable artifact at that point. In fact, artifacts mutate, right?
So in summary, these are the seven deadly diseases. You try to figure out how to get a higher retention rate on capturing invisible work. You really need to capture 100 percent of your work. You don’t have to be a time clock, but I don’t know how you can actually attack the dark debt, the complex systems, without actually knowing what you’re doing. Again, no disrespect with like, Atlassian. But people tell me, “Oh, we’re going to put an Atlassian in. It’s going to solve everything.” I’m like, “Yeah, but you’re only capturing 30 percent of what you’re doing. What part are you improving?”
Consolidate work systems, collaboration, bottlenecks, institutionalized knowledge, inverse Conway maneuver. Understand that there are parts of your organization, not just your code bases, but the way you, again, that Congress report of Susan Mauldin and reporting … those things just kind of eat you alive. Again, I can give you tons of good references about resilience and blameless and human factors. And I think automating governance. And I took this … I put this in our paper, and I needed to re-word. There’s a last paragraph in Liquid Software that they did an incredible job explaining. But it was sort of focused. It wasn’t as general focused as I needed, so I literally wrote my own version of it. But I literally say … In other words, I could not have written this paragraph had I not read Liquid Software. But I did rewrite this myself.
You can read it yourself really quick, but basically that’s … And they do an incredible job in Liquid Software sort of ending up with this … Read Liquid Software. It’s really … In other words, it’s about trust. Great, perfect. It’s about trust, right? Trusted source. You know, one company, one of the references [inaudible 00:43:31] they’re using Vault. They created all their signatures from Vault. That’s their authority. There’s a lot of ways to skin this. And one other thing, too. If this is like, “What is he talking about?” So imagine, not only are you sending up these immutable crypto events in some list. But imagine that when you do the build, you tar up all the logs from the build. You hash that, and that goes up in the list. So imagine when the auditor walks in and says, “This can’t be broke.” Like, maybe when quantum breaks through. But for now, nobody could’ve tampered with the build logs, because again, I don’t want to call it blockchain. It’s just a link list of crypto events that are immutable.
The auditors are, and will catch on that, “Yep. We’re done.” Thirty-five day audits turn into four hour audits. “What’s the providence of this change? Oh, here’s the hash. You get it. It’s like everything else we use in the bank. If this doesn’t work, then nothing else in our bank works either. We’re done with that one. What’s the next one? Oh, next one’s the same thing.”
All that said is, I do have a booth. The one on the left is my son. I’m teaching how to, like, he’s a third year electro-engineering student. Georgia Tech, got to brag. I feel so inclined. I’m trying to teach him what sales is like. I’m giving him a taste of our industry. And sales sucks. It’s hard. So you have my permission to go over there and give him a hard time. Cause he’s like going gung ho about, “Dad, what should I say if they do this?” So, we could have a little bit of fun if you want. We might try to sell you something, but we won’t if you don’t want us to. And we do have a couple of books, so … We only brought like, 20 books but if you create a really good conversation with those guys, they’ll give you a book. I’m done. I hope you enjoyed my presentation.