More Resources

Managing Chaos- Software Delivery in a Decentralized World

Software delivery pipelines are getting more and more complex. New deployments formats and systems arise. The legacy ones are still here – for an indefinite period of time. Edge computing, blockchains, machine learning – all introduce new challenges and opportunities. Let’s look at how to accomodate these changes, and provide true transparency and measurability all across our delivery chain.

Schedule a Demo >

TRY JFROG DEVOPS TOOLS >

VIDEO TRANSCRIPT

Hello folks and welcome to swampUP First up, let’s get the administrative stuff off our table, So, this is a recorded talk, which means I’ll be able to answer your questions in the session chat, while I’m talking so feel free to send me your questions, I’m really looking forward to them. And now, we can start talking about software delivery in a decentralized world. But first up, let me ask you a simple question is there a DevOps in the house? Is there a DevOps in the house? No, seriously, Are there any of you out there with the DevOps in your job title? If you’re DevOps engineer, DevOps manager, or DevOps architect send a “+” to the session chat. Now, if you think that DevOps as a profession in 2020 is still an anti-pattern, send a “-” to the session chat.

It will be interesting to see what happens, right? Now, some of us, maybe many of us, still think it’s an anti-pattern but it doesn’t matter what we think because while we think the industry happily marches ahead to the sounds of the DevOps trumpet actually, there’s a new movement going on. Who here has heard about DevDevOps? It’s a model for shared responsibility between developers and DevOps, now, right now we have a new silo, we have DevOps, we need to be able to collaborate with them So, according to Twitter, this is a new trend that has been gaining popularity in the last 4 or 5 years. But that’s fun and laughs but the skeptics they’re never satisfied, they’re saying that this binary model of Devs and Ops is not inclusive enough, it doesn’t paint the whole picture, so they start breeding all kind of monsters mutations in their secret labs bringing in all kind of things, breeding things like DevTestOps and GitOps, collaboration between Git and Ops anyone? And… DevSecOps and BizDevOps and ProdOps and there’s one another… I heard, only recently, which is BizDevDesSecOps. This is how we solve problems, right? So, our software delivery is dysfunctional, we can’t deliver on time and on budget our end-users are unhappy, how do we solve all of this? Right? We invent a new word, which ends with Ops and all our problems are solved.
But… All this work actually goes away when we remind ourselves what it is we’re trying to do here. What… all we’re trying to do is making IT delivery as painless, as stressless and as effective as possible, isn’t it? Right? And that takes us back to the Evergreen principles of flow, collaboration and yes, systems thinking. Now, if we said systems thinking, we’ll have to remind ourselves what a system is. So, according to the late Dr. Russell Ackoff a system is just a whole, with a w, that is made up of parts and that gives a system an interesting property a system’s behavior is dependent on the behavior of its parts, but more over. The behavior of each part of the system is also dependent on the behavior of the other parts of that system, which means that the parts of the system are interdependent. And, this interdependence, in the end defines the behavior of the system as a whole, which means that the more components, the more parts a system has the harder or even the more impossible it becomes to understand or even more predict the system’s behavior.
Now, there’s another interesting feature. You see, no part of a system can do what a system as a whole can do and that’s very easy to prove, as Dr. Russell Ackoff used to say Some of you out there can code, right? Send a “C” to the session chat if you’re one of those who still write code, now you as a system as a human being, you’re a system you can code, but can your hands taken separately from that system code? Some of you will say, “it’s not my hands, it’s my brain that does the coding”, OK. The brain, the brain is a magnificent machine, we hardly understand even 5% of how it works can it write code separately from the system that’s your body? No, it can not.
That’s easy to prove, take you brain out of your skull, put it in front of a laptop, ask it to write something simple, “HELLO WORLD” in Python. it won’t be able to. But there’s a paradox there – some folks evidently succeed writing code without ever using the brain, just by muscle memory go to stack overflow, copy paste and it works. I, myself, have written codes like this hundreds of times in my career. In quite the same way, I copied this example from again, Dr. Russell Ackoff I heartily recommend you hear his… Go check him out, explore his books, his lectures on YouTube he explains these things about systems thinking much better than I do Now, the more components a system has, the more complicated it becomes to predict its behavior because even the slightest change, an unseen change in the initial conditions of the system, can lead to totally diversioned, wildly different outcomes in its final states.
This dependence on the initial conditions of the system was first discovered and described by this man right here, Ed Lorenz when he was trying to analyze and predict weather evolution patterns so he found this and he described it and it later became known as the Butterfly Effect and, you know, as the saying goes a flap of butterfly wings in Brazil can cause a tornado in Shanghai” or much more relevant for today, one unhealthy bat in an animal market somewhere in Asia can cause a worldwide financial crisis that we don’t know how to get out of and with time, the butterfly effect has developed into a whole, new direction in the scientific thinking of the world, which is now known as chaos theory and is considered one of the three main scientific paradigm shifts of the previous century with two others being relativity and quantum physics.
Now, folks who deal with chaos theory sometimes refer to themselves as chaosologists and the reason for this talk is actually that while working with large information systems and large international IT organizations, at a certain point I realized that that’s what I am. A chaosologist, which is a fancy way of saying “I don’t know what the hell I’m doing here”. Right? So… Systems that we, chaosologists, deal with are known as non-linear dynamic systems or there’s another name that I like even more, which is complex adaptive systems. Now, the adaptivity here it means that the system as a whole and its components they adapt, they change they adapt to the events or series of events happening inside or outside of the systems by means of self-organizing collaboration.
Alright? so self organization is another property very important property of those systems. Now, this self-organized collaboration is achieved through very strong de-centralization of those systems because in centralized systems self-organized collaboration cannot really occur, it can’t be as adaptive as needed and finally, the adaptivity, the self-organization, the de-centralization, all these give the systems one property that should be really interesting to us as the builders of large-scale information systems, and that that’s that those systems are scale-free, which means that their level of order and variability order and anthropy is the same, no matter what scale we’ll look at them at so they can scale down and up in-and-out and stay at the same level of chaos. Now, why am I talking about all of this? Why am I talking about systems and chaos, de-centralization? Well, it’s because those patterns, those properties are becoming omnipresent in the system that we are building and in the system we’re part of, in the social system we’re part of and actually, all of this can be seen as one large socio-technical system, right? The systems that company and societies were a part of and the technological systems that we’re building.
and let’s see those patterns so we’ll start with micro-services, of course. By the way, if you need what kind of bees those are, you send your answer to the chat and I’ll tell you if those bees are the type that you think. But let’s talk about microservices, so in order to make our systems more agile, easier to deploy, easier to scale we start reducing, right? It’s a reductionist approach, we reduce the footprint, we reduce the amount of code that goes into each service, we reduce the amount of resources that each service distributes, we use the amount of stay that each service holds What invariably goes up? The complexity of interactions, of course. So, when I have a whole new plethora a whole new eco-system of tools and processes that we have to employ in order to make these interactions simpler, and we’ll talk about this later.
But microservices is so 2010s, right? We now have a new pattern the promised panacea for the operational complexity that microservices have created, we have serverless, right? We don’t need any more servers, we don’t need any more operating systems, just pristine, clean, executable code and everything is event-driven, no events, nothing gets triggered must be sad to be that function that nobody triggers, but it can’t even be sad – it doesn’t exist. What becomes complex? The interactions, the interactions with the functions, how do we debug them? We need to be able to replay all the events in order to understand why the function behaves in this or that way, how do different functions interact with one another? Who has access where? How do we trace the transactions that go through all different functions? All of this is still a lot of unanswered questions. But both microservices and serverless are a part of what we, tech hipsters, call what? Right, cloud native. Now, the cloud with all of its grandiosity has one, inhered weakness.
You see, we’re accustom to think of a cloud as distributed but the fact is that contrary to the popular belief, the servers of the cloud providers don’t reside in the cloud. They sit where? In data centers, right, so cloud computing is essentially data-centralized. In order to use cloud computing, we need to be able to stream a lot of data into the cloud and back and that puts limits on its scalability and that gives rise to what we now call edge computing or there is another name for this, fog computing. You see, if clouds are those big chunks of water, steam, ice, sand – whatever the clouds are made of, fog are those little droplets of water dispersed around us in the air and these are the devices that will be running the internet according to even most modest predictions in the next 5-10 years, we’ll have billions of devices connected to the internet each foglet, each droplet of water will be able to process data on its own or which is much more interesting – in self-organizing collaboration with other devices, with other droplets of water and that’s a new world, a new a new ecosystem of information systems that we need to be thinking of, we need to be thinking of how to deliver, how to build those systems and finally, one cannot talk about distribution, de-centralization without mentioning blockchain, I’m sure many of you were waiting for this. So, no matter what skeptics would say, no matter what you will say, you would say that blockchain is a badly distributed badly implemented distributed database, you may say that there are no real life use cases for blockchain today and even the gold rush put aside, blockchain provides a promise.
A promise of worldwide fully distributed trust where we can trust not a central entity that can become corrupted but a smart technology and that’s why, more and more folks all over the world develop those distributed applications, that’s why large folks as IBM for example whose project’s logo, HyperLedger, you can see on the screen develop those systems and build them for the future use that we may not yet see. Right? And the skeptics may say, “there are no real life use cases for blockchain” and in the same way they said this about electricity about 150 years ago No real life use case for electricity and in a way, they were right the real-life of the 18th century was so different from the real-life of today Now, the systems that we’re building and those we’re a part of are increasingly distributed, decentralized, chaotic so the question arises – how does one manage chaos? Now, as we said previously, the behavior of complex, adaptive systems is dependent on the interaction between their components, right? That brings us to two conclusions, first of all we cannot simplify and manage such a system by simplifying its parts. We tried it with microservices, it didn’t really work. And the only way to simplify a system is by simplifying the interaction of its parts and that is exactly what gives rise to a system of protocols that I’ll be talking about in a minute in a minute, you see… when we interact, our interactions are sometimes very rich, and that’s a great thing our language is a complex, adaptive, emergent system but in order to manage those interactions, we need to build protocols, right? Subsets of symbols and processes that allow the interactions to be manageable, understandable and simpler.
That’s why, in order to build those chaotic, complex, systems we need a new family of protocols, I collectively call them Distributed System Integration Protocols and here is what I’m talking about. Well, first up, Open Metrics, we… increasingly build more and more software components and we integrate with more and more third party components, sometimes caring about the performance of third party components even more than of those that we build ourselves and that calls for a single, understandable, common protocol of if we’re talking about systems performance, that’s what open-metrics project is concerned with backed by folks from Prometheus, InfluxDB, Google and SolarWinds it’s out there already about 2 years, they have GitHub repository if system performance is anywhere in the field of your interest, go check this out it’s a very important initiative.
Up next, where we have a lot of components talking to each other and one transaction can expand a lot of components, we need to be able to trace the transaction to understand if anything is getting stuck there, if anything is getting emitted from that transaction and exactly for that, we need to be able to do distributed tracing and open tracing is about creating a protocol for distributed tracing across all kinds of software components.
Also very important, in use by Jagger and Zipkin and different other tracing and obseravbility tools. Then, Open Policy Agent, once we do allow our systems to interact we need to be able to define policies regarding those interactions. Which interactions are OK? Which are less OK? Who’s allowed to talk to whom? Whose allowed to say what to whom? That’s what open policy agent allows us to do developing a single language to talk about system policies then we were talking about event-driven previously in the context of serverless but it’s not only serverless, in any distributed decentralized system, if you want to allow large scale integration, large scale collaboration, we need to be able to allow some of the communication to be asynchronous and that calls for event driven collaboration patterns, this calls for common event sending and receiving infrastructure and this calls for a common language if we’re talking about advanced, so we know which events we can expect and what to do about them, and that’s what cloud events project is dedicated to.
Then, our software delivery pipelines are becoming ever more complex because the world that we’re delivering to is chaotic and complex and the project initiated by JFrog and Google about 2 years ago, it’s called Grafeas is doing exactly with that, with providing a unified metadata API and protocol for talking about software artifacts and changes that we apply to them. Go check that out, integrate that with your systems and finally, when each of our components is all the time, talking to other components and they all to interact and they all need to be able to self-organize and collaborate in an autonomous manner we need to provide each of our components with an identity, which is a system, a software identity, right? It’s not an identity of a person, you and me, it’s an identity of a software components And the project called SPIFFE is concerned exactly with that, already used by such systems as Istio and HashiCorp console and Envoy the network proxy behind both of those so, if you need to provide identity to services, go and look at SPIFFE all of these are open-source projects so it’s our time to participate in them to develop those protocols for building the systems of tomorrow Now, one can not really talk about chaos and not mention chaos engineering of course. The concept was first originated from Netflix, right? They’re known for building the Simian army, the family of tools that were built for testing and verifying the resilience, the reliability of their own system before they released it to the community with Chaos Monkey becoming the most famous from this family, but tools are even less important than the principles that they define those principles can be found here, the whole manifesto called principles of chaos engineering at principlesofchaos.org if you’re building information systems today, you need to be aware of this go read this, go… understand the principles that the manifesto is talking about Now, finally, I was talking quite a bit about decentralization here, why is decentralization important? Because as we said, TVT and self-organization in centralized systems is very limited because it’s very dependent on the central entity and if that central entity tries to take the control back, then the self-organization disappears.
In quite the same way, we see that the internet that was originally designed to be decentralized we start seeing it getting more and more centralized today as commercial entities and governments are pulling the data, are trying to manage the data are trying to use our data in their interest we see a lot of scandals around Facebook for example in the last year because of them misusing our data so now there’s a whole movement called re-decentralize that calls for the re-decentralization of the internet because the centralized model doesn’t scale, as we said, and we should look into that because this is really points at the direction that our system and our societies are going to. And of course, decentralization complexity chaos of the systems that we’re a part of, calls for a new understanding of such issues as consensus and trust and there’s this great book by a British author called Rachel Botsman which is called Who Can You Trust? It talks about the evolution of trust in human society basically what Rachel says is that once, long, long, long ago our trust model was local, where we trusted our tribe members or our village neighbor, then for a very long period in history we started trusting other people based on the on some centralist entity, right? So we trusted other people because our government said it was OK to trust them, we trusted other people because the bank said that it’s OK to trust them, or any commercial structure because Facebook said it’s OK to trust them, but now Again, thanks to technology, our trust model becomes more and more distributed where we start trusting some person, some business on the other side of the word because some folks we’ve never seen provided good reviews for that business or person and that’s… a much more distributed model without any central entity, maybe with some central infrastructure but we don’t trust the code of the central entity and that is totally aligned with how the systems that we’re building operate, in fact that’s Conways law for you, right? so the systems we’re building reflect the changes in the system that we’re a part of and then they allow the changes in those systems that we’re a part of to occur Rachel Botsman defines trust as a confident relationship with the unknown, a real life definition because if you think about it, that’s what chaosology is about. A confident relationship with the unknown.
Now, All this, again, calls us to revisit the way that we collaborate, as we said, the systems that we’re building will be reflecting the patterns of collaboration in the systems that we’re a part of and that’s why a true modern DevOps organization that can build those complex distributed decentralized systems also has to be… to look the same, it has to be decentralized, built off multiple, lightweight, semi-autonomous units, each taking their service from the cradle to the grave from the inception to the user Collaborating in a leaderless manner, with their collaboration based on a sync communication pattern, especially today when we suddenly, quite unexpectedly, many of us found ourselves working from home each one from their own bedroom So, that’s why we desperately need full transparency of information, full sharing of knowledge and in quite the same way, the software delivery patterns into this chaotic, distributed, decentralized world have to be decentralized, they need to be pull-based because we already found out that the pull-based flow is pull-based collaboration is the only way to establish flow in complex systems They need to be event driven, we talked about that and they need to be based on shared, emergent, protocols And there’s a word of warning here.
You see, when we start dealing and understanding and approaching those chaotic, complex systems our knee jerk reaction is to try to control them. Not manage them, smartly, you know? Simplifying what we can simplify and managing that, but really to control their behavior and we see this more and more now that when we suddenly had to switch to a new, more distributed work model what toll it takes on our mental health all those hours spent in front of Zoom screens when we spend 5, 8 10 hours in front of our laptop screens? And… all this leads to burnout because a burnout occurs when we try to control things that we cannot really control, things that cannot be controlled. And that’s my message to you let’s not try to control things we cannot control, let’s not burn out it’s terrible when professionals burnout, when whole organizations burnout that’s a catastrophe. We don’t want to get there, there’s really no no use in that, most of us here are not saving the world, let’s be truthful but what we can do is we can build our systems resilient, we can build them in a smart, manageable way embracing the fact they’re chaotic embracing the fact that they’re decentralized and providing them the proprieties that they need in order to be resilient in this self-organized, centralist way that we’ve been discussing today The life, our life is chaotic, our world is chaotic that’s what makes it beautiful, as Albert Einstein once said, you can live your life thinking that this world is has no wonders, there’s nothing miraculous in this world, or you can spend your life thinking that this whole life is one, big miracle I certainly believe in the second way and I think everything we talked about today shows us that it’s the only way to treat our systems, our societies and we’re those building those systems that allow this world to become ever-more distributed so let’s keep ourselves healthy let’s keep ourselves from burnouts and continue building those wonderful chaotic complex and beautiful systems. Thank you.

さらに見る

Related Resources

Try JFrog for Free!

Start Free