An Accelerated World Requires Accelerated Delivery
As the world continues to demand more software, the next generation of applications requires speed, security and efficient software distribution at scale to meet the promises of DevOps.
Join JFrog’s CTO Yoav Landman and CPO Dror Bereznitsky as they unveil JFrog’s 2021 roadmap, with a few DevOps industry-changing surprises.
Don’t miss this annual landmark session, which includes details on the release of the world’s first Federated Repositories, Signed Pipelines, Cold Artifact Storage, Scoped Tokens and third-party dependency scanning.
Video Transcript
Hi everyone, let’s talk about accelerating the world of DevOps. As developers, we always like to think in terms of code. But while the journey to the runtime starts with code, it immediately transitions to binaries. Our customers swear by their binaries. And there’s a reason for that. This is the single source of truth for the software releases. It starts in the build phase, where you need binaries in order to compile and you’re, of course, creating new binaries. And it continues all the way to production to the runtime.
In fact, most of the DevOps infinity loop applies to binaries, and is about managing these binaries.
But there’s a bigger story around managing binaries, a story of lineage and a story of resume, of a binary resume. Every software release is composed of many, many smaller components just like a car, and binaries are assembled to create bigger or larger binaries.
For instance, your Helm chart refers to container images, which refer to OS level packages, and application binaries like NPM modules, rust or go modules. So in order to trust a software release, we need to have trust in every single component in this chain, yours or a third party component. Finally, the binaries you’re going to run and monitor are the same binaries that you start with. Your CI binaries, your release binaries, your distribution binaries, and the binaries that you’re going to end up in within your runtime are the same binaries.
This is why everyone today is building the DevOps automation and digital transformation flows around binaries. Because building these binaries, again from source does not only mean a loss of time, it also means loss of quality. Because you’re losing a strong reference to an immutable release that is going to end up in your runtime, and guaranteeing the reproducibility of a build is a very, very difficult thing to achieve.
Our platform, the JFrog platform and product are all around managing this flow of binaries, and the metadata relationship between binaries. We are managing the binary operations or BinOps, and BinOps are the most fundamental thing in achieving fast and reliable software releases.
Since faster releases is the goal of everyone doing DevOps, we need to focus on accelerating the BinOps workflows to allow your organization to move faster, and to create the next generation of software releases.
The exciting announcements that we’re going to make here today are exactly about this type of acceleration. And to make this more concrete, these are the acceleration challenges we have been focusing on based on keeping a continuous feedback loop with thousands of developers teams.
First, scaling and managing growth. Second, is enabling the end to end security and trust in the BinOps pipeline. And finally, allowing private distribution to the run time to break beyond boundaries of scale, and usability. Let’s go into the details and let’s start with accelerating scale or accelerating growth. So many organizations we see are challenged by three types of growth. First one is the growth of teams, you have many more teams, you have many more projects, you have more business lines, and we are actually seeing you building self service portals in order to manage the allocation of resources for these teams.
But we need to help you to lower this cost. Second challenge is the growth of geographies. Many of you use application to exchange artifacts across teams in remote data centers. What we want to do is we want to make data center transparency a reality for you with near zero setup, and also have the R for your artifacts. And finally, growth of binaries.
While you need to keep all the binaries around for a long period of time, possibly for legal or other reasons, those binaries impact the scale of your operation because they bloat the indexes. And they also create a user experience noise. We want to allow you to cold store such releases while giving your users the option to self restore them when they need them. So that was the acceleration of growth and scale. And the next acceleration is about security.
We believe that security has changed, and we’ll change the way that we do DevOps. Today, we have a name for it DevSecOps. But the world is changing into having security as an integrated layer as part of every product, not as a separate value at layer two products that are otherwise insecure. So infrastructure as code, and cloud config, and binaries management and CI, they all need to have security baked in, the same way that we think about every other quality aspect of a software release. And today, we will show you how we support this notion in two ways.
The first one is by adding cryptographic trust to the BinOps pipeline itself. And the second one is by moving all our products and services to a zero trust model. So more on that later. Finally, let’s talk about software distribution.
The market is moving towards a reality, where all enterprises have to deal with distribution and become distribution experts to some degree. There are many reasons for that. And primarily the need to run software closer to customers, run software in remote data centers, in multi cloud, in edges. And this is backed up by the predictions of analysts but for us, it’s already the reality that we’re seeing, you are already running software at the edge, on remote data centers, and on anything blackbox if you wish, the need is here and now.
Because of that, we see a lot of investment going into solving the problem of private distribution. And we see four main challenges that you’re trying to solve.
First one is managing and setting up a private distribution network. The second is lowering the cost of ownership and maintenance on an edge node. The third one is reducing the load and the dependency on a central Artifactory server.
And finally, the last challenge is overcoming the growing sizes of software releases and applications. And so we want to take away this pain from you, and take away this complexity in accelerating distribution by providing you with an out of the box solution.
So let’s go right into it and start making some exciting announcements. And we have a lot of them.
Ready? So I want to start at the end, I want to start with software distribution to the runtime. We already have Artifactory edge, which connects to a central Artifactory server and it’s a local read only flavor of Artifactory.
So it has long term storage, it has a UI, it can be scaled for high availability, it can do security, so authorization and authentication and it can overcome slow and unreliable networks.
And also it can push to destinations that sit beyond an inbound only firewall. And the way to push content to an edge is through a release bundle.
A release bundle is a JFrog technology.Basically, it’s a collection of queries that you run against the source artifactory to gather the artifacts that you want to collect to form a release. And you generate the signed bundle JSON with all the artifacts and the properties and optional additional metadata.
This JSON is signed centrally and it’s validated on a factory edge, so it gives you atomicity and immutability. So we wanted to take out the factory edge and allow you to extend it with a super light distribution setup. And here is what you told us you need from such a solution.
First, it’s got to be hybrid that’s very important, then it needs to have a low storage cache, it needs to be ultra scalable, walk over the internet, it can be warmed up if you want to, it needs to be highly available, it has to be permission aware and respect security, it needs to be container and package a well so that you can run your existing clients against it such as Docker pool and so on.
It has to have activity auditing, and finally it needs to have a very, very low total cost of ownership. So that’s a lot of requirements. And at this point, I’m super excited to introduce to you the world’s first hybrid private distribution network, or Pdn, as we call it.
So what is Pdn? What is private distribution network? It’s a setup of cascading groups of boxes that are connected to an edge. So you can build your own topology using groups in here we can see an example where we have a central artifactory and then groups in Europe and groups in the US and below, there’s another level of groups with cities, and clients are connected to a local distribution group. And groups can point up to each other all the way to a water factory or to an edge where the main topology is continuously reported to. And adding a new distribution node to a group is a super easy thing to achieve. You just give it the parent group name, a self group name, and some credentials. And that’s it, that’s how you build it.
And it’s all based on an internal technology we developed in Jeff frog called watu. So this is how you build this super flexible and easy topology. So to give you some more details about how all this works. So every Pdn node is like a lightweight remote cache. And we arrange nodes in groups in order to break beyond the single machine network capacity.
We maintain an LRU cache in every node or in the group level. And the way to populate it is either on demand as a proxy like you’re used to with remote repositories and such, or you can trigger a warm up on demand more on that in a second.
The cache itself is highly available, and we achieve that through p2p. So it’s a Jeff frog specific p2p protocol.
It’s not the BitTorrent protocol, it’s a GRPC firewall friendly based protocol. And it’s permission aware so content is distributed based on security permissions, we allow you to have content revoked, so to do content revocation. And all this is just a very, very lightweight golang process, meaning it can run on a very low spec hardware as well.
Finally, the reason I spoke earlier about release bundles is that this is the mechanism of how to provide atomicity and trust to releases that are being distributed to the pdn. So when you want to distribute and to push a secure distribution to PDN nodes, you first create a release bundle, this release bundle can be persistent, or it can be dynamic, so you can achieve the distribution and the bundle creation in one operation, the content of the bundle, of course would be propagated to all selected groups, and then it’s going to be validated on the groups and as an additional feature, we also allow you to close the bundles for downloads until all the content is there. So that guarantees, it gives you another layer of atomicity.
Okay, so it’s a great point to stop and see PDN in action, let’s see a demo of private distribution network. But before we dive into the demo, let me give you some context about what we’re about to see. So there’s a topology set up here with the central Artifactory at the top, the one that’s marked home, and below the three edge nodes in three different continents.
So one in the US, one in Europe and one in Asia. And below there are distribution groups. This is the Pdn groups. And the numbers on the green rectangles represent the number of nodes in Asia’s distribution group. And we’re going to create a release bundle, and basically distribute and populate the selected group nodes with it.
So let’s switch over to the demo and see it in action. So Let’s start by creating a release bundle. In this case, we’re using a JSON file is the input. And we’ll give this bundle a version 303 and the name SwampUP 2021, And in this case, we’ll just selecting a Docker container with all its layers. Let’s create the bundle. And we can see the output, we can see the different files or the different layers in this case of the container, we can see the container manifest with its checksum and file. And next we’re going to go to the JFrog platform UI.
We can see the release bundle we just created. And we can see that it hasn’t been distributed yet. And just before we go ahead and distribute it to distribution nodes, we’re going to go to one of these nodes’ client, it’s a client that sits below the London distribution group in this case, and we’re going to attempt to pull from the group, this Docker container that we’re about to distribute. And of course, it’s going to fail because the container is not there yet in the group.
So we’re getting a 404. So now it’s time for us to go back to the platform and we’re going to run the distribution, of course, you can do through West but here, we’re going to do it from the UI. Here, all the edge nodes, the three edge nodes, the US and Europe and Asia. And we’re going to select some nodes for distribution, you can see the thousands of distribution nodes here accumulated between all the groups, we object California, just for the example. And next thing for us is to go ahead and hit distribute, and distribution starts. And we can track the process, we’re actually starting to distribute to child groups, even before the parent group received all the content in full.
We do it for optimization. And we can see the process as it goes, we let it finish. And that’s it all the nodes in all the groups received the container. And finally, the last thing for us to do is go back to this client in London and attempt the pull again. And voila, now it succeeds, because the container was distributed to the group.
That’s it.
Okay, so that was Pdn in action, private distribution network. And now it’s time to reveal even more news. So let’s switch over to Dror Bereznitsky, our Chief Product officer to announce the next product news.
Thank you, Yoav.
In this part of the session, I will speak about security, and how to create more trust in the software release lifecycle. And I would like to start by asking you a question. Do you trust the software that you’re deploying to production?
How do you know that the software that you intend to deploy is the one that you are actually deploying to production? We believe that in order to have trust in the software that you’re deploying to production, you need to have trust in everything which is used to build it from the moment the developer checks the code to the version control, until the production ready binary are out of the CI pipeline and ready for deployment. First, you need to have trust in the pipeline, which builds your software, making sure that you track and audit each and every step that is part of the pipeline. Then you want to make sure that all the third party software packages which are used for building the software are safe to use and does not contain any known vulnerabilities. And finally, you need to trust all the different services that are integrated as part of the pipeline, and make sure they are doing only what they are intended to do.
So let’s start with trusting the third party software packages that you’re using to build your software. For the past couple of years, we have been having helping developers to build secure software by protecting software packages that are stored in the factory using JFrog Xray. And today, I’m very happy to share that we are taking this support and protection provided by X ray a step further to the left by introducing a new capability for identifying vulnerabilities in third party dependencies directly from the source code.
Using a simple JFrog CLI command, developers will be able to scan their source code for the usage of vulnerable third party packages. And behind the scenes, the JFrog CLI will analyze the source code looking at files such as the pom XML or the package JSON, and will build a list of third party dependencies that are being used. And then by integrating with X ray, it will identify if any of them has any known security vulnerabilities. Not only that, but you will be able to leverage the policies that you define in x ray in order to decide what is the right way to approach any vulnerability, the same way that you do with any other Xray scan. And of course, this new capability can also be used as part of automation, as part of your CI pipeline or any other automated process.
This new capability will be released in q2 and will allow finding and mitigating vulnerabilities early in the process when it’s also easiest to fix them.
The second thing you want to secure while building your software is all the different services and tools that are being integrated as part of the pipeline.
For each one of those tools that take part of your build process, first you want to know exactly who the service fits and second, you want to provide just enough permissions to do what they are supposed to do, and not more than that. And to provide this level of protection, we have decided to follow the concept of zero trust security.
Zero trust security is a security concept centered around the belief that organizations should not automatically trust anything inside or outside the perimeter, and instead must verify anything and everything trying to connect to its system before gaining any access.
To do so, I’m happy to share that we are extending the security capabilities of the JFrog platform with a new feature that is called scope tokens.
Scope tokens are tokens which are used for granting permissions to a user or group of users to perform given actions on the JFrog platform resources.
Scope tokens are covering all the different types of resources managing the JFrog platform including repositories, builds, released bundles, reports, projects and much, much more. And the list of action is also uniform across the platform and includes actions such as read, write, delete, manage and execute. And finally, include in Excel patterns can be optional use in order to provide more fine grained control over the resources.
So for example, I can create a scope token for a build tool that has read or write permissions for a given path inside the repository. Or another options is to create another scope token for a deployment tool that only contains read permissions for something like release bundles. And finally, after securing the different ingredients used as part of the build, you also want to make sure that you can trust the final result.
When deploying your binary to the production environment, there is a question of whether this is actually the binary that you intend to deploy, there is always a chance that somebody replaced the version in the last moment, or that another pipeline completely overwritten, the version that you’re trying to deploy.
To help me with that, I’m very happy to announce a new unique capability of JFrog pipelines called sign pipelines.
The idea behind sign pipelines is collecting a rich set of metadata called pipe info during the build process, which captures everything you need to know about how this artifact got built. For example, all the different pipelines that were used in order to generate the artifact, all the different steps that were executed as part of each pipeline, including the input, the output and their configuration, and also the git commit the trigger the pipeline, and much, much more.
The collected metadata is then being signed while the pipeline continues, and it’s run. And it’s being associated with the artifacts that are generated as a result of the pipeline.
By having the signed metadata, which describes exactly why, when and how the artifact was created, you will be able to prove the authenticity of the artifact and by that to ensure its immutability.
The combination of authenticity and immutability should provide you with the highest level of trusting the artifacts that you’re deploying to production. And I’m happy to share that this new sign pipelines capability will be available towards the end of q2. And now I would like to show a short demo of the sign pipelines.
For the purpose of this demo, we have two pipelines.
The first one is the development pipeline that builds the go application, and then publishes it to Artifactory together with the build info.
You can see that the latest one produced the build name, go up with the build number 27. Now let’s see if it’s actually deployed to Artifactory and we can see that we have the build number 27 in Artifactory, and we have the goal up. Now, let’s go to the other pipeline.
The other pipeline is a production pipeline, which promotes the build artifacts to a production repository, and then deploys the application to production.
But before running it, let’s emulate the hacker that replaces the build results with the hacked version of the software and also replaces the building. Now when I’ll run the production pipeline, basically, the pipeline will take the hacked version of the application, it will promote it because it has no way of identifying that somebody replaced the build. And eventually it will deploy the hacked version to production.
With site pipelines, we can add the one line configuration that will tell our pipeline to fail a build in case it doesn’t manage to validate the build artifacts. So in this case, when I’ll run the pipeline, it will fail because it will try to compare the build that it tries to promote to production with the results of the previous builds the same metadata and we identified that somebody changed the build and the artifacts and therefore, the build will fail in the deployment phase will be skipped. And by that will prevent the hacker for deploying hacked version to production. And again, signed pipelines will be available at the end of q2, and as I mentioned, it’s a unique capability of the JFrog platform.
The next item I’m going to discuss is scaling in managing growth across different sites and geographies. And there are several reasons of why you need to manage binaries in multi site environments.
One common use case is having distributed teams which spans across multiple geographies, or having globally distributed CI\CD processes.
Synchronizing your artifacts between those environments helps teams to collaborate and also improve your CI\CD processes and by that, saving your time.
Another use case that is having fallback sites, which are used to keep safe copies of your artifacts.
When thinking about multi site environments, you need to take into consideration a couple of requirements that you need to take a look at, for example, synchronizing between on prem and the cloud or synchronizing between multiple clouds and in general having different types of topologies.
There is also a question of scale, whether you need to manage a growing number of global sites, repositories and artifacts. And for many years, we’ve been supporting the management of multi site topologies in environment using the application capability. But over the recent year, we saw that there is a growing need to provide the next generation solution to help you with the growing complexity and scale of multi site environments. And so I’m thrilled to share that we have released a new capability that’s called federated repositories.
Federated repositories allow you in a very simple way to connect a set of globally distributed repositories into federations and by that achieve what we call data center transparency. So, let’s see how it works.
In general, you have local repositories in different JFrog platform deployments that are encapsulated under what we call a federation. Synchronization between the different members is automatically configured. And basically, you don’t need to deal with the underlying application, or to do any configuration for that matter.
In changes, for example, when artifacts are being deployed, or window being deleted or distributed rapidly, as soon as possible between the different Federation members.
Under the hood while using a new replication framework that was designed from the beginning to mill artifacts at the required scale. And finally, you can easily migrate your existing repositories to the new capability without the need to recreate them or to do anything special about it.. So let’s see how it all works together.
In this diagram, you can see a federation which contains three sites, San Francisco, London, and Tokyo.
In each one of these, we have a federated repository, which is just considered a member of the Federation.
This federated repository behaves and is very similar in nature to a local repository. And all of these federated repositories are connected in what we call bi-directional milling, which basically means that any change performed on any member of the Federation will be automatically milled to any of the other members.
So for example, if I’m deploying an artifact to the London repository, it will be automatically replaced, replicated to Tokyo or San Francisco. And the same goes for deleting artifacts or changing the metadata. And now after hearing all the details, I think it’s time to see federated repositories in action.
We have a JFrog platform environment with the four sites, one in the US, one in Europe, one in India, and one in Australia. And I will start by creating a new federated repository. And I will do it in the US site. So as you can see, we now have a new menu for the federated repositories, and I will create a new one. So first of all, I need to choose the type and I will make it Docker repository. And then I need to give this repository a key so let’s call it Acme Docker pod. And most important, I need to join federation members so this will be federated. So I’ll add an existing repository from the Indian site and in the Madrid site, I will create a new repository that does not exist, I can do it directly from here. And I will also choose an existing one in Sydney. And now we can see that I have four different members in my federation. And that’s Basically, I created my federation.
Let’s say that I have the federal repository as part of my repositories list, you can see that it’s here. And as the next step, I would deploy a new Docker image to the newly created repository. So I’m deploying in Alpine Linux image and it got pushed into the repository. So let’s see, it actually exists in the repository I deployed to. And here you can see the newly deployed Alpine image.
Now let’s move to the other members of the federation, and see that the newly deployed image gets also deployed here. So I can see that, indeed, the middling worked and I have the new image also in the other different Federation members, and also in this one. So basically, it was federated across my Federation.
Okay, so moving on. Let’s see what happens if I delete this version from here. So I deleted it from the Indian part of the federation. And if I go to the Madrid one, and refresh the view, you can see that the deletion was also milled. And now I don’t have this image anymore.
We can also see it in the US side. So the deletion action was also federal. So that was the demo of federated depositories. And I would like to mention once again, that this feature is already available, and you’re welcome to give it a try. And in the last part of the presentation, I will discuss administration with scale.
Scaling is a great sign that you’re succeeding, when you’re growing, you’re hiring more people. And then you’re forming new teams, which in turn are working on new projects or new products. And this results in building more and more software and generating more artifacts.
However, the growth also generates administration works, which puts a load on the teams that are responsible for the DevOps processes.
Those teams are usually lean and they have to deal with many stakeholders, many customers and many tasks. And so naturally, they would like to optimize things as much as possible.
And there are three ways or a couple of ways you can optimize and reduce the loads of those themes.
First, you can always automate, basically automate manual processes, and eliminate any manual action performed on the way. Second, you can delegate work to the project teams or to the project leads. And by that spread the loads from your platform administrators as much as possible. And finally, you can provide self service to allow team to work in the own space. And one area, which requires attention in administration work, is the exponential growth in the number of artifacts, which happens as you build more and more software and deploy it to Artifactory. And those artifacts naturally consume space and eventually somebody has to pay for it.
I mean, of course, you can tackle this challenge by cleaning up used artifacts to free the space, but not always you’re able to simply clean artifacts.
In many cases due to organizational policies or even regulations, you’re required to store those artifacts basically to archive them for a longer period of time, that sometimes go even up to 20 years in certain industries. And in order to tackle this challenge and allow you to automate the process of archiving artifacts, I’m very happy to share that we are introducing a new capability, that’s called artifacts storage.
This new capability will allow you to save costs by archiving unused artifacts for the longer term and storing them in cheaper storage solutions, such as AWS, glacia or similar.
You will be able to define archiving policies leveraging the artifact metadata or any other rules that you want to use in order to identify which artifacts should be archived and when. And archived artifacts will be stored together with the metadata in a different storage than your regular artifact.
As I said, you can use cheaper type of storage or what is called cold storage. And in the case you want to install any of the archive artifacts, you will be able to search for them and recover them back to the repository it belongs to.
The retrieval action can be done by any user that has the right permissions. And by that we are allowing the administrators basically to have a self serve artifact retrieval, which again, reduces the load from the administration thing. And cold storage, cold artifact storage will be available for beta during this quarter.
If you want to take part of the beta, please contact us. And finally, I would like to speak about delegation.
For those of you that have been with us in the previous SwampUP, you heard us speaking about a new upcoming capability called projects. And I’m super happy to share that over the past year, we’ve been working on developing and implementing this feature, and it was released and available for you to use.
So what are projects and why do you need them?
As I mentioned before, there are those super DevOps teams that are managing all of your CI\CD tools, whether it’s the version control, it’s the issue tracker, it’s the CI\CD servers or the artifact management. And those teams are lean and they usually serve a very large number of teams and projects and have a lot of tasks. And often they become a bottleneck. And as I mentioned before, the ways to help those teams, such as delegating the work, or providing self self service capabilities. And this is exactly what we want to achieve with projects. And projects basically provide a single scope for managing resources and permissions across the JFrog platform, allowing platform administrators to delegate the management to the project teams.
So let’s see what we have in projects. And as you can see here, every project has a unique name, and also a unique key that identifies it. And within every project, there are resources that are managed in the scope of this project.
This includes repositories, released bundles, reports, pipelines, and basically everything you need for your day to day work. And the names of those resources are also prefixed with the identifier of the project in order to uniquely identify them and distinguish them from other resources in other projects.
Resources can be shared between projects. So if, for example, team A has a project that it needs to share with the team B, they can make it a shared resource and share it with team B.
There are also global resources and those resources are ones that need to be used by all teams.
So for example, if I have a remote repository that’s looking at the Docker Hub, I can create it as a global repository and share it between the different teams. And finally, let’s speak about project roles because this is what actually allows you to have the delegation. So we have three types of roles for projects.
There’s the platform admin, and platform admins are responsible for the overall administration of the platform. And as such, they also have the permissions to create projects.
Then there are the project admins and project admins are responsible for one or more projects. And within this responsibility, the capable of adding members to the project, creating resources, or changing the permissions on existing resources that are in the project. And finally, we have project members and project members are capable of seeing and viewing the resources that are managed within the scope of the project. And they gain permissions depending on the wall and the permissions that were assigned to them by the project admin. And by having those different types of roles, platform admins can delegate project related work to the project admins, and by that reducing the load that they have on their teams. And as mentioned before projects is already out there in ga version and ready for you to try and use. So, to summarize, we’ve seen a lot of new capabilities and features that we’ll discuss in this session and you probably asked yourself, when can I get all of it? So I’ll start by the private distribution network that Yoav mentioned, and this is in beta so if you wish to join the beta please contact us and have a chance to try it for yourself.
The software composition analysis for sources is something that’s coming later in Q2 as well as the signed pipelines and scope tokens, all of these will be available by the end of Q2.
Federated repositories is already out there and I encourage you to try it out for yourselves.
Cold artifacts storage is also in beta so again, if you want to join the beta, please contact us and give it a try. And finally, projects are already out there and you can try them for yourself.
So thank you very much, I hope you enjoyed this session, and enjoy the rest of the conference.
Thank you.