Deploying to Google Serverless with JFrog DevOps

Preston Holmes
Outbound Product Manager, Google Cloud
Google Cloud

Cloud Run is Google’s fully managed serverless container platform.

This session explores the several ways to use APIs, events, and webhooks to integrate with the toolset on jfrog.io.

 

 

 

Video Transcript

Hi, my name is Preston Holmes. I’m an Outbound Product Manager in Google Cloud, focused on the serverless landscape. Today, I’m going to talk to you a little bit about the integration of JFrog and our serverless container platform called Cloud Run.

Now, before I get into how these two technologies integrate, I think a lot of what we’re seeing in the serverless adoption space relates to the idea of building more modern applications and I won’t go into all of this, but I think it’s important to recognize that the end goal here is to get to a state where applications for our customers can be more dynamic, scalable, and intelligent and this means adopting a whole bunch of more modern development practices in the overall software life cycle.

Google’s had its own sort of journey with this for quite some time, which is that you can imagine in the very early days of Google, how a data center was constantly being reconfigured. Now, this is a view on a relatively modern layout of a Google data center. But, for early Google developers who are constantly building and deploying to underlying hardware that was constantly reconfiguring, it was quite disruptive, so we really needed a way to come up with something that would treat the data center really more as a computer to be targeted for deployment rather than the individual servers that compose the data center.

We’ve actually been using sort of serverless paradigms at Google for quite some time and we’ve recognized that the benefits of serverless infrastructure has the following. One is that we have rapid auto scaling. Everything is pre-provisioned and multi-tenant. There’s no need to sort of acquire or secure compute nodes. The overall system is very fault tolerant. We have multi-zone deployments of the control plane. Things are self-healing, that as individual computers or instances of containers become unhealthy, they are automatically replaced. Then there’s a whole sort of set of supporting services, which I’ll talk a bit more later.

To this core compute infrastructure, we bring a very high capable connectivity set, so this includes both the global front end system of Google’s overall front end system, as well as specific support for advanced protocols, such as web sockets and gRPC.

Now we’ve actually taken this core infrastructure capability and taken a couple stabs at productizing it to the public and this began all the way back in 2008, really predating Google Cloud itself with App Engine. Now App Engine was born at the time of the highly sort of opinionated PaaS era. This was a highly curated, batteries included, sort of fully integrated development application environment, but it did suffer from that opinionation requiring some heavily constrained runtime parameters. So you could only run certain software and certain software stacks in it.

In 2017, Google Cloud basically came up with our sort of response to the idea of functions as a service, so this was after AWS released Lambda and this is really an attempt to look at the developer contract in its most minimal and sort of focused form, where a developer only needs to provide the actual method of a function rather than all of the packaging and materials around it.

Then in 2019, we released what we feel is really sort of a sweet spot in the serverless space, which is the idea of serverless containers and that is Cloud Run and that’s really what I’ll spend most of my time I’m talking about today. If I had to sum up what kind of you can think of as Cloud Run in just one slide, it would be this and that is that you take a container. So this is what Docker made wildly popular through sort of the serialized container format and is now known as the open container initiative, so any OCI container, and run it directly on Google’s infrastructure with as little intervening management as possible. This is a way you can take a container, hand it to Google. Google will then run it at scale for you.

Now, when we talk about the container, we often are talking about this in sort of two meanings of the word. The first is really the artifact of the container. This is the thing that gets built. If you’re using JFrog Pipelines and JFrog Artifactory, this is the management of the artifact, both it’s building and storage of the artifact and this is bytes at rest, in some sort of storage system and we talk about that and that’s really the container image.

There’s also the container runtime environment and this is what Cloud Run is essentially focused on. I think of Cloud Run almost as a sort of super duper docking station that you can plug your container artifact into at runtime and it brings all of these extra features of the Google Cloud platform. So won’t go cover every one of these today, but just to give you a sense of some of them, I’ll go kind of around clockwise here.

The first is that every service that you run in Cloud Run is assigned a discreet workload identity, so this is an identity that your code runs as, that can be specifically permissioned to access other Cloud APIs. It includes this built-in, front end. So has a built-in load balancing and traffic management. I’ll show you some traffic management here in a bit in the demo but you can also optionally bring in all the capabilities of Google’s other load balance or product set and the Google Cloud load balancer.

It has built in logging and monitoring, so observability is something that you get without any kind of particular configuration or setup. Then you can bring in other connections to things like our Secret Manager, Cloud SQL, through the form of the SQL proxy, which exposes a SQL database as a local socket. You can manage environment variables outside of the artifact as part of the deployment runtime configuration. So all of those things are things that sort of can get plugged into your container while it’s running.

Now, we see Cloud Run being used for a wide range of workloads. Rest APIs are super good fit for Cloud Run. This is inherently, predominantly, a request response based run time service but we’re seeing increasing usage of Cloud Run in sort of data automation and data engineering. So, being able to orchestrate based on web hooks or based on other sort of steps that are happening in terms of say events generated when new files are copied into a storage bucket and this is something I’ll talk a little bit at the very end of the talk about some of these integrations and automations potentially with some of the JFrog platform.

We do have Cloud Run deployed in pretty much every region that Cloud Run is currently deployed it in, so it’s a globally available service. Each service that you deploy in Cloud Run is constrained to a specific region at the time you deploy it, but you can deploy that service as essentially clones of the service in each of our regions and then unite them behind a global load balancer.

I want to talk a little bit about the built-in traffic management ahead of the demo, because I’ll demo this traffic management. So the idea with traffic management is, as you’re building new versions of your application, you want to either do some sort of blue, green deployment, or maybe a canary release deployment and the idea is that you have both versions of your app deployed, in a way that gives you control over just shifting some of the request traffic.

We have this scenario where we have our current stable running version. We’ll call this the blue revision A. Then we deploy a new revision B and we want to start managing some of the traffic and switching, just say one out of 10 requests over to that new revision. We can do that and then monitor the new revision and observe it for any sort of aberrant behaviors, any regressions that a have gotten past our QA process, et cetera. Then we can continue to shift more and more traffic to that version and over time, make it fully cut over. With the dynamic scaling in Cloud Run, each of these revisions is going to be elastically scaling to the proportion of traffic it’s receiving. So let’s just take a look at what this comes across as, as a demo.

So the first thing I’m going to do in the demo is just show you how quick and easy services are deployed into Cloud Run. Again, no pre-provisioning. This is just an enabled API in a project. What I’m going to do is create a new service and I’m going to use a container I have pre-built that has some custom instrumentation in it, to share a little bit more about how this scales. I’m not going to use any automatic deployment from source. That would be set up through JFrog pipelines. I’ll deploy this to Iowa. I won’t go through on all of these settings. We have quite a few. I will allow unauthenticated because I’m not trying to put this service behind any custom authorization, and I’m going to change one of these settings down here for now, which is going to force Cloud Run to scale out the number of containers. In other words, I’m going to force Cloud Run to be sort of less efficient in the way it’s handling traffic, just so you can see how our system will scale out. And go ahead and start this up.

We see that a service is a long running thing that consists of a number of revisions. This is going to be the first revision and we have already a fully functioning HTTPS URL. We can go to and see that we get a basic [json 00:11:08] response. Now, what I have is instrumentation for this service, that will show you how many requests are being. I’ll pull this into view and you see that right now, this service is scaled down to zero. No traffic is being served. No containers are running. Let’s go ahead and change that by running a little bit of a load command to this. We’ll send it some traffic just for 10 seconds or so and you’ll see what happens here is number of containers comes up to handle this traffic and when the load test completes, we’ll immediately scale this all back down to zero, and you see that all the requests were handled and traffic is sort of managed and very elastic.

Now, I wanted to show you the traffic splitting. So as I had mentioned, we have sort of this idea of a blue and a green deployment. Right now, we just have our blue deployment. Let’s assume that I’ve run my pipeline on JFrog. I’ve built a new artifact. I’m getting ready to deploy that. I have that kind of necessarily staged onto the Google side and it’s ready to go. I’m going to simulate that by really just changing an environment variable. So here I can edit these. I’m going to go into variables. I’m going to change a label to green, and I’m going to uncheck this box, which says, “serve this revision immediately.” What this will do is it will deploy this changed configuration and again, it would likely in a real world, use a different container image and it’s running, but not receiving any traffic. If I go back my main URL here, it still says blue.

I can do something here, which is give this a revision URL and this is a revision specific URL that is not subject to any of the traffic management. So, if I save this. Give that just a moment. Again, the front end programming of this is the front end networking, not the front end, as far as the console, but the way in which traffic is routed is very quick to update. Here. I can go to the specific URL for the green revision and see that that is reachable, but not through my main URL.

So, what I’m going to go ahead and do, is bring up the visualization again and I’m going to bring a bit of sustained load. So instead of just 10 seconds, I’ll do five minutes. That should get us through this and what you’ll see is I’m going to bring up and you’ll see the number of blue containers is up and running and serving. And once that’s stabilized a little bit here, I’m going to go back and start to do a bit of traffic management. Let’s just start off with 10%. And again, as this programming change goes into the network, once this is done, you’ll see that some of our traffic starts to shift to this green revision and we bring up a container too, to handle just that 10%.

Each of these revisions is scaling dynamically to the proportion of the traffic they’re receiving. You can see here, we’re getting pretty close to that.90/10 split here. If I manage traffic again to 50%. Make another change. Again, these are not changing or adding revisions. This is just changing the traffic configuration. We’ll see that now we’re back close to a 50/50. I think at this point, is normally you’d run this for some period of time and make sure you were seeing nothing concerning in your logs or user error reports or anything else. Make sure that this service is behaving as you would expect. At this point, I think that we’re good and I will go ahead and just switch over the rest of the traffic to this. Here you’ll see that once this traffic switch is completed, that blue revision will scale back down to zero. Again, all very, very sort of dynamic auto-scaling in the overall system. This graph is a custom visualization. That’s why I’m using my prebuilt image.

The Cloud monitoring graph will show you the number of container instances, as well as memory and CPU usage. However, that visualization that’s built in to our monitoring tools, is much more optimized for real world operations, which really only care about things sort of at a one minute resolution, in which this kind of rapid scaling would not be quite as visible. Hopefully that gives you a quick taste of what Cloud Run is and what it can do.

Now that you’ve got a sense of what Cloud Run is and what it can do, let’s talk about how to integrate JFrog with Cloud Run. Now, both of these are essentially platforms with many capabilities and some of these capabilities do overlap. I would say that sometimes when you’re looking at two different platforms, you’re often presented with a vendors pre-built assumption around an integration. Now, sometimes this is very handy. In fact, JFrog has quite a few built-in integrations into Pipelines automatically.

When it comes to more custom integration work ,I think it’s really nice that both of these platform have really rich sets of both web hook support, event generation, as well as really rich APIs and this allows you to not pick sort of the one-size-fits-all way to integrate these two things, but choose a way in which you can integrate the different sub-components, such that it’s going to best suit your own workflows.

When it comes to integrating JFrog with Cloud Run, obviously JFrog has its own sort of ecosystem and universe of components that really fully manage the software building and management of sort of the software life cycle and generally there’s a terminal deployment step that gets the final build artifact out to some sort of runtime environment. In this case, that would be a Cloud Run.

Now with JFrog Artifactory and its support for containers, the natural assumption would be that a deployment should point Cloud Run to pull a container image from Artifactory and JFrog. However, this isn’t exactly as supported as one would think, or maybe as one would like and I need to go into a little bit about the way Cloud Run actually handles that rapid auto scaling with containers, to explain why it is this way currently.

Cloud Run supports what we call container streaming and let me get a little bit more into the details of what that means. This is a quick and high level architecture view of the internal system of Cloud Run. We’ve got the API and control plane primarily on the left. We have the traffic management and the front end system on at the top. We have the core pool of compute, which consists of the number of these application server nodes. And we’ve got the scheduler and a storage system, and we have here, and as an example, the container registry from Google on the bottom right.

When you deploy a new service, the first thing that’s done is a set of PubSub messages are sent out to each of these components to prepare your Cloud Run service in each of them. One of those steps is to make a copy from our own container artifact management into a custom storage solution. What this does is it reconfigures the bytes of that container image into a form that is recognizable as a block storage system, as opposed to a set of layers.

Then as a container is started on one of the compute nodes, we’re using something called gVisor to do the isolation. And so gVisor allows us to run this multi-tenant environment where we’re not relying solely on sort of the OCI runtime alone, as far as a way of isolating one customer from another. So Docker is not known to be completely sufficient as an example of isolated different containers while running from each other. Part of what gVisor does is it can provide a set of what are called gofers, which are a way of exposing network and disc IO at a kernel level to an application and actually come up with other implementations.

What this means is, we’re able to take any read or write file system calls, by the application and turn them into streaming network reading IO request, to this custom storage system. This is generally similar to the way things like persistent discs work, in that you’ve got a network mounted block storage system. And because the way that works is a new container can be scaled out across our fleet very, very quickly because the association of a container image with that particular instance, has no more latency than essentially the mount directive. So, you can simply mount that network storage system into the container very, very quickly and we don’t have to have any delays associated with pulling down a container image from anywhere else. Then finally, once that’s running, we can continue with the rest of the process of routing traffic and putting that URL into the front end.

So, what does that mean for our integration landscape here? What that means is that as part of the steps in your pipeline, you’ll want to use a step to make what I call an operational cash copy of the container artifact from Artifactory, over to our artifact registry analog. This isn’t to use Google’s artifact registry as a primary artifact registry. The source of truth in this case would always be Artifactory, where you would take advantage of any scanning or QA steps you might have or any auditing. This is essentially to bring a caching copy of that artifact into the same infrastructure environment as Cloud Run.

Now, this is sort of the way to use Cloud Run as a way of deploying artifacts built in the JFrog ecosystem. However, there’s a secondary use that I think is worth pointing out for Cloud Run, when it comes to using this with JFrog Pipelines and that is to use Cloud Run as a very simple way to host web hooks, that can be used to build all sorts of custom integration into a pipeline.

So pipelines do support a bash step, and you can get quite a lot done in a very flexible way in that step. However, if you want to write some very, sort of more elaborate business logic in Python or Go, and you want that to be hosted in a way that scales to zero and is simply called upon by the pipeline when needed, there is the integration of an outgoing web hook in JFrog Pipeline and Cloud Run makes a very easy and straightforward place to host custom code that can be invoked by those web hooks.

So, two different ways to integrate Cloud Run and JFrog ecosystem together, however, not the only two ways. I think, as I mentioned earlier, that rich integration between events, web hooks, and a strong API surface across both tools, gives you lots of flexibility to make your own integration serve your needs the best. And with that, thank you very much.

 

Release Fast Or Die