Going Serverless, with Artifactory and Containers on Cloud Run
Guillaume Laforge | Developer Advocate for Google Cloud Platform
Ayrat Khayretdinov | CNCF ambassador
On Google Cloud, modern workloads are running containers, in a serverless fashion.
With Cloud Run, you can easily deploy your workloads, and have them scale up and down transparently to accommodate traffic spikes as well as low demand. In this session, we’ll introduce you to the serverless container world of Cloud Run, how to build them with Cloud Build, and we’ll see how we can take advantage of JFrog Artifactory for hosting our container artifacts.
In a series of live demos, we will demonstrate different Cloud Run use cases. Legacy Monolith, Machine Learning API, Vault, Microservices or Event Based Architecture — Google Cloud Run and JFrog will get you covered!
Video transcript
Hello everyone and welcome to this session on the theme of serverless containers and integration with a different family of products. I’m Guilaume Laforge, I’m a developer advocate for Google Cloud focusing on serverless technologies. And today with me, my colleague Ayrat, a hybrid cloud specialist at Google, and also a CNCF ambassador in Google Developer experts. So you might have heard that recently, JFrog expanded its cloud DevOps offering to the Google Cloud marketplace, where users can easily deploy the end to end JFrog DevOps platform from and onto Google Cloud, to accelerate and secure their application delivery. So that’s one of the reasons why we wanted to tell you about some of the great combinations of Google cloud technologies and in particular, serverless technologies like Cloud Run, which is the product we’re going to tell you about, and with JFrog’s cool products.
Before going further, let’s define what serverless is about. There are two ways to think about the concept of serverless. There’s an operational and a programming model. From an operational perspective, as a user, you’re using the fully managed environment, you don’t have to deal with hardware and it working, operating system, etc. So the security aspects as well are taken care of for you.
For example, applying patches for known vulnerabilities. That’s the cloud provider, Google Cloud that’s going to handle that under the hood for the serverless products. And a key characteristic of serverless is really the pay only for usage model. So the cost is proportional to the usage.
When there’s no traffic, you can pay well, zero. And the infrastructure scales, your services up and down, as needed transparently. So from a programming perspective, you tend to write smaller grain services, no more big monolith.
You often programming an event driven approach reacting to various events from your system, or from the cloud. And also, since your services can scale down to zero, you should also program somewhat defensively in in a sense, using the stateless approach because you never know what’s going to happen for your resources once the instance that is running your workload is being shut down. So if you look at the the spectrum of products offered by Google Cloud, you have things like Compute Engine for VMs, you have Kubernetes engine for orchestrating containers and in the serverless zone, you will find products like cloud ran for containers, App Engine for web apps, mobile backends, and Cloud Functions for small brained functions of business logic.
Today, we’re going to focus on Cloud Run, which allows you to deploy containers in a serverless fashion. So what is Cloud Run? Cloud Run is serverless platform that lets you deploy containerized workloads in a fully managed environment. So you can use your preferred programming language. So it’s not a limited choice.
You use whichever language you want, using the library, binary, etc, that you want. And all the infrastructure, the scaling aspect, everything at that level is done for you by the platform. So you get a great developer experience, because you focus on your application, its business logic, its code, rather than setting up all the things that make up a project. So container technology is great for portability.
But if you want the whole scaling and managed aspects as well, to be portable, you’d like the whole platform to be portable. So that’s why Cloud Run is built upon the open source Canadian standard.
Thanks to the Canadian standard, you can run containerized applications in the fully managed Cloud Run service onto the hybrid cloud anthos platform on other cloud providers potentially, or on any Kubernetes cluster where you’ve installed the Canadian primitives, so you’re not locked in the Google Cloud Platform at all.
You’ve got the choice to run and scale your containerized workloads thanks to cloud native. With Cloud Run, as I said you’re not limited in using particular languages or stacks or frameworks. With containers you can use any language, any library, dependency any binary and take advantage of the rich ecosystem of base images. So on Cloud Run, you get the flexibility of containers, but at the same time, you get all the benefits and the velocity of serverless compute platforms.
Your containerized workload should follow some contract like listening to localhost on the port environment variable, it shouldn’t take too much time to be ready to serve the first request. Serving requests shouldn’t take too long either, right? Also program in a stateless manner.
For example, the file system is just in memory. So you’re never sure what’s on the file system will be there for the rest of the next incoming requests.
Also avoid background activity as potentially the instance running your workload may be shut down before another one takes on. So the one that’s going to be… that doesn’t receive any more requests is going to be throttled at the CPU level, or even killed after a little while after completing the request. So background activity will be killed as well and wouldn’t necessarily finished in time.
In terms of resources, you have one to four V-CPUs, from 256 megabytes of RAM up to 8 gigs, by default container can run 80 concurrent requests. And up to 100 containers can be spun off by default. But it’s configurable and you can increase those figures.
When speaking of the operational model, we said that Cloud Run is a pay per use model. So the more CPU you use, the more memory you use, the more requests that you get, the cost will be proportional to that. And it’s billed by increments of 100 milliseconds. So it’s quite granular. So you’re not going to pay one full second, if the request takes only 100 milliseconds to finish, right?
Additionally, the billable time is not just per request. So let’s say you have two concurrent requests, you’d pay for the execution, the total execution of the two requests, no, it doesn’t work like this, you’re actually build from the beginning of the first concurrent request up to the end of the last request execution. So it’s pretty interesting in terms of pricing. And be sure to take advantage of concurrency, of course.
Speaking of concurrently, as you saw in the previous graphics until some other… unlike some other serverless platforms, from our competitors, or even Cloud Functions, for example, where the level of concurrency is one, meaning that one instance can serve just one request at a time, Cloud Run can receive 80 concurrent requests up to 250, if you configure it that way. So this also helps coping with incoming peaks of traffic, and scaling gracefully without too many cold starts.
That’s pretty important for serverless platforms to think about this. Alright, so what else? That’s already lots, but there’s more. So Cloud Run is available in tons of regions around the world. And thanks to, if you pair Cloud Run with the Google Cloud load balancer, you can expose a global endpoint that routes user requests to the closest region, so it helps being resilient to region outages.
It helps with serving user request with the minimum latency possible as the user will be routed to the closest instance of your service in the closest region.
You can take advantage of things like Cloud CDN for caching assets, to reduce the load on your service and improve performance. You also have the integration with cloud armor for DDoS protection for filtering traffic by a pure IP address or geography and define other firewall rules.
In terms of deployments of your Cloud Run services, you can create new revisions of a service. So you can use that mechanism to do various deployment scenarios.
For example, bluegreen deployments, Canary or AB testing, you can define percentage of traffic between several revisions and even use tags to easily identify services and route traffic to them so it’s pretty effective. You have access to Google Cloud VPC and other Google Cloud services through the VPC, for example, memory store redis or memcached. Or if you want to access virtual machines on Google Compute Engine, you can take advantage of VPC access.
There’s also shared VPC across projects. What else, there’s, in addition to Cloud Run or complementarity to Cloud Run, if you need to orchestrate various Cloud Run services, or even third party APIs you can use Cloud workflows. It’s also fully managed service that is serverless, too. And you can easily chain service calls, handle retry request policies, it can really sort out the mess of microservices spaghetti and their coordination through even driven approaches. So it’s a great way too, especially if you are able to define your business processes as a flowchart, let’s say, you can design or implement that flowchart as a workflow in order to coordinate a fleet of services that should do something together. I mentioned, cold starts earlier.
There’s another thing which is useful for avoiding cold starts that by defining minimum number of instances that are ready to serve, that are warm and ready to serve new incoming requests. So if you look at the traffic in blue, for the parts that are below the the dotted green line, usually you would have had a cold start for requests coming in those periods.
But with minimum number of instances, you’re sure that there’s always going to be an instance that is ready to serve new incoming requests, even when there’s no traffic that keeps instances running. So it’s really useful to avoid most if not all cold starts for your applications.
What else? Graceful instance termination. When your recovering services scale down after a peak of traffic, your container instances will be shut down. But you can do that gracefully by being notified of instance termination, thanks to a sick term signal. It gives you a few more seconds to clean up the resources that might still be open, like a database connection or cleaning some temporary resources or cached data. There’s beyond just handling usual HTTP requests, which is what Cloud Run is about, you can also implement things like GRPC APIs. So you can call GRPC APIs or implement GRPC APIs. I mean being a server of GRPC APIs.
Beyond GRPC, there’s server side streaming. So for example, you can do… as I was speaking of GRPC, you can also do GRPC streaming.
You can also use servers and evens, you can use HTTP 2 streaming and web sockets for really, you know, real time and interactive kind of applications. Wow, that’s a lot of things, right? So it’s really a bird’s eye view or frog’s eye view of what Cloud Run offers to developers and customers. Now, enough talking, it’s time for real concrete hands on demos, where Aryat, my colleague is going to show you the integration with Artifactory and other cool JFrog integrations. So thanks a lot for your attention. And that’s your turn, Ayrat. Thank you.
Thanks Guilaume.
Hi, my name is Archie. I’m a cognitive ambassador from Canada working at Google Cloud. Happy to be here at swamp up 2021 and share some exciting technology with this amazing community. So beyond Covered Well, what’s Cloud Run and its features? Now let me show you a few demos, how you can easily deploy your applications on serverless Cloud Run, and how JFrog technologies can help us here. First, I’m going to deploy a simple application on Cloud Run and show how Cloud Run handled traffic and takes advantage of concurrency. So I’m here on the Google Cloud Console Cloud Run menu. And I’m going to create a new service. I’m going to give some nice name to it. And now I have an option to either provide a Docker image or provide my GitHub repository. So if you have your repository with the code, and you want to build it even without Docker file, Cloud Run, using cognitive build packs to automatically build your images and deploy them in the quadrant. But in my case, I’m going to actually use my own image that already prepared for this demo. And here, you can have some advanced settings that if you want to play around, you can set up your memory and CPU and whatnot. In my case, I’m going to just specify the concurrency equals to one.
You can configure variables, secrets, connections, to databases, and whatnot, some security features, but just going to go ahead and deploy this application. So you can see that applications getting up and running.
We’re providing some views for your metrics and logs. Alright, so the application has been deployed. And we can see that the endpoint is here with HTTPS. So we are automatically providing utilis, certificate verification, and an application is up and running. That’s great. However, this was more like to show you how the traffic is working. So I have here a visualization of what’s happening behind the scene with this service. And as you can see, right now it serves zero traffic, that means you’re not paying for anything right now.
Now, I’m going to use a Linux utility, I’m going to send some burst load, let’s say 10 seconds, and we’re going to have 20 requests per second. Let’s see how our Cloud Run service will behave. So you can see that our service immediately scales up to 30 containers, then right away scales down to zero. And that’s essentially what you’re paying for. And I think that’s awesome.
You can think how many applications you have today that you’re running maybe on Kubernetes, or on prem, that’s sitting there and not doing anything, and you’re paying for that, right?
Cloud Run lets you scale to zero when there’s no traffic. And when you get a burst of traffic, it will quickly scale up the system without any, you know, packet losses. So that’s, you know, scaling to zero. And the traffic behavior, I want to actually show you another nice feature, which is called concurrency. And, actually, concurrency is a pretty unique feature for Cloud Run, it doesn’t exist on Kubernetes. And with this feature, it allows you to serve multiple concurrent requests from the same container. And hence, you can save some costs. So now I’ll generate some extra load. This time, I’m going to do a 3 minute load. And I’m going to create a new revision of my application. And here, I’m going to specify concurrency tab, I’m going to add a new variable just to change the color of my application. So it’s going to be green, like JFrog. And I’m going to uncheck this surf percent traffic right away. So what will happen now, the application is going to deploy the new revision of my application. And we should see on the visualization that I’m going to have a new container that is green, new application that is green. So let me shift some traffic. And in this case, I’m going to shift traffic 50-50. And let’s check our visualization.
All right. So as you can observe in the graph, that the higher concurrency we have for this, let’s say for our green application, the fewer containers in this case, we are only using five minute run, and hence we can save the cost.
The blue application that is not using concurrency currently has to use 25 containers, whereas concurrency is only using five so that’s a pretty amazing cost saving right there, right? Okay, so I hope you can now have some idea how to use cloudron. Maybe you know, you right away want to try it yourself. But you know whether you’re building an HTTP web application, or trying to build a multiplayer game using WebSockets, maybe you want to serve some APIs like machine learning, or do data processing, Cloud Run gets you covered.
There’s some areas where we’re still kind of evolving, so Cloud Run is still not great to run stateful workloads. However, we can provide you capability to connect to the different cloud services and save the state there. Alright, so while Google Cloud provides many solutions, to build software supply chain with Cloud Run on GCP, such as Google Container Registry, cloud build, we’re living in a world where we’re embracing hybrid and multi cloud approaches. So many companies still running on prem or they’re running, you know, on different cloud providers, or at the same time in multi cloud, and they’re kind of solving a challenge where they need to deploy applications. And ideally, they want to do this the same way across the board. And that’s why we partnered with JFrog, to do the hybrid multi cloud CI\CD solution, where Anthos deploys Gk on prem, or on Azure or on Amazon or on the environment, and connects them to Google Cloud Console. And then JFrog takes care of CI\CD portion. And so in our next demo, we’re going to get into real life example, and show you some workflows to implement that use case.
The best tools we’re going to use Anthos platform that will deploy our Cloud Run for anthos on GCP.
For automated software delivery, we’re going to use JFrog cloud platform and GCP.
We’re going to leverage different pipelines for our CI and CD portion. We’re going to use JFrog Artifactory to store our artifacts and Docker images. And finally, we’re going to use j for X ray to make sure our code is safe and intact.
The basic use case, developer makes a pull request to the GitHub repository, trigger the pipeline. In this case, we’re using JFrog pipelines, it’s going to run the code quality, build our image, scan it, then push to the JFrog Artifactory. In the end, JFrog pipelines kicks in and we’ll be able to roll out our image to the foreground for Anthos. As for our demo application, we’re going to use microservices this time, and it’s going to be an online store application.
We have different components here, some cash, some front end. So let’s go over onto our Google Cloud Console. And we can see that actually the services that already has been deployed. So the difference we used from before it was running on fully managed Cloud Run platform whereas this microservice application is deployed on Cloud Run for anthos.
As we see here, there are some clusters involved. So let’s go inside of the front end and see what it looks like.
The application is up and running. But since we are in a different conference, we are getting a huge demand on the frog bikes. So let’s try to update our microservices application and get some JFrog bikes going here. So to change this code, we actually need to look into our source code.
We can see here all our sources for our code for our microservices, going to the front end and we can see that it’s written in go Lang and our images are stored somewhere here. So we need to change this image, citibike and replace it with the frog bike.
Let’s go for it.
Copy… And we’re going to need some code here. We get our repository, we’re going to make sure that we have booth message so we can know what we are committing here. And we want to push it to our GitHub repo.
I’m going with the basic workflow here without a pull requests, because I’m pretty much alone here. So what we should see now is our pipeline should kick in.
Check the JFrog… So this is JFrog platform that’s running on GCP. And we have Artifactory here pipelines. So let’s look into the pipelines actually. And what we have here is we have some integrations. So for integrations, we’re using Artifactory integration, to make sure that we can deploy our code and images to the JFrog Artifactory.
We’re using Google Cloud integration in order to create our service account and to connect to deploy the code. We also have the GitHub integration going because we want to make sure that on the pull request, our code is going to be deployed. Now let’s check the the pipeline itself.
We can see that our pipeline kicked in and it’s processing. We have a very nice UI here that shows what our pipeline looks like. So we have multiple steps here, the first step is going to build our Docker image, then it’s going to push it to either our Artifactory, so it’s going to also scan our image and if the scan will be successful, it’s going to be deployed here.
Usually, you can actually have multiple steps here for scan and then to deploy. In our case, for the demo purposes, we kind of have made those two steps in parallel to speed things up.
So let’s look more inside of the pipeline and see how things are going. So right now, our Docker build started. And we can see actually, in the steps here, it’s actually right now pushing our image to the… it’s building our image, actually.And then the next step will be the push. So we’re going to wait a little bit and see how things are going.
In the meantime, what I can do, actually, I can hop into the source code repository of how the pipeline looks like. So here, we have some resources that we’re defining. So we’re providing the GitHub repository integration, and the building.
Then we actually have our steps of our pipeline that we just looked into. So we are building our front end Docker image. And then we are pushing it to the Docker local, this is Artifactory repository, and then we’re executing Xray scan. And I think the last part, which was actually interesting for me to… to configure is that right now, there’s no, maybe direct integration with Google Cloud console, so what we’re using here, we’re actually deploying Cloud SDK with the Google Cloud command line. And we’re logging into Google Cloud, and then we can execute the deployment steps.
What else we have? Actually, some manifest going here. So actually not deploy this application, we’re using this manifest, which is actually not a Kubernetes manifest, it’s actually the [inaudible] manifest, it’s much smaller than Kubernetes one.
This allows you to declaratively deploy your application. And it’s pretty great, because now with Cloud Run, you know, you have manifests that you can deploy on Cloud Run fully managed or on Cloud Run for anthos. And so it provides code portability, and zero locking, as [inaudible] can run on Cloud Run, or any Kubernetes cluster that has k native. So let’s go back to our pipeline, just to see how it’s going. So we finish the build, the push part is also finished . So let’s check for the push.
Yeah, so we’re actually here pushing your image. And then the Xray scanning part and then final part is actually where we’re pulling that Google Cloud SDK. So we can actually execute the command for the cloud deployed. So it’s actually running right now. And hopefully, it’s going to be complete. And few instances.
All right. Here… and wait a few seconds more. And then see what our new revision will kick off automatically here. And yeah, it’s been just deployed. Let’s see, yes, the pipeline is fully up and running. So let’s go and see the new release of our application. I’m going to actually open in incognito window and see how it’s going to deploy.
Yes, and we have our front bike up and running. So looks great to me. And we’re going to go back to our slides and going to wrap up our presentation.
Thanks a lot. I hope you liked our demos.
Hope you enjoying the swamp up 2021. And here you can find some amazing links for the Cloud Run that can help you to get started.
Please follow Guilaume and myself on Twitter and let us know if you have any questions. Thank you.
See you soon.