Managing Artifacts in a distributed Landscape [swampUP 2020]

Roman Roberman,DevOps Engineer, Apps Flyer

July 7, 2020

2 min read

AppsFlyer Transforms Its Artifact Management with Artifactory’s Single Source of Truth: https://jfrog.com/blog/appsflyer-tran…

AppsFlyer’s mobile attribution and analysis platform is used by the biggest and most popular applications on Earth, generating a constant “storm” of 100B+ events (HTTP Requests) on our microservices, cloud-based platform daily. And one of the technology backbones enabling this scale of operations is JFrog Artifactory. With a diversity of artifacts being maintained in a Wild West of repositories – from Docker registry, NPM, to S3 with third-party dependencies for all, we realized we needed a better way to globally manage our many different platform engineering artifacts and pipelines. Enter Artifactory. AppsFlyer selected Artifactory to help overcome the challenges of managing these many dependencies in a large engineering organization. This talk from swampUP 2020 is the story of how AppsFlyer migrated a live production system serving tens of thousands of production clients with iron-clad 24/7 uptime SLAs to Artifactory in order to improve deployment reliability and create a single source of truth for all of our credentials, and return order to the Wild West of our artifact management at scale.

Video Transcript

My name is Roman and I’m going to tell you about our transition from a mess of artifact management systems to Artifactory.
Who am I? I am a platform engineer that works at UPS flow for almost two years now.
I am about of the infotools team, which was a part of our DevOps team at AppsFlyer.
A few words about AppsFlyer –
AppsFlyer is a mobile app Attribution Analytics platform
that helps up marketeers measure and optimize their user acquisition funnel.
This really means that whatever you click on an ad on your mobile phone or doing an event
inside the application that’s installed on your mobile phone, we receive an event for that.
From that that we can derive how ad campaigns for your application are performing.
And if they’re even worthwhile,
there’s a lot of features on top of that, but enough talking about marketing materials for now.
Question and Answers…
you are welcome to send your questions during this talk, and I’ll try to answer them
after in the chat.
Let’s talk a little bit about how artifact management was
an obstacle before de-factory.
As you can see, we had the internal repositories, external repositories,
we had even S3 buckets that we used as repositories.
We didn’t have documentation and we had
a lack of repositories. For example, we didn’t have people repository.
In one sentence:
real mess.
We had to configure
different repositories for each project
and some of them even were public repositories.
It looked like this.
And a little bit of like a spaghetti
Which problems does it bring?
Some of the external sources can be unavailable
and some of the dependencies can be deleted with time if you are using external repositories to pull your dependencies.
We use lots of different credentials for all of the sources
and it’s hard to keep track where everything is located and pulled from.
As we were growing, we needed the most scalable solution
for the outage management AppsFlyer.
So basically, what was our goal?
We wanted to improve the reliability and the speed of the deployments. This is one.
Second goal, we wanted one place to go and not different places to get different artifacts.
And we wanted one set of credentials
to our artifact management system.
Basically keep everything organized.
And we decided that we will use artifactory to reach those goals.
Let’s talk a little bit about terminology.
In AppsFlyer artifactory, we use the three different
repository types: virtual repository, local repository and remote repository.
Local repository is basically physically locally managed repositories that you can deploy artifacts into.
Remote repositories are just a proxy to a repository located on the different server
and the virtual repository, other XML repositories and the common Java.
And basically, most of the time we use the
local repositories as our repositories,
which aggregates all of our local and remote repositories that we have.
So, a little bit of our info.
Basically, we have two clusters in one mission controlling instance,
we have one cluster in Europe.
This is our main production cluster,
which sells everything and we have the R cluster in the US.
Let’s deepdive a little bit.
The machines are provisioned with terraform
artifactory is installed with Chef with the custom cookbook that we wrote.
As I mentioned we have two clusters with replication between them
and one instance of mission control. The data is stored in s3
and we have cash inside each node for speed.
Mission Control is in charge of that application that creates as soon as a new window is created.
And each report has the quantum replication as well.
Mission Control gives us a better look on both clusters and creates all the repositories
automatically in the US cluster when they are created in your cluster.
And after Artifactory it looks better,
everything dissolved in the Artifactory. We control everything.
Remember how it looked before and now it looks like this – way better.
Let’s discuss a little bit about the POC we conducted before we decided to move on to artifactory
as our production solution for artifact management.
It was conducted from several rounds of pulling from the factory.
First, it was more of a benchmark. We used 50 nodes as you can see in this slide.
To pull simultaneously from artifactory and form our registry.
And after we increased
and we call it as the “more below test,” because we increased it to 300 nodes.
Basically, we wanted to… the purpose was to check
that Artifactory won’t have lower performance that our current registries have.
Also tested the pushing images to one node
in the cluster and pulling them from another node.
We tested pushing them for one node inside your cluster and pulling them from US cluster.
And we also tried killing and reviving one node and we checked If the node revives under a certain time,
everything was good with those tests.
And what are the Edit values, that artifactory Lingus embedded documentation,
no more we need to ask how to deploy, what to deploy,
all base access control –
this is something that we needed. We just connected our LDAP
to Artifactory when all the users were there with all things,
everything was perfect. Retention…
before that, we didn’t have the retention and now we can clean up all the artifacts
and save money, risk money.
And the most important: one interface, one URL,
one place that everything is organized at.
I want it to take several seconds speak about the setup,
which serves us as self service.
Before we waste a lot of times how do I deploy a certain to a certain repo?
Now we can send the people who asked this question to the set me up menu,
and it’s like a self service. All the data is there.
And of course, we built operational dashboards with all the metrics we could look for artifactory
and we use it to check on the all the clusters.
And in case of a production issue, this is the first place that we go to check.
Here’s another Docker push statistics…
we have everything. Everthing goes to [INAUDIBLE] and presented.
Of course, we also pass in cheap,
all the logs from artifactory to our Central Kibana.
You can see excerpt in this slide.
Let’s speak a little bit about our migration process.
So the migration was
that we created local beautiful and remote repositories in Artifactory.
The remote repositories, we created, proxied our older repositories.
And now people didn’t have to go to all the rest of the repositories,
they just could go to Artifactory and get all the artifacts from the other repositories.
Class. When you pull from a remote repository,
it caches the artifact in Artifactory.
It means that normal going to external to public repositories,
you just pull once on a public repository and everything is inside our VPC.
It took some time to migrate all the projects since we had a lot of services
and they were growing like crazy.
But the process of the migration itself is very easy, it’s just to
change one URL inside the config file.
This is an example in the slide in the build again, this is it.
And now I want to show a small self-service of this we have for repository creation in Artifactory.
Currently, only the DevOps team has the permission to create repositories in Artifactory.
And we were getting lots of requests for a repo creation and we wanted to automate and
self serve the process.
Of course on the department we found a nice telephone multi factor provider that
Atlassian created, which sorted our issue.
We created a repository model
from this provider, which uses the resources of the provider.
And basically, we used it as a self service
for the R&D members to create their own repositories.
Basically, we just need to approve merge request,
and everything is there, naming convention, all the logs, the configuration, it’s very simple.
And the test which comes in TF planner.
Let me show you how it works. First demo time.
So basically, we have a repository, we have a model,
you basically need to fill up a repository name, package, type and repo type and create a merge request.
An example of a merge request… you create a merge request and the CI pipeline starts,
does all what it needs and in the end,
it creates a plan.
For example, our plan is that we want to add a generic
local repository name presentation,
you can see local generic presentation.
After the plan is done
the merge request in the [INAUDIBLE] pool. We have another pipeline,
same as before, validate plan but now we have also applied
after they apply policy is done.
The repository is created in Artifactory. And as you can see, the repository is created.
Very good.
We have self service we don’t need to create manually repositories,
everything is backed up in terraform,
even In case of the artifactory will be deleted,
we can restore all the repositories in
the two commands plan and given one comment, just apply.
And that’s it. This was very easy.
This is very easy for us to maintain and
everyone is happy from solution.
I wanted to talk a little bit about our future plans.
Currently our cluster installs don’t use Docker compose.
We want to use Docker compose because we can upgrade the way of the cookbook was written.
It’s giving us problems to upgrade so we are currently changing
our cookbooks to Docker compose which will give us availability
and we want to add permission
to our self service – permission stop in Artifactory.
That’s it. Thank you. You can ask your questions and I’ll try to answer them.
Thanks

Introduction to Artifactory Platform
JFrog