C and C++ DevOps – It Keeps Getting Better

July 7, 2020

< 1 min read


The basic tools and a complete DevOps workflow for C and C++ languages: https://jfrog.com/webinar/modern-c-c-…

Right now is an exciting time for DevOps engineers who automate builds for C and C++. A wave of new features has hit the Conan package manager over the past year, dramatically improving the capabilities and reducing common pain points. In this session, you’ll find out the benefits of managing your C and C++ builds with Conan. You’ll get a crash course on the new major features and the problems they solve.

Speakers

Jerry Wiltse

Senior Software Engineer - Conan Team

Jerry Wiltse is a Senior Software Engineer of the Conan development team, and has been dedicated to build engineering for C and C++ since 2016. He is an avid open-source enthusiast but also deeply focused on enterprise development environments. He is also the narrator and creator of the Conan learning track on the JFrog Academy.

Video Transcript

Hello, and welcome to DevOps for C and C++
it keeps getting better.
My name is Jerry Wiltse, and I’m Conan and developer at JFrog.
Please feel free to ask questions in the chat throughout this presentation,
because I’ll actually be available in the chat
to interact with you while this recording plays.
I have to say that I’m very excited to give this talk today
because there are some amazing new and innovative
features that can dramatically improve
the use of Conan in CI pipelines.
So let’s get started.
We’ll introduce these new features in the context of the
problems they solve.
We’ll first discuss the problems,
then talk a little bit about the new features that
aim to make things better
and finally, explain how those features actually work.
So let’s start with a question.
What’s the biggest challenge in C and C++?
I’ll narrow the scope of this question a bit by adding
for professional development teams at large scales
with multiplatform applications.
Placing our feedback at Conan
one nomination has to be automated build and test.
That’s going to be the focus of our talk today
but we’re not alone in this perspective.
The keynote topic at CPPCON 2017
discussed this topic as well.
It was titled, “Live at Head”
by Titus Winters from Google.
That presentation was so relevant to our talk today, we’re going to summarize some of the key points made there.
First,
changes to libraries which affect the code or the build of code
can break consumers.
The question, is this small change going to break some consumers down the stream
is effectively unknowable in many situations.
So in general,
C and C++ does have
dealt with this by broadly recompiling and relinking after most changes.
If it sounds like an old story, that’s because it is.
People have struggled to
and talked about this for decades.
I’ve given related talks in years past and
so have countless others
and there are a lot of reasons for why this
problem has been so difficult to solve,
and I’ve listen a few of them here.
But unfortunately,
the best advice from 25 years ago
stands today.
To be safe
rebuild your consumers after every change to a library.
other C-eco systems have a de facto standard
strategy for managing breaking changes,
it’s called semantic versioning or SemVer
So why hasn’t this solved our problems for C and C++?
Well, actually, it has solved some of our cases fairly successfully.
SemVer has found purchase in a few aspects of C and C++
for example, virtually all major open-source libraries released using SemVer
many private enterprise code bases
have also used it for internal components as well,
so it does need to be considered
and supported by tools such as Conan.
But what’s the essence of SemVer and why
does it fall short in some cases for C and C++?
The theory behind SemVer is simple.
Assuming you can organize your source code,
into discrete testable units
then you can build and test that code in separate CI jobs
and then your consumers
can be updated to use newer versions of their dependencies
when they’re ready.
That is, when someone takes the time to go and update
and bump their versions as well.
So this unlocks libraries to evolve at their own pace,
and avoid having to wait for all consumers to be updated.
So it’s good in many cases.
If you think about, that’s really the only way an open-source eco-system can work.
It works here because most open source projects have distinct teams
that maintain them.
and it’s pretty difficult to imagine
all the inter-connected open source libraries
in the world, evolving together.
So, while, it does work
fairly well for this case
there are plenty of trade-offs
for example, some open source libraries are
poorly maintained, and some are abandoned.
So maintainers may not update their project to use the latest versions of
all their dependencies at a timely fashion,
or some not at all.
When a library is stuck on a very
old version of another library,
this can cause major problems for consumers.
Also, some truly problematic releases of software
for example, security flaws and encryption libraries
they might still in the wild forever.
and so, there are plenty of other trade-offs as well.
At the same time,
we can say that SemVer does not work
in practice,
both statements are true.
It all depends on context.
So, in the context of some video games
monolithic applications
and large-_ base at Google
SemVer is regarded as completely inadequate and useless.
Why?
What’s so different about this context?
Well, in these contexts, evolving everything together
is the goal
letting consuming code continue to use on old
PI or function
is completely unacceptable.
So developing in this context has trade-offs as well.
For example, if a particular piece of
code is delicate or extremely complex
and calls into a library you’re trying to evolve
it can be very difficult or dangerous
to make any kinds of changes.
It can also be difficult to test or prove something
isn’t broken somehow after a change you make.
So, this kind of problem can actually slow down
the development _ of the entire project.
Once again, there are many tradeoffs
made in these kinds of development context, these are just a few
So, neither strategy is perfect.
We just wanted to shed more light on why development teams
really do have a legitimate need
for non-SemVer development scenarios.
Earlier, we said
that the end of the CI story for these cases was pretty sad.
Generally, CI strategies resort to building their applications
and their dependencies from source
liberally, to ensure correctness and compatibility.
However, this is so painfully time consuming
and inefficient that development teams have continued to look for more efficient solutions.
In fact, this has been one of the things
which has brought many developers to Conan in the first place.
So, can Conan offer a CI strategy which provides
equivalent compatibility guarantees to building from source?
but without the extreme inefficiency and overhead?
The answer is yes.
In the past year, Conan and JFrog
have invested significant research and development effort
to address this use case.
The features are very largely
based on discussions with customer development teams
trying to implement Conan into their CI.
So this is a good opportunity to mention just how valuable customer feedback is
and just how dedicated the Conan team is
to addressing that feedback.
So, exactly how are things getting better?
One major improvement in recent times
is that the Conan team
has a much better understanding of a few things than it did in the past.
One very clear realization
was what we talked about earlier.
SemVer is not a solution for all cases.
This was a challenge for Conan
because it was definitely developed with SemVer in mind
another realization was that
these teams still want the other benefits of using Conan
even without SemVer.
So, during that same time, another realization occured
multiple problems and feature requests
that different Conan users were reporting
were all found to be facets of the same problem.
First, Conan center users were struggling with
binaries being broken unexpectedly after re-builds.
Also, enterprise users were asking about
how to evolve open-source packages they brought in house
and they didn’t want to change the version number and they
were looking for alternatives
but this made it clear that revisions
are even relevant in conjugation with SemVer
So, they’re not mutually exclusive, but in fact they’re complementary at times.
So, finally, rather than continuing to consider the collection of
problems, out of scope,
the Conan team decided to tackle it
and give users a first class experience around it.
So from these advancements and understandings,
of CI use cases and challenges
have come new features.
The first feature is actually a new, fundamental building block
of the package model
it’s an opt-in feature so all the related behaviors are off by default
but cold revisions
they ensure that every time a recipe or binary is exported or built with a change
the new binary or recipe
doesn’t overwrite the old one.
Even if it has the same package ID.
Instead, a new hash is produced
and the new binary or recipe
is stored in a new folder.
So, when I first learned about this feature
my immediate reaction was concern.
Conan arti creates so many binaries
and if it starts now, storing and creating revisions of each binary
I didn’t understand how one would ever
configure a CI job
to specify the correct
versions and the correct revisions of each dependency and each binary
it seemed unapproachable to me.
So it turns out that a related feature called lockfiles
is the answer to these questions.
The lockfiles feature works together with revisions to enable a
complete CI strategy which is free from SemVer.
So without languages or
_ other languages have lockfiles
implementing them in C and C++
in Conan, it was very challenging
if you’re not familiar with lockfiles
the general principle is simple to explain
at the start of the build, you can lock the entire dependency craft
then you
provide a Conan reference
either a path to a Conan file or just the full reference
along with a profile and you get back a lockfile named Conan.lock
so features around these lockfiles are what make revisions useable in the CI work flows
for example, at the end of the build
we can use the lockfile to generate a build artifile
and from that build artifile, we can see what packages need to be built next
and trigger the relevant CI jobs for those packages.
So here’s the sample lockfile
notice the full-package reference with revisions at the bottom.
labeled _.
I chose recipe version,
recipe revision
package ID and package revision
all in a single string
this is considered a full-package reference
So now that we’ve explained what revisions and lockfiles are
we can explain the fun part,
how they can actually be used in CI workflows.
So, lockfiles are a bit challenging to think about at first
so let’s start by clearly stating some
requirements and constraints on our CI jobs.
_ the common requirement across most teams working with us on these features,
was something like the following.
For every non-versioned library change
rebuild the library
then rebuild all the consumers of that library
on all the supported platforms
even when there’s multiple PRs coming in
or multiple commands coming in at the same time.
So to make that even more clear,
the entire pipeline should run against each
specific individual change
in isolation separately.
All, once again, without SemVer
for controlling the version.
So the control flow of this process doesn’t actually seem that complicated at first
everytime there’s a change
lock the graph
then rebuild the change component
generate a build artifile from the lock
and then use the build artifile to trigger down stream jobs.
Just have to be sure to pass the original lockfile
to each job
when Conan gets the lockfile, it updates it
according to which package was just rebuilt
so this is how the build order determines what needs to be built next.
With this approach, we actually mean
all the requirements and satisfy all the constraints we mentioned earlier.
We’re not relying on SemVer to address to the new binaries
any current changes are isolated
and built and tested properly
by virtue of the lockfiles containing the full-package reference
and everything is reproduceable
given the original lockfile.
So this sounds great, and it really is great.
However, it’s important to pause and give consideration to the implications
of using lockfiles as the basis for your CI strategy.
If you’ve implemented CI with Conan before
these lockfiles will certainly require you to change and rethink some of your assumptions.
Also, you generally just can’t add lockfiles to your existing build
jobs and build strategies
you’ll have to re-tool.
To really understand how the lockfiles are designed to work
we have to talk about the difference
between two different perspectives
on our CI jobs.
That is, starting notes first
versus ending notes first.
and by first, I don’t mean first to build.
So, let’s define those terms a little more clearly.
These are not official mathematical terms
However, we did derive them based on some other terms which are official
so, see, lock files are related to the field of mathematics known as graph theory.
So let’s quickly define some terms specific to that.
The acronym DAG, pronounced Dag
stands for Directed as in, one directional
Acyclic, which means, doesn’t have any loops
and Graph which is a set of relationships.
So, the image on the right is a visual representation of a dag of Conan packages
this can be generated for any recipe with the Conan info command
So the word Note in graph theory
would generally refer to the Conan packages in this DAG.
The lines with the arrows in this
image,
indicate which packages depend on which other packages
So, the arrow represents “depends on”
So hopefully now we can, more easily, define the terms we used earlier
Our focus today is the case for the library changes
where we want to build and test all our downstream consumers.
So here we see that the starting note
is the library that changed.
In CI, that will kick off a job, which will lead to a chain of jobs
which eventually re-build and re-test all the consumers.
We call the final consumers, the ending notes.
So specifically we’re talking about those consumers
which have no consumers of their own.
That’s what we call an ending note.
So, here’s a sample of a build artifile
that was generated after a change to Zlib.
If you read the file from the top down, it makes the most sense.
You would next to rebuild, open SSL
and boost.
Once Boost and open SSL are done
then Poco could be rebuilt
Once Boost and Poco are both done
then we could rebuild our app
which is shown in the image as Conanfile.txt
So in this graph
Zlib is the starting note
and Conanfile.txt would be the end note.
When we say, “starting-nodes-first thinking”
what we’re describing is a habit of
planning everything in your CI around the starting nodes.
In our example case, that would be Zlib.
So, if you know one of your projects was for
multiple products, depend on Zlib
it’s pretty intuitive to think this way.
You think,
“I need to start by creating a CI job for Zlib
to build all my binaries for all my supported platforms
and then once that’s done, I can set up a job to build
all the packages which depend directly on Zlib”
and then work your way through creating CI for each of the dependencies in order, that way.
It makes sense for a lot of reasons.
If you use Jenkins, you might create a Jenkins file that looks like this
You can add stage for each supporting platform
assuming you know what all of those are, at the start
This is how we did it in the open-source packaging world
this is how the Conan team did it for Conan Center
so it’s very intuitive and it’s conventional, it’s how most other languages set up their CI jobs.
Unfortunately,
this approach has largely unacceptable consequences
again, at scale.
If you want all the jobs
to use the right package IDs and the right binaries,
you have to get everything right.
Every profile name, every Conan option
has to be correct in every Jenkins file.
By correct we mean, consistent
with every downstream job, including the end nodes, so
it really and truly does that scale past a few hundred jobs
and even at that scale, it’s already unmanageable.
So…
This, for the more… this approach doesn’t give us
any of the benefits that we want
from provisions and lockfiles that we were talking about.
So, it’s unacceptable on many fronts
and we don’t recommend this strategy moving forward.
Many teams implementing CI on Conan, started with this strategy
and have eventually come to us with the feedback that they’ve really struggled.
So, fortunately, there is now a new approach
that we can try, we can recommend
to solve this problem.
So, now, we’re going to try to plan
a similar CI setup for the same application
but we’re going to think about it in terms of the end nodes first.
Our application.
It might seem less intuitive, but let’s try it out.
So, the following story is actually easy to relate to.
We’re developing an application locally, we know how to build it, including the options we want.
We have a profile defined and we want to use it, so
in theory,
we just want our CI to do what
Conan create
with –build all we do.
But we want it in a more, multi-job, multi-platform, parallelized way.
Most developers are familiar with this situation
you get something to run locally,
now you want to put it on CI to run.
So fortunately, that familiar experience
is very compatible with how lockfiles work.
It turns out that lockfiles require us to approach the problem in this end-nodes-first way.
Most people immediately get the general reasoning for lockfiles,
they want to lock a dependency graph
so that other changes don’t affect
the build that you want to run.
But most people don’t realize that this implies a very specific perspective
for it to be useful.
For example, if we do a graph lock oon Zlib’s perspective
we actually accomplish nothing.
It has no dependencies so nothing will be locked.
What we really mean to do is to lock all the dependencies
from the perspective of an end-node in the DAG.
Another thing that’s important to realize is that lockfiles
most contain
all options for all of the packages in the DAG.
So, remember that options can affect
what packages are included or excluded from the dependency graph
so they’re actually part of the dependency graph calculation.
So, it’s really important
that we have all the settings and options relevant to
all the packages when we do the lock.
So, interestingly, the profiles get embedded in the lockfiles
which also implies that we’ll need to be able to
define the list of builds to run
dynamically in our CI files.
This might be, for example, dynamic stages if you’re using Jenkins
or other CI systems have similar constructs.
We’ll talk about why this is in a moment.
So a previous explanation
about how lockfiles work in CI, was a bit oversimplified.
Let’s add a little more detail
regarding starting nodes and ending nodes now.
At the start of any job, which builds a node in the graph
whether it’s Zlib, open SSL or Boost
we need the lockfiles from all consuming end nodes to do the build properly
so we need to get all the end node lockfiles
in order to know the subtotal of all the profiles we need to build in each node.
So, since we get that list during the job,
and that list could change over time
this is where we need the
dynamic builds.
This idea of dynamic builds in a CI job can be tricky,
but the good news is, that it actually solves some of the other problems we mentioned earlier,
so, consistency across many Jenkins files and scalability.
The bad news for people with existing deployments of Conan in their CI
is that they’ll probably be required
to do a significant re-factor of the CI builds
across your environment, to implement lockfiles.
As we said earlier,
you can’t just add lockfiles to existing jobs
without re-tooling them.
So, after we have all the lockfiles for all consumers,
we can rebuild our starting node.
and then begin triggering the downstream after downstream nodes
until we reach the ending node.
And remember that we passed the lockfile TH job,
so that it can be updated properly.
So, in summary,
this talk was a firehose of information.
I don’t expect anyone here to fully understand how to implement lockfiles in their CI, today.
I spent weeks working through the concept with real
environment scenarios and it’s very difficult, it’s a complicated topic.
However, my hope is that you now understand the scale of these new features,
and how they may or may not be relevant to you in your _ deployment.
For organizations which need to be able to test changes
without bumping versions.
Finally, we have a supporting mechanism.
If they also need to be able to build multiple PRs at the same time
finally, we have a supporting mechanism for that, too.
So, I’m out of time, but it’s important to point out
that Conan team is still continuing to work
and improve these features with every release.
So stay tuned in the coming weeks and months
for additional features
relevant to Conan
and DevOps for C and C++
So, if you’re exploring Conan and CI in a professional setting
We also want to suggest that you attend one of our CI/CD training sessions
We go through extensive exercises using revisions and lockfiles
in realistic CI environments
and if you want information about this,
please send an email
To Conandays@jfrog.com
In either case, we also encourage you to contact us
very early on in your journey, whether you’re doing a POC or an implementation
you can contact us via GitHub issues
or the Conan channel in the CPPlang _.
Thank you very much for your time
have a great day and enjoy the rest of SwampUP 2020

44:00
NOW PLAYING

[Webinar]