CodeCraft – BMW’s Driver of “Digital Car” Software Development

Alexander Denk
DevOps Engineer and JFrog Platform Service Owner

BMW has a long history in developing software: With corporate to off-board as well as on-board software, nearly every area is covered.

For more than 20 years BMW has collected experience in bringing software into cars, and using massive backend infrastructures to connect our fleet. Complexity and the number of components are growing exponentially in this field, which spawned the need for CodeCraft a few years ago for a fresh DevOps platform to drive BMW’s development of Electronic Control Units (ECUs), impacting every component starting from engine to window, steering, brake.

With the first-ever BMW iX Sports Activity Vehicle® as a digital native, it will be our first vehicle that delivers these new software experiences. Beginning with state-of-the-art technologies we moved to cutting edge tech-stacks to ensure fast and robust development processes under strict safety regulations and to enable our teams to deal with all related challenges.

JFrog products such as Artifactory are an essential and mission-critical part of this service mesh.

Video Transcript

Hello and a warm welcome to the JFrog SwampUP 2021. My name is Alexander Denk and I’m a DevOps engineer at BMW working on our project called Codecraft. I’m the service owner for our JFrog products as well as the deputy service on GitHub. My first coding experience started in 2002, when I started overcoating in Visual Basic. Since then, I’ve achieved my Master’s in computer science, then joined a research company as well as a food startup, and later on in 2018, I finally ended up at BMW in my current position.

This was not by accident, as you can see in the picture above. This is me nearly 30 years ago, sitting in my first BMW car, and this started over my passion for cars. Today, I want to talk about our project Codecraft, which is the drive of digital core software development in taking innovation to the next level. In our past, when you thought about building costs, it was mainly about bending steel, but also today we have nearly everything digital.

As you can see if you could take a closer look on the picture, we have robotics, we have monitoring and logging, we have a completely digital supply chain. But this is not my main focus today, because today I want to talk about the car itself. And for us in the research and development department, which is caring about the digital car, this is mainly about hardware and software.

Today, we have a lot of both in our cars. So we have for example the BMW operation system 8, which is supporting remote software updates, is connected with a massive back end, and offers a lot of driving assistance features which we are heading to highly automated driving already today. We have a digital key, we have a personal assistant, which supports natural language processing, as well as an HMI which supports for example, gesture control.

And on top of all that we have a lot of apps. In this example, you can see the BMW head unit, which is an example for massive growth in every area. When we started in the early 90s with our first navigation and entertainment proposals, we had 16 megabytes of RAM in our head unit. Today, it’s nearly 1000 times of that. And we see the growth in every area.

We see it in the hardware, we see it in the software, and we have now services and back end, which didn’t exist 15 years ago, and all the services are expanding in every direction. And if you see this slide, which I took from our BMW group investor presentation, in early May 2021, we announced the Neue Klasse 2025, which is a reference to our own past to be uncompromisingly electric, digital and circular. And digital for us is one of the goals of the company, not only of our department, it’s one of the goals of the whole company. And to enable that driving innovation for customers is a crucial part of our mission. However, what about developers? At first, we need to solve a few things out.

Why is the automotive industry different compared to other tech companies? Well, it’s actually pretty easy. As you can see here, a long time ago, we have a history. This is not the case for most of the tech companies which started on the scratch Greenfield. But for us, there is already a lot of past which we are carrying around And we have also one major expression, no experiments. So this is also for good reason, because safety is a crucial part. Building cause is mainly about integrations of working with suppliers.

We have long development circuits far away from CI and CD needs. Basically, we have three to seven years for developing a car in the past and strong regulations also enable us to have a highly qualified function. So basically, you could say SVN is good enough. However, as we saw in the slides before, we have a digital first approach and to achieve this approach, to be best in class is a hard requirement. And on top of all that we have new tech competitors like Tesla, or Neo. We have more of everything.

As we saw also before we have a lot of microservices in the background. We have a lot of easy use in the car with handling the different stuff and also on top of that we have a lot of software in and behind the car. We want to avoid bad press because getting a car not into the field because of software failures. Cannot happen to us or must not happen to us, to be honest. And it’s also about security, because safety is the one part which prevents people from being killed, but security, on the other hand is also becoming more relevant, for example, a digital key may not be used to steal your car. So is SVN, for example, really good enough?

The answer is pretty obvious. No, it isn’t. So, with all that out of the way, let’s come to the future, and this is for us Codecraft. Codecraft is our HI software production plant. It’s basically a collection and integration of several commercial as well as open source services. It’s housing multiple ECU projects, these are from the current generation, which is currently pushed into the market, and also will house the next upcoming generation, the Neue Klasse as you have seen before, which is announced for 2025 and following.

We try to leave the project as much power of choice as possible, however, we want to offer a unified experience, a unified process. What does this mean in the detail? So we offer a highly qualified tool chain, we offer best practices, and we want to offer the project as much guidance as possible. But nevertheless, we want not that they need to break out of the natural ecosystem. For example, an Android developer may be used to use Carrot or Workitem and this is fine for us because if you are forced to use GitHub, it won’t work out as expected.

Nevertheless, we do not want each project to have a completely out of the wild tool chain. So this is pretty important for us to keep everything straight and aligned. And what is the quote for that? We are utilizing 10,000 CPUs for thousands of GFCI drops that are running in parallel and this results in multiple ten thousands OCI trips per day. And all this enables us to deal with the growing complexity in the growing numbers in every direction. So what makes Codecraft different?

Well, we try to be the best in class. So we need a modern, fast, safe and secure development environment. It must be attractive for developers as well as must support a pretty good collaboration. So how we developed software in the past is no option for us today. As you can see in the pictures above, it can be compared to a plant. We have the parallelism in the plant, we have the collaboration, we have to fun.

About the architecture of Codecraft, I want to tell you a short overview of the tech stack and our core portfolio at first. So there are no big surprises, we are running Codecraft on top of multiple Kubernetes clusters, we are utilizing VMs and all that is operated as well on and off premise private clouds. We are utilizing pretty much often normal open source state of the art software.

So for example, ELK for monitoring and logging, we have also Prometheus and Grafana. for monitoring, we have a lot of cross cutting services, for example, covering SSL quality, status and support topics. We have, of course, the JFrog products for our binary management and more, and we have GitHub as our major driver of the whole platform. And, not that common, we have a sole CI and CD system and our own CI library on top of that. Also, we are utilizing pretty much standard Git workflow.

So as you can see, it’s just committing the code, issuing a pull request, then running to see eyedrops and passing in the results. So this is the pretty normal Git workflow we are utilizing and this is pretty good, because every developer is familiar with that. One of our main challenges in the last few years, we saw massive growth in all areas. For example, completely expected to have the growth in the first use because we are starting with onboarding the new projects on the new platform, but then in the year 2020 and following we are also seeing a massive growth in the number of CI drops, in the number of users and this is mainly about heading towards the start of production.

Nevertheless, growth is not always the best thing so we try to limit it as far as possible, because scaling is not always the best option you have, but nevertheless scaling is needed. Codecraft behind the scenes. We try to eat our own food, so we try to develop the platform completely on the platform itself and everything is developed with state of the art techniques. So for example, we have a pretty massive usage of everything as code. In the left, you can see example for a simple repository definition for Artifactory.

This is pretty familiar for everybody who is developing neater code. So we prefer this clearly over fancy AI stuff, because this can be checked automatically. And this can be verified automatically, this can be deployed automatically, and moreover, it’s familiar for every developer and developers are our main customers on the platform. So everything is centered around the developer.

To get the best experience out of it, it’s pretty important for us to also use the same stuff our customers are using. Today, I want to talk mainly about three different or unusual major building blocks in Codecraft, which you probably didn’t expect in this way. So at first, we have our OpenStack environment. And this is, I guess, not what you expected me to talk about because today, it’s more in to talk about AWS, Azure, Alibaba, or any other public cloud. However, we are still utilizing off premise private clouds, for different various good reasons.

For example, we have fixed costs, we have fixed resources and we are able to optimize this for the use case and this leads to lower costs and lower resources, that would be a quiet for public cloud environment. And if you go to the big scale, and we are definitely in the big scale, it’s not that easy to spawn, for example, 1000 VMs in the public cloud, and make guarantees that respond in a certain timeframe. And one major issue is also best friends. So it’s a pretty big advantage to go to the table of your neighbor and then tell him, well, we need this or we have a problem there and you will get direct help for this. This is not always the case in the public cloud environment.

However, you must expect to have a big learning curve, and we had this, this is also to be looked up in the source I linked here. And you should never forget, always use what suits your use case best. So we also using AWS, for example. So OpenStack is not our only solution but for big part of our base load, it’s the best solution. The next part I want to talk about is basil. This is a p link system, which probably was also not expected, because developing mainly using for example Yocto or Oss to make files.

Nevertheless, we want to present basil because we have pretty amazing experiences with it. And for us basil allows a faster incremental build chain and this is mainly because it allows distributed caching, and only rebuilds parts that are really necessary to be rebuilt. We have a dependency management already integrated in basil and also tools and dependencies are frozen directly in the repository, which allows us to have reproducible builds and this is pretty important for developing safety critical software. We have the support of Ripple projects, which is really important for us because we have a lot of suppliers working on the same components or similar components and they’re not allowed to see the code of each other. And the first basil car out on the road is the new BMW X, which was launched a few months ago.

So if you can’t see this, it’s fast, correct – choose two. The last component i want to talk about is Zuul. Zuul shouldn’t be mixed up with Netflix Zuul which is caring about micro services, our Zuul is for CI\CD and is provided from the OpenStack community and was designed to be an alternative for Jenkins. It’s ensuring that the most is always green because it prevents us from broken code being merged into the mainline. It’s also supporting cross project dependencies and offering a simple and reusable style for playbooks based on Ansible to define tropes. In our case, we also have a CI library on top which has Ansible playbooks and Ansible roles, which are covering a pretty big base set of all CI proposes you can think of.

For example, uploading or downloading artifacts from a distributed artifactory setup. Also, BMW is taking an active part in the development of Zuul. One of the core developers is a member of our team. And as always, in the open source community contributions and integrations are always welcome. So let’s come to the conclusion.

As we could see here in our earlier slide, which showed the BMW group Investor Relations presentation, the Neue Klasse 2025 should be uncompromisingly electric, digital and circular. And to achieve all this, the best is just good enough. Without innovation, we cannot develop software we used to develop software 10 years ago in the automotive industry. So we needed a tool chain, which is modern, fast, safe and secure for development, is attractive for the developers, because we need to hire new developers a lot in the future and it must be collaborative, as you can see in the plan above.

Thank you very much. Now there’s room for questions and answers.

Release Fast Or Die