Software Supply Chain Security for Open Source Projects – it’s time to prepare!

Attacks on the open-source value chain (OS supply chain) are becoming more sophisticated, and we, as software developers, are becoming the focus of these attacks. So what are the essential first steps, and what should you focus on? This raises the question of suitable methods and tools. At the same time, the company’s strategic orientation must be considered in this security strategy.

In the recent past, we have also learned that attacks are increasingly targeting individual infrastructure elements of software development, such as the classic CI/CD pipeline.

In this webinar, we address the following questions:

  • What potential threats are there in general
  • What are classic attack points in software development from the source code to binary
  • What tools are there, and where should they be used
  • How can I arm myself against the challenges of cyber attacks tomorrow

Resources:

JFrog Platform on AWS to Manage Your Software Bill of Materials Solution Sheet

JFrog Xray Solution Sheet

JFrog Artifactory Solution Sheet  

Video Transcript

Hello and welcome to this new video. It’s a pleasure for me to see you here and what we are talking about today!

So, today we want to talk about supply chain security about software supply chain

security what are the key points? What are the different mechanics? You can see here some open source projects from the Linux foundation I want to highlight and we want to have a look at vulnerabilities in malicious code packages and in the end what you can do against all these attacks.

If you’re interested in this stay here by the way my name is Sven Ruppert, I’m developer advocate for JFrog and as you can see mostly i’m out in the woods and yeah so if you are watching the first time one of my videos then it’s a big pleasure for me and a big welcome from my side and uh if you want to have more videos like this about java or devsecops topics

then have a look at my YouTube channel you will see a bunch of them there and if you like the video give me a thumbs up and it would be a pleasure for me to see you as my new subscriber if you start subscribing to my channel you’re missing no other video anymore. 

So and now it’s time to start okay let’s start talking a little bit

about supply chain security supply chain security is a very broad topic and software supply chain security is just a part of the supply chain security what does it mean supply chain is everything that is used in terms of human power about humans machines material third-party components processes everything that is producing something or it is some supply chain so in everything that is disturbing it or compromising it is something against the supply chain security and supply chain security is focusing exactly on this topic how to make everything smooth

so that you can work or this process is running um yeah without any interruptions or disruptions so um yeah software supply chain security is just a part that is focusing on how to create software okay what changed over the years with the supply chain attacks a long time ago it was more or less an individual on a group of hackers it tried to break into the supply chains

and it was more or less in financial oriented aspects so they wanted to get some money somehow okay.

But over the last years, and especially right now, so we’re in the beginning of 2022 we have a global political not so nice situation right now in the east and then it could happen that if you’re

working directly or indirectly for companies that’s working from government or you’re

working inside the supply chain for a company that is not really um yeah is a target of a different government let’s say like this then it could be that you’re not attacked by individuals or a hacker group you’re attacked by a government and this is a complete different thing because they have have completely different resources different possibilities and um even if you’re a small medium-sized company if you’re part of the supply chain it could be that you are now attacked by governments instead of individual hackers and this is a complete different beast and what we can see is the big companies are improving their security day by day they have a huge amount of resources in terms of manpower money infrastructure and so on and so on

and this means that the pressure or the attacks they are more more not against the big

companies they are more more against the small companies around these big companies and this means that over the

time this pressure will increase and even if you have a small medium-sized company with let’s say 10 or 15 employees you will get now the full amount of attacks on your part of

the supply chain because it is way cheaper to attack the small and medium business house companies and a few of them instead of attacking the big company so it means the pressure is

increasing step by step and as much as big companies are increasing their protection it means the pressure against the small medium business house companies will increase as well and one of the biggest question is what is a key factor against all these attacks or what isn’t fundamental thing you should have in mind and whatever supply chain you have um the traceability is one of the key points against all these different attacks or it’s a fundamental thing to protect against compromised elements inside this so traceability means that all parts what you’re doing at what time who’s involved what material is used what is the output where the output is going through and so on and so on if you are able to have this traceability about the whole supply chain this is one of the key factors but now we want to um yeah limit the skill from general software supplier general supply chain security to software supply chain security and here we are focusing on from source code to binary okay let’s talk a little bit about the software supply chain security it’s a subset of the supply chain security it’s focusing just on sophie and i have two open source projects from the Linux foundation i really would like to highlight here and one is a project salter and the other one is project pyrsia let’s start with salsa salsa is a project it’s a documentation project and it means that a bunch of different individuals or cyber security exporters security experts try to create documentation first of all to identify so to give

advice to you so that you know what at what level you are with your

security what are the next steps what you can do to increase your security and the description of all the different common attacks against a software supply chain so really what are the takes from source code up to the binaries in production so part of project delta are these

levels these levels are more or less so that you know where you are and what what you can do

and what are the next steps to increase your security and the first levels is level

zero is just you have to document everything that is used inside your software development process so it’s a full documentation of everything so that you know what is going on where is

something involved what components you are using and so on and so on so level one then is describing that you have to create an s bond software builds of material so that you know this binary is depending on all the other components all the other dependencies so that you have the full dependency list of your created binaries uh don’t worry uh we will focus on this s-bomb what what it is how to create it where you can get this stuff uh later on in this video but um yeah so this this is the level one the level two is that you start using and git repository on source code the version server and the icd environment as well as in the repository for the

binaries and making sure that everything is automated as much as possible

level three is introducing security audits so it means that external parties are checking what is a security level you have what you have done right and wrong and what should you do better and if you have done this then the level four is describing the definition of hematic immutable and reproducible builds so it means that you know what is part of the build that you reproduce it and that you’re creating binaries just once and then using them so never recreate a binary so all this

is a very very short overview of this project but i have a video focusing just on this

project slsa on my YouTube channel there will explain all the mechanics the details and all different flavors what you can see and get there and have a look there on my YouTube channel and search for the video about project salsa so the next part from this project salta is the documentation about most common attacks against software supply chain and this means that we are focusing now from source code to binary what could

happen and we have a few things first thing is for example no source code modification will be done without any review if you have done this what could be the next attack the next attack could be that you are compromising the source code repository itself it means that you’re just sinking in with bad commits or changing source code at this point so you have to hand this source code repository from the source code repository it’s now going to build a ci environment and here again it could be the build could be changed or it could be compromised in a way that’s

fetching the original sources but overlying with some other or additional sources or that you are just compromising the build itself a very prominent example of this is the solarwinds hack where with every build inside the ci environment the binary was corrupted inside the ci environment so hardening the ci environment is one thing from the ci environment now it’s pushed to the repository and here what you can do is you can bypass the ci environment say i’m the ci environment and pushing compromise binary into the repository or you can attack the repository itself but there are a few things so we are talking about bad pencils or during the build what you can try is to provide bad dependencies so that this is used during the build and you can change or promote binaries from outside that will grab by the repository long story short meaning we have several hot spots here and the main hotspot is first of all the three components the source code repository the icd environment as well as a git or the binary repository here you just have to just but you have to harden this

environment this is a operational part but for the software developers there are two things left the own source code and what’s going on with the source code and then all this stuff with the binary so all dependencies and i think it’s worse to have a look at exactly this part so it’s good and binaries okay let’s talk about the next project and the project is called pyrsia

pyrsiasm project it’s an open source project from the Linux foundation and it was initially created by the company JFrog.

And what what we want to do so in this project we want to focus from the binary will be built until the binary will be delivered it could be as dependent could be for production but this is a part where project pyrsia is focusing on and all the other parts are external of or not included in this project so it means we are focusing here from now we are building a binary update it will be delivered how is Pyrsia securing the software supply chain or this build process so you want to take once again this build infrastructure so all these build threads and here what what british is doing is um you’re providing to this diesel decentralized it’s a peer-to-peer network or a peer-to-peer package manager um what you’re doing is you’re sending the url where the source code is in a commit then different nodes will grab the source code will build it locally and then sharing the information about the binary itself and then if all binaries are the same then the build infrastructure is not compromised so you can’t you can create different nodes of pyrsia but you have no control about if your nodes are selected for building

something so it’s really randomly selected in a way that it’s very hard to

really provide these nodes in a compromised way so that you can bring in conformers binary services platform so now we have this binary inside this p2p network and then it will be delivered to the distribution layer of pyrsia by the way this is a very short description of pyrsia so if you want to have a more detailed um then check out my YouTube channel i have the video just about this project pyrsia and then i’m going really with every step in detail so that i’m providing all the information about the internals here it’s just a very short overview okay how is pyrsia delivering this binary so it’s a peer-to-peer network so if you have now this uh for example maybe dependency inside suppose your network you are asking here okay with a

maven coordinate this pyrsian node then it will be selected where this binary is and then you can fetch it from several points so if you have bigger binaries you have all the advantage of a peer-to-peer network that it could be delivered partially from different nodes to use the bandwidth as much as possible on the other side we have gateways to

our docker hub and maven central for example these are authorized nodes so if something is not inside the pyrsian network it will be fetched from this authorized nodes and then stored inside the peer-to-peer network as well so if these nodes are going down once or maybe for a few minutes or whatever you can always ask the pyrsian network and it will be still there on the other side just have a look at this um project website it’s a young project and uh yeah i can say try it out okay we saw that we have different parts and that uh one is done by the organizational or operational part and the other thing is source code and binaries and i want to highlight four main areas of cyber defense or cyber security or the devsecops whatever you want to say to it or where you want to place it and it’s first of all the zast static application security testing it isn’t a testing mechanism where you’re testing each component until it’s not running so it’s just uh

from the first line of code you can say this depends on you then you’re scanning if something is running already you have this dynamic application security testing this means the application is running and you’re looking from outside on this and you have more this hacker

approach and the combination of both is a is interactive application security

testing this means you’re ramping up the environment you have from outside the attack and you’re looking inside and modifying the attack factor and all this stuff and then the last part is runtime application security protection and the name mentioned already it is just for production it means inside your production environment you’re analyzing what’s going on and try to identify it is an ongoing attack now so um okay the last one i just cut out at this part because it’s just for um production if you are focusing on is it means you need someone who is highly skilled so

this is mostly something that you’re doing later if you have experience with the security stuff already

uh the dust part is quite late inside the product line because you need something that’s

running already and you can’t really scan 100 of the components because you’re just looking from outside so you

should focus definitely on the static application security part because with the first line of code you can start scanning all the included components and identifying if they are

malicious packages or vulnerabilities so we saw that we have left over source code in binaries if you want to start with dust and i highly recommend to focus first only

on the binaries because if you’re comparing how much code you write and how many dependencies you are adding and how many lines of code this is then by for the most projects the by far biggest part dependencies so focus there and scan there for vulnerabilities malicious code packages it’s the low hanging fruit if you start with cyber defense or cyber security topics or inside the devsecops part

so next question is what’s the best source for vulnerability information and

here i can say whatever single database you’re using make sure that you are

building a superset out of different database because single database are mostly

having a lack of vulnerabilities because the market is so huge and you never know to what provider this information is

sold so this is exactly what we at JFrog has done or doing already so if we

aggregating different vulnerability database commercial 131 and we have a dedicated research team on top of this

vulnerability database that is enriching this data with mitigation remediation

information and we we’re adding the knowledge about our own zero days as well so whatever you’re choosing make

sure that you have a superset as we have done it in JFrog let’s talk about malicious packages and we have different aspects i want to highlight here and the first one is the infection assets so what what is an

infection method so the way how this military is called is yeah provided or it’s it’s it’s provided away so that you’re consuming it i will start now with tight squatting so what what means tight squatting touch quoting means that um you have common packages very very well-known

packages and they have a name and you’re grabbing this code and then you are

changing the name of this dependency based on common typos

and then if you have a common typo you’re just providing this package in a regular official repository and then with this typo he’s just referencing the corrupted package and inside this package you can do whatever you want so

the next thing is masquerading so masquerading is focusing on imitating

the whole environment around dependency so you think you you’re duplicating the

codes the metadata you’re adding small pieces of malicious codes here inside

this one and you’re building korean packages so you’re building something that looks

exactly the same that maybe has a same name but provided of different

places and then something is inside so that you’re infected that’s it so the

only difference from the original package is maybe one line with an obfuscated code that is

doing something calling something sending something and so on the next

thing is a drawing package a drawing package is more or less like the

historical view you have a package that’s doing something a pdf library and it can print and all this stuff so everything what what you need but inside this you have additional functionality that’s obfuscated hidden somewhere and it’s just activated during the time you’re using this really well working library so sometimes there’s problem packages here they’re a good library so they’re giving you a good value but an additional value as well another way is dependency confusion

dependency confusion means that you um if you have internal packages and

external packages for example so inside your company and i know for example because i know someone who’s working

there or this information is bleeding out i know what is the name of an internal dependency

then what i can do is i can create a dependency

exactly with the same name and a higher version number or very high version number and putting this one in official

repositories so i i’m really using exactly the same definition but my ci

environment is looking first maybe at madden central and grabbing theirs is

yeah this wrong independency was the same name maybe same functionality

slightly different version number so that automatic version increases will grab this one from outside and then you

have this yeah dependency confusion in a way that you

see it’s my dependency but it’s grabbed from outside another infection method is hijacking

hijacking means that you have access to the infrastructure of

this project you are taking over the ownership in some way it could be an aggressive way that you’re really hacking the page having uh this or if there’s an unmaintained project and you

see it’s it’s used then you’re just taking this free domain and building

again something around it or you’re taking the open source project that nobody is maintaining anymore and doing this one so hijacking is yeah it’s more or less

you are the maintainer of the project but with a different intention so this these are the common infection methods and now the question is what are common payloads another thing is watercom payloads come payloads is more or less what is what they’re doing so what what is this code that is in these malicious packages and then uh one of the bigger things is

sensitive data stealer so it means they they want to have credit card numbers they want to have user tokens environment variables passwords usernames whatever

so they they want to steal this data and sending them somewhere so you have this

okay look if you have this environment variable check this name and with the next request send an additional request to the attacker server so that is one thing the other thing is that you have something like a connect back shell it’s

it’s like a remote shell so um there is malicious code that’s waiting and connecting back to the um attacker server so that he knows okay this militia is good is there i can connect then i’m sending commands it will be

executed on the other side and the result will be sent back so this is just

yeah whatever you can do at this system to do this one and another thing that is very popular

these days is that you have download and execute so you have a malicious code set it’s connecting to room YouTuber

downloading a binary and starting executing them and this is quite often used for for example for crypto mining so just to mine crypto currency with other people energy and money and

sending this one back so these are the most common um payloads that you have in malicious cuts but um how to hide this malicious code and then we are talking about obfuscating techniques

now talking about obfuscating techniques obvious skating techniques can be just by public available object skater or

custom made of your skater but mostly you can really search for a few skaters and then you’re

using it so what they are doing is more or less they are they’re renaming variables and

encoding um commands in different encoding and all this stuff so that you that you are not

able to read it immediately so you have to re-encode it to see what’s going on so

this is one thing but a little bit more interesting is a control flow flattening the control flow

um is more or less you have this control so it’s running a b c d e f g and then

in the middle you’re breaking some if-then-else stuff in and then depending on the amount of calls or variable or whatever different code will be executed and this is so implemented in the code that’s not obvious so that’s that you’re not really immediately seeing it but if you’re analyzing the whole control control flow you will see that there is something that is around this main logic okay the next is homogenous characters homogeneous characters is something like you um have this different unicode character that are looking like plain ascii latin characters but they will be different if you’re comparing strings and all this stuff so with this you can make sure that some comparisons always succeed or always fail so for you it it looks like a

regular ascii sign but it isn’t different unicode sign and with this

you can do a little bit more and this is this bi-directional control characters and this is cool thing say um we are reading left right right to left whatever and we can say to the machine as well and we can use this um this control characters inside my source code so that a human is reading maybe from left to right and seeing okay this is the source code he has some command it’s sort it’s and and so on and the

compiler will see it completely different you will come to this character switching from left to right right to left reading and interpreting this stuff and it’s completely different

and this is something that is quite cool in terms of getting this understanding

for this but it’s not so easy to detect so we heard so much about

vulnerabilities malicious packages mouse crane techniques and all this stuff but the first question is inside the software development chain where is the right place to put security in and hunter is quite easy everywhere so every single step should be involved or should be hardened with a security approach so security is like quality it is really part of every dedicated step on the other side what makes so unique this combination of artifactory and x-rays so this dependency management and vulnerability scanning never mind that all dependencies of all text layers they have some metadata around so like easy and dynamically linked dependency is a compile scope is is in tesco this is a innovation range it says with a dedicated it’s a static link dynamically linked and so on and so on so all this information is available inside these different dependency managers and if you are grabbing all dependencies of this artifact here and you have the whole metadata and the knowledge of this then you can use this metadata to analyze

for example remediation mitigation information you can use this for defining the whole attack vector over different technology borders or inside the whole text stack so having this information of all dependency managers of all tech layers and the possibility to scan these binaries isn’t huge plus compared to i’m just getting one technology or i’m just getting one binary because mostly this information is not part of the binary and then you can’t use this information

so dependency management and scanning for vulnerabilities is a very good combination

the next question is is shifting left to the ci environment enough

well shifting left to the ci is good because this isn’t fully automated uh

gate where passing through it’s the place where you can implement this

security border that it must pass before it’s going to the next steps but shifting left to the ci

is not enough because if it’s reached ci then you spend already so much time and maybe you can do it a little bit earlier so the only thing on a little bit earlier than ci

environment is ide as well as command line interface so we have both of them we have the command line interface where you can work straight on the command line you can script it you can use it to to see what vulnerabilities are there and we have this ide plugin so if you’re typing the first line of dependency then you see immediately if there are

vulnerabilities in the dependency tree or if there are some compliance issues while this makes sense first of all if you’re spending time on creating something pushing it to ci getting back that’s not possible and you have to rewrite it then you’re getting bored and

the quality of the second solution is maybe not so good because you are under time pressure now you’ve wasted a lot of time already and you are bored because you’re doing things twice so it makes sense to have this information as early as possible available so that you are focusing on the core things once and

doing this right without being bored or wasting time why should you use the cli

command and how to use it so first of all you should use the cli command because with this you can work without any other tool if you’re cloning a repository like main project then you can just go in on the terminal shell on the command line you can go inside this project and call the audit command JFrog ordered minus nvm so with ordered minus mvm the command line interface now there is a project a maven project it will extract the whole dependency tree from the publix ml and give you all the

vulnerabilities and compliance issues that are defined here so you can configure it with watches and all that stuff but plane is just ordered it and get all the information that are available so with this you are not

wasting time so you can check it without um starting any workflows or just

opening ide in other stuff on the other side you can script it so that all other tools existing infrastructure on your site is able to use those capabilities as well on the other side you can do a little bit more in terms of analyzing

for example docker image is called on demand scanning on demand scanning means that you are creating docker image on your machine you edit stuff you grab different things and you want to know if this docker image is good enough so in terms of compliance issues or if you have any vulnerabilities you should get rid of so you can extract this docker

image so that you have entire image on your disk and then you can use the jfox

cli command to analyze this docker image or if you have docker desktop you can use the local desktop plug-in to analyze this docker image you have immediately the information about what’s inside in terms of vulnerabilities and compliance issues and you can send this information to artifactory and then it’s under on demand scanning available so it’s documented and if you’re changing the docker image then you see the difference

between different scans so you can work together with your colleagues you have it documented

without pushing anything of this docker image to this artifactory so nothing is bleeding in if you’re composing stuff on your side and without using the cli resources and waiting

there so using cli makes sense for integration make sense to be in fast and

make sense to analyze what’s going on give you the flexibility to work

straight on your challenge what you have and in this environment where you are

a little bit earlier i talked about the s bomb so what’s in sbon spoon is a software built of materials means a full

list of all dependencies that are used to create this binary this is quite popular these days because the executive order of cyber security from the u.s president mr biden

explicits say it everything that is used owned

run whatever by the u.s government must fulfill this s-bomb the software builds of material so you have to provide the

full list of all dependencies of all tech layers we know this since a long time we called

it uh these days earlier build info and build info is a super set of s-bomb so

you can add here during the time you’re creating a binary not only the whole dependency list but it is added all meta information you want to add like environment variables date time

machine agent name whatever so everything what’s necessary and what you want to push there can be stored and it’s immutable on the other side for this you have this top about x-ray and

inside x-ray you have the actual knowledge of vulnerabilities so it means

if you’re passing a bill today and it’s green because today we don’t have the knowledge about vulnerabilities that are inside and passing this to production maybe tomorrow with an update of the vulnerability database we know oh we found a new vulnerability then we know that for this binary we have no

vulnerability in so we have this information without scanning production and on the other side isn’t good yeah good thing to on the daily in the morning just to scan for the importance

binary to check that i created yesterday so if there is a new vulnerability entry there for

example so with this you can maintain production without scanning it and on the other side you can at this Xray tab going to this action thing clicking there and then you can create nest bomb and extract the expo in

different variations so it depends what standards you you have to fulfill and then you have everything to yeah be compliant with this executive order of cyber security so it means we have now the possibility to create this s-bomb we have the possibility to see vulnerabilities from

in the past created binaries without skinning everything and we can use this

to actively maintain stuff that is running in production by the way you will find there are a lot

of cbss values in certain places and if you’re clicking on seriousness values you see all the basic metrics

i haven’t enough time to explain now all these different matrix values but i have

a dedicated YouTube talk that is going through all the cbss metrics and you

will see that you have the possibility to scale this uss metric to your environment and for this you need environmental metrics how to use this and how to deal with the cvs calculator is

shown in one of my videos there and it’s a good advice to see and play around with the cvs as well as because it will give you some advice which one is more important for your environment and you can adjust the order in which you want to fight against vulnerabilities okay we had a lot of stuff now so we talked about supply chain security about software supply chain security about these two Linux foundation projects we saw what is the difference between vulnerabilities and malicious packages different techniques of your skate and so on and so on i showed you how to extract this exponent what you can do on command line and why shift left through the ci is not enough you should shift left through the ede ide or command line interface all together is a wrong package but there are a lot of more detailed videos to seeing the topics on my YouTube channel so just check it out and if you are interested try it we have a freight here so there you can register and try all this stuff by yourself or attend one of my webinars or workshops where we have hands-on practice on exactly these topics the other third my day is more listed i found this lake i will make my camp here for tonight will enjoy it and whatever time you are seeing this, I hope you will have a good rest of your day and well stay safe and see you!

Release Fast Or Die