Malicious Packages: What They Are, and How to Protect Against Them

If you develop software, you probably work with software packages all day long. In addition to creating packages for your own software, you rely on packages from other sources – such as third-party repositories – to satisfy your application’s dependencies, set up development environments and so on. Downloading packages is convenient, because packages and package managers provide an easy means of finding and installing software.

However, software packages also come with a risk: If you download and install malicious packages, they could introduce malware into your application environment. And you may not even know, since malicious packages are designed to masquerade as legitimate software.

What are malicious software packages?

A malicious software package is any type of software package that contains malicious code. It may also contain legitimate code, but if a package is malicious, it has malware or other types of malicious software hiding inside it.

Packages come in many forms, and malicious code can exist inside any type of package. For example, common types of malicious packages include:

  • A Windows .exe application installation file that installs malware onto a system in addition to the application users intend to install.
  • A .deb or .rpm file that installs a compromised version of a server application onto a Linux system.
  • A Docker container image that includes malicious dependencies.
  • A Python package that installs a vulnerable version of a Python framework into a development environment.

Why malicious packages are hard to detect

If there were an easy way to determine whether a package is malicious before downloading and installing it, of course, malicious packages wouldn’t pose much of a threat. Unfortunately, that is not the case. For several reasons, it can be hard to detect malicious packages.

Public package repositories

Typically, packages are hosted in some type of repository, where developers and users can search for and download them. For example, many public Docker container images are hosted on Docker Hub. Similarly, Python packages are stored in public indexes like PyPI.

Public package repositories don’t usually require security checks or validations before accepting packages. Instead, they allow anyone to create repositories and upload packages to them. In turn, unsuspecting users may download those packages without realizing they contain malicious code.

Installation tools don’t detect malicious packages

Just as package repositories don’t automatically block malicious packages, most package installers don’t try to detect malware before installing a package. They just install the package. That means that you can’t assume a package is safe just because it installs without issue or your package manager doesn’t give you any warnings about it.

Lack of package content visibility

Determining what’s inside a package before installing it is not always easy. You may be able to extract a package to view the individual files and directories inside, but even then, it can be hard to know whether a given file might contain malicious code, especially if it’s a binary file.

Packages may depend on other packages

Sometimes, packages trigger the automatic installation of other packages in order to satisfy dependencies. That means that even if there is no malware in one package, the package could install malware through other packages – so scanning just one package for malware isn’t enough to ensure it won’t introduce malicious code onto your system.

DNS spoofing

Sophisticated attackers who want to trick victims into installing malicious packages can use attack techniques like DNS spoofing, which allows them to redirect network traffic to servers of their choosing, even though the servers’ URLs appear to be legitimate.

For example, attackers who manage to manipulate DNS records could redirect traffic intended for a business’s internal Docker container registry to their own server, which hosts malicious container images. Users may assume that the containers they install are safe because they are pulling them from what appears to be a trusted, internal registry. But in fact, the images are coming from a malicious source.

Best practices for avoiding malicious packages

There are a number of steps that developers and users can take to minimize the risk of installing malicious packages into the environments or systems they use:

  • Scan packages: The most important practice for protecting against malicious packages is to scan packages of all types – application installers, containers images and so on – before installing them. Scanning tools can’t detect all types of malware and vulnerabilities, but they can alert you to many common risks.
  • Only download from trusted sources: Never install a package from a repository if you’re unsure who maintains the repository or whether the maintainers can be trusted. And don’t assume that because a repository is on a mainstream site, like GitHub or Docker Hub, it can be trusted. Anyone can set up repositories on these services, so you need to make sure you trust the specific repository maintainers.
  • Verify package names: It may be tempting to download packages by guessing the name of the package you want. But this is a mistake because malicious packages are often given names that are similar to those of legitimate packages. That’s why it’s important to identify the source of each package you use, instead of assuming that a package with a legitimate-sounding name is actually legitimate.
  • Check DNS settings: Before installing packages, check which DNS server your package manager is configured to use. You can also consider setting up your own DNS servers, so that you have complete control over name resolution.
  • Verify package checksums: A checksum is a string of characters that is unique to each package you download. If maintainers of a repository that you trust publish checksums for their packages, calculate the checksums for packages after you download them, then make sure they match the published checksums. If they don’t, it means either that your package file was corrupted (which is rare, but possible), or that you did not download the exact copy of the package that is hosted in the repository you trust – possibly because attackers inserted malicious packages in place of the legitimate ones.