Sponsored

The PyPI package ‘lightning’ — better known as PyTorch Lightning, one of the most widely used training frameworks in the machine learning ecosystem — was compromised in a supply chain attack disclosed on April 30 by security firm Semgrep. Versions 2.6.2 and 2.6.3, published to the Python Package Index that day, contained a hidden _runtime directory with a 14.8-megabyte obfuscated JavaScript payload that executes automatically upon module import. Any team that ran pip install lightning during the affected window should treat its environment as fully compromised.

The attack is the work of a threat actor Semgrep has linked to the ‘Mini Shai-Hulud’ campaign, a Dune-themed operation that previously targeted npm packages earlier this year. This time the entry point shifted from npm to PyPI, but the payload and exfiltration architecture remain consistent, including the creation of public GitHub repositories with the description ‘A Mini Shai-Hulud has Appeared’ to serve as dead-drop exfiltration channels.

What the malware steals — and how

The credential harvesting is comprehensive. The payload scans more than 80 local file paths for GitHub personal access tokens, npm tokens, and other secrets. It dumps all environment variables from process.env, runs gh auth token to extract GitHub CLI credentials, and — on Linux CI runners — dumps the Runner.Worker process memory to extract every secret marked isSecret:true in GitHub Actions workflows.

Cloud providers are targeted systematically. On AWS, the malware tries environment variables, ~/.aws/credentials profiles, IMDSv2, and ECS task metadata to call sts:GetCallerIdentity, then enumerates and fetches all Secrets Manager values and SSM parameters. On Azure, it authenticates via DefaultAzureCredential to enumerate subscriptions and access Key Vault secrets. On GCP, it uses GoogleAuth to enumerate and retrieve all Secret Manager entries. The targeting covers local development machines, CI runners, and all three major cloud platforms.

Stolen data exits through four parallel channels: direct HTTPS POST to a command-and-control server, a GitHub commit-search dead-drop that encodes tokens in commit messages prefixed with ‘EveryBoiWeBuildIsAWormyBoi’, attacker-controlled public repositories where credentials are committed as base64-encoded JSON files, and — most aggressively — direct pushes to the victim’s own GitHub repository using compromised ghs_ server tokens.

A worm that jumps from Python to JavaScript

The cross-ecosystem propagation mechanism is the most technically novel feature. If the malware finds npm publish credentials on an infected machine, it injects a setup.mjs dropper and the full router_runtime.js payload into every npm package that token can publish to, sets scripts.preinstall to execute the dropper, bumps the patch version, and republishes. Any downstream developer who subsequently installs one of those packages runs the full malware chain, creating a worm-like spread from a single PyPI compromise into the npm ecosystem.

Semgrep has not disclosed how many npm packages were republished with the injected payload, but the blast radius is potentially significant given that PyTorch Lightning’s PyPI page shows more than 32 million total downloads and the package sits in the dependency trees of thousands of ML training pipelines across industry and academia.

First documented abuse of Claude Code hooks in a live attack

Perhaps the most striking element is the persistence mechanism. Once inside a repository, the malware writes hooks targeting two of the most common AI-assisted developer tools.

For Claude Code, Anthropic’s AI coding agent, it creates a .claude/settings.json file containing a SessionStart hook with matcher: "*" that points to node .vscode/setup.mjs. The hook fires automatically every time a developer opens Claude Code in the infected repository — no user action required beyond launching the session. For VS Code, it plants a .vscode/tasks.json file with a runOn: folderOpen task that triggers the same dropper. Semgrep described this as ‘among the first documented instances of malware abusing Claude Code’s hook system in a real-world attack.’

As a bonus payload, if the malware holds a GitHub token with write access, it pushes a GitHub Actions workflow named ‘Formatter’ that dumps all repository secrets via $ and uploads them as a downloadable artifact on every push.

A pattern that keeps accelerating

The PyTorch Lightning compromise arrives barely three weeks after a separate supply chain incident hit OpenAI’s own infrastructure. On April 10, a compromised version of the Axios npm package affected the GitHub Actions workflow OpenAI uses for macOS app signing, potentially exposing certificate and notarization material for ChatGPT Desktop, Codex, and Atlas. OpenAI rotated certificates and scheduled revocation for May 8.

Taken together, the incidents underscore a structural vulnerability in the AI development ecosystem. The machine learning supply chain runs on a thin layer of open-source packages — PyTorch, Lightning, Hugging Face Transformers, and their transitive dependencies — maintained by small teams and published through package registries with limited provenance controls. As AI workloads increasingly run in cloud CI/CD pipelines with broad credential access, those packages become high-value targets that offer attackers a path from a single compromised maintainer account to thousands of production environments.

The immediate remediation advice from Semgrep: pin Lightning to a known-safe version (2.6.1 or earlier), audit repositories for unexpected .claude/ and .vscode/ directories, and rotate any GitHub tokens, cloud credentials, or API keys that were present in affected environments during the exposure window.

L
Lois Vance

Contributing writer at Clarqo, covering technology, AI, and the digital economy.