AI coding risk is not only vulnerable code.
It is also fake dependencies.
That sounds minor until a generated import line turns into an install command. A model invents a plausible package name. A developer trusts the suggestion. An agentic coding workflow runs the install. If an attacker has registered the hallucinated name first, the bug is no longer a hallucination. It is a supply-chain event.
This is slopsquatting.
The newer data is uncomfortable because it is both better and still exploitable. A May 2026 frontier-model cohort paper reports package-name hallucination rates between 4.62% and 6.10% across 199,845 paired Python and JavaScript prompts. That is a tighter range than older studies. But the attack class survives because the error does not need to be common. It only needs to be repeatable and registrable (arXiv).
The conclusion is not “models are bad at packages.”
The conclusion is sharper: dependency names have become an AI-generated attack surface.
The Problem
Software supply-chain security already knows how package-name abuse works.
Typosquatting catches mistyped names. Dependency confusion abuses internal package names that resolve publicly. Maintainer compromise turns a trusted package hostile. Slopsquatting adds a different upstream source: the model.
The attacker does not need the developer to mistype. The attacker waits for the coding assistant to invent.
The original comprehensive package-hallucination study generated 576,000 Python and JavaScript code samples across 16 code-generating models. It found average hallucinated-package rates of at least 5.2% for commercial models and 21.7% for open-source models, plus 205,474 unique hallucinated package names (arXiv).
That earlier paper made the attack surface obvious. A hallucinated name is not just an error. It is an unclaimed namespace with semantic credibility. The name often looks like a real package because models learn naming patterns: framework prefixes, utility suffixes, wrapper conventions, hyphenation, abbreviations and ecosystem-specific vocabulary.
Attackers can mine those names the same way defenders can. Run prompts. Collect phantom dependencies. Register the most plausible ones. Wait.
That turns “AI made something up” into “AI generated the attacker’s target list.”
The Analysis
The 2026 frontier-model result should not be misread as a solved problem.
Yes, the spread compressed. The newer cohort’s reported range is far lower than the older open-source average and closer to the older commercial-model rate. That is progress. It suggests frontier coding models are less chaotic about package names than the systems measured in the earlier broad study.
But software risk is not linear with hallucination rate.
A small package-hallucination rate can still be economically useful if the names recur, look credible, and sit in high-volume ecosystems such as npm and PyPI. Attackers do not need every prompt to hallucinate. They need enough prompts to reveal names that developers or agents will later install.
The important change is automation. In a copy-paste workflow, a human may notice a strange package name. In an agentic workflow, the assistant may add the dependency, update the manifest, run the package manager, execute tests, and proceed. The human review may happen after the package has already run install scripts.
That is why slopsquatting belongs next to package-manager policy, not only prompt hygiene.
Prompting the model to “use real packages only” is weak. Models can answer confidently while wrong. Registry validation is stronger. A package name suggested by a model should be checked against the official registry, package age, maintainer history, download pattern, known advisories, lockfile policy and internal allowlists before installation.
The defense also has to treat newly created packages differently. A hallucinated name that appears in a model suggestion and was registered yesterday is not normal dependency churn. It is a risk signal. Package age is not proof of safety, but freshness plus AI suggestion plus low reputation is the kind of boring correlation that catches real attacks before a postmortem needs diagrams.
The second control point is the agent runtime.
If a coding agent can run npm install or pip install without a policy gate, slopsquatting becomes an execution path. The package manager is not just downloading code. It may run lifecycle scripts, native builds, post-install hooks or transitive dependency resolution. That is a large privilege handoff for a name the model just invented.
The third control point is memory.
AI coding tools often operate inside repositories with existing manifests, lockfiles, internal package names and MCP or plugin configuration. A poisoned dependency suggestion can become persistent project state. Once committed, it may be installed by CI, by another developer, or by a deployment pipeline that never saw the original model interaction.
That is the part teams underestimate. The hallucination is temporary. The manifest change is durable.
The Implications
The practical fix is not banning AI coding assistants.
It is refusing to let them create dependency trust by assertion.
Any AI-generated package addition should pass through deterministic checks. Does the package exist in the expected registry? Is the name already approved internally? How old is it? Who maintains it? Did it appear in the lockfile before the model suggested it? Does it run install scripts? Does it request suspicious permissions or ship obfuscated code? Is there a safer standard-library or already-approved dependency path?
For enterprises, this belongs in developer-platform policy. CI should reject unapproved new dependencies. Agent runtimes should require confirmation before package installation. Internal mirrors should quarantine first-time packages. Security tools should flag dependency additions that originated in AI sessions.
For open-source maintainers and registry operators, slopsquatting creates an abuse-class problem. Defensive registration can help in narrow cases, but it does not scale to hundreds of thousands of plausible hallucinations. Better signals are needed: namespace reputation, rapid takedown paths, suspicious first-publish detection, and telemetry that distinguishes organic adoption from AI-driven install bursts.
For model vendors, the benchmark should not stop at “does the code compile.” Dependency realism is now part of secure code generation. A model that writes correct business logic but invents a package name has not solved the task. It has moved the risk into the package manager.
The useful mental model is simple.
Generated code is not complete until its dependencies are real, approved and safe to install.
Slopsquatting is the reason that sentence needs to be policy, not advice.
Discussion
Sign in to join the discussion.
No comments yet. Be the first to share your thoughts.