Fake AI Agent Skill Passed Security Scans and Reportedly Reached 26,000 Agents

Severity
Medium

Detail

Security researchers at AIR demonstrated a supply chain risk affecting AI agent ecosystems by creating a fake AI agent skill named brand-landingpage that successfully passed multiple security scanning tools and was distributed through a popular skill marketplace. According to AIR, the skill reportedly reached approximately 26,000 AI agents, including some associated with corporate environments.

The purpose of the experiment was to evaluate the effectiveness of current AI skill vetting processes and trust mechanisms. The skill appeared legitimate, claiming to help users create landing pages using Google’s Stitch design platform. To increase credibility, AIR leveraged trusted ecosystem signals, including publishing the skill through a popular GitHub repository with thousands of stars and obtaining clean results from multiple security scanners.

Although the proof-of-concept payload was intentionally harmless and only collected user email addresses, researchers demonstrated that a malicious actor could potentially use the same technique to execute arbitrary code, steal sensitive information, access internal resources, or compromise enterprise AI agent environments.

How?

The attack exploited a fundamental weakness in how AI agent skill scanners validate packages. The submitted skill package itself contained no malicious code and therefore passed security checks performed by tools from Cisco, NVIDIA, and other skill-scanning platforms. Instead, the skill instructed AI agents to obtain installation instructions from an external website controlled by the researchers.

Initially, the external website redirected users to legitimate documentation, allowing the skill to pass all security reviews and gain adoption. After the skill was installed by a significant number of users, the researchers modified the content hosted on the external website without changing the skill package itself. The updated instructions directed agents to download and execute additional code. In the demonstration, the downloaded code only transmitted user email addresses back to the researchers; however, the same technique could be abused to execute malicious commands, access sensitive files, exfiltrate data, or interact with internal systems accessible to the AI agent.

The attack highlights a key limitation of current AI skill scanning technologies, which typically analyze only the submitted package and not external resources referenced by the skill. As a result, malicious content can be introduced after approval by modifying externally hosted resources.

Recommendation

Organizations should treat AI agent skills and extensions with the same level of scrutiny applied to traditional software packages. Security reviews should not be limited to the skill package itself but should also include all external URLs, installation instructions, scripts, and dependencies referenced by the skill. Any externally hosted content should be continuously monitored for changes, as security validation performed during initial approval may become ineffective if linked resources are modified afterward.

Organizations should establish centralized approval processes for AI agent skills, allowing only vetted and trusted skills to be deployed within enterprise environments. Version pinning should be enforced to prevent unauthorized changes from being introduced through updated external content. AI agents should operate under the principle of least privilege, limiting their access to sensitive data, internal systems, and business applications. Additionally, organizations should regularly inventory installed skills, monitor agent activity for unusual behavior, and implement continuous reassessment of approved skills to detect post-approval modifications that may introduce malicious functionality.

Source

https://thehackernews.com/2026/06/fake-ai-agent-skill-passed-security.html