The App Store Attack You Didn’t See Coming

Part 1 of 2 – AI’s Trust Problem

A security firm just proved that AI skill marketplaces are the new malware vector. And the scariest part? Everyone involved did exactly what they were supposed to do.


Something went around social media this week that I haven’t been able to stop thinking about.

A security company called AIR did something that should genuinely alarm anyone building with or deploying agentic AI tools right now.

They didn’t find a zero-day. They didn’t exploit a CVE. They just… made an app. And waited.

The experiment centred on a skill called brand-landingpage, presented as a tool for helping users build a landing page with Google’s Stitch design tool. AIR chose this use case deliberately. It would appeal to non-technical corporate users: marketers, salespeople, designers. People who install things because they’re useful, not because they’ve audited the source.

Here’s where it gets clever.

Rather than building credibility from scratch, they submitted the skill to a popular open-source agents repository with about 36,000 GitHub stars and 156 skills. The pull request was merged after a few days. Now the skill had social proof baked in. It was in a reputable repo. It looked legit. They promoted it through Instagram ads, and installs followed.

The malicious technique didn’t depend on suspicious code inside the submitted files. Instead, the skill instructed agents to set up a Stitch SDK by following installation instructions hosted at stitch-design.ai, a domain AIR controlled. Google’s actual Stitch domain is stitch.withgoogle.com.

One letter off. One redirect. Passes every scanner.

AIR tested the skill against scanners from Cisco, Nvidia, and skills.sh. All marked it as safe.

Once they had enough installs, AIR changed the content behind the fake documentation. The revised page instructed agents to download and run a script. In the test, that script collected email addresses, but AIR noted the same technique could have been used to compromise the machines running the agent. Some of those agents were tied to corporate accounts. Private conversations. Internal systems.

26,000 users. All reachable via one dodgy domain redirect buried in a README.


This isn’t a hacking story. It’s a trust story.

The attack worked because of a chain of assumed legitimacy: popular repo → merged PR → Instagram promotion → security scanner green light → install. No single link in that chain was obviously broken. The skill looked fine because, until it didn’t need to anymore, it was fine.

This is the same pattern as every major supply chain attack of the last two years. Third-party involvement in breaches doubled from 15% to 30% in a single year. The largest single-year jump ever recorded by the Verizon DBIR. Attackers aren’t breaking through your walls anymore. They’re walking through doors that trusted vendors already opened.

What’s new here is the vector: AI agent skill marketplaces. A category that barely existed 18 months ago. And in the first weeks of one major platform’s launch, Bitdefender Labs found that approximately 17% of skills already carried malicious payloads. Not edge cases. A systemic failure of the trust model, right out of the gate.


Why static scanning can’t fix this

The reason the scanners all missed it is structural, not a gap that a better scanner solves.

The malicious behaviour wasn’t in the skill. It was deferred. Hosted externally, switched on only once they’d reached enough installs. There’s no scanner in the world that can check what a domain will serve in three months’ time.

The agentic model makes this uniquely dangerous. When a traditional app fetches a URL, it displays content. When an AI agent fetches that same URL, it may execute instructions from it. The surface area isn’t just data. It’s runtime behaviour. Nothing in the security industry’s toolbox was built for that threat model.


What you should actually do

If you’re deploying AI agents in any professional context, a few things are worth locking in now:

Treat skills like code dependencies, not apps. You wouldn’t pull in an npm package without understanding what it does. The same rigour applies. More so, actually, because the execution model is less predictable.

Domain reputation at install time isn’t the right check. You need to think about what a skill could do after its payload changes. Sandboxing, outbound network restrictions, and agent permission scoping all matter.

Non-technical promotion is a signal worth noting. The AIR attack was pushed through Instagram by people who had no idea what was inside it. That’s not inherently suspicious. But skills being enthusiastically promoted through non-technical channels, with no corresponding technical scrutiny, deserves a second look.

Your AI governance framework needs a supply chain clause. If you’re on a committee or working group dealing with AI adoption, this exact scenario belongs in your risk register. Not as a hypothetical. It happened recently.


The scariest thing about this research isn’t the attack. It’s how obvious it feels in retrospect. We built an entire marketplace ecosystem for AI agents, bolted on the same static scanning we use for code packages, and called it secure.

The attack surface for agentic AI isn’t your prompt injection defence. It’s the skill someone on your team installed on Tuesday because a designer on Instagram said it was great.


In part two, I look at the same trust problem from the other direction: what happens when the person creating the risk is already inside your organisation.

Leave a Comment