The AI Moat Illusion

In 2021, having API access to GPT-3 was a genuine competitive advantage. You were one of a few thousand companies with access to a capability that others didn't have. By 2023, GPT-4 access was available to anyone with a credit card. By 2024, open-source models were delivering GPT-3.5-class performance freely, and the gap between open-source and frontier closed from enormous to significant to — in many task categories — negligible.

We're in 2025. Llama 3.1 70B scores within 5% of GPT-4 on MMLU benchmarks. The price per token for GPT-3.5-class capability has dropped roughly 97% since the model's release — from $0.02 per thousand tokens to under $0.001. Mistral, Qwen, Gemma, and a dozen other model families are competitive in specific domains. Every cloud provider has a model API. The defining capability of 2021 has become infrastructure, priced accordingly, differentiated by none.

If your AI strategy is built around "we use the best model," you're building a strategy that depreciates every six months. Here's what actually creates durable competitive advantage in an AI-commoditized world, and what technical leaders should be building toward right now.

01 — Historical Parallel

The Commoditization Curve

What's happening in AI is not unprecedented. It's the same commoditization curve that has played out across every major technology platform transition. Understanding the historical pattern gives you a map for what comes next.

Mainframe computers, 1960s-70s: owning a mainframe was a genuine competitive advantage. IBM charged accordingly — a single System/360 mainframe could cost $2.5 million in 1960s dollars. Companies with mainframes could do things competitors could not. The moat was hardware access.

Minicomputers, 70s-80s: compute became cheaper and more accessible. The moat shifted from hardware ownership to software — who had the best applications running on the newly accessible hardware. Hardware became a commodity.

Personal computers, 80s-90s: compute became nearly universal. The moat shifted again, from software to distribution — who owned the desktop, the retail channel, the enterprise relationship. Microsoft won by controlling the OS layer that sat between commoditized hardware and the applications running on it.

Cloud computing, 2010s: on-demand infrastructure access commoditized servers, storage, and networking. The moat shifted once more — from infrastructure to the applications and data built on top of it. AWS, Azure, and GCP became commodity infrastructure. The value is in what runs on them, not the infrastructure itself.

AI is following the same curve, compressed by roughly 5x due to the pace of open-source development and the economics of transformer scaling. The model capability layer is commoditizing. The value will concentrate in the application layer — in the workflow integrations, the proprietary data, the trust relationships, and the network effects that sit above the model. This is where technical leaders should be building. Not at the model layer.

97% price drop per token, GPT-3.5 class, 2020–2024

5% benchmark gap, Llama 3.1 70B vs GPT-4 on MMLU

6mo typical cycle for model advantage to become table stakes

02 — Where Value Lives

What the Application Layer Is Worth

Andreessen Horowitz wrote in 2023 that "AI applications capture most of the value in the stack, not infrastructure." The evidence supports this when you look at which companies are building durable businesses versus which are locked in a continuous capability arms race.

Cursor.sh built a code editor that deeply integrates AI into the development workflow. Its competitive advantage isn't that it uses a better model than GitHub Copilot — both use frontier models. The advantage is workflow ownership: Cursor is where the code lives, where the context accumulates, where the developer's attention already is. Switching from Cursor to another tool has behavioral costs that go far beyond switching between two API providers.

GitHub Copilot has an even more durable position. It's embedded in VS Code, which is used by the majority of professional developers. Every developer who uses Copilot does so inside the tool they already use for everything else. The distribution moat — owning where developers work — is hard to dislodge. A competitor with a better model would need to also win the developer environment battle. That's a much higher bar than "better model."

Harvey, the legal AI company, has built around domain trust. Law firms that use Harvey see AI recommendations through the lens of Harvey's domain expertise and track record in legal work. The accumulation of trust in a specific domain — and the demonstrated accuracy that builds that trust over time — creates switching costs that are about human confidence, not just contractual lock-in. A generic LLM with better benchmark scores wouldn't automatically displace Harvey, because "better on benchmarks" isn't the same as "trusted in this specific legal domain where we've been using Harvey for two years." Those are fundamentally different things.

"The model is a commodity input. The application layer — the workflow, the data, the trust — is where value is actually captured. The companies that know this are building; the ones that don't are renting capability that will be commoditized under them."

03 — The Four Real Moats

The 4 Real Moats in AI

Moat 1: The Data Flywheel

The most durable competitive advantage in AI is proprietary data — specifically, data that improves with each user interaction and that competitors cannot access or replicate.

Waymo is the canonical example. By early 2024, Waymo's autonomous vehicles had logged over 20 million miles of real-world driving data. That data — the edge cases, the near-misses, the unusual scenarios, the geographic diversity — is the training substrate for their models. A new entrant to autonomous driving doesn't just need to build the technology stack. They need to generate millions of miles of real-world data to train competitive models. That takes years that cannot be bought, regardless of funding.

For enterprise AI, the data flywheel plays out at a smaller but still significant scale. A company that has been processing customer loyalty transactions for ten years has behavioral data about customer purchasing patterns that competitors launching today cannot access. That data, used to train or fine-tune AI models for personalization, creates outputs that are more accurate than anything a generic model can produce — because the generic model doesn't know your customers. It can't.

The key pattern: the flywheel requires that each user interaction generates data that flows back into improving the system. This doesn't happen automatically. It requires deliberate architecture from the start. Every AI product team should ask: "What data does this interaction generate, and how does it flow back to improve future outputs?" If the answer is "it doesn't," you're not building a flywheel. You're running a service that competitors can replicate on a weekend.

Moat 2: Workflow Ownership

If your AI is embedded in the workflow that users already live in, the switching cost is behavioral rather than contractual. Behavioral switching costs are harder to overcome than contractual ones, because they require changing habits and relearning patterns that users have developed over months of daily use. People don't abandon tools they've built mental models around, even when a better-on-benchmarks alternative exists.

The depth of workflow integration matters a lot. An AI feature that sits adjacent to a workflow — a chatbot you can ask questions to — is more replaceable than an AI feature that runs inside the workflow (code suggestions appearing inline in your editor as you type). The difference is how completely the AI's removal would disrupt the user's daily work.

When designing AI features, ask where in the workflow the AI lives. The closer to the user's core work, the higher the integration depth, the harder the switching costs. An AI that reads your email and surfaces relevant context as you're composing a reply is more deeply embedded than an AI you open in a separate tab. Both use similar model capabilities. The integration depth is what creates the moat, not the model quality.

Moat 3: Domain Trust

Trust is earned slowly and lost quickly, and in AI it has a specific accumulation mechanism: accurate domain-specific performance over time. A model that is 95% accurate in a domain is not just 5% better than one that is 90% accurate — it's trusted in ways the 90% model isn't. That gap is qualitative, not just quantitative.

Research on human trust in AI systems (Dietvorst et al., "Algorithm Aversion," Journal of Experimental Psychology, 2015) found that humans will switch to a human advisor over an algorithm when the algorithm makes a visible error, even if the algorithm's overall error rate is lower. People forgive human mistakes in a way they don't forgive algorithmic mistakes. Domain trust has to be built carefully — not by claiming accuracy, but by demonstrating it repeatedly in contexts where users can verify it themselves.

Epic, the dominant electronic health records system, benefits from this trust mechanism in healthcare AI. When Epic's AI surfaces a clinical recommendation, physicians engage with it more seriously than they would with a recommendation from a generic AI assistant — because Epic's recommendations come with the implicit endorsement of the system that manages their patient data, built over years of domain-specific use. That trust took years to build. A competitor with a technically superior model cannot purchase it or skip the line.

Moat 4: Integration Depth

The fourth moat is the cumulative effect of deep integration across multiple systems. An AI that can access your email, calendar, project management tool, CRM, and code repository simultaneously is exponentially more useful than one that can access any single one of those — not just because it knows more, but because the combination reveals relationships and patterns that no individual system exposes. That's where the really useful inferences live.

The classic example is how Notion AI compares to a generic AI chatbot. Notion AI knows what you wrote, when you wrote it, how you organize your thinking, what projects you're working on, what you've been researching. A generic AI knows only what you tell it in the current session. The information advantage compounds with every document you write in Notion — which makes it harder to replace, not easier.

The replacement cost of a deeply integrated AI isn't the migration of the AI itself. It's the migration of everything the AI is connected to — all the workflows, data sources, and habits that have formed around it. This is why integration depth creates a moat: it converts model capability (replaceable) into workflow dependency (not easily replaced). Very different things.

Evaluating Your Own Moat

Ask these four questions about your AI product: (1) Does every user interaction generate data that makes future outputs better? If yes, you have a data flywheel. (2) Is the AI embedded inside the workflow, or adjacent to it? The more inside, the harder to replace. (3) Have you built a track record of accuracy in a specific domain that users have verified and come to trust? (4) How many other systems does your AI read from and write to? More integrations = higher switching costs. Score yourself honestly.

04 — What to Build Now

What Technical Leaders Should Do Right Now

Strategic insight is only useful if it translates to action. Here are the concrete moves I'd make if I were leading product or engineering for an AI-enabled company right now.

Audit Your Value Proposition

What is your AI product actually providing? List the specific capabilities users care about. For each capability, ask: does this require a specific model, or would any competent model deliver the same experience? If the answer is "any competent model," you don't have a model moat — you have an application. That's fine, but know it clearly, because it tells you where to invest and where not to.

Build Your Data Flywheel Deliberately

Every AI product generates data. Most products don't use it to improve. The deliberate move is to instrument your AI features to capture signal from user interactions: what did users accept, what did they edit, what did they reject, what did they ask as follow-up questions? This signal is training data for future model improvement. The companies building this infrastructure now will have meaningfully better models in two years — not because they invented better training techniques, but because they had better proprietary data to train on. That advantage compounds quietly.

Don't Build Model Infrastructure Unless It Is Your Core Product

This bears repeating because the temptation is constant and I understand why. Building and maintaining model infrastructure — training pipelines, evaluation harnesses, deployment infrastructure, safety systems — is a full-time job for a specialized team. Unless your business value comes directly from model differentiation (Midjourney, ElevenLabs, Stability AI), this investment competes with the application-layer work that actually creates your competitive advantage. Use the APIs. Invest the savings in your data flywheel, your workflow integration, and your evaluation infrastructure.

Invest in Evaluation Infrastructure

The ability to rapidly evaluate AI output quality is a durable competitive advantage. Teams with strong eval infrastructure can safely ship model updates, experiment with new models, and detect quality regressions before users do. Teams without eval infrastructure move slowly and get surprised by regressions in the worst possible moment. As models continue to improve and change, the ability to quickly evaluate "is this new model version better for our specific use case?" is worth more than any specific current model capability.

Think Carefully About Trust Architecture

How do you make users trust your AI? This is a design problem, not just a technical one. Trust comes from transparency (showing reasoning, not just conclusions), consistency (being reliably accurate over time), graceful failure handling (acknowledging uncertainty rather than generating confident nonsense), and domain specificity (being demonstrably better in your domain than a general-purpose tool). Design for trust explicitly, because it's the precondition for adoption at depth — and adoption at depth is what creates workflow integration moats.

05 — Predictions

The Winners and Losers

I'll be direct about where I think this is heading, because strategic honesty requires making predictions that can be proven wrong.

Winners

Companies with proprietary data and active flywheels — healthcare systems with patient outcome data, financial institutions with transaction history, retailers with purchase behavioral data, any company that has been systematically collecting domain-specific data and has the infrastructure to actually use it for AI improvement.

Companies with deep workflow integration — any AI product embedded in the daily workflow of its users. Cursor, GitHub Copilot, Notion AI, Salesforce Einstein are all positioned well here. The category of "AI embedded in the software developers and knowledge workers already use every day" is the highest-value position in the application layer. Hard to dislodge.

Companies with domain trust built over time — healthcare, legal, and financial AI products that have demonstrated accurate performance in regulated, high-stakes domains. These markets move slowly, but the trust moat in regulated domains is deep once established. The slow movement is the feature, not the bug.

Losers

Generic AI wrappers with no data advantage — products that are essentially "a nice UI wrapped around GPT-4" with no proprietary data, no workflow integration, and no trust accumulation. Every six months, a new model makes their wrapper less distinctive, and open-source catches up further. The unit economics of reselling commodity model capability without adding proprietary value are unsustainable. I've watched several of these run out of runway.

Companies betting on a single model provider — tight coupling to one model provider creates existential vendor risk. Model providers change pricing, deprecate models, and change terms of service. The architecture should abstract the model layer so that swapping providers is a configuration change, not a rebuild. If you can't swap your model in a day, you're too coupled.

Teams without evaluation infrastructure — you cannot improve what you cannot measure. Teams shipping AI features without systematic evaluation are accumulating invisible quality debt. When the reckoning comes — in the form of user complaints, trust erosion, or competitive displacement — they won't have the infrastructure to understand what's wrong, let alone fix it quickly.

"In five years, the question won't be 'are you using AI?' — everyone will be. The question will be 'what do you have that compounds?' Data flywheels compound. Workflow trust compounds. Integration depth compounds. Model access doesn't."

Conclusion

The Model Is Not the Moat

The AI stack will continue to evolve. Models will get better, cheaper, and more accessible. The commoditization curve that dropped token prices 97% in four years isn't going to reverse. Open-source development will continue closing the gap with frontier models in most task categories. This is good for users and good for the industry — it means AI capabilities become broadly accessible, not hoarded by well-capitalized incumbents.

What it means for strategy is that model access is not a durable competitive advantage. It never really was, and it's less so with each model generation. The organizations building on model access as their primary competitive differentiation are running on a treadmill that gets faster every six months. They're working harder to stay in the same place.

The durable advantages are the ones that require time and user interaction to build: data flywheels filled with proprietary behavioral data, workflow ownership that creates behavioral switching costs, domain trust earned through demonstrated accuracy over time, and integration depth that makes replacement prohibitively disruptive. These take longer to build than hooking up an API. They're harder to describe in a press release. They're nearly impossible to replicate quickly by a competitor who decides they want to catch up.

The model is a commodity. What you build on top of it, what data flows through it, and how deeply it's embedded in how people work — that's the moat. Build accordingly.

RS Arun

Technical Leader · Built AI products at startup and enterprise scale · Co-founder & CTO at Rareblue Technologies (backed by 100xVC)

View Portfolio