AI Runs on Trusted Data — And Publishers Hold the Keys to Its Future
Artificial intelligence is reshaping how people seek and consume information. But AI runs on trusted data, and trusted data originates with publishers.
Preventing the Content War Before It Really Starts
Gary Newcomb
CTO & Co-Founder, FetchRight
Cloudflare's new Pay-Per-Crawl model marks a critical turning point in how content creators and AI companies interact. It blocks unauthorized crawlers by default and allows monetized, permission-based access. It's a powerful assertion of control—and a long-overdue response to unlicensed data scraping.
But here's the deeper question:
Are we setting the stage for a functional ecosystem—or just the early shots in a content war?
Imagine asking someone to buy a car hidden behind curtain #1. No spec sheet, no test drive. That's effectively what we're asking AI systems to do now: pay before they can even assess the relevance or value of the content.
That dynamic risks creating a high-friction, closed ecosystem—one where discovery is throttled, access is opaque, and smaller publishers lose visibility because AI systems can't afford to gamble on unknown content.
If we're not careful, we'll replace the open web with a fragmented patchwork of paywalls, scraping battles, and blacklists. And the ones who lose won't be the platforms—it'll be the users.
This is our opportunity to prevent a war before it really starts. Not by picking sides—but by designing smarter systems.
Let's imagine a peek-then-pay framework that allows AI systems to make informed decisions, and content creators to retain control:
AI crawlers can see titles, tags, authors, product names, and brief abstracts. Enough to assess relevance, without giving away full content.
A short snippet (e.g., 100–200 tokens) or structured headers let crawlers evaluate tone, quality, and intent—without undermining monetization.
Publishers define what uses are allowed—indexing, summarization, training, or real-time responses—and at what price.
Trustworthy agents (from OpenAI, Perplexity, etc.) get scaled access with authentication, billing, and transparent logging. Bad actors get blocked.
Take a site like PCMag—rich editorial content, deep archives, trusted recommendations. Under a "peek-then-pay" model, AI systems could intelligently select what content to pay for, enabling discovery and driving compensation back to the publisher.
Everybody wins:
Cloudflare's move is an important foundation—but it's just the start. If we stop here, we'll build walls. If we keep going, we can build markets—dynamic, permissioned, and fair.
We're at a crossroads: one path leads to litigation, arms races, and broken discovery. The other leads to sustainable value exchange.
The future of content and AI doesn't have to be adversarial. With the right protocols and infrastructure, we can create an ecosystem where:
Let's choose wisely—and prevent the war before it really starts.
The peek-then-pay approach isn't just about licensing; it's about building the foundation for a sustainable, collaborative future between human creativity and artificial intelligence.
Artificial intelligence is reshaping how people seek and consume information. But AI runs on trusted data, and trusted data originates with publishers.
FetchRight and Peek-Then-Pay let LLMs ask publishers directly for pre-filtered, attribution-ready context — cutting duplicate embedding work and lowering costs for both sides.
The audience has migrated to AI-driven discovery channels. Publishers who structure and govern what they contribute can become essential partners in the AI economy.