A new framework is emerging to define how AI and human creativity coexist online.
The internet's next protocol war has already begun — over who controls how AI learns.
As generative AI accelerates, a quiet imbalance has taken hold of the web. LLMs depend on high-quality, human-authored content, yet publishers, who invest to create that content, often have no say in how it's used, monetized, or represented.
The result? A web where creators lose agency, and AI models carry enormous costs to reprocess the very data that publishers already understand best.
That's the gap the open-source Peek-Then-Pay standard aims to fill — with commercial implementations like FetchRight emerging to operationalize it.
What Is Peek-Then-Pay?
Think of it as robots.txt for AI — but enforceable and auditable.
Peek-Then-Pay (PTP) is an open standard that defines how AI crawlers can discover, understand, and license publisher content in a transparent, auditable way.
Each participating site hosts a lightweight machine-readable manifest (peek.json) that describes:
- Which kinds of AI usage are allowed (e.g., summarization, embedding, training)
- Where to obtain a license
- What enforcement rules apply (rate limits, token ceilings, data formats)
The peek.json references the publisher's licensing API (often hosted at a service like api.fetchright.ai) where dynamic pricing schemes are defined. That API can return custom terms for each registered AI agent or operator, reflecting factors like usage intent, historical relationship, or publisher preferences.
Controlled Visibility Without a License
Peek-Then-Pay isn't just a paywall; it's a protocol for discovery.
Where traditional "tollbooth" systems respond with opaque 402 (Payment Required) errors, Peek-Then-Pay encourages the use of 203 (Non-Authoritative Information) responses when no valid license is present.
These 203 responses can include:
- A representative sample of the content (a brief excerpt, title, or structured metadata)
- The
peek.json link and pointers to the licensing API - Optional AI-readable cues that help a crawler understand the content's type and relevance
This lets LLMs and agents preview what a publisher offers without overstepping (improving discoverability and model training context) while allowing publishers to gain visibility and attribution within the AI ecosystem without giving away full content.
It's a balance of reach and control: publishers remain discoverable to AI, but only as far as they choose.
Publishers Regain Agency
At its core, Peek-Then-Pay is about restoring publisher voice and ownership in an AI-driven ecosystem.
Publishers shouldn't have to choose between blocking crawlers entirely or giving away their intellectual property. With Peek-Then-Pay, they can:
- Retain agency over how their content is accessed, transformed, and monetized
- Control intent: permit session-level summarization or search embeddings, while restricting training or full-text reproduction
- Provide value without exposure: by delivering pre-transformed embeddings or structured metadata, the content itself remains private — only its AI-usable representation is shared
- Extend their voice: publishers already generate embeddings and AI summaries internally to power search and recommendations on their own sites. Those same domain-tuned representations can now flow outward to the broader AI ecosystem, carrying the publisher's expertise, nuance, and editorial integrity
This isn't just licensing. It's representation in the AI era.
LLMs Gain Efficiency, Legitimacy, and Clarity
On the other side of the handshake, LLM companies benefit from a cleaner, faster, and cheaper data supply chain.
Today's models spend massive compute resources re-crawling, cleaning, embedding, and summarizing redundant web data. It's an enormous waste of GPU cycles and human effort.
With Peek-Then-Pay and FetchRight:
- LLMs can license pre-transformed data directly (embeddings, summaries, or RAG-ready segments) reducing ingestion costs by 60–90%
- Traceable licenses provide clear provenance for every dataset, helping models meet the emerging compliance and copyright standards of the EU AI Act and beyond
- Publishers' own embeddings bring domain-specific precision that improves search, recommendation, and retrieval-augmented generation performance
And importantly: this approach helps rebalance the compute load.
While model developers prepare for massive hardware investments, cutting multi-billion-dollar deals with chip manufacturers to scale their datacenters, Peek-Then-Pay shifts some of that processing closer to where it belongs: the publisher, who owns the content and context.
That distributed approach reduces redundancy and aligns incentives: the publisher provides structured value; the LLM pays for verified, efficient access.
Efficiency Meets Integrity
Peek-Then-Pay transforms content access into a cooperative ecosystem:
Publishers → Control, revenue, and voice
LLMs → Efficiency, legality, and quality
The Web → Transparency, attribution, and balance
The Path Forward
The future of AI content isn't about locking down knowledge. It's about creating a transparent value exchange between creators and machines.
Peek-Then-Pay gives publishers their agency, gives LLMs efficiency, and gives the web a sustainable framework for cooperation.
You can explore the open-source spec and ongoing discussion at peekthenpay.org — where publishers, developers, and AI researchers are shaping the next era of fair content access.