The Future of AI Licensing: Peek-Then-Pay Protocol
How the Peek-Then-Pay standard is reshaping the relationship between AI systems and content creators, giving publishers control while enabling efficient AI development.
Part 2 of 2 -- How publishers and LLMs share structured context to save compute and preserve authority
Gary Newcomb
CTO & Co-Founder, FetchRight
This article builds on Part 1: The Cost of Context, where I explained why modern LLMs burn enormous compute reconstructing context they've already seen.
If you haven't read that first piece, it provides the economic foundation for what follows.
Large language models don't browse the web the way humans do.
When you ask an AI assistant a question, it doesn't "go online". It interprets your prompt, retrieves text from cached indexes or APIs, and burns compute tokens to rebuild context each time. Every token it reprocesses costs money and time.
Meanwhile, publishers (the specialists of the web) already have structured, vetted content. But the AI systems that depend on it rarely contact them directly. Instead, they route through general-purpose search engines, re-embedding or scraping pages, losing brand attribution and adding massive redundant compute.
That's the disconnect FetchRight and the open Peek-Then-Pay standard are designed to fix.
Here's the mechanical chain inside an LLM "web search":
Every one of those steps costs CPU/GPU time and discards most of the data fetched. And every LLM provider on Earth repeats this work independently.
Search engines are still the best tool for one crucial job:
identifying who the experts are.
LLMs should continue using Google or Bing for discovery
"Who are the authoritative sources for this topic?"
But once those experts are known, agents shouldn't need to scrape them, re-embed them, or repeatedly process full HTML pages on every query.
A better pattern emerges:
Search engines identify the specialists.
Publishers answer agentic questions directly.
Under Peek-Then-Pay, participating sites expose two lightweight capabilities:
1. Publisher Search (Cross-Resource Discovery)
GET /.well-known/peek/search?q="best 4K monitor 2025"
Returns a ranked list of canonical URLs, along with content/media types and scoring metadata (keyword, vector, or hybrid).
This helps the agent understand which specific pages are authoritative for the query — without relying solely on third-party search snippets.
2. Chunk Retrieval (Per-Resource Evidence Extraction)
After search identifies relevant URLs, the agent selects one and requests semantically relevant evidence from that specific page:
GET /products/best-4k-monitors?intent=chunk
&embedding=[...]
&top_k=5
&license=...
The enforcer then:
The result:
LLMs receive only the passages that matter, without scraping, without full-page re-embedding, and without losing the publisher's attribution or voice.
Once this pattern is in place, an AI agent can:
No scraping.
No redundant re-embedding.
No lost attribution.
Peek-Then-Pay defines how those endpoints behave.
FetchRight operationalizes them.
FetchRight sits between AI crawlers and publishers, providing:
To the AI agent, it looks like a single, clean API for structured context. To the publisher, it's a protective gateway that maintains brand authority and monetizes access.
Developer Note: Full MCP Support
Both the FetchRight Licensing API (api.fetchright.ai) and the Cloudflare enforcer support the Model Context Protocol (MCP).
This allows agents to discover publishers, request licenses, and invoke search or chunk retrieval directly as MCP tool calls, with no custom client code. It makes publishers first-class participants in agentic ecosystems.
For Publishers:
For LLM Operators:
Both sides save money. Both sides gain clarity and provenance.
And the web itself becomes semantically structured instead of being endlessly re-scraped.
The future web isn't about who owns data; it's about who provides the best structured access to it. FetchRight turns publishers into first-class participants in the AI economy, and gives LLMs a cheaper, faster, auditable way to think.
It starts with better context.
And that context already exists - on the publisher's side of the glass.
---
Missed Part 1? Start here: The Cost of Context
It explains why context reconstruction is the real cost center of modern AI.
How the Peek-Then-Pay standard is reshaping the relationship between AI systems and content creators, giving publishers control while enabling efficient AI development.
The audience has migrated to AI-driven discovery channels. Publishers who structure and govern what they contribute can become essential partners in the AI economy.
Every AI query burns thousands of tokens just to reacquire context. FetchRight and Peek-Then-Pay offer a smarter approach where publishers provide structured data and LLMs pay for efficiency, not redundancy.