Publisher Licensing: How Content Deals Shape AI Answers in 2026

About Author

Thibault Besson-Magdelain

Founder of Sorank, 5+ years of experience in SEO, GEO enthusiast.

Read other articles

Summarize with

ChatGPT Perplexity

Share on

Summary: Publisher licensing is a formal agreement that lets an AI company use a publisher's content to train models and power answers, in exchange for payment, attribution, and links back to the source.

Publisher licensing is the practice of news organizations and content owners signing formal deals that permit AI companies to use their journalism, both to train models and to display in AI products. In return, publishers typically receive compensation, attribution inside AI answers, links back to their sites, and sometimes access to the AI company's technology to build their own tools.

This has become one of the defining business questions of the AI era. As assistants answer more queries directly, the terms on which they may use published content determine who gets paid, who gets cited, and ultimately who stays visible when an AI mediates discovery.

What is publisher licensing?

At its core, publisher licensing converts an informal, contested use of content into a contract. Instead of an AI company scraping and summarizing articles with no agreement, the two sides negotiate rights: which archives can be used for training, what can be shown in a chatbot, how attribution and links appear, and what the publisher is paid. One publisher framed it as a fair return for creators, noting that without quality journalism, AI models quickly lose their value.

These deals usually bundle several rights together. They can grant model training rights over a publisher's archive, real-time display of summaries and quotes with links, revenue sharing tied to usage, and API access so the publisher can build features on the same technology.

How licensing differs from training data scraping

The alternative to licensing is unlicensed ingestion, where a model learns from content pulled off the open web without permission. Much of a model's AI training data has historically come from broad web crawls, which is exactly what publishers contest. Licensing replaces that gray area with explicit, paid permission.

This is also why control over crawling matters. Publishers increasingly gate access for AI crawlers and assert text and data mining rights reservation, using technical and legal signals to withhold content until a deal is in place. A license is the commercial resolution of that standoff.

The 2025 wave of licensing deals

2025 saw a rush of agreements. OpenAI signed deals with Axios, The Guardian, Schibsted, and The Washington Post, and as part of the Axios deal it funded four new local newsrooms. Google signed its first AI content licensing deal with the Associated Press, feeding real-time news into Gemini, and later ran a pilot with publishers including Der Spiegel and El Pais.

Amazon licensed The New York Times, plus Conde Nast and Hearst for its Rufus shopping assistant, and Meta entered in December with seven multi-year deals covering publishers like CNN, Fox News, and USA Today for its Llama models. Microsoft, Mistral, Perplexity, and others struck their own agreements, turning licensing from a novelty into standard industry practice.

How publishers get paid

Compensation models vary widely. Some deals are flat annual fees, such as Meta's reported arrangement with News Corp worth up to 50 million dollars a year, while News Corp's OpenAI deal was reported at more than 250 million dollars over five years. Others are usage-based, paying per deployment of content, and some use revenue sharing, with models like Perplexity and Prorata reportedly allocating around 50 percent of revenue.

The amounts are meaningful but rarely transformative. Amazon's New York Times deal, reported at 20 to 25 million dollars a year, was characterized as close to 1 percent of the Times' total revenue. Publishers describe the choice as a la carte pay-per-use, like Microsoft's marketplace, versus an all-you-can-eat lump sum, like some OpenAI deals.

Why publisher licensing matters for GEO

Licensing increasingly decides who gets cited in AI answers, which is the heart of generative engine optimization. Analyses suggest licensed sources receive far more visibility: Reddit, with reported deals worth 60 to 70 million dollars a year, appears in a large share of Perplexity citations, while freely licensed Wikipedia dominates ChatGPT citations. The presence of a deal can directly lift LLM citations.

The flip side is stark for everyone else. Unlicensed mid-tier publishers risk becoming nearly invisible in AI-mediated discovery regardless of content quality, creating a winner-take-all dynamic. For brands that cannot sign a deal, the practical response is to earn source citation the organic way, through clear, authoritative, well-structured content paired with sound keyword research and content planning.

Licensing versus litigation

The industry is split. Many publishers sign deals while others sue, and some do both. News Corp signed with OpenAI yet sued Perplexity for similar practices, and outlets including The New York Times, CNN, Encyclopedia Britannica, and Merriam-Webster have pursued litigation. Roughly two dozen major publishers have signed deals while a similar number actively sue.

This reflects a deeper unresolved question about fair use in copyright law. Publishers pursue parallel strategies, negotiating favorable terms with cooperating platforms while litigating against those that refuse, preserving legal leverage while securing revenue now. The eventual legal outcome may matter more than any single deal.

The traffic trade-off

Even a good deal does not restore old traffic patterns. Reporting suggests a large majority of AI answers end without a click to the source, so licensed publishers increasingly gain attribution and brand awareness rather than referral visits. One publisher reported that around 20 percent of its Google results featuring its links included AI summaries that discouraged clickthroughs.

This reframes the value of licensing. The payment and the citation are the return, not a flood of clicks, which is why publishers weigh licensing against the slow erosion of zero-click attribution. Visibility inside the answer becomes the asset, even when the click does not follow.

Conclusion

Publisher licensing turns the contested use of content into paid, attributed permission, and in 2025 it became standard practice across OpenAI, Google, Amazon, Meta, and others. The deals vary from flat fees to revenue shares, and they increasingly determine who gets cited when an AI answers, making licensing a core lever of AI-search visibility.

For brands without a seat at that table, earning citations organically through authoritative, well-structured content is the path forward, supported by Sorank's research and content planning tools. Reference sources: Digiday, Press Gazette, and Will Scott.

Frequently questions asked

What do publishers get from an AI licensing deal?

Publishers typically receive payment, attribution inside AI answers, and links back to their sites, and sometimes access to the AI company's technology to build their own tools. Compensation ranges from flat annual fees to usage-based payments to revenue sharing. Increasingly, the most valuable return is being cited and surfaced inside AI answers rather than referral traffic.

Does having a licensing deal make a publisher more visible in AI answers?

Evidence suggests yes. Analyses show licensed and freely licensed sources, such as Reddit and Wikipedia, appear in a large share of citations from tools like Perplexity and ChatGPT. Unlicensed mid-tier publishers risk becoming nearly invisible in AI-mediated discovery regardless of quality, which is why licensing has become a significant lever for AI-search visibility.

If I cannot sign a licensing deal, can my content still appear in AI answers?

Yes. Licensing helps, but it is not the only path to citation. Clear, authoritative, well-structured content that AI crawlers can access and parse can still be surfaced and cited organically. Focus on direct answers, strong topical depth, clean structure, and crawlability, which is the core of generative engine optimization for sites without deals.