TDM rights reservation lets copyright holders opt out of text and data mining for AI training using machine-readable signals. Learn how it works.

TDM rights reservation is the formal way a copyright holder opts out of having their content mined for text and data mining, the automated analysis that underpins much of modern AI training. Text and data mining, often shortened to TDM, is the computational technique of analyzing large volumes of text and data to extract patterns, trends, and correlations. In the European Union, this mining is permitted by default, but creators can reserve their rights to prevent it.
This matters because the assistants and models that increasingly decide online visibility, such as ChatGPT, Perplexity, and Gemini, are built on vast amounts of web content gathered by crawlers. TDM rights reservation is the legal lever that determines whether your content can be used to train those systems, which makes it a core concept for any publisher thinking about AI search, licensing, and control over their work.
TDM rights reservation is a right granted to copyright holders under the EU Digital Single Market Copyright Directive, formally Directive 2019/790. The directive created a broad exception that allows reproduction and extraction of lawfully accessible works for text and data mining without asking permission first. Article 4 of that directive then gives rights holders the ability to reserve their rights and opt out of this exception.
The logic is opt-out rather than opt-in. Mining is allowed unless you say otherwise, but if you do reserve your rights properly, the exception no longer applies to your content. At that point an AI developer must obtain authorization from you before mining the work, which can mean negotiating a license. This places the reservation at the heart of how creators control machine access to their content, alongside tools like publisher licensing.
For content made publicly available online, the directive requires that the reservation be expressed in a machine-readable form. A statement buried in human-readable terms of service is not enough; the signal has to be something a crawler can parse automatically. This is what connects a legal right to a practical, technical action.
In effect, the reservation is a message to the bots that gather training data. When a compliant crawler bot visits a page, it should read the reservation and refrain from mining the content for AI. The reservation does not block access in the way a password would; it communicates the rights holder's refusal in a format that crawlers are expected to recognize and respect.
Several technical protocols can carry a TDM reservation, and the European Commission has run a stakeholder consultation to identify which ones count as state of the art. The familiar robots.txt file is the baseline, but more specialized options exist, including a dedicated ai.txt file, the TDM Reservation Protocol known as TDMRep, C2PA content authenticity assertions, and centralized registries such as the Do Not Train registry.
These split into two families. Location-based protocols, like robots.txt or HTTP headers, are applied by the domain owner and cover all content on a site. Unit-based protocols use metadata tags to mark a specific work, telling a crawler the creator's wish for that single file. Many of these signals are designed to be read by the same AI crawlers that gather AI training data, and they sit alongside emerging conventions like llms-full.txt files.
The EU AI Act reinforces the directive for the largest models. Recital 105 acknowledges that training general-purpose AI requires extensive text, images, and video, and confirms that using copyright-protected content needs authorization unless an exception applies. The Act then turns the reservation into a compliance duty rather than a mere request.
Article 53 requires providers of general-purpose AI models to put in place a policy to comply with EU copyright law, including identifying and respecting rights reservations using state-of-the-art technologies. Providers must also publish a sufficiently detailed summary of the content used for training, so creators can check whether their work was used. The GPAI Code of Practice fleshes this out, asking signatories to honor machine-readable protocols that are technically implementable and widely adopted. This is part of the broader landscape of AI regulation.
For publishers, this is a strategic choice rather than a purely legal one. Reserving rights can protect premium content and create leverage to negotiate paid licensing with AI companies. Leaving content open can increase the chance of being included in training data and, potentially, of being represented in AI answers. Each path has trade-offs for visibility and revenue.
The decision interacts directly with generative engine optimization. If your goal is to appear in AI answers and earn citations, blocking all mining may work against you. If your goal is to protect and monetize a valuable archive, reservation is a tool to assert control. Many publishers will segment their content, opening some and reserving the rest, which they can plan alongside their AI content strategy and disciplined keyword research and content planning.
The biggest weakness is that a reservation is a signal, not a wall. A non-compliant bot can simply ignore robots.txt or a TDM tag and mine the content anyway, since the file does not technically prevent access. The legal right exists, but enforcement depends on detection and, ultimately, litigation.
There is also fragmentation. With many competing protocols and no single universally honored standard yet, a rights holder may need to use several signals at once to be safe, and there is uncertainty about which a given crawler will read. The Commission's effort to publish an agreed list of opt-out solutions aims to reduce this confusion, but until adoption is uniform, the practical strength of a reservation varies by who is crawling.
TDM rights reservation is primarily an EU concept, rooted in the Digital Single Market directive, and other regions take different approaches. The United Kingdom has consulted on its own model, and the United States relies more on fair use doctrine and case law than on a formal opt-out, which means the same content can be treated differently depending on where mining occurs.
For global publishers, this patchwork is a planning problem. A reservation that carries legal weight in the EU may have a weaker footing elsewhere, so a complete strategy considers each major jurisdiction. The direction of travel, though, is toward more explicit creator control, and machine-readable reservation is the mechanism most likely to underpin it.
TDM rights reservation is the machine-readable opt-out that lets copyright holders refuse text and data mining of their work, including AI training, under the EU Copyright Directive and the AI Act. It converts a legal right into a technical signal that crawlers are meant to read, giving creators a way to protect and potentially monetize their content. Its main limit is enforcement, since a reservation signals refusal but cannot physically block a non-compliant bot.
For publishers weighing visibility against control, the choice to reserve or open content is now part of strategy. Connect it with publisher licensing and your broader AI content strategy, and use Sorank's research and content planning tools to decide where openness earns the most. Reference sources: European Commission and IAPP.
TDM rights reservation is the legal mechanism that lets a copyright holder say no to having their content used for text and data mining, which includes AI training. Under the EU Digital Single Market Copyright Directive, mining is allowed by default unless the rights holder reserves their rights using a machine-readable signal. If you opt out correctly, an AI developer must get your permission before mining your work.
You express the reservation in a machine-readable form that crawlers can read. The most common method is robots.txt, but more specific options exist, including the TDM Reservation Protocol, an ai.txt file, HTTP headers, and metadata tags on individual works. Location-based protocols apply to a whole site, while unit-based protocols tag a single file. Using clear, widely recognized signals gives your reservation the best chance of being respected.
Legally, within the EU, a valid machine-readable reservation means a general-purpose AI provider must obtain authorization before mining your content, and the AI Act requires providers to put copyright compliance policies in place. Technically, a reservation is only a signal, so a non-compliant bot can still ignore it. The reservation strengthens your legal position but does not physically block access on its own.