Speakable is schema.org markup that flags content for text-to-speech playback. Learn how speakable schema works, its limits, and its role in voice and AI search.

Speakable is a schema.org structured data property that identifies the sections within an article or webpage best suited for audio playback using text-to-speech. By marking up a headline and a short summary, you tell search engines and voice platforms exactly which words to read aloud when answering a spoken question.
It exists because voice and audio answers work differently from a screen. A device cannot read a whole article aloud, so speakable points it to the concise, self-contained passages that make sense without any visual context. The property is still in beta and narrow in scope, but it captures an idea that matters more every year: making content consumable by ear, not just by eye.
Speakable is a property in the schema.org vocabulary that flags page sections for text-to-speech. Implemented as part of structured data, usually within a WebPage or Article type, it specifies which elements a voice assistant should prioritize when reading content aloud. The property can be repeated to mark several sections on a page.
It is best understood as a hint, not a command. The markup tells the platform which passages are voice-friendly, but the system still decides algorithmically whether and what to read. As a form of structured content, speakable is part of the broader practice of labeling your page so machines can use it precisely.
Google Assistant uses speakable structured data to answer topical news queries on smart speakers and voice-enabled devices. When a user asks for news on a subject, the Assistant returns up to three articles from around the web and uses text-to-speech to read the sections marked as speakable, along with attribution to the source.
This makes speakable a route to appearing in spoken answers, not just visual results. Being the source a device reads aloud is a distinct form of visibility, closely tied to voice search, where the single spoken answer carries far more weight than a list of links a user can scan.
You add speakable as a SpeakableSpecification inside your structured data, then point it at the content to read using one of two content-locator types: CSS selectors or xPaths. A CSS selector targets elements by class or tag, for example a headline class and a summary class, while an xPath navigates the document structure directly. Only one locator type is allowed per specification object.
A typical JSON-LD snippet sets the page type, then a speakable object whose cssSelector lists the headline and summary elements. Keep the markup pointed at passages that already read well aloud. Implemented cleanly, this is a small addition that turns existing answer ready content into something a voice platform can speak directly.
Google recommends around 20 to 30 seconds of content per speakable section, roughly two to three sentences. The goal is concise headlines and summaries that deliver comprehensible, useful information on their own, not entire articles read end to end. Mark only the one or two passages that directly answer the likely question.
Crucially, the content must make sense in a voice-only setting. Avoid marking up photo captions, datelines, or attributions that confuse a listener who cannot see the page. Write short, clear sentences in a conversational register, since anything ambiguous on screen becomes worse when heard without visual cues.
For SEO, speakable is a niche but real opportunity in voice. As audio consumption grows, with a large majority of people listening to online audio each month, being the passage a device reads aloud is valuable hands-free visibility. It also improves accessibility for users with visual impairments and supports multitasking listeners.
For generative engine optimization, the deeper lesson is structural. The same discipline that makes content speakable, concise, self-contained passages that answer a question without surrounding context, is exactly what helps AI assistants extract and cite your content. Optimizing for the ear reinforces good voice search optimization and the passage clarity that AI search rewards.
Speakable is narrow today. It is in beta, limited to news content, and available to users in the US with English-language devices. It works only with Google Assistant, so it does not reach other voice platforms, and Google may choose to read a different section than the one you marked if its algorithms judge it more relevant.
Because of these constraints, speakable should not be the centerpiece of a strategy. Treat it as a low-cost enhancement for eligible content rather than a guaranteed channel. The principles behind it, however, clarity and self-contained answers, apply far beyond the property itself and pay off across voice and AI search.
The property points toward a broader shift: content increasingly consumed without a screen, through smart speakers, in-car assistants, and AI voices. As more answers are spoken rather than shown, the ability to deliver a clean, standalone passage becomes a core skill, whether or not a specific markup is involved.
Even if speakable stays limited, designing content to be read aloud, short sentences, direct answers, no reliance on visuals, future-proofs it for voice-first and AI-driven discovery. The format teaches a habit worth keeping: write so a machine can speak your best sentences and have them stand on their own.
Speakable is a schema.org property that marks the sections of a page best suited for text-to-speech, used by Google Assistant to read news answers aloud. It is implemented with CSS selectors or xPaths, works best on concise two to three sentence passages, and remains narrow in scope and in beta. Its real value is the discipline it encourages.
Pair it with broader voice search optimization and clean structured content, and use Sorank's research and content planning tools to build answers that work by ear and on screen. Reference sources: Google Search Central and Productive Shop.
Speakable is a schema.org property that marks the sections of a page best suited for text-to-speech playback. Google Assistant uses it to answer topical news queries on smart speakers, reading the marked headline and summary aloud and crediting the source. It lets publishers signal which concise, self-contained passages a voice assistant should speak.
Google recommends around 20 to 30 seconds of audio per speakable section, which is roughly two to three sentences. The aim is a concise headline or summary that delivers useful information on its own. Mark only the one or two passages that directly answer the likely question, and make sure they make sense without any visual context.
Speakable is still in beta and narrow in scope. It is limited to news content, available mainly to US users with English-language devices, and works only with Google Assistant rather than all voice platforms. Google may also read a different section than the one you marked if it judges that more relevant, so inclusion is not guaranteed.