Ir al contenido principal
Skip to docs content

Screen Reader Comparison

A side-by-side comparison of the four major screen readers used today: NVDA, JAWS, VoiceOver, and Narrator. Each has its own announcement patterns, terminology preferences, and interaction quirks. This guide documents those differences so you can understand what users actually hear when they navigate your interfaces. Keep in mind that this is approximate: actual behavior varies by screen reader version, browser pairing, operating system build, and individual user settings. No two setups produce identical output, but the patterns described here reflect default configurations as of mid-2024.

Overview Table

The screen reader landscape is dominated by four products, each tied to specific platforms and browsers. Understanding their market positions, cost models, and default verbosity helps you prioritize testing and set expectations for what users will experience. The table below summarizes the key differences at a glance. Market share numbers come from the WebAIM Screen Reader User Survey and other accessibility community surveys. They shift year over year and vary significantly depending on whether you measure desktop-only or include mobile users.

Screen ReaderPlatformBrowserMarket ShareCostVerbosity
NVDAWindowsFirefox, Chrome~40% (desktop)FreeConfigurable
JAWSWindowsChrome, Edge~30% (desktop)$1000+ licenseVerbose by default
VoiceOvermacOS, iOSSafari~25% (desktop + mobile)Built-inModerate
NarratorWindowsEdge~5%Built-inConcise

Note: Market share estimates vary significantly by survey methodology and user population. The WebAIM survey skews toward power users who actively respond to surveys. Mobile usage (where VoiceOver dominates on iOS) is often underrepresented in desktop-focused studies. These numbers are directional, not definitive.

NVDA (NonVisual Desktop Access) is the most popular free screen reader on Windows. It's open source, actively developed by NV Access, and pairs best with Firefox, though Chrome support has improved significantly in recent years. Its verbosity is highly configurable, which means experienced users often reduce announcement detail to speed up navigation. For testing purposes, use NVDA's default settings to establish a baseline.

JAWS (Job Access With Speech) by Freedom Scientific is the oldest commercial screen reader still in active use. It dominated the market for decades before NVDA gained traction as a free alternative. JAWS is verbose by default: it announces more context about elements, uses slightly different terminology, and has deep customization options through its scripting language. Many enterprise accessibility teams still test primarily with JAWS because their user base includes long-time JAWS customers in corporate environments.

VoiceOver is Apple's built-in screen reader, available on macOS and iOS. On desktop, it pairs exclusively with Safari for web content. Using VoiceOver with Chrome on macOS produces inconsistent results. On iOS, VoiceOver is the dominant screen reader by a massive margin since there are no third-party alternatives. VoiceOver uses comma-separated announcements and tends to announce the name before the role, separated by a pause.

Narrator is Microsoft's built-in screen reader for Windows. It's seen major improvements in recent years and works best with Microsoft Edge. While its market share is small among experienced screen reader users, it's often the first screen reader that casual users encounter because it comes pre-installed. Narrator tends toward concise announcements, using fewer words than JAWS or VoiceOver for the same elements.

How Each Reader Announces Common Patterns

The most practical way to understand screen reader differences is to see exactly what each one says for the same HTML element. The table below shows the default announcement for common patterns. These are based on default verbosity settings with no user customizations applied. In practice, experienced users often reduce verbosity, change punctuation settings, or use custom dictionaries, but these defaults represent what a typical user hears out of the box.

Pay attention to the ordering patterns. NVDA and JAWS typically announce the accessible name followed by the role as a continuous phrase. VoiceOver and Narrator separate them with commas, creating a more punctuated rhythm. Neither approach is better; they're just different conventions that users adapt to over time. The important thing is that the semantic information is complete and correct, regardless of presentation order.

ElementNVDAJAWSVoiceOverNarrator
Link"Home link""Home link""Home, link""Home, link"
Button"Submit button""Submit button""Submit, button""Submit, button"
Heading h2"heading level 2 About""About heading level 2""About, heading level 2""About, heading, level 2"
Checkbox (checked)"Accept checkbox checked""Accept check box checked""Accept, ticked, checkbox""Accept, checkbox, checked"
Navigation landmark"Main navigation landmark""Main navigation region""navigation, Main""navigation, Main"
Required input"Email edit required""Email edit type in text""Email, required, text field""Email, edit, required"
Expanded button"Menu button expanded""Menu button expanded""Menu, expanded, pop-up button""Menu, button, expanded"
Image"graphic Company logo""Company logo graphic""Company logo, image""Company logo, image"
List"list with 5 items""list of 5 items""list, 5 items""list, 5 items"
Alert"alert Error message""Alert Error message""Error message""alert, Error message"

Notice that VoiceOver doesn't announce "alert" as a separate word for the alert role. It relies on the distinct audio cue (a sound effect) to signal the alert, then reads only the message content. This is a common pattern with VoiceOver: it uses audio cues more aggressively than other readers, reducing spoken verbosity while still conveying role information through non-speech sounds. Narrator and NVDA prefer to speak the role explicitly, which makes them slightly more verbose but also more predictable for users who aren't familiar with all the audio cues.

The heading announcements show another interesting divergence. NVDA announces the role and level first ("heading level 2"), then the content. JAWS puts the content first, then the role metadata. VoiceOver and Narrator both lead with content but differ in how they break up the level information. These ordering differences are purely presentational. All four readers convey the same semantic information (name, role, level), just in different sequences.

Key Behavioral Differences

Beyond the element-level announcement patterns shown above, screen readers differ in several systematic ways. These behavioral differences affect how users perceive your interface and can influence design decisions around labeling, state communication, and landmark structure. Understanding these patterns helps you write markup that works well everywhere rather than optimizing for a single reader.

Announcement Order

NVDA and JAWS concatenate the accessible name and role into a single phrase without punctuation: "Submit button", "Home link". VoiceOver and Narrator insert a comma between name and role: "Submit, button", "Home, link". This creates a slight pause in speech synthesis that separates the label from the type. Both approaches are valid: the comma-separated style can be clearer for complex names, while the concatenated style feels faster for simple elements. Neither approach requires any developer action to work correctly; it's purely a reader-side presentation choice.

Landmark Terminology

When announcing landmark regions (navigation, main, complementary, etc.), NVDA uses the word "landmark" explicitly: "Main navigation landmark". JAWS uses the word "region" instead: "Main navigation region". VoiceOver and Narrator announce the landmark role as a prefix followed by the label: "navigation, Main". This terminology difference is worth knowing because if a user reports hearing "region" vs "landmark", it tells you which screen reader they're using. From a development perspective, all four readers correctly identify the same landmark. The terminology is cosmetic.

Disabled State

The way screen readers communicate that an element is disabled (via the disabled attribute or aria-disabled="true") varies significantly in word choice. NVDA says "unavailable", a term that clearly communicates the element exists but cannot be interacted with. VoiceOver says "dimmed", a visual metaphor that originated from macOS native UI conventions where disabled controls appear grayed out. Narrator says "disabled", the most literal and technical term. JAWS varies between "unavailable" and "grayed" depending on context and version. Despite different words, users of each reader understand the meaning. You don't need to add extra ARIA to compensate for these differences. Just use the standard disabled pattern.

Images

When encountering an image with alt text, NVDA announces it as "graphic" followed by the alt text: "graphic Company logo". VoiceOver and Narrator both use "image" as the role word: "Company logo, image". JAWS varies: in some versions it says "graphic" and in others "image", and the position relative to the alt text can change with settings. The key takeaway: always provide meaningful alt text, and don't worry about whether users hear "graphic" or "image". Both words are universally understood by screen reader users to mean "this is a non-text visual element."

Checkboxes

Unchecked checkboxes demonstrate vocabulary differences clearly. NVDA says "not checked", direct and unambiguous. VoiceOver says "unticked", using the British English metaphor of a tick mark. Narrator says "unchecked", a simple prefix negation. For checked state, VoiceOver says "ticked" while others say "checked". These are cosmetic vocabulary choices that don't affect usability. Users learn their reader's vocabulary quickly and never confuse the meaning.

Form Fields

Text inputs reveal some of the biggest divergences in announcement style. JAWS says "type in text" as a prompt, actively instructing the user what to do. NVDA says "edit blank" for an empty field or "edit" followed by the current value if populated, a more descriptive approach. VoiceOver says "text field", clean and minimal. These differences mean that automated testing output will look different for each reader, but the semantic content (this is a text input, here is its label, here is its current value) remains consistent across all four readers.

Advanced Guide: Cross-reader debugging

Techniques for debugging accessibility issues that manifest differently across screen readers.

How Speakable Models These Differences

Speakable includes dedicated rendering engines for each of the four major screen readers. Rather than producing a single generic output, it applies the specific announcement patterns, vocabulary choices, and ordering conventions documented above. When you run Speakable against a piece of HTML, you get four distinct outputs that approximate what each reader would say.

Each renderer is built around the default verbosity settings for its target screen reader. NVDA's renderer uses concatenated name-role patterns and the "graphic"/"unavailable"/"not checked" vocabulary. JAWS's renderer applies "region" for landmarks and "type in text" for form fields. VoiceOver's renderer uses comma separation, "dimmed" for disabled state, and "ticked"/"unticked" for checkboxes. Narrator's renderer uses comma separation with the "disabled" vocabulary and concise phrasing.

It's important to note that each real screen reader has extensive verbosity configuration. Users can increase or decrease how much information gets spoken, change punctuation behavior, and even create custom pronunciation dictionaries. Speakable models the default experience: what a user hears on a fresh installation with no customization. This represents the most common baseline and is the appropriate target for development-time testing.

The renderers are deterministic. The same HTML always produces the same output for a given screen reader target. This makes them suitable for snapshot testing, CI/CD assertions, and regression detection. If your HTML changes in a way that alters the predicted screen reader output, Speakable will flag it.

Practical Implications for Developers

Now that you understand how screen readers differ, the natural question is: what should you actually do about it? The answer is simpler than you might expect. The differences documented above are presentation-layer concerns. They affect how information is spoken, not what information is available. Your job as a developer is to provide correct semantic structure; each screen reader handles the presentation according to its own conventions.

Don't try to make all readers say the same thing

Each reader has its own vocabulary and ordering conventions developed over decades. Users are fluent in their reader's language. Attempting to force uniformity (e.g., adding extra ARIA to make VoiceOver announce things more like NVDA) creates confusion rather than clarity. Embrace the differences. They exist for good reasons and users expect them.

Focus on correct semantic structure

Use the right HTML elements. Add labels to form fields. Provide alt text for images. Set ARIA states correctly. If your semantics are right, every screen reader will convey the correct meaning, just with different words and ordering. The accessibility tree is your contract; how readers present it is their responsibility.

Test that meaning is preserved, not exact wording

When reviewing Speakable output across readers, ask: "Does each reader convey the same functional information?" Not: "Do they all say the exact same words?" A button labeled "Delete" might be announced as "Delete button" (NVDA) or "Delete, button" (VoiceOver), both correct. A missing label is a problem regardless of reader; different wording for the same semantic content is not.

View all outputs simultaneously

Speakable shows all four screen reader outputs side by side using the -s all flag. This is the fastest way to spot issues that affect one reader but not others, for example, a landmark that NVDA announces correctly but VoiceOver skips due to a missing label.

CLI Commands

Use the -s flag to target specific screen readers or view all outputs at once:

Terminal
# Show all four screen reader outputs
speakable analyze index.html -s all

# Check individual readers
speakable analyze index.html -s nvda
speakable analyze index.html -s jaws
speakable analyze index.html -s voiceover
speakable analyze index.html -s narrator

The -s all output groups results by screen reader with clear headers, making it easy to scan for inconsistencies. When running in CI, you can assert against individual reader outputs to catch regressions specific to one reader's rendering path.

For programmatic use, the renderer functions are available individually through the JavaScript API. Each renderer takes the same parsed accessibility tree and applies its own formatting rules, so you can integrate cross-reader testing directly into your test suite without spawning CLI processes.

Important caveat

Speakable approximates screen reader behavior. Actual output varies by version, settings, and browser. Always validate complex interactions with real assistive technology. Speakable is a development-time linter, not a replacement for manual testing with actual screen readers. Use it to catch obvious issues early and to maintain consistency across code changes. Then confirm critical flows with real AT before shipping.

Related Pages