Skip to main content
Skip to docs content

API Reference

Speakable processes HTML through a modular pipeline: parse → extract → model → render. Each stage is available as a standalone programmatic API via @reticular/speakable, or through the speakable CLI.

Parser

The Parser is the entry point of the pipeline. It takes a raw HTML string and returns a parsed DOM document via jsdom with lenient error recovery. Malformed HTML is handled gracefully — the parser emits warnings but continues processing.

CLI
# Analyze an HTML file (default JSON output)
speakable page.html
Programmatic API
import { parseHTML } from '@reticular/speakable';

// Parse any HTML string — raw snippets, full pages, file contents
const { document, warnings } = parseHTML('<button>Submit</button>');

// Or read from a file
import { readFileSync } from 'fs';
const html = readFileSync('page.html', 'utf-8');
const result = parseHTML(html);

Returns { document, warnings } — a standard DOM Document and an array of parsing warnings. Lenient mode means even severely malformed HTML produces a usable document rather than throwing.

Extractor

The Extractor walks the parsed DOM and builds a canonical accessibility tree. It computes accessible names, maps roles, extracts states, and determines focusability for every element — following the W3C ARIA specification.

Programmatic API
import {
  buildAccessibilityTree,
  buildAccessibilityTreeWithSelector,
  computeAccessibleName,
  computeRole,
} from '@reticular/speakable';

// Build the full accessibility tree from a DOM element
const { model, warnings } = buildAccessibilityTree(document.body);

// Or filter to specific elements with a CSS selector
const results = buildAccessibilityTreeWithSelector(document.body, 'button');

Name Computation

Follows the ARIA name computation algorithm: aria-labelledbyaria-label → native label → alt → text content → title.

Role Mapping

Explicit role attribute takes priority, then implicit role from the HTML element (e.g. <nav> → navigation, <a href> → link).

State Extraction

Extracts ARIA states: expanded, checked (including mixed), pressed, selected, disabled, invalid, required, readonly, busy, current, grabbed, hidden, level, posinset, setsize.

Focus Detection

Determines focusability from native element type and explicit tabindex. Reports both focusable status and tabindex value.

Renderers

Renderers transform the canonical model into screen reader-specific announcement text. Each renderer applies the unique patterns of its target screen reader. Pass an optional colorize boolean for ANSI-colored terminal output.

Programmatic API
import { renderNVDA, renderJAWS, renderVoiceOver, renderAuditReport }
  from '@reticular/speakable';

renderNVDA(model);          // plain text
renderNVDA(model, true);    // ANSI-colored output
renderJAWS(model);          // JAWS-style output
renderVoiceOver(model);     // VoiceOver-style output
renderAuditReport(model);   // structured audit report

NVDA

Simulates NVDA speech output. Uses "navigation landmark", "edit" for textboxes, "graphic" for images. States: "not checked", "half checked" (mixed), "unavailable" (disabled).

JAWS

Approximates JAWS speech patterns. Uses "navigation region" (vs NVDA's "landmark"), "clickable" for links, "check box" (two words). Mixed state: "partially checked".

VoiceOver

Tailored for macOS VoiceOver. Announces role before name for headings and landmarks (e.g. "navigation, Main"). Uses "dimmed" for disabled, "edit text" for textboxes.

Audit Report

Generates a structured accessibility report with landmark structure, heading hierarchy validation, interactive element inventory, severity-coded issues (error/warning/info), and summary statistics.

Model

The canonical AnnouncementModel is a deterministic, serializable representation of the accessibility tree. It's designed for snapshot testing, diffing, and CI/CD pipelines.

Programmatic API
import { serializeModel, deserializeModel, validateModel }
  from '@reticular/speakable';

// Serialize to deterministic JSON (sorted keys)
const json = serializeModel(model);

// Deserialize back with validation
const restored = deserializeModel(json);

// Validate model structure (throws ValidationError)
validateModel(model);
AnnouncementModel Structure
interface AnnouncementModel {
  version: { major: number; minor: number };
  root: AccessibleNode;
  metadata: {
    extractedAt: string;   // ISO 8601 timestamp
    sourceHash?: string;   // hash of source HTML
  };
}

interface AccessibleNode {
  role: AccessibleRole;       // "button", "link", "heading", etc.
  name: string;               // computed accessible name
  description?: string;       // aria-describedby / title
  value?: AccessibleValue;    // form control values
  state: AccessibleState;     // expanded, checked, pressed, etc.
  focus: FocusInfo;           // { focusable, tabindex? }
  children: AccessibleNode[]; // child nodes
}

Diff

The diff module compares two accessibility trees and returns a structured list of added, removed, and changed nodes. Each change includes the specific properties that differ (name, role, state, focus). Ideal for regression detection in CI.

CLI
# Compare two HTML files
speakable new.html --diff old.html

# Diff with text output
speakable new.html --diff old.html -f text
Programmatic API
import { diffAccessibilityTrees } from '@reticular/speakable';

const diff = diffAccessibilityTrees(oldTree, newTree);
// diff.changes → [{ type, path, node?, changes? }]
// diff.summary → { added, removed, changed, total }

Voice Announcer

The web analyzer and browser extension include built-in speech playback powered by the browser's native SpeechSynthesis API. Hear what screen readers would say — no assistive technology installation required.

Play All

Reads the full output sequentially with pauses between lines and longer pauses between screen reader sections. Supports pause, resume, and stop.

Line-by-Line

Navigate output one line at a time with ↑/↓ arrow keys. Each line is spoken aloud and highlighted — mimicking how screen reader users actually browse.

Voice & Speed

Choose from available browser voices and adjust playback speed from 0.5x to 2.0x. Zero dependencies — uses the Web Speech API built into all modern browsers.

Extension Support

The Chrome extension includes the same voice controls — play, pause, stop, voice selection, and speed adjustment — directly in the extension popup.

See the Usage Guide for keyboard shortcuts and detailed usage instructions.

Need more help?

Join our developer community to share accessibility patterns.

Join Discord