Research Portfolio · AI for Accessibility

Beyond Visual Defaults.

Creative software assumes sight. I build systems through which blind and low-vision people author slides, charts, and 3-D models, and direct the AI that increasingly helps make them: perceiving layout through touch and audio, verifying output they cannot see, and automating routine work on their own terms.

Designing for creators who cannot inspect the surface previews what every AI user will soon face.

The Bigger Picture

Why Accessibility Is Where Trustworthy AI Gets Solved

A blind user working with AI depends on output they cannot inspect at a glance. That has always been the reality of assistive technology, and it is rapidly becoming the ordinary condition of using AI at all. Three reasons this population is the right place to solve the problem:

1 The Highest Bar for Trust

No glance to fall back on

Sighted users catch many AI errors incidentally, in a glance; blind users catch none that the interface does not surface. A system that earns their trust cannot assume the user will notice; verification has to be designed in.

The payoff: Oversight mechanisms proven with blind users hold up for any user, in any high-stakes setting.

2 Everyone's Problem, Arriving at Scale

Review is not keeping up

When an agent rewrites thousands of lines of code or an entire document at once, no one reviews all of it. The condition blind users have always managed, acting on output they cannot fully perceive, is becoming the default for everyone.

The payoff: The structured, non-visual interaction techniques built for accessibility are a ready answer to this coming overload.

3 Agents Perceive Like Screen Readers

Structure, not pixels

AI agents read interfaces through accessibility trees, APIs, and structured representations: the same channels screen readers use. Making these representations expressive for blind users and making them expressive for agents are largely the same research problem.

The payoff: Accessibility research doubles as the design of AI-native interfaces.

Below, I present three case studies that put this vision into practice, each tackling a unique facet of making visual creativity accessible through AI: restoring spatial perception, enabling output verification, and giving users authorship over automation.

01
Spatial Understanding Touch & Audio CHI 2023 ASSETS 2024

Breaking the 1D Barrier

The Problem

Screen readers flatten the rich, 2D world of artboards and charts into a linear, 1D stream of text. Blind users lose spatial context, making layout and data analysis cognitively exhausting.

Spatial perception is often treated as purely visual, but it is fundamentally geometric and relational. My work decouples spatial reasoning from sight by reintroducing dimensionality through multimodal interaction. By combining touch, haptics, and spatial audio, we allow users to "physically" explore digital artifacts, restoring the agency to perceive layout and density that screen readers strip away.

A11yBoard: Decoupling Command and Perception

A11yBoard is a system that makes digital artboards, such as presentation slides, accessible to blind users. Blind creators are often power users of keyboards, but keyboards lack spatial feedback. A11yBoard introduces a "split-sensory" architecture: users keep the precision of the keyboard for command input (on a laptop) while gaining a new "perceptual window" via a paired touch device, enabling risk-free spatial exploration of slide layouts.

A blind user utilizing A11yBoard with a split setup: a laptop for keyboard commands and a smartphone for tactile spatial exploration.
The A11yBoard Split Setup: Laptop for editing commands, mobile device for spatial perception.

To support this, we developed a gesture vocabulary that translates visual scanning into tactile queries. The "Split-Tap" allows a user to keep one finger on an object (maintaining spatial reference) while tapping with another to query its properties, separating navigation from interrogation.

Diagram of A11yBoard gestures including Single-finger Exploration, Split-tap Selection, Two-finger Dwell, Two-finger Flick, Double Tap, Three-finger Swipe, Quadruple Tap, and Triple Tap.
Gesture vocabulary for non-visual spatial scanning.

ChartA11y: Feeling the Shape of Data

While A11yBoard handles discrete objects, data visualizations present a different challenge: density. A screen reader can read a data point, but it cannot convey a "trend." ChartA11y is a smartphone app that makes charts and graphs accessible through touch, haptics, and sonification. It provides two complementary modes of interaction.

Semantic Navigation provides structured access to chart components. Users traverse the chart's hierarchy (axes, legends, series) through a gesture set designed for building a mental model before diving into details.

ChartA11y gesture vocabulary including panning, double tap, swipe, and rotor interactions.
Semantic Navigation Gestures: Structured access to chart hierarchy.

Sonification turns data analysis into a multisensory experience. We map pitch to value and timbre to density, enabling users to perceive trends through audio alone.

Visualization of auditory feedback mapping pitch to Y-values in line charts and pitch/duration to density in scatter plots.
Sonification Design: Mapping data density to audio timbre and duration.

Direct Touch Mapping turns the screen into a tactile canvas. As users drag their fingers across a scatter plot, they receive continuous haptic and auditory feedback based on data density, identifying clusters, outliers, and gaps instantly.

ChartA11y Direct Touch Mapping features showing sonification via touch, pinch to zoom, split-tap for info, and swiping to switch series.
Direct Touch Mapping: Users "scan" density with their fingers, using pinch-to-zoom to manage information.
Key Contribution: Multimodal systems that restore spatial reasoning to assistive technology, enabling blind users to perceive 2D layouts and dense data trends that screen readers fundamentally cannot convey.
02
3D Modeling Generative AI ASSETS 2025

Trust in the Black Box

The Problem

Generative AI can create complex 3D models instantly, but for blind creators, the output is a black box. How can they trust that the AI respected their intent if they can't see the mesh?

If a blind user prompts an AI for a "helicopter," they might get a blob or a masterpiece. Without sight, they cannot verify the result. A11yShape is a system that enables blind users to create and verify 3D parametric models with AI assistance. It solves the verification problem through Cross-Representation Interaction: instead of showing only the visual output, we synchronize the Code (the source of truth), the Semantic Hierarchy (the structure), and the AI Description (the explanation).

A11yShape UI showing the synchronization between the Code Editor, Semantic Tree, and AI Assistant panel.
Cross-Representation Highlighting: Selecting a component in the semantic tree (1) highlights its code (4) and generates a focused AI description (3).

This triangulation allows for verification without vision. Users inspect the underlying logic rather than visual output. If the AI says "added a propeller," the user can verify that the code block exists, is connected to the right parent node, and has parameters that make sense, transforming a "slot machine" interaction into a rigorous, iterative engineering process.

User journey comparison showing failure of all-at-once generation vs success of iterative, component-based construction.
From Hallucination to Engineering: While "all-at-once" generation fails (1), iterative verification allows blind users to construct complex parametric models (2-10).
Key Contribution: A verification paradigm for AI-generated artifacts where users inspect synchronized semantic representations rather than visual output, making trust possible without sight.
03
User-Defined Routines AI Agents Preprint 2026

Authoring the Automation

The Problem

Navigating information-dense interfaces with a screen reader is repetitively exhausting. AI agents promise automation but often act as "black boxes," taking control away from the user and creating safety risks if they hallucinate.

ScreenRoutine is a system that lets blind users define their own automation routines in natural language, then translates those routines into structured, verifiable programs. Instead of asking a black-box agent to "do it for me," users author Routines, e.g., "Find the cheapest cable," which the system compiles into semantic blocks: Triggers, Filters, and Actions. This restores agency by putting the user in control of the automation logic.

ScreenRoutine Workflow: (A) User describes intent in natural language. (B) System translates this into a structured, verifiable routine. (C) The routine executes via standard screen reader navigation. (D) User can refine logic via natural language.
From Intent to Execution: Users speak a goal (A), which transforms into a verifiable routine (B) that drives the screen reader (C).

The Intermediate Representation is the key to trust. Before execution, the user can audit the logic: "Did it interpret 'cheapest' as sorting by price?" If the logic is flawed, they can refine it naturally (e.g., "Actually, sort by length"). By sitting between the user and the application, ScreenRoutine empowers blind users to be the architects of their own automation, combining the flexibility of LLMs with the reliability of deterministic execution.

Key Contribution: A system that shifts the accessibility paradigm from rigid tool use to user-authored AI routines, restoring personal agency in automated workflows.