Chromium Code Reviews
chromiumcodereview-hr@appspot.gserviceaccount.com (chromiumcodereview-hr) | Please choose your nickname with Settings | Help | Chromium Project | Gerrit Changes | Sign out
(197)

Side by Side Diff: docs/accessibility.md

Issue 2478083002: Add more topics to accessiiblity documentation. (Closed)
Patch Set: Created 4 years, 1 month ago
Use n/p to move between diff chunks; N/P to move between comments. Draft comments are only viewable by you.
Jump to:
View unified diff | Download patch
« no previous file with comments | « no previous file | no next file » | no next file with comments »
Toggle Intra-line Diffs ('i') | Expand Comments ('e') | Collapse Comments ('c') | Show Comments Hide Comments ('s')
OLDNEW
1 # Accessibility Overview 1 # Accessibility Overview
2 2
3 This document describes how accessibility is implemented throughout Chromium at 3 Accessibility means ensuring that all users, including users with disabilities,
4 a high level. 4 have equal access to software. One piece of this involves basic design
5 principles such as using appropriate font sizes and color contrast,
6 avoiding using color to convey important information, and providing keyboard
7 alternatives for anything that is normally accomplished with a pointing device.
8 However, the majority of accessibility code in Chromium is concerned with
aboxhall 2016/11/04 17:02:07 Rather than "the majority of accessibility code in
dmazzoni 2016/11/04 22:09:05 Done.
9 providing full access to Chromium's UI via external accessibility APIs that
10 are utilized by assistive technology.
11
12 Assistive technology includes:
aboxhall 2016/11/04 17:02:07 How about "'Assistive technology' here refers to s
dmazzoni 2016/11/04 22:09:05 Done.
13
14 * Screen readers for blind users that describe the screen using
15 synthesized speech or braille
16 * Voice control applications that let you speak to the computer,
17 * Switch access that lets you control the computer with a small number
18 of physical switches,
19 * Magnifiers that magnify a portion of the screen, and often highlight the
20 cursor and caret for easier viewing, and
21 * Assistive learning and literacy software that helps users who have a hard
22 time reading print, by highlighting and/or speaking selected text
23
24 In addition, because accessibility APIs provide a convenient and universal
25 way to explore and control applications, they're often used for automated
26 testing scripts, and UI automation software like password managers.
27
28 Web browsers play an important role in this ecosystem because they need
29 to not only provide access to their own UI, but also provide access to
30 all of the content of the web.
31
32 Each operating system has its own native accessibility API. While the
33 core APIs tend to be well-documented, it's unfortunately common for
34 screen readers in particular to depend on additional undocumented or
35 vendor-specific APIs in order to fully function, especially with web
36 browsers, because the standard APIs are insufficient to handle the
37 complexity of the web.
38
39 Chromium needs to support all of these operating system and
40 vendor-specific accessibility APIs in order to be usable with the full
41 ecosystem of assistive technology on all platforms. Just like Chromium
42 sometimes mimics the quirks and bugs of older browsers, Chromium often
43 needs to mimic the quirks and bugs of other browsers' implementation
44 of accessibility APIs, too.
5 45
6 ## Concepts 46 ## Concepts
7 47
8 The three central concepts of accessibility are: 48 While each operating system and vendor accessibility API is different,
49 there are some concepts all of them share.
9 50
10 1. The *tree*, which models the entire interface as a tree of objects, exposed 51 1. The *tree*, which models the entire interface as a tree of objects, exposed
11 to screenreaders or other accessibility software; 52 to assistive technology via accessibility APIs;
12 2. *Events*, which let accessibility software know that a part of the tree has 53 2. *Events*, which let assistive technology know that a part of the tree has
13 changed somehow; 54 changed somehow;
14 3. *Actions*, which come from accessibility software and ask the interface to 55 3. *Actions*, which come from assistive technology and ask the interface to
15 change. 56 change.
16 57
17 Here's an example of an accessibility tree looks like. The following HTML: 58 Consider the following small HTML file:
18 59
19 ``` 60 ```
20 <select title="Select A"> 61 <html>
21 <option value="1">Option 1</option> 62 <head>
22 <option value="2" selected>Option 2</option> 63 <title>How old are you?</title>
23 <option value="3">Option 3</option> 64 </head>
24 </select> 65 <body>
25 ``` 66 <label for="age">Age</label>
26 67 <input id="age" type="number" name="age" value="42">
27 has a generated accessibility tree like this: 68 <div>
28 69 <button>Back</button>
29 ``` 70 <button>Next</button>
30 0: AXMenuList title="Select A" 71 </div>
31 1: AXMenuListOption title="Option 1" 72 </body>
32 2: AXMenuListOption title="Option 2" selected 73 </html>
33 3: AXMenuListOption title="Option 3" 74 ```
34 ``` 75
35 76 ### The Accessibility Tree and Accessibility Attributes
36 Given that accessibility tree, an example of the events generated when selecting 77
37 "Option 1" might be: 78 Internally, Chromium represents the accessibility tree for that web page
38 79 using a data structure something like this:
39 ``` 80
40 AXMenuListItemUnselected 2 81 ```
41 AXMenuListItemSelected 1 82 id=1 role=WebArea name="How old are you?"
42 AXMenuListValueChanged 0 83 id=2 role=Label name="Age"
43 ``` 84 id=3 role=TextField labelledByIds=[2] value="42"
44 85 id=4 role=Group
45 An example of a command used to change the selection from "Option 1" to "Option 86 id=5 role=Button name="Back"
46 3" might be: 87 id=6 role=Button name="Next"
47 88 ```
48 ``` 89
49 AccessibilityMsg_DoDefaultAction 3 90 Note that the tree structure closely resembles the structure of the
50 ``` 91 HTML elements, but slightly simplified. Each node in the accessibility
51 92 tree has an ID and a role. Many have a name. The text field has a value,
52 All three concepts are handled at several layers in Chromium. 93 and instead of a name it has labelledByIds, which indicates that its
94 accessible name comes from another node in the tree, the label node
95 with id=2.
96
97 On a particular platform, each node in the accessibility tree is implemented
98 by an object that conforms to a particular protocol.
99
100 On Windows, the root element implements the IAccessible protocol and
101 if you call IAccessible::get_accRole, it returns ROLE_SYSTEM_DOCUMENT,
102 and if you call IAccessible::get_accName, it returns "How old are you?".
103 Other methods let you walk the tree.
104
105 On macOS, the root element implements the NSAccessibility protocol and
106 if you call accessibilityRole(), it returns @"AXWebArea", and if you
Elly Fong-Jones 2016/11/04 15:59:06 objc methods are generally referenced like -[NSAcc
dmazzoni 2016/11/04 22:09:05 Good idea
107 call accessibilityLabel(), it returns "How old are you?".
108
Elly Fong-Jones 2016/11/04 15:59:06 what (if anything) happens on Linux?
dmazzoni 2016/11/04 22:09:05 Done.
109 So while the details of the interface vary, the underlying concepts are
110 similar. Both IAccessible and NSAccessibility have a concept of a role,
111 but IAccessible uses a role of "document" for a web page, while NSAccessibility
112 uses a role of "web area". Both IAccessible and NSAccessibility have a
113 concept of the primary accessible text for a node, but IAccessible calls
114 it the "name" while NSAccessibility calls it the "label".
115
116 **Historical note:** The internal names of roles and attributes in
117 Chrome often tend to most closely match the macOS accessibility API
118 because Chromium was originally based on WebKit, where most of the
119 accessibility code was written by Apple. Over time we're slowly
120 migrating internal names to match what those roles and attributes are
121 called in web accessibility standards, like ARIA.
122
123 ### Accessibility Events
124
125 In Chromium's internal terminology, an Accessibility Event always represents
126 communication from the app to the assistive technology, indicating that the
127 accessibility tree changed in some way.
128
129 As an example, if the user were to press the Tab key and the text
130 field from the example above became focused, Chromium would fire a
131 "focus" accessibility event that assistive technology could listen
132 to. A screen reader might then announce the name and current value of
133 the text field. A magnifier might zoom the screen to its bounding
134 box. If the user types some text into the text field, Chromium would
135 fire a "value changed" accessibility event.
136
137 As with nodes in the accessibility tree, each platform has a slightly different
138 API for accessibility events. On Windows we'd fire EVENT_OBJECT_FOCUS for
139 a focus change, and on Mac we'd fire @"AXFocusedUIElementChanged".
140 Those are pretty similar. Sometimes they're quite different - to support
141 live regions (notifications that certain key parts of a web page have changed),
142 on Mac we simply fire @"AXLiveRegionChanged", but on Windows we need to
143 fire IA2_EVENT_TEXT_INSERTED and IA2_EVENT_TEXT_REMOVED events individually
144 on each affected node within the changed region, with additional attributes
145 like "container-live:polite" to indicate that the affected node was part of
146 a live region. The point is just to illustrate that the concepts are similar,
Elly Fong-Jones 2016/11/04 15:59:06 is this last sentence a meta-comment on the rest o
dmazzoni 2016/11/04 22:09:05 Yes, I tried to clarify it a bit. I just didn't wa
147 but the details of notifying software on each platform about changes can
148 vary quite a bit.
149
150 ### Accessibility Actions
151
152 Each native object that implements a platform's native accessibility API
153 supports a number of actions, which are requests from the assistive
154 technology to control or change the UI. This is the opposite of events,
155 which are messages from Chromium to the assistive technology.
156
157 For example, if the user had a voice control application running, such as
158 Voice Access on Android, the user could just speak the name of one of the
159 buttons on the page, like "Next". Upon recognizing that text and finding
160 that it matches one of the UI elements on the page, the voice control
161 app executes the action to click the button id=6 in Chromium's accessibility
162 tree. Internally we call that action "do default" rather than click, since
163 it represents the default action for any type of control.
164
165 Other examples of actions include setting focus, changing the value of
166 a control, and scrolling the page.
167
168 ### Parameterized attributes
169
170 In addition to accessibility attributes, events, and actions, native
171 accessibility APIs often have so-called "parameterized attributes".
172 The most common example of this is for text - for example there may be
173 a function to retrieve the bounding box for a range of text, or a
174 function to retrieve the text properties (font family, font size,
175 weight, etc.) at a specific character position.
176
177 Parameterized attributes are particularly tricky to implement because
178 of Chromium's multi-process architecture. More on this in the next section.
179
180 ## Chromium's multi-process architecture
aboxhall 2016/11/04 17:02:07 A diagram somewhere in here would be very helpful,
dmazzoni 2016/11/04 22:09:05 Sounds great, but I'll save that for a future revi
181
182 Native accessibility APIs tend to have a *functional* interface, where
183 Chromium implements an interface for a canonical accessible object that
184 includes methods to return various attributes, walk the tree, or perform
185 an action like click(), focus(), or setValue(...).
186
187 In contrast, the web has a largely *declarative* interface. The shape
188 of the accessibility tree is determined by the DOM tree (occasionally
189 influenced by CSS), and the accessible semantics of a DOM element can
190 be modified by adding ARIA attributes.
191
192 One important complication is that all of these native accessibility APIs
193 are *synchronous*, while Chromium is multi-process, with the contents of
194 each web page living in a different process than the process that
195 implements Chromium's UI and the native accessibility APIs.
196
197 Chromium's multi-process architecture means that we can't implement
198 accessibility APIs the same way that a single-process browser can -
199 namely, by calling directly into the DOM to compute the result of each
200 API call. For example, on some operating systems there might be an API
201 to get the bounding box for a particular range of characters on the
202 page. In other browsers, this might be implemented by creating a DOM
203 selection object and asking for its bounding box.
204
205 That implementation would be impossible in Chromium because it'd require
206 blocking the main thread while waiting for a response from the renderer
207 process that implements that web page's DOM. (Not only is blocking the
208 main thread strictly disallowed, but the latency of doing this for every
209 API call makes it prohibitively slow anyway.) Instead, Chromium takes an
210 approach where a representation of the entire accessibility tree is
211 cached in the main process. Great care needs to be taken to ensure that
212 this representation is as concise as possible.
213
214 In Chromium, we build a data structure representing all of the
215 information for a web page's accessibility tree, send the data
216 structure from the renderer process to the main browser process, cache
217 it in the main browser process, and implement native accessibility
218 APIs using solely the information in that cache.
219
220 As the accessibility tree changes, tree updates and accessibility events
221 get sent from the renderer process to the browser process. The browser
222 cache is updated atomically in the main thread, so whenever an external
223 client (like assistive technology) calls an accessibility API function,
224 we're always returning something from a complete and consistent snapshot
225 of the accessibility tree. From time to time, the cache may lag what's
226 in the renderer process by a fraction of a second.
227
228 Here are some of the specific challenges faced by this approach and
229 how we've addressed them.
230
231 ### Sparse data
232
233 There are a *lot* of possible accessibility attributes for any given
234 node in an accessibility tree. For example, there are more than 150
235 unique accessibility API methods that Chrome implements on the Windows
236 platform alone. We need to implement all of those APIs, many of which
237 request rather rare or obscure attributes, but storing all possible
238 attribute values in a single struct would be quite wasteful.
239
240 To avoid each accessible node object containing hundreds of fields the
241 data for each accessibility node is stored in a relatively compact
242 data structure, ui::AXNodeData. Every AXNodeData has an integer ID, a
243 role enum, and a couple of other mandatory fields, but everything else
244 is stored in attribute arrays, one for each major data type.
245
246 ```
247 struct AXNodeData {
248 int32_t id;
249 AXRole role;
250 ...
251 std::vector<std::pair<AXStringAttribute, std::string>> string_attributes;
252 std::vector<std::pair<AXIntAttribute, int32_t>> int_attributes;
253 ...
254 }
255 ```
256
257 So if a text field has a placeholder attribute, we can store
258 that by adding an entry to `string_attributes` with an attribute
259 of ui::AX_ATTR_PLACEHOLDER and the placeholder string as the value.
260
261 ### Incremental tree updates
262
263 Web pages change frequently. It'd be terribly inefficient to send a
264 new copy of the accessibility tree every time any part of it changes.
265 However, the accessibility tree can change shape in complicated ways -
266 for example, whole subtrees can be reparented dynamically.
267
268 Rather than writing code to deal with every possible way the
269 accessibility tree could be modified, Chromium has a general-purpose
270 tree serializer class that's designed to send small incremental
271 updates of a tree from one process to another. The tree serializer has
272 just a few requirements:
273
274 * Every node in the tree must have a unique integer ID.
275 * The tree must be acyclic.
276 * The tree serializer must be notified when a node's data changes.
277 * The tree serializer must be notified when the list of child IDs of a
278 node changes.
279
280 The tree serializer doesn't know anything about accessibility attributes.
aboxhall 2016/11/04 17:02:07 Would it be dangerous to point to the specific loc
dmazzoni 2016/11/04 22:09:05 My plan was to cover all of the concepts at the to
281 It keeps track of the previous state of the tree, and every time the tree
282 structure changes (based on notifications of a node changing or a node's
283 children changing), it walks the tree and builds up an incremental tree
284 update that serializes as few nodes as possible.
285
286 In the other process, the Unserialization code applies the incremental
287 tree update atomically.
288
289 ### Text bounding boxes
290
291 One challenge faced by Chromium is that accessibility clients want to be
292 able to query the bounding box of an arbitrary range of text - not necessarily
293 just the current cursor position or selection. As discussed above, it's
294 not possible to block Chromium's main browser process while waiting for this
295 information from Blink, so instead we cache enough information to satisfy these
296 queries in the accessibility tree.
297
298 To compactly store the bounding box of every character on the page, we
299 split the text into *inline text boxes*, sometimes called *text runs*.
300 For example, in a typical paragraph, each line of text would be its own
301 inline text box. In general, an inline text box or text run contians a
302 sequence of text characters that are all oriented in the same direction,
303 in a line, with the same font, size, and style.
304
305 Each inline text box stores its own bounding box, and then the relative
306 x-coordinate of each character in its text (assuming left-to-right).
307 From that it's possible to compute the bounding box
308 of any individual character.
309
310 The inline text boxes are part of Chromium's internal accessibility tree.
311 They're used purely internally and aren't ever exposed directly via any
312 native accessibility APIs.
313
314 For example, suppose that a document contains a text field with the text
315 "Hello world", but the field is narrow, so "Hello" is on the first line and
316 "World" is on the second line. Internally Chromium's accessibility tree
317 might look like this:
318
319 ```
320 staticText location=(8, 8) size=(38, 36) name='Hello world'
321 inlineTextBox location=(0, 0) size=(36, 18) name='Hello ' characterOffsets=1 2,19,23,28,36
322 inlineTextBox location=(0, 18) size=(38, 18) name='world' characterOffsets=1 2,20,25,29,37
323 ```
324
325 ### Scrolling, transformations, and animation
326
327 Native accessibility APIs typically want the bounding box of every element in th e
328 tree, either in window coordinates or global screen coordinates. If we
329 stored the global screen coordinates for every node, we'd be constantly
330 re-serializing the whole tree every time the user scrolls or drags the
331 window.
332
333 Instead, we store the bounding box of each node in the accessibility tree
334 relative to its *offset container*, which can be any ancestor. If no offset
335 container is specified, it's assumed to be the root of the tree.
336
337 In addition, any offset container can contain scroll offsets, which can be
338 used to scroll the bounding boxes of anything in that subtree.
339
340 Finally, any offset container can also include an arbitrary 4x4 transformation
341 matrix, which can be used to represent arbitrary 3-D rotations, translations, an d
342 scaling, and more. The transformation matrix applies to the whole subtree.
343
344 Storing coordinates this way means that any time an object scrolls, moves, or
345 animates its position and scale, only the root of the scrolling or animation
346 needs to post updates to the accessibility tree. Everything in the subtree
347 remains valid relative to that offset container.
348
349 Computing the global screen coordinates for an object in the accessibility
350 tree just means walking up its ancestor chain and applying offsets and
351 occasionally multiplying by a 4x4 matrix.
352
353 ### Site isolation / out-of-process iframes
354
355 At one point in time, all of the content of a single Tab or other web view
356 was contained in the same Blink process, and it was possible to serialize
357 the accessibility tree for a whole frame tree in a single pass.
358
359 Today the situation is a bit more complicated, as Chromium supports
360 out-of-process iframes. (It also supports "browser plugins" such as
361 the `<webview>` tag in Chrome packaged apps, which embeds a whole
362 browser inside a browser, but for the purposes of accessibility this
363 is handled the same as frames.)
364
365 Rather than a mix of in-process and out-of-process frames that are handled
366 differently, Chromium builds a separate independent accessibility tree
367 for each frame. Each frame gets its own tree ID, and it keeps track of
368 the tree ID of its parent frame (if any) and any child frames.
369
370 In Chrome's main browser process, the accessibility trees for each frame
371 are cached separately, and when an accessibility client (assistive
372 technology) walks the accessibility tree, Chromium dynamically composes
373 all of the frames into a single virtual accessibility tree on the fly,
374 using those aforementioned tree IDs.
375
376 The node IDs for accessibility trees only need to be unique within a
377 single frame. Where necessary, separate unique IDs are used within
378 Chrome's main browser process. In Chromium accessibility, a "node ID"
379 always means that ID that's only unique within a frame, and a "unique ID"
380 means an ID that's globally unique.
53 381
54 ## Blink 382 ## Blink
55 383
56 Blink constructs an accessibility tree (a hierarchy of [WebAXObject]s) from the 384 Blink constructs an accessibility tree (a hierarchy of [WebAXObject]s) from the
57 page it is rendering. WebAXObject is the public API wrapper around [AXObject], 385 page it is rendering. WebAXObject is the public API wrapper around [AXObject],
58 which is the core class of Blink's accessibility tree. AXObject is an abstract 386 which is the core class of Blink's accessibility tree. AXObject is an abstract
59 class; the most commonly used concrete subclass of it is [AXNodeObject], which 387 class; the most commonly used concrete subclass of it is [AXNodeObject], which
60 wraps a [Node]. In turn, most AXNodeObjects are actually [AXLayoutObject]s, 388 wraps a [Node]. In turn, most AXNodeObjects are actually [AXLayoutObject]s,
61 which wrap both a [Node] and a [LayoutObject]. Access to the LayoutObject is 389 which wrap both a [Node] and a [LayoutObject]. Access to the LayoutObject is
62 important because some elements are only in the AXObject tree depending on their 390 important because some elements are only in the AXObject tree depending on their
(...skipping 36 matching lines...) Expand 10 before | Expand all | Expand 10 after
99 APIs. This is done in the platform-specific subclasses of 427 APIs. This is done in the platform-specific subclasses of
100 BrowserAccessibilityManager, in a method named `NotifyAccessibilityEvent`. 428 BrowserAccessibilityManager, in a method named `NotifyAccessibilityEvent`.
101 3. Dispatching incoming accessibility actions to the appropriate recipient, via 429 3. Dispatching incoming accessibility actions to the appropriate recipient, via
102 [BrowserAccessibilityDelegate]. For messages destined for a renderer, 430 [BrowserAccessibilityDelegate]. For messages destined for a renderer,
103 [RenderFrameHostImpl], which is a BrowserAccessibilityDelegate, is 431 [RenderFrameHostImpl], which is a BrowserAccessibilityDelegate, is
104 responsible for sending appropriate `AccessibilityMsg_Foo` IPCs to the 432 responsible for sending appropriate `AccessibilityMsg_Foo` IPCs to the
105 renderer, where they will be received by [RenderAccessibilityImpl]. 433 renderer, where they will be received by [RenderAccessibilityImpl].
106 434
107 On Chrome OS, RenderFrameHostImpl does not route events to 435 On Chrome OS, RenderFrameHostImpl does not route events to
108 BrowserAccessibilityManager at all, since there is no platform screenreader 436 BrowserAccessibilityManager at all, since there is no platform screenreader
109 outside Chrome to integrate with. 437 outside Chromium to integrate with.
110 438
111 ## Views 439 ## Views
112 440
113 Views generates a [NativeViewAccessibility] for each View, which is used as the 441 Views generates a [NativeViewAccessibility] for each View, which is used as the
114 delegate for an [AXPlatformNode] representing that View. This part is relatively 442 delegate for an [AXPlatformNode] representing that View. This part is relatively
115 straightforward, but then the generated tree must be combined with the web 443 straightforward, but then the generated tree must be combined with the web
116 accessibility tree, which is handled by BrowserAccessibilityManager. 444 accessibility tree, which is handled by BrowserAccessibilityManager.
117 445
118 ## WebUI 446 ## WebUI
119 447
(...skipping 29 matching lines...) Expand all
149 [Node]: https://cs.chromium.org/chromium/src/third_party/WebKit/Source/core/dom/ Node.h 477 [Node]: https://cs.chromium.org/chromium/src/third_party/WebKit/Source/core/dom/ Node.h
150 [RenderAccessibilityImpl]: https://cs.chromium.org/chromium/src/content/renderer /accessibility/render_accessibility_impl.h 478 [RenderAccessibilityImpl]: https://cs.chromium.org/chromium/src/content/renderer /accessibility/render_accessibility_impl.h
151 [RenderFrameHostImpl]: https://cs.chromium.org/chromium/src/content/browser/fram e_host/render_frame_host_impl.h 479 [RenderFrameHostImpl]: https://cs.chromium.org/chromium/src/content/browser/fram e_host/render_frame_host_impl.h
152 [ui::AXNodeData]: https://cs.chromium.org/chromium/src/ui/accessibility/ax_node_ data.h 480 [ui::AXNodeData]: https://cs.chromium.org/chromium/src/ui/accessibility/ax_node_ data.h
153 [WebAXObject]: https://cs.chromium.org/chromium/src/third_party/WebKit/public/we b/WebAXObject.h 481 [WebAXObject]: https://cs.chromium.org/chromium/src/third_party/WebKit/public/we b/WebAXObject.h
154 [automation API]: https://cs.chromium.org/chromium/src/chrome/renderer/resources /extensions/automation 482 [automation API]: https://cs.chromium.org/chromium/src/chrome/renderer/resources /extensions/automation
155 [automation.idl]: https://cs.chromium.org/chromium/src/chrome/common/extensions/ api/automation.idl 483 [automation.idl]: https://cs.chromium.org/chromium/src/chrome/common/extensions/ api/automation.idl
156 [ax_enums.idl]: https://cs.chromium.org/chromium/src/ui/accessibility/ax_enums.i dl 484 [ax_enums.idl]: https://cs.chromium.org/chromium/src/ui/accessibility/ax_enums.i dl
157 [chrome.automation API]: https://developer.chrome.com/extensions/automation 485 [chrome.automation API]: https://developer.chrome.com/extensions/automation
158 [webui-js]: https://cs.chromium.org/chromium/src/ui/webui/resources/js/cr/ui/ 486 [webui-js]: https://cs.chromium.org/chromium/src/ui/webui/resources/js/cr/ui/
OLDNEW
« no previous file with comments | « no previous file | no next file » | no next file with comments »

Powered by Google App Engine
This is Rietveld 408576698