RE: Questions for APA based on review of the XR Device API from White, Jason J on 2019-07-25 (public-apa@w3.org from July 2019)

From: White, Jason J <jjwhite@ets.org>
Date: Thu, 25 Jul 2019 22:03:12 +0000
To: "Gottfried Zimmermann (Lists)" <zimmermann@accesstechnologiesgroup.com>, "public-apa@w3.org" <public-apa@w3.org>
Message-ID: <SN6PR07MB48649AD174E8612E0C7A7B8FABC10@SN6PR07MB4864.namprd07.prod.outlook.com>
Thank you, Gottfried, for your well informed and thoughtful review. I agree with all of your observations.

From: Gottfried Zimmermann (Lists) <zimmermann@accesstechnologiesgroup.com>
Sent: Thursday, July 25, 2019 2:34 PM
To: public-apa@w3.org
Cc: White, Jason J <jjwhite@ets.org>
Subject: RE: Questions for APA based on review of the XR Device API

This is a quite complex spec, and I do not claim to have understood all details.  So, while I concur with Jason's points, I would like to add a few additional items for discussion:


  1.  Captions in XR: Should there be a pre-defined "channel" for display of captions for describing spoken words and sounds?  May there be XRDevices that would natively support the display of captions, either by specific hardware (additions) or by specific rendering techniques?  If so, the XRDevice should indicate such a feature, and there should be ways to display captions on this "channel".  And the user should probably be able to configure how captions are displayed, e.g. distance, font size, text color and background color.
  2.  Monoscopic view: Some users may want to see XR content in a monoscopic manner only (i.e. with one eye).  It seems that section 7.1 already addresses this requirement.
  3.  Alternative ray modes: Section 10.1 allows for the following ray modes: gaze, tracked-pointer and screen.  It seems possible that users with motor impairments would like to use some alternative form of pointing, e.g. by selecting pre-defined regions by numbers on a keypad.  It needs to be investigated whether the spec supports this.
  4.  Integration of semantic information in the rendering layer vs. separate offscreen model: As Jason points out, from an accessibility perspective, it would be important to be able to annotate the rendered objects in the rendering layer, that is the XRWebGLLayer.  Can we somehow store semantic information in this layer rather than create an "offscreen model" as a separate hierarchy of semantic objects?  Having an "offscreen model" seems more complex for assistive technologies, and takes the risk of running out of sync with the information in the rendering layer.
  5.  Dependency on Gamepad API: It needs to be checked whether the spec indeed creates a dependency on a gamepad and its API.  I agree with Jason that this should not be the case.  Anyway, alternative gamepads (such as the Xbox Adaptive Controller<https://nam01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fnews.xbox.com%2Fen-us%2F2018%2F05%2F16%2Fxbox-adaptive-controller%2F&data=02%7C01%7Cjjwhite%40ets.org%7C55c1658882d74a7c4d7508d7112ead82%7C0ba6e9b760b34fae92f37e6ddd9e9b65%7C0%7C1%7C636996764503838442&sdata=MMjGks3zMq29D9SpxjXBjHrl%2BI2hldMWU0xfnEuBDH0%3D&reserved=0>) need to be accommodated, either by fitting into the role of a gamepad, or by allowing for alternative input devices.  Hopefully the developers of WebXR have this in mind.

Best regards,
Gottfried

Von: White, Jason J [mailto:jjwhite@ets.org]
Gesendet: Montag, 22. Juli 2019 22:15
An: public-apa@w3.org<mailto:public-apa@w3.org>
Betreff: Questions for APA based on review of the XR Device API

Having read substantial portions of the document, concentrating on over-all capabilities rather than details of the API, I would like to raise a series of questions.

The specification - WebXR Device API: https://www.w3.org/TR/webxr/<https://nam01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.w3.org%2FTR%2Fwebxr%2F&data=02%7C01%7Cjjwhite%40ets.org%7C55c1658882d74a7c4d7508d7112ead82%7C0ba6e9b760b34fae92f37e6ddd9e9b65%7C0%7C0%7C636996764503838442&sdata=HrQ457CtMI5YI4klZD1ztdWOQg1tjFMUN8ADkLE5yiQ%3D&reserved=0>

WebXR Device API Explained: https://github.com/immersive-web/webxr/blob/master/explainer.md<https://nam01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fimmersive-web%2Fwebxr%2Fblob%2Fmaster%2Fexplainer.md&data=02%7C01%7Cjjwhite%40ets.org%7C55c1658882d74a7c4d7508d7112ead82%7C0ba6e9b760b34fae92f37e6ddd9e9b65%7C0%7C1%7C636996764503848444&sdata=054FX6JlvWJQeiLGg2KosRm%2FcYB952qL4WQ3BpNfUKc%3D&reserved=0>

Section 10.2 creates a dependency on the Gamepad API. This introduces the question of whether the alternative input devices used by people with disabilities (including those available for game applications) are adequately supported. I suspect this will depend, at least in part, on considerations drawn from APA's review of the Gamepad API itself.

Section 11 ("Layers") introduces the mechanisms used for visual rendering of the 3-dimensional XR content. Only one type of layer is presently defined, but it is made clear that additional layer types may be defined in future revisions of the specification. The presently defined layer type relies on WebGL for the rendering of the XR content, and, it appears (section 12), on an HTML canvas to host it.

My present understanding is that only Canvas 2D supports integration with ARIA (hit regions, etc.), and hence with assistive technologies that depend on accessibility APIs. WebGL does not offer such support. It therefore does not appear possible to associate accessibility API objects directly with components of the 3D scene as rendered within the canvas. However, there are alternative options (some mentioned at the recently held XR Access Symposium):


  1.  The Accessibility Object Model, as I understand it, is ultimately meant to enable an arbitrarily complex accessibility subtree to be associated with any DOM node. This might enable an application to maintain a hierarchy representing the UI of the environment - at least in principle, and at least for virtual objects. Integration with the real scene in immersive augmented reality applications raises interesting challenges, of course. So this is just a potential path forward rather than a solution. There are clearly important questions regarding the capabilities this would demand of the Accessibility Object Model, and whether these can be met. Problems of object selection are a good example of the kind of challenge that occurs in the XR environment - how to create a nonvisual interface that provides salient information and interaction opportunities to the user, without being overwhelming.
  2.  The functions typically associated with assistive technologies could be implemented directly in the XR application, without relying on external assistive software or associated accessibility APIs. In this case, application development environments and components used in building applications would need to provide appropriate support, ensuring that each application author does not need to reimplement large parts of the accessibility-related functionality.
There may be other possibilities that I'm overlooking, including combinations of the above.

A further question is how best to support transformations of the visual scene that may be needed for purposes of accessibility, for example by people with low vision or those with colour blindness. For example, do enlargement or other operations make sense as accessibility options in the XR context? If so, is there a role for an assistive technology in modifying the visual scene, and perhaps also in modifying the effects of user input, for example by adjusting the coordinates sent to the XR applications to take account of changes in the size or position of objects?

The specification can at least be read as implying that the XR content should be rendered as given by the application, and that the inputs should be conveyed from the hardware to the application, without any intermediary such as an AT. On the other hand, I didn't find anything that obviously prevents such use cases from being supported. Again, it's an architectural question whether an AT is desirable here, which may of course have different answers according to circumstances.

The WebXR Device API only addresses the graphical presentation and the input mechanisms. Audio is presumably to be created using the Web Audio API. Are we confident that the integration of audio - including spatial audio - into XR applications can be well supported by the existing specs? How would haptic output be supported? These are obviously valuable tools for designers of accessible interfaces, and we need to be sure that significant gaps do not exist in the capabilities that the various available APIs provide when used in combination.

It is also clear from recent conversations which I've engaged in that multi-party interactions could occur in an XR application, raising the possibility of using WebRTC to convey audio, video, or text that is introduced into the XR environment. Again, with all of the current and proposed specifications, do we have what we need from an accessibility standpoint? If not, what features are missing?

These questions obviously extend beyond the WebXR Device API to the total set of features available to applications, assuming that various specs are implemented.

As always, the Research questions Task Force is available as a forum for some of the more detailed research and discussions that may be necessary as this work develops. XR accessibility is currently a topic of RQTF meetings.


________________________________

This e-mail and any files transmitted with it may contain privileged or confidential information. It is solely for use by the individual for whom it is intended, even if addressed incorrectly. If you received this e-mail in error, please notify the sender; do not disclose, copy, distribute, or take any action in reliance on the contents of this information; and delete it from your system. Any other use of this e-mail is prohibited.


Thank you for your compliance.

________________________________

________________________________

This e-mail and any files transmitted with it may contain privileged or confidential information. It is solely for use by the individual for whom it is intended, even if addressed incorrectly. If you received this e-mail in error, please notify the sender; do not disclose, copy, distribute, or take any action in reliance on the contents of this information; and delete it from your system. Any other use of this e-mail is prohibited.


Thank you for your compliance.

________________________________
Received on Thursday, 25 July 2019 22:04:23 UTC