The Voice Interaction Service API provides an abstraction over different potential voice control applications. Implementations can be developed following the guidelines described in Application Development. The content in this integration guide describes how to integrate these applications into a specific Android Automotive OS (AAOS) system image.
These terms are used through this guide:
- Assist data. When a voice interaction session is started,
the system is able to capture views and screenshots, and pass this information
to the session. Applications can expose additional information by implementing
- Push-to-talk (PTT). Physical voice control button, usually located in the steering wheel.
- RecognitionService (RS). Voice recognition service used by
apps through the
SpeechRecognizerAPI. VIAs must include both the
- Tap-to-talk (TTT). Software voice control button, usually included as part of the system UI). In Android this is also referred to as Assist Gesture.
VoiceInteractionService. Lightweight system service implemented by the VIA developer. The selected service is bound from system service on boot, and is always running.
- VoiceInteractionSession (VIS). This class encapsulates the user interaction business logic. It is responsible for presenting the user with status of the voice interaction, handling VoiceInteractor requests and receiving assist and screenshot data.
- VoiceInteractionSessionService (VSS). A service, part of a
VIA, responsible for handling a voice interaction session. This service is bound
from Android's system service during a voice interaction with a user. All
business logic of this session is implemented in the
VoiceSessionclass. This service is only guaranteed to stay alive during a single user voice session.
- Voice Interaction Application (VIA). Android application designed
to serve as a voice control (referred to as assistant). These applications can
be identified by including a
VoiceInteractionServicein their manifest. Only one of these applications can be selected as default at a time in the system. Only the default application will be maintained alive (bound from a system service), and will be the receiver of Push-To-Talk (PTT) or Tap-To-Talk (TTT) events.
This table describes the responsibilities of each party.
|Car Manufacturers (OEMs)||AOSP||App Developers|
OEMs have the ultimate responsibility of providing a good user experience to customers. OEMs must ensure that the all pre-installed voice interaction services fulfill the requirements described in Preloaded Assistants: UX Guidance.
Core Assistant Experience
An automotive Voice Interaction Application (VIA) performs the following actions:
- [MUST] Respond to system-handled voice interaction triggers (PTT, TTT).
- [MUST] Display a visual representation of their progress (for example, listening, processing, and fulfilling).
- [MUST] Use voice or sounds to indicate understanding and completion of user requests.
- [MUST] Serve as a speech recognizer for other apps (see the SpeechRecognizer API).
- [SHOULD] Respond to a hotword trigger.
- [MAY] Display a settings activity where users can configure this VIA (for example, permissions, hotword configuration, and sign-in).
- [MAY] Handle assist data (
- [MAY] Support voice interaction from Keyguard (lock screen).
At a high level, a voice interaction application interacts with these actors:
Figure 1. Voice interaction actors
VoiceInteractionManagerService. This system service is responsible for managing the default VIA, and exposing its functionality to the rest of the system.
RecognitionService. This service exposes speech recognition capabilities to other applications in the system.
SoundTrigger. Implements hotword management and it's available to VIAs through the AlwaysOnHotwordDetector.
MediaRecorder. Provides access to audio input for both hotword detection (when using CPU) and speech recognition.
CarInputService. These services are responsible (among other things) for handling key-events, routing PTT to the VIA, by means of the
User. The user interacts with a VIA by means of Triggers (PTT, TTT, Hotword) or the Voice Plate UI.
- CarService, Notifications, Media, Telephony, ContactsProvider, and so on. Services and applications used by the VoiceInteractionSession to fulfill the user's commands.
AAOS diverges from Android in the following aspects:
- Besides normal Assistant functionalities, AAOS VIAs can control vehicle functions (for example, HVAC, seats, and interior lights). These functionalities can be integrated using the CarPropertyManager API (see more at Reading a Vehicle Property) provided OEMs configure access correctly as described in Privileged Permission Allowlisting.
- Customization and consistency are more relevant in Automotive than in any other form factor. See Customization to read more about implementing these guidelines.