The Gemini Live API provides a comprehensive solution for implementing conversational interfaces with your user. When building for Android XR, you can integrate with the Gemini Live API through Firebase AI Logic. Unlike using Text to Speech (TTS) and Automatic Speech Recognition (ASR), the Gemini Live API handles both audio input and output in a seamless way. The Gemini Live API does require a persistent internet connection, incur cost, supports a limited number of concurrent connections per project and might not be ideal for handling error conditions or other critical user communication, especially on AI glasses with no display.
In addition to supporting audio interfaces, you can also use the Gemini Live API to build agentic experiences.
To get started with the Gemini Live API, follow along the steps outlined in the
Gemini Live API guide. It walks you through instantiating and configuring a
LiveGenerativeModel, establishing a
LiveSession and creating custom
FunctionDeclaration instances that allow your app to process
requests from Gemini.