A developer has demonstrated Gemini Live API running on Ray-Ban Meta glasses, using LiveKit's open-source framework as a secure WebRTC proxy between the wearables and Google's AI.
The architecture routes video and audio from the glasses through a phone (Android or iOS) to LiveKit Cloud, which proxies requests to Gemini's WebSocket API. The setup uses Meta's Wearables SDK for hardware access and LiveKit's agents framework for the backend. Google's gemini-2.5-flash-native-audio-preview model handles both speech recognition and synthesis natively, eliminating separate TTS/ASR components.
Notably, the implementation enables video processing at 1FPS from the glasses' camera. In the demo, a user can ask the AI to identify plants or objects in their field of view and receive spoken responses. LiveKit's framework includes Silero VAD for turn-taking and supports proactive dialog.
The real question is whether this proves practical beyond demos. Google's documentation flags several constraints: the setup requires persistent internet connectivity, incurs API costs per session, limits concurrent connections per project, and isn't recommended for error-prone or mission-critical applications. Battery drain from continuous streaming and latency under varying network conditions remain untested at scale.
For enterprise leaders evaluating similar XR integrations, this points to the current state of wearable AI: technically feasible for controlled use cases, but infrastructure costs and reliability constraints mean pilot carefully before committing. The combination of Meta's newly opened developer access and Google's Firebase AI Logic for Android XR suggests the vendor ecosystem is moving toward standardized tooling, which could accelerate production readiness.
Developers can access LiveKit's vision agent examples via pip (livekit-agents[silero,google,images]) and generate access tokens through LiveKit Cloud CLI. The trade-offs between real-time responsiveness and operational overhead will determine whether this moves beyond proof-of-concept.