Robot Ross demonstrator · Agentegra ATF · Built with Flotilla

Voice Control

Compiled RobotRoss knowledge page generated from RobotRoss source code, architecture notes, and operational documentation.

Voice Control

1. Overview

Voice control is a top-level part of the ATF architecture, not a side note. It provides spoken interaction with the same local evidence base used by the text query path.

2. Speech-to-Text

  • Whisper is the intended speech-to-text engine for converting operator prompts into local text queries.
  • Spoken prompts should be interpreted against the compiled wiki and the operational ledger.

3. Reasoning Path

  • The local model answers from the RobotRoss knowledge corpus and ledger evidence.
  • This query path is intended to run on the local system, not through a cloud-hosted inference layer.

4. Text-to-Speech

  • Voxtral is the intended text-to-speech engine for spoken answers.
  • Spoken output should summarize the same evidence-backed answer returned in the text channel.

5. Notes and Open Points

  • Voice interaction should remain aligned with the same provenance expectations as the text query path.
  • The UI should present voice as a first-class control surface once the runtime hookup is in place.

Voice Demo