Skip to main content

VoiceAgentClient

The main class for interacting with the Lokutor Voice Agent.

Constructor

new VoiceAgentClient(config: LokutorConfig & { 
    prompt: string, 
    voice?: VoiceStyle, 
    language?: Language 
})
Config Options:
OptionTypeDescription
apiKeystringRequired. Your Lokutor API Key.
promptstringRequired. The system prompt defining the AI’s persona.
voiceVoiceStyleOptional. Default is VoiceStyle.F1.
languageLanguageOptional. Default is Language.ENGLISH.
serverUrlstringOptional. Custom WebSocket URL.
onTranscriptionfunctionCallback for user speech transcriptions.
onResponsefunctionCallback for AI text responses.
onAudiofunctionCallback for raw agent audio buffers.
onStatusfunctionCallback for session status changes.
onErrorfunctionCallback for error events.

Methods

connect(): Promise<boolean>

Establishes a connection to the Lokutor server.

sendAudio(audioData: Buffer | Uint8Array)

Sends raw PCM audio data (16-bit, 44.1kHz, mono) to the server.

onAudio(callback: (data: Buffer) => void)

Subscribes to incoming audio buffers from the AI.

disconnect()

Closes the WebSocket connection.

TTSClient

Dedicated client for converting text to high-quality streaming audio.

Constructor

new TTSClient(config: { apiKey: string, serverUrl?: string })

Methods

synthesize(options: SynthesizeOptions): Promise<void>

Starts synthesis and returns a promise that resolves when the stream finishes. Options:
OptionTypeDescription
textstringRequired. The text to speak.
voiceVoiceStyleOptional.
languageLanguageOptional.
speednumberOptional. Default is 1.05.
stepsnumberOptional. Synthesis quality (1-50). Default is 24.
visemesbooleanOptional.
onAudiofunctionCallback for incoming audio buffers.
onVisemesfunctionCallback for animation/viseme data.
onErrorfunctionCallback for errors.

Enums

VoiceStyle

  • F1 to F5: Female voices.
  • M1 to M5: Male voices.

Language

  • ENGLISH: “en”
  • SPANISH: “es”
  • FRENCH: “fr”
  • PORTUGUESE: “pt”
  • KOREAN: “ko”

Constants

AUDIO_CONFIG

  • SAMPLE_RATE: 44100
  • CHANNELS: 1
  • CHUNK_DURATION_MS: 20
  • CHUNK_SIZE: 882
  • CHUNK_SIZE: 882