tts

NVIDIA Magpie TTS service backed by an AWS SageMaker endpoint.

class pipecat.services.nvidia.sagemaker.tts.NvidiaSageMakerTTSSettings(model: str | None | _NotGiven = <factory>, extra: dict[str, Any]=<factory>, voice: str | None | _NotGiven = <factory>, language: Language | str | None | _NotGiven = <factory>)[source]

Bases: TTSSettings

Settings for NVIDIA SageMaker TTS services.

Parameters:
  • voice – NIM voice name (e.g. Magpie-Multilingual.EN-US.Aria).

  • language – BCP-47 language code passed to NIM (e.g. en-US).

class pipecat.services.nvidia.sagemaker.tts.NvidiaSageMakerHTTPTTSService(*, endpoint_name: str, region: str = 'us-west-2', sample_rate: int | None = None, settings: NvidiaSageMakerTTSSettings | None = None, **kwargs)[source]

Bases: TTSService

NVIDIA Magpie TTS service that calls a SageMaker HTTP endpoint.

Sends each text segment to the wrapper’s POST /invocations endpoint as a JSON body and streams the raw PCM audio response back to bot as TTSAudioRawFrame frames.

Example:

tts = NvidiaSageMakerHTTPTTSService(
    endpoint_name=os.getenv("SAGEMAKER_MAGPIE_ENDPOINT_NAME"),
    region=os.getenv("AWS_REGION", "us-west-2"),
    settings=NvidiaSageMakerHTTPTTSService.Settings(
        voice="Magpie-Multilingual.EN-US.Aria",
        language="en-US",
    ),
)
Settings

alias of NvidiaSageMakerTTSSettings

__init__(*, endpoint_name: str, region: str = 'us-west-2', sample_rate: int | None = None, settings: NvidiaSageMakerTTSSettings | None = None, **kwargs)[source]

Initialize the SageMaker HTTP TTS service.

Parameters:
  • endpoint_name – Name of the deployed SageMaker endpoint.

  • region – AWS region where the endpoint lives.

  • sample_rate – Output sample rate in Hz. Defaults to bot’s pipeline rate.

  • settings – Runtime-updatable settings (voice, language).

  • **kwargs – Forwarded to TTSService.

can_generate_metrics() bool[source]

Check if this service can generate processing metrics.

Returns:

True, as this service supports metrics generation.

async start(frame: StartFrame)[source]

Start the TTS service and create the SageMaker client.

Parameters:

frame – The start frame containing initialization parameters.

async stop(frame: EndFrame)[source]

Stop the TTS service and close the SageMaker client.

Parameters:

frame – The end frame.

async cancel(frame: CancelFrame)[source]

Cancel the TTS service and close the SageMaker client.

Parameters:

frame – The cancel frame.

async run_tts(text: str, context_id: str) AsyncGenerator[Frame, None][source]

Synthesize text via SageMaker and yield a single PCM audio frame.

Parameters:
  • text – The text to synthesize.

  • context_id – Pipecat audio context identifier.

Yields:

TTSAudioRawFrame chunks of signed 16-bit mono PCM.

async setup(setup: FrameProcessorSetup)

Set up the processor with required components.

Parameters:

setup – Configuration object containing setup parameters.

class pipecat.services.nvidia.sagemaker.tts.NvidiaSageMakerTTSService(*, endpoint_name: str, region: str = 'us-west-2', sample_rate: int | None = None, settings: NvidiaSageMakerTTSSettings | None = None, **kwargs)[source]

Bases: InterruptibleTTSService

NVIDIA Magpie TTS service using SageMaker bidirectional streaming.

Maintains a persistent HTTP/2 bidi-stream session to the SageMaker endpoint for the lifetime of the pipeline. Each text segment is sent as NIM realtime events; audio chunks arrive asynchronously and are pushed as TTSAudioRawFrame frames.

Example:

tts = NvidiaSageMakerTTSService(
    endpoint_name=os.getenv("SAGEMAKER_MAGPIE_ENDPOINT_NAME"),
    region=os.getenv("AWS_REGION", "us-west-2"),
    settings=NvidiaSageMakerTTSService.Settings(
        voice="Magpie-Multilingual.EN-US.Aria",
        language="en-US",
    ),
)
Settings

alias of NvidiaSageMakerTTSSettings

__init__(*, endpoint_name: str, region: str = 'us-west-2', sample_rate: int | None = None, settings: NvidiaSageMakerTTSSettings | None = None, **kwargs)[source]

Initialize the SageMaker WebSocket TTS service.

Parameters:
  • endpoint_name – Name of the deployed SageMaker endpoint.

  • region – AWS region where the endpoint lives.

  • sample_rate – Output sample rate in Hz. Defaults to pipeline rate.

  • settings – Runtime-updatable settings (voice, language).

  • **kwargs – Forwarded to InterruptibleTTSService.

can_generate_metrics() bool[source]

Check if this service can generate processing metrics.

Returns:

True, as this service supports metrics generation.

async start(frame: StartFrame)[source]

Start the TTS service and connect to the SageMaker endpoint.

Parameters:

frame – The start frame containing initialization parameters.

async stop(frame: EndFrame)[source]

Stop the TTS service and disconnect from the SageMaker endpoint.

Parameters:

frame – The end frame.

async cancel(frame: CancelFrame)[source]

Cancel the TTS service and disconnect from the SageMaker endpoint.

Parameters:

frame – The cancel frame.

async run_tts(text: str, context_id: str) AsyncGenerator[Frame | None, None][source]

Send text to NIM; audio arrives asynchronously via _receive_messages.

async setup(setup: FrameProcessorSetup)

Set up the processor with required components.

Parameters:

setup – Configuration object containing setup parameters.