user_stop

class pipecat.turns.user_stop.BaseUserTurnStopStrategy(*, enable_user_speaking_frames: bool = True, **kwargs)[source]

Bases: BaseObject

Base class for strategies that determine when the user stops speaking.

Subclasses should implement logic to detect when the user stops speaking. This could be based on analyzing incoming frames (such as transcriptions), conversation state, or other heuristics.

Events triggered by strategies:

on_push_frame: Indicates the strategy wants to push a frame.

on_user_turn_inference_triggered: Signals that enough evidence exists to start LLM inference for the current user turn. In most cases this fires together with on_user_turn_stopped. Strategies that gate finalization on the LLM (e.g. LLMTurnCompletionUserTurnStopStrategy) fire only this event upstream and a separate strategy fires on_user_turn_stopped once the LLM confirms the turn is complete.

on_user_turn_stopped: Signals that the user turn is semantically final. Observers, transcript appenders, and UI indicators should bind this event.

__init__(*, enable_user_speaking_frames: bool = True, **kwargs)[source]

Initialize the base user turn stop strategy.

Parameters:

enable_user_speaking_frames – If True, the aggregator will emit frames indicating when the user stops speaking. This is enabled by default, but you may want to disable it if another component (e.g., an STT service) is already generating these frames.
**kwargs – Additional keyword arguments.

async cleanup()[source]: Cleanup the strategy.

async reset()[source]

Reset the strategy to its initial state.

Deprecated since version 1.6.0: Use handle_user_turn_started() and handle_user_turn_stopped() instead. Will be removed in 2.0.0.

A stop strategy is reset at both turn boundaries — armed on start, cleaned up on stop — so this historically ran at both. New strategies should override the boundary callbacks directly, which say plainly when the work runs, and reset however they like inside them (a private helper called from both, an extra clear() on stop, and so on).

async handle_user_turn_started()[source]

Notify the strategy that a user turn has started.

The controller calls this on every stop strategy when a turn begins. Override to run, for example, logic to arm the strategy to detect the end of the turn now in progress.

async handle_user_turn_stopped()[source]

Notify the strategy that the user turn has stopped.

The controller calls this on every stop strategy when a turn ends, regardless of which strategy (or the stop watchdog timeout) ended it. Override to run stop-specific logic — e.g. dropping a turn analyzer’s buffered speech that must not survive an externally-ended turn.

async process_frame(frame: Frame) → ProcessFrameResult | None[source]

Process an incoming frame to decide whether the user stopped speaking.

Subclasses should override this to implement logic that decides whether the user has stopped speaking.

Parameters:: frame – The frame to be analyzed.
Returns:: A ProcessFrameResult indicating the outcome, or None (treated as CONTINUE for backward compatibility).

async push_frame(frame: Frame, direction: FrameDirection = FrameDirection.DOWNSTREAM)[source]

Emit on_push_frame to push a frame using the user aggreagtor.

Parameters:

frame – The frame to be pushed.
direction – What direction the frame should be pushed to.

async broadcast_frame(frame_cls: type[Frame], **kwargs)[source]

Emit on_broadcast_frame to broadcast a frame using the user aggreagtor.

Parameters:

frame_cls – The class of the frame to be broadcasted.
**kwargs – Keyword arguments to be passed to the frame’s constructor.

async trigger_user_turn_stopped()[source]

Fire both on_user_turn_inference_triggered and on_user_turn_stopped.

Most strategies call this when they decide a turn has ended. To defer finalization to another strategy (so this strategy fires only the inference-triggered event), wrap this strategy with deferred() instead of changing the trigger call.

async trigger_user_turn_inference_triggered()[source]: Trigger only the on_user_turn_inference_triggered event.

async trigger_user_turn_finalized()[source]: Trigger only the on_user_turn_stopped event.

class pipecat.turns.user_stop.DeferredUserTurnStopStrategy(inner: BaseUserTurnStopStrategy, **kwargs)[source]

Bases: BaseUserTurnStopStrategy

Wraps a stop strategy and suppresses its on_user_turn_stopped event.

Event subscriptions added to the wrapper are forwarded directly to the inner strategy, except for on_user_turn_stopped, which is dropped. The inner strategy’s frame-side and inference-triggered events therefore reach external listeners (the controller, etc.) unchanged; finalization is left to another strategy in the chain such as LLMTurnCompletionUserTurnStopStrategy.

Use the deferred() helper for ergonomic construction:

stop=[
    deferred(TurnAnalyzerUserTurnStopStrategy(turn_analyzer=...)),
    LLMTurnCompletionUserTurnStopStrategy(),
]

__init__(inner: BaseUserTurnStopStrategy, **kwargs)[source]

Initialize the deferred wrapper.

Parameters:

inner – The strategy whose finalization should be deferred.
**kwargs – Additional keyword arguments forwarded to the base class.

property inner: BaseUserTurnStopStrategy: Return the wrapped strategy.

add_event_handler(event_name: str, handler)[source]

Forward event subscriptions to the inner strategy.

on_user_turn_stopped is silently dropped — that’s the whole point of the wrapper. Every other event handler is attached to the inner strategy directly, so the inner’s events reach the listener without any per-event proxy method on the wrapper.

async setup(task_manager: BaseTaskManager)[source]: Set up the inner strategy.

async cleanup()[source]: Clean up the inner strategy.

async handle_user_turn_started()[source]: Forward the turn-started callback to the inner strategy.

async handle_user_turn_stopped()[source]: Forward the turn-stopped callback to the inner strategy.

async process_frame(frame: Frame) → ProcessFrameResult | None[source]: Forward frame processing to the inner strategy.

class pipecat.turns.user_stop.ExternalUserTurnCompletionStopStrategy(*, enable_user_speaking_frames: bool = True, **kwargs)[source]

Bases: BaseUserTurnStopStrategy

Finalize the user turn whenever a UserTurnInferenceCompletedFrame arrives.

Generic stop strategy for pipelines where some external component (LLM with completion markers, STT with built-in turn detection, a dedicated end-of-turn classifier, custom user code, etc.) judges when a turn is semantically complete and emits UserTurnInferenceCompletedFrame.

Pair this with one or more deferred(...)-wrapped detector strategies that drive on_user_turn_inference_triggered but leave finalization to this strategy:

stop=[
    deferred(TurnAnalyzerUserTurnStopStrategy(turn_analyzer=...)),
    ExternalUserTurnCompletionStopStrategy(),
]

For LLM-completion-marker gating specifically, use the subclass LLMTurnCompletionUserTurnStopStrategy instead, which additionally pushes the LLMUpdateSettingsFrame that enables the marker protocol on the LLM.

A completion resolves with some latency (e.g. the LLM ✓ arrives after the inference finishes), so the user may have resumed speaking in the meantime. The controller drops a finalization that arrives while the user is speaking, so a stale completion does not end the turn (and talk over the user); the turn stays open for the next inference to re-evaluate. That check lives in the controller, which holds the authoritative user-speaking state.

If the producer never emits UserTurnInferenceCompletedFrame, the controller’s user_turn_stop_timeout watchdog finalizes the turn after no activity. Tune that timeout if your producer can take longer than the default to respond.

async process_frame(frame: Frame) → ProcessFrameResult[source]: Fire on_user_turn_stopped whenever UserTurnInferenceCompletedFrame is seen.

class pipecat.turns.user_stop.ExternalUserTurnStopStrategy(*, timeout: float = 0.5, wait_for_transcript: bool = True, **kwargs)[source]

Bases: BaseUserTurnStopStrategy

User turn stop strategy controlled by an external processor.

This strategy does not determine when a user turn ends on its own, it relies on a different processor in the pipeline which is responsible for emitting UserStoppedSpeakingFrame frames.

__init__(*, timeout: float = 0.5, wait_for_transcript: bool = True, **kwargs)[source]

Initialize the external user turn stop strategy.

Parameters:

timeout – A short delay used internally to handle consecutive or slightly delayed transcriptions.
wait_for_transcript – When True (default), turn-stop signaling waits for transcript text to arrive after the external UserStoppedSpeakingFrame. When False, the strategy signals turn-stop as soon as that frame arrives — independent of transcripts. Set this to False when local turn detection is the intended driver of the conversation (e.g. with a realtime LLM service consuming audio directly), so transcripts are off the latency critical path. LLMContextAggregatorPair flips this for you when realtime_service_mode=True.
**kwargs – Additional keyword arguments.

property wait_for_transcript: bool: Whether turn-stop signaling waits for transcript text.

async handle_user_turn_started()[source]: Ready the strategy to detect the end of the turn now starting.

async handle_user_turn_stopped()[source]: Clear per-turn state once the turn has ended.

async setup(task_manager: BaseTaskManager)[source]

Initialize the strategy with the given task manager.

Parameters:: task_manager – The task manager to be associated with this instance.

async cleanup()[source]: Cleanup the strategy.

async process_frame(frame: Frame) → ProcessFrameResult[source]

Process an incoming frame to update strategy state.

Updates internal transcription text and VAD state. The user end turn will be triggered when appropriate based on the collected frames.

Parameters:: frame – The frame to be analyzed.
Returns:: Always returns CONTINUE so subsequent stop strategies are evaluated.

class pipecat.turns.user_stop.LLMTurnCompletionUserTurnStopStrategy(*, config: UserTurnCompletionConfig | None = None, **kwargs)[source]

Bases: ExternalUserTurnCompletionStopStrategy

LLM-gated stop strategy.

Extends ExternalUserTurnCompletionStopStrategy with the LLM-specific setup needed for the marker-based completion protocol: on StartFrame, pushes an LLMUpdateSettingsFrame upstream that enables filter_incomplete_user_turns on the LLM and seeds the UserTurnCompletionConfig.

Finalization itself is inherited: when the LLM service’s UserTurnCompletionLLMServiceMixin detects a ✓ marker, it broadcasts a UserTurnInferenceCompletedFrame and the base class fires on_user_turn_stopped. On incomplete_short / incomplete_long markers the mixin re-prompts internally and no completion frame is emitted, so the public stop event stays deferred.

Install alongside one or more deferred(...)-wrapped detector strategies that drive on_user_turn_inference_triggered but leave finalization to this strategy. The aggregator’s deprecation path for filter_incomplete_user_turns does this rewiring automatically.

__init__(*, config: UserTurnCompletionConfig | None = None, **kwargs)[source]

Initialize the LLM turn-completion stop strategy.

Parameters:

config – Configuration applied to the LLM via the filter_incomplete_user_turns setting on StartFrame. Defaults to UserTurnCompletionConfig().
**kwargs – Additional keyword arguments forwarded to the base class.

property config: UserTurnCompletionConfig: Return the configured UserTurnCompletionConfig.

async process_frame(frame: Frame) → ProcessFrameResult[source]: Configure the LLM on start and delegate completion handling to the base.

class pipecat.turns.user_stop.SpeechTimeoutUserTurnStopStrategy(*, user_speech_timeout: float = 0.6, wait_for_transcript: bool = True, **kwargs)[source]

Bases: BaseUserTurnStopStrategy

User turn stop strategy using two independent timers after VAD stop.

After the user stops speaking (detected by VAD), this strategy runs two independent timers. The user turn stop is triggered only when both have finished and at least one transcript has been received:

user_speech_timeout: Policy floor — the window in which the user may resume speaking after a pause. Always runs to completion.
stt_timeout: Safety net for STT latency — the P99 time for the STT service to return a final transcript after VAD stop, adjusted by the VAD stop_secs. Short-circuited when the STT service emits a finalized transcript (TranscriptionFrame.finalized=True), since finalization means STT has nothing more to send.

Fallback: when a transcript arrives without a VAD stop event, the user_speech_timeout timer measures inactivity since the last transcript (rearmed on each transcript). stt_timeout has no meaning here since it is defined relative to VAD stop, and STT has already emitted a transcript — so the stt wait is marked done immediately.

__init__(*, user_speech_timeout: float = 0.6, wait_for_transcript: bool = True, **kwargs)[source]

Initialize the speech timeout-based user turn stop strategy.

Parameters:

user_speech_timeout – Time to wait for the user to potentially say more after they pause speaking. Defaults to 0.6 seconds.
wait_for_transcript – Whether to require at least one transcript before triggering end-of-turn. When True (default), turn-end fires only after the user-speech timer expires and at least one transcript has been received. When False, the strategy signals turn-end as soon as VAD reports end of speech and the user-speech timer has elapsed — independent of transcripts. Set this to False when local turn detection is the intended driver of the conversation (e.g. with a realtime LLM service consuming audio directly), so transcripts are off the latency critical path. LLMContextAggregatorPair flips this for you when realtime_service_mode=True.
**kwargs – Additional keyword arguments.

property wait_for_transcript: bool: Whether transcripts gate end-of-turn signalling.

async reset()[source]

Reset the strategy to its initial state.

Note that _vad_user_speaking is intentionally left untouched: it reflects the live physical VAD state, not turn-scoped bookkeeping. VAD only re-emits a start after a stop, so if the user is still speaking when a turn boundary resets this strategy, clearing the flag would make the strategy believe there’s no active VAD reference and fall back to treating any transcript as a standalone utterance.

async handle_user_turn_started()[source]: Ready the strategy to detect the end of the turn now starting.

async handle_user_turn_stopped()[source]: Clear per-turn state once the turn has ended.

async setup(task_manager: BaseTaskManager)[source]

Initialize the strategy with the given task manager.

Parameters:: task_manager – The task manager to be associated with this instance.

async cleanup()[source]: Cleanup the strategy.

async process_frame(frame: Frame) → ProcessFrameResult[source]

Process an incoming frame to update strategy state.

Updates internal transcription text and VAD state. The user end turn will be triggered when appropriate based on the collected frames.

Parameters:: frame – The frame to be analyzed.
Returns:: Always returns CONTINUE so subsequent stop strategies are evaluated.

class pipecat.turns.user_stop.UserTurnStoppedParams(enable_user_speaking_frames: bool)[source]

Bases: object

Parameters emitted when a user turn stops.

These parameters are passed to the on_user_turn_stopped event and provide contextual information about how the end of user turn should be handled by the user aggregator.

Parameters:: enable_user_speaking_frames – Whether the user aggregator should emit frames indicating user speaking state (e.g., user stopped speaking). This is typically enabled by default, but may be disabled when another component (such as an STT service) is already responsible for generating user speaking frames.

enable_user_speaking_frames: bool

class pipecat.turns.user_stop.TurnAnalyzerUserTurnStopStrategy(*, turn_analyzer: BaseTurnAnalyzer, wait_for_transcript: bool = True, **kwargs)[source]

Bases: BaseUserTurnStopStrategy

User turn stop strategy that uses a turn detection model to determine if the user is done speaking.

This strategy feeds audio, VAD, and transcription frames to a turn detection model (BaseTurnAnalyzer) that predicts when the user has finished their turn. Once the model indicates the turn is complete, the strategy waits for a final transcription before triggering the end of the user’s turn.

For services that support finalization (TranscriptionFrame.finalized=True), the turn can be triggered immediately once the finalized transcript is received. Otherwise, an STT timeout (adjusted by VAD stop_secs) is used as a fallback.

__init__(*, turn_analyzer: BaseTurnAnalyzer, wait_for_transcript: bool = True, **kwargs)[source]

Initialize the user turn stop strategy.

Parameters:

turn_analyzer – The turn detection analyzer instance to detect end of user turn.
wait_for_transcript – Whether to require a transcript before triggering end-of-turn. When True (default), turn-end fires only after the turn analyzer reports COMPLETE and either a finalized transcript arrives or the STT safety-net timeout elapses with text in hand. When False, the strategy signals turn-end as soon as the turn analyzer reports COMPLETE — independent of transcripts. Set this to False when local turn detection is the intended driver of the conversation (e.g. with a realtime LLM service consuming audio directly), so transcripts are off the latency critical path. LLMContextAggregatorPair flips this for you when realtime_service_mode=True.
**kwargs – Additional keyword arguments.

property wait_for_transcript: bool: Whether transcripts gate end-of-turn signalling.

async handle_user_turn_started()[source]: Ready the strategy to detect the end of the turn now starting.

async handle_user_turn_stopped()[source]

Clear the analyzer’s buffered speech when the turn ends.

Beyond the usual per-turn reset, drop the analyzer’s buffered speech state so a stale silence timer can’t fire a phantom end-of-turn after an externally-ended turn (for example a forced stop while a mute strategy held audio back). The analyzer already clears itself on its own COMPLETE, so the clear is a no-op on the normal path; the pre-speech buffer refills continuously before the next turn starts.

async setup(task_manager: BaseTaskManager)[source]

Initialize the strategy with the given task manager.

Parameters:: task_manager – The task manager to be associated with this instance.

async cleanup()[source]: Cleanup the strategy.

async process_frame(frame: Frame) → ProcessFrameResult[source]

Process an incoming frame to update the turn analyzer and strategy state.

Parameters:: frame – The frame to be analyzed.
Returns:: Always returns CONTINUE so subsequent stop strategies are evaluated.

pipecat.turns.user_stop.deferred(strategy: BaseUserTurnStopStrategy) → DeferredUserTurnStopStrategy[source]

Defer this stop strategy’s finalization to another strategy.

Wraps strategy in a DeferredUserTurnStopStrategy: the inner strategy continues to drive inference-triggered events, but its on_user_turn_stopped event is suppressed. Use when another strategy in the chain (e.g. LLMTurnCompletionUserTurnStopStrategy) owns finalization.

Example:

stop=[
    deferred(TurnAnalyzerUserTurnStopStrategy(turn_analyzer=...)),
    LLMTurnCompletionUserTurnStopStrategy(),
]

Parameters:: strategy – The stop strategy to defer.
Returns:: A wrapper that exposes the inner strategy’s behavior with finalization suppressed.

user_stop

Submodules