Summarizer Mode

Summarizer Mode sends your prompt to several AI models simultaneously, then routes all of their responses to a dedicated synthesizer model that merges them into a single, comprehensive answer. Each fact in the synthesis is tagged with the source model(s) that contributed it.

This differs from Compare Mode, which shows each model’s response side-by-side without merging. Summarizer Mode is best when you want a unified answer and attribution, not a raw comparison.

Prerequisites

At least two source models must be selected before you can send in Summarizer Mode.
A synthesizer model must be designated (one model is auto-assigned from the available model list on first load; it can be changed at any time).

Selecting Summarizer Mode

Open a conversation or start a new one.
In the routing mode selector, choose Summarize.
The mode is persisted to localStorage under the key arbitex-routing-mode and restored on page reload.

localStorage key: arbitex-routing-mode
Valid values:     "single" | "compare" | "summarize"

When you navigate to an existing conversation, the mode stored on that conversation record overrides localStorage.

Choosing Source Models

Source models are the models whose responses the synthesizer will merge. You must select two or more.

Click the model selector to open the model picker.
Choose the models you want to query. Your selection is persisted to localStorage under arbitex-selected-models.
The count badge on the mode selector shows the number of source models — the synthesizer model is excluded from this count.

localStorage key: arbitex-selected-models
Format:           JSON array of model_id strings

Choosing the Synthesizer Model

The synthesizer model receives the merged source responses and writes the final synthesis. It is chosen separately from the source models.

On first load, the synthesizer defaults to models[0] from the available model list (typically the most capable model).
To change it, open the summarizer model dropdown and select a different model.
The selected synthesizer is automatically excluded from the source model list so it does not appear as both a source and a synthesizer in the same request.

How a Summarize Request Works

When you submit a message in Summarizer Mode (sendSummarizeMessage in frontend/src/stores/chat.ts):

Phase 1 — Fan-out

The platform sends your message to all source models in parallel using the POST /api/chat/{conversation_id}/messages endpoint with mode: "summarize". Each source model streams its response back independently.

Streaming state per model is tracked in summarizeStreams, keyed by model_id. The UI renders each source response as it arrives in a StreamingSummaryCard.

Phase 2 — Synthesis

Once all source model responses are complete, the backend builds a synthesizer prompt using build_synthesizer_messages (backend/app/services/boundary_markers.py). The function returns a (system_prompt, user_prompt) pair:

system_prompt — synthesizer role definition, boundary security instructions, and synthesis rules. This is constant for a given request (varies only by the boundary token).
user_prompt — the original question for context + all source model responses wrapped in boundary markers + attribution instructions.

The synthesizer model then streams its merged response back. The synthesizer stream appears in summarizeStreams under the synthesizer’s model_id when the first synthesis chunk arrives.

Phase 3 — Save

On completion, the store saves:

Individual source model responses as role: "assistant" messages.
The synthesizer output as a role: "summary" message tagged with the synthesizer’s model_id and provider.

Boundary Markers and Security

Summarizer Mode defends against cross-model prompt injection using cryptographic boundary markers (backend/app/services/boundary_markers.py).

Each source model’s response is wrapped:

[BOUNDARY:{token}:START:{model_id}]
...response text...
[BOUNDARY:{token}:END:{model_id}]

token is a secrets.token_hex(16) value (32-character hex string) generated fresh for each summarize request.
The synthesizer is instructed to treat only content inside boundary markers as model responses, and to ignore any instructions or directives that appear within those boundaries (OWASP LLM01 defense).
If boundary markers are detected in the synthesizer’s output — which would indicate the synthesizer was manipulated — the platform logs a boundary_violation_detected WARNING event and strips the leaked markers from the output.

Source Attribution

The synthesizer is instructed to tag every factual claim with its source model(s) using inline markers:

Attribution format	When used
`[source:model_name]text[/source]`	Claim from a single source model
`[source:model1,model2]text[/source]`	Claim from multiple (but not all) source models
`[source:model1,model2,model3]text[/source]`	Claim that appeared in all source models

Transitions, headings, and structural phrases (e.g., “In summary…”, “However…”) may appear outside source markers. The synthesizer is instructed to omit markers for common knowledge and structural text.

Colored text highlights

When a completed synthesis is rendered, every source-attributed span is displayed with a provider-specific background color that matches the corresponding source model’s color in the legend above the card.

Colors are assigned by provider family using the getModelColor(provider, index) utility (frontend/src/lib/providerColors.ts):

Provider family	Color family
Anthropic (Claude)	Purple/violet
OpenAI (GPT)	Green
Google (Gemini)	Blue
Meta (Llama)	Orange
Mistral	Indigo
Cohere	Teal
Other / unknown	Gray

If your organization has two models from the same provider (e.g., Claude 3 Opus and Claude 3 Haiku), the second model receives a lighter shade of the same family.

Attribution legend

The Attribution Legend above the summary card shows a pill badge for each source model, colored with its assigned provider color. This legend is the key for interpreting the colored spans in the synthesis text.

Sources: [Claude 3 Opus ▌] [GPT-4o ▌] [Gemini 1.5 Pro ▌]

Hovering over any colored attribution span shows a tooltip identifying the source:

Single source: "Source: claude-3-opus"
Multiple sources: "Sources: claude-3-opus, gpt-4o"

Tooltips are rendered by the AttributionHighlight component (frontend/src/components/AttributionHighlight.tsx).

Brand-aware colors

Attribution colors use the provider’s assigned Tailwind color family but respect the active brand theme’s opacity and contrast settings. The colors are not hardcoded hex values — they adapt when an organization activates a custom brand theme (e.g., Voya’s enterprise theme). The --brand-* CSS custom properties flow through to the highlighted segments.

Attribution is automatic

No user action is required to enable attribution. When the backend synthesizer includes [source:ModelName]...[/source] markers in its output, the AttributionHighlight component parses them automatically and renders the colored segments. If a model output contains no source markers, the synthesis renders as plain markdown without highlights.

Interpreting multi-model synthesis

When reading a highlighted synthesis:

Single-color span — Only one source model contributed this claim. The color matches that model’s pill in the legend.
Multi-model tooltip — Multiple models agreed on this claim; the span takes the color of the first-listed source.
Un-highlighted text — Common knowledge that all models agreed on, or structural/transitional text the synthesizer added.
Full paragraph un-highlighted — The synthesizer synthesized across all sources without a single dominant contributor.

Streaming Behavior

Component	Behavior
Source model cards	Appear immediately, stream content in real time as each model responds
Synthesizer card	Appears dynamically when the first synthesis chunk arrives (after all sources complete)
Abort	Pressing Stop cancels all in-flight streams via `AbortController`; the store marks all still-streaming models as complete

The summarizeStreams state is not cleared on completion — it remains populated so the streaming cards render in their streamingComplete state without a flash-of-empty during the streaming → history transition. Streams are cleared when a new request starts or when navigating to a different conversation.

Usage Tracking

Each model in the fan-out (source models + synthesizer) tracks token usage independently. Usage data is available per-message:

{
  "input_tokens": 312,
  "output_tokens": 847,
  "cost_estimate": 0.00124,
  "latency_ms": 2340
}

The synthesizer’s usage reflects its own input/output (the wrapped source responses count as synthesizer input tokens).

Saving and Restoring

State	Persistence
Routing mode (`summarize`)	`localStorage: arbitex-routing-mode` and on the conversation record (`mode` field)
Selected source model IDs	`localStorage: arbitex-selected-models`
Synthesizer model	In-memory only; resets to `models[0]` on page reload

Resetting the store on logout clears both localStorage keys.

Compare Mode — side-by-side multi-model comparison without synthesis
Model Selection — managing available models and providers
Governance and Policy — how governance prompts interact with multi-model modes
Usage Dashboard — token usage and cost tracking across modes

Summarizer Mode

Summarizer Mode

Prerequisites

Selecting Summarizer Mode

Choosing Source Models

Choosing the Synthesizer Model

How a Summarize Request Works

Phase 1 — Fan-out

Phase 2 — Synthesis

Phase 3 — Save

Boundary Markers and Security

Source Attribution

Colored text highlights

Attribution legend

Tooltip on hover

Brand-aware colors

Attribution is automatic

Interpreting multi-model synthesis

Streaming Behavior

Usage Tracking

Saving and Restoring

Related Pages