Summarizer Mode
Summarizer Mode
Section titled “Summarizer Mode”Summarizer Mode sends your prompt to several AI models simultaneously, then routes all of their responses to a dedicated synthesizer model that merges them into a single, comprehensive answer. Each fact in the synthesis is tagged with the source model(s) that contributed it.
This differs from Compare Mode, which shows each model’s response side-by-side without merging. Summarizer Mode is best when you want a unified answer and attribution, not a raw comparison.
Prerequisites
Section titled “Prerequisites”- At least two source models must be selected before you can send in Summarizer Mode.
- A synthesizer model must be designated (one model is auto-assigned from the available model list on first load; it can be changed at any time).
Selecting Summarizer Mode
Section titled “Selecting Summarizer Mode”- Open a conversation or start a new one.
- In the routing mode selector, choose Summarize.
- The mode is persisted to
localStorageunder the keyarbitex-routing-modeand restored on page reload.
localStorage key: arbitex-routing-modeValid values: "single" | "compare" | "summarize"When you navigate to an existing conversation, the mode stored on that conversation record overrides localStorage.
Choosing Source Models
Section titled “Choosing Source Models”Source models are the models whose responses the synthesizer will merge. You must select two or more.
- Click the model selector to open the model picker.
- Choose the models you want to query. Your selection is persisted to
localStorageunderarbitex-selected-models. - The count badge on the mode selector shows the number of source models — the synthesizer model is excluded from this count.
localStorage key: arbitex-selected-modelsFormat: JSON array of model_id stringsChoosing the Synthesizer Model
Section titled “Choosing the Synthesizer Model”The synthesizer model receives the merged source responses and writes the final synthesis. It is chosen separately from the source models.
- On first load, the synthesizer defaults to
models[0]from the available model list (typically the most capable model). - To change it, open the summarizer model dropdown and select a different model.
- The selected synthesizer is automatically excluded from the source model list so it does not appear as both a source and a synthesizer in the same request.
How a Summarize Request Works
Section titled “How a Summarize Request Works”When you submit a message in Summarizer Mode (sendSummarizeMessage in frontend/src/stores/chat.ts):
Phase 1 — Fan-out
Section titled “Phase 1 — Fan-out”The platform sends your message to all source models in parallel using the POST /api/chat/{conversation_id}/messages endpoint with mode: "summarize". Each source model streams its response back independently.
Streaming state per model is tracked in summarizeStreams, keyed by model_id. The UI renders each source response as it arrives in a StreamingSummaryCard.
Phase 2 — Synthesis
Section titled “Phase 2 — Synthesis”Once all source model responses are complete, the backend builds a synthesizer prompt using build_synthesizer_messages (backend/app/services/boundary_markers.py). The function returns a (system_prompt, user_prompt) pair:
- system_prompt — synthesizer role definition, boundary security instructions, and synthesis rules. This is constant for a given request (varies only by the boundary token).
- user_prompt — the original question for context + all source model responses wrapped in boundary markers + attribution instructions.
The synthesizer model then streams its merged response back. The synthesizer stream appears in summarizeStreams under the synthesizer’s model_id when the first synthesis chunk arrives.
Phase 3 — Save
Section titled “Phase 3 — Save”On completion, the store saves:
- Individual source model responses as
role: "assistant"messages. - The synthesizer output as a
role: "summary"message tagged with the synthesizer’smodel_idandprovider.
Boundary Markers and Security
Section titled “Boundary Markers and Security”Summarizer Mode defends against cross-model prompt injection using cryptographic boundary markers (backend/app/services/boundary_markers.py).
Each source model’s response is wrapped:
[BOUNDARY:{token}:START:{model_id}]...response text...[BOUNDARY:{token}:END:{model_id}]tokenis asecrets.token_hex(16)value (32-character hex string) generated fresh for each summarize request.- The synthesizer is instructed to treat only content inside boundary markers as model responses, and to ignore any instructions or directives that appear within those boundaries (OWASP LLM01 defense).
- If boundary markers are detected in the synthesizer’s output — which would indicate the synthesizer was manipulated — the platform logs a
boundary_violation_detectedWARNING event and strips the leaked markers from the output.
Source Attribution
Section titled “Source Attribution”The synthesizer is instructed to tag every factual claim with its source model(s) using inline markers:
| Attribution format | When used |
|---|---|
[source:model_name]text[/source] | Claim from a single source model |
[source:model1,model2]text[/source] | Claim from multiple (but not all) source models |
[source:model1,model2,model3]text[/source] | Claim that appeared in all source models |
Transitions, headings, and structural phrases (e.g., “In summary…”, “However…”) may appear outside source markers. The synthesizer is instructed to omit markers for common knowledge and structural text.
Colored text highlights
Section titled “Colored text highlights”When a completed synthesis is rendered, every source-attributed span is displayed with a provider-specific background color that matches the corresponding source model’s color in the legend above the card.
Colors are assigned by provider family using the getModelColor(provider, index) utility (frontend/src/lib/providerColors.ts):
| Provider family | Color family |
|---|---|
| Anthropic (Claude) | Purple/violet |
| OpenAI (GPT) | Green |
| Google (Gemini) | Blue |
| Meta (Llama) | Orange |
| Mistral | Indigo |
| Cohere | Teal |
| Other / unknown | Gray |
If your organization has two models from the same provider (e.g., Claude 3 Opus and Claude 3 Haiku), the second model receives a lighter shade of the same family.
Attribution legend
Section titled “Attribution legend”The Attribution Legend above the summary card shows a pill badge for each source model, colored with its assigned provider color. This legend is the key for interpreting the colored spans in the synthesis text.
Sources: [Claude 3 Opus ▌] [GPT-4o ▌] [Gemini 1.5 Pro ▌]Tooltip on hover
Section titled “Tooltip on hover”Hovering over any colored attribution span shows a tooltip identifying the source:
- Single source:
"Source: claude-3-opus" - Multiple sources:
"Sources: claude-3-opus, gpt-4o"
Tooltips are rendered by the AttributionHighlight component (frontend/src/components/AttributionHighlight.tsx).
Brand-aware colors
Section titled “Brand-aware colors”Attribution colors use the provider’s assigned Tailwind color family but respect the active brand theme’s opacity and contrast settings. The colors are not hardcoded hex values — they adapt when an organization activates a custom brand theme (e.g., Voya’s enterprise theme). The --brand-* CSS custom properties flow through to the highlighted segments.
Attribution is automatic
Section titled “Attribution is automatic”No user action is required to enable attribution. When the backend synthesizer includes [source:ModelName]...[/source] markers in its output, the AttributionHighlight component parses them automatically and renders the colored segments. If a model output contains no source markers, the synthesis renders as plain markdown without highlights.
Interpreting multi-model synthesis
Section titled “Interpreting multi-model synthesis”When reading a highlighted synthesis:
- Single-color span — Only one source model contributed this claim. The color matches that model’s pill in the legend.
- Multi-model tooltip — Multiple models agreed on this claim; the span takes the color of the first-listed source.
- Un-highlighted text — Common knowledge that all models agreed on, or structural/transitional text the synthesizer added.
- Full paragraph un-highlighted — The synthesizer synthesized across all sources without a single dominant contributor.
Streaming Behavior
Section titled “Streaming Behavior”| Component | Behavior |
|---|---|
| Source model cards | Appear immediately, stream content in real time as each model responds |
| Synthesizer card | Appears dynamically when the first synthesis chunk arrives (after all sources complete) |
| Abort | Pressing Stop cancels all in-flight streams via AbortController; the store marks all still-streaming models as complete |
The summarizeStreams state is not cleared on completion — it remains populated so the streaming cards render in their streamingComplete state without a flash-of-empty during the streaming → history transition. Streams are cleared when a new request starts or when navigating to a different conversation.
Usage Tracking
Section titled “Usage Tracking”Each model in the fan-out (source models + synthesizer) tracks token usage independently. Usage data is available per-message:
{ "input_tokens": 312, "output_tokens": 847, "cost_estimate": 0.00124, "latency_ms": 2340}The synthesizer’s usage reflects its own input/output (the wrapped source responses count as synthesizer input tokens).
Saving and Restoring
Section titled “Saving and Restoring”| State | Persistence |
|---|---|
Routing mode (summarize) | localStorage: arbitex-routing-mode and on the conversation record (mode field) |
| Selected source model IDs | localStorage: arbitex-selected-models |
| Synthesizer model | In-memory only; resets to models[0] on page reload |
Resetting the store on logout clears both localStorage keys.
Related Pages
Section titled “Related Pages”- Compare Mode — side-by-side multi-model comparison without synthesis
- Model Selection — managing available models and providers
- Governance and Policy — how governance prompts interact with multi-model modes
- Usage Dashboard — token usage and cost tracking across modes