22 / 22AdvancedGeneration

Speculative Execution

Multi-strategy parallel ranking

Generate N candidate answers in parallel using different strategies (temperature, framing, style), then a ranker LLM selects the best one. Trades token cost for answer quality.

Speculative Execution applies the "generate multiple candidates and pick the best" principle to LLM responses. Four candidate answers are generated simultaneously, each using a different strategy: conservative (temp 0.0), creative (temp 0.8), analytical (temp 0.0 with structured framing), and concise (temp 0.2). Using `Send`, all four run in parallel.

A ranker LLM receives all four candidates and scores each on accuracy, helpfulness, and clarity (1–10 per criterion). The candidate with the highest combined score becomes the final answer. The ranker's scores and reasoning are preserved in state for inspection.

Speculative Execution costs approximately 4× the tokens of a single-shot response but produces measurably better answers on tasks where a single strategy reliably underperforms — creative writing, nuanced analysis, or questions that benefit from both depth and concision. It is the highest-cost, highest-quality pattern in this collection.

Speculative Execution

Further Reading