An incident always crosses sources. The first alert lands in the SIEM, the next clue is a span ID in APM, the smoking gun is a process tree in the EDR, and the attribution lives in a threat-intel platform. The analyst is the integration layer, and that's where most incidents lose minutes.
The pattern we wanted to break
Open four tabs. Search the same hostname four different ways. Reconcile by eyeball. Lose context every time you switch.
What we built
- A federated retrieval layer that fans a single text query out to all four telemetry indices in parallel.
- A cross-encoder re-ranker that merges results with knowledge of provenance and recency.
- A per-source fallback path so a slow index degrades to recency-only instead of stalling the response.
Query understanding
Queries get rewritten into structured sub-intents: host, actor, window, indicator. Each sub-intent biases retrieval toward the most appropriate index.
The retrieval loop is short:
q = query_encoder(text)
cands = union(
log_index.knn(q, k=200),
trace_index.knn(q, k=200),
endpoint_index.knn(q, k=200),
intel_index.knn(q, k=200),
)
return cross_encoder.rank(text, cands)[:20]
Latency budget
Analysts will tolerate roughly 800ms for first results. We split the budget: 200ms for fan-out, 400ms for retrieval inside each index, 200ms for re-rank. Each index gets to skip cheaply with a recency-only fallback if it can't hit its window.
Results with provenance
Every result row renders a small badge for its source, the index recency, and the rank score. Without that, analysts mistake "high-rank" for "trusted" and skip evidence from the lower-recency sources.