Skip to content

Aggregate vector

Per-result classification tells the agent about each source. The aggregate vector tells the agent about the SERP as a whole — whether the landscape is diverse, vendor-dominated, authoritative, or structurally commercial.

interface AggregateVector {
tier_distribution: Record<Tier, number>; // count per tier, full SERP (not the trimmed max_results)
mean_editorial_standards: "high" | "medium" | "low" | "unknown";
mean_authoritative_weight: number; // 0..1, arithmetic mean over results
diversity_entropy: number; // Shannon entropy of tier_distribution
vendor_dominance_ratio: number; // (vendor_primary + vendor_content_marketing) / total
}

Raw counts per tier. The agent can inspect this directly — “3 peer_reviewed, 7 vendor_content_marketing” communicates more than a summary score can.

The SERP’s overall trust level as a single scalar. Empirical observations across the 5-layer probe suite:

LayerQuery stylemean_auth
A (regulated)GDPR Article 17 right to erasure scope~0.70
B (academic phrasing)transformational leadership meta-analysis effect size~0.87
B (general phrasing)what makes a good leader~0.38
C (SaaS operational)series B saas magic number benchmark~0.23

Below ~0.3, an agent should treat aggregated claims as commercial positioning, not research.

Shannon entropy of the tier distribution. Low entropy = the SERP is dominated by a single tier. High entropy = the SERP spans many tiers.

Important: low entropy is not automatically bad. A SERP of 10/10 peer-reviewed papers has entropy 0.0 and that’s a feature. sift’s summary_hints suppress the “low diversity” warning when mean_authoritative_weight >= 0.7 precisely for this reason.

Fraction of results classified as vendor_primary or vendor_content_marketing. Not the same as “commercial ratio” — affiliate results aren’t counted here. See Summary hints for the hints that fire at 50%+ and at 90%+ without non-commercial alternatives.

Aggregate is over the full SERP, not the trimmed output

Section titled “Aggregate is over the full SERP, not the trimmed output”

When max_results=5 is passed but sift fetched 10, the aggregate is computed over all 10. The agent should see the real landscape even when it asked for a smaller trimmed view. This is deliberate: aggregate metrics for too few results are noisy and misleading.