search_vectorized schema

sift exposes one tool: search_vectorized. It wraps Brave Search, classifies each result, and returns the augmented SERP.

Input

Field	Type	Default	Description
`query`	`string`	required	Search query.
`profile`	`"default" \| "academic" \| "dev" \| "no_affiliate"`	`"default"`	Reserved — currently forwarded to Brave for future Goggle wiring.
`safety`	`"lenient" \| "standard" \| "strict"`	`"standard"`	Reserved for future per-call recommend-policy tuning.
`max_results`	`number` (1–20)	`10`	Maximum results to return.
`country`	`string` (ISO-2 uppercase)	env / backend default	SERP localization (e.g., `"GB"`, `"JP"`).
`search_lang`	`string` (lowercase)	env / backend default	Language preference (e.g., `"en"`, `"ja"`).
`verbosity`	`"full" \| "concise" \| "summary"`	`"concise"`	Output shape. See Verbosity modes.

Output (`full` mode)

{
  results: Array<{
    url: string,
    title: string,
    description: string,
    quality_vector: QualityVector,      // see /concepts/quality-vector/
    safety_flag: {
      threat: "MALWARE" | "SOCIAL_ENGINEERING" | "UNWANTED_SOFTWARE",
      source: "gsb"
    } | null,
    recommended_action: "keep" | "tag" | "block"
  }>,
  aggregate_vector: AggregateVector,    // see /concepts/aggregate-vector/
  summary_hints: string[],              // see /concepts/summary-hints/
  stats: {
    fetched: number,
    returned: number,
    vectorized: number,
    cache_hits: number,
    llm_calls: number,
    llm_ok: number,
    llm_timeouts: number,
    llm_other_errors: number
  }
}

concise and summary modes trim per-result fields and some aggregate fields. See Verbosity modes for exact shapes.

`recommended_action` policy (default)

Condition	Action
`safety_flag` set	`block`
`tier ∈ {affiliate, content_farm}`	`block`
`tier ∈ {vendor_content_marketing, unknown}`	`tag`
`domain_content_mismatch == true`	`tag`
otherwise	`keep`

The policy lives in src/vectorize.ts#DEFAULT_RECOMMEND_POLICY.

Safety vs tier — orthogonal

safety_flag is populated by Google Safe Browsing and is orthogonal to tier. A peer_reviewed paper on a compromised host still gets recommended_action=block. A content_farm that isn’t actively malicious still gets tier-classified normally.

Stats

stats is diagnostic output for you to inspect. The LLM counts (cache_hits, llm_calls, llm_ok, llm_timeouts, llm_other_errors) give per-request visibility into whether the judge is healthy.