search_vectorized schema
sift exposes one tool: search_vectorized. It wraps Brave Search, classifies each result, and returns the augmented SERP.
| Field | Type | Default | Description |
|---|---|---|---|
query | string | required | Search query. |
profile | "default" | "academic" | "dev" | "no_affiliate" | "default" | Reserved — currently forwarded to Brave for future Goggle wiring. |
safety | "lenient" | "standard" | "strict" | "standard" | Reserved for future per-call recommend-policy tuning. |
max_results | number (1–20) | 10 | Maximum results to return. |
country | string (ISO-2 uppercase) | env / backend default | SERP localization (e.g., "GB", "JP"). |
search_lang | string (lowercase) | env / backend default | Language preference (e.g., "en", "ja"). |
verbosity | "full" | "concise" | "summary" | "concise" | Output shape. See Verbosity modes. |
Output (full mode)
Section titled “Output (full mode)”{ results: Array<{ url: string, title: string, description: string, quality_vector: QualityVector, // see /concepts/quality-vector/ safety_flag: { threat: "MALWARE" | "SOCIAL_ENGINEERING" | "UNWANTED_SOFTWARE", source: "gsb" } | null, recommended_action: "keep" | "tag" | "block" }>, aggregate_vector: AggregateVector, // see /concepts/aggregate-vector/ summary_hints: string[], // see /concepts/summary-hints/ stats: { fetched: number, returned: number, vectorized: number, cache_hits: number, llm_calls: number, llm_ok: number, llm_timeouts: number, llm_other_errors: number }}concise and summary modes trim per-result fields and some aggregate fields. See Verbosity modes for exact shapes.
recommended_action policy (default)
Section titled “recommended_action policy (default)”| Condition | Action |
|---|---|
safety_flag set | block |
tier ∈ {affiliate, content_farm} | block |
tier ∈ {vendor_content_marketing, unknown} | tag |
domain_content_mismatch == true | tag |
| otherwise | keep |
The policy lives in src/vectorize.ts#DEFAULT_RECOMMEND_POLICY.
Safety vs tier — orthogonal
Section titled “Safety vs tier — orthogonal”safety_flag is populated by Google Safe Browsing and is orthogonal to tier. A peer_reviewed paper on a compromised host still gets recommended_action=block. A content_farm that isn’t actively malicious still gets tier-classified normally.
stats is diagnostic output for you to inspect. The LLM counts (cache_hits, llm_calls, llm_ok, llm_timeouts, llm_other_errors) give per-request visibility into whether the judge is healthy.