Tier definitions
Tier definitions are in src/llm-judge.ts#VECTOR_SYSTEM_PROMPT. If you edit them, the whole classifier changes.
The 9 tiers
Section titled “The 9 tiers”regulated_primary
Section titled “regulated_primary”Official regulatory / government publications, court records, central bank releases, standards-body documents.
Examples: SEC filings on sec.gov/cgi-bin/browse-edgar, FDA guidance at fda.gov/regulatory-information/, court records, Eurostat / BLS / e-Stat Japan statistics, ISO / NIST / W3C specs, ico.org.uk GDPR guidance.
peer_reviewed
Section titled “peer_reviewed”Journal articles and working papers with editorial review.
Examples: arXiv, PubMed / PMC, JSTOR, ScienceDirect, SAGE, Taylor & Francis, Frontiers, Nature, Springer, Emerald, INFORMS journal articles. Working papers / preprints hosted on university repositories when the URL path indicates a research paper (/publications/, /papers/, /article/, /doi/, paper-identifier PDFs).
independent_editorial
Section titled “independent_editorial”Established publications with named editors, corrections policy, and fact-checking. Applies to their main editorial output — not sponsored / contributor / council paths.
Examples: BBC, Reuters, The Guardian, Ars Technica, The Atlantic, NYT news, Nikkei news, Wikipedia, university news pages (eller.arizona.edu/news), popular explainers on .edu domains that aren’t peer-reviewed papers.
vendor_primary
Section titled “vendor_primary”A vendor’s own product homepage, product page, documentation, API reference, or official changelog. The content is about the vendor’s own product on the vendor’s own canonical surface.
Examples: MDN (Mozilla docs), rfc-editor.org, a SaaS company’s pricing page, a product’s API reference, davidjteece.com (author’s own site for his work).
vendor_content_marketing
Section titled “vendor_content_marketing”A vendor’s blog or strategic-advice content that teaches concepts adjacent to the vendor’s product. The article is NOT on a dedicated product page; it educates toward the vendor’s domain as a lead-generation mechanism.
Examples: HubSpot blog on marketing, Stripe blog on payments concepts, VC firm thought leadership (a16z, SaaStr, First Round Review), consulting firm “insights” (Bain, Deloitte, KPMG, McKinsey), for-profit university program pages (waldenu.edu/programs/business/resource/...), trade associations publishing “research” about their own industry (see boundary rule 4).
affiliate
Section titled “affiliate”“Best X”, “Top N”, “Best X for YEAR”, head-to-head comparisons, roundups, buying guides — when the content pattern is a commercial listicle. Apply regardless of publisher reputation. PCMag / CNet / TechRadar / Wired “Best X 2026” articles ARE affiliate, even though those publishers ALSO publish independent_editorial content elsewhere. Classify by the specific article, not the brand.
content_farm
Section titled “content_farm”Mass-produced, AI-generated, low-effort templated content. Thin wrappers around other sources, auto-translated content, SEO-churn sites.
User-generated content with low moderation or community-edited.
Examples: Reddit posts, forum threads, Stack Overflow, Quora answers, Hacker News, individual Medium posts by named authors with no vendor affiliation, community-run wikis.
unknown
Section titled “unknown”Signals don’t allow confident classification. authoritative_weight is low; recommended_action is tag so the agent knows not to rely.
Boundary rules (six rules enforced by the prompt)
Section titled “Boundary rules (six rules enforced by the prompt)”- Affiliate trumps reputation. “Best X”, “Top N”, comparison listicles →
affiliate, even on famous publishers. - Vendor blog ≠ vendor primary. Blogs at
/blog/or/resources/teaching adjacent concepts →vendor_content_marketing, notvendor_primary. - URL path override. Paths containing
/sponsored/,/partner/,/advertorial/,/branded/,/promoted/,/contributor/,/councils/,/community/,/guest-post/,/opinion/guest/on otherwise-reputable domains: classify by the article alone. Parent reputation does NOT transfer. - Trade associations / lobbies. Groups publishing “research” about their own industry (Corn Refiners on corn syrup, Heartland on climate) →
vendor_content_marketing+domain_content_mismatch=true. They are vendors of a policy position. - Domain / content mismatch. True when the domain’s implied business strongly differs from the content topic (hospital-linen supplier publishing weight-loss reviews). False for plausibly related or generic domains.
- Academic / government TLD discipline.
.edu,.ac.xx,.gov,.europa.eu,.who.intare NOT automatic tier indicators. Classify by the specific page content:- Published paper →
peer_reviewed - University news / press release / popular explainer →
independent_editorial - Degree program marketing →
vendor_content_marketing - Official regulator guidance →
regulated_primary
- Published paper →
Taxonomy gaps (known)
Section titled “Taxonomy gaps (known)”- Gray-market vendors. Gaming boost providers, tool crack sites, and similar legal-gray commercial actors currently fall through as
vendor_primary. A dedicated tier or orthogonallegitimacyaxis is on the roadmap. - Aggregator journalism (e.g., content aggregators with light editorial overlay) sometimes classifies as
content_farmwhen the overlay is light; rules 2 and 3 help but edge cases remain.
If you find a consistent misclassification, it belongs in the observation log — the learning loop will eventually surface it.