Concepts
What we label
How a raw news article becomes a labeled, sentiment-scored row.
The source is GDELT, which catalogs worldwide news in near-real-time. Raw GDELT is a noisy, unlabeled firehose. We turn each article into a structured, market-relevant row in four steps:
- Relevance gate: a distilled student model scores whether the article is market-relevant.
Only articles that pass the gate are kept; the score is returned as
relevance. - Faceting: the same model assigns a domain (what it's about), an impact level, and a region.
- Sentiment: FinBERT scores the headline,
producing a continuous
sentimentvalue and asentimentLabel(positive/negative/neutral). - Store: the article (URL, headline, source, date) plus all labels is written per-article and
served from
/api/v1/news(see Get Started).
So instead of scraping and labeling GDELT yourself, you query a clean feed where every row is already relevant, faceted, and sentiment-scored. The labeling model is distilled from frontier-model teacher labels: frontier-quality labels at API latency and cost. See Methodology.