FinanceLab
Concepts

What we label

How a raw news article becomes a labeled, sentiment-scored row.

The source is GDELT, which catalogs worldwide news in near-real-time. Raw GDELT is a noisy, unlabeled firehose. We turn each article into a structured, market-relevant row in four steps:

  1. Relevance gate: a distilled student model scores whether the article is market-relevant. Only articles that pass the gate are kept; the score is returned as relevance.
  2. Faceting: the same model assigns a domain (what it's about), an impact level, and a region.
  3. Sentiment: FinBERT scores the headline, producing a continuous sentiment value and a sentimentLabel (positive / negative / neutral).
  4. Store: the article (URL, headline, source, date) plus all labels is written per-article and served from /api/v1/news (see Get Started).

So instead of scraping and labeling GDELT yourself, you query a clean feed where every row is already relevant, faceted, and sentiment-scored. The labeling model is distilled from frontier-model teacher labels: frontier-quality labels at API latency and cost. See Methodology.