Methodology

The student model

Labeling a streaming news feed with a frontier model is too slow and costly. Instead we distill: a frontier model labels a teaching set, and a smaller multi-task "student" model learns to reproduce those labels: relevance gate + domain + impact + region in a single pass. The result is frontier-quality labeling at API speed.

Sentiment

Sentiment is scored separately by FinBERT (ProsusAI/finbert), a finance-tuned language model, on the article headline. We return both the continuous score (sentiment) and the class (sentimentLabel).

Point-in-time

Each article carries its publish day (date). The feed is append-only and labels are fixed at enrichment time, so a query for a past date returns what was known then; no look-ahead.

What you get vs. raw GDELT

Raw GDELT is an unlabeled firehose. We add the relevance gate (signal vs. noise), the faceting (so you can slice by what matters), and finance-tuned sentiment: the parts that are expensive to build and maintain yourself.

Methodology

The student model

Sentiment

Point-in-time

What you get vs. raw GDELT

On this page