Benchmark Surface

Model Evaluation

Compare supported sentiment models across available corpora, inspect confusion structure, and review class-level behavior before using them in the public dashboard.

Dataset

Corpus

Switch between evaluation corpora without leaving the page or changing chart semantics.

Selection

Model

Choose a single model to drive the confusion matrix and per-class visual diagnostics.

Snapshot

Summary

Key model quality metrics for the currently selected corpus and evaluation target.

Dataset: News Corpus

ModelOpenAISelected evaluation target
Accuracy100.0%Overall classification accuracy
Precision100.0%Macro average
Recall100.0%Macro average
F1100.0%Macro average

Visuals

Evaluation Visuals

Read prediction concentration and per-class tradeoffs side-by-side for the active model.

Reference Table

All Model Details

A full comparison table for the loaded corpus, including confusion matrix shape.

ModelAccuracyPrecision (macro)Recall (macro)F1 (macro)Confusion Matrix Shape
Naive Bayes46.7%27.8%46.7%33.3%3 x 3
SVM46.7%28.9%46.7%34.4%3 x 3
VADER80.0%70.0%80.0%73.3%3 x 3
OpenAI100.0%100.0%100.0%100.0%3 x 3