Benchmark Surface

Model Evaluation

Compare supported sentiment models across available corpora, inspect confusion structure, and review class-level behavior before using them in the public dashboard.

Dataset

Corpus

Switch between evaluation corpora without leaving the page or changing chart semantics.

Selection

Model

Choose a single model to drive the confusion matrix and per-class visual diagnostics.

Snapshot

Summary

Key model quality metrics for the currently selected corpus and evaluation target.

Dataset: News Corpus

ModelNaive BayesSelected evaluation target
Accuracy46.7%Overall classification accuracy
Precision27.8%Macro average
Recall46.7%Macro average
F133.3%Macro average

Visuals

Evaluation Visuals

Read prediction concentration and per-class tradeoffs side-by-side for the active model.

Reference Table

All Model Details

A full comparison table for the loaded corpus, including confusion matrix shape.

ModelAccuracyPrecision (macro)Recall (macro)F1 (macro)Confusion Matrix Shape
Naive Bayes46.7%27.8%46.7%33.3%3 x 3
SVM46.7%28.9%46.7%34.4%3 x 3
VADER80.0%70.0%80.0%73.3%3 x 3
OpenAI100.0%100.0%100.0%100.0%3 x 3