MentorAI Evaluation Dashboard
Run:
original-aggregated
Complete
Updated: 2026-01-30T14:28:45.462987
Per-Criteria Results
Human Spot-Check Agreement
Overall Agreement
-
-
Conversations Rated
-
-
Conversation
Persona
Rater
Agreement
Disagreements
Criterion Deep-Dive
Show failures for:
B-03
0
B-04
0
E-02
0
F-01
0
ID
Persona
Criterion
Evidence
Transcript
Per-Conversation Results
Filter by persona:
All
Group by persona
ID
Persona
Critical
Quality Score
Status