A numeric score representing the quality of the evaluated output.
The scale and meaning of the score depend on the specific evaluator used.
A textual explanation or feedback about the evaluation, providing context for the score.
This may include reasons for low scores or suggestions for improvement.
What Evaluators Do
Evaluators help you:- Score outputs (e.g., correctness, relevance, tone).
- Generate structured feedback (rubrics, classifications, labels).
- Compare versions of a flow or model configuration.
- Track quality over time as agents evolve.
When to Use an Evaluator
Add an evaluator when you want:- Automated quality checks
- Regression detection
- A quantitative score to rank outputs
- A structured audit trail for later review
Supported Evaluators
- Annotation Evaluator: Use human-provided annotations on previous runs to evaluate current agent behavior.
- Code Evaluator: Execute custom code to evaluate outputs based on your own logic.
- LLM Evaluator: Use a language model to assess output quality based on criteria you define.

