Evaluation

Evaluators are specialized nodes that score, grade, or analyze outputs produced by your graph. They’re designed to run after upstream nodes complete, so you can measure quality, detect failures, or generate structured feedback. All evaluators are designed to output two fields:

Score

float

A numeric score representing the quality of the evaluated output. The scale and meaning of the score depend on the specific evaluator used.

Feedback

string

A textual explanation or feedback about the evaluation, providing context for the score. This may include reasons for low scores or suggestions for improvement.

What Evaluators Do

Evaluators help you:

Score outputs (e.g., correctness, relevance, tone).
Generate structured feedback (rubrics, classifications, labels).
Compare versions of a flow or model configuration.
Track quality over time as agents evolve.

When to Use an Evaluator

Add an evaluator when you want:

Automated quality checks
Regression detection
A quantitative score to rank outputs
A structured audit trail for later review

Supported Evaluators

Annotation Evaluator: Use human-provided annotations on previous runs to evaluate current agent behavior.
Code Evaluator: Execute custom code to evaluate outputs based on your own logic.
LLM Evaluator: Use a language model to assess output quality based on criteria you define.

Getting started

Build with RELAI Agent Composer

Environments

Optimizing agents

Use agents in production

Key Concepts

What Evaluators Do

When to Use an Evaluator

Supported Evaluators

Getting started

Build with RELAI Agent Composer

Environments

Optimizing agents

Use agents in production

Key Concepts

​What Evaluators Do

​When to Use an Evaluator

​Supported Evaluators

What Evaluators Do

When to Use an Evaluator

Supported Evaluators