Skip to main content

Custom Evals

In BotDojo, Evaluations are just Flows with a special End Node that returns the output of the evaluations in a standard way.

Setup of Flow to be an Evaluation

To set up a flow as an evaluation in BotDojo, follow these steps:

  1. Create a flow as you normally would, ensuring that you clearly define the input of the flow in the start node. These inputs will need to be mapped later.

  2. Click on the gear icon in the designer to open the flow settings.

  3. In the flow settings, navigate to the "Advanced" tab and select "Eval Config".

  4. Click the "Add Evaluation Result End Node" button. This will add a special end node to your flow that you need to connect to the rest of your flow.

alt text

Then toggle the Use as Eval switch.

Evaluation Config

  • Name: Name your Evaluation.
  • Variable Name: Should be recognizable but can't contain special characters or spaces.
    warning

    Currently, the Variable Name of an eval must be unique. You can't hook up two evaluations with the same variable name.

  • Description: Describe what the evaluation checks.
  • Category: If you plan to share your evaluation on the BotDojo Hub, then these are the categories it will show up in.
  • Evaluation Output Type:
    • metric: The Evaluation will return a numerical metric, like a score from 1-10.
    • pass/fail: The Evaluation will only return a pass or fail.
    • category: The evaluation will return a category, like "Needs More Information", "Missing Links".

Evaluation End Node

  • pass: Set this to true, false, or null (if there is an error in processing the evaluation).
  • reason (optional): Provide a reason for why the evaluation passed or failed.
  • type: Specify the type of evaluation. It can be one of the following:
    • "metric": Returns a numeric score.
    • "pass/fail": Returns a boolean value indicating whether the evaluation passed or failed.
    • "category": Returns a set of values or categories.
  • error: If an error occurred during the evaluation, provide an error message here.