Support templates in our string evaluators #60

maraisr · 2025-06-07T06:32:02Z

Much like our web prompt tooling, we should allow the models cli to support templated variables in string evaluators.

Before

Running test case 1/1...
  ✗ FAILED
    Model Response: Goodbye! Take care and see you next time! 🌍👋
    ✗ string evaluator (score: 0.00)
      Expected to contain: '{{expected}}' ⬅️⬅️⬅️⬅️
    ✓ similarity check (score: 0.25)
      LLM evaluation matched choice: '2'

After

Running test case 1/1...
  ✗ FAILED
    Model Response: Hello there! How can I assist you today? 😊
    ✓ string evaluator (score: 1.00)
      Expected to contain: 'hello' ⬅️⬅️⬅️⬅️
    ✗ similarity check (score: 0.00)
      LLM evaluation matched choice: '1'

Copilot

Pull Request Overview

Support templated variables in string evaluators for the CLI, allowing test cases to inject dynamic values into string checks.

Introduces string fields in example prompts and switches evaluator contains to use {{string}}
Updates runStringEvaluator signature to accept a testCase map and applies templateString in each comparison
Adjusts tests to pass an empty testCase into runStringEvaluator

Reviewed Changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.

File	Description
examples/sample_prompt.yml	Added `string` keys under `testData` and updated evaluator `contains`
cmd/eval/eval_test.go	Changed `runStringEvaluator` calls to include the new `testCase` param
cmd/eval/eval.go	Refactored `runStringEvaluator` to template all string criteria

Comments suppressed due to low confidence (2)

examples/sample_prompt.yml:9

[nitpick] The key string in testData is very generic and may be confused with the evaluator’s string block. Consider renaming it to something more descriptive like templateVar or placeholder.

    string: hello

cmd/eval/eval_test.go:132

There are no tests covering the new templating functionality or error paths when a placeholder is missing. Consider adding tests that include {{string}} substitutions and missing-key scenarios to validate templateString behavior.

result, err := handler.runStringEvaluator("test", tt.evaluator, map[string]interface{}{}, tt.response)

cmd/eval/eval.go

maraisr · 2025-06-07T07:27:21Z

cmd/eval/eval.go

+		if err != nil {
+			return EvaluationResult{}, fmt.Errorf("failed to template message content: %w", err)
+		}


Yeah I do agree with Copilot in that this looks a little gnarly. But with the way that Go wants to handle errors we kinda need this everywhere.

That is unless, we create the helper method here — that should there be an error, we just default back to the provided string. But open to suggestions/opinions.

feat: support templates in our string evaluators

8fc32d7

Copilot AI review requested due to automatic review settings June 7, 2025 06:32

maraisr requested a review from a team as a code owner June 7, 2025 06:32

Copilot AI reviewed Jun 7, 2025

View reviewed changes

cmd/eval/eval.go Show resolved Hide resolved

cmd/eval/eval.go Show resolved Hide resolved

test: ensure we have tests for template evaluators

77cdbf5

maraisr requested a review from sgoedecke June 7, 2025 06:37

maraisr commented Jun 7, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Support templates in our string evaluators #60

Support templates in our string evaluators #60

maraisr commented Jun 7, 2025

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

maraisr Jun 7, 2025

Uh oh!

Uh oh!

Support templates in our string evaluators #60

Are you sure you want to change the base?

Support templates in our string evaluators #60

Conversation

maraisr commented Jun 7, 2025

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Reviewed Changes

Uh oh!

Uh oh!

Uh oh!

maraisr Jun 7, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!