npm - promptfoo - Versions diffs - 0.2.2 → 0.3.0 - Mend

promptfoo 0.2.2 → 0.3.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (30) hide show

package/README.md CHANGED Viewed

@@ -13,6 +13,15 @@ With promptfoo, you can:
 - Use as a command line tool, or integrate into your workflow as a library
 - Use OpenAI API models (built-in support), or integrate custom API providers for any LLM API
+**» [View docs on website](https://promptfoo.dev/docs/intro) «**
+promptfoo works by producing matrix views that allow you to quickly review prompt outputs across many inputs.  The goal: tune prompts systematically across all relevant test cases, instead of testing prompts one-off.
+Here's an example of a side-by-side comparison of multiple prompts and inputs.  You can manually review outputs, or set up "expectations" that automatically flag bad outputs.
+![Prompt evaluation matrix](https://user-images.githubusercontent.com/310310/236690475-b05205e8-483e-4a6d-bb84-41c2b06a1247.png)
 ## Usage (command line)
 To get started, run the following command:
@@ -32,15 +41,16 @@ npx promptfoo eval
 If you're looking to customize your usage, you have the full set of parameters at your disposal:
 ```bash
-npx promptfoo eval -p <prompt_paths...> -o <output_path> -r <providers> [-v <vars_path>] [-j <max_concurrency] [-c <config_path>]
+npx promptfoo eval -p <prompt_paths...> -o <output_path> -r <providers> [-v <vars_path>] [-j <max_concurrency] [-c <config_path>] [--grader <grading_provider>]
 ```
 - `<prompt_paths...>`: Paths to prompt file(s)
 - `<output_path>`: Path to output CSV, JSON, YAML, or HTML file. Defaults to terminal output
 - `<providers>`: One or more of: `openai:<model_name>`, or filesystem path to custom API caller module
 - `<vars_path>` (optional): Path to CSV, JSON, or YAML file with prompt variables
-- `<max_concurrency>` (optional): Number of simultaneous API requests. Defaults to 3
+- `<max_concurrency>` (optional): Number of simultaneous API requests. Defaults to 4
 - `<config_path>` (optional): Path to configuration file
+- `<grading_provider>`: A provider that handles the grading process, if you are using [LLM grading](#expected-outputs)
 ### Examples
@@ -64,7 +74,9 @@ This command will evaluate the prompts in `prompts.txt`, substituing the variabl
 Have a look at the setup and full output [here](https://github.com/typpo/promptfoo/tree/main/examples/assistant-cli).
-You can run the command without an `-o` option to output in your terminal ([example](https://user-images.githubusercontent.com/310310/235329207-e8c22459-5f51-4fee-9714-1b602ac3d7ca.png)), or use `-o` to specify an HTML ([example](https://user-images.githubusercontent.com/310310/235483444-4ddb832d-e103-4b9c-a862-b0d6cc11cdc0.png)), CSV ([example](https://docs.google.com/spreadsheets/d/1nanoj3_TniWrDl1Sj-qYqIMD6jwm5FBy15xPFdUTsmI/edit?usp=sharing)), JSON ([example](https://github.com/typpo/promptfoo/blob/main/examples/simple-cli/output.json)), or YAML output.
+You can also output a nice [spreadsheet](https://docs.google.com/spreadsheets/d/1nanoj3_TniWrDl1Sj-qYqIMD6jwm5FBy15xPFdUTsmI/edit?usp=sharing), [JSON](https://github.com/typpo/promptfoo/blob/main/examples/simple-cli/output.json), YAML, or an HTML file:
+![Table output](https://user-images.githubusercontent.com/310310/235483444-4ddb832d-e103-4b9c-a862-b0d6cc11cdc0.png)
 #### Model quality
@@ -164,9 +176,9 @@ You can use [Nunjucks](https://mozilla.github.io/nunjucks/) templating syntax to
 Example of a single prompt file with multiple prompts (`prompts.txt`):
 ```
-Translate the following text to French: "{{text}}"
+Translate the following text to French: "{{name}}: {{text}}"
 ---
-Translate the following text to German: "{{text}}"
+Translate the following text to German: "{{name}}: {{text}}"
 ```
 Example of multiple prompt files:
@@ -174,13 +186,13 @@ Example of multiple prompt files:
 - `prompt1.txt`:
   ```
-  Translate the following text to French: "{{text}}"
+  Translate the following text to French: "{{name}}: {{text}}"
   ```
 - `prompt2.txt`:
   ```
-  Translate the following text to German: "{{text}}"
+  Translate the following text to German: "{{name}}: {{text}}"
   ```
 ### Vars File
@@ -192,24 +204,27 @@ Vars are substituted by [Nunjucks](https://mozilla.github.io/nunjucks/) templati
 Example of a vars file (`vars.csv`):
 ```
-text
-"Hello, world!"
-"Goodbye, everyone!"
+"name","text"
+"Bob","Hello, world!"
+"Joe","Goodbye, everyone!"
 ```
 Example of a vars file (`vars.json`):
 ```json
-[{ "text": "Hello, world!" }, { "text": "Goodbye, everyone!" }]
+[
+  { "name": "Bob", "text": "Hello, world!" },
+  { "name": "Joe", "text": "Goodbye, everyone!" }
+]
 ```
-### Expected Value
+### Expected Outputs
-You can specify an expected value for each test case to evaluate the success or failure of the model's output. To do this, add a special field called `__expected` in the `vars` file. The `__expected` field supports three types of value comparisons:
+You can specify an expected value for each test case to evaluate the success or failure of the model's output. To do this, add a special field called `__expected` in the `vars` file. The `__expected` field supports these types of value comparisons:
 1. If the expected value starts with `eval:`, it will evaluate the contents as the body of a JavaScript function defined like: `function(output) { <eval> }`. The function should return a boolean value, where `true` indicates success and `false` indicates failure.
-2. If the expected value starts with `grade:`, it will call the `gradeOutput(prompt, output)` function. You should assume this function exists and returns a boolean value, where `true` indicates success and `false` indicates failure.
+2. If the expected value starts with `grade:`, it will ask an LLM to evaluate whether the output meets the condition. For example: `grade: don't mention being an AI`. This option requires a provider name to be supplied to promptfoo via the `--grader` argument: `promptfoo --grader openai:gpt-4 ...`.
 3. Otherwise, it attempts an exact string match comparison between the expected value and the model's output.
@@ -219,6 +234,7 @@ Example of a vars file with the `__expected` field (`vars.csv`):
 text,__expected
 "Hello, world!","Bonjour le monde"
 "Goodbye, everyone!","eval:return output.includes('Au revoir');"
+"I am a pineapple","grade:doesn't reference any fruits besides pineapple"
 ```
 Example of a vars file with the `__expected` field (`vars.json`):
@@ -227,6 +243,7 @@ Example of a vars file with the `__expected` field (`vars.json`):
 [
   { "text": "Hello, world!", "__expected": "Bonjour le monde" },
   { "text": "Goodbye, everyone!", "__expected": "eval:output.includes('Au revoir');" }
+  { "text": "I am a pineapple", "__expected": "grade:doesn't reference any fruits besides pineapple" }
 ]
 ```
@@ -297,6 +314,8 @@ Other OpenAI-related environment variables are supported:
 - `OPENAI_TEMPERATURE` - temperature model parameter, defaults to 0
 - `OPENAI_MAX_TOKENS` - max_tokens model parameter, defaults to 1024
+- `OPENAI_API_HOST` - override the hostname for the API request. Useful for proxies like Helicone.
+- `REQUEST_TIMEOUT_MS` - maximum request time, in milliseconds (defaults to 10000)
 The OpenAI provider supports the following model formats:

package/dist/evaluator.d.ts CHANGED Viewed

@@ -1,3 +1,3 @@
-import { EvaluateOptions, EvaluateSummary } from './types.js';
+import type { EvaluateOptions, EvaluateSummary } from './types.js';
 export declare function evaluate(options: EvaluateOptions): Promise<EvaluateSummary>;
 //# sourceMappingURL=evaluator.d.ts.map

package/dist/evaluator.d.ts.map CHANGED Viewed

	@@ -1 +1 @@
1	- {"version":3,"file":"evaluator.d.ts","sourceRoot":"","sources":["../src/evaluator.ts"],"names":[],"mappings":"~~AAKA~~,OAAO,~~EAAE~~,eAAe,~~EAAE~~,eAAe,~~EAAuC~~,MAAM,YAAY,CAAC;~~AA2EnG~~,~~wBAAsB~~,QAAQ,CAAC,OAAO,EAAE,eAAe,~~GAAG,OAAO,CAAC,eAAe,CAAC,CAqIjF~~"}
1	+ {"version":3,"file":"evaluator.d.ts","sourceRoot":"","sources":["../src/evaluator.ts"],"names":[],"mappings":"AAMA,OAAO,KAAK,EAEV,eAAe,EAGf,eAAe,EAGhB,MAAM,YAAY,CAAC;AA2SpB,wBAAgB,QAAQ,CAAC,OAAO,EAAE,eAAe,4BAGhD"}

package/dist/evaluator.js CHANGED Viewed

@@ -1,175 +1,246 @@
 import async from 'async';
 import nunjucks from 'nunjucks';
-const DEFAULT_MAX_CONCURRENCY = 3;
-function checkExpectedValue(expected, output) {
-    if (expected.startsWith('eval:')) {
-        const evalBody = expected.slice(5);
-        const evalFunction = new Function('output', `return ${evalBody}`);
-        return evalFunction(output);
-    }
-    else if (expected.startsWith('grade:')) {
-        // NYI
-        return false;
+import { DEFAULT_GRADING_PROMPT } from './prompts.js';
+const DEFAULT_MAX_CONCURRENCY = 4;
+class Evaluator {
+    constructor(options) {
+        this.options = options;
+        this.stats = {
+            successes: 0,
+            failures: 0,
+            tokenUsage: {
+                total: 0,
+                prompt: 0,
+                completion: 0,
+            },
+        };
     }
-    else {
-        return expected === output;
+    async gradeOutput(expected, output) {
+        const { grading } = this.options;
+        if (!grading) {
+            throw new Error('Cannot grade output without grading config. Specify --grader option or grading config.');
+        }
+        const prompt = nunjucks.renderString(grading.prompt || DEFAULT_GRADING_PROMPT, {
+            content: output,
+            rubric: expected,
+        });
+        const resp = await grading.provider.callApi(prompt);
+        if (resp.error || !resp.output) {
+            return {
+                pass: false,
+                reason: resp.error || 'No output',
+                tokensUsed: {
+                    total: resp.tokenUsage?.total || 0,
+                    prompt: resp.tokenUsage?.prompt || 0,
+                    completion: resp.tokenUsage?.completion || 0,
+                },
+            };
+        }
+        try {
+            const parsed = JSON.parse(resp.output);
+            parsed.tokensUsed = {
+                total: resp.tokenUsage?.total || 0,
+                prompt: resp.tokenUsage?.prompt || 0,
+                completion: resp.tokenUsage?.completion || 0,
+            };
+            return parsed;
+        }
+        catch (err) {
+            return {
+                pass: false,
+                reason: `Output is not valid JSON: ${resp.output}`,
+                tokensUsed: {
+                    total: resp.tokenUsage?.total || 0,
+                    prompt: resp.tokenUsage?.prompt || 0,
+                    completion: resp.tokenUsage?.completion || 0,
+                },
+            };
+        }
     }
-}
-async function runEval({ provider, prompt, vars, includeProviderId, }) {
-    vars = vars || {};
-    const renderedPrompt = nunjucks.renderString(prompt, vars);
-    // Note that we're using original prompt, not renderedPrompt
-    const promptDisplay = includeProviderId ? `[${provider.id()}] ${prompt}` : prompt;
-    const setup = {
-        prompt: {
-            raw: renderedPrompt,
-            display: promptDisplay,
-        },
-        vars,
-    };
-    try {
-        const response = await provider.callApi(renderedPrompt);
-        const ret = {
-            ...setup,
-            response,
-            success: false,
-        };
-        if (response.error) {
-            ret.error = response.error;
+    async checkExpectedValue(expected, output) {
+        if (expected.startsWith('eval:')) {
+            const evalBody = expected.slice(5);
+            const evalFunction = new Function('output', `return ${evalBody}`);
+            return { pass: evalFunction(output) };
         }
-        else if (response.output) {
-            const matchesExpected = vars.__expected
-                ? checkExpectedValue(vars.__expected, response.output)
-                : true;
-            if (!matchesExpected) {
-                ret.error = `Expected ${vars.__expected}, got "${response.output}"`;
-            }
-            ret.success = matchesExpected;
+        else if (expected.startsWith('grade:')) {
+            const gradingResult = await this.gradeOutput(expected.slice(6), output);
+            return {
+                pass: gradingResult.pass,
+                reason: gradingResult.reason,
+            };
         }
         else {
-            ret.success = false;
-            ret.error = 'No output';
+            const pass = expected === output;
+            return {
+                pass,
+                reason: pass ? undefined : `Expected: ${expected}, Output: ${output}`,
+            };
         }
-        return ret;
     }
-    catch (err) {
-        return {
-            ...setup,
-            error: String(err),
-            success: false,
+    async runEval({ provider, prompt, vars, includeProviderId, }) {
+        vars = vars || {};
+        const renderedPrompt = nunjucks.renderString(prompt, vars);
+        // Note that we're using original prompt, not renderedPrompt
+        const promptDisplay = includeProviderId ? `[${provider.id()}] ${prompt}` : prompt;
+        const setup = {
+            prompt: {
+                raw: renderedPrompt,
+                display: promptDisplay,
+            },
+            vars,
         };
-    }
-}
-export async function evaluate(options) {
-    const prompts = [];
-    const results = [];
-    for (const promptContent of options.prompts) {
-        for (const provider of options.providers) {
-            prompts.push({
-                raw: promptContent,
-                display: options.providers.length > 1 ? `[${provider.id()}] ${promptContent}` : promptContent,
-            });
+        try {
+            const response = await provider.callApi(renderedPrompt);
+            const ret = {
+                ...setup,
+                response,
+                success: false,
+            };
+            if (response.error) {
+                ret.error = response.error;
+            }
+            else if (response.output) {
+                const checkResult = vars.__expected
+                    ? await this.checkExpectedValue(vars.__expected, response.output)
+                    : { pass: true };
+                if (!checkResult.pass) {
+                    ret.error = checkResult.reason || `Expected: ${vars.__expected}`;
+                }
+                ret.success = checkResult.pass;
+            }
+            else {
+                ret.success = false;
+                ret.error = 'No output';
+            }
+            // Update token usage stats
+            this.stats.tokenUsage.total += response.tokenUsage?.total || 0;
+            this.stats.tokenUsage.prompt += response.tokenUsage?.prompt || 0;
+            this.stats.tokenUsage.completion += response.tokenUsage?.completion || 0;
+            if (ret.success) {
+                this.stats.successes++;
+            }
+            else {
+                this.stats.failures++;
+            }
+            return ret;
+        }
+        catch (err) {
+            return {
+                ...setup,
+                error: String(err),
+                success: false,
+            };
         }
     }
-    const vars = options.vars && options.vars.length > 0 ? options.vars : [{}];
-    const varsWithExpectedKeyRemoved = vars.map((v) => {
-        const ret = { ...v };
-        delete ret.__expected;
-        return ret;
-    });
-    const isTest = vars[0].__expected;
-    const table = [
-        isTest
-            ? [
-                'RESULT',
-                [...prompts.map((p) => p.display), ...Object.keys(varsWithExpectedKeyRemoved[0])],
-            ].flat()
-            : [...prompts.map((p) => p.display), ...Object.keys(varsWithExpectedKeyRemoved[0])],
-    ];
-    const stats = {
-        successes: 0,
-        failures: 0,
-        tokenUsage: {
-            total: 0,
-            prompt: 0,
-            completion: 0,
-        },
-    };
-    let progressbar;
-    if (options.showProgressBar) {
-        const totalNumRuns = options.prompts.length * options.providers.length * (options.vars?.length || 1);
-        const cliProgress = await import('cli-progress');
-        progressbar = new cliProgress.SingleBar({
-            format: 'Eval: [{bar}] {percentage}% | ETA: {eta}s | {value}/{total} | {provider} "{prompt}" {vars}',
-        }, cliProgress.Presets.shades_classic);
-        progressbar.start(totalNumRuns, 0, {
-            provider: '',
-            prompt: '',
-            vars: '',
-        });
-    }
-    const runEvalOptions = [];
-    for (const row of vars) {
+    async evaluate() {
+        const options = this.options;
+        const prompts = [];
         for (const promptContent of options.prompts) {
             for (const provider of options.providers) {
-                runEvalOptions.push({
-                    provider,
-                    prompt: promptContent,
-                    vars: row,
-                    includeProviderId: options.providers.length > 1,
+                const display = options.providers.length > 1 ? `[${provider.id()}] ${promptContent}` : promptContent;
+                prompts.push({
+                    raw: promptContent,
+                    display,
                 });
             }
         }
-    }
-    const combinedOutputs = new Array(vars.length).fill(null).map(() => []);
-    await async.forEachOfLimit(runEvalOptions, options.maxConcurrency || DEFAULT_MAX_CONCURRENCY, async (options, index) => {
-        const row = await runEval(options);
-        results[index] = row;
-        if (row.error) {
-            stats.failures++;
+        const vars = options.vars && options.vars.length > 0 ? options.vars : [{}];
+        const varsWithExpectedKeyRemoved = vars.map((v) => {
+            const ret = { ...v };
+            delete ret.__expected;
+            return ret;
+        });
+        const isTest = vars[0].__expected;
+        const table = [
+            [...prompts.map((p) => p.display), ...Object.keys(varsWithExpectedKeyRemoved[0])],
+        ];
+        let progressbar;
+        if (options.showProgressBar) {
+            const totalNumRuns = options.prompts.length * options.providers.length * (options.vars?.length || 1);
+            const cliProgress = await import('cli-progress');
+            progressbar = new cliProgress.SingleBar({
+                format: 'Eval: [{bar}] {percentage}% | ETA: {eta}s | {value}/{total} | {provider} "{prompt}" {vars}',
+            }, cliProgress.Presets.shades_classic);
+            progressbar.start(totalNumRuns, 0, {
+                provider: '',
+                prompt: '',
+                vars: '',
+            });
         }
-        else {
-            if (row.success) {
-                stats.successes++;
+        const runEvalOptions = [];
+        for (const row of vars) {
+            for (const promptContent of options.prompts) {
+                for (const provider of options.providers) {
+                    runEvalOptions.push({
+                        provider,
+                        prompt: promptContent,
+                        vars: row,
+                        includeProviderId: options.providers.length > 1,
+                    });
+                }
             }
-            else {
-                stats.failures++;
-            }
-            stats.tokenUsage.total += row.response?.tokenUsage?.total || 0;
-            stats.tokenUsage.prompt += row.response?.tokenUsage?.prompt || 0;
-            stats.tokenUsage.completion += row.response?.tokenUsage?.completion || 0;
         }
+        const tempResults = [];
+        const combinedOutputs = new Array(vars.length).fill(null).map(() => []);
+        await async.forEachOfLimit(runEvalOptions, options.maxConcurrency || DEFAULT_MAX_CONCURRENCY, async (options, index) => {
+            const row = await this.runEval(options);
+            //results[index as number] = row;
+            tempResults.push({ index: index, row });
+            if (progressbar) {
+                progressbar.increment({
+                    provider: options.provider.id(),
+                    prompt: options.prompt.slice(0, 10),
+                    vars: Object.entries(options.vars || {})
+                        .map(([k, v]) => `${k}=${v}`)
+                        .join(' ')
+                        .slice(0, 10),
+                });
+            }
+            // Bookkeeping for table
+            if (typeof index !== 'number') {
+                throw new Error('Expected index to be a number');
+            }
+            const combinedOutputIndex = Math.floor(index / prompts.length);
+            combinedOutputs[combinedOutputIndex].push(row.response?.output || row.error || '');
+        });
         if (progressbar) {
-            progressbar.increment({
-                provider: options.provider.id(),
-                prompt: options.prompt.slice(0, 10),
-                vars: Object.entries(options.vars || {})
-                    .map(([k, v]) => `${k}=${v}`)
-                    .join(' ')
-                    .slice(0, 10),
+            progressbar.stop();
+        }
+        const results = [];
+        tempResults
+            .sort((a, b) => a.index - b.index)
+            .forEach(({ index, row }) => {
+            results[index] = row;
+        });
+        // TODO(ian): Provide full context in table cells, and have the caller
+        // construct the table contents itself.
+        if (isTest) {
+            // Iterate through each combined output
+            combinedOutputs.forEach((output, index) => {
+                // Create a new array to store the modified output with [PASS] or [FAIL] prepended
+                const modifiedOutput = [];
+                // Iterate through each output value and prepend [PASS] or [FAIL] based on the success status
+                output.forEach((o, outputIndex) => {
+                    const resultIndex = index * prompts.length + outputIndex;
+                    const result = results[resultIndex];
+                    // TODO(ian): sometimes output and result.error can be identical (in the case of exception)
+                    const resultStatus = result.success ? `[PASS] ${o}` : `[FAIL] ${result.error}\n---\n${o}`;
+                    modifiedOutput.push(resultStatus);
+                });
+                // Add the modified output and the corresponding values from varsWithExpectedKeyRemoved to the table
+                const tableRow = [...modifiedOutput, ...Object.values(varsWithExpectedKeyRemoved[index])];
+                table.push(tableRow);
             });
         }
-        // Bookkeeping for table
-        if (typeof index !== 'number') {
-            throw new Error('Expected index to be a number');
+        else {
+            table.push(...combinedOutputs.map((output, index) => [...output, ...Object.values(vars[index])]));
         }
-        const combinedOutputIndex = Math.floor(index / prompts.length);
-        combinedOutputs[combinedOutputIndex].push(row.response?.output || row.error || '');
-    });
-    if (progressbar) {
-        progressbar.stop();
-    }
-    // TODO(ian): Display errors in table UI.
-    if (isTest) {
-        table.push(...combinedOutputs.map((output, index) => [
-            results[index].success ? 'PASS' : `FAIL: ${results[index].error}`,
-            ...output,
-            ...Object.values(varsWithExpectedKeyRemoved[index]),
-        ]));
+        return { results, stats: this.stats, table };
     }
-    else {
-        table.push(...combinedOutputs.map((output, index) => [...output, ...Object.values(vars[index])]));
-    }
-    return { results, stats, table };
+}
+export function evaluate(options) {
+    const ev = new Evaluator(options);
+    return ev.evaluate();
 }
 //# sourceMappingURL=evaluator.js.map

package/dist/evaluator.js.map CHANGED Viewed

	@@ -1 +1 @@
1	- {"version":3,"file":"evaluator.js","sourceRoot":"","sources":["../src/evaluator.ts"],"names":[],"mappings":"AAAA,OAAO,KAAK,MAAM,OAAO,CAAC;AAC1B,OAAO,QAAQ,MAAM,UAAU,CAAC;~~AAahC~~,MAAM,uBAAuB,GAAG,CAAC,CAAC;AAElC,SAAS,~~kBAAkB~~,CAAC,QAAgB,EAAE,MAAc;~~IAC1D~~,IAAI,QAAQ,CAAC,UAAU,CAAC,OAAO,CAAC,EAAE;~~QAChC~~,MAAM,QAAQ,GAAG,QAAQ,CAAC,KAAK,CAAC,CAAC,CAAC,CAAC;~~QACnC~~,MAAM,YAAY,GAAG,IAAI,QAAQ,CAAC,QAAQ,EAAE,UAAU,QAAQ,EAAE,CAAC,CAAC;~~QAClE~~,OAAO,YAAY,CAAC,MAAM,CAAC,CAAC;~~KAC7B~~;~~SAAM~~,IAAI,QAAQ,CAAC,UAAU,CAAC,QAAQ,CAAC,EAAE;~~QACxC~~,MAAM;~~QACN~~,OAAO,~~KAAK~~,CAAC;~~KACd~~;~~SAAM~~;~~QACL~~,~~OAAO~~,QAAQ,KAAK,MAAM,CAAC;~~KAC5B~~;~~AACH~~,CAAC;~~AAED~~,KAAK,~~UAAU~~,OAAO,CAAC,~~EACrB~~,QAAQ,EACR,MAAM,EACN,IAAI,EACJ,iBAAiB,GACF;~~IACf~~,IAAI,GAAG,IAAI,IAAI,EAAE,CAAC;~~IAClB~~,MAAM,cAAc,GAAG,QAAQ,CAAC,YAAY,CAAC,MAAM,EAAE,IAAI,CAAC,CAAC;~~IAE3D~~,4DAA4D;~~IAC5D~~,MAAM,aAAa,GAAG,iBAAiB,CAAC,CAAC,CAAC,IAAI,QAAQ,CAAC,EAAE,EAAE,KAAK,MAAM,EAAE,CAAC,CAAC,CAAC,MAAM,CAAC;~~IAElF~~,MAAM,KAAK,GAAG;~~QACZ~~,MAAM,EAAE;~~YACN~~,GAAG,EAAE,cAAc;~~YACnB~~,OAAO,EAAE,aAAa;~~SACvB~~;~~QACD~~,IAAI;~~KACL~~,CAAC;~~IAEF~~,IAAI;~~QACF~~,MAAM,QAAQ,GAAG,MAAM,QAAQ,CAAC,OAAO,CAAC,cAAc,CAAC,CAAC;~~QACxD~~,MAAM,GAAG,GAAmB;~~YAC1B~~,GAAG,KAAK;~~YACR~~,QAAQ;~~YACR~~,OAAO,EAAE,KAAK;~~SACf~~,CAAC;~~QACF~~,IAAI,QAAQ,CAAC,KAAK,EAAE;~~YAClB~~,GAAG,CAAC,KAAK,GAAG,QAAQ,CAAC,KAAK,CAAC;~~SAC5B~~;~~aAAM~~,IAAI,QAAQ,CAAC,MAAM,EAAE;~~YAC1B~~,MAAM,~~eAAe~~,GAAG,IAAI,CAAC,UAAU;~~gBACrC~~,CAAC,CAAC,kBAAkB,CAAC,IAAI,CAAC,UAAU,EAAE,QAAQ,CAAC,MAAM,CAAC;~~gBACtD~~,CAAC,CAAC,IAAI,CAAC;~~YACT~~,IAAI,CAAC,~~eAAe~~,EAAE;~~gBACpB~~,GAAG,CAAC,KAAK,GAAG,~~YAAY~~,IAAI,CAAC,UAAU,~~UAAU~~,~~QAAQ~~,CAAC,~~MAAM~~,GAAG,CAAC;~~aACrE~~;~~YACD~~,GAAG,CAAC,OAAO,GAAG,~~eAAe~~,CAAC;~~SAC~~/B;~~aAAM~~;~~YACL~~,GAAG,CAAC,OAAO,~~GAAG~~,KAAK,CAAC;~~YACpB~~,~~GAAG~~,CAAC,KAAK,~~GAAG~~,~~WAAW~~,CAAC;~~SACzB~~;~~QACD~~,OAAO,GAAG,CAAC;~~KACZ~~;~~IAAC~~,OAAO,GAAG,EAAE;~~QACZ~~,OAAO;~~YACL~~,GAAG,KAAK;~~YACR~~,KAAK,EAAE,MAAM,CAAC,GAAG,CAAC;~~YAClB~~,OAAO,EAAE,KAAK;~~SACf~~,CAAC;~~KACH~~;~~AACH~~,CAAC;~~AAED~~,~~MAAM~~,CAAC,~~KAAK,UAAU,~~QAAQ~~,CAAC,OAAwB~~;~~IACrD~~,MAAM,OAAO,~~GAAa~~,~~EAAE~~,CAAC;~~IAC7B~~,MAAM,OAAO,~~GAAqB~~,EAAE,CAAC;~~IAErC~~,KAAK,MAAM,aAAa,IAAI,OAAO,CAAC,OAAO,EAAE;~~QAC3C~~,KAAK,MAAM,QAAQ,IAAI,OAAO,CAAC,SAAS,EAAE;~~YACxC~~,~~OAAO~~,~~CAAC,IAAI,CAAC;gBACX,GAAG,EAAE,aAAa;gBAClB,~~OAAO,~~EACL~~,OAAO,CAAC,SAAS,CAAC,MAAM,GAAG,CAAC,CAAC,CAAC,CAAC,IAAI,QAAQ,CAAC,EAAE,EAAE,KAAK,aAAa,EAAE,CAAC,CAAC,CAAC,aAAa;~~aACvF~~,CAAC,CAAC;~~SACJ~~;~~KACF~~;~~IAED~~,MAAM,IAAI,GAAG,OAAO,CAAC,IAAI,IAAI,OAAO,CAAC,IAAI,CAAC,MAAM,GAAG,CAAC,CAAC,CAAC,CAAC,OAAO,CAAC,IAAI,CAAC,CAAC,CAAC,CAAC,EAAE,CAAC,CAAC;~~IAC3E~~,MAAM,0BAA0B,GAAG,IAAI,CAAC,GAAG,CAAC,CAAC,CAAC,EAAE,EAAE;~~QAChD~~,MAAM,GAAG,GAAG,EAAE,GAAG,CAAC,EAAE,CAAC;~~QACrB~~,OAAO,GAAG,CAAC,UAAU,CAAC;~~QACtB~~,OAAO,GAAG,CAAC;~~IACb~~,CAAC,CAAC,CAAC;~~IACH~~,MAAM,MAAM,GAAG,IAAI,CAAC,CAAC,CAAC,CAAC,UAAU,CAAC;~~IAClC~~,MAAM,KAAK,GAAe;~~QACxB~~,~~MAAM;YACJ,~~CAAC,~~CAAC;gBACE,QAAQ;gBACR,CAAC,~~GAAG,OAAO,CAAC,GAAG,CAAC,CAAC,CAAC,EAAE,EAAE,CAAC,CAAC,CAAC,OAAO,CAAC,EAAE,GAAG,MAAM,CAAC,IAAI,CAAC,0BAA0B,CAAC,CAAC,CAAC,CAAC,CAAC;~~aAClF~~,CAAC~~,IAAI,EAAE~~;~~YACV~~,~~CAAC,CAAC,CAAC,GAAG,OAAO,CAAC,GAAG,CAAC,CAAC,CAAC,EAAE,EAAE,CAAC,CAAC,CAAC,OAAO,CAAC,EAAE,GAAG,MAAM,CAAC,~~IAAI,CAAC,0BAA0B,CAAC,CAAC,CAAC,CAAC,CAAC;KACtF,CAAC;IAEF,MAAM,KAAK,GAAG;QACZ,SAAS,EAAE,CAAC;QACZ,QAAQ,EAAE,CAAC;QACX,UAAU,EAAE;YACV,KAAK,EAAE,CAAC;YACR,MAAM,EAAE,CAAC;YACT,UAAU,EAAE,CAAC;SACd;KACF,CAAC;IAEF,IAAI,WAAkC,CAAC;~~IACvC~~,IAAI,OAAO,CAAC,eAAe,EAAE;~~QAC3B~~,MAAM,YAAY,GAChB,OAAO,CAAC,OAAO,CAAC,MAAM,GAAG,OAAO,CAAC,SAAS,CAAC,MAAM,GAAG,CAAC,OAAO,CAAC,IAAI,EAAE,MAAM,IAAI,CAAC,CAAC,CAAC;~~QAClF~~,MAAM,WAAW,GAAG,MAAM,MAAM,CAAC,cAAc,CAAC,CAAC;~~QACjD~~,WAAW,GAAG,IAAI,WAAW,CAAC,SAAS,CACrC;~~YACE~~,MAAM,EACJ,4FAA4F;~~SAC~~/F,EACD,WAAW,CAAC,OAAO,CAAC,cAAc,CACnC,CAAC;~~QACF~~,WAAW,CAAC,KAAK,CAAC,YAAY,EAAE,CAAC,EAAE;~~YACjC~~,QAAQ,EAAE,EAAE;~~YACZ~~,MAAM,EAAE,EAAE;~~YACV~~,IAAI,EAAE,EAAE;~~SACT~~,CAAC,CAAC;~~KACJ~~;~~IAED~~,MAAM,cAAc,GAAqB,EAAE,CAAC;~~IAC5C~~,KAAK,MAAM,GAAG,IAAI,IAAI,EAAE;~~QACtB~~,KAAK,MAAM,aAAa,IAAI,OAAO,CAAC,OAAO,EAAE;~~YAC3C~~,KAAK,MAAM,QAAQ,IAAI,OAAO,CAAC,SAAS,EAAE;~~gBACxC~~,cAAc,CAAC,IAAI,CAAC;~~oBAClB~~,QAAQ;~~oBACR~~,MAAM,EAAE,aAAa;~~oBACrB~~,IAAI,EAAE,GAAG;~~oBACT~~,iBAAiB,EAAE,OAAO,CAAC,SAAS,CAAC,MAAM,GAAG,CAAC;~~iBAChD~~,CAAC,CAAC;~~aACJ~~;SACF;~~KACF~~;~~IAED~~,MAAM,eAAe,GAAe,IAAI,KAAK,CAAC,IAAI,CAAC,MAAM,CAAC,CAAC,IAAI,CAAC,IAAI,CAAC,CAAC,GAAG,CAAC,GAAG,EAAE,CAAC,EAAE,CAAC,CAAC;~~IACpF~~,MAAM,KAAK,CAAC,cAAc,CACxB,cAAc,EACd,OAAO,CAAC,cAAc,IAAI,uBAAuB,EACjD,KAAK,EAAE,OAAuB,EAAE,KAAsB,EAAE,EAAE;~~QACxD~~,MAAM,GAAG,GAAG,MAAM,~~OAAO~~,CAAC,OAAO,CAAC,~~CAAC;QACnC,~~OAAO,CAAC,~~KAAe,~~CAAC~~,GAAG,GAAG,CAAC~~;~~QAC/B~~,~~IAAI,GAAG,CAAC,KAAK,EAAE~~;~~YACb~~,~~KAAK~~,CAAC,~~QAAQ,EAAE,CAAC;SAClB;aAAM;YACL,~~IAAI,~~GAAG,~~CAAC,~~OAAO,~~EAAE~~;gBACf~~,KAAK,~~CAAC,SAAS,~~EAAE,~~CAAC;aACnB;iBAAM;gBACL~~,~~KAAK,CAAC,QAAQ,~~EAAE,~~CAAC;aAClB;YACD,KAAK,CAAC,UAAU,CAAC,KAAK,IAAI,~~GAAG,~~CAAC,QAAQ,~~EAAE,~~UAAU,EAAE,KAAK,IAAI,~~CAAC,CAAC;~~YAC/D~~,~~KAAK,CAAC,UAAU,CAAC,MAAM,~~IAAI,~~GAAG,CAAC,QAAQ,EAAE,UAAU,EAAE,MAAM,IAAI,CAAC,CAAC;YACjE,KAAK,CAAC,UAAU,CAAC,UAAU,IAAI,GAAG,CAAC,QAAQ,EAAE,UAAU,EAAE,UAAU,IAAI,CAAC,CAAC;SAC1E;QAED,IAAI,~~WAAW,EAAE;~~YACf~~,WAAW,CAAC,SAAS,CAAC;~~gBACpB~~,QAAQ,EAAE,OAAO,CAAC,QAAQ,CAAC,EAAE,EAAE;~~gBAC~~/B,MAAM,EAAE,OAAO,CAAC,MAAM,CAAC,KAAK,CAAC,CAAC,EAAE,EAAE,CAAC;~~gBACnC~~,IAAI,EAAE,MAAM,CAAC,OAAO,CAAC,OAAO,CAAC,IAAI,IAAI,EAAE,CAAC;~~qBACrC~~,GAAG,CAAC,CAAC,CAAC,CAAC,EAAE,CAAC,CAAC,EAAE,EAAE,CAAC,GAAG,CAAC,IAAI,CAAC,EAAE,CAAC;~~qBAC5B~~,IAAI,CAAC,GAAG,CAAC;~~qBACT~~,KAAK,CAAC,CAAC,EAAE,EAAE,CAAC;~~aAChB~~,CAAC,CAAC;~~SACJ~~;~~QAED~~,wBAAwB;~~QACxB~~,IAAI,OAAO,KAAK,KAAK,QAAQ,EAAE;~~YAC7B~~,MAAM,IAAI,KAAK,CAAC,+BAA+B,CAAC,CAAC;~~SAClD~~;~~QACD~~,MAAM,mBAAmB,GAAG,IAAI,CAAC,KAAK,CAAC,KAAK,GAAG,OAAO,CAAC,MAAM,CAAC,CAAC;~~QAC~~/D,eAAe,CAAC,mBAAmB,CAAC,CAAC,IAAI,CAAC,GAAG,CAAC,QAAQ,EAAE,MAAM,IAAI,GAAG,CAAC,KAAK,IAAI,EAAE,CAAC,CAAC;~~IACrF~~,CAAC,CACF,CAAC;~~IAEF~~,IAAI,WAAW,EAAE;~~QACf~~,WAAW,CAAC,IAAI,EAAE,CAAC;~~KACpB~~;~~IAED~~,~~yCAAyC~~;~~IACzC~~,IAAI,~~MAAM~~,EAAE;~~QACV~~,KAAK,CAAC,~~IAAI~~,~~CACR~~,GAAG,eAAe,CAAC,~~GAAG~~,CAAC,CAAC,MAAM,EAAE,KAAK,EAAE,EAAE,CAAC;~~YACxC~~,OAAO,CAAC,KAAK,CAAC,CAAC,OAAO,CAAC,CAAC,CAAC,MAAM,CAAC,CAAC,CAAC,~~SAAS~~,~~OAAO~~,CAAC,~~KAAK~~,CAAC,CAAC,KAAK,EAAE;~~YACjE~~,~~GAAG~~,~~MAAM~~;~~YACT~~,GAAG,MAAM,CAAC,MAAM,CAAC,0BAA0B,CAAC,KAAK,CAAC,CAAC;~~SACpD~~,CAAC,~~CACH~~,CAAC;~~KACH~~;~~SAAM~~;~~QACL~~,KAAK,CAAC,IAAI,CACR,GAAG,eAAe,CAAC,GAAG,CAAC,CAAC,MAAM,EAAE,KAAK,EAAE,EAAE,CAAC,CAAC,GAAG,MAAM,EAAE,GAAG,MAAM,CAAC,MAAM,CAAC,IAAI,CAAC,KAAK,CAAC,CAAC,CAAC,CAAC,CACtF,CAAC;~~KACH~~;~~IAED~~,OAAO,EAAE,OAAO,EAAE,KAAK,EAAE,KAAK,EAAE,CAAC;~~AACnC~~,CAAC"}
1	+ {"version":3,"file":"evaluator.js","sourceRoot":"","sources":["../src/evaluator.ts"],"names":[],"mappings":"AAAA,OAAO,KAAK,MAAM,OAAO,CAAC;AAC1B,OAAO,QAAQ,MAAM,UAAU,CAAC;AAEhC,OAAO,EAAE,sBAAsB,EAAE,MAAM,cAAc,CAAC;AA0BtD,MAAM,uBAAuB,GAAG,CAAC,CAAC;AAElC,MAAM,SAAS;IAIb,YAAY,OAAwB;QAClC,IAAI,CAAC,OAAO,GAAG,OAAO,CAAC;QACvB,IAAI,CAAC,KAAK,GAAG;YACX,SAAS,EAAE,CAAC;YACZ,QAAQ,EAAE,CAAC;YACX,UAAU,EAAE;gBACV,KAAK,EAAE,CAAC;gBACR,MAAM,EAAE,CAAC;gBACT,UAAU,EAAE,CAAC;aACd;SACF,CAAC;IACJ,CAAC;IAED,KAAK,CAAC,WAAW,CAAC,QAAgB,EAAE,MAAc;QAChD,MAAM,EAAE,OAAO,EAAE,GAAG,IAAI,CAAC,OAAO,CAAC;QAEjC,IAAI,CAAC,OAAO,EAAE;YACZ,MAAM,IAAI,KAAK,CACb,wFAAwF,CACzF,CAAC;SACH;QAED,MAAM,MAAM,GAAG,QAAQ,CAAC,YAAY,CAAC,OAAO,CAAC,MAAM,IAAI,sBAAsB,EAAE;YAC7E,OAAO,EAAE,MAAM;YACf,MAAM,EAAE,QAAQ;SACjB,CAAC,CAAC;QAEH,MAAM,IAAI,GAAG,MAAM,OAAO,CAAC,QAAQ,CAAC,OAAO,CAAC,MAAM,CAAC,CAAC;QACpD,IAAI,IAAI,CAAC,KAAK,IAAI,CAAC,IAAI,CAAC,MAAM,EAAE;YAC9B,OAAO;gBACL,IAAI,EAAE,KAAK;gBACX,MAAM,EAAE,IAAI,CAAC,KAAK,IAAI,WAAW;gBACjC,UAAU,EAAE;oBACV,KAAK,EAAE,IAAI,CAAC,UAAU,EAAE,KAAK,IAAI,CAAC;oBAClC,MAAM,EAAE,IAAI,CAAC,UAAU,EAAE,MAAM,IAAI,CAAC;oBACpC,UAAU,EAAE,IAAI,CAAC,UAAU,EAAE,UAAU,IAAI,CAAC;iBAC7C;aACF,CAAC;SACH;QAED,IAAI;YACF,MAAM,MAAM,GAAG,IAAI,CAAC,KAAK,CAAC,IAAI,CAAC,MAAM,CAAkB,CAAC;YACxD,MAAM,CAAC,UAAU,GAAG;gBAClB,KAAK,EAAE,IAAI,CAAC,UAAU,EAAE,KAAK,IAAI,CAAC;gBAClC,MAAM,EAAE,IAAI,CAAC,UAAU,EAAE,MAAM,IAAI,CAAC;gBACpC,UAAU,EAAE,IAAI,CAAC,UAAU,EAAE,UAAU,IAAI,CAAC;aAC7C,CAAC;YACF,OAAO,MAAM,CAAC;SACf;QAAC,OAAO,GAAG,EAAE;YACZ,OAAO;gBACL,IAAI,EAAE,KAAK;gBACX,MAAM,EAAE,6BAA6B,IAAI,CAAC,MAAM,EAAE;gBAClD,UAAU,EAAE;oBACV,KAAK,EAAE,IAAI,CAAC,UAAU,EAAE,KAAK,IAAI,CAAC;oBAClC,MAAM,EAAE,IAAI,CAAC,UAAU,EAAE,MAAM,IAAI,CAAC;oBACpC,UAAU,EAAE,IAAI,CAAC,UAAU,EAAE,UAAU,IAAI,CAAC;iBAC7C;aACF,CAAC;SACH;IACH,CAAC;IAED,KAAK,CAAC,kBAAkB,CACtB,QAAgB,EAChB,MAAc;QAEd,IAAI,QAAQ,CAAC,UAAU,CAAC,OAAO,CAAC,EAAE;YAChC,MAAM,QAAQ,GAAG,QAAQ,CAAC,KAAK,CAAC,CAAC,CAAC,CAAC;YACnC,MAAM,YAAY,GAAG,IAAI,QAAQ,CAAC,QAAQ,EAAE,UAAU,QAAQ,EAAE,CAAC,CAAC;YAClE,OAAO,EAAE,IAAI,EAAE,YAAY,CAAC,MAAM,CAAC,EAAE,CAAC;SACvC;aAAM,IAAI,QAAQ,CAAC,UAAU,CAAC,QAAQ,CAAC,EAAE;YACxC,MAAM,aAAa,GAAG,MAAM,IAAI,CAAC,WAAW,CAAC,QAAQ,CAAC,KAAK,CAAC,CAAC,CAAC,EAAE,MAAM,CAAC,CAAC;YACxE,OAAO;gBACL,IAAI,EAAE,aAAa,CAAC,IAAI;gBACxB,MAAM,EAAE,aAAa,CAAC,MAAM;aAC7B,CAAC;SACH;aAAM;YACL,MAAM,IAAI,GAAG,QAAQ,KAAK,MAAM,CAAC;YACjC,OAAO;gBACL,IAAI;gBACJ,MAAM,EAAE,IAAI,CAAC,CAAC,CAAC,SAAS,CAAC,CAAC,CAAC,aAAa,QAAQ,aAAa,MAAM,EAAE;aACtE,CAAC;SACH;IACH,CAAC;IAED,KAAK,CAAC,OAAO,CAAC,EACZ,QAAQ,EACR,MAAM,EACN,IAAI,EACJ,iBAAiB,GACF;QACf,IAAI,GAAG,IAAI,IAAI,EAAE,CAAC;QAClB,MAAM,cAAc,GAAG,QAAQ,CAAC,YAAY,CAAC,MAAM,EAAE,IAAI,CAAC,CAAC;QAE3D,4DAA4D;QAC5D,MAAM,aAAa,GAAG,iBAAiB,CAAC,CAAC,CAAC,IAAI,QAAQ,CAAC,EAAE,EAAE,KAAK,MAAM,EAAE,CAAC,CAAC,CAAC,MAAM,CAAC;QAElF,MAAM,KAAK,GAAG;YACZ,MAAM,EAAE;gBACN,GAAG,EAAE,cAAc;gBACnB,OAAO,EAAE,aAAa;aACvB;YACD,IAAI;SACL,CAAC;QAEF,IAAI;YACF,MAAM,QAAQ,GAAG,MAAM,QAAQ,CAAC,OAAO,CAAC,cAAc,CAAC,CAAC;YACxD,MAAM,GAAG,GAAmB;gBAC1B,GAAG,KAAK;gBACR,QAAQ;gBACR,OAAO,EAAE,KAAK;aACf,CAAC;YACF,IAAI,QAAQ,CAAC,KAAK,EAAE;gBAClB,GAAG,CAAC,KAAK,GAAG,QAAQ,CAAC,KAAK,CAAC;aAC5B;iBAAM,IAAI,QAAQ,CAAC,MAAM,EAAE;gBAC1B,MAAM,WAAW,GAAG,IAAI,CAAC,UAAU;oBACjC,CAAC,CAAC,MAAM,IAAI,CAAC,kBAAkB,CAAC,IAAI,CAAC,UAAU,EAAE,QAAQ,CAAC,MAAM,CAAC;oBACjE,CAAC,CAAC,EAAE,IAAI,EAAE,IAAI,EAAE,CAAC;gBACnB,IAAI,CAAC,WAAW,CAAC,IAAI,EAAE;oBACrB,GAAG,CAAC,KAAK,GAAG,WAAW,CAAC,MAAM,IAAI,aAAa,IAAI,CAAC,UAAU,EAAE,CAAC;iBAClE;gBACD,GAAG,CAAC,OAAO,GAAG,WAAW,CAAC,IAAI,CAAC;aAChC;iBAAM;gBACL,GAAG,CAAC,OAAO,GAAG,KAAK,CAAC;gBACpB,GAAG,CAAC,KAAK,GAAG,WAAW,CAAC;aACzB;YAED,2BAA2B;YAC3B,IAAI,CAAC,KAAK,CAAC,UAAU,CAAC,KAAK,IAAI,QAAQ,CAAC,UAAU,EAAE,KAAK,IAAI,CAAC,CAAC;YAC/D,IAAI,CAAC,KAAK,CAAC,UAAU,CAAC,MAAM,IAAI,QAAQ,CAAC,UAAU,EAAE,MAAM,IAAI,CAAC,CAAC;YACjE,IAAI,CAAC,KAAK,CAAC,UAAU,CAAC,UAAU,IAAI,QAAQ,CAAC,UAAU,EAAE,UAAU,IAAI,CAAC,CAAC;YAEzE,IAAI,GAAG,CAAC,OAAO,EAAE;gBACf,IAAI,CAAC,KAAK,CAAC,SAAS,EAAE,CAAC;aACxB;iBAAM;gBACL,IAAI,CAAC,KAAK,CAAC,QAAQ,EAAE,CAAC;aACvB;YAED,OAAO,GAAG,CAAC;SACZ;QAAC,OAAO,GAAG,EAAE;YACZ,OAAO;gBACL,GAAG,KAAK;gBACR,KAAK,EAAE,MAAM,CAAC,GAAG,CAAC;gBAClB,OAAO,EAAE,KAAK;aACf,CAAC;SACH;IACH,CAAC;IAED,KAAK,CAAC,QAAQ;QACZ,MAAM,OAAO,GAAG,IAAI,CAAC,OAAO,CAAC;QAC7B,MAAM,OAAO,GAAa,EAAE,CAAC;QAE7B,KAAK,MAAM,aAAa,IAAI,OAAO,CAAC,OAAO,EAAE;YAC3C,KAAK,MAAM,QAAQ,IAAI,OAAO,CAAC,SAAS,EAAE;gBACxC,MAAM,OAAO,GACX,OAAO,CAAC,SAAS,CAAC,MAAM,GAAG,CAAC,CAAC,CAAC,CAAC,IAAI,QAAQ,CAAC,EAAE,EAAE,KAAK,aAAa,EAAE,CAAC,CAAC,CAAC,aAAa,CAAC;gBACvF,OAAO,CAAC,IAAI,CAAC;oBACX,GAAG,EAAE,aAAa;oBAClB,OAAO;iBACR,CAAC,CAAC;aACJ;SACF;QAED,MAAM,IAAI,GAAG,OAAO,CAAC,IAAI,IAAI,OAAO,CAAC,IAAI,CAAC,MAAM,GAAG,CAAC,CAAC,CAAC,CAAC,OAAO,CAAC,IAAI,CAAC,CAAC,CAAC,CAAC,EAAE,CAAC,CAAC;QAC3E,MAAM,0BAA0B,GAAG,IAAI,CAAC,GAAG,CAAC,CAAC,CAAC,EAAE,EAAE;YAChD,MAAM,GAAG,GAAG,EAAE,GAAG,CAAC,EAAE,CAAC;YACrB,OAAO,GAAG,CAAC,UAAU,CAAC;YACtB,OAAO,GAAG,CAAC;QACb,CAAC,CAAC,CAAC;QACH,MAAM,MAAM,GAAG,IAAI,CAAC,CAAC,CAAC,CAAC,UAAU,CAAC;QAClC,MAAM,KAAK,GAAe;YACxB,CAAC,GAAG,OAAO,CAAC,GAAG,CAAC,CAAC,CAAC,EAAE,EAAE,CAAC,CAAC,CAAC,OAAO,CAAC,EAAE,GAAG,MAAM,CAAC,IAAI,CAAC,0BAA0B,CAAC,CAAC,CAAC,CAAC,CAAC;SAClF,CAAC;QAEF,IAAI,WAAkC,CAAC;QACvC,IAAI,OAAO,CAAC,eAAe,EAAE;YAC3B,MAAM,YAAY,GAChB,OAAO,CAAC,OAAO,CAAC,MAAM,GAAG,OAAO,CAAC,SAAS,CAAC,MAAM,GAAG,CAAC,OAAO,CAAC,IAAI,EAAE,MAAM,IAAI,CAAC,CAAC,CAAC;YAClF,MAAM,WAAW,GAAG,MAAM,MAAM,CAAC,cAAc,CAAC,CAAC;YACjD,WAAW,GAAG,IAAI,WAAW,CAAC,SAAS,CACrC;gBACE,MAAM,EACJ,4FAA4F;aAC/F,EACD,WAAW,CAAC,OAAO,CAAC,cAAc,CACnC,CAAC;YACF,WAAW,CAAC,KAAK,CAAC,YAAY,EAAE,CAAC,EAAE;gBACjC,QAAQ,EAAE,EAAE;gBACZ,MAAM,EAAE,EAAE;gBACV,IAAI,EAAE,EAAE;aACT,CAAC,CAAC;SACJ;QAED,MAAM,cAAc,GAAqB,EAAE,CAAC;QAC5C,KAAK,MAAM,GAAG,IAAI,IAAI,EAAE;YACtB,KAAK,MAAM,aAAa,IAAI,OAAO,CAAC,OAAO,EAAE;gBAC3C,KAAK,MAAM,QAAQ,IAAI,OAAO,CAAC,SAAS,EAAE;oBACxC,cAAc,CAAC,IAAI,CAAC;wBAClB,QAAQ;wBACR,MAAM,EAAE,aAAa;wBACrB,IAAI,EAAE,GAAG;wBACT,iBAAiB,EAAE,OAAO,CAAC,SAAS,CAAC,MAAM,GAAG,CAAC;qBAChD,CAAC,CAAC;iBACJ;aACF;SACF;QAED,MAAM,WAAW,GAA6C,EAAE,CAAC;QACjE,MAAM,eAAe,GAAe,IAAI,KAAK,CAAC,IAAI,CAAC,MAAM,CAAC,CAAC,IAAI,CAAC,IAAI,CAAC,CAAC,GAAG,CAAC,GAAG,EAAE,CAAC,EAAE,CAAC,CAAC;QACpF,MAAM,KAAK,CAAC,cAAc,CACxB,cAAc,EACd,OAAO,CAAC,cAAc,IAAI,uBAAuB,EACjD,KAAK,EAAE,OAAuB,EAAE,KAAsB,EAAE,EAAE;YACxD,MAAM,GAAG,GAAG,MAAM,IAAI,CAAC,OAAO,CAAC,OAAO,CAAC,CAAC;YACxC,iCAAiC;YACjC,WAAW,CAAC,IAAI,CAAC,EAAE,KAAK,EAAE,KAAe,EAAE,GAAG,EAAE,CAAC,CAAC;YAElD,IAAI,WAAW,EAAE;gBACf,WAAW,CAAC,SAAS,CAAC;oBACpB,QAAQ,EAAE,OAAO,CAAC,QAAQ,CAAC,EAAE,EAAE;oBAC/B,MAAM,EAAE,OAAO,CAAC,MAAM,CAAC,KAAK,CAAC,CAAC,EAAE,EAAE,CAAC;oBACnC,IAAI,EAAE,MAAM,CAAC,OAAO,CAAC,OAAO,CAAC,IAAI,IAAI,EAAE,CAAC;yBACrC,GAAG,CAAC,CAAC,CAAC,CAAC,EAAE,CAAC,CAAC,EAAE,EAAE,CAAC,GAAG,CAAC,IAAI,CAAC,EAAE,CAAC;yBAC5B,IAAI,CAAC,GAAG,CAAC;yBACT,KAAK,CAAC,CAAC,EAAE,EAAE,CAAC;iBAChB,CAAC,CAAC;aACJ;YAED,wBAAwB;YACxB,IAAI,OAAO,KAAK,KAAK,QAAQ,EAAE;gBAC7B,MAAM,IAAI,KAAK,CAAC,+BAA+B,CAAC,CAAC;aAClD;YACD,MAAM,mBAAmB,GAAG,IAAI,CAAC,KAAK,CAAC,KAAK,GAAG,OAAO,CAAC,MAAM,CAAC,CAAC;YAC/D,eAAe,CAAC,mBAAmB,CAAC,CAAC,IAAI,CAAC,GAAG,CAAC,QAAQ,EAAE,MAAM,IAAI,GAAG,CAAC,KAAK,IAAI,EAAE,CAAC,CAAC;QACrF,CAAC,CACF,CAAC;QAEF,IAAI,WAAW,EAAE;YACf,WAAW,CAAC,IAAI,EAAE,CAAC;SACpB;QAED,MAAM,OAAO,GAAqB,EAAE,CAAC;QACrC,WAAW;aACR,IAAI,CAAC,CAAC,CAAC,EAAE,CAAC,EAAE,EAAE,CAAC,CAAC,CAAC,KAAK,GAAG,CAAC,CAAC,KAAK,CAAC;aACjC,OAAO,CAAC,CAAC,EAAE,KAAK,EAAE,GAAG,EAAE,EAAE,EAAE;YAC1B,OAAO,CAAC,KAAK,CAAC,GAAG,GAAG,CAAC;QACvB,CAAC,CAAC,CAAC;QAEL,sEAAsE;QACtE,uCAAuC;QACvC,IAAI,MAAM,EAAE;YACV,uCAAuC;YACvC,eAAe,CAAC,OAAO,CAAC,CAAC,MAAM,EAAE,KAAK,EAAE,EAAE;gBACxC,kFAAkF;gBAClF,MAAM,cAAc,GAAa,EAAE,CAAC;gBAEpC,6FAA6F;gBAC7F,MAAM,CAAC,OAAO,CAAC,CAAC,CAAC,EAAE,WAAW,EAAE,EAAE;oBAChC,MAAM,WAAW,GAAG,KAAK,GAAG,OAAO,CAAC,MAAM,GAAG,WAAW,CAAC;oBACzD,MAAM,MAAM,GAAG,OAAO,CAAC,WAAW,CAAC,CAAC;oBACpC,2FAA2F;oBAC3F,MAAM,YAAY,GAAG,MAAM,CAAC,OAAO,CAAC,CAAC,CAAC,UAAU,CAAC,EAAE,CAAC,CAAC,CAAC,UAAU,MAAM,CAAC,KAAK,UAAU,CAAC,EAAE,CAAC;oBAC1F,cAAc,CAAC,IAAI,CAAC,YAAY,CAAC,CAAC;gBACpC,CAAC,CAAC,CAAC;gBAEH,oGAAoG;gBACpG,MAAM,QAAQ,GAAG,CAAC,GAAG,cAAc,EAAE,GAAG,MAAM,CAAC,MAAM,CAAC,0BAA0B,CAAC,KAAK,CAAC,CAAC,CAAC,CAAC;gBAC1F,KAAK,CAAC,IAAI,CAAC,QAAQ,CAAC,CAAC;YACvB,CAAC,CAAC,CAAC;SACJ;aAAM;YACL,KAAK,CAAC,IAAI,CACR,GAAG,eAAe,CAAC,GAAG,CAAC,CAAC,MAAM,EAAE,KAAK,EAAE,EAAE,CAAC,CAAC,GAAG,MAAM,EAAE,GAAG,MAAM,CAAC,MAAM,CAAC,IAAI,CAAC,KAAK,CAAC,CAAC,CAAC,CAAC,CACtF,CAAC;SACH;QAED,OAAO,EAAE,OAAO,EAAE,KAAK,EAAE,IAAI,CAAC,KAAK,EAAE,KAAK,EAAE,CAAC;IAC/C,CAAC;CACF;AAED,MAAM,UAAU,QAAQ,CAAC,OAAwB;IAC/C,MAAM,EAAE,GAAG,IAAI,SAAS,CAAC,OAAO,CAAC,CAAC;IAClC,OAAO,EAAE,CAAC,QAAQ,EAAE,CAAC;AACvB,CAAC"}

package/dist/main.js CHANGED Viewed

@@ -31,7 +31,7 @@ These prompts are nunjucks templates, so you can use logic like this:
   prompts: ['prompts.txt'],
   providers: ['openai:gpt-3.5-turbo'],
   vars: 'vars.csv',
-  maxConcurrency: 3,
+  maxConcurrency: 4,
 };`;
     const readme = `To get started, set your OPENAI_API_KEY environment variable. Then run:
 \`\`\`
@@ -89,6 +89,7 @@ async function main() {
         .option('-v, --vars <path>', 'Path to file with prompt variables (csv, json, yaml)', defaultConfig.vars)
         .option('-c, --config <path>', 'Path to configuration file. Automatically loads promptfooconfig.js', defaultConfig.config)
         .option('-j, --max-concurrency <number>', 'Maximum number of concurrent API calls', String(defaultConfig.maxConcurrency))
+        .option('--grader', 'Model that will grade outputs', defaultConfig.grader)
         .option('--verbose', 'Show debug logs', defaultConfig.verbose)
         .action(async (cmdObj) => {
         if (cmdObj.verbose) {
@@ -123,6 +124,11 @@ async function main() {
             maxConcurrency: cmdObj.maxConcurrency && cmdObj.maxConcurrency > 0 ? cmdObj.maxConcurrency : undefined,
             ...config,
         };
+        if (cmdObj.grader) {
+            options.grading = {
+                provider: await loadApiProvider(cmdObj.grader),
+            };
+        }
         const summary = await evaluate(options);
         if (cmdObj.output) {
             logger.info(chalk.yellow(`Writing output to ${cmdObj.output}`));
@@ -141,10 +147,22 @@ async function main() {
                     head: ['blue', 'bold'],
                 },
             });
-            // Skip first row (header) and add the rest. Color the first column green if it's a success, red if it's a failure.
+            // Skip first row (header) and add the rest. Color PASS/FAIL
             for (const row of summary.table.slice(1)) {
-                const color = row[0] === 'PASS' ? 'green' : row[0].startsWith('FAIL') ? 'red' : undefined;
-                table.push(row.map((col, i) => (i === 0 && color ? chalk[color](col) : col)));
+                table.push(row.map((col) => {
+                    if (col.startsWith('[PASS]')) {
+                        // color '[PASS]' green
+                        return chalk.green.bold(col.slice(0, 6)) + col.slice(6);
+                    }
+                    else if (col.startsWith('[FAIL]')) {
+                        // color everything red up until '---'
+                        return col
+                            .split('---')
+                            .map((c, idx) => (idx === 0 ? chalk.red.bold(c) : c))
+                            .join('---');
+                    }
+                    return col;
+                }));
             }
             logger.info('\n' + table.toString());
         }