npm - katt - Versions diffs - 0.0.7 → 0.0.9 - Mend

katt 0.0.7 → 0.0.9

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (17) hide show

package/.nvmrc +1 -0
package/CODE_OF_CONDUCT.MD +92 -0
package/README.md +31 -13
package/SECURITY.md +10 -0
package/build-tests/__snapshots__/check1.snap.md +1 -0
package/build-tests/__snapshots__/check1__Hello_World__should_return_the_date_in_a_json_format.snap.md +1 -0
package/build-tests/__snapshots__/check1__root.snap.md +1 -0
package/build-tests/check1.eval.js +19 -0
package/build-tests/check2.eval.js +15 -0
package/build-tests/customPrompt.md +1 -0
package/dist/index.js +225 -167
package/dist/katt.js +1 -1
package/dist/runCli-j5xhVCdB.js +424 -0
package/katt-codex.json +4 -0
package/package.json +8 -7
package/renovate.json +6 -0
package/dist/runCli-C7uxWavX.js +0 -312

package/.nvmrc ADDED Viewed

	@@ -0,0 +1 @@
1	+ 24

package/CODE_OF_CONDUCT.MD ADDED Viewed

@@ -0,0 +1,92 @@
+# Contributor Covenant 3.0 Code of Conduct
+## Our Pledge
+We pledge to make our community welcoming, safe, and equitable for all.
+We are committed to fostering an environment that respects and promotes the dignity, rights, and contributions of all individuals, regardless of characteristics including race, ethnicity, caste, color, age, physical characteristics, neurodiversity, disability, sex or gender, gender identity or expression, sexual orientation, language, philosophy or religion, national or social origin, socio-economic position, level of education, or other status. The same privileges of participation are extended to everyone who participates in good faith and in accordance with this Covenant.
+## Encouraged Behaviors
+While acknowledging differences in social norms, we all strive to meet our community's expectations for positive behavior. We also understand that our words and actions may be interpreted differently than we intend based on culture, background, or native language.
+With these considerations in mind, we agree to behave mindfully toward each other and act in ways that center our shared values, including:
+1. Respecting the **purpose of our community**, our activities, and our ways of gathering.
+2. Engaging **kindly and honestly** with others.
+3. Respecting **different viewpoints** and experiences.
+4. **Taking responsibility** for our actions and contributions.
+5. Gracefully giving and accepting **constructive feedback**.
+6. Committing to **repairing harm** when it occurs.
+7. Behaving in other ways that promote and sustain the **well-being of our community**.
+## Restricted Behaviors
+We agree to restrict the following behaviors in our community. Instances, threats, and promotion of these behaviors are violations of this Code of Conduct.
+1. **Harassment.** Violating explicitly expressed boundaries or engaging in unnecessary personal attention after any clear request to stop.
+2. **Character attacks.** Making insulting, demeaning, or pejorative comments directed at a community member or group of people.
+3. **Stereotyping or discrimination.** Characterizing anyone’s personality or behavior on the basis of immutable identities or traits.
+4. **Sexualization.** Behaving in a way that would generally be considered inappropriately intimate in the context or purpose of the community.
+5. **Violating confidentiality**. Sharing or acting on someone's personal or private information without their permission.
+6. **Endangerment.** Causing, encouraging, or threatening violence or other harm toward any person or group.
+7. Behaving in other ways that **threaten the well-being** of our community.
+### Other Restrictions
+1. **Misleading identity.** Impersonating someone else for any reason, or pretending to be someone else to evade enforcement actions.
+2. **Failing to credit sources.** Not properly crediting the sources of content you contribute.
+3. **Promotional materials**. Sharing marketing or other commercial content in a way that is outside the norms of the community.
+4. **Irresponsible communication.** Failing to responsibly present content which includes, links or describes any other restricted behaviors.
+## Reporting an Issue
+Tensions can occur between community members even when they are trying their best to collaborate. Not every conflict represents a code of conduct violation, and this Code of Conduct reinforces encouraged behaviors and norms that can help avoid conflicts and minimize harm.
+When an incident does occur, it is important to report it promptly. To report a possible violation, **Open a discussion in this repository.**
+Community Moderators take reports of violations seriously and will make every effort to respond in a timely manner. They will investigate all reports of code of conduct violations, reviewing messages, logs, and recordings, or interviewing witnesses and other participants. Community Moderators will keep investigation and enforcement actions as transparent as possible while prioritizing safety and confidentiality. In order to honor these values, enforcement actions are carried out in private with the involved parties, but communicating to the whole community may be part of a mutually agreed upon resolution.
+## Addressing and Repairing Harm
+****
+If an investigation by the Community Moderators finds that this Code of Conduct has been violated, the following enforcement ladder may be used to determine how best to repair harm, based on the incident's impact on the individuals involved and the community as a whole. Depending on the severity of a violation, lower rungs on the ladder may be skipped.
+1) Warning
+   1) Event: A violation involving a single incident or series of incidents.
+   2) Consequence: A private, written warning from the Community Moderators.
+   3) Repair: Examples of repair include a private written apology, acknowledgement of responsibility, and seeking clarification on expectations.
+2) Temporarily Limited Activities
+   1) Event: A repeated incidence of a violation that previously resulted in a warning, or the first incidence of a more serious violation.
+   2) Consequence: A private, written warning with a time-limited cooldown period designed to underscore the seriousness of the situation and give the community members involved time to process the incident. The cooldown period may be limited to particular communication channels or interactions with particular community members.
+   3) Repair: Examples of repair may include making an apology, using the cooldown period to reflect on actions and impact, and being thoughtful about re-entering community spaces after the period is over.
+3) Temporary Suspension
+   1) Event: A pattern of repeated violation which the Community Moderators have tried to address with warnings, or a single serious violation.
+   2) Consequence: A private written warning with conditions for return from suspension. In general, temporary suspensions give the person being suspended time to reflect upon their behavior and possible corrective actions.
+   3) Repair: Examples of repair include respecting the spirit of the suspension, meeting the specified conditions for return, and being thoughtful about how to reintegrate with the community when the suspension is lifted.
+4) Permanent Ban
+   1) Event: A pattern of repeated code of conduct violations that other steps on the ladder have failed to resolve, or a violation so serious that the Community Moderators determine there is no way to keep the community safe with this person as a member.
+   2) Consequence: Access to all community spaces, tools, and communication channels is removed. In general, permanent bans should be rarely used, should have strong reasoning behind them, and should only be resorted to if working through other remedies has failed to change the behavior.
+   3) Repair: There is no possible repair in cases of this severity.
+This enforcement ladder is intended as a guideline. It does not limit the ability of Community Managers to use their discretion and judgment, in keeping with the best interests of our community.
+## Scope
+This Code of Conduct applies within all community spaces, and also applies when an individual is officially representing the community in public or other spaces. Examples of representing our community include using an official email address, posting via an official social media account, or acting as an appointed representative at an online or offline event.
+## Attribution
+This Code of Conduct is adapted from the Contributor Covenant, version 3.0, permanently available at [https://www.contributor-covenant.org/version/3/0/](https://www.contributor-covenant.org/version/3/0/).
+Contributor Covenant is stewarded by the Organization for Ethical Source and licensed under CC BY-SA 4.0. To view a copy of this license, visit [https://creativecommons.org/licenses/by-sa/4.0/](https://creativecommons.org/licenses/by-sa/4.0/)
+For answers to common questions about Contributor Covenant, see the FAQ at [https://www.contributor-covenant.org/faq](https://www.contributor-covenant.org/faq). Translations are provided at [https://www.contributor-covenant.org/translations](https://www.contributor-covenant.org/translations). Additional enforcement and community guideline resources can be found at [https://www.contributor-covenant.org/resources](https://www.contributor-covenant.org/resources). The enforcement ladder was inspired by the work of [Mozilla’s code of conduct team](https://github.com/mozilla/inclusion).

package/README.md CHANGED Viewed

@@ -12,16 +12,10 @@ Katt is a lightweight testing framework for running AI Evals, inspired by [Jest]
 - [Articles](#articles)
 - [Hello World - Example](#hello-world---example)
 - [Main Features](#main-features)
-- [Usage](#usage)
 - [Installation](#installation)
 - [Basic Usage](#basic-usage)
-- [Using promptFile](#using-promptfile)
 - [Specifying AI Models](#specifying-ai-models)
 - [Development](#development)
-- [Setup](#setup)
-- [Available Scripts](#available-scripts)
-- [Verification Process](#verification-process)
-- [Project Structure](#project-structure)
 - [How It Works](#how-it-works)
 - [Requirements](#requirements)
 - [License](#license)
@@ -29,7 +23,8 @@ Katt is a lightweight testing framework for running AI Evals, inspired by [Jest]
 ## Overview
-Katt is designed to evaluate and validate the behavior of AI agents like **Claude Code**, **GitHub Copilot**, **OpenAI Codex** and more. It provides a simple, intuitive API for writing tests that interact with AI models and assert their responses.
+#### ✨ Run your own benchmarks and evaluations ✨
+**Katt** is designed to evaluate and validate the behavior of AI agents like **Claude Code**, **GitHub Copilot**, **OpenAI Codex** and more. It provides a simple, intuitive API for writing tests that interact with AI models and assert their responses.
 ## API Documentation
@@ -68,6 +63,7 @@ describe("Greeting agent", () => {
 - **Classification Matcher**: Built-in `toBeClassifiedAs()` matcher to grade a response against a target label on a 1-5 scale
 - **Concurrent Execution**: Runs eval files concurrently for faster test execution
 - **Model Selection**: Support for specifying custom AI models
+- **Runtime Selection**: Run prompts through GitHub Copilot (default) or Codex
 - **Configurable Timeouts**: Override prompt wait time per test or via `katt.json`
 ## Usage
@@ -127,11 +123,14 @@ describe("Model selection", () => {
 });
 ```
-You can also set a default model for the project by adding a `katt.json` file in the project root:
+You can also set runtime defaults in `katt.json`.
+Copilot (default runtime):
 ```json
 {
-  "copilot": {
+  "agent": "gh-copilot",
+  "agentOptions": {
     "model": "gpt-5-mini"
   },
   "prompt": {
@@ -140,10 +139,29 @@ You can also set a default model for the project by adding a `katt.json` file in
 }
 ```
+Codex:
+```json
+{
+  "agent": "codex",
+  "agentOptions": {
+    "model": "gpt-5-codex",
+    "profile": "default",
+    "sandbox": "workspace-write"
+  },
+  "prompt": {
+    "timeoutMs": 240000
+  }
+}
+```
 When this file exists:
-- `prompt("...")` and `promptFile("...")` use `copilot.model` by default
-- `prompt("...", { model: "..." })` still overrides the config value
+- Supported agents are:
+  - `gh-copilot` (default when `agent` is missing or unsupported)
+  - `codex`
+- `prompt("...")` and `promptFile("...")` merge `agentOptions` with call-time options
+- `prompt("...", { model: "..." })` overrides the model from config
 - `prompt.timeoutMs` sets the default wait timeout for long-running prompts
 ## Development
@@ -201,8 +219,8 @@ katt/
 ## Requirements
 - Node.js
-- GitHub Copilot CLI installed (see [GitHub Copilot CLI installation docs](https://docs.github.com/en/copilot/how-tos/copilot-cli/install-copilot-cli))
-- Access to AI models (e.g., OpenAI API key for Codex)
+- For `gh-copilot` runtime: access to GitHub Copilot with a logged-in user
+- For `codex` runtime: Codex CLI installed and authenticated (`codex login`)
 ## License

package/SECURITY.md ADDED Viewed

@@ -0,0 +1,10 @@
+# Security Policy
+## Supported Versions
+Since Katt is under development, only the latest version will be supported.
+## Reporting a Vulnerability
+- Create an issue on this repository.
+- Describe the vulnerability and the level of it.

package/build-tests/__snapshots__/check1.snap.md ADDED Viewed

	@@ -0,0 +1 @@
1	+ heeey

package/build-tests/__snapshots__/check1__Hello_World__should_return_the_date_in_a_json_format.snap.md ADDED Viewed

	@@ -0,0 +1 @@
1	+ { year: 2026 }

package/build-tests/__snapshots__/check1__root.snap.md ADDED Viewed

	@@ -0,0 +1 @@
1	+ heeey

package/build-tests/check1.eval.js ADDED Viewed

@@ -0,0 +1,19 @@
+import { describe, expect, it, prompt } from "katt";
+describe('Hello World', () => {
+    it('should return the date in a json format', async () => {
+        const currentData = new Date(Date.now());
+        const result = await prompt('Return the current year in the format "{ year: YYYY }"');
+        expect(result).toContain(`{ year: ${currentData.getFullYear()} }`);
+    });
+    it('should classify a response as helpful', async () => {
+        const response = await prompt('You are a helpful assistant. Give one short tip for learning JavaScript.');
+        await expect(response).toBeClassifiedAs('helpful', { threshold: 3 });
+    });
+});
+const result2 = await prompt('If you read this just say heeey');
+expect(result2.toLowerCase()).toMatchSnapshot();

package/build-tests/check2.eval.js ADDED Viewed

@@ -0,0 +1,15 @@
+import { describe, expect, it, prompt, promptFile } from "katt";
+describe('Working with files', () => {
+    it('It should load the file and compare', async () => {
+        const result = await promptFile('./customPrompt.md');
+        expect(result.toLowerCase()).toContain('hola');
+    });
+});
+describe('Working with prompt as expectation', () => {
+    it('It should be friendly', async () => {
+        const result = await prompt('You are a friendly assistant. If you read this, say "Hola"!', { model: 'gpt-5.2' });
+        expect(result).promptCheck('To be friendly, the response should contain a greeting.');
+    });
+});

package/build-tests/customPrompt.md ADDED Viewed

	@@ -0,0 +1 @@
1	+ If you read this, say "Hola"!