tryassay 0.21.1 → 0.22.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +4 -4
- package/demo/.claude/.truth_last_prompt +1 -0
- package/demo/.claude/truth_status +1 -0
- package/demo/css/style.css +1181 -0
- package/demo/data/demo-events.json +103 -0
- package/demo/index.html +222 -0
- package/demo/js/chat.js +292 -0
- package/demo/js/code-panel.js +206 -0
- package/demo/js/demo-mode.js +107 -0
- package/demo/js/orb.js +634 -0
- package/demo/js/question-cards.js +207 -0
- package/demo/js/sse-client.js +473 -0
- package/demo/js/state.js +162 -0
- package/demo/js/timeline.js +394 -0
- package/demo/js/voice.js +154 -0
- package/dist/api/server.d.ts +1 -0
- package/dist/api/server.js +65 -2
- package/dist/api/server.js.map +1 -1
- package/dist/cli.js +13 -0
- package/dist/cli.js.map +1 -1
- package/dist/commands/demo.d.ts +5 -0
- package/dist/commands/demo.js +107 -0
- package/dist/commands/demo.js.map +1 -0
- package/dist/commands/runtime.d.ts +4 -0
- package/dist/commands/runtime.js +50 -3
- package/dist/commands/runtime.js.map +1 -1
- package/dist/runtime/agents/planner-agent.d.ts +5 -2
- package/dist/runtime/agents/planner-agent.js +232 -1
- package/dist/runtime/agents/planner-agent.js.map +1 -1
- package/dist/runtime/app-create-orchestrator.d.ts +4 -0
- package/dist/runtime/app-create-orchestrator.js +151 -48
- package/dist/runtime/app-create-orchestrator.js.map +1 -1
- package/dist/runtime/dashboard-sync.d.ts +25 -0
- package/dist/runtime/dashboard-sync.js +169 -0
- package/dist/runtime/dashboard-sync.js.map +1 -0
- package/dist/runtime/types.d.ts +28 -0
- package/package.json +3 -2
package/README.md
CHANGED
|
@@ -211,7 +211,7 @@ export ANTHROPIC_API_KEY="sk-ant-..."
|
|
|
211
211
|
npm install -g tryassay
|
|
212
212
|
export ANTHROPIC_API_KEY="sk-ant-..."
|
|
213
213
|
|
|
214
|
-
# Generate code with
|
|
214
|
+
# Generate code with built-in claim verification
|
|
215
215
|
tryassay generate --task "Write a function that validates email addresses" --lang typescript --verbose
|
|
216
216
|
```
|
|
217
217
|
|
|
@@ -301,7 +301,7 @@ const check = await sdk.verify({
|
|
|
301
301
|
// check.verifications: each claim's verdict (PASS/PARTIAL/FAIL)
|
|
302
302
|
```
|
|
303
303
|
|
|
304
|
-
The SDK enforces
|
|
304
|
+
The SDK enforces verification — `generate()` always returns a verification proof alongside the code. The formal verifier runs deterministic pattern checks (regex, not LLM) and can override LLM verdicts. On its first production call, it caught the LLM hallucinating PASS on code with SQL injection.
|
|
305
305
|
|
|
306
306
|
---
|
|
307
307
|
|
|
@@ -352,7 +352,7 @@ Two modes: **Assay API** (recommended, uses your Assay key) or **BYOK** (bring y
|
|
|
352
352
|
|
|
353
353
|
| Command | Description |
|
|
354
354
|
|---------|-------------|
|
|
355
|
-
| `tryassay generate` | Generate verified code —
|
|
355
|
+
| `tryassay generate` | Generate verified code — claim verification below the model call |
|
|
356
356
|
| `tryassay assess <target>` | Run autonomous LVR assessment against a codebase or GitHub URL |
|
|
357
357
|
| `tryassay init` | Initialize project configuration |
|
|
358
358
|
| `tryassay hallucinate` | Generate a hallucinated ToS/API docs/user manual |
|
|
@@ -557,7 +557,7 @@ A complete 6-iteration cycle that achieves 90%+ compliance costs approximately *
|
|
|
557
557
|
Contributions are welcome. Areas where help is particularly valuable:
|
|
558
558
|
|
|
559
559
|
- **Multi-document hallucination** -- Extending beyond ToS to API docs, user manuals, privacy policies, and compliance certifications simultaneously
|
|
560
|
-
- **
|
|
560
|
+
- **Deterministic verification integration** -- Replacing LLM-based verification with property-based testing, model checking, or static analysis for specific claim categories
|
|
561
561
|
- **CI/CD integration** -- Running Assay in continuous integration pipelines for specification-drift detection
|
|
562
562
|
- **Language support** -- The CLI currently targets TypeScript/JavaScript codebases; other languages need codebase indexing adapters
|
|
563
563
|
- **Benchmarking** -- Comparing initial hallucination quality across different LLMs (Claude, GPT-4, Gemini, Llama)
|
|
@@ -0,0 +1 @@
|
|
|
1
|
+
1771956016.3584828
|
|
@@ -0,0 +1 @@
|
|
|
1
|
+
green
|