@datafog/fogclaw 0.1.6 → 0.3.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/CHANGELOG.md +42 -0
- package/README.md +39 -0
- package/dist/backlog-tools.d.ts +57 -0
- package/dist/backlog-tools.d.ts.map +1 -0
- package/dist/backlog-tools.js +173 -0
- package/dist/backlog-tools.js.map +1 -0
- package/dist/backlog.d.ts +82 -0
- package/dist/backlog.d.ts.map +1 -0
- package/dist/backlog.js +169 -0
- package/dist/backlog.js.map +1 -0
- package/dist/config.d.ts.map +1 -1
- package/dist/config.js +6 -0
- package/dist/config.js.map +1 -1
- package/dist/extract.d.ts +28 -0
- package/dist/extract.d.ts.map +1 -0
- package/dist/extract.js +91 -0
- package/dist/extract.js.map +1 -0
- package/dist/index.d.ts +2 -1
- package/dist/index.d.ts.map +1 -1
- package/dist/index.js +96 -3
- package/dist/index.js.map +1 -1
- package/dist/message-sending-handler.d.ts +41 -0
- package/dist/message-sending-handler.d.ts.map +1 -0
- package/dist/message-sending-handler.js +54 -0
- package/dist/message-sending-handler.js.map +1 -0
- package/dist/tool-result-handler.d.ts +37 -0
- package/dist/tool-result-handler.d.ts.map +1 -0
- package/dist/tool-result-handler.js +95 -0
- package/dist/tool-result-handler.js.map +1 -0
- package/dist/types.d.ts +16 -0
- package/dist/types.d.ts.map +1 -1
- package/dist/types.js +3 -0
- package/dist/types.js.map +1 -1
- package/openclaw.plugin.json +11 -1
- package/package.json +7 -1
- package/.github/workflows/harness-docs.yml +0 -30
- package/AGENTS.md +0 -28
- package/docs/DATA.md +0 -28
- package/docs/DESIGN.md +0 -17
- package/docs/DOMAIN_DOCS.md +0 -30
- package/docs/FRONTEND.md +0 -24
- package/docs/OBSERVABILITY.md +0 -25
- package/docs/PLANS.md +0 -171
- package/docs/PRODUCT_SENSE.md +0 -20
- package/docs/RELIABILITY.md +0 -60
- package/docs/SECURITY.md +0 -50
- package/docs/design-docs/core-beliefs.md +0 -17
- package/docs/design-docs/index.md +0 -8
- package/docs/generated/README.md +0 -36
- package/docs/generated/memory.md +0 -1
- package/docs/plans/2026-02-16-fogclaw-design.md +0 -172
- package/docs/plans/2026-02-16-fogclaw-implementation.md +0 -1606
- package/docs/plans/README.md +0 -15
- package/docs/plans/active/2026-02-16-feat-openclaw-official-submission-plan.md +0 -386
- package/docs/plans/active/2026-02-17-feat-release-fogclaw-via-datafog-package-plan.md +0 -328
- package/docs/plans/active/2026-02-17-feat-submit-fogclaw-to-openclaw-plan.md +0 -244
- package/docs/plans/tech-debt-tracker.md +0 -42
- package/docs/plugins/fogclaw.md +0 -101
- package/docs/runbooks/address-review-findings.md +0 -30
- package/docs/runbooks/ci-failures.md +0 -46
- package/docs/runbooks/code-review.md +0 -34
- package/docs/runbooks/merge-change.md +0 -28
- package/docs/runbooks/pull-request.md +0 -45
- package/docs/runbooks/record-evidence.md +0 -43
- package/docs/runbooks/reproduce-bug.md +0 -42
- package/docs/runbooks/respond-to-feedback.md +0 -42
- package/docs/runbooks/review-findings.md +0 -31
- package/docs/runbooks/submit-openclaw-plugin.md +0 -68
- package/docs/runbooks/update-agents-md.md +0 -59
- package/docs/runbooks/update-domain-docs.md +0 -42
- package/docs/runbooks/validate-current-state.md +0 -41
- package/docs/runbooks/verify-release.md +0 -69
- package/docs/specs/2026-02-16-feat-openclaw-official-submission-spec.md +0 -115
- package/docs/specs/2026-02-17-feat-submit-fogclaw-to-openclaw.md +0 -125
- package/docs/specs/README.md +0 -5
- package/docs/specs/index.md +0 -8
- package/docs/spikes/README.md +0 -8
- package/fogclaw.config.example.json +0 -33
- package/scripts/ci/he-docs-config.json +0 -123
- package/scripts/ci/he-docs-drift.sh +0 -112
- package/scripts/ci/he-docs-lint.sh +0 -234
- package/scripts/ci/he-plans-lint.sh +0 -354
- package/scripts/ci/he-runbooks-lint.sh +0 -445
- package/scripts/ci/he-specs-lint.sh +0 -258
- package/scripts/ci/he-spikes-lint.sh +0 -249
- package/scripts/runbooks/select-runbooks.sh +0 -154
- package/src/config.ts +0 -183
- package/src/engines/gliner.ts +0 -240
- package/src/engines/regex.ts +0 -71
- package/src/index.ts +0 -372
- package/src/redactor.ts +0 -51
- package/src/scanner.ts +0 -196
- package/src/types.ts +0 -71
- package/tests/config.test.ts +0 -78
- package/tests/gliner.test.ts +0 -289
- package/tests/plugin-smoke.test.ts +0 -143
- package/tests/redactor.test.ts +0 -320
- package/tests/regex.test.ts +0 -345
- package/tests/scanner.test.ts +0 -348
- package/tsconfig.json +0 -20
|
@@ -1,172 +0,0 @@
|
|
|
1
|
-
# FogClaw Design Document
|
|
2
|
-
|
|
3
|
-
**Date:** 2026-02-16
|
|
4
|
-
**Repo:** `datafog/fogclaw` (public, MIT license)
|
|
5
|
-
**Status:** Approved
|
|
6
|
-
|
|
7
|
-
## Overview
|
|
8
|
-
|
|
9
|
-
FogClaw is an OpenClaw plugin that brings DataFog's PII detection and redaction capabilities into the OpenClaw AI agent ecosystem. It acts as both a passive guardrail on message flow and an on-demand tool the agent can invoke explicitly. It uses a dual-engine approach: ported DataFog regex patterns for structured PII and GLiNER (via ONNX) for zero-shot NER on custom entities.
|
|
10
|
-
|
|
11
|
-
## Decisions
|
|
12
|
-
|
|
13
|
-
| Decision | Choice |
|
|
14
|
-
|---|---|
|
|
15
|
-
| Use case | Both guardrail + on-demand tool |
|
|
16
|
-
| Language | Pure TypeScript (ONNX for GLiNER) |
|
|
17
|
-
| Regex layer | Port DataFog regex patterns |
|
|
18
|
-
| PII action | Configurable per-entity-type (default: redact) |
|
|
19
|
-
| Custom terms | Config file (`fogclaw.config.json`) |
|
|
20
|
-
| Default model | `onnx-community/gliner_large-v2.1` |
|
|
21
|
-
| Architecture | Dual-layer (regex + GLiNER) |
|
|
22
|
-
|
|
23
|
-
## Project Structure
|
|
24
|
-
|
|
25
|
-
```
|
|
26
|
-
fogclaw/
|
|
27
|
-
├── openclaw.plugin.json # OpenClaw plugin manifest
|
|
28
|
-
├── package.json
|
|
29
|
-
├── tsconfig.json
|
|
30
|
-
├── fogclaw.config.example.json # Example user config
|
|
31
|
-
├── src/
|
|
32
|
-
│ ├── index.ts # Plugin entry: register hook + tool
|
|
33
|
-
│ ├── engines/
|
|
34
|
-
│ │ ├── regex.ts # Ported DataFog regex patterns
|
|
35
|
-
│ │ └── gliner.ts # GLiNER ONNX inference wrapper
|
|
36
|
-
│ ├── scanner.ts # Orchestrator: regex → GLiNER pipeline
|
|
37
|
-
│ ├── redactor.ts # Redaction strategies (token, mask, hash)
|
|
38
|
-
│ ├── config.ts # Config loading & validation
|
|
39
|
-
│ └── types.ts # Shared TypeScript types
|
|
40
|
-
├── models/ # Auto-downloaded ONNX model cache
|
|
41
|
-
├── tests/
|
|
42
|
-
│ ├── regex.test.ts
|
|
43
|
-
│ ├── gliner.test.ts
|
|
44
|
-
│ ├── scanner.test.ts
|
|
45
|
-
│ └── redactor.test.ts
|
|
46
|
-
└── README.md
|
|
47
|
-
```
|
|
48
|
-
|
|
49
|
-
## Detection Pipeline
|
|
50
|
-
|
|
51
|
-
```
|
|
52
|
-
Input text
|
|
53
|
-
│
|
|
54
|
-
▼
|
|
55
|
-
┌─────────────┐
|
|
56
|
-
│ Regex Pass │ ← emails, SSNs, phones, credit cards, IPs, dates, zips
|
|
57
|
-
│ (~20µs/kB) │ confidence: 1.0
|
|
58
|
-
└─────┬───────┘
|
|
59
|
-
│
|
|
60
|
-
▼
|
|
61
|
-
┌─────────────┐
|
|
62
|
-
│ GLiNER Pass │ ← persons, orgs, locations + custom entities from config
|
|
63
|
-
│ (ONNX) │ confidence: 0.0-1.0
|
|
64
|
-
└─────┬───────┘
|
|
65
|
-
│
|
|
66
|
-
▼
|
|
67
|
-
┌─────────────┐
|
|
68
|
-
│ Merge & │ ← Deduplicate overlapping spans, prefer higher confidence
|
|
69
|
-
│ Normalize │ Canonical type mapping (same as DataFog)
|
|
70
|
-
└─────┬───────┘
|
|
71
|
-
│
|
|
72
|
-
▼
|
|
73
|
-
Entity[] — unified results
|
|
74
|
-
```
|
|
75
|
-
|
|
76
|
-
### Entity Type
|
|
77
|
-
|
|
78
|
-
```typescript
|
|
79
|
-
interface Entity {
|
|
80
|
-
text: string; // "john@example.com"
|
|
81
|
-
label: string; // "EMAIL"
|
|
82
|
-
start: number; // character offset
|
|
83
|
-
end: number;
|
|
84
|
-
confidence: number; // 1.0 for regex, 0.0-1.0 for GLiNER
|
|
85
|
-
source: "regex" | "gliner";
|
|
86
|
-
}
|
|
87
|
-
```
|
|
88
|
-
|
|
89
|
-
### Span Conflict Resolution
|
|
90
|
-
|
|
91
|
-
When regex and GLiNER detect overlapping spans, prefer regex (confidence 1.0) for structured types, GLiNER for semantic types. Partially overlapping spans resolved by higher confidence.
|
|
92
|
-
|
|
93
|
-
### GLiNER Labels
|
|
94
|
-
|
|
95
|
-
Built-in: `["person", "organization", "location", "address", "date of birth", "medical record number", "account number", "passport number"]`
|
|
96
|
-
|
|
97
|
-
Plus `custom_entities` from user config.
|
|
98
|
-
|
|
99
|
-
## OpenClaw Integration
|
|
100
|
-
|
|
101
|
-
### Hook (Guardrail)
|
|
102
|
-
|
|
103
|
-
Registers `before_agent_start` hook to intercept incoming messages. Per-entity-type actions:
|
|
104
|
-
- **redact**: Replace with tokens like `[EMAIL]` (default)
|
|
105
|
-
- **block**: Stop message, notify user
|
|
106
|
-
- **warn**: Notify but allow message through
|
|
107
|
-
|
|
108
|
-
### Tools
|
|
109
|
-
|
|
110
|
-
Two tools registered for on-demand use by the agent:
|
|
111
|
-
|
|
112
|
-
1. **fogclaw_scan** — Detect entities in text, return structured results
|
|
113
|
-
2. **fogclaw_redact** — Detect and redact entities, return sanitized text
|
|
114
|
-
|
|
115
|
-
Both accept optional `custom_labels` parameter for ad-hoc zero-shot entity detection.
|
|
116
|
-
|
|
117
|
-
### Redaction Strategies
|
|
118
|
-
|
|
119
|
-
- **token**: `"Contact john@example.com"` → `"Contact [EMAIL]"`
|
|
120
|
-
- **mask**: `"Contact john@example.com"` → `"Contact ****************"`
|
|
121
|
-
- **hash**: `"Contact john@example.com"` → `"Contact [EMAIL_a1b2c3d4e5f6]"`
|
|
122
|
-
|
|
123
|
-
## Configuration
|
|
124
|
-
|
|
125
|
-
```json
|
|
126
|
-
{
|
|
127
|
-
"enabled": true,
|
|
128
|
-
"guardrail_mode": "redact",
|
|
129
|
-
"redactStrategy": "token",
|
|
130
|
-
"model": "onnx-community/gliner_large-v2.1",
|
|
131
|
-
"confidence_threshold": 0.5,
|
|
132
|
-
"custom_entities": ["project codename", "internal tool name", "competitor name"],
|
|
133
|
-
"entityActions": {
|
|
134
|
-
"SSN": "block",
|
|
135
|
-
"CREDIT_CARD": "block",
|
|
136
|
-
"EMAIL": "redact",
|
|
137
|
-
"PHONE": "redact",
|
|
138
|
-
"PERSON": "warn"
|
|
139
|
-
}
|
|
140
|
-
}
|
|
141
|
-
```
|
|
142
|
-
|
|
143
|
-
## Dependencies
|
|
144
|
-
|
|
145
|
-
```json
|
|
146
|
-
{
|
|
147
|
-
"dependencies": {
|
|
148
|
-
"gliner": "^0.x.x",
|
|
149
|
-
"onnxruntime-node": "^1.x"
|
|
150
|
-
},
|
|
151
|
-
"devDependencies": {
|
|
152
|
-
"vitest": "^2.x",
|
|
153
|
-
"typescript": "^5.x"
|
|
154
|
-
}
|
|
155
|
-
}
|
|
156
|
-
```
|
|
157
|
-
|
|
158
|
-
## Technical Considerations
|
|
159
|
-
|
|
160
|
-
**Model Loading:** Downloaded once from HuggingFace, cached in `~/.openclaw/extensions/fogclaw/models/`. Singleton pattern — stays loaded after first inference.
|
|
161
|
-
|
|
162
|
-
**Error Handling:** GLiNER failure → fall back to regex-only with warning. Network failure during download → clear error with manual download instructions.
|
|
163
|
-
|
|
164
|
-
**Performance:** Regex <1ms, GLiNER ~50-200ms per message. Well under 1s total — acceptable for messaging bots.
|
|
165
|
-
|
|
166
|
-
## Not In v1 (YAGNI)
|
|
167
|
-
|
|
168
|
-
- No outbound message scanning
|
|
169
|
-
- No persistent audit log
|
|
170
|
-
- No web UI for config
|
|
171
|
-
- No GLiNER2 support (add later when npm ecosystem catches up)
|
|
172
|
-
- No runtime entity label management (config file only)
|