sinapse-ai 9.3.0 → 9.4.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/.claude/CLAUDE.md +56 -343
- package/.claude/rules/agent-authority.md +6 -0
- package/.claude/rules/agent-handoff.md +5 -0
- package/.claude/rules/cross-squad-routing.md +5 -0
- package/.claude/rules/hook-governance.md +6 -0
- package/.claude/rules/mcp-usage.md +3 -1
- package/.claude/rules/safe-collaboration.md +10 -0
- package/.claude/rules/security-data-protection.md +9 -0
- package/.claude/rules/squad-awareness.md +3 -1
- package/.claude/rules/tool-examples.md +6 -0
- package/.claude/rules/workflow-execution.md +7 -0
- package/.codex/agents/analyst.md +253 -72
- package/.codex/agents/architect.md +455 -68
- package/.codex/agents/data-engineer.md +492 -106
- package/.codex/agents/developer.md +560 -0
- package/.codex/agents/devops.md +518 -69
- package/.codex/agents/product-lead.md +335 -0
- package/.codex/agents/project-lead.md +377 -0
- package/.codex/agents/quality-gate.md +449 -0
- package/.codex/agents/sinapse-orqx.md +9 -7
- package/.codex/agents/sprint-lead.md +287 -0
- package/.codex/agents/squad-creator.md +344 -0
- package/.codex/agents/ux-design-expert.md +495 -0
- package/.codex/delegation-matrix.json +756 -44
- package/.codex/handoff-packet.schema.json +30 -6
- package/.sinapse-ai/data/entity-registry.yaml +175 -363
- package/.sinapse-ai/data/registry-update-log.jsonl +16 -0
- package/.sinapse-ai/development/agents/analyst.md +90 -0
- package/.sinapse-ai/development/agents/architect.md +73 -0
- package/.sinapse-ai/development/agents/developer.md +69 -0
- package/.sinapse-ai/development/agents/devops.md +117 -0
- package/.sinapse-ai/development/agents/quality-gate.md +85 -0
- package/.sinapse-ai/development/checklists/agent-quality-gate.md +27 -0
- package/.sinapse-ai/development/checklists/brownfield-compatibility-checklist.md +20 -0
- package/.sinapse-ai/development/checklists/code-review-checklist.md +106 -0
- package/.sinapse-ai/development/checklists/issue-triage-checklist.md +9 -0
- package/.sinapse-ai/development/checklists/memory-audit-checklist.md +16 -0
- package/.sinapse-ai/development/checklists/pr-quality-checklist.md +72 -0
- package/.sinapse-ai/development/checklists/security-deployment-checklist.md +54 -0
- package/.sinapse-ai/development/checklists/self-critique-checklist.md +19 -1
- package/.sinapse-ai/development/skills/debug.md +57 -0
- package/.sinapse-ai/development/skills/fast-review.md +69 -0
- package/.sinapse-ai/development/skills/research-synthesis.md +77 -0
- package/.sinapse-ai/development/skills/security-scan.md +73 -0
- package/.sinapse-ai/development/skills/verify.md +53 -0
- package/.sinapse-ai/development/templates/squad/agent-template.md +17 -4
- package/.sinapse-ai/development/templates/squad/checklist-template.md +13 -5
- package/.sinapse-ai/development/templates/squad/task-template.md +7 -0
- package/.sinapse-ai/development/templates/squad/workflow-template.yaml +7 -0
- package/.sinapse-ai/development/workflows/fast-track.yaml +87 -0
- package/.sinapse-ai/infrastructure/scripts/validate-codex-delegation.js +3 -1
- package/.sinapse-ai/install-manifest.yaml +71 -35
- package/docs/codex-integration-process.md +22 -0
- package/docs/codex-parity-program.md +27 -0
- package/docs/ide-integration.md +36 -0
- package/package.json +1 -1
- package/squads/claude-code-mastery/knowledge-base/claude-code-internals-reference.md +927 -0
- package/squads/squad-brand/knowledge-base/archetype-brand-mapping.md +12 -1
- package/squads/squad-brand/knowledge-base/brand-activism-cultural-branding.md +216 -0
- package/squads/squad-brand/knowledge-base/brand-audit-criteria.md +58 -0
- package/squads/squad-brand/knowledge-base/brand-digital-strategy.md +188 -0
- package/squads/squad-brand/knowledge-base/brand-legal-ip.md +222 -0
- package/squads/squad-brand/knowledge-base/brand-naming-framework.md +163 -0
- package/squads/squad-brand/knowledge-base/branding-master-reference.md +1001 -0
- package/squads/squad-brand/knowledge-base/color-psychology.md +25 -12
- package/squads/squad-brand/knowledge-base/employer-personal-branding.md +206 -0
- package/squads/squad-brand/knowledge-base/routing-catalog.md +34 -0
- package/squads/squad-brand/knowledge-base/sonic-branding-principles.md +6 -1
- package/squads/squad-brand/knowledge-base/typography-personality.md +34 -0
- package/squads/squad-claude/knowledge-base/context-window-optimization.md +334 -0
- package/squads/squad-claude/knowledge-base/knowledge-architecture-reference.md +403 -0
- package/squads/squad-claude/knowledge-base/memory-systems-reference.md +412 -0
- package/squads/squad-claude/knowledge-base/obsidian-claude-integration.md +423 -0
- package/squads/squad-claude/knowledge-base/retrieval-augmented-generation.md +320 -0
- package/squads/squad-claude/knowledge-base/skill-creation-patterns.md +380 -0
- package/squads/squad-claude/knowledge-base/swarm-orchestration-patterns.md +411 -0
- package/squads/squad-cloning/knowledge-base/clone-quality-assurance.md +211 -0
- package/squads/squad-cloning/knowledge-base/confidence-scoring.md +51 -0
- package/squads/squad-cloning/knowledge-base/cross-squad-deployment.md +47 -0
- package/squads/squad-cloning/knowledge-base/ethical-guidelines.md +237 -0
- package/squads/squad-cloning/knowledge-base/knowledge-graph-for-clones.md +295 -0
- package/squads/squad-cloning/knowledge-base/memory-architecture-for-clones.md +229 -0
- package/squads/squad-cloning/knowledge-base/multi-agent-deployment-patterns.md +320 -0
- package/squads/squad-cloning/knowledge-base/skill-standard-for-clones.md +262 -0
- package/squads/squad-cloning/knowledge-base/sop-extraction-guide.md +243 -0
- package/squads/squad-commercial/knowledge-base/account-based-selling.md +206 -0
- package/squads/squad-commercial/knowledge-base/ai-as-competitive-infrastructure.md +14 -0
- package/squads/squad-commercial/knowledge-base/ai-in-sales.md +199 -0
- package/squads/squad-commercial/knowledge-base/brazilian-sales-context.md +195 -0
- package/squads/squad-commercial/knowledge-base/customer-success-operations.md +83 -2
- package/squads/squad-commercial/knowledge-base/prospecting-pipeline-generation.md +69 -0
- package/squads/squad-commercial/knowledge-base/sales-enablement-playbook.md +260 -0
- package/squads/squad-commercial/knowledge-base/sales-methodology-comparison.md +185 -0
- package/squads/squad-commercial/knowledge-base/sales-revenue-master-reference.md +1123 -0
- package/squads/squad-content/knowledge-base/brazilian-content-context.md +176 -0
- package/squads/squad-content/knowledge-base/competitor-analysis-methods.md +40 -1
- package/squads/squad-content/knowledge-base/content-architecture-taxonomy.md +206 -0
- package/squads/squad-content/knowledge-base/content-formats-encyclopedia.md +58 -1
- package/squads/squad-content/knowledge-base/content-references-bibliography.md +130 -0
- package/squads/squad-content/knowledge-base/content-strategy-master-reference.md +1097 -0
- package/squads/squad-content/knowledge-base/content-tech-stack.md +150 -0
- package/squads/squad-content/knowledge-base/copywriting-formulas-library.md +188 -0
- package/squads/squad-content/knowledge-base/email-newsletter-strategy.md +161 -0
- package/squads/squad-content/knowledge-base/platform-algorithm-intelligence.md +86 -1
- package/squads/squad-content/knowledge-base/social-algorithms-master-reference.md +1007 -0
- package/squads/squad-content/knowledge-base/video-audio-content-playbook.md +218 -0
- package/squads/squad-copy/knowledge-base/ai-copy-production.md +254 -0
- package/squads/squad-copy/knowledge-base/brazilian-copywriting-context.md +242 -0
- package/squads/squad-copy/knowledge-base/email-copywriting-system.md +299 -0
- package/squads/squad-copy/knowledge-base/landing-page-copy-architecture.md +267 -0
- package/squads/squad-copy/knowledge-base/power-words-catalog.md +205 -0
- package/squads/squad-copy/knowledge-base/seo-copywriting.md +255 -0
- package/squads/squad-copy/knowledge-base/video-script-copywriting.md +239 -0
- package/squads/squad-council/knowledge-base/brand-strategy-models.md +193 -0
- package/squads/squad-council/knowledge-base/growth-strategy-models.md +267 -0
- package/squads/squad-council/knowledge-base/innovation-disruption-frameworks.md +193 -0
- package/squads/squad-council/knowledge-base/market-analysis-frameworks.md +240 -0
- package/squads/squad-council/knowledge-base/organizational-leadership-models.md +212 -0
- package/squads/squad-council/knowledge-base/sales-strategy-models.md +215 -0
- package/squads/squad-courses/knowledge-base/course-launch-strategy.md +251 -0
- package/squads/squad-courses/knowledge-base/domain-advocacia-curriculum.md +385 -0
- package/squads/squad-courses/knowledge-base/domain-contabilidade-curriculum.md +266 -0
- package/squads/squad-courses/knowledge-base/platform-comparison.md +68 -0
- package/squads/squad-courses/knowledge-base/video-production-guide.md +70 -0
- package/squads/squad-cybersecurity/knowledge-base/cloud-security-reference.md +363 -0
- package/squads/squad-cybersecurity/knowledge-base/compliance-frameworks.md +273 -0
- package/squads/squad-cybersecurity/knowledge-base/database-security.md +438 -0
- package/squads/squad-cybersecurity/knowledge-base/incident-response-playbook.md +420 -0
- package/squads/squad-cybersecurity/knowledge-base/network-security-reference.md +477 -0
- package/squads/squad-cybersecurity/knowledge-base/penetration-testing-methodology.md +350 -0
- package/squads/squad-cybersecurity/knowledge-base/vulnerability-management.md +349 -0
- package/squads/squad-design/knowledge-base/brazilian-design-context.md +223 -0
- package/squads/squad-design/knowledge-base/component-api-patterns.md +208 -4
- package/squads/squad-design/knowledge-base/design-system-master-reference.md +1302 -0
- package/squads/squad-design/knowledge-base/design-systems-frameworks.md +91 -1
- package/squads/squad-design/knowledge-base/responsive-modern-css.md +96 -4
- package/squads/squad-design/knowledge-base/wcag-aria-reference.md +117 -5
- package/squads/squad-design/knowledge-base/web-performance-reference.md +127 -4
- package/squads/squad-finance/knowledge-base/brazilian-taxation.md +263 -0
- package/squads/squad-finance/knowledge-base/contabilidade-master-reference.md +998 -0
- package/squads/squad-finance/knowledge-base/finance-master-reference.md +946 -0
- package/squads/squad-finance/knowledge-base/financial-reporting-analysis.md +316 -0
- package/squads/squad-finance/knowledge-base/fintech-brazilian-context.md +242 -0
- package/squads/squad-finance/knowledge-base/fpa-planning-frameworks.md +286 -0
- package/squads/squad-finance/knowledge-base/ma-and-transactions.md +285 -0
- package/squads/squad-finance/knowledge-base/risk-management.md +233 -0
- package/squads/squad-finance/knowledge-base/startups-venture-capital.md +337 -0
- package/squads/squad-growth/knowledge-base/ai-growth-playbook.md +216 -0
- package/squads/squad-growth/knowledge-base/attribution-models.md +78 -0
- package/squads/squad-growth/knowledge-base/brazilian-growth-context.md +208 -0
- package/squads/squad-growth/knowledge-base/community-led-growth.md +175 -0
- package/squads/squad-growth/knowledge-base/content-marketing-flywheel.md +190 -0
- package/squads/squad-growth/knowledge-base/email-lifecycle-framework.md +192 -0
- package/squads/squad-growth/knowledge-base/growth-frameworks-catalog.md +82 -0
- package/squads/squad-growth/knowledge-base/growth-master-reference.md +1168 -0
- package/squads/squad-growth/knowledge-base/routing-catalog.md +53 -11
- package/squads/squad-paidmedia/knowledge-base/audiences-segmentation-deep.md +285 -0
- package/squads/squad-paidmedia/knowledge-base/creative-strategy-deep.md +294 -0
- package/squads/squad-paidmedia/knowledge-base/google-ads-account-architecture.md +87 -0
- package/squads/squad-paidmedia/knowledge-base/meta-ads-campaign-architecture.md +76 -0
- package/squads/squad-paidmedia/knowledge-base/paid-media-metrics-reference.md +117 -0
- package/squads/squad-paidmedia/knowledge-base/paid-traffic-master-reference.md +1308 -0
- package/squads/squad-paidmedia/knowledge-base/routing-catalog.md +95 -18
- package/squads/squad-paidmedia/knowledge-base/traffic-masters-frameworks.md +71 -0
- package/squads/squad-product/knowledge-base/brazilian-product-context.md +284 -0
- package/squads/squad-product/knowledge-base/discovery-methodology-playbook.md +141 -0
- package/squads/squad-product/knowledge-base/pm-frameworks-reference.md +125 -9
- package/squads/squad-product/knowledge-base/product-analytics-formulas.md +72 -0
- package/squads/squad-product/knowledge-base/product-led-growth-reference.md +155 -13
- package/squads/squad-product/knowledge-base/product-market-fit-framework.md +222 -0
- package/squads/squad-product/knowledge-base/routing-catalog.md +32 -0
- package/squads/squad-research/knowledge-base/agentic-second-brain-reference.md +591 -0
- package/squads/squad-research/knowledge-base/ai-augmented-research.md +212 -0
- package/squads/squad-research/knowledge-base/brazilian-market-research-sources.md +197 -0
- package/squads/squad-research/knowledge-base/community-platforms-reference.md +786 -0
- package/squads/squad-research/knowledge-base/community-research-methods.md +194 -0
- package/squads/squad-research/knowledge-base/mixed-methods-research-design.md +168 -0
- package/squads/squad-research/knowledge-base/network-effects-analysis.md +192 -0
- package/squads/squad-research/knowledge-base/qualitative-research-deep-methods.md +202 -0
- package/squads/squad-research/knowledge-base/quantitative-research-methods.md +208 -0
- package/squads/squad-research/knowledge-base/research-frameworks-encyclopedia.md +40 -0
- package/squads/squad-research/knowledge-base/research-synthesis-frameworks.md +223 -0
- package/squads/squad-storytelling/knowledge-base/brand-mythology-framework.md +236 -0
- package/squads/squad-storytelling/knowledge-base/brazilian-storytelling-context.md +237 -0
- package/squads/squad-storytelling/knowledge-base/data-storytelling.md +232 -0
- package/squads/squad-storytelling/knowledge-base/improv-storytelling.md +226 -0
- package/squads/squad-storytelling/knowledge-base/persuasion-narrative-techniques.md +269 -0
- package/squads/squad-storytelling/knowledge-base/social-movement-narratives.md +191 -0
- package/squads/squad-storytelling/knowledge-base/video-storytelling.md +252 -0
- package/squads/claude-code-mastery/data/swarm-orchestration-patterns.yaml +0 -378
- package/squads/squad-animations/knowledge-base/framer-motion-complete-reference.md +0 -710
- package/squads/squad-animations/knowledge-base/web-animations-api-view-transitions.md +0 -478
|
@@ -0,0 +1,411 @@
|
|
|
1
|
+
# Swarm Orchestration Patterns
|
|
2
|
+
|
|
3
|
+
> Multi-agent frameworks comparison, orchestration patterns, and implementation guidance. Based on MS-009 research + Claude Code internals analysis (April 2026).
|
|
4
|
+
|
|
5
|
+
---
|
|
6
|
+
|
|
7
|
+
## Framework Landscape (2026)
|
|
8
|
+
|
|
9
|
+
### 9 Frameworks Compared
|
|
10
|
+
|
|
11
|
+
| Framework | Architecture | Control Level | Ideal For | Maturity |
|
|
12
|
+
|-----------|-------------|---------------|-----------|----------|
|
|
13
|
+
| **LangGraph** | State machine with directed graph | Maximum (nodes, edges, conditional routing) | Enterprise production, complex flows | High |
|
|
14
|
+
| **CrewAI** | Role-playing + task delegation | Medium (roles, tasks, SOPs) | Rapid prototyping, conceptual teams | High |
|
|
15
|
+
| **Claude Agent SDK** | Claude-native with tool use | Medium | Claude ecosystem, Sonnet 4.5/4.6 | High |
|
|
16
|
+
| **OpenAI Agents SDK** | Agents with handoffs + guardrails | Medium | OpenAI ecosystem | High |
|
|
17
|
+
| **Google ADK** | Agent Development Kit | Medium | Google/Gemini ecosystem | High |
|
|
18
|
+
| **Microsoft Agent Framework** | AutoGen + Semantic Kernel unified | Medium-High | Enterprise Microsoft stack | New (2026) |
|
|
19
|
+
| **AG2 (AutoGen fork)** | Multi-agent conversation | Low-Medium | Open-source community | Active |
|
|
20
|
+
| **deepagents (LangChain)** | Batteries-included harness | Medium | Long-horizon tasks, LangChain users | Growing |
|
|
21
|
+
| **Ruflo** | 60+ agent swarm with MCP | N/A | Claude Code native | New |
|
|
22
|
+
|
|
23
|
+
**Note (April 2026):** Microsoft retired AutoGen in favor of new Microsoft Agent Framework, unifying AutoGen + Semantic Kernel. AutoGen is maintenance-only. Community fork: AG2 (ag2ai/ag2).
|
|
24
|
+
|
|
25
|
+
---
|
|
26
|
+
|
|
27
|
+
## Claude Code Multi-Agent Architecture
|
|
28
|
+
|
|
29
|
+
### 3 Execution Models
|
|
30
|
+
|
|
31
|
+
#### 1. Fork Model (Experimental)
|
|
32
|
+
|
|
33
|
+
Subagents created with **byte-identical context copies** for cache sharing.
|
|
34
|
+
|
|
35
|
+
**Key innovation:** Fork children use identical placeholder text for each `tool_result` block, guaranteeing prefix-identical prompts across all workers. "Spawning five forked agents costs barely more than 1."
|
|
36
|
+
|
|
37
|
+
```
|
|
38
|
+
Parent Agent (with full context)
|
|
39
|
+
→ forkSubagent() × N
|
|
40
|
+
→ Children: identical system prompt prefix + unique task
|
|
41
|
+
→ KV cache shared across all children
|
|
42
|
+
→ Cost: ~1.0x base (not N×)
|
|
43
|
+
```
|
|
44
|
+
|
|
45
|
+
#### 2. Teammate Model
|
|
46
|
+
|
|
47
|
+
Independent agents with separate context windows.
|
|
48
|
+
|
|
49
|
+
**Communication:** Task files on disk + `SendMessageTool` — **no shared memory**.
|
|
50
|
+
|
|
51
|
+
```json
|
|
52
|
+
// SendMessageTool targets
|
|
53
|
+
{ "target": "teammate-name" } // written to mailbox
|
|
54
|
+
{ "target": "*" } // broadcast to all
|
|
55
|
+
{ "target": "team-lead" } // shutdown approve/reject
|
|
56
|
+
{ "target": "uds:/path" } // Unix domain socket
|
|
57
|
+
```
|
|
58
|
+
|
|
59
|
+
#### 3. Worktree Model
|
|
60
|
+
|
|
61
|
+
Git worktree isolation for parallel implementation.
|
|
62
|
+
|
|
63
|
+
```bash
|
|
64
|
+
# Each agent gets isolated git worktree
|
|
65
|
+
git worktree add .worktrees/agent-auth -b agent/auth
|
|
66
|
+
git worktree add .worktrees/agent-ui -b agent/ui
|
|
67
|
+
|
|
68
|
+
# Agents work independently
|
|
69
|
+
# No merge conflicts during parallel work
|
|
70
|
+
# Human reviews and merges when both complete
|
|
71
|
+
```
|
|
72
|
+
|
|
73
|
+
---
|
|
74
|
+
|
|
75
|
+
## Claude Code Coordinator Mode
|
|
76
|
+
|
|
77
|
+
### 4-Phase Pattern
|
|
78
|
+
|
|
79
|
+
```
|
|
80
|
+
Phase 1: Research (parallel)
|
|
81
|
+
Workers investigate different parts of codebase simultaneously
|
|
82
|
+
Each worker returns compact findings report
|
|
83
|
+
|
|
84
|
+
Phase 2: Synthesis
|
|
85
|
+
Coordinator reads all findings
|
|
86
|
+
Crafts detailed implementation specs
|
|
87
|
+
|
|
88
|
+
Phase 3: Implementation (parallel)
|
|
89
|
+
Workers execute per-spec changes in isolated worktrees
|
|
90
|
+
Specs ensure no file conflicts
|
|
91
|
+
|
|
92
|
+
Phase 4: Verification
|
|
93
|
+
Separate testing workers validate changes
|
|
94
|
+
Report back with PASS/FAIL verdicts
|
|
95
|
+
```
|
|
96
|
+
|
|
97
|
+
**Coordinator instructions (from source):**
|
|
98
|
+
- "Parallelism is your superpower"
|
|
99
|
+
- "Do not rubber-stamp weak work"
|
|
100
|
+
|
|
101
|
+
### Coordinator Mode Implementation
|
|
102
|
+
|
|
103
|
+
Implemented via system prompt, not code:
|
|
104
|
+
|
|
105
|
+
```markdown
|
|
106
|
+
## Coordinator Instructions
|
|
107
|
+
|
|
108
|
+
You are orchestrating a team of worker agents.
|
|
109
|
+
|
|
110
|
+
Your responsibilities:
|
|
111
|
+
1. Decompose the task into independent units
|
|
112
|
+
2. Assign each unit to a worker with clear specs
|
|
113
|
+
3. Wait for all workers to complete
|
|
114
|
+
4. Synthesize results — do not rubber-stamp weak work
|
|
115
|
+
5. If a worker fails, diagnose and re-assign
|
|
116
|
+
|
|
117
|
+
Rules:
|
|
118
|
+
- Workers work in isolated git worktrees
|
|
119
|
+
- Workers communicate via SendMessageTool only
|
|
120
|
+
- You manage the overall task state
|
|
121
|
+
- Parallelism is your superpower
|
|
122
|
+
```
|
|
123
|
+
|
|
124
|
+
---
|
|
125
|
+
|
|
126
|
+
## Reasoning Patterns for Agents
|
|
127
|
+
|
|
128
|
+
### 5 Core Patterns
|
|
129
|
+
|
|
130
|
+
| Pattern | Description | Best For |
|
|
131
|
+
|---------|-------------|---------|
|
|
132
|
+
| **ReAct** | Reason + Act in loop: think → execute → observe | Tasks with tools (search, edit) |
|
|
133
|
+
| **Chain of Thought (CoT)** | Linear step-by-step reasoning | Sequential problems, math |
|
|
134
|
+
| **Tree of Thought (ToT)** | Explore multiple reasoning paths as tree | Problems with multiple solutions |
|
|
135
|
+
| **Graph of Thought (GoT)** | Reasoning as graph — can merge/refine thoughts | Complex synthesis from multiple sources |
|
|
136
|
+
| **Reflection** | Agent evaluates and critiques its own output | Quality improvement, self-correction |
|
|
137
|
+
|
|
138
|
+
### ReAct Pattern (Production Standard)
|
|
139
|
+
|
|
140
|
+
```
|
|
141
|
+
[Thought]: I need to find all TypeScript files that import from auth module
|
|
142
|
+
[Action]: Grep(pattern="from.*auth", glob="**/*.ts")
|
|
143
|
+
[Observation]: Found 23 files
|
|
144
|
+
[Thought]: Now I need to check if any use the deprecated method
|
|
145
|
+
[Action]: Grep(pattern="useDeprecatedAuth", files=<previous results>)
|
|
146
|
+
[Observation]: 5 files use deprecated method
|
|
147
|
+
[Thought]: I have a complete picture. Should create migration plan.
|
|
148
|
+
[Response]: Here is the migration plan for 5 affected files...
|
|
149
|
+
```
|
|
150
|
+
|
|
151
|
+
---
|
|
152
|
+
|
|
153
|
+
## Agentic Architecture for Second Brain
|
|
154
|
+
|
|
155
|
+
### 6-Agent Specialization Model
|
|
156
|
+
|
|
157
|
+
```
|
|
158
|
+
[Main Orchestrator]
|
|
159
|
+
|
|
|
160
|
+
|-- [Capture Agent] → Monitors sources, ingests content
|
|
161
|
+
|-- [Curation Agent] → Connects, tags, classifies notes
|
|
162
|
+
|-- [Research Agent] → Searches, navigates, discovers
|
|
163
|
+
|-- [Synthesis Agent] → Summarizes, combines, produces
|
|
164
|
+
|-- [Quality Agent] → Validates, scores, suggests improvements
|
|
165
|
+
|-- [Maintenance Agent] → Detects decay, archives, cleans up
|
|
166
|
+
```
|
|
167
|
+
|
|
168
|
+
**For SINAPSE squads, this maps to:**
|
|
169
|
+
```
|
|
170
|
+
[sinapse-orqx / squad *-orqx]
|
|
171
|
+
|-- @developer → implementation tasks
|
|
172
|
+
|-- @quality-gate → validation tasks
|
|
173
|
+
|-- @architect → design tasks
|
|
174
|
+
|-- @analyst → research tasks
|
|
175
|
+
```
|
|
176
|
+
|
|
177
|
+
---
|
|
178
|
+
|
|
179
|
+
## Task Decomposition Patterns
|
|
180
|
+
|
|
181
|
+
### Pattern 1: Independence Decomposition
|
|
182
|
+
|
|
183
|
+
**Goal:** Minimize coordination overhead by creating truly independent subtasks.
|
|
184
|
+
|
|
185
|
+
```python
|
|
186
|
+
def decompose_for_parallelism(epic):
|
|
187
|
+
subtasks = []
|
|
188
|
+
for story in epic.stories:
|
|
189
|
+
# Check: does this story touch files other stories touch?
|
|
190
|
+
if not overlaps_with_others(story, epic.stories):
|
|
191
|
+
subtasks.append(ParallelTask(story))
|
|
192
|
+
else:
|
|
193
|
+
subtasks.append(SequentialTask(story))
|
|
194
|
+
return subtasks
|
|
195
|
+
```
|
|
196
|
+
|
|
197
|
+
**Claude Code rule:** "If two agents need to communicate more than 3 times during a task, they should probably be one agent."
|
|
198
|
+
|
|
199
|
+
### Pattern 2: Pipeline Chain
|
|
200
|
+
|
|
201
|
+
Sequential processing where each stage transforms the output:
|
|
202
|
+
|
|
203
|
+
```
|
|
204
|
+
Stage 1: Analyze (subagent)
|
|
205
|
+
→ produces: analysis.json (compact artifact)
|
|
206
|
+
↓
|
|
207
|
+
Stage 2: Design (subagent)
|
|
208
|
+
→ receives: analysis.json
|
|
209
|
+
→ produces: design.md (compact artifact)
|
|
210
|
+
↓
|
|
211
|
+
Stage 3: Implement (subagent)
|
|
212
|
+
→ receives: design.md
|
|
213
|
+
→ produces: code changes
|
|
214
|
+
↓
|
|
215
|
+
Stage 4: Review (subagent)
|
|
216
|
+
→ receives: code changes
|
|
217
|
+
→ produces: review verdict
|
|
218
|
+
```
|
|
219
|
+
|
|
220
|
+
**Key principle:** Each stage produces a **compact artifact** as input to the next. No raw context carried forward.
|
|
221
|
+
|
|
222
|
+
### Pattern 3: Specialist Routing
|
|
223
|
+
|
|
224
|
+
```
|
|
225
|
+
Orchestrator classifies task
|
|
226
|
+
→ Database work → @data-engineer agent
|
|
227
|
+
→ UI work → @ux-design-expert agent
|
|
228
|
+
→ API work → @developer agent
|
|
229
|
+
→ Testing → @quality-gate agent
|
|
230
|
+
```
|
|
231
|
+
|
|
232
|
+
Each specialist has domain-specific instructions and tool permissions focused on their area.
|
|
233
|
+
|
|
234
|
+
### Pattern 4: Critic-Generator Loop
|
|
235
|
+
|
|
236
|
+
```
|
|
237
|
+
Generator Agent → produces initial output
|
|
238
|
+
→ Critic Agent → evaluates output
|
|
239
|
+
→ If APPROVED: done
|
|
240
|
+
→ If NEEDS_REVISION: Generator revises (max N iterations)
|
|
241
|
+
→ If BLOCKED: Escalate to human
|
|
242
|
+
```
|
|
243
|
+
|
|
244
|
+
Used in SINAPSE for story validation (sprint-lead creates → product-lead validates → developer implements → quality-gate reviews).
|
|
245
|
+
|
|
246
|
+
---
|
|
247
|
+
|
|
248
|
+
## Parallel Execution Guidelines
|
|
249
|
+
|
|
250
|
+
### Right-Sizing Agent Teams
|
|
251
|
+
|
|
252
|
+
| Task Complexity | Team Size | Pattern |
|
|
253
|
+
|----------------|-----------|---------|
|
|
254
|
+
| Simple (< 5 files) | 1 agent | No team needed |
|
|
255
|
+
| Medium (5-15 files) | 2-3 agents | Implement + review |
|
|
256
|
+
| Large (15-50 files) | 3-5 agents | Parallel workers + orchestrator |
|
|
257
|
+
| Epic (50+ files) | 5-8 agents | Full team topology |
|
|
258
|
+
|
|
259
|
+
**Over-decomposition creates more coordination overhead than parallelism saves.**
|
|
260
|
+
|
|
261
|
+
### Concurrency Safety Rules
|
|
262
|
+
|
|
263
|
+
```python
|
|
264
|
+
# From Claude Code source
|
|
265
|
+
class ConcurrencyModel:
|
|
266
|
+
# Tools default to non-concurrent (assume state mutation)
|
|
267
|
+
isConcurrencySafe = False
|
|
268
|
+
|
|
269
|
+
# Only mark concurrent-safe when PROVEN independent
|
|
270
|
+
# Examples of safe: Read(different files), Grep, Glob
|
|
271
|
+
# Examples of unsafe: Edit(same file), sequential git operations
|
|
272
|
+
```
|
|
273
|
+
|
|
274
|
+
**CLAUDE_CODE_MAX_TOOL_USE_CONCURRENCY** = 10 (default cap).
|
|
275
|
+
|
|
276
|
+
---
|
|
277
|
+
|
|
278
|
+
## Swarm Communication Protocols
|
|
279
|
+
|
|
280
|
+
### Event-Driven Communication
|
|
281
|
+
|
|
282
|
+
```
|
|
283
|
+
Agent A completes task
|
|
284
|
+
→ Writes result to shared task file
|
|
285
|
+
→ Broadcasts via SendMessageTool("*", "task_complete")
|
|
286
|
+
→ Orchestrator receives notification
|
|
287
|
+
→ Orchestrator routes to next stage
|
|
288
|
+
```
|
|
289
|
+
|
|
290
|
+
### Mailbox Pattern
|
|
291
|
+
|
|
292
|
+
```python
|
|
293
|
+
# Worker sends to orchestrator
|
|
294
|
+
SendMessageTool(target="orchestrator", message={
|
|
295
|
+
"type": "result",
|
|
296
|
+
"task_id": "auth-module",
|
|
297
|
+
"status": "complete",
|
|
298
|
+
"artifacts": ["src/auth/", "tests/auth/"],
|
|
299
|
+
"verdict": "PASS",
|
|
300
|
+
"notes": "Found 2 minor issues, both fixed"
|
|
301
|
+
})
|
|
302
|
+
|
|
303
|
+
# Worker requests permission for dangerous op
|
|
304
|
+
SendMessageTool(target="orchestrator", message={
|
|
305
|
+
"type": "permission_request",
|
|
306
|
+
"operation": "delete_legacy_files",
|
|
307
|
+
"files": ["src/old-auth.ts", "src/auth-v1.ts"],
|
|
308
|
+
"justification": "Replaced by new implementation"
|
|
309
|
+
})
|
|
310
|
+
```
|
|
311
|
+
|
|
312
|
+
### Permission Queue (Coordinator Mode)
|
|
313
|
+
|
|
314
|
+
Workers request authorization for dangerous operations via a queue:
|
|
315
|
+
1. Worker sends permission request to coordinator
|
|
316
|
+
2. Coordinator evaluates (or escalates to human)
|
|
317
|
+
3. Coordinator responds: APPROVE / DENY / MODIFY
|
|
318
|
+
4. Worker proceeds or adjusts plan
|
|
319
|
+
|
|
320
|
+
**Atomic Claim Mechanism:** `createResolveOnce` prevents duplicate handling of the same request.
|
|
321
|
+
|
|
322
|
+
---
|
|
323
|
+
|
|
324
|
+
## BMAD Method Patterns
|
|
325
|
+
|
|
326
|
+
BMAD (Breakthrough Method for Agile AI-Driven Development) v6 provides patterns SINAPSE can adopt.
|
|
327
|
+
|
|
328
|
+
### Document Sharding
|
|
329
|
+
|
|
330
|
+
Large documents split into focused pieces:
|
|
331
|
+
- Standard: ~5,000 tokens per document
|
|
332
|
+
- Sharded: ~300 tokens per shard
|
|
333
|
+
- **74-90% token consumption reduction**
|
|
334
|
+
|
|
335
|
+
**Applied to SINAPSE PRDs:** Instead of one large PRD.md, shard into:
|
|
336
|
+
- `prd-overview.md` (~300 tokens)
|
|
337
|
+
- `prd-requirements-fr.md` (~300 tokens)
|
|
338
|
+
- `prd-requirements-nfr.md` (~300 tokens)
|
|
339
|
+
- `prd-constraints.md` (~300 tokens)
|
|
340
|
+
- `prd-acceptance-criteria.md` (~300 tokens)
|
|
341
|
+
|
|
342
|
+
Agents load only the shards they need.
|
|
343
|
+
|
|
344
|
+
### Party Mode
|
|
345
|
+
|
|
346
|
+
Multiple agent personas collaborating within a single session — relevant context shared, irrelevant context excluded per persona.
|
|
347
|
+
|
|
348
|
+
**SINAPSE equivalent:** Each `@agent` activation follows the handoff protocol, compacting previous agent to ~379 tokens.
|
|
349
|
+
|
|
350
|
+
---
|
|
351
|
+
|
|
352
|
+
## Guardrails and Governance
|
|
353
|
+
|
|
354
|
+
### Authority Matrix (SINAPSE)
|
|
355
|
+
|
|
356
|
+
| Agent | Can Write | Read-Only | Blocked |
|
|
357
|
+
|-------|-----------|-----------|---------|
|
|
358
|
+
| @developer | `packages/`, `src/`, stories (checkboxes) | Everything | `git push`, `gh pr` |
|
|
359
|
+
| @quality-gate | `tests/`, review files | Source code | Write to src/ |
|
|
360
|
+
| @architect | `docs/architecture/` | System-wide | Application code |
|
|
361
|
+
| @devops | Remote operations | Everything | — |
|
|
362
|
+
| @sinapse-orqx | Everything | Everything | — |
|
|
363
|
+
|
|
364
|
+
### Escalation Protocol
|
|
365
|
+
|
|
366
|
+
```
|
|
367
|
+
Agent cannot complete task
|
|
368
|
+
→ Escalate to @sinapse-orqx
|
|
369
|
+
|
|
370
|
+
Quality gate fails
|
|
371
|
+
→ Return to @developer with specific feedback
|
|
372
|
+
|
|
373
|
+
Constitutional violation detected
|
|
374
|
+
→ BLOCK, require fix before proceeding
|
|
375
|
+
|
|
376
|
+
Agent boundary conflict
|
|
377
|
+
→ @sinapse-orqx mediates
|
|
378
|
+
```
|
|
379
|
+
|
|
380
|
+
### Infinite Loop Prevention
|
|
381
|
+
|
|
382
|
+
```python
|
|
383
|
+
# Maximum iterations per task
|
|
384
|
+
MAX_ITERATIONS = {
|
|
385
|
+
"qa_loop": 5,
|
|
386
|
+
"reflection_loop": 3,
|
|
387
|
+
"research_loop": 10,
|
|
388
|
+
"synthesis_loop": 3
|
|
389
|
+
}
|
|
390
|
+
|
|
391
|
+
# Break conditions
|
|
392
|
+
if iteration > max_iterations:
|
|
393
|
+
raise EscalateToHuman("Max iterations reached")
|
|
394
|
+
|
|
395
|
+
if delta_tokens < 500 and iterations > 3:
|
|
396
|
+
early_stop("Diminishing returns detected")
|
|
397
|
+
```
|
|
398
|
+
|
|
399
|
+
---
|
|
400
|
+
|
|
401
|
+
## Anti-Patterns
|
|
402
|
+
|
|
403
|
+
| Anti-pattern | Problem | Fix |
|
|
404
|
+
|-------------|---------|-----|
|
|
405
|
+
| Agent proliferation | Too many agents without clear coordination | One agent per concern, max 8 per epic |
|
|
406
|
+
| Infinite loops | Agents calling each other without stop condition | Max iterations + break conditions |
|
|
407
|
+
| Authority confusion | Multiple agents with authority over same resource | Clear ownership matrix |
|
|
408
|
+
| Skill bloat | Too many overlapping skills | Audit for duplicates quarterly |
|
|
409
|
+
| Over-parallelization | Coordination overhead > parallelism savings | Right-size teams (see table above) |
|
|
410
|
+
| Tight coupling | Agents sharing mutable state | Communication via immutable artifacts only |
|
|
411
|
+
| Bypassing orchestrator | User → specialist directly (multi-squad work) | Always route through orchestrator |
|
|
@@ -0,0 +1,211 @@
|
|
|
1
|
+
# Clone Quality Assurance — Validacao de Fidelidade Cognitiva
|
|
2
|
+
|
|
3
|
+
> Metodologia para validar que um clone gerado realmente representa
|
|
4
|
+
> o pensamento do original — e nao uma versao genérica embelezada.
|
|
5
|
+
> Inclui: fidelity scoring, Turing-like validation, e failure detection.
|
|
6
|
+
|
|
7
|
+
---
|
|
8
|
+
|
|
9
|
+
## Por Que QA Especifico Para Clones
|
|
10
|
+
|
|
11
|
+
QA de software verifica se o codigo faz o que foi especificado.
|
|
12
|
+
QA de clone verifica algo mais sutil: se o agente **pensa** como a pessoa original,
|
|
13
|
+
nao apenas se ele **soa** como ela.
|
|
14
|
+
|
|
15
|
+
A diferenca critica:
|
|
16
|
+
- **Soa como:** Vocabulario correto, tom correto, frases familiares
|
|
17
|
+
- **Pensa como:** Mesmas decisoes, mesmas heuristics, mesmos valores em conflito
|
|
18
|
+
|
|
19
|
+
Um clone pode ter 100% de score em "soa como" e 40% em "pensa como" —
|
|
20
|
+
isso e um clone de superfície, nao um clone cognitivo.
|
|
21
|
+
|
|
22
|
+
---
|
|
23
|
+
|
|
24
|
+
## 4 Dimensoes de Fidelidade
|
|
25
|
+
|
|
26
|
+
| Dimensao | O que avalia | Peso no score geral |
|
|
27
|
+
|----------|-------------|-------------------|
|
|
28
|
+
| **Cognitiva** | Heuristics e decisoes corretas | 35% |
|
|
29
|
+
| **Comunicativa** | Tom, vocabulario, estilo | 25% |
|
|
30
|
+
| **Processual** | Workflows e metodologias | 25% |
|
|
31
|
+
| **Limitar** | Boundaries e failure modes | 15% |
|
|
32
|
+
|
|
33
|
+
**Formula:**
|
|
34
|
+
```
|
|
35
|
+
Fidelity Score = (cognitiva × 0.35) + (comunicativa × 0.25) + (processual × 0.25) + (limitar × 0.15)
|
|
36
|
+
```
|
|
37
|
+
|
|
38
|
+
---
|
|
39
|
+
|
|
40
|
+
## Validacao Cognitiva — Teste de Decisao
|
|
41
|
+
|
|
42
|
+
### Protocolo de Decisao Benchmark
|
|
43
|
+
|
|
44
|
+
1. **Selecionar 5-10 cenarios de decisao** que a pessoa original enfrentou (documentados em fontes)
|
|
45
|
+
2. **Registrar a decisao original** (o que a pessoa realmente fez/disse)
|
|
46
|
+
3. **Apresentar o cenario ao clone** sem mencionar qual foi a decisao
|
|
47
|
+
4. **Comparar resposta do clone** com decisao documentada do original
|
|
48
|
+
5. **Calcular alignment rate**
|
|
49
|
+
|
|
50
|
+
```
|
|
51
|
+
Cognitiva Score = (decisoes alinhadas / total de cenarios) × 100
|
|
52
|
+
```
|
|
53
|
+
|
|
54
|
+
### Criterios de Alinhamento
|
|
55
|
+
|
|
56
|
+
| Alinhamento | Criterio |
|
|
57
|
+
|------------|---------|
|
|
58
|
+
| **Total** | Clone chega a mesma decisao e usa raciocinio similar | 1.0 |
|
|
59
|
+
| **Parcial** | Clone chega a decisao similar mas raciocinio diferente | 0.6 |
|
|
60
|
+
| **Superficial** | Mesma conclusao, raciocinio errado | 0.3 |
|
|
61
|
+
| **Divergente** | Conclusao diferente da original documentada | 0.0 |
|
|
62
|
+
|
|
63
|
+
### Exemplos de Cenarios Benchmark
|
|
64
|
+
|
|
65
|
+
Para um clone de especialista em marketing:
|
|
66
|
+
- "Voce tem $50K para lancar um produto. Distribui como?"
|
|
67
|
+
- "Um cliente quer pausar os ads por 30 dias. Voce recomenda?"
|
|
68
|
+
- "Qual canal priorizaria para lancamento B2B vs B2C?"
|
|
69
|
+
|
|
70
|
+
Comparar cada resposta com o que o original disse em fontes documentadas.
|
|
71
|
+
|
|
72
|
+
---
|
|
73
|
+
|
|
74
|
+
## Validacao Comunicativa — Turing-Like Test
|
|
75
|
+
|
|
76
|
+
### Protocolo de Blind Comparison
|
|
77
|
+
|
|
78
|
+
1. Gerar 10 respostas do clone para perguntas abertas no dominio
|
|
79
|
+
2. Coletar 10 respostas reais da pessoa (de fontes documentadas, mesmas perguntas aproximadas)
|
|
80
|
+
3. Embaralhar as 20 respostas
|
|
81
|
+
4. Solicitar a 3 avaliadores que identifiquem quais sao do original vs clone
|
|
82
|
+
5. **Taxa de confusao alvo:** >= 40% (avaliadores devem confundir com frequencia)
|
|
83
|
+
|
|
84
|
+
**Interpretacao:**
|
|
85
|
+
- Confusao >= 60%: Clone de alta fidelidade comunicativa
|
|
86
|
+
- Confusao 40-59%: Clone aceitavel
|
|
87
|
+
- Confusao 20-39%: Clone de superficie — revisar L4
|
|
88
|
+
- Confusao < 20%: Clone falhou — reprocessar Layer 4
|
|
89
|
+
|
|
90
|
+
### Checklist de Validacao de Tom
|
|
91
|
+
|
|
92
|
+
- [ ] Nivel de formalidade correspondente (1-5)
|
|
93
|
+
- [ ] Uso de analogias no mesmo estilo
|
|
94
|
+
- [ ] Comprimento medio de resposta similar
|
|
95
|
+
- [ ] Vocabulario caracteristico presente (>=70% das palavras-chave)
|
|
96
|
+
- [ ] Abertura de resposta no mesmo padrao
|
|
97
|
+
- [ ] Fechamento/CTA no mesmo padrao
|
|
98
|
+
- [ ] Uso de dados vs historias na proporcao correta
|
|
99
|
+
|
|
100
|
+
---
|
|
101
|
+
|
|
102
|
+
## Validacao Processual — Workflow Compliance
|
|
103
|
+
|
|
104
|
+
### Protocolo de Workflow Trace
|
|
105
|
+
|
|
106
|
+
Para cada workflow documentado (L3):
|
|
107
|
+
|
|
108
|
+
1. **Dar uma tarefa** que deveria ativar o workflow
|
|
109
|
+
2. **Observar se o clone segue os steps** na ordem correta
|
|
110
|
+
3. **Verificar se os outputs intermediarios** correspondem ao documentado
|
|
111
|
+
4. **Verificar se o output final** match o esperado
|
|
112
|
+
|
|
113
|
+
```
|
|
114
|
+
Processual Score = (workflows seguidos corretamente / total testados) × 100
|
|
115
|
+
```
|
|
116
|
+
|
|
117
|
+
**Red flag:** Clone pula steps — pode indicar que o workflow foi extraido de forma incompleta
|
|
118
|
+
ou que a heuristic de quando usar o workflow nao foi corretamente mapeada.
|
|
119
|
+
|
|
120
|
+
---
|
|
121
|
+
|
|
122
|
+
## Validacao de Limites — Boundary Testing
|
|
123
|
+
|
|
124
|
+
### Protocolo de Limite
|
|
125
|
+
|
|
126
|
+
1. **Testar pedidos fora do dominio** do original (deve recusar ou redirecionar)
|
|
127
|
+
2. **Testar topicos que o original evita** (deve replicar o comportamento de evitar)
|
|
128
|
+
3. **Testar areas de incerteza** documentadas (deve admitir, nao inventar)
|
|
129
|
+
|
|
130
|
+
```
|
|
131
|
+
Limitar Score = (respostas dentro de boundaries / total de testes) × 100
|
|
132
|
+
```
|
|
133
|
+
|
|
134
|
+
### Casos de Teste de Boundary
|
|
135
|
+
|
|
136
|
+
| Cenario | Comportamento Esperado | Pass/Fail |
|
|
137
|
+
|---------|----------------------|-----------|
|
|
138
|
+
| Pergunta fora do dominio | Redirecionamento ou recusa explicita | |
|
|
139
|
+
| Topico controverso que evita | Neutralidade ou "nao e minha area" | |
|
|
140
|
+
| Pedido de conselho em area que admitiu ignorancia | Admite limitacao | |
|
|
141
|
+
| Pergunta que contradiz valores documentados | Recusa coerente | |
|
|
142
|
+
|
|
143
|
+
---
|
|
144
|
+
|
|
145
|
+
## Score de Fidelidade por Tier
|
|
146
|
+
|
|
147
|
+
| Score | Interpretacao | Acao |
|
|
148
|
+
|-------|--------------|------|
|
|
149
|
+
| >= 85% | Clone de alta fidelidade | Publicar e deployar |
|
|
150
|
+
| 70-84% | Clone aceitavel com gaps | Documentar gaps, publicar com ressalvas |
|
|
151
|
+
| 55-69% | Clone de superficie | Revisar extracoes, reprocessar L1 e L2 |
|
|
152
|
+
| < 55% | Clone falhou | Voltar ao pipeline, mais fontes necessarias |
|
|
153
|
+
|
|
154
|
+
### Fidelity Score vs Confidence Score
|
|
155
|
+
|
|
156
|
+
| Score | O que mede | Quando calcular |
|
|
157
|
+
|-------|-----------|----------------|
|
|
158
|
+
| **Confidence Score** | Qualidade das fontes usadas na extracao | Durante extracao (pre-geração) |
|
|
159
|
+
| **Fidelity Score** | Qualidade do clone gerado | Apos geracao (pos-geração) |
|
|
160
|
+
|
|
161
|
+
Um clone pode ter confidence score alto (boas fontes) e fidelity score baixo
|
|
162
|
+
(ma geracao). Ambos precisam ser verificados.
|
|
163
|
+
|
|
164
|
+
---
|
|
165
|
+
|
|
166
|
+
## Deteccao de Falsos Positivos
|
|
167
|
+
|
|
168
|
+
Problemas comuns que inflariam o score artificialmente:
|
|
169
|
+
|
|
170
|
+
### Problema 1: Resposta Genérica Passando por Clone
|
|
171
|
+
**Sintoma:** Clone responde correto mas resposta seria identica para qualquer especialista da area.
|
|
172
|
+
**Detecção:** Testar perguntas onde o original tem opiniao conhecidamente contraria ao mainstream.
|
|
173
|
+
**Exemplo:** Se o original defende "nunca use ads no inicio", o clone deve replicar isso, nao a visao mainstream.
|
|
174
|
+
|
|
175
|
+
### Problema 2: Estilo Mascarando Lacuna Cognitiva
|
|
176
|
+
**Sintoma:** Tom/vocabulario perfeito mas decisoes erradas.
|
|
177
|
+
**Deteccao:** Focar nos testes de decisao antes dos testes de estilo.
|
|
178
|
+
**Indicador:** Dimensao cognitiva < dimensao comunicativa por 20+ pontos.
|
|
179
|
+
|
|
180
|
+
### Problema 3: Fabricacao Confiante
|
|
181
|
+
**Sintoma:** Clone responde com confianca em areas sem fontes documentadas.
|
|
182
|
+
**Deteccao:** Testar em areas FORA do corpus de fontes.
|
|
183
|
+
**Regra:** Clone deve declinar ou qualificar fortemente em areas nao documentadas.
|
|
184
|
+
|
|
185
|
+
---
|
|
186
|
+
|
|
187
|
+
## Checklist Pre-Publicacao
|
|
188
|
+
|
|
189
|
+
### Gate 1 — Confidence Score
|
|
190
|
+
- [ ] Confidence >= 60% (Tier 1), 75% (Tier 2), 85% (Tier 3)
|
|
191
|
+
- [ ] L1 e L2 nao estao abaixo de 60% individualmente
|
|
192
|
+
- [ ] Proporcao de [DIRETO] adequada por tier
|
|
193
|
+
|
|
194
|
+
### Gate 2 — Fidelity Score
|
|
195
|
+
- [ ] Fidelity >= 70% (minimo aceitavel)
|
|
196
|
+
- [ ] Dimensao cognitiva >= 65%
|
|
197
|
+
- [ ] Dimensao limitar >= 70%
|
|
198
|
+
|
|
199
|
+
### Gate 3 — Integridade
|
|
200
|
+
- [ ] Failure modes documentados (L6)
|
|
201
|
+
- [ ] Nenhum principio ou heuristic marcado como [HIPOTESE] foi promovido a core principle
|
|
202
|
+
- [ ] Fonte de cada core principle identificada
|
|
203
|
+
- [ ] Gaps documentados explicitamente no agent.md
|
|
204
|
+
|
|
205
|
+
### Gate 4 — Etica
|
|
206
|
+
- [ ] Consent verification (ver `ethical-guidelines.md`)
|
|
207
|
+
- [ ] Declaracao de representacao no agent.md
|
|
208
|
+
- [ ] Limites de uso documentados
|
|
209
|
+
|
|
210
|
+
Ver tambem: `clone-tier-standards.md` para os criterios minimos por tier.
|
|
211
|
+
Ver tambem: `ethical-guidelines.md` para considerations de consentimento e uso.
|
|
@@ -95,3 +95,54 @@ Para calibrar o scoring:
|
|
|
95
95
|
4. Revise: algum [INFERIDO] deveria ser [HIPOTESE]?
|
|
96
96
|
5. Recalcule
|
|
97
97
|
6. Se score surpreende (muito alto ou baixo), revise tags
|
|
98
|
+
|
|
99
|
+
---
|
|
100
|
+
|
|
101
|
+
## Confidence Score vs Fidelity Score
|
|
102
|
+
|
|
103
|
+
Sao metricas distintas que medem coisas diferentes:
|
|
104
|
+
|
|
105
|
+
| Score | O que mede | Quando calcular | Quem usa |
|
|
106
|
+
|-------|-----------|----------------|---------|
|
|
107
|
+
| **Confidence Score** | Qualidade das fontes (pre-geracao) | Durante extracao | @cognitive-extractor |
|
|
108
|
+
| **Fidelity Score** | Qualidade do clone gerado (pos-geracao) | Apos geracao | @agent-forger + validador |
|
|
109
|
+
|
|
110
|
+
**Analogia:** Confidence e a qualidade dos ingredientes; fidelidade e a qualidade do prato final.
|
|
111
|
+
|
|
112
|
+
Um clone pode ter:
|
|
113
|
+
- Confidence alto + Fidelity baixo: boas fontes, mas geracao falhou
|
|
114
|
+
- Confidence baixo + Fidelity alto: fontes fracas, mas o que foi extraido foi bem gerado
|
|
115
|
+
|
|
116
|
+
**Ambos devem ser >= threshold do tier para publicacao.**
|
|
117
|
+
|
|
118
|
+
---
|
|
119
|
+
|
|
120
|
+
## Confidence Decay (Envelhecimento das Extracoes)
|
|
121
|
+
|
|
122
|
+
O confidence score calculado no momento da extracao pode degradar
|
|
123
|
+
conforme o tempo passa e novas informacoes surgem:
|
|
124
|
+
|
|
125
|
+
| Situacao | Impacto no confidence |
|
|
126
|
+
|---------|----------------------|
|
|
127
|
+
| Nova fonte confirma [HIPOTESE] | Upgrade para [INFERIDO] |
|
|
128
|
+
| Nova fonte contradiz [DIRETO] | Revisar — possivel evolucao de pensamento |
|
|
129
|
+
| Fonte original removida/deletada | Downgrade por perda de proveniencia |
|
|
130
|
+
| Conteudo tem 3+ anos sem confirmacao recente | Aplicar fator de decay 0.85x |
|
|
131
|
+
|
|
132
|
+
**Protocolo de re-validacao:** Para clones com mais de 2 anos, re-calcular
|
|
133
|
+
confidence score considerando decay temporal. Ver `source-classification.md`
|
|
134
|
+
para pesos por janela temporal.
|
|
135
|
+
|
|
136
|
+
---
|
|
137
|
+
|
|
138
|
+
## Score por Layer — Interpretacao Critica
|
|
139
|
+
|
|
140
|
+
O score geral esconde distribuicoes que importam:
|
|
141
|
+
|
|
142
|
+
| Padrao | Diagnostico | Acao |
|
|
143
|
+
|--------|-------------|------|
|
|
144
|
+
| L1 > 85%, L2 < 50% | Clone sabe o que pensa mas nao como decide | Buscar mais conteudo de decisao (entrevistas) |
|
|
145
|
+
| L2 > 85%, L4 < 50% | Clone decide bem mas "nao soa" certo | Buscar mais conteudo de comunicacao |
|
|
146
|
+
| L4 > 85%, L1 < 50% | Clone "soa" mas nao "pensa" — perigoso | Priorizar mais fontes de conteudo analitico |
|
|
147
|
+
| L5 < 30% independente | Meta-patterns insuficientes | Normal em Tier 1, preocupante em Tier 3 |
|
|
148
|
+
| L6 = 0% | Failure modes nao documentados | Obrigatorio para Tier 2+ |
|