sagaz-ai 0.1.0 → 0.1.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -36,6 +36,11 @@ Sagaz also guides the user through the process. At the end of each phase, it exp
36
36
  - **GitHub without guesswork:** Sagaz recommends commits, pushes, pull requests, issues, and releases at the right time.
37
37
  - **Web and mobile:** workflows for browser apps, websites, dashboards, Android, and iOS.
38
38
  - **Persistent state:** Markdown run state records decisions, approvals, handoffs, risks, and test evidence.
39
+ - **Agent observability:** compact traces record decisions, tools, evidence, failures, and recoveries.
40
+ - **Durable checkpoints:** long projects can resume across threads and refactors without losing context.
41
+ - **Tool registry:** Sagaz verifies and recommends tools such as GitHub CLI, Playwright, Vercel, Expo/EAS, Supabase, Firebase, Stripe, CI/CD, and observability services.
42
+ - **Stack presets:** common web, mobile, backend, database, and dashboard stacks are documented as starting points.
43
+ - **Sagaz evaluations:** scenario-based checks help prevent regressions in the orchestration system itself.
39
44
 
40
45
  ## How It Works
41
46
 
@@ -52,6 +57,10 @@ Key areas:
52
57
  - `agents/`: individual roles.
53
58
  - `tasks/`: formal tasks with inputs, outputs, acceptance criteria, verification, and stop conditions.
54
59
  - `protocols/`: rules for quality, handoffs, GitHub, production, design, CI/CD, and monitoring.
60
+ - `tools/`: tool and connector selection rules.
61
+ - `stack-presets/`: default stack recommendations and tradeoffs.
62
+ - `evals/`: checks for Sagaz's own reliability.
63
+ - `examples/`: reusable examples of common project flows.
55
64
  - `templates/`: reusable artifacts for specs, QA, handoffs, run state, and releases.
56
65
  - `engineering/`: deep software engineering standards.
57
66
  - `governance/`: security, quality, versioning, and ecosystem maintenance.
@@ -147,6 +156,10 @@ Sagaz should choose the appropriate workflow, create or update persistent run st
147
156
 
148
157
  For production-grade work, Sagaz can also apply SRE readiness, DORA metrics, secure SDLC, dependency governance, data privacy lifecycle, architecture fitness functions, API contracts, performance budgets, accessibility compliance, database migrations, release strategy, and AI application quality protocols.
149
158
 
159
+ For tool-heavy work, Sagaz uses a tool registry to verify local availability and recommend the right connector or platform before asking permission to install, authenticate, deploy, publish, or modify external resources.
160
+
161
+ For common project types, Sagaz can start from documented stack presets such as Next.js on Vercel, React with Vite, Expo mobile, React Native, Supabase, Firebase, Node APIs, static sites, and admin dashboards.
162
+
150
163
  ## Web Example
151
164
 
152
165
  ```text
@@ -175,6 +188,8 @@ Sagaz must ask permission before moving between major teams/phases, installing d
175
188
 
176
189
  Sagaz may directly run low-risk diagnostics such as reading files, inspecting structure, searching text, checking status, and proposing plans.
177
190
 
191
+ Sagaz should proactively suggest useful actions across the whole ecosystem, including tests, visual QA, accessibility checks, security checks, commits, releases, deployment previews, monitoring, and documentation updates.
192
+
178
193
  ## Who It Is For
179
194
 
180
195
  Sagaz is for users who want to build serious software with Codex without needing to personally manage every detail of engineering, design, GitHub, deployment, and production operations.
@@ -42,7 +42,7 @@ See `agents/` for Orchestrator, Product Strategist, Technology Strategist, Softw
42
42
 
43
43
  ## Protocols
44
44
 
45
- See `protocols/` for quality gates, testing matrix, stack selection, design quality, guided proactivity, squad handoffs, production readiness, GitHub operations, CI/CD readiness, post-delivery monitoring, delegation, communication, memory, and model routing.
45
+ See `protocols/` for quality gates, testing matrix, stack selection, design quality, guided proactivity, squad handoffs, production readiness, GitHub operations, CI/CD readiness, post-delivery monitoring, durable run state, agent observability, delegation, communication, memory, and model routing.
46
46
 
47
47
  ## Advanced Engineering Protocols
48
48
 
@@ -58,10 +58,36 @@ See `protocols/` for quality gates, testing matrix, stack selection, design qual
58
58
  - `protocols/database-migrations.md`
59
59
  - `protocols/release-strategy.md`
60
60
  - `protocols/ai-application-quality.md`
61
+ - `protocols/agent-observability.md`
62
+ - `protocols/durable-run-state.md`
63
+
64
+ ## Tools
65
+
66
+ - `tools/tool-registry.md`
67
+
68
+ ## Stack Presets
69
+
70
+ - `stack-presets/nextjs-vercel.md`
71
+ - `stack-presets/react-vite.md`
72
+ - `stack-presets/expo-mobile.md`
73
+ - `stack-presets/react-native.md`
74
+ - `stack-presets/supabase.md`
75
+ - `stack-presets/firebase.md`
76
+ - `stack-presets/node-api.md`
77
+ - `stack-presets/static-site.md`
78
+ - `stack-presets/admin-dashboard.md`
79
+
80
+ ## Evaluations
81
+
82
+ - `evals/sagaz-evaluation-suite.md`
83
+
84
+ ## Examples
85
+
86
+ - `examples/README.md`
61
87
 
62
88
  ## Templates
63
89
 
64
- See `templates/` for task briefs, product specs, technical specs, design systems, stack recommendations, run state, squad handoffs, QA reports, release checklists, and final handoffs.
90
+ See `templates/` for task briefs, product specs, technical specs, design systems, stack recommendations, run state, squad handoffs, QA reports, release checklists, changelogs, release notes, and final handoffs.
65
91
 
66
92
  ## Governance
67
93
 
@@ -26,4 +26,8 @@ Create autonomous, auditable, outcome-oriented AI teams inside Codex with low to
26
26
  18. Repeated delivery cycles should track DORA metrics without turning them into vanity metrics.
27
27
  19. Security must be integrated through the full SDLC, including threat modeling and verification.
28
28
  20. Dependencies, data lifecycle, API contracts, migrations, releases, accessibility, performance, and AI quality must have explicit protocols when relevant.
29
+ 21. Durable run state and compact observability must be used for multi-phase or production work.
30
+ 22. External tools must be selected through the tool registry: verify availability, explain value, and ask permission before setup or state-changing use.
31
+ 23. Stack presets should be used as starting points, then adapted to project constraints.
32
+ 24. Sagaz itself must be evaluated with scenario-based checks before major workflow or package releases.
29
33
 
@@ -0,0 +1,71 @@
1
+ # Sagaz Evaluation Suite
2
+
3
+ ## Purpose
4
+
5
+ Evaluate whether Sagaz itself produces consistent, reliable, low-token, production-oriented results.
6
+
7
+ ## Evaluation Cadence
8
+
9
+ Run these evaluations before major Sagaz releases and after changing core workflows, squads, protocols, or installation behavior.
10
+
11
+ ## Core Evaluations
12
+
13
+ | Evaluation | Goal | Pass Criteria |
14
+ | --- | --- | --- |
15
+ | Invocation | Sagaz is easy to start by name | User can invoke Sagaz with one prompt |
16
+ | Language intake | User can write in any language | Sagaz understands the request and answers in American English |
17
+ | Workflow selection | Correct workflow is selected | Selected workflow matches project type and risk |
18
+ | Token discipline | Only needed files are loaded | No broad file loading without reason |
19
+ | Handoff quality | Teams transition clearly | Current work, evidence, next work, and permission are stated |
20
+ | Stack advisory | Stack is justified | Cost, speed, scale, maintainability, deployment, and future changes are covered |
21
+ | Design quality | UI work reaches high standards | Design system, responsiveness, accessibility, and visual QA are included |
22
+ | Verification depth | Tests match risk | Build, lint, unit, integration, e2e, accessibility, and manual checks are considered |
23
+ | GitHub guidance | User is guided proactively | Commits, pushes, PRs, releases, and issues are suggested at the right time |
24
+ | Production readiness | Launch risk is explicit | Security, env vars, rollback, monitoring, and residual risks are documented |
25
+
26
+ ## Scenario Tests
27
+
28
+ Use these prompts as smoke tests:
29
+
30
+ ```text
31
+ Sagaz: create a complete appointment scheduling SaaS with premium design and Vercel deployment.
32
+ ```
33
+
34
+ ```text
35
+ Sagaz: create an Android/iOS habit tracker and recommend the best stack.
36
+ ```
37
+
38
+ ```text
39
+ Sagaz: refactor this existing project safely without changing behavior.
40
+ ```
41
+
42
+ ```text
43
+ Sagaz: fix this production bug, test it, and prepare a GitHub release.
44
+ ```
45
+
46
+ ```text
47
+ Sagaz: I am a beginner. Guide me through everything and ask permission before major actions.
48
+ ```
49
+
50
+ ## Scoring
51
+
52
+ Score each scenario from 0 to 3:
53
+
54
+ - 0: failed or unsafe
55
+ - 1: partially usable
56
+ - 2: usable with gaps
57
+ - 3: production-grade for the scenario
58
+
59
+ Sagaz should not release a major workflow change with any core evaluation below 2.
60
+
61
+ ## Regression Log
62
+
63
+ ```md
64
+ Date:
65
+ Version:
66
+ Scenario:
67
+ Score:
68
+ Failure:
69
+ Fix:
70
+ Retest evidence:
71
+ ```
@@ -0,0 +1,70 @@
1
+ # Examples
2
+
3
+ ## Purpose
4
+
5
+ Provide reusable, low-token examples for common Sagaz projects. Examples are not templates to copy blindly. They show the expected flow, artifacts, handoffs, and verification depth.
6
+
7
+ ## Example Categories
8
+
9
+ - complete marketing website
10
+ - SaaS web app
11
+ - admin dashboard
12
+ - mobile app
13
+ - bugfix to release
14
+ - safe refactor
15
+ - production deploy
16
+ - GitHub release
17
+
18
+ ## Example Structure
19
+
20
+ Each example should include:
21
+
22
+ - user prompt
23
+ - selected workflow
24
+ - required squads
25
+ - first questions, if any
26
+ - stack recommendation summary
27
+ - handoff sequence
28
+ - expected artifacts
29
+ - verification plan
30
+ - GitHub actions to suggest
31
+ - deployment path
32
+ - final handoff shape
33
+
34
+ ## Minimal Web App Example
35
+
36
+ ```md
37
+ User prompt:
38
+ Sagaz: create a premium appointment scheduling SaaS for small clinics.
39
+
40
+ Workflow:
41
+ workflows/greenfield-web-app.md
42
+
43
+ Squads:
44
+ Product Factory -> Design Studio -> Production Critical -> GitHub Ops
45
+
46
+ Stack recommendation:
47
+ Next.js on Vercel, TypeScript, Supabase, Playwright, GitHub Actions.
48
+
49
+ Reason:
50
+ Fast delivery, clear deployment path, managed auth/database, strong web ecosystem, good future maintainability.
51
+
52
+ Required handoffs:
53
+ Intake -> stack -> spec -> design -> architecture -> implementation -> QA -> production readiness -> GitHub/deploy.
54
+ ```
55
+
56
+ ## Mobile App Example
57
+
58
+ ```md
59
+ User prompt:
60
+ Sagaz: create an Android/iOS habit tracker with premium UX and store-ready release planning.
61
+
62
+ Workflow:
63
+ workflows/mobile-app-production.md
64
+
65
+ Likely stack:
66
+ Expo, React Native, TypeScript, SQLite or Supabase depending on sync needs, EAS for builds.
67
+
68
+ Required evidence:
69
+ Device-size review, offline behavior decision, accessibility checks, app icon/splash plan, release checklist.
70
+ ```
@@ -16,6 +16,7 @@ When Sagaz changes, update every relevant location:
16
16
  - `package.json` version when publishing a package update
17
17
  - GitHub repository
18
18
  - npm package when the installer or distributed files change
19
+ - GitHub Actions workflows when package checks or publishing rules change
19
20
 
20
21
  ## Release Checklist
21
22
 
@@ -33,6 +34,15 @@ npm publish:
33
34
  Post-publish install test:
34
35
  ```
35
36
 
37
+ ## GitHub Actions
38
+
39
+ The repository should include:
40
+
41
+ - package checks on push and pull request
42
+ - manual npm publishing workflow
43
+
44
+ The npm publishing workflow requires an `NPM_TOKEN` repository secret. Do not assume it exists. If it is missing, guide the user through creating it and explain why it is needed.
45
+
36
46
  ## npm Publishing
37
47
 
38
48
  Use npm only for installation packaging. Sagaz itself is intended for Codex Desktop, not as a standalone CLI agent runtime.
@@ -0,0 +1,75 @@
1
+ # Agent Observability
2
+
3
+ ## Purpose
4
+
5
+ Make Sagaz work auditable without turning every interaction into a long transcript. Observability must show what was decided, which team acted, which tools were used, what evidence exists, and what risks remain.
6
+
7
+ ## When To Use
8
+
9
+ Use this protocol for medium or large projects, production work, multi-team handoffs, debugging the Sagaz process, or any task where repeatability matters.
10
+
11
+ ## Minimum Trace
12
+
13
+ Record only high-signal events:
14
+
15
+ - user goal
16
+ - active workflow
17
+ - current squad and agent role
18
+ - decision points
19
+ - handoffs
20
+ - approvals or denied approvals
21
+ - tool categories used
22
+ - files changed
23
+ - tests and checks run
24
+ - failures and recoveries
25
+ - final evidence
26
+ - residual risks
27
+
28
+ ## Token-Efficient Trace Format
29
+
30
+ ```md
31
+ ## Trace
32
+
33
+ | Time | Team | Event | Evidence | Next |
34
+ | --- | --- | --- | --- | --- |
35
+ | YYYY-MM-DD HH:MM | Product Factory | Requirements clarified | product-spec.md | Stack selection |
36
+ ```
37
+
38
+ ## Tool Event Format
39
+
40
+ ```md
41
+ Tool:
42
+ Purpose:
43
+ Inputs:
44
+ Output summary:
45
+ Cost/risk:
46
+ Follow-up:
47
+ ```
48
+
49
+ Do not paste long logs unless the log itself is the artifact being reviewed. Summarize and keep the exact command available.
50
+
51
+ ## Metrics
52
+
53
+ Track these only when useful:
54
+
55
+ - estimated token load by phase: low, medium, high
56
+ - elapsed phase time
57
+ - number of verification passes
58
+ - reopened defects
59
+ - blocked handoffs
60
+ - tests run versus tests planned
61
+ - deployment result
62
+
63
+ ## Failure Handling
64
+
65
+ When a failure occurs:
66
+
67
+ 1. Record the failed action.
68
+ 2. Record the observed error.
69
+ 3. State the likely cause as an assumption when not proven.
70
+ 4. Define the smallest recovery step.
71
+ 5. Add or update a prevention rule if the failure can recur.
72
+
73
+ ## Completion Rule
74
+
75
+ Sagaz must not claim reliable completion unless the trace includes enough evidence for another Codex thread to understand what happened and continue from the same state.
@@ -0,0 +1,70 @@
1
+ # Durable Run State
2
+
3
+ ## Purpose
4
+
5
+ Keep Sagaz coherent across long projects, new Codex threads, context compaction, refactors, and releases.
6
+
7
+ ## Rule
8
+
9
+ For medium or large work, create or update a run state file based on `templates/run-state.md`. The run state is the compact memory of the project. It must be short enough to reload cheaply and specific enough to prevent rework.
10
+
11
+ ## Required Sections
12
+
13
+ - project goal
14
+ - user constraints
15
+ - active workflow
16
+ - current phase
17
+ - completed handoffs
18
+ - approved decisions
19
+ - denied or deferred suggestions
20
+ - stack decision
21
+ - design system status
22
+ - implementation status
23
+ - verification evidence
24
+ - GitHub status
25
+ - deployment status
26
+ - risks and open questions
27
+ - next recommended action
28
+
29
+ ## Checkpoint Rules
30
+
31
+ Create a checkpoint after:
32
+
33
+ - intake completion
34
+ - stack selection
35
+ - product specification approval
36
+ - design system approval
37
+ - architecture approval
38
+ - implementation milestones
39
+ - failed verification
40
+ - successful verification
41
+ - commit or release
42
+ - deployment
43
+
44
+ ## Checkpoint Format
45
+
46
+ ```md
47
+ ## Checkpoint: YYYY-MM-DD HH:MM
48
+
49
+ Phase:
50
+ Completed:
51
+ Evidence:
52
+ Decision:
53
+ Risks:
54
+ Next recommended action:
55
+ Needs user permission:
56
+ ```
57
+
58
+ ## Recovery Flow
59
+
60
+ When resuming:
61
+
62
+ 1. Read the run state first.
63
+ 2. Read only the files referenced by the current phase.
64
+ 3. Verify the repository state before changing files.
65
+ 4. State the current phase and next proposed action.
66
+ 5. Ask permission only when the action is meaningful or state-changing.
67
+
68
+ ## Token Discipline
69
+
70
+ Prefer links and short summaries over pasted content. Move large specs to templates or project files and keep the run state as an index plus decision ledger.
@@ -0,0 +1,31 @@
1
+ # Stack Preset: Admin Dashboard
2
+
3
+ ## Best For
4
+
5
+ Operational dashboards, internal tools, CRM-like interfaces, analytics panels, and back-office applications.
6
+
7
+ ## Default Stack
8
+
9
+ - React or Next.js
10
+ - TypeScript
11
+ - Component library or a strict internal design system
12
+ - Tables, filters, forms, role permissions, and audit logs
13
+ - Playwright for critical workflows
14
+
15
+ ## Design Rules
16
+
17
+ - Prioritize dense, scannable layouts.
18
+ - Avoid marketing-style hero sections.
19
+ - Make navigation predictable.
20
+ - Use tables, filters, segmented controls, drawers, and modals intentionally.
21
+ - Preserve keyboard and accessibility quality.
22
+
23
+ ## Required Sagaz Checks
24
+
25
+ - role-based access
26
+ - empty states
27
+ - loading states
28
+ - error states
29
+ - destructive-action confirmations
30
+ - audit trail needs
31
+ - responsive behavior
@@ -0,0 +1,33 @@
1
+ # Stack Preset: Expo Mobile
2
+
3
+ ## Best For
4
+
5
+ Cross-platform Android/iOS apps where one codebase, fast iteration, and managed builds matter.
6
+
7
+ ## Default Stack
8
+
9
+ - Expo
10
+ - React Native
11
+ - TypeScript
12
+ - Expo Router when suitable
13
+ - EAS Build
14
+ - EAS Update only when rollback strategy is clear
15
+
16
+ ## Strengths
17
+
18
+ - Fastest practical path to Android and iOS for many teams.
19
+ - Managed build and update tooling.
20
+ - Large ecosystem.
21
+ - Good fit for Codex-assisted development.
22
+
23
+ ## Tradeoffs
24
+
25
+ - Some native integrations may require config plugins or prebuild.
26
+ - App store release still needs careful manual review.
27
+ - Over-the-air updates require governance.
28
+
29
+ ## Use When
30
+
31
+ - The user wants Android and iOS.
32
+ - Native custom code is limited or manageable.
33
+ - Delivery speed and maintainability matter.
@@ -0,0 +1,27 @@
1
+ # Stack Preset: Firebase
2
+
3
+ ## Best For
4
+
5
+ Realtime apps, mobile-first products, quick authentication, push notifications, and apps that benefit from managed NoSQL infrastructure.
6
+
7
+ ## Strengths
8
+
9
+ - Strong mobile ecosystem.
10
+ - Realtime features.
11
+ - Managed auth and hosting.
12
+ - Good for rapid iteration.
13
+
14
+ ## Tradeoffs
15
+
16
+ - NoSQL data modeling requires care.
17
+ - Complex querying can become expensive or awkward.
18
+ - Security rules must be tested.
19
+
20
+ ## Required Sagaz Checks
21
+
22
+ - security rules
23
+ - cost model
24
+ - data access patterns
25
+ - emulator usage
26
+ - backup/export path
27
+ - auth and permission flows
@@ -0,0 +1,39 @@
1
+ # Stack Preset: Next.js On Vercel
2
+
3
+ ## Best For
4
+
5
+ Production web apps, SaaS products, dashboards, marketing sites with dynamic features, and apps that benefit from fast deployment.
6
+
7
+ ## Default Stack
8
+
9
+ - Next.js
10
+ - TypeScript
11
+ - Tailwind CSS or an existing design system
12
+ - Vercel
13
+ - Playwright
14
+ - GitHub Actions
15
+
16
+ ## Strengths
17
+
18
+ - Strong ecosystem and hiring market.
19
+ - Excellent deployment path on Vercel.
20
+ - Good performance defaults.
21
+ - Works well for full-stack web applications.
22
+ - Easy to evolve from prototype to production.
23
+
24
+ ## Tradeoffs
25
+
26
+ - Platform features can create Vercel coupling.
27
+ - Server/client boundaries require discipline.
28
+ - Complex apps still need architecture, testing, and observability.
29
+
30
+ ## Use When
31
+
32
+ - The user wants a serious browser-based app.
33
+ - SEO or performance matters.
34
+ - Deployment simplicity matters.
35
+
36
+ ## Avoid When
37
+
38
+ - The app is mostly native mobile.
39
+ - The user needs a static-only site with no dynamic behavior.
@@ -0,0 +1,34 @@
1
+ # Stack Preset: Node API
2
+
3
+ ## Best For
4
+
5
+ Backend APIs, integrations, webhooks, automation services, and full-stack apps needing a dedicated server.
6
+
7
+ ## Default Stack
8
+
9
+ - Node.js
10
+ - TypeScript
11
+ - Fastify or Express based on project needs
12
+ - Zod or similar runtime validation
13
+ - Postgres or managed database
14
+ - Jest/Vitest plus integration tests
15
+
16
+ ## Strengths
17
+
18
+ - Mature ecosystem.
19
+ - Good fit for integrations and APIs.
20
+ - Easy to deploy widely.
21
+
22
+ ## Tradeoffs
23
+
24
+ - Requires explicit architecture for auth, validation, errors, logging, rate limits, and deployment.
25
+
26
+ ## Required Sagaz Checks
27
+
28
+ - API contract
29
+ - validation
30
+ - authentication and authorization
31
+ - logging and observability
32
+ - rate limiting
33
+ - integration tests
34
+ - deployment and rollback
@@ -0,0 +1,25 @@
1
+ # Stack Preset: React Native
2
+
3
+ ## Best For
4
+
5
+ Mobile apps that need deeper native control than a fully managed Expo project.
6
+
7
+ ## Default Stack
8
+
9
+ - React Native
10
+ - TypeScript
11
+ - React Navigation or Expo Router depending on project structure
12
+ - Native build tooling
13
+ - Detox or Maestro for mobile end-to-end tests when feasible
14
+
15
+ ## Strengths
16
+
17
+ - Shared codebase across Android and iOS.
18
+ - More native control than managed-only workflows.
19
+ - Large ecosystem.
20
+
21
+ ## Tradeoffs
22
+
23
+ - Native build complexity is higher.
24
+ - CI setup is more involved.
25
+ - Requires stronger platform-specific verification.
@@ -0,0 +1,32 @@
1
+ # Stack Preset: React With Vite
2
+
3
+ ## Best For
4
+
5
+ Client-heavy apps, dashboards, internal tools, prototypes, and static sites that do not need a full-stack framework.
6
+
7
+ ## Default Stack
8
+
9
+ - React
10
+ - TypeScript
11
+ - Vite
12
+ - Tailwind CSS or existing CSS architecture
13
+ - Vitest
14
+ - Playwright when user workflows matter
15
+
16
+ ## Strengths
17
+
18
+ - Simple mental model.
19
+ - Fast local development.
20
+ - Low framework overhead.
21
+ - Easy static deployment.
22
+
23
+ ## Tradeoffs
24
+
25
+ - Backend, auth, routing, and data strategy must be chosen separately.
26
+ - SEO and server rendering are not the default.
27
+
28
+ ## Use When
29
+
30
+ - The app is mostly client-side.
31
+ - The project needs simplicity and speed.
32
+ - The backend already exists.
@@ -0,0 +1,23 @@
1
+ # Stack Preset: Static Site
2
+
3
+ ## Best For
4
+
5
+ Marketing pages, documentation, portfolios, landing pages, and simple content sites.
6
+
7
+ ## Default Stack
8
+
9
+ - Astro, Eleventy, or Vite depending on content needs
10
+ - TypeScript when the project has interactive code
11
+ - Netlify, Vercel, Cloudflare Pages, or GitHub Pages
12
+
13
+ ## Strengths
14
+
15
+ - Low cost.
16
+ - Simple hosting.
17
+ - Strong performance and security baseline.
18
+ - Easy maintenance.
19
+
20
+ ## Tradeoffs
21
+
22
+ - Dynamic features require external services or backend functions.
23
+ - Content workflows must be planned if nontechnical users edit content.
@@ -0,0 +1,28 @@
1
+ # Stack Preset: Supabase
2
+
3
+ ## Best For
4
+
5
+ Apps needing managed Postgres, authentication, storage, edge functions, and fast backend setup.
6
+
7
+ ## Strengths
8
+
9
+ - Relational data model.
10
+ - Managed auth and storage.
11
+ - Good developer experience.
12
+ - Works well with web and mobile apps.
13
+
14
+ ## Tradeoffs
15
+
16
+ - Row-level security must be designed and tested carefully.
17
+ - Vendor-specific features create some coupling.
18
+ - Production backup and migration strategy must be explicit.
19
+
20
+ ## Required Sagaz Checks
21
+
22
+ - data model
23
+ - row-level security
24
+ - migration plan
25
+ - backup and restore plan
26
+ - auth flows
27
+ - environment variables
28
+ - local or preview environment strategy
@@ -0,0 +1,26 @@
1
+ # Changelog
2
+
3
+ ## Format
4
+
5
+ Use reverse chronological order.
6
+
7
+ ```md
8
+ ## [Version] - YYYY-MM-DD
9
+
10
+ ### Added
11
+
12
+ ### Changed
13
+
14
+ ### Fixed
15
+
16
+ ### Security
17
+
18
+ ### Migration Notes
19
+ ```
20
+
21
+ ## Rules
22
+
23
+ - Mention user-visible changes.
24
+ - Mention installation or package changes.
25
+ - Mention breaking changes clearly.
26
+ - Keep implementation details concise.
@@ -0,0 +1,21 @@
1
+ # Release Notes
2
+
3
+ ## Release
4
+
5
+ Version:
6
+ Date:
7
+ GitHub commit:
8
+ npm package:
9
+
10
+ ## Summary
11
+
12
+ ## What Changed
13
+
14
+ ## Why It Matters
15
+
16
+ ## Verification
17
+
18
+ ## Upgrade Notes
19
+
20
+ ## Known Limitations
21
+
@@ -0,0 +1,63 @@
1
+ # Tool Registry
2
+
3
+ ## Purpose
4
+
5
+ Give Sagaz a consistent way to decide which external or local tools should be recommended, verified, or used during a project.
6
+
7
+ ## Operating Rule
8
+
9
+ Sagaz must not assume a tool is available. It should inspect the local project, verify installation when relevant, explain why the tool is useful, and ask permission before installation, authentication, publishing, deploying, or irreversible changes.
10
+
11
+ ## Core Tool Categories
12
+
13
+ | Category | Examples | Sagaz Use |
14
+ | --- | --- | --- |
15
+ | Version control | Git, GitHub CLI | status, commits, branches, pull requests, releases |
16
+ | Package managers | npm, pnpm, yarn, bun | install, scripts, audits, builds |
17
+ | Web verification | Playwright, browser tools, Lighthouse | screenshots, interaction tests, accessibility checks |
18
+ | Mobile delivery | Expo, EAS, Xcode, Android Studio, Gradle | Android/iOS builds and store readiness |
19
+ | Deployment | Vercel, Netlify, Cloudflare, Render, Fly.io | preview, production deploy, rollback guidance |
20
+ | Databases | Supabase, Firebase, Postgres, Prisma | schema, migrations, backup, policies |
21
+ | Payments | Stripe | checkout, webhooks, test mode, compliance boundaries |
22
+ | AI providers | OpenAI, OpenRouter, TogetherAI, Groq | model selection, routing, budget, quality checks |
23
+ | Observability | Sentry, Logtail, Grafana, OpenTelemetry | errors, traces, logs, service health |
24
+ | Security | npm audit, CodeQL, Dependabot, secret scanning | dependency and code risk detection |
25
+ | CI/CD | GitHub Actions | repeatable tests, builds, release checks |
26
+
27
+ ## Recommendation Format
28
+
29
+ ```md
30
+ Tool:
31
+ Why it is recommended:
32
+ Cost:
33
+ Setup effort:
34
+ Risk:
35
+ Alternative:
36
+ Permission needed:
37
+ ```
38
+
39
+ ## Connector Readiness
40
+
41
+ Before using a tool, verify:
42
+
43
+ - installed or available command
44
+ - authentication status, if needed
45
+ - project compatibility
46
+ - required secrets or environment variables
47
+ - cost or quota impact
48
+ - rollback path
49
+
50
+ ## Default Choices
51
+
52
+ For typical Codex projects:
53
+
54
+ - GitHub CLI for GitHub operations.
55
+ - Playwright for browser end-to-end tests.
56
+ - GitHub Actions for CI.
57
+ - Vercel for Next.js web deployment when the user wants simple production hosting.
58
+ - Expo/EAS for cross-platform mobile delivery when native platform constraints allow it.
59
+ - Supabase for fast full-stack apps needing relational data and authentication.
60
+
61
+ ## Safety Rule
62
+
63
+ Sagaz must ask permission before installing tools, creating cloud resources, linking accounts, deploying, publishing packages, or modifying production data.
@@ -29,7 +29,10 @@ If navigation is needed, read:
29
29
  8. For production, apply production-critical: tests, build, security, env vars, deployment, rollback, and risks.
30
30
  9. For GitHub, apply proactive github-ops-guided: suggest commits, pushes, PRs, issues, releases, and checks at the right time.
31
31
  10. For production-grade engineering, apply relevant advanced protocols: SRE readiness, DORA metrics, secure SDLC, dependency governance, data privacy lifecycle, architecture fitness functions, API contracts, performance budgets, accessibility compliance, database migrations, release strategy, and AI application quality.
32
- 11. Do not declare done without verification evidence proportional to risk.
32
+ 11. For multi-phase or production work, apply durable run state and compact agent observability.
33
+ 12. Use the tool registry before recommending or using external tools, connectors, deployments, publishing, or account-linked actions.
34
+ 13. Use stack presets as starting points when recommending technologies, then adapt to user constraints.
35
+ 14. Do not declare done without verification evidence proportional to risk.
33
36
 
34
37
  ## Quick Selection
35
38
 
@@ -40,6 +43,9 @@ If navigation is needed, read:
40
43
  - Bugfix to release: `workflows/bugfix-to-release.md`
41
44
  - Design/UI: `squads/design-studio.md` and `protocols/design-quality.md`
42
45
  - GitHub: `squads/github-ops.md` and `protocols/github-operations.md`
46
+ - Tools/connectors: `tools/tool-registry.md`
47
+ - Stack presets: `stack-presets/`
48
+ - Sagaz quality checks: `evals/sagaz-evaluation-suite.md`
43
49
 
44
50
  ## Required Handoff
45
51
 
@@ -62,6 +68,14 @@ For long-running work, create or update a file based on:
62
68
 
63
69
  `ai-orchestration-ecosystem/templates/run-state.md`
64
70
 
71
+ Then apply:
72
+
73
+ `ai-orchestration-ecosystem/protocols/durable-run-state.md`
74
+
75
+ For auditability, use:
76
+
77
+ `ai-orchestration-ecosystem/protocols/agent-observability.md`
78
+
65
79
  ## Source Of Truth
66
80
 
67
81
  The complete details are in:
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "sagaz-ai",
3
- "version": "0.1.0",
3
+ "version": "0.1.1",
4
4
  "description": "Sagaz AI orchestration ecosystem installer for Codex Desktop.",
5
5
  "type": "module",
6
6
  "bin": {