shipwright-cli 1.10.0 → 2.0.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (108) hide show
  1. package/README.md +114 -36
  2. package/completions/_shipwright +212 -32
  3. package/completions/shipwright.bash +97 -25
  4. package/docs/strategy/01-market-research.md +619 -0
  5. package/docs/strategy/02-mission-and-brand.md +587 -0
  6. package/docs/strategy/03-gtm-and-roadmap.md +759 -0
  7. package/docs/strategy/QUICK-START.txt +289 -0
  8. package/docs/strategy/README.md +172 -0
  9. package/package.json +4 -2
  10. package/scripts/sw +208 -1
  11. package/scripts/sw-activity.sh +500 -0
  12. package/scripts/sw-adaptive.sh +925 -0
  13. package/scripts/sw-adversarial.sh +1 -1
  14. package/scripts/sw-architecture-enforcer.sh +1 -1
  15. package/scripts/sw-auth.sh +613 -0
  16. package/scripts/sw-autonomous.sh +664 -0
  17. package/scripts/sw-changelog.sh +704 -0
  18. package/scripts/sw-checkpoint.sh +1 -1
  19. package/scripts/sw-ci.sh +602 -0
  20. package/scripts/sw-cleanup.sh +1 -1
  21. package/scripts/sw-code-review.sh +637 -0
  22. package/scripts/sw-connect.sh +1 -1
  23. package/scripts/sw-context.sh +605 -0
  24. package/scripts/sw-cost.sh +1 -1
  25. package/scripts/sw-daemon.sh +432 -130
  26. package/scripts/sw-dashboard.sh +1 -1
  27. package/scripts/sw-db.sh +540 -0
  28. package/scripts/sw-decompose.sh +539 -0
  29. package/scripts/sw-deps.sh +551 -0
  30. package/scripts/sw-developer-simulation.sh +1 -1
  31. package/scripts/sw-discovery.sh +412 -0
  32. package/scripts/sw-docs-agent.sh +539 -0
  33. package/scripts/sw-docs.sh +1 -1
  34. package/scripts/sw-doctor.sh +59 -1
  35. package/scripts/sw-dora.sh +615 -0
  36. package/scripts/sw-durable.sh +710 -0
  37. package/scripts/sw-e2e-orchestrator.sh +535 -0
  38. package/scripts/sw-eventbus.sh +393 -0
  39. package/scripts/sw-feedback.sh +471 -0
  40. package/scripts/sw-fix.sh +1 -1
  41. package/scripts/sw-fleet-discover.sh +567 -0
  42. package/scripts/sw-fleet-viz.sh +404 -0
  43. package/scripts/sw-fleet.sh +8 -1
  44. package/scripts/sw-github-app.sh +596 -0
  45. package/scripts/sw-github-checks.sh +1 -1
  46. package/scripts/sw-github-deploy.sh +1 -1
  47. package/scripts/sw-github-graphql.sh +1 -1
  48. package/scripts/sw-guild.sh +569 -0
  49. package/scripts/sw-heartbeat.sh +1 -1
  50. package/scripts/sw-hygiene.sh +559 -0
  51. package/scripts/sw-incident.sh +617 -0
  52. package/scripts/sw-init.sh +88 -1
  53. package/scripts/sw-instrument.sh +699 -0
  54. package/scripts/sw-intelligence.sh +1 -1
  55. package/scripts/sw-jira.sh +1 -1
  56. package/scripts/sw-launchd.sh +363 -28
  57. package/scripts/sw-linear.sh +1 -1
  58. package/scripts/sw-logs.sh +1 -1
  59. package/scripts/sw-loop.sh +64 -3
  60. package/scripts/sw-memory.sh +1 -1
  61. package/scripts/sw-mission-control.sh +487 -0
  62. package/scripts/sw-model-router.sh +545 -0
  63. package/scripts/sw-otel.sh +596 -0
  64. package/scripts/sw-oversight.sh +689 -0
  65. package/scripts/sw-pipeline-composer.sh +1 -1
  66. package/scripts/sw-pipeline-vitals.sh +1 -1
  67. package/scripts/sw-pipeline.sh +687 -24
  68. package/scripts/sw-pm.sh +693 -0
  69. package/scripts/sw-pr-lifecycle.sh +522 -0
  70. package/scripts/sw-predictive.sh +1 -1
  71. package/scripts/sw-prep.sh +1 -1
  72. package/scripts/sw-ps.sh +1 -1
  73. package/scripts/sw-public-dashboard.sh +798 -0
  74. package/scripts/sw-quality.sh +595 -0
  75. package/scripts/sw-reaper.sh +1 -1
  76. package/scripts/sw-recruit.sh +573 -0
  77. package/scripts/sw-regression.sh +642 -0
  78. package/scripts/sw-release-manager.sh +736 -0
  79. package/scripts/sw-release.sh +706 -0
  80. package/scripts/sw-remote.sh +1 -1
  81. package/scripts/sw-replay.sh +520 -0
  82. package/scripts/sw-retro.sh +691 -0
  83. package/scripts/sw-scale.sh +444 -0
  84. package/scripts/sw-security-audit.sh +505 -0
  85. package/scripts/sw-self-optimize.sh +1 -1
  86. package/scripts/sw-session.sh +1 -1
  87. package/scripts/sw-setup.sh +1 -1
  88. package/scripts/sw-standup.sh +712 -0
  89. package/scripts/sw-status.sh +1 -1
  90. package/scripts/sw-strategic.sh +658 -0
  91. package/scripts/sw-stream.sh +450 -0
  92. package/scripts/sw-swarm.sh +583 -0
  93. package/scripts/sw-team-stages.sh +511 -0
  94. package/scripts/sw-templates.sh +1 -1
  95. package/scripts/sw-testgen.sh +515 -0
  96. package/scripts/sw-tmux-pipeline.sh +554 -0
  97. package/scripts/sw-tmux.sh +1 -1
  98. package/scripts/sw-trace.sh +485 -0
  99. package/scripts/sw-tracker-github.sh +188 -0
  100. package/scripts/sw-tracker-jira.sh +172 -0
  101. package/scripts/sw-tracker-linear.sh +251 -0
  102. package/scripts/sw-tracker.sh +117 -2
  103. package/scripts/sw-triage.sh +603 -0
  104. package/scripts/sw-upgrade.sh +1 -1
  105. package/scripts/sw-ux.sh +677 -0
  106. package/scripts/sw-webhook.sh +627 -0
  107. package/scripts/sw-widgets.sh +530 -0
  108. package/scripts/sw-worktree.sh +1 -1
@@ -0,0 +1,619 @@
1
+ # Market Research: AI Coding Agent Landscape
2
+
3
+ **Date:** February 2026
4
+ **Author:** Market Research Analysis
5
+ **Status:** Competitive Landscape & Strategic Positioning
6
+
7
+ ---
8
+
9
+ ## Executive Summary
10
+
11
+ The AI coding agent market has experienced explosive growth in 2025-2026, with autonomous software engineering capabilities maturing rapidly. Leading models now exceed 80% success rates on standardized benchmarks, developer adoption has reached 85% by end of 2025, and market valuations project $8.5 billion by 2026 (reaching $35 billion by 2030).
12
+
13
+ Shipwright occupies a unique position in this landscape: it is the only open-source, team-oriented orchestration platform designed specifically for multi-agent delivery pipelines with daemon-driven autonomous processing, persistent learning systems, and fleet operations across multiple repositories. This analysis identifies key competitors, market trends, and strategic differentiation opportunities.
14
+
15
+ ---
16
+
17
+ ## Competitive Landscape
18
+
19
+ ### Direct Competitors
20
+
21
+ #### 1. **Devin (Cognition Labs)**
22
+
23
+ **Status:** Commercial, $$$
24
+ **Positioning:** Autonomous AI software engineer
25
+
26
+ **Key Achievements:**
27
+
28
+ - Production-deployed at thousands of companies (Goldman Sachs, Santander, Nubank)
29
+ - 67% PR merge rate (up from 34% in 2024)
30
+ - Produces 25% of Cognition's internal pull requests
31
+ - 4x faster problem-solving, 2x more resource efficient than prior year
32
+
33
+ **Strengths:**
34
+
35
+ - Best-in-class single-agent capability for complex refactoring and feature work
36
+ - High PR merge quality indicating strong contextual understanding
37
+ - Proven production deployment track record
38
+ - Clear roadmap for 50% internal code production by end of 2026
39
+
40
+ **Weaknesses:**
41
+
42
+ - Only 15% success rate on complex end-to-end tasks requiring senior judgment
43
+ - No multi-agent orchestration (single Devin instance per task)
44
+ - Proprietary, closed-source, vendor-locked
45
+ - Expensive (no public pricing, but enterprise-tier cost)
46
+ - No team collaboration features
47
+
48
+ **Target Audience:** Enterprise teams wanting to offload junior engineer tasks; not suitable for cross-repo fleet operations
49
+
50
+ ---
51
+
52
+ #### 2. **OpenHands (formerly OpenDevin)**
53
+
54
+ **Status:** Open source, MIT license
55
+ **Positioning:** Open platform for AI coding agents
56
+
57
+ **Key Features:**
58
+
59
+ - Model-agnostic (works with Claude, GPT, local LLMs)
60
+ - Python SDK for composable agent definitions
61
+ - CLI and cloud scalability (supports 1000s of agents)
62
+ - 188+ contributors, 2100+ contributions
63
+ - Evaluation harness with 15+ benchmarks
64
+
65
+ **Strengths:**
66
+
67
+ - Fully open source with permissive MIT license
68
+ - Model flexibility (not locked to specific vendor)
69
+ - Academic credibility (university partnerships)
70
+ - Rich evaluation infrastructure
71
+ - Community-driven development
72
+
73
+ **Weaknesses:**
74
+
75
+ - Lacks orchestration for multi-agent teams (designed for single agents at scale)
76
+ - No daemon-driven autonomous issue processing
77
+ - No persistent memory system
78
+ - No DORA metrics or cost intelligence
79
+ - No pipeline composition (no workflow/delivery pipeline)
80
+ - Limited production deployment evidence
81
+
82
+ **Target Audience:** Teams wanting to self-host agents; researchers; teams favoring flexibility over workflow automation
83
+
84
+ ---
85
+
86
+ #### 3. **SWE-agent (Princeton/Stanford)**
87
+
88
+ **Status:** Open source, academic research
89
+ **Positioning:** Agent-computer interface for repository fixing
90
+
91
+ **Key Technology:**
92
+
93
+ - Custom agent-computer interface (ACI) optimized for file editing and repo navigation
94
+ - Mini-SWE-agent: achieves >74% on SWE-bench verified in just 100 lines of Python
95
+ - Presented at NeurIPS 2024
96
+
97
+ **Strengths:**
98
+
99
+ - Exceptional performance on SWE-bench (gold standard for code agents)
100
+ - Minimal, elegant approach (100-line reference implementation)
101
+ - Strong academic credentials
102
+ - Highly efficient agent-environment interaction
103
+
104
+ **Weaknesses:**
105
+
106
+ - Single-agent only, no multi-agent orchestration
107
+ - Research-focused, not production-optimized
108
+ - No delivery pipeline or deployment automation
109
+ - No daemon or autonomous processing
110
+ - Limited persistence or learning systems
111
+ - Essentially a benchmark-specific tool, not a production delivery platform
112
+
113
+ **Target Audience:** Researchers, academics, teams benchmarking agent capability; not a production delivery solution
114
+
115
+ ---
116
+
117
+ #### 4. **GitHub Copilot Workspace + Agent Mode**
118
+
119
+ **Status:** Commercial (Microsoft/GitHub)
120
+ **Positioning:** AI agents integrated into GitHub
121
+
122
+ **Recent Developments (2025):**
123
+
124
+ - Agent Mode: iterative self-correction, error recognition, auto-fixing
125
+ - Coding Agent (GA in May 2025): asynchronous autonomous developer agent
126
+ - Model Context Protocol (MCP) support for custom tools
127
+ - "Project Padawan": future autonomous task completion from issue to PR
128
+
129
+ **Strengths:**
130
+
131
+ - Native GitHub integration (issues → agent → PR → merge workflow)
132
+ - Multi-model choice (Claude, GPT via MCP)
133
+ - Asynchronous execution (can work in background)
134
+ - Mission Control: parallel task orchestration for large refactors
135
+ - Built-in at $20/month for Copilot Pro users
136
+
137
+ **Weaknesses:**
138
+
139
+ - Proprietary, closed-source
140
+ - No cross-repo fleet operations
141
+ - No daemon-driven issue watching (GitHub-native only)
142
+ - Limited multi-agent team coordination
143
+ - No persistent learning or memory system
144
+ - Pricing per-user, not per-task
145
+
146
+ **Target Audience:** Teams already on GitHub; enterprises with Copilot Pro; teams wanting low-friction integration
147
+
148
+ ---
149
+
150
+ #### 5. **Cursor IDE, Windsurf, Cline**
151
+
152
+ **Status:** Commercial/Open source hybrid
153
+ **Positioning:** AI-powered development environments
154
+
155
+ **Market Context:**
156
+
157
+ - 85% of developers use some form of AI coding tool by end of 2025
158
+ - Cursor: $20/month, strong on IDE polish
159
+ - Windsurf: acquired by Cognition (Devin's parent), $15/month, deep agentic planning
160
+ - Cline: open-source, runs in VS Code or terminal, local-first control
161
+
162
+ **Strengths:**
163
+
164
+ - Seamless IDE integration (chat, autocomplete, refactor in one environment)
165
+ - Cursor/Windsurf are feature-rich and mature
166
+ - Cline offers transparency and local control
167
+ - Multi-file editing with diff visualization
168
+
169
+ **Weaknesses:**
170
+
171
+ - Interactive-only (no daemon for autonomous background processing)
172
+ - No pipeline automation or deployment
173
+ - No cross-repo orchestration
174
+ - No memory or learning systems
175
+ - Designed for individual developer workflows, not team delivery
176
+
177
+ **Target Audience:** Individual developers; teams using VS Code; teams wanting polished IDE experience
178
+
179
+ ---
180
+
181
+ #### 6. **Amazon Q Developer Agent**
182
+
183
+ **Status:** Commercial, AWS service
184
+ **Positioning:** Enterprise AI coding assistant for AWS
185
+
186
+ **2025 Performance:**
187
+
188
+ - 51% SWE-bench verified (state-of-the-art in April 2025)
189
+ - 66% on full SWE-bench dataset
190
+ - Pricing: $19/user/month
191
+
192
+ **Strengths:**
193
+
194
+ - Strong SWE-bench performance
195
+ - Deep AWS service knowledge
196
+ - Expanded language support (Dart, Go, Kotlin, Rust, Bash, Terraform, etc.)
197
+ - Enterprise support and compliance
198
+ - Generous capacity (1000 agentic interactions/month, 4000 LOC/month transformations)
199
+
200
+ **Weaknesses:**
201
+
202
+ - AWS-centric (optimization bias toward AWS patterns)
203
+ - Proprietary, vendor-locked
204
+ - No multi-agent orchestration
205
+ - No daemon or autonomous processing
206
+ - No cross-repo fleet operations
207
+ - Limited deployment to AWS only
208
+
209
+ **Target Audience:** AWS-native enterprises; teams using AWS infrastructure; large enterprises wanting compliance
210
+
211
+ ---
212
+
213
+ #### 7. **v0 by Vercel**
214
+
215
+ **Status:** Commercial SaaS
216
+ **Positioning:** AI UI/full-stack code generation for Next.js
217
+
218
+ **2026 Roadmap:**
219
+
220
+ - Full-stack app generation (not just UI)
221
+ - End-to-end agentic workflows
222
+ - Self-driving deployment infrastructure
223
+ - 6M+ developers, 80K+ active teams
224
+
225
+ **Strengths:**
226
+
227
+ - Specialized for React/Next.js (deep optimization)
228
+ - Vercel infrastructure integration
229
+ - UI-first feedback loop (see code working immediately)
230
+ - Full-stack ambitions for 2026
231
+
232
+ **Weaknesses:**
233
+
234
+ - Narrow focus (Next.js/React only)
235
+ - Not suitable for non-web or monolith codebases
236
+ - No multi-agent orchestration
237
+ - No cross-repo or fleet operations
238
+ - Interactive-only
239
+
240
+ **Target Audience:** Frontend-heavy teams; Next.js/React shops; startups building web apps
241
+
242
+ ---
243
+
244
+ #### 8. **Aider**
245
+
246
+ **Status:** Open source, GPLv3
247
+ **Positioning:** Terminal-based AI pair programming with Git integration
248
+
249
+ **Key Features:**
250
+
251
+ - Multi-file editing with Git commit tracking
252
+ - Works with any LLM (Claude, GPT-4, local models)
253
+ - Codebase-aware (builds internal maps)
254
+ - CLI-native workflow
255
+
256
+ **Strengths:**
257
+
258
+ - Highly trusted in terminal/CLI environments
259
+ - Strong git integration (every edit is a commit)
260
+ - Model-agnostic
261
+ - Proven for refactors and multi-file changes
262
+ - Small, focused scope
263
+
264
+ **Weaknesses:**
265
+
266
+ - CLI-only (not web/IDE-based)
267
+ - No autonomous processing (interactive only)
268
+ - No pipeline or deployment automation
269
+ - No multi-agent or team features
270
+ - No memory or learning systems
271
+
272
+ **Target Audience:** Terminal-loving developers; teams wanting git-native workflows; DevOps engineers
273
+
274
+ ---
275
+
276
+ ### Adjacent Competitors: General Agent Orchestration Frameworks
277
+
278
+ These are not code-specific but compete for the multi-agent orchestration and workflow automation layers:
279
+
280
+ **CrewAI:** Role-playing agent framework, good for multi-agent workflows but not code-optimized
281
+ **AutoGen (Microsoft):** Open-source multi-agent orchestration, general-purpose
282
+ **LangGraph:** Graph-based task orchestration (DAG model), general-purpose
283
+
284
+ **Gap:** None of these are specialized for software delivery pipelines, team coordination in version control workflows, or autonomous issue processing.
285
+
286
+ ---
287
+
288
+ ## Competitive Matrix
289
+
290
+ | Feature | Devin | OpenHands | SWE-agent | Copilot | Cursor | Amazon Q | v0 | Aider | Shipwright |
291
+ | ------------------------- | ------------- | --------- | --------- | -------------- | ----------- | ----------- | ----------- | -------- | ---------------------- |
292
+ | **Model** | Proprietary | Agnostic | Agnostic | Agnostic (MCP) | Proprietary | Proprietary | Proprietary | Agnostic | Agnostic (uses Claude) |
293
+ | **Open Source** | ✗ | ✓ | ✓ | ✗ | Limited | ✗ | ✗ | ✓ | ✓ |
294
+ | **Single Agent** | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ |
295
+ | **Multi-Agent Teams** | ✗ | ✗ | ✗ | Limited | ✗ | ✗ | ✗ | ✗ | **✓** |
296
+ | **Autonomous Processing** | ✓ | ✗ | ✗ | Async | ✗ | ✗ | ✗ | ✗ | **✓** |
297
+ | **Daemon-Driven** | ✗ | ✗ | ✗ | ✗ | ✗ | ✗ | ✗ | ✗ | **✓** |
298
+ | **Issue Watching** | ✗ | ✗ | ✗ | GitHub-native | ✗ | ✗ | ✗ | ✗ | **✓ (multi-tracker)** |
299
+ | **Fleet Operations** | ✗ | ✗ | ✗ | ✗ | ✗ | ✗ | ✗ | ✗ | **✓** |
300
+ | **Delivery Pipeline** | ✗ | ✗ | ✗ | Basic | ✗ | ✗ | Basic | ✗ | **✓ (12 stages)** |
301
+ | **Persistent Memory** | ✗ | ✗ | ✗ | ✗ | ✗ | ✗ | ✗ | ✗ | **✓** |
302
+ | **DORA Metrics** | ✗ | ✗ | ✗ | ✗ | ✗ | ✗ | ✗ | ✗ | **✓** |
303
+ | **Cost Intelligence** | ✗ | ✗ | ✗ | ✗ | ✗ | ✗ | ✗ | ✗ | **✓** |
304
+ | **Git Worktrees** | ✗ | ✗ | ✗ | ✗ | ✗ | ✗ | ✗ | ✗ | **✓** |
305
+ | **Interactive Only** | ✗ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✗ |
306
+ | **IDE Integration** | ✗ | ✗ | ✗ | Native | Native | ✗ | Web | ✗ | tmux |
307
+ | **Pricing** | Enterprise $$ | Free/OSS | Free/OSS | $20/mo | $20/mo | $19/user/mo | Freemium | Free/OSS | **Free/OSS** |
308
+
309
+ ---
310
+
311
+ ## Market Trends & Insights
312
+
313
+ ### 1. **Benchmark Performance Explosion**
314
+
315
+ - **SWE-bench Verified:** Top models now exceed 80% (Claude Opus 4.6 leads at 80.8%)
316
+ - **Reality Gap:** Despite high benchmark scores, real-world production success remains 23-25% on SWE-bench Pro (stricter evaluation)
317
+ - **Lesson:** The gap between lab performance and production suggests that agent orchestration, workflow automation, and learning systems will be competitive differentiators
318
+
319
+ **Implication for Shipwright:** Persistent memory, DORA metrics, and pipeline composition address this gap by learning from failures and improving over time.
320
+
321
+ ---
322
+
323
+ ### 2. **AI Agent Market Growth**
324
+
325
+ - **2025 Developer Adoption:** 85% of developers use some form of AI coding tool
326
+ - **Market Projection:** $8.5B by 2026, $35B by 2030
327
+ - **Risk:** 40% of agentic AI projects could be cancelled by 2027 due to cost, scaling complexity, or risk
328
+
329
+ **Implication for Shipwright:** Open-source, self-hosted positioning addresses cost and compliance concerns. Autonomous daemon addresses complexity concerns by reducing manual orchestration burden.
330
+
331
+ ---
332
+
333
+ ### 3. **Shift from Code Generation to Workflow Automation**
334
+
335
+ - **2024-2025:** Focus was on single-agent capability (Devin, SWE-agent, Copilot)
336
+ - **2026 Trend:** Market moving toward multi-agent teams, orchestration, and delivery pipelines
337
+ - **Evidence:** Devin's shift to "fleet management," Copilot's Mission Control, Vercel's agentic workflows, GitHub's Project Padawan
338
+
339
+ **Implication for Shipwright:** This is exactly Shipwright's core value: multi-agent orchestration, delivery pipelines, fleet operations. The market is moving toward this positioning.
340
+
341
+ ---
342
+
343
+ ### 4. **Model Vendor Lock-in vs. Flexibility**
344
+
345
+ - **Proprietary Tools:** Devin, Cursor, Windsurf, Amazon Q are locked to specific models
346
+ - **Flexible Tools:** OpenHands, SWE-agent, Claude Code, Aider support model switching
347
+ - **2026 Trend:** MCP (Model Context Protocol) and A2A (Agent-to-Agent) protocols emerging as interoperability standards
348
+
349
+ **Implication for Shipwright:** Agnostic to Claude Code (could theoretically support other agents). This positions Shipwright as a platform layer, not a vendor lock-in tool.
350
+
351
+ ---
352
+
353
+ ### 5. **Autonomous vs. Interactive**
354
+
355
+ - **Interactive Tools Dominate:** Most tools (Cursor, Windsurf, Copilot, Aider) require human-in-the-loop
356
+ - **Autonomous Leaders:** Devin and Copilot Agent Mode pioneer background processing
357
+ - **2026 Direction:** Enterprises increasingly demand autonomous, asynchronous processing (teams don't want to babysit agents)
358
+
359
+ **Implication for Shipwright:** Daemon-driven autonomous processing is a strong differentiator. Competitors are just starting to offer this; Shipwright has it built-in.
360
+
361
+ ---
362
+
363
+ ### 6. **Team Collaboration & Multi-Agent Trends**
364
+
365
+ - **Current State:** Most tools are single-agent or single-user focused
366
+ - **Emerging:** GitHub Copilot Workspace, Devin teams, Claude Code agent teams
367
+ - **Challenge:** Coordinating multiple agents without conflicts or duplicate work
368
+ - **Solution:** Enterprise frameworks (CrewAI, AutoGen, LangGraph) still generic; none purpose-built for software delivery
369
+
370
+ **Implication for Shipwright:** Multi-agent team orchestration with git-based coordination (worktrees, branch isolation) is a rare capability. This is a strong market differentiator.
371
+
372
+ ---
373
+
374
+ ### 7. **Enterprise Adoption Requirements**
375
+
376
+ Enterprises moving beyond proof-of-concept demand:
377
+
378
+ - **Cost Visibility:** Token usage, budget controls, ROI tracking
379
+ - **Audit Trail:** What changed, why, approval workflows
380
+ - **Integration:** GitHub, Linear, Jira, Slack, CI/CD pipelines
381
+ - **Compliance:** Self-hosted, data privacy, role-based access
382
+
383
+ **Implication for Shipwright:** Cost intelligence, memory system, GitHub/Linear/Jira integration, tmux-native workflow, open-source self-hosting all address enterprise adoption barriers.
384
+
385
+ ---
386
+
387
+ ### 8. **Specialization vs. Generalization**
388
+
389
+ - **Specialized Wins:** v0 (React/Next.js), SWE-agent (benchmark optimization), Devin (end-to-end tasks)
390
+ - **Generalist Tools:** OpenHands, Claude Code, Aider (any language, any task)
391
+ - **Market Lesson:** Specialization wins on depth; generalization wins on breadth
392
+
393
+ **Implication for Shipwright:** Positioned as a delivery platform (generalist), but can specialize via templates, team configurations, and domain-specific agent definitions.
394
+
395
+ ---
396
+
397
+ ## Shipwright's Unique Market Position
398
+
399
+ ### What Shipwright Does That Competitors Don't
400
+
401
+ | Capability | Unique to Shipwright? | Value Proposition |
402
+ | ----------------------------------- | --------------------- | --------------------------------------------------------------------------------------------------- |
403
+ | Multi-agent team orchestration | Nearly unique | Parallel feature work, cross-layer coordination (frontend+backend+tests) |
404
+ | Daemon-driven autonomous processing | Nearly unique | Background issue watching → full pipeline without manual intervention |
405
+ | Fleet operations (multi-repo) | Unique | Scale orchestration across 10+ repos with single daemon |
406
+ | Persistent memory system | Unique | Agents learn from failures, improve over time, capture institutional knowledge |
407
+ | DORA metrics integration | Unique | Measure delivery performance (lead time, deployment frequency, CFR, MTTR) |
408
+ | Cost intelligence | Unique | Token budgeting, cost per issue, ROI tracking |
409
+ | Git worktree isolation | Unique | True parallel pipelines without branch conflicts |
410
+ | 12-stage delivery pipeline | Unique | intake → plan → design → build → test → review → quality → PR → merge → deploy → validate → monitor |
411
+ | Issue tracker integration | Unique | GitHub, Linear, Jira bidirectional sync with daemon auto-processing |
412
+ | tmux-native workflow | Unique | Professional TUI, team panes, session persistence, Claude Code optimized |
413
+ | Open source + self-hosted | Rare | All features available without vendor lock-in |
414
+
415
+ ---
416
+
417
+ ## Market Gaps Shipwright Fills
418
+
419
+ ### Gap 1: No One Orchestrates Multi-Agent Teams for Delivery
420
+
421
+ **Problem:** Devin, Copilot, OpenHands are single-agent. GitHub/Copilot have limited multi-agent support.
422
+ **Shipwright Solution:** Full multi-agent team orchestration with role-based coordination (builder, reviewer, tester, optimizer, docs, security).
423
+
424
+ ### Gap 2: No One Processes Issues Autonomously at Scale
425
+
426
+ **Problem:** All tools require human interaction. No daemon watching GitHub for labeled issues.
427
+ **Shipwright Solution:** Daemon watches GitHub, Linear, Jira → spawns teams → full pipeline → auto-merge/deploy.
428
+
429
+ ### Gap 3: No One Operates Across Multiple Repos
430
+
431
+ **Problem:** All tools optimize for single-repo or single-codebase.
432
+ **Shipwright Solution:** Fleet operations with shared worker pool, rebalancing, fleet metrics.
433
+
434
+ ### Gap 4: No One Learns from Failures
435
+
436
+ **Problem:** Each agent run is isolated. No institutional knowledge transfer.
437
+ **Shipwright Solution:** Memory system captures failure patterns, injects context into future runs, agents improve over time.
438
+
439
+ ### Gap 5: No One Measures Delivery Performance
440
+
441
+ **Problem:** No visibility into DORA metrics (lead time, deployment frequency, CFR, MTTR).
442
+ **Shipwright Solution:** Native DORA metrics, self-optimization based on metrics.
443
+
444
+ ### Gap 6: No One Provides True Cost Visibility
445
+
446
+ **Problem:** Token usage hidden, budgeting impossible, ROI unclear.
447
+ **Shipwright Solution:** Token tracking, daily budgets, cost per issue, cost forecasting.
448
+
449
+ ---
450
+
451
+ ## Competitive Threats & Responses
452
+
453
+ ### Threat 1: Devin Continues to Improve
454
+
455
+ **Status:** Devin produces 25% of Cognition's code, targeting 50% by end of 2026
456
+ **Risk Level:** High for single-agent use cases
457
+ **Shipwright Response:**
458
+
459
+ - Devin excels at single complex tasks, but can't coordinate multi-agent teams
460
+ - Shipwright positions as the "orchestration layer" — you could theoretically use Devin agents within Shipwright pipelines (via API)
461
+ - Focus messaging on team coordination, multi-repo ops, autonomous processing
462
+
463
+ ### Threat 2: GitHub Copilot Workspace Becomes Default
464
+
465
+ **Status:** Copilot is $20/month for 300M+ GitHub users
466
+ **Risk Level:** High for GitHub-native teams
467
+ **Shipwright Response:**
468
+
469
+ - Copilot is interactive-only and GitHub-only; Shipwright is autonomous, multi-tracker (GitHub + Linear + Jira)
470
+ - Copilot integrations are MCP-friendly; Shipwright can coexist (use Copilot agents within Shipwright)
471
+ - Open source, self-hosted model addresses enterprise compliance concerns
472
+
473
+ ### Threat 3: OpenHands Gains Production Traction
474
+
475
+ **Status:** 188+ contributors, MIT license, fast-growing
476
+ **Risk Level:** Medium (positioned differently but overlapping audience)
477
+ **Shipwright Response:**
478
+
479
+ - OpenHands is a single-agent framework at scale; Shipwright is multi-agent orchestration
480
+ - OpenHands lacks daemon, pipeline, fleet, memory, DORA metrics
481
+ - Could integrate: Shipwright could spawn OpenHands agents in pipelines
482
+
483
+ ### Threat 4: Proprietary Tools Commoditize Open Source
484
+
485
+ **Status:** Cursor, Windsurf, Devin all moving downmarket
486
+ **Risk Level:** Medium for developer mindshare
487
+ **Shipwright Response:**
488
+
489
+ - Shipwright appeals to enterprises and teams wanting self-hosted, cost-predictable solutions
490
+ - Developer adoption isn't the goal; team delivery efficiency is
491
+ - Focus on DevOps/platform engineers and CTOs, not individual developers
492
+
493
+ ---
494
+
495
+ ## Strategic Recommendations
496
+
497
+ ### 1. **Own the "Orchestration" Market**
498
+
499
+ Position Shipwright as the orchestration platform for AI agents. Even if Devin or Copilot Agent Mode become the best single agents, Shipwright is the "controller" for teams of agents.
500
+
501
+ **Messaging:** "Devin is a junior engineer. Shipwright is the engineering manager."
502
+
503
+ ### 2. **Focus on Enterprise Adoption Drivers**
504
+
505
+ Enterprises care about:
506
+
507
+ - **Cost control:** Implement token budgeting, per-issue cost tracking, ROI dashboards
508
+ - **Compliance:** Self-hosted, audit trails, role-based access
509
+ - **Integration:** Deep GitHub/Linear/Jira support, CI/CD webhooks, Slack notifications
510
+ - **Metrics:** DORA metrics, burn charts, velocity tracking
511
+
512
+ ### 3. **Build a Market for Autonomous Delivery**
513
+
514
+ Most tools are interactive. Position Shipwright as "the autonomous delivery platform" — teams configure it once, daemon runs in background, PRs arrive pre-reviewed.
515
+
516
+ **Differentiator:** "Write once, ship continuously."
517
+
518
+ ### 4. **Develop Agent Marketplace**
519
+
520
+ Create a marketplace for pre-built agents, team templates, and pipeline configurations. This creates network effects and switching costs.
521
+
522
+ **Examples:**
523
+
524
+ - Agent: "Security Specialist" (scans for OWASP Top 10)
525
+ - Agent: "Performance Reviewer" (benchmarks before/after)
526
+ - Template: "Monolith to Microservices" (multi-agent refactor)
527
+ - Template: "Legacy Framework Upgrade" (coordinated dependency updates)
528
+
529
+ ### 5. **Memory as a Competitive Moat**
530
+
531
+ Invest heavily in the memory system. Agents that learn from failures are 2-3x more effective than those that don't.
532
+
533
+ **Market Positioning:** "Agents that get smarter with every issue."
534
+
535
+ ### 6. **Target the "DevOps/Platform Team" Buyer**
536
+
537
+ These teams:
538
+
539
+ - Want to scale developer productivity without hiring
540
+ - Care about metrics and ROI
541
+ - Manage multiple repos/teams
542
+ - Run in-house infrastructure
543
+
544
+ **Shipwright fits perfectly:** "A platform engineering tool for AI."
545
+
546
+ ### 7. **Prepare for LLM Model Commoditization**
547
+
548
+ As Claude, GPT, and others converge on capability, orchestration and workflow will be the differentiator.
549
+
550
+ **Strategy:** Make Shipwright model-agnostic (can swap Claude for any LLM via API). This future-proofs against model commoditization.
551
+
552
+ ---
553
+
554
+ ## Market Sizing Estimates
555
+
556
+ ### Total Addressable Market (TAM)
557
+
558
+ - **Enterprise teams** managing 10+ repos: ~100K globally
559
+ - **Cloud-native orgs** with CI/CD: ~500K globally
560
+ - At $50-200/month per team = $50M-100M/year market
561
+
562
+ ### Serviceable Addressable Market (SAM)
563
+
564
+ - **Tier 1:** Tech companies, fast-growth startups: ~50K teams
565
+ - At $100-300/month = $50M-150M/year
566
+
567
+ ### Shipwright Target (SOM)
568
+
569
+ - **Year 1:** 100 teams (free/open source, early adopters)
570
+ - **Year 2:** 500 teams (commercial + open source mix)
571
+ - **Year 3:** 2000 teams
572
+ - At average $50/month (blended): $1.2M/year by year 3
573
+
574
+ ---
575
+
576
+ ## Conclusion
577
+
578
+ Shipwright operates in a market with clear tailwinds:
579
+
580
+ - Multi-agent orchestration is emerging as critical
581
+ - Autonomous, daemon-driven processing is becoming table stakes
582
+ - Enterprise adoption is increasing, driving demand for self-hosted, auditable solutions
583
+ - Market is moving from code generation (where proprietary tools lead) to delivery pipeline automation (where open, flexible platforms win)
584
+
585
+ **Unique value proposition:** The only open-source platform for autonomous, multi-agent, multi-repo software delivery with persistent learning and complete observability.
586
+
587
+ **Key success factors:**
588
+
589
+ 1. Aggressive investment in memory system and agent learning
590
+ 2. Deep enterprise integrations (GitHub, Linear, Jira, Slack, CI/CD)
591
+ 3. Cost intelligence as a first-class feature
592
+ 4. Agent and template marketplace to build network effects
593
+ 5. Positioning as "orchestration layer," not single-agent competitor
594
+
595
+ **6-month priorities:**
596
+
597
+ - Ship memory system v2 (failure pattern injection)
598
+ - Launch cost intelligence dashboard
599
+ - Add Linear/Jira parity with GitHub
600
+ - Develop 3-5 production agent templates
601
+ - Secure 10 enterprise pilot customers
602
+
603
+ ---
604
+
605
+ ## Sources
606
+
607
+ - [Cognition | Devin's 2025 Performance Review](https://cognition.ai/blog/devin-annual-performance-review-2025)
608
+ - [OpenHands | The Open Platform for Cloud Coding Agents](https://openhands.dev/)
609
+ - [SWE-agent | GitHub Repository](https://github.com/SWE-agent/SWE-agent)
610
+ - [GitHub Copilot | Agent Mode and Features](https://github.com/newsroom/press-releases/agent-mode)
611
+ - [Cursor IDE vs Windsurf Comparison](https://research.aimultiple.com/ai-code-editor/)
612
+ - [Amazon Q Developer | 2025 Updates](https://aws.amazon.com/blogs/devops/april-2025-amazon-q-developer/)
613
+ - [v0 by Vercel | Building Agents and Apps](https://v0.app/)
614
+ - [Aider | AI Pair Programming in Terminal](https://aider.chat/)
615
+ - [Claude Code | Agent Teams Orchestration](https://code.claude.com/docs/en/agent-teams)
616
+ - [AI Coding Agent Market Trends 2026](https://blog.logrocket.com/ai-dev-tool-power-rankings)
617
+ - [SWE-Bench Performance Leaderboard](https://scale.com/leaderboard/swe_bench_pro_public)
618
+ - [AI Agent Orchestration Frameworks 2026](https://aimultiple.com/agentic-orchestration)
619
+ - [Deloitte | AI Agent Orchestration 2026](https://www.deloitte.com/us/en/insights/industry/technology/technology-media-and-telecom-predictions/2026/ai-agent-orchestration.html)