shipwright-cli 1.10.0 → 2.0.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +114 -36
- package/completions/_shipwright +212 -32
- package/completions/shipwright.bash +97 -25
- package/docs/strategy/01-market-research.md +619 -0
- package/docs/strategy/02-mission-and-brand.md +587 -0
- package/docs/strategy/03-gtm-and-roadmap.md +759 -0
- package/docs/strategy/QUICK-START.txt +289 -0
- package/docs/strategy/README.md +172 -0
- package/package.json +4 -2
- package/scripts/sw +208 -1
- package/scripts/sw-activity.sh +500 -0
- package/scripts/sw-adaptive.sh +925 -0
- package/scripts/sw-adversarial.sh +1 -1
- package/scripts/sw-architecture-enforcer.sh +1 -1
- package/scripts/sw-auth.sh +613 -0
- package/scripts/sw-autonomous.sh +664 -0
- package/scripts/sw-changelog.sh +704 -0
- package/scripts/sw-checkpoint.sh +1 -1
- package/scripts/sw-ci.sh +602 -0
- package/scripts/sw-cleanup.sh +1 -1
- package/scripts/sw-code-review.sh +637 -0
- package/scripts/sw-connect.sh +1 -1
- package/scripts/sw-context.sh +605 -0
- package/scripts/sw-cost.sh +1 -1
- package/scripts/sw-daemon.sh +432 -130
- package/scripts/sw-dashboard.sh +1 -1
- package/scripts/sw-db.sh +540 -0
- package/scripts/sw-decompose.sh +539 -0
- package/scripts/sw-deps.sh +551 -0
- package/scripts/sw-developer-simulation.sh +1 -1
- package/scripts/sw-discovery.sh +412 -0
- package/scripts/sw-docs-agent.sh +539 -0
- package/scripts/sw-docs.sh +1 -1
- package/scripts/sw-doctor.sh +59 -1
- package/scripts/sw-dora.sh +615 -0
- package/scripts/sw-durable.sh +710 -0
- package/scripts/sw-e2e-orchestrator.sh +535 -0
- package/scripts/sw-eventbus.sh +393 -0
- package/scripts/sw-feedback.sh +471 -0
- package/scripts/sw-fix.sh +1 -1
- package/scripts/sw-fleet-discover.sh +567 -0
- package/scripts/sw-fleet-viz.sh +404 -0
- package/scripts/sw-fleet.sh +8 -1
- package/scripts/sw-github-app.sh +596 -0
- package/scripts/sw-github-checks.sh +1 -1
- package/scripts/sw-github-deploy.sh +1 -1
- package/scripts/sw-github-graphql.sh +1 -1
- package/scripts/sw-guild.sh +569 -0
- package/scripts/sw-heartbeat.sh +1 -1
- package/scripts/sw-hygiene.sh +559 -0
- package/scripts/sw-incident.sh +617 -0
- package/scripts/sw-init.sh +88 -1
- package/scripts/sw-instrument.sh +699 -0
- package/scripts/sw-intelligence.sh +1 -1
- package/scripts/sw-jira.sh +1 -1
- package/scripts/sw-launchd.sh +363 -28
- package/scripts/sw-linear.sh +1 -1
- package/scripts/sw-logs.sh +1 -1
- package/scripts/sw-loop.sh +64 -3
- package/scripts/sw-memory.sh +1 -1
- package/scripts/sw-mission-control.sh +487 -0
- package/scripts/sw-model-router.sh +545 -0
- package/scripts/sw-otel.sh +596 -0
- package/scripts/sw-oversight.sh +689 -0
- package/scripts/sw-pipeline-composer.sh +1 -1
- package/scripts/sw-pipeline-vitals.sh +1 -1
- package/scripts/sw-pipeline.sh +687 -24
- package/scripts/sw-pm.sh +693 -0
- package/scripts/sw-pr-lifecycle.sh +522 -0
- package/scripts/sw-predictive.sh +1 -1
- package/scripts/sw-prep.sh +1 -1
- package/scripts/sw-ps.sh +1 -1
- package/scripts/sw-public-dashboard.sh +798 -0
- package/scripts/sw-quality.sh +595 -0
- package/scripts/sw-reaper.sh +1 -1
- package/scripts/sw-recruit.sh +573 -0
- package/scripts/sw-regression.sh +642 -0
- package/scripts/sw-release-manager.sh +736 -0
- package/scripts/sw-release.sh +706 -0
- package/scripts/sw-remote.sh +1 -1
- package/scripts/sw-replay.sh +520 -0
- package/scripts/sw-retro.sh +691 -0
- package/scripts/sw-scale.sh +444 -0
- package/scripts/sw-security-audit.sh +505 -0
- package/scripts/sw-self-optimize.sh +1 -1
- package/scripts/sw-session.sh +1 -1
- package/scripts/sw-setup.sh +1 -1
- package/scripts/sw-standup.sh +712 -0
- package/scripts/sw-status.sh +1 -1
- package/scripts/sw-strategic.sh +658 -0
- package/scripts/sw-stream.sh +450 -0
- package/scripts/sw-swarm.sh +583 -0
- package/scripts/sw-team-stages.sh +511 -0
- package/scripts/sw-templates.sh +1 -1
- package/scripts/sw-testgen.sh +515 -0
- package/scripts/sw-tmux-pipeline.sh +554 -0
- package/scripts/sw-tmux.sh +1 -1
- package/scripts/sw-trace.sh +485 -0
- package/scripts/sw-tracker-github.sh +188 -0
- package/scripts/sw-tracker-jira.sh +172 -0
- package/scripts/sw-tracker-linear.sh +251 -0
- package/scripts/sw-tracker.sh +117 -2
- package/scripts/sw-triage.sh +603 -0
- package/scripts/sw-upgrade.sh +1 -1
- package/scripts/sw-ux.sh +677 -0
- package/scripts/sw-webhook.sh +627 -0
- package/scripts/sw-widgets.sh +530 -0
- package/scripts/sw-worktree.sh +1 -1
|
@@ -0,0 +1,619 @@
|
|
|
1
|
+
# Market Research: AI Coding Agent Landscape
|
|
2
|
+
|
|
3
|
+
**Date:** February 2026
|
|
4
|
+
**Author:** Market Research Analysis
|
|
5
|
+
**Status:** Competitive Landscape & Strategic Positioning
|
|
6
|
+
|
|
7
|
+
---
|
|
8
|
+
|
|
9
|
+
## Executive Summary
|
|
10
|
+
|
|
11
|
+
The AI coding agent market has experienced explosive growth in 2025-2026, with autonomous software engineering capabilities maturing rapidly. Leading models now exceed 80% success rates on standardized benchmarks, developer adoption has reached 85% by end of 2025, and market valuations project $8.5 billion by 2026 (reaching $35 billion by 2030).
|
|
12
|
+
|
|
13
|
+
Shipwright occupies a unique position in this landscape: it is the only open-source, team-oriented orchestration platform designed specifically for multi-agent delivery pipelines with daemon-driven autonomous processing, persistent learning systems, and fleet operations across multiple repositories. This analysis identifies key competitors, market trends, and strategic differentiation opportunities.
|
|
14
|
+
|
|
15
|
+
---
|
|
16
|
+
|
|
17
|
+
## Competitive Landscape
|
|
18
|
+
|
|
19
|
+
### Direct Competitors
|
|
20
|
+
|
|
21
|
+
#### 1. **Devin (Cognition Labs)**
|
|
22
|
+
|
|
23
|
+
**Status:** Commercial, $$$
|
|
24
|
+
**Positioning:** Autonomous AI software engineer
|
|
25
|
+
|
|
26
|
+
**Key Achievements:**
|
|
27
|
+
|
|
28
|
+
- Production-deployed at thousands of companies (Goldman Sachs, Santander, Nubank)
|
|
29
|
+
- 67% PR merge rate (up from 34% in 2024)
|
|
30
|
+
- Produces 25% of Cognition's internal pull requests
|
|
31
|
+
- 4x faster problem-solving, 2x more resource efficient than prior year
|
|
32
|
+
|
|
33
|
+
**Strengths:**
|
|
34
|
+
|
|
35
|
+
- Best-in-class single-agent capability for complex refactoring and feature work
|
|
36
|
+
- High PR merge quality indicating strong contextual understanding
|
|
37
|
+
- Proven production deployment track record
|
|
38
|
+
- Clear roadmap for 50% internal code production by end of 2026
|
|
39
|
+
|
|
40
|
+
**Weaknesses:**
|
|
41
|
+
|
|
42
|
+
- Only 15% success rate on complex end-to-end tasks requiring senior judgment
|
|
43
|
+
- No multi-agent orchestration (single Devin instance per task)
|
|
44
|
+
- Proprietary, closed-source, vendor-locked
|
|
45
|
+
- Expensive (no public pricing, but enterprise-tier cost)
|
|
46
|
+
- No team collaboration features
|
|
47
|
+
|
|
48
|
+
**Target Audience:** Enterprise teams wanting to offload junior engineer tasks; not suitable for cross-repo fleet operations
|
|
49
|
+
|
|
50
|
+
---
|
|
51
|
+
|
|
52
|
+
#### 2. **OpenHands (formerly OpenDevin)**
|
|
53
|
+
|
|
54
|
+
**Status:** Open source, MIT license
|
|
55
|
+
**Positioning:** Open platform for AI coding agents
|
|
56
|
+
|
|
57
|
+
**Key Features:**
|
|
58
|
+
|
|
59
|
+
- Model-agnostic (works with Claude, GPT, local LLMs)
|
|
60
|
+
- Python SDK for composable agent definitions
|
|
61
|
+
- CLI and cloud scalability (supports 1000s of agents)
|
|
62
|
+
- 188+ contributors, 2100+ contributions
|
|
63
|
+
- Evaluation harness with 15+ benchmarks
|
|
64
|
+
|
|
65
|
+
**Strengths:**
|
|
66
|
+
|
|
67
|
+
- Fully open source with permissive MIT license
|
|
68
|
+
- Model flexibility (not locked to specific vendor)
|
|
69
|
+
- Academic credibility (university partnerships)
|
|
70
|
+
- Rich evaluation infrastructure
|
|
71
|
+
- Community-driven development
|
|
72
|
+
|
|
73
|
+
**Weaknesses:**
|
|
74
|
+
|
|
75
|
+
- Lacks orchestration for multi-agent teams (designed for single agents at scale)
|
|
76
|
+
- No daemon-driven autonomous issue processing
|
|
77
|
+
- No persistent memory system
|
|
78
|
+
- No DORA metrics or cost intelligence
|
|
79
|
+
- No pipeline composition (no workflow/delivery pipeline)
|
|
80
|
+
- Limited production deployment evidence
|
|
81
|
+
|
|
82
|
+
**Target Audience:** Teams wanting to self-host agents; researchers; teams favoring flexibility over workflow automation
|
|
83
|
+
|
|
84
|
+
---
|
|
85
|
+
|
|
86
|
+
#### 3. **SWE-agent (Princeton/Stanford)**
|
|
87
|
+
|
|
88
|
+
**Status:** Open source, academic research
|
|
89
|
+
**Positioning:** Agent-computer interface for repository fixing
|
|
90
|
+
|
|
91
|
+
**Key Technology:**
|
|
92
|
+
|
|
93
|
+
- Custom agent-computer interface (ACI) optimized for file editing and repo navigation
|
|
94
|
+
- Mini-SWE-agent: achieves >74% on SWE-bench verified in just 100 lines of Python
|
|
95
|
+
- Presented at NeurIPS 2024
|
|
96
|
+
|
|
97
|
+
**Strengths:**
|
|
98
|
+
|
|
99
|
+
- Exceptional performance on SWE-bench (gold standard for code agents)
|
|
100
|
+
- Minimal, elegant approach (100-line reference implementation)
|
|
101
|
+
- Strong academic credentials
|
|
102
|
+
- Highly efficient agent-environment interaction
|
|
103
|
+
|
|
104
|
+
**Weaknesses:**
|
|
105
|
+
|
|
106
|
+
- Single-agent only, no multi-agent orchestration
|
|
107
|
+
- Research-focused, not production-optimized
|
|
108
|
+
- No delivery pipeline or deployment automation
|
|
109
|
+
- No daemon or autonomous processing
|
|
110
|
+
- Limited persistence or learning systems
|
|
111
|
+
- Essentially a benchmark-specific tool, not a production delivery platform
|
|
112
|
+
|
|
113
|
+
**Target Audience:** Researchers, academics, teams benchmarking agent capability; not a production delivery solution
|
|
114
|
+
|
|
115
|
+
---
|
|
116
|
+
|
|
117
|
+
#### 4. **GitHub Copilot Workspace + Agent Mode**
|
|
118
|
+
|
|
119
|
+
**Status:** Commercial (Microsoft/GitHub)
|
|
120
|
+
**Positioning:** AI agents integrated into GitHub
|
|
121
|
+
|
|
122
|
+
**Recent Developments (2025):**
|
|
123
|
+
|
|
124
|
+
- Agent Mode: iterative self-correction, error recognition, auto-fixing
|
|
125
|
+
- Coding Agent (GA in May 2025): asynchronous autonomous developer agent
|
|
126
|
+
- Model Context Protocol (MCP) support for custom tools
|
|
127
|
+
- "Project Padawan": future autonomous task completion from issue to PR
|
|
128
|
+
|
|
129
|
+
**Strengths:**
|
|
130
|
+
|
|
131
|
+
- Native GitHub integration (issues → agent → PR → merge workflow)
|
|
132
|
+
- Multi-model choice (Claude, GPT via MCP)
|
|
133
|
+
- Asynchronous execution (can work in background)
|
|
134
|
+
- Mission Control: parallel task orchestration for large refactors
|
|
135
|
+
- Built-in at $20/month for Copilot Pro users
|
|
136
|
+
|
|
137
|
+
**Weaknesses:**
|
|
138
|
+
|
|
139
|
+
- Proprietary, closed-source
|
|
140
|
+
- No cross-repo fleet operations
|
|
141
|
+
- No daemon-driven issue watching (GitHub-native only)
|
|
142
|
+
- Limited multi-agent team coordination
|
|
143
|
+
- No persistent learning or memory system
|
|
144
|
+
- Pricing per-user, not per-task
|
|
145
|
+
|
|
146
|
+
**Target Audience:** Teams already on GitHub; enterprises with Copilot Pro; teams wanting low-friction integration
|
|
147
|
+
|
|
148
|
+
---
|
|
149
|
+
|
|
150
|
+
#### 5. **Cursor IDE, Windsurf, Cline**
|
|
151
|
+
|
|
152
|
+
**Status:** Commercial/Open source hybrid
|
|
153
|
+
**Positioning:** AI-powered development environments
|
|
154
|
+
|
|
155
|
+
**Market Context:**
|
|
156
|
+
|
|
157
|
+
- 85% of developers use some form of AI coding tool by end of 2025
|
|
158
|
+
- Cursor: $20/month, strong on IDE polish
|
|
159
|
+
- Windsurf: acquired by Cognition (Devin's parent), $15/month, deep agentic planning
|
|
160
|
+
- Cline: open-source, runs in VS Code or terminal, local-first control
|
|
161
|
+
|
|
162
|
+
**Strengths:**
|
|
163
|
+
|
|
164
|
+
- Seamless IDE integration (chat, autocomplete, refactor in one environment)
|
|
165
|
+
- Cursor/Windsurf are feature-rich and mature
|
|
166
|
+
- Cline offers transparency and local control
|
|
167
|
+
- Multi-file editing with diff visualization
|
|
168
|
+
|
|
169
|
+
**Weaknesses:**
|
|
170
|
+
|
|
171
|
+
- Interactive-only (no daemon for autonomous background processing)
|
|
172
|
+
- No pipeline automation or deployment
|
|
173
|
+
- No cross-repo orchestration
|
|
174
|
+
- No memory or learning systems
|
|
175
|
+
- Designed for individual developer workflows, not team delivery
|
|
176
|
+
|
|
177
|
+
**Target Audience:** Individual developers; teams using VS Code; teams wanting polished IDE experience
|
|
178
|
+
|
|
179
|
+
---
|
|
180
|
+
|
|
181
|
+
#### 6. **Amazon Q Developer Agent**
|
|
182
|
+
|
|
183
|
+
**Status:** Commercial, AWS service
|
|
184
|
+
**Positioning:** Enterprise AI coding assistant for AWS
|
|
185
|
+
|
|
186
|
+
**2025 Performance:**
|
|
187
|
+
|
|
188
|
+
- 51% SWE-bench verified (state-of-the-art in April 2025)
|
|
189
|
+
- 66% on full SWE-bench dataset
|
|
190
|
+
- Pricing: $19/user/month
|
|
191
|
+
|
|
192
|
+
**Strengths:**
|
|
193
|
+
|
|
194
|
+
- Strong SWE-bench performance
|
|
195
|
+
- Deep AWS service knowledge
|
|
196
|
+
- Expanded language support (Dart, Go, Kotlin, Rust, Bash, Terraform, etc.)
|
|
197
|
+
- Enterprise support and compliance
|
|
198
|
+
- Generous capacity (1000 agentic interactions/month, 4000 LOC/month transformations)
|
|
199
|
+
|
|
200
|
+
**Weaknesses:**
|
|
201
|
+
|
|
202
|
+
- AWS-centric (optimization bias toward AWS patterns)
|
|
203
|
+
- Proprietary, vendor-locked
|
|
204
|
+
- No multi-agent orchestration
|
|
205
|
+
- No daemon or autonomous processing
|
|
206
|
+
- No cross-repo fleet operations
|
|
207
|
+
- Limited deployment to AWS only
|
|
208
|
+
|
|
209
|
+
**Target Audience:** AWS-native enterprises; teams using AWS infrastructure; large enterprises wanting compliance
|
|
210
|
+
|
|
211
|
+
---
|
|
212
|
+
|
|
213
|
+
#### 7. **v0 by Vercel**
|
|
214
|
+
|
|
215
|
+
**Status:** Commercial SaaS
|
|
216
|
+
**Positioning:** AI UI/full-stack code generation for Next.js
|
|
217
|
+
|
|
218
|
+
**2026 Roadmap:**
|
|
219
|
+
|
|
220
|
+
- Full-stack app generation (not just UI)
|
|
221
|
+
- End-to-end agentic workflows
|
|
222
|
+
- Self-driving deployment infrastructure
|
|
223
|
+
- 6M+ developers, 80K+ active teams
|
|
224
|
+
|
|
225
|
+
**Strengths:**
|
|
226
|
+
|
|
227
|
+
- Specialized for React/Next.js (deep optimization)
|
|
228
|
+
- Vercel infrastructure integration
|
|
229
|
+
- UI-first feedback loop (see code working immediately)
|
|
230
|
+
- Full-stack ambitions for 2026
|
|
231
|
+
|
|
232
|
+
**Weaknesses:**
|
|
233
|
+
|
|
234
|
+
- Narrow focus (Next.js/React only)
|
|
235
|
+
- Not suitable for non-web or monolith codebases
|
|
236
|
+
- No multi-agent orchestration
|
|
237
|
+
- No cross-repo or fleet operations
|
|
238
|
+
- Interactive-only
|
|
239
|
+
|
|
240
|
+
**Target Audience:** Frontend-heavy teams; Next.js/React shops; startups building web apps
|
|
241
|
+
|
|
242
|
+
---
|
|
243
|
+
|
|
244
|
+
#### 8. **Aider**
|
|
245
|
+
|
|
246
|
+
**Status:** Open source, GPLv3
|
|
247
|
+
**Positioning:** Terminal-based AI pair programming with Git integration
|
|
248
|
+
|
|
249
|
+
**Key Features:**
|
|
250
|
+
|
|
251
|
+
- Multi-file editing with Git commit tracking
|
|
252
|
+
- Works with any LLM (Claude, GPT-4, local models)
|
|
253
|
+
- Codebase-aware (builds internal maps)
|
|
254
|
+
- CLI-native workflow
|
|
255
|
+
|
|
256
|
+
**Strengths:**
|
|
257
|
+
|
|
258
|
+
- Highly trusted in terminal/CLI environments
|
|
259
|
+
- Strong git integration (every edit is a commit)
|
|
260
|
+
- Model-agnostic
|
|
261
|
+
- Proven for refactors and multi-file changes
|
|
262
|
+
- Small, focused scope
|
|
263
|
+
|
|
264
|
+
**Weaknesses:**
|
|
265
|
+
|
|
266
|
+
- CLI-only (not web/IDE-based)
|
|
267
|
+
- No autonomous processing (interactive only)
|
|
268
|
+
- No pipeline or deployment automation
|
|
269
|
+
- No multi-agent or team features
|
|
270
|
+
- No memory or learning systems
|
|
271
|
+
|
|
272
|
+
**Target Audience:** Terminal-loving developers; teams wanting git-native workflows; DevOps engineers
|
|
273
|
+
|
|
274
|
+
---
|
|
275
|
+
|
|
276
|
+
### Adjacent Competitors: General Agent Orchestration Frameworks
|
|
277
|
+
|
|
278
|
+
These are not code-specific but compete for the multi-agent orchestration and workflow automation layers:
|
|
279
|
+
|
|
280
|
+
**CrewAI:** Role-playing agent framework, good for multi-agent workflows but not code-optimized
|
|
281
|
+
**AutoGen (Microsoft):** Open-source multi-agent orchestration, general-purpose
|
|
282
|
+
**LangGraph:** Graph-based task orchestration (DAG model), general-purpose
|
|
283
|
+
|
|
284
|
+
**Gap:** None of these are specialized for software delivery pipelines, team coordination in version control workflows, or autonomous issue processing.
|
|
285
|
+
|
|
286
|
+
---
|
|
287
|
+
|
|
288
|
+
## Competitive Matrix
|
|
289
|
+
|
|
290
|
+
| Feature | Devin | OpenHands | SWE-agent | Copilot | Cursor | Amazon Q | v0 | Aider | Shipwright |
|
|
291
|
+
| ------------------------- | ------------- | --------- | --------- | -------------- | ----------- | ----------- | ----------- | -------- | ---------------------- |
|
|
292
|
+
| **Model** | Proprietary | Agnostic | Agnostic | Agnostic (MCP) | Proprietary | Proprietary | Proprietary | Agnostic | Agnostic (uses Claude) |
|
|
293
|
+
| **Open Source** | ✗ | ✓ | ✓ | ✗ | Limited | ✗ | ✗ | ✓ | ✓ |
|
|
294
|
+
| **Single Agent** | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ |
|
|
295
|
+
| **Multi-Agent Teams** | ✗ | ✗ | ✗ | Limited | ✗ | ✗ | ✗ | ✗ | **✓** |
|
|
296
|
+
| **Autonomous Processing** | ✓ | ✗ | ✗ | Async | ✗ | ✗ | ✗ | ✗ | **✓** |
|
|
297
|
+
| **Daemon-Driven** | ✗ | ✗ | ✗ | ✗ | ✗ | ✗ | ✗ | ✗ | **✓** |
|
|
298
|
+
| **Issue Watching** | ✗ | ✗ | ✗ | GitHub-native | ✗ | ✗ | ✗ | ✗ | **✓ (multi-tracker)** |
|
|
299
|
+
| **Fleet Operations** | ✗ | ✗ | ✗ | ✗ | ✗ | ✗ | ✗ | ✗ | **✓** |
|
|
300
|
+
| **Delivery Pipeline** | ✗ | ✗ | ✗ | Basic | ✗ | ✗ | Basic | ✗ | **✓ (12 stages)** |
|
|
301
|
+
| **Persistent Memory** | ✗ | ✗ | ✗ | ✗ | ✗ | ✗ | ✗ | ✗ | **✓** |
|
|
302
|
+
| **DORA Metrics** | ✗ | ✗ | ✗ | ✗ | ✗ | ✗ | ✗ | ✗ | **✓** |
|
|
303
|
+
| **Cost Intelligence** | ✗ | ✗ | ✗ | ✗ | ✗ | ✗ | ✗ | ✗ | **✓** |
|
|
304
|
+
| **Git Worktrees** | ✗ | ✗ | ✗ | ✗ | ✗ | ✗ | ✗ | ✗ | **✓** |
|
|
305
|
+
| **Interactive Only** | ✗ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✗ |
|
|
306
|
+
| **IDE Integration** | ✗ | ✗ | ✗ | Native | Native | ✗ | Web | ✗ | tmux |
|
|
307
|
+
| **Pricing** | Enterprise $$ | Free/OSS | Free/OSS | $20/mo | $20/mo | $19/user/mo | Freemium | Free/OSS | **Free/OSS** |
|
|
308
|
+
|
|
309
|
+
---
|
|
310
|
+
|
|
311
|
+
## Market Trends & Insights
|
|
312
|
+
|
|
313
|
+
### 1. **Benchmark Performance Explosion**
|
|
314
|
+
|
|
315
|
+
- **SWE-bench Verified:** Top models now exceed 80% (Claude Opus 4.6 leads at 80.8%)
|
|
316
|
+
- **Reality Gap:** Despite high benchmark scores, real-world production success remains 23-25% on SWE-bench Pro (stricter evaluation)
|
|
317
|
+
- **Lesson:** The gap between lab performance and production suggests that agent orchestration, workflow automation, and learning systems will be competitive differentiators
|
|
318
|
+
|
|
319
|
+
**Implication for Shipwright:** Persistent memory, DORA metrics, and pipeline composition address this gap by learning from failures and improving over time.
|
|
320
|
+
|
|
321
|
+
---
|
|
322
|
+
|
|
323
|
+
### 2. **AI Agent Market Growth**
|
|
324
|
+
|
|
325
|
+
- **2025 Developer Adoption:** 85% of developers use some form of AI coding tool
|
|
326
|
+
- **Market Projection:** $8.5B by 2026, $35B by 2030
|
|
327
|
+
- **Risk:** 40% of agentic AI projects could be cancelled by 2027 due to cost, scaling complexity, or risk
|
|
328
|
+
|
|
329
|
+
**Implication for Shipwright:** Open-source, self-hosted positioning addresses cost and compliance concerns. Autonomous daemon addresses complexity concerns by reducing manual orchestration burden.
|
|
330
|
+
|
|
331
|
+
---
|
|
332
|
+
|
|
333
|
+
### 3. **Shift from Code Generation to Workflow Automation**
|
|
334
|
+
|
|
335
|
+
- **2024-2025:** Focus was on single-agent capability (Devin, SWE-agent, Copilot)
|
|
336
|
+
- **2026 Trend:** Market moving toward multi-agent teams, orchestration, and delivery pipelines
|
|
337
|
+
- **Evidence:** Devin's shift to "fleet management," Copilot's Mission Control, Vercel's agentic workflows, GitHub's Project Padawan
|
|
338
|
+
|
|
339
|
+
**Implication for Shipwright:** This is exactly Shipwright's core value: multi-agent orchestration, delivery pipelines, fleet operations. The market is moving toward this positioning.
|
|
340
|
+
|
|
341
|
+
---
|
|
342
|
+
|
|
343
|
+
### 4. **Model Vendor Lock-in vs. Flexibility**
|
|
344
|
+
|
|
345
|
+
- **Proprietary Tools:** Devin, Cursor, Windsurf, Amazon Q are locked to specific models
|
|
346
|
+
- **Flexible Tools:** OpenHands, SWE-agent, Claude Code, Aider support model switching
|
|
347
|
+
- **2026 Trend:** MCP (Model Context Protocol) and A2A (Agent-to-Agent) protocols emerging as interoperability standards
|
|
348
|
+
|
|
349
|
+
**Implication for Shipwright:** Agnostic to Claude Code (could theoretically support other agents). This positions Shipwright as a platform layer, not a vendor lock-in tool.
|
|
350
|
+
|
|
351
|
+
---
|
|
352
|
+
|
|
353
|
+
### 5. **Autonomous vs. Interactive**
|
|
354
|
+
|
|
355
|
+
- **Interactive Tools Dominate:** Most tools (Cursor, Windsurf, Copilot, Aider) require human-in-the-loop
|
|
356
|
+
- **Autonomous Leaders:** Devin and Copilot Agent Mode pioneer background processing
|
|
357
|
+
- **2026 Direction:** Enterprises increasingly demand autonomous, asynchronous processing (teams don't want to babysit agents)
|
|
358
|
+
|
|
359
|
+
**Implication for Shipwright:** Daemon-driven autonomous processing is a strong differentiator. Competitors are just starting to offer this; Shipwright has it built-in.
|
|
360
|
+
|
|
361
|
+
---
|
|
362
|
+
|
|
363
|
+
### 6. **Team Collaboration & Multi-Agent Trends**
|
|
364
|
+
|
|
365
|
+
- **Current State:** Most tools are single-agent or single-user focused
|
|
366
|
+
- **Emerging:** GitHub Copilot Workspace, Devin teams, Claude Code agent teams
|
|
367
|
+
- **Challenge:** Coordinating multiple agents without conflicts or duplicate work
|
|
368
|
+
- **Solution:** Enterprise frameworks (CrewAI, AutoGen, LangGraph) still generic; none purpose-built for software delivery
|
|
369
|
+
|
|
370
|
+
**Implication for Shipwright:** Multi-agent team orchestration with git-based coordination (worktrees, branch isolation) is a rare capability. This is a strong market differentiator.
|
|
371
|
+
|
|
372
|
+
---
|
|
373
|
+
|
|
374
|
+
### 7. **Enterprise Adoption Requirements**
|
|
375
|
+
|
|
376
|
+
Enterprises moving beyond proof-of-concept demand:
|
|
377
|
+
|
|
378
|
+
- **Cost Visibility:** Token usage, budget controls, ROI tracking
|
|
379
|
+
- **Audit Trail:** What changed, why, approval workflows
|
|
380
|
+
- **Integration:** GitHub, Linear, Jira, Slack, CI/CD pipelines
|
|
381
|
+
- **Compliance:** Self-hosted, data privacy, role-based access
|
|
382
|
+
|
|
383
|
+
**Implication for Shipwright:** Cost intelligence, memory system, GitHub/Linear/Jira integration, tmux-native workflow, open-source self-hosting all address enterprise adoption barriers.
|
|
384
|
+
|
|
385
|
+
---
|
|
386
|
+
|
|
387
|
+
### 8. **Specialization vs. Generalization**
|
|
388
|
+
|
|
389
|
+
- **Specialized Wins:** v0 (React/Next.js), SWE-agent (benchmark optimization), Devin (end-to-end tasks)
|
|
390
|
+
- **Generalist Tools:** OpenHands, Claude Code, Aider (any language, any task)
|
|
391
|
+
- **Market Lesson:** Specialization wins on depth; generalization wins on breadth
|
|
392
|
+
|
|
393
|
+
**Implication for Shipwright:** Positioned as a delivery platform (generalist), but can specialize via templates, team configurations, and domain-specific agent definitions.
|
|
394
|
+
|
|
395
|
+
---
|
|
396
|
+
|
|
397
|
+
## Shipwright's Unique Market Position
|
|
398
|
+
|
|
399
|
+
### What Shipwright Does That Competitors Don't
|
|
400
|
+
|
|
401
|
+
| Capability | Unique to Shipwright? | Value Proposition |
|
|
402
|
+
| ----------------------------------- | --------------------- | --------------------------------------------------------------------------------------------------- |
|
|
403
|
+
| Multi-agent team orchestration | Nearly unique | Parallel feature work, cross-layer coordination (frontend+backend+tests) |
|
|
404
|
+
| Daemon-driven autonomous processing | Nearly unique | Background issue watching → full pipeline without manual intervention |
|
|
405
|
+
| Fleet operations (multi-repo) | Unique | Scale orchestration across 10+ repos with single daemon |
|
|
406
|
+
| Persistent memory system | Unique | Agents learn from failures, improve over time, capture institutional knowledge |
|
|
407
|
+
| DORA metrics integration | Unique | Measure delivery performance (lead time, deployment frequency, CFR, MTTR) |
|
|
408
|
+
| Cost intelligence | Unique | Token budgeting, cost per issue, ROI tracking |
|
|
409
|
+
| Git worktree isolation | Unique | True parallel pipelines without branch conflicts |
|
|
410
|
+
| 12-stage delivery pipeline | Unique | intake → plan → design → build → test → review → quality → PR → merge → deploy → validate → monitor |
|
|
411
|
+
| Issue tracker integration | Unique | GitHub, Linear, Jira bidirectional sync with daemon auto-processing |
|
|
412
|
+
| tmux-native workflow | Unique | Professional TUI, team panes, session persistence, Claude Code optimized |
|
|
413
|
+
| Open source + self-hosted | Rare | All features available without vendor lock-in |
|
|
414
|
+
|
|
415
|
+
---
|
|
416
|
+
|
|
417
|
+
## Market Gaps Shipwright Fills
|
|
418
|
+
|
|
419
|
+
### Gap 1: No One Orchestrates Multi-Agent Teams for Delivery
|
|
420
|
+
|
|
421
|
+
**Problem:** Devin, Copilot, OpenHands are single-agent. GitHub/Copilot have limited multi-agent support.
|
|
422
|
+
**Shipwright Solution:** Full multi-agent team orchestration with role-based coordination (builder, reviewer, tester, optimizer, docs, security).
|
|
423
|
+
|
|
424
|
+
### Gap 2: No One Processes Issues Autonomously at Scale
|
|
425
|
+
|
|
426
|
+
**Problem:** All tools require human interaction. No daemon watching GitHub for labeled issues.
|
|
427
|
+
**Shipwright Solution:** Daemon watches GitHub, Linear, Jira → spawns teams → full pipeline → auto-merge/deploy.
|
|
428
|
+
|
|
429
|
+
### Gap 3: No One Operates Across Multiple Repos
|
|
430
|
+
|
|
431
|
+
**Problem:** All tools optimize for single-repo or single-codebase.
|
|
432
|
+
**Shipwright Solution:** Fleet operations with shared worker pool, rebalancing, fleet metrics.
|
|
433
|
+
|
|
434
|
+
### Gap 4: No One Learns from Failures
|
|
435
|
+
|
|
436
|
+
**Problem:** Each agent run is isolated. No institutional knowledge transfer.
|
|
437
|
+
**Shipwright Solution:** Memory system captures failure patterns, injects context into future runs, agents improve over time.
|
|
438
|
+
|
|
439
|
+
### Gap 5: No One Measures Delivery Performance
|
|
440
|
+
|
|
441
|
+
**Problem:** No visibility into DORA metrics (lead time, deployment frequency, CFR, MTTR).
|
|
442
|
+
**Shipwright Solution:** Native DORA metrics, self-optimization based on metrics.
|
|
443
|
+
|
|
444
|
+
### Gap 6: No One Provides True Cost Visibility
|
|
445
|
+
|
|
446
|
+
**Problem:** Token usage hidden, budgeting impossible, ROI unclear.
|
|
447
|
+
**Shipwright Solution:** Token tracking, daily budgets, cost per issue, cost forecasting.
|
|
448
|
+
|
|
449
|
+
---
|
|
450
|
+
|
|
451
|
+
## Competitive Threats & Responses
|
|
452
|
+
|
|
453
|
+
### Threat 1: Devin Continues to Improve
|
|
454
|
+
|
|
455
|
+
**Status:** Devin produces 25% of Cognition's code, targeting 50% by end of 2026
|
|
456
|
+
**Risk Level:** High for single-agent use cases
|
|
457
|
+
**Shipwright Response:**
|
|
458
|
+
|
|
459
|
+
- Devin excels at single complex tasks, but can't coordinate multi-agent teams
|
|
460
|
+
- Shipwright positions as the "orchestration layer" — you could theoretically use Devin agents within Shipwright pipelines (via API)
|
|
461
|
+
- Focus messaging on team coordination, multi-repo ops, autonomous processing
|
|
462
|
+
|
|
463
|
+
### Threat 2: GitHub Copilot Workspace Becomes Default
|
|
464
|
+
|
|
465
|
+
**Status:** Copilot is $20/month for 300M+ GitHub users
|
|
466
|
+
**Risk Level:** High for GitHub-native teams
|
|
467
|
+
**Shipwright Response:**
|
|
468
|
+
|
|
469
|
+
- Copilot is interactive-only and GitHub-only; Shipwright is autonomous, multi-tracker (GitHub + Linear + Jira)
|
|
470
|
+
- Copilot integrations are MCP-friendly; Shipwright can coexist (use Copilot agents within Shipwright)
|
|
471
|
+
- Open source, self-hosted model addresses enterprise compliance concerns
|
|
472
|
+
|
|
473
|
+
### Threat 3: OpenHands Gains Production Traction
|
|
474
|
+
|
|
475
|
+
**Status:** 188+ contributors, MIT license, fast-growing
|
|
476
|
+
**Risk Level:** Medium (positioned differently but overlapping audience)
|
|
477
|
+
**Shipwright Response:**
|
|
478
|
+
|
|
479
|
+
- OpenHands is a single-agent framework at scale; Shipwright is multi-agent orchestration
|
|
480
|
+
- OpenHands lacks daemon, pipeline, fleet, memory, DORA metrics
|
|
481
|
+
- Could integrate: Shipwright could spawn OpenHands agents in pipelines
|
|
482
|
+
|
|
483
|
+
### Threat 4: Proprietary Tools Commoditize Open Source
|
|
484
|
+
|
|
485
|
+
**Status:** Cursor, Windsurf, Devin all moving downmarket
|
|
486
|
+
**Risk Level:** Medium for developer mindshare
|
|
487
|
+
**Shipwright Response:**
|
|
488
|
+
|
|
489
|
+
- Shipwright appeals to enterprises and teams wanting self-hosted, cost-predictable solutions
|
|
490
|
+
- Developer adoption isn't the goal; team delivery efficiency is
|
|
491
|
+
- Focus on DevOps/platform engineers and CTOs, not individual developers
|
|
492
|
+
|
|
493
|
+
---
|
|
494
|
+
|
|
495
|
+
## Strategic Recommendations
|
|
496
|
+
|
|
497
|
+
### 1. **Own the "Orchestration" Market**
|
|
498
|
+
|
|
499
|
+
Position Shipwright as the orchestration platform for AI agents. Even if Devin or Copilot Agent Mode become the best single agents, Shipwright is the "controller" for teams of agents.
|
|
500
|
+
|
|
501
|
+
**Messaging:** "Devin is a junior engineer. Shipwright is the engineering manager."
|
|
502
|
+
|
|
503
|
+
### 2. **Focus on Enterprise Adoption Drivers**
|
|
504
|
+
|
|
505
|
+
Enterprises care about:
|
|
506
|
+
|
|
507
|
+
- **Cost control:** Implement token budgeting, per-issue cost tracking, ROI dashboards
|
|
508
|
+
- **Compliance:** Self-hosted, audit trails, role-based access
|
|
509
|
+
- **Integration:** Deep GitHub/Linear/Jira support, CI/CD webhooks, Slack notifications
|
|
510
|
+
- **Metrics:** DORA metrics, burn charts, velocity tracking
|
|
511
|
+
|
|
512
|
+
### 3. **Build a Market for Autonomous Delivery**
|
|
513
|
+
|
|
514
|
+
Most tools are interactive. Position Shipwright as "the autonomous delivery platform" — teams configure it once, daemon runs in background, PRs arrive pre-reviewed.
|
|
515
|
+
|
|
516
|
+
**Differentiator:** "Write once, ship continuously."
|
|
517
|
+
|
|
518
|
+
### 4. **Develop Agent Marketplace**
|
|
519
|
+
|
|
520
|
+
Create a marketplace for pre-built agents, team templates, and pipeline configurations. This creates network effects and switching costs.
|
|
521
|
+
|
|
522
|
+
**Examples:**
|
|
523
|
+
|
|
524
|
+
- Agent: "Security Specialist" (scans for OWASP Top 10)
|
|
525
|
+
- Agent: "Performance Reviewer" (benchmarks before/after)
|
|
526
|
+
- Template: "Monolith to Microservices" (multi-agent refactor)
|
|
527
|
+
- Template: "Legacy Framework Upgrade" (coordinated dependency updates)
|
|
528
|
+
|
|
529
|
+
### 5. **Memory as a Competitive Moat**
|
|
530
|
+
|
|
531
|
+
Invest heavily in the memory system. Agents that learn from failures are 2-3x more effective than those that don't.
|
|
532
|
+
|
|
533
|
+
**Market Positioning:** "Agents that get smarter with every issue."
|
|
534
|
+
|
|
535
|
+
### 6. **Target the "DevOps/Platform Team" Buyer**
|
|
536
|
+
|
|
537
|
+
These teams:
|
|
538
|
+
|
|
539
|
+
- Want to scale developer productivity without hiring
|
|
540
|
+
- Care about metrics and ROI
|
|
541
|
+
- Manage multiple repos/teams
|
|
542
|
+
- Run in-house infrastructure
|
|
543
|
+
|
|
544
|
+
**Shipwright fits perfectly:** "A platform engineering tool for AI."
|
|
545
|
+
|
|
546
|
+
### 7. **Prepare for LLM Model Commoditization**
|
|
547
|
+
|
|
548
|
+
As Claude, GPT, and others converge on capability, orchestration and workflow will be the differentiator.
|
|
549
|
+
|
|
550
|
+
**Strategy:** Make Shipwright model-agnostic (can swap Claude for any LLM via API). This future-proofs against model commoditization.
|
|
551
|
+
|
|
552
|
+
---
|
|
553
|
+
|
|
554
|
+
## Market Sizing Estimates
|
|
555
|
+
|
|
556
|
+
### Total Addressable Market (TAM)
|
|
557
|
+
|
|
558
|
+
- **Enterprise teams** managing 10+ repos: ~100K globally
|
|
559
|
+
- **Cloud-native orgs** with CI/CD: ~500K globally
|
|
560
|
+
- At $50-200/month per team = $50M-100M/year market
|
|
561
|
+
|
|
562
|
+
### Serviceable Addressable Market (SAM)
|
|
563
|
+
|
|
564
|
+
- **Tier 1:** Tech companies, fast-growth startups: ~50K teams
|
|
565
|
+
- At $100-300/month = $50M-150M/year
|
|
566
|
+
|
|
567
|
+
### Shipwright Target (SOM)
|
|
568
|
+
|
|
569
|
+
- **Year 1:** 100 teams (free/open source, early adopters)
|
|
570
|
+
- **Year 2:** 500 teams (commercial + open source mix)
|
|
571
|
+
- **Year 3:** 2000 teams
|
|
572
|
+
- At average $50/month (blended): $1.2M/year by year 3
|
|
573
|
+
|
|
574
|
+
---
|
|
575
|
+
|
|
576
|
+
## Conclusion
|
|
577
|
+
|
|
578
|
+
Shipwright operates in a market with clear tailwinds:
|
|
579
|
+
|
|
580
|
+
- Multi-agent orchestration is emerging as critical
|
|
581
|
+
- Autonomous, daemon-driven processing is becoming table stakes
|
|
582
|
+
- Enterprise adoption is increasing, driving demand for self-hosted, auditable solutions
|
|
583
|
+
- Market is moving from code generation (where proprietary tools lead) to delivery pipeline automation (where open, flexible platforms win)
|
|
584
|
+
|
|
585
|
+
**Unique value proposition:** The only open-source platform for autonomous, multi-agent, multi-repo software delivery with persistent learning and complete observability.
|
|
586
|
+
|
|
587
|
+
**Key success factors:**
|
|
588
|
+
|
|
589
|
+
1. Aggressive investment in memory system and agent learning
|
|
590
|
+
2. Deep enterprise integrations (GitHub, Linear, Jira, Slack, CI/CD)
|
|
591
|
+
3. Cost intelligence as a first-class feature
|
|
592
|
+
4. Agent and template marketplace to build network effects
|
|
593
|
+
5. Positioning as "orchestration layer," not single-agent competitor
|
|
594
|
+
|
|
595
|
+
**6-month priorities:**
|
|
596
|
+
|
|
597
|
+
- Ship memory system v2 (failure pattern injection)
|
|
598
|
+
- Launch cost intelligence dashboard
|
|
599
|
+
- Add Linear/Jira parity with GitHub
|
|
600
|
+
- Develop 3-5 production agent templates
|
|
601
|
+
- Secure 10 enterprise pilot customers
|
|
602
|
+
|
|
603
|
+
---
|
|
604
|
+
|
|
605
|
+
## Sources
|
|
606
|
+
|
|
607
|
+
- [Cognition | Devin's 2025 Performance Review](https://cognition.ai/blog/devin-annual-performance-review-2025)
|
|
608
|
+
- [OpenHands | The Open Platform for Cloud Coding Agents](https://openhands.dev/)
|
|
609
|
+
- [SWE-agent | GitHub Repository](https://github.com/SWE-agent/SWE-agent)
|
|
610
|
+
- [GitHub Copilot | Agent Mode and Features](https://github.com/newsroom/press-releases/agent-mode)
|
|
611
|
+
- [Cursor IDE vs Windsurf Comparison](https://research.aimultiple.com/ai-code-editor/)
|
|
612
|
+
- [Amazon Q Developer | 2025 Updates](https://aws.amazon.com/blogs/devops/april-2025-amazon-q-developer/)
|
|
613
|
+
- [v0 by Vercel | Building Agents and Apps](https://v0.app/)
|
|
614
|
+
- [Aider | AI Pair Programming in Terminal](https://aider.chat/)
|
|
615
|
+
- [Claude Code | Agent Teams Orchestration](https://code.claude.com/docs/en/agent-teams)
|
|
616
|
+
- [AI Coding Agent Market Trends 2026](https://blog.logrocket.com/ai-dev-tool-power-rankings)
|
|
617
|
+
- [SWE-Bench Performance Leaderboard](https://scale.com/leaderboard/swe_bench_pro_public)
|
|
618
|
+
- [AI Agent Orchestration Frameworks 2026](https://aimultiple.com/agentic-orchestration)
|
|
619
|
+
- [Deloitte | AI Agent Orchestration 2026](https://www.deloitte.com/us/en/insights/industry/technology/technology-media-and-telecom-predictions/2026/ai-agent-orchestration.html)
|