@alexandrealvaro/agentic 0.9.0-beta.1 → 0.9.2-beta.1
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +2 -0
- package/WORKFLOW.md +9 -0
- package/package.json +1 -1
package/README.md
CHANGED
|
@@ -2,6 +2,8 @@
|
|
|
2
2
|
|
|
3
3
|
A starter kit for engineering production code with LLMs. Lean templates and init prompts grounded in established standards: [Anthropic Skills](https://code.claude.com/docs/en/skills), [Claude Code subagents](https://code.claude.com/docs/en/sub-agents), [agents.md](https://agents.md), Nygard ADRs, [GitHub Spec Kit](https://github.com/github/spec-kit), and [Google Labs DESIGN.md](https://github.com/google-labs-code/design.md).
|
|
4
4
|
|
|
5
|
+
**The framing.** An LLM is the super-soldier serum; the engineer is Steve Rogers. The serum amplifies what the engineer already brings — solid bases, investigation, care for quality, architecture, clean code, observability, maintainability. The kit encodes those bases as skills, ADRs, and gates so the amplification compounds in the right direction. See [WORKFLOW.md](WORKFLOW.md) for the principles.
|
|
6
|
+
|
|
5
7
|
The CLI installs nine universal skills (`agentic-bootstrap`, `agentic-philosophy`, `agentic-architecture`, `agentic-adr`, `agentic-spec`, `agentic-task`, `agentic-audit`, `agentic-review`, `agentic-ground`) plus three conditional ones (`agentic-design` for frontend, `agentic-subagent` for Claude Code, `agentic-skill` opt-in) into the agent's native location. Each skill produces its artifact or runs its operation via the agent's native conversational UI; `agentic update` keeps installed skills in sync with upstream kit changes via a state-aware three-way diff. Report rough edges via [GitHub Issues](https://github.com/alexandremendoncaalvaro/agentic-development/issues); current releases live under [GitHub Releases](https://github.com/alexandremendoncaalvaro/agentic-development/releases).
|
|
6
8
|
|
|
7
9
|
## Prerequisites
|
package/WORKFLOW.md
CHANGED
|
@@ -2,6 +2,8 @@
|
|
|
2
2
|
|
|
3
3
|
Engineering production code with LLMs. Agentic, not vibe coding.
|
|
4
4
|
|
|
5
|
+
**The Steve Rogers framing.** The LLM is the super-soldier serum. The engineer is Steve Rogers. The serum amplifies what the engineer already brings — solid bases, organization, investigation, care for quality, architecture, clean code, documentation, observability, maintainability. Add the serum to a disciplined engineer and you get Captain America. Add it to an undisciplined one and you get faster sloppy at scale. This document is those bases written down as principles. The discipline is the input; the LLM is the amplifier; the kit (skills, ADRs, audits, gates) is the scaffolding that keeps the discipline intact across sessions, agents, and projects.
|
|
6
|
+
|
|
5
7
|
**The principle behind the rest:** context engineering beats prompt engineering. Context is finite and decays as it fills — aim for the smallest set of high-signal tokens that gets the outcome.
|
|
6
8
|
|
|
7
9
|
## TL;DR
|
|
@@ -25,6 +27,7 @@ What to keep in mind:
|
|
|
25
27
|
13. **Automation needs rails.** Hooks, tests, lint, CI, sandboxing, and permissions matter more than advisory text the agent can forget.
|
|
26
28
|
14. **Autonomy requires observability.** If the agent makes decisions, log the trajectory: tool calls, intermediate outputs, failures.
|
|
27
29
|
15. **Staged spikes when the technique is uncertain.** When the *how* is unknown — a library choice, a CV technique, a multi-stage transformation — break the problem into staged spikes against golden fixtures with per-stage debug artifacts.
|
|
30
|
+
16. **Discipline scales with project maturity.** Same principles bind every project; the artifact set scales. A spike runs posture + research + audit; a regulated product adds spec / ADR / hooks / evals. Add ceremony only where it changes agent behavior; configure at init and reconfigure as the project matures.
|
|
28
31
|
|
|
29
32
|
> Working with agents means trading typing for technical direction. The value is in giving the right context, setting boundaries, validating the result, and keeping "almost right" out of production.
|
|
30
33
|
|
|
@@ -41,6 +44,8 @@ There are two complementary frames for the artifacts the kit produces. The first
|
|
|
41
44
|
3. **Plan / Decisions** — `ARCHITECTURE.md` (system patterns and boundaries), `doc/adr/NNNN-*.md` (binding architectural decisions in Michael Nygard's pattern), `doc/tasks/NNNN-*.md` (per-work-unit plan with checkbox acceptance criteria). The *how* of building what the spec asked for.
|
|
42
45
|
4. **Code** — the implementation. Code is the primary documentation of behavior; comments justify non-obvious choices.
|
|
43
46
|
|
|
47
|
+
The four layers scale with project maturity (TL;DR #16). A spike or PoC profile may legitimately ship only Layers 1 and 4 — adding Layers 2 and 3 to a 200-line experiment is ceremony that does not change agent behavior. A team or regulated product runs all four. The kit's profiles (`poc`, `solo`, `team`, `mature`) configure which layers auto-install per project and are changeable as the project matures; the principles in this document bind every profile, only the artifact set differs.
|
|
48
|
+
|
|
44
49
|
### Three context types (loading mechanism)
|
|
45
50
|
|
|
46
51
|
- **Operational context is advisory.** `AGENTS.md` (or `CLAUDE.md` for Claude Code, which can mirror or import the same content via `@AGENTS.md`) tells the agent how to build, test, follow conventions, and where the security boundaries are. The agent reads it as a guide, not a contract. Open standard `AGENTS.md` is native in most agentic IDEs.
|
|
@@ -88,6 +93,8 @@ Use XML when the prompt mixes instructions, retrieved context, examples, user in
|
|
|
88
93
|
|
|
89
94
|
No format is universally best. **An observation from my practice, not benchmarked:** I've seen consistent gains when shifting prompts to XML — most noticeably with autonomous agents, where the prompt has to land alone without conversational refinement. Direct interactive use (Claude Code, Codex) tolerates loose Markdown; unattended agents don't. Claude in particular seems to respond well to XML, which I attribute to its training, but I haven't benchmarked it. Treat this as a starting hypothesis worth testing on your own target model and task before standardizing.
|
|
90
95
|
|
|
96
|
+
**Host-aware structured prompts.** Hosts that expose structured-prompt primitives — Claude Code's `AskUserQuestion` (multi-choice cards) and Plan Mode (plan-approval cards) — reduce ambiguity at confirmation gates more reliably than inline text. Prefer the structured primitive when the host supports it; fall back to numbered text otherwise. Codex has no equivalent today; its skills stay on numbered text. Skills carrying confirmation gates or multi-choice interview steps prescribe this preference (ADR-0014).
|
|
97
|
+
|
|
91
98
|
## 4–5. Research Before Implementation
|
|
92
99
|
|
|
93
100
|
Combines Find the Happy Path (canonical / idiomatic baseline) and Ground in Real Patterns (anchoring in project-specific examples). The kit treats both as one indivisible flow via `agentic-ground`; two prose sections would frame one operation as two separate practices.
|
|
@@ -163,6 +170,8 @@ Two 2025 industry surveys point at the same wall. JetBrains' DevEcosystem 2025 r
|
|
|
163
170
|
|
|
164
171
|
The takeaway: §10 (Reviewer) and §11 (Quality Gates) are not optional. Skipping them is where bug density grows.
|
|
165
172
|
|
|
173
|
+
Per the Steve Rogers framing in the preamble: the serum cannot manufacture discrimination — it amplifies whatever discrimination the engineer already brings. The kit's job is to encode discrimination into the agent's context (specs, ADRs, fresh-context reviews, deterministic gates) so the amplification compounds in the disciplined direction even when the engineer is sleepy, rushed, or handing off to another collaborator.
|
|
174
|
+
|
|
166
175
|
## 13. Evals for Anything Autonomous
|
|
167
176
|
|
|
168
177
|
If your agent is making decisions on its own, you need evals. A few principles:
|
package/package.json
CHANGED