npm - @cubis/foundry - Versions diffs - 0.3.76 → 0.3.77 - Mend

@cubis/foundry 0.3.76 → 0.3.77

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (165) hide show

package/workflows/workflows/agent-environment-setup/platforms/copilot/skills/deep-research/references/source-ladder.md ADDED Viewed

@@ -0,0 +1,81 @@
+# Source Ladder
+## Goal
+Use the smallest amount of external research that still produces a decision-ready answer. Keep the evidence traceable and ordered by trust.
+## 1. Repo / Local Evidence First
+Start by inspecting:
+- application code and tests
+- README files and internal docs
+- generated workflow or skill assets
+- lockfiles, config files, and package manifests
+- existing integration code and migration history
+If the repo already answers the question, stop there. Do not browse externally just because web research feels safer.
+## 2. Primary External Sources
+Use these next:
+- official vendor docs
+- upstream repositories and release notes
+- standards bodies and reference specs
+- maintainer-authored examples
+Prefer sources that expose:
+- exact feature names
+- current version constraints
+- config formats
+- dates or changelog context
+When the topic is time-sensitive, capture the date you verified the source and the version or doc page involved.
+## 3. Secondary / Community Sources
+Use these only after primary evidence:
+- Reddit threads
+- issue comments
+- independent blog posts
+- forum discussions
+- third-party comparison articles
+Community evidence is useful for:
+- practical gotchas
+- migration pain points
+- missing-doc workarounds
+- real-world adoption patterns
+Community evidence is not enough on its own for authoritative claims about product behavior, supported configuration, or security guarantees.
+## 4. Conflict Handling
+When sources disagree:
+1. Prefer repo evidence for the current codebase state.
+2. Prefer official docs over community claims for product behavior.
+3. Prefer newer dated material when the sources cover the same feature.
+4. If the conflict remains unresolved, report it as a gap instead of guessing.
+## 5. Evidence Labels
+Use these labels in research output:
+- **Verified fact** — backed by repo evidence or a primary source
+- **Secondary evidence** — backed only by community or indirect sources
+- **Inference** — reasoned conclusion not directly stated by a source
+- **Gap** — could not be verified confidently
+## 6. Stop Conditions
+Stop researching when:
+- the decision is already clear
+- new sources only repeat the same point
+- the remaining uncertainty is small and clearly documented
+- the task should move into implementation or planning

package/workflows/workflows/agent-environment-setup/platforms/copilot/skills/skills_index.json CHANGED Viewed

@@ -193,6 +193,38 @@
       "databases"
     ]
   },
+  {
+    "id": "deep-research",
+    "package_id": "deep-research",
+    "catalog_id": "deep-research",
+    "kind": "skill",
+    "name": "deep-research",
+    "canonical": true,
+    "canonical_id": "deep-research",
+    "deprecated": false,
+    "replaced_by": null,
+    "aliases": [],
+    "category": "core-operating",
+    "layer": "core-operating",
+    "maturity": "incubating",
+    "tier": "experimental",
+    "tags": [
+      "core-operating",
+      "deep",
+      "deep research",
+      "deep-research",
+      "research"
+    ],
+    "path": ".github/skills/deep-research/SKILL.md",
+    "description": "Use when investigating latest vendor behavior, comparing tools or platforms, verifying claims beyond the repo, or gathering external evidence before implementation.",
+    "triggers": [
+      "deep-research",
+      "deep research",
+      "deep",
+      "research",
+      "core-operating"
+    ]
+  },
   {
     "id": "django-drf",
     "package_id": "django-drf",

package/workflows/workflows/agent-environment-setup/platforms/copilot/workflows/onboard.md CHANGED Viewed

@@ -30,9 +30,9 @@ Use this when joining a new project, exploring an unfamiliar codebase, or prepar
 ## Skill Routing
-- Primary skills: `architecture-doc`, `system-design`
+- Primary skills: `deep-research`, `system-design`
 - Supporting skills (optional): `system-design`, `database-design`, `typescript-best-practices`, `javascript-best-practices`, `python-best-practices`
-- Start with `architecture-doc` for systematic exploration and `system-design` for architecture mapping. Add `system-design` for undocumented systems.
+- Start with `deep-research` for systematic exploration and `system-design` for architecture mapping. Add `system-design` for undocumented systems. Prefer repo evidence first; use external sources only when setup or dependency behavior cannot be confirmed locally.
 ## Workflow steps
@@ -55,7 +55,7 @@ Use this when joining a new project, exploring an unfamiliar codebase, or prepar
 ONBOARD_WORKFLOW_RESULT:
   primary_agent: researcher
   supporting_agents: [code-archaeologist?, backend-specialist?, frontend-specialist?]
-  primary_skills: [architecture-doc, system-design]
+  primary_skills: [deep-research, system-design]
   supporting_skills: [system-design?, database-design?]
   project_overview:
     purpose: <string>

package/workflows/workflows/agent-environment-setup/platforms/copilot/workflows/orchestrate.md CHANGED Viewed

@@ -24,8 +24,8 @@ Use this when a task spans multiple domains (backend + frontend, security + infr
 ## Skill Routing
 - Primary skills: `system-design`, `api-design`
-- Supporting skills (optional): `database-design`, `architecture-doc`, `mcp-server-builder`, `tech-doc`, `prompt-engineering`, `skill-creator`
-- Start with `system-design` for system design coordination and `api-design` for integration contracts. Add supporting skills based on the coordination challenge.
+- Supporting skills (optional): `database-design`, `deep-research`, `mcp-server-builder`, `tech-doc`, `prompt-engineering`, `skill-creator`
+- Start with `system-design` for system design coordination and `api-design` for integration contracts. Add `deep-research` before implementation when the coordination challenge depends on fresh external facts or public comparison.
 ## Workflow steps

package/workflows/workflows/agent-environment-setup/platforms/copilot/workflows/plan.md CHANGED Viewed

@@ -36,13 +36,13 @@ Use this when starting a new feature, project, or significant change that needs
 ## Skill Routing
 - Primary skills: `system-design`, `api-design`
-- Supporting skills (optional): `database-design`, `architecture-doc`, `mcp-server-builder`, `tech-doc`, `prompt-engineering`, `skill-creator`
-- Start with `system-design` for system design and `api-design` for API contracts. Add `database-design` when data modeling is central, `architecture-doc` when external knowledge is needed.
+- Supporting skills (optional): `database-design`, `deep-research`, `mcp-server-builder`, `tech-doc`, `prompt-engineering`, `skill-creator`
+- Start with `system-design` for system design and `api-design` for API contracts. Add `database-design` when data modeling is central, `deep-research` when fresh external knowledge or public comparison is needed.
 ## Workflow steps
 1. Clarify scope, success criteria, and constraints.
-2. Research existing patterns and dependencies.
+2. Research existing patterns and dependencies, starting in-repo and escalating to `deep-research` only when outside evidence is required.
 3. Decompose into tasks with ownership and dependencies.
 4. Define interfaces, contracts, and failure modes.
 5. Produce acceptance criteria for each milestone.
@@ -62,7 +62,7 @@ PLAN_WORKFLOW_RESULT:
   primary_agent: project-planner
   supporting_agents: [orchestrator?, backend-specialist?, frontend-specialist?, database-architect?]
   primary_skills: [system-design, api-design]
-  supporting_skills: [database-design?, architecture-doc?, mcp-server-builder?]
+  supporting_skills: [database-design?, deep-research?, mcp-server-builder?]
   plan:
     scope_summary: <string>
     tasks:

package/workflows/workflows/agent-environment-setup/platforms/gemini/commands/accessibility.toml CHANGED Viewed

@@ -7,6 +7,8 @@ Execution contract:
 2. Confirm the request fits the workflow's "When to use" section.
 3. Execute according to "Workflow steps" and apply "Context notes".
 4. Complete "Verification" checks and report concrete evidence.
+5. If freshness, public comparison, or explicit research needs appear, pause implementation and load `deep-research` or route to the researcher posture first.
+6. For outside evidence: repo first, official docs next, Reddit/community only as labeled secondary evidence.
 If command arguments are provided, treat them as additional user context.
 '''

package/workflows/workflows/agent-environment-setup/platforms/gemini/commands/backend.toml CHANGED Viewed

@@ -7,6 +7,8 @@ Execution contract:
 2. Confirm the request fits the workflow's "When to use" section.
 3. Execute according to "Workflow steps" and apply "Context notes".
 4. Complete "Verification" checks and report concrete evidence.
+5. If freshness, public comparison, or explicit research needs appear, pause implementation and load `deep-research` or route to the researcher posture first.
+6. For outside evidence: repo first, official docs next, Reddit/community only as labeled secondary evidence.
 If command arguments are provided, treat them as additional user context.
 '''

package/workflows/workflows/agent-environment-setup/platforms/gemini/commands/create.toml CHANGED Viewed

@@ -7,6 +7,8 @@ Execution contract:
 2. Confirm the request fits the workflow's "When to use" section.
 3. Execute according to "Workflow steps" and apply "Context notes".
 4. Complete "Verification" checks and report concrete evidence.
+5. If freshness, public comparison, or explicit research needs appear, pause implementation and load `deep-research` or route to the researcher posture first.
+6. For outside evidence: repo first, official docs next, Reddit/community only as labeled secondary evidence.
 If command arguments are provided, treat them as additional user context.
 '''

package/workflows/workflows/agent-environment-setup/platforms/gemini/commands/database.toml CHANGED Viewed

@@ -7,6 +7,8 @@ Execution contract:
 2. Confirm the request fits the workflow's "When to use" section.
 3. Execute according to "Workflow steps" and apply "Context notes".
 4. Complete "Verification" checks and report concrete evidence.
+5. If freshness, public comparison, or explicit research needs appear, pause implementation and load `deep-research` or route to the researcher posture first.
+6. For outside evidence: repo first, official docs next, Reddit/community only as labeled secondary evidence.
 If command arguments are provided, treat them as additional user context.
 '''

package/workflows/workflows/agent-environment-setup/platforms/gemini/commands/debug.toml CHANGED Viewed

@@ -7,6 +7,8 @@ Execution contract:
 2. Confirm the request fits the workflow's "When to use" section.
 3. Execute according to "Workflow steps" and apply "Context notes".
 4. Complete "Verification" checks and report concrete evidence.
+5. If freshness, public comparison, or explicit research needs appear, pause implementation and load `deep-research` or route to the researcher posture first.
+6. For outside evidence: repo first, official docs next, Reddit/community only as labeled secondary evidence.
 If command arguments are provided, treat them as additional user context.
 '''

package/workflows/workflows/agent-environment-setup/platforms/gemini/commands/devops.toml CHANGED Viewed

@@ -7,6 +7,8 @@ Execution contract:
 2. Confirm the request fits the workflow's "When to use" section.
 3. Execute according to "Workflow steps" and apply "Context notes".
 4. Complete "Verification" checks and report concrete evidence.
+5. If freshness, public comparison, or explicit research needs appear, pause implementation and load `deep-research` or route to the researcher posture first.
+6. For outside evidence: repo first, official docs next, Reddit/community only as labeled secondary evidence.
 If command arguments are provided, treat them as additional user context.
 '''

package/workflows/workflows/agent-environment-setup/platforms/gemini/commands/implement-track.toml CHANGED Viewed

@@ -7,6 +7,8 @@ Execution contract:
 2. Confirm the request fits the workflow's "When to use" section.
 3. Execute according to "Workflow steps" and apply "Context notes".
 4. Complete "Verification" checks and report concrete evidence.
+5. If freshness, public comparison, or explicit research needs appear, pause implementation and load `deep-research` or route to the researcher posture first.
+6. For outside evidence: repo first, official docs next, Reddit/community only as labeled secondary evidence.
 If command arguments are provided, treat them as additional user context.
 '''

package/workflows/workflows/agent-environment-setup/platforms/gemini/commands/migrate.toml CHANGED Viewed

@@ -7,6 +7,8 @@ Execution contract:
 2. Confirm the request fits the workflow's "When to use" section.
 3. Execute according to "Workflow steps" and apply "Context notes".
 4. Complete "Verification" checks and report concrete evidence.
+5. If freshness, public comparison, or explicit research needs appear, pause implementation and load `deep-research` or route to the researcher posture first.
+6. For outside evidence: repo first, official docs next, Reddit/community only as labeled secondary evidence.
 If command arguments are provided, treat them as additional user context.
 '''

package/workflows/workflows/agent-environment-setup/platforms/gemini/commands/mobile.toml CHANGED Viewed

@@ -7,6 +7,8 @@ Execution contract:
 2. Confirm the request fits the workflow's "When to use" section.
 3. Execute according to "Workflow steps" and apply "Context notes".
 4. Complete "Verification" checks and report concrete evidence.
+5. If freshness, public comparison, or explicit research needs appear, pause implementation and load `deep-research` or route to the researcher posture first.
+6. For outside evidence: repo first, official docs next, Reddit/community only as labeled secondary evidence.
 If command arguments are provided, treat them as additional user context.
 '''

package/workflows/workflows/agent-environment-setup/platforms/gemini/commands/onboard.toml CHANGED Viewed

@@ -7,6 +7,8 @@ Execution contract:
 2. Confirm the request fits the workflow's "When to use" section.
 3. Execute according to "Workflow steps" and apply "Context notes".
 4. Complete "Verification" checks and report concrete evidence.
+5. If freshness, public comparison, or explicit research needs appear, pause implementation and load `deep-research` or route to the researcher posture first.
+6. For outside evidence: repo first, official docs next, Reddit/community only as labeled secondary evidence.
 If command arguments are provided, treat them as additional user context.
 '''

package/workflows/workflows/agent-environment-setup/platforms/gemini/commands/orchestrate.toml CHANGED Viewed

@@ -7,6 +7,8 @@ Execution contract:
 2. Confirm the request fits the workflow's "When to use" section.
 3. Execute according to "Workflow steps" and apply "Context notes".
 4. Complete "Verification" checks and report concrete evidence.
+5. If freshness, public comparison, or explicit research needs appear, pause implementation and load `deep-research` or route to the researcher posture first.
+6. For outside evidence: repo first, official docs next, Reddit/community only as labeled secondary evidence.
 If command arguments are provided, treat them as additional user context.
 '''

package/workflows/workflows/agent-environment-setup/platforms/gemini/commands/plan.toml CHANGED Viewed

@@ -7,6 +7,8 @@ Execution contract:
 2. Confirm the request fits the workflow's "When to use" section.
 3. Execute according to "Workflow steps" and apply "Context notes".
 4. Complete "Verification" checks and report concrete evidence.
+5. If freshness, public comparison, or explicit research needs appear, pause implementation and load `deep-research` or route to the researcher posture first.
+6. For outside evidence: repo first, official docs next, Reddit/community only as labeled secondary evidence.
 If command arguments are provided, treat them as additional user context.
 '''

package/workflows/workflows/agent-environment-setup/platforms/gemini/commands/refactor.toml CHANGED Viewed

@@ -7,6 +7,8 @@ Execution contract:
 2. Confirm the request fits the workflow's "When to use" section.
 3. Execute according to "Workflow steps" and apply "Context notes".
 4. Complete "Verification" checks and report concrete evidence.
+5. If freshness, public comparison, or explicit research needs appear, pause implementation and load `deep-research` or route to the researcher posture first.
+6. For outside evidence: repo first, official docs next, Reddit/community only as labeled secondary evidence.
 If command arguments are provided, treat them as additional user context.
 '''

package/workflows/workflows/agent-environment-setup/platforms/gemini/commands/release.toml CHANGED Viewed

@@ -7,6 +7,8 @@ Execution contract:
 2. Confirm the request fits the workflow's "When to use" section.
 3. Execute according to "Workflow steps" and apply "Context notes".
 4. Complete "Verification" checks and report concrete evidence.
+5. If freshness, public comparison, or explicit research needs appear, pause implementation and load `deep-research` or route to the researcher posture first.
+6. For outside evidence: repo first, official docs next, Reddit/community only as labeled secondary evidence.
 If command arguments are provided, treat them as additional user context.
 '''

package/workflows/workflows/agent-environment-setup/platforms/gemini/commands/review.toml CHANGED Viewed

@@ -7,6 +7,8 @@ Execution contract:
 2. Confirm the request fits the workflow's "When to use" section.
 3. Execute according to "Workflow steps" and apply "Context notes".
 4. Complete "Verification" checks and report concrete evidence.
+5. If freshness, public comparison, or explicit research needs appear, pause implementation and load `deep-research` or route to the researcher posture first.
+6. For outside evidence: repo first, official docs next, Reddit/community only as labeled secondary evidence.
 If command arguments are provided, treat them as additional user context.
 '''

package/workflows/workflows/agent-environment-setup/platforms/gemini/commands/security.toml CHANGED Viewed

@@ -7,6 +7,8 @@ Execution contract:
 2. Confirm the request fits the workflow's "When to use" section.
 3. Execute according to "Workflow steps" and apply "Context notes".
 4. Complete "Verification" checks and report concrete evidence.
+5. If freshness, public comparison, or explicit research needs appear, pause implementation and load `deep-research` or route to the researcher posture first.
+6. For outside evidence: repo first, official docs next, Reddit/community only as labeled secondary evidence.
 If command arguments are provided, treat them as additional user context.
 '''

package/workflows/workflows/agent-environment-setup/platforms/gemini/commands/test.toml CHANGED Viewed

@@ -7,6 +7,8 @@ Execution contract:
 2. Confirm the request fits the workflow's "When to use" section.
 3. Execute according to "Workflow steps" and apply "Context notes".
 4. Complete "Verification" checks and report concrete evidence.
+5. If freshness, public comparison, or explicit research needs appear, pause implementation and load `deep-research` or route to the researcher posture first.
+6. For outside evidence: repo first, official docs next, Reddit/community only as labeled secondary evidence.
 If command arguments are provided, treat them as additional user context.
 '''

package/workflows/workflows/agent-environment-setup/platforms/gemini/commands/vercel.toml CHANGED Viewed

@@ -7,6 +7,8 @@ Execution contract:
 2. Confirm the request fits the workflow's "When to use" section.
 3. Execute according to "Workflow steps" and apply "Context notes".
 4. Complete "Verification" checks and report concrete evidence.
+5. If freshness, public comparison, or explicit research needs appear, pause implementation and load `deep-research` or route to the researcher posture first.
+6. For outside evidence: repo first, official docs next, Reddit/community only as labeled secondary evidence.
 If command arguments are provided, treat them as additional user context.
 '''

package/workflows/workflows/agent-environment-setup/platforms/gemini/rules/GEMINI.md CHANGED Viewed

@@ -44,7 +44,7 @@ Execute this tree top-to-bottom. Stop at the **first match**. Never skip levels.
 ├─ [TRIVIAL] Single-step, obvious, reversible?
 │   → Execute directly. No routing. Stop.
 │
-├─ [EXPLICIT] User named a command or specialist?
+├─ [EXPLICIT] User named a command, specialist, or exact skill?
 │   → Honor that route exactly. Stop.
 │
 ├─ [SINGLE-DOMAIN] Multi-step but contained in one specialty?
@@ -64,6 +64,7 @@ Execute this tree top-to-bottom. Stop at the **first match**. Never skip levels.
 **Hard rules:**
 - Never pre-load skills before route resolution.
+- If the user names an exact skill ID, run `skill_validate` on that ID before `route_resolve`.
 - Never chain multiple skill loads per request.
 - Treat this file as **durable project memory** — not a per-task playbook.
@@ -219,6 +220,7 @@ Use this matrix to match incoming tasks to the correct skill and primary posture
 | docker-compose-dev | DevOps | Docker Compose local dev environments | DevOps Engineer |
 | kubernetes-deploy | DevOps | K8s manifests, Helm charts, deployment | DevOps Engineer |
 | observability | DevOps | Logging, metrics, tracing, alerting | DevOps Engineer |
+| deep-research | Research | Latest docs, public comparisons, external verification | Researcher |
 | llm-eval | AI/ML | LLM evaluation, benchmarking, evals | Researcher |
 | rag-patterns | AI/ML | RAG architecture, embeddings, retrieval | Researcher |
 | prompt-engineering | AI/ML | Prompt design, few-shot, chain-of-thought | Researcher |
@@ -254,12 +256,15 @@ cbx workflows doctor gemini --scope project
 Keep MCP context lazy and exact. Skills are supporting context, not the route layer.
 1. Never begin with `skill_search`. Inspect the repo/task locally first.
-2. Resolve workflows, agents, or free-text route intent with `route_resolve` before loading any skills.
-3. If the route is still unresolved and local grounding leaves the domain unclear, use one narrow `skill_search`.
-4. Always run `skill_validate` on the exact selected ID before `skill_get`.
-5. Call `skill_get` with `includeReferences:false` by default.
-6. Load at most one sidecar markdown file at a time with `skill_get_reference`.
-7. Do not auto-prime every specialist with a skill. Load only what the task clearly needs.
-8. Use upstream MCP servers such as `postman`, `stitch`, or `playwright` for real cloud/browser actions when available.
+2. If the user already named `/workflow`, `@agent`, or an exact skill ID, honor it directly. For exact skills, run `skill_validate` first and skip `route_resolve` when valid.
+3. Resolve only free-text workflow/agent intent with `route_resolve` before loading non-explicit skills.
+4. If the route is still unresolved and local grounding leaves the domain unclear, use one narrow `skill_search`.
+5. Always run `skill_validate` on the exact selected ID before `skill_get`.
+6. Call `skill_get` with `includeReferences:false` by default.
+7. Load at most one sidecar markdown file at a time with `skill_get_reference`.
+8. Do not auto-prime every specialist with a skill. Load only what the task clearly needs.
+9. For research: repo/local evidence first, official docs next, Reddit/community only as labeled secondary evidence.
+10. Escalate to research only when freshness matters, public comparison matters, or the user explicitly asks to research/verify.
+11. Use upstream MCP servers such as `postman`, `stitch`, or `playwright` for real cloud/browser actions when available.
 <!-- cbx:mcp:auto:end -->

package/workflows/workflows/agent-environment-setup/platforms/gemini/skills/deep-research/SKILL.md ADDED Viewed

@@ -0,0 +1,89 @@
+---
+name: deep-research
+description: Use when investigating latest vendor behavior, comparing tools or platforms, verifying claims beyond the repo, or gathering external evidence before implementation.
+---
+# Deep Research
+## Purpose
+Run a disciplined research pass before implementation when the repo alone is not enough. This skill keeps research evidence-driven: inspect the local codebase first, escalate to official docs when freshness or public comparison matters, then use labeled community evidence only when it adds practical context.
+## When to Use
+- Verifying latest SDK, CLI, API, or platform behavior
+- Comparing tools, frameworks, hosted services, or implementation approaches
+- Checking whether public docs and the local repo disagree
+- Gathering external evidence before planning a migration or new capability
+- Producing a structured research brief that hands off cleanly into implementation
+## Instructions
+1. **Define the research question before collecting sources** because vague research sprawls quickly. Restate the target topic, freshness requirement, comparison axis, and what decision the findings need to support.
+2. **Inspect the repo first** because many questions are already answerable from local code, configs, tests, docs, or generated assets. Do not browse externally until the local evidence is exhausted or clearly insufficient.
+3. **Decide whether external research is actually required** because not every task needs web evidence. Escalate only when freshness matters, public comparison matters, or the user explicitly asks to research or verify.
+4. **Follow the source ladder strictly** because evidence quality matters. Use official docs, upstream repositories, standards, and maintainer material as primary sources before looking at blogs, issue threads, or Reddit.
+5. **Capture concrete source details** because research without provenance is hard to trust. Record exact links, relevant dates, versions, and any repo files that support or contradict the external evidence.
+6. **Cross-check important claims across more than one source when possible** because public docs, repos, and community advice can drift. If sources disagree, say so explicitly instead of smoothing over the conflict.
+7. **Use Reddit and other community sources only as labeled secondary evidence** because they can surface practical gotchas but are not authoritative. Treat them as implementation color, not final truth.
+8. **Separate verified facts from inference** because downstream planning depends on confidence. Mark what is directly supported by repo evidence or official sources versus what you infer from patterns or secondary signals.
+9. **Keep the output decision-oriented** because the goal is not to dump links. Tie each finding back to the implementation, workflow, agent, or skill decision it affects.
+10. **Recommend the next route explicitly** because research is usually a handoff, not the end of the task. Name the next workflow, agent, or exact skill that should continue the work.
+11. **State the remaining gaps and risks** because incomplete research is still useful when the uncertainty is visible. Call out what you could not verify, what may have changed recently, and what assumptions remain.
+12. **Avoid over-quoting and over-collecting** because research quality comes from synthesis, not volume. Prefer concise summaries with high-signal citations over long pasted excerpts.
+13. **When the task turns into implementation, stop researching and hand off** because mixing discovery and execution usually creates drift. Deliver the research brief first, then route into the correct workflow or specialist.
+## Output Format
+Deliver:
+1. **Research question** — topic, freshness requirement, and decision to support
+2. **Verified facts** — repo evidence and primary-source findings
+3. **Secondary/community evidence** — labeled lower-trust supporting signals
+4. **Gaps / unknowns** — unresolved questions or contradictory evidence
+5. **Recommended next route** — direct execution, workflow, agent, or exact skill to use next
+## References
+Load only the file needed for the current question.
+| File | Load when |
+| --- | --- |
+| `references/source-ladder.md` | Need the repo-first and source-priority policy for official docs versus community evidence. |
+| `references/research-output.md` | Need the structured output format, evidence labeling rules, or handoff pattern. |
+| `references/comparison-checklist.md` | Comparing vendors, frameworks, or tools and need a concrete evaluation frame. |
+## Examples
+Use these when the task shape already matches.
+| File | Use when |
+| --- | --- |
+| `examples/01-latest-docs-check.md` | Verifying a latest capability or doc claim before implementation. |
+| `examples/02-ecosystem-comparison.md` | Comparing multiple tools or platforms with official-first sourcing. |
+| `examples/03-research-to-implementation-handoff.md` | Turning research findings into a concrete next workflow or specialist handoff. |
+## Gemini Research Flow
+- Treat Gemini commands and GEMINI.md as routing aids, not primary evidence. The evidence still comes from the repo first, then official docs, then labeled community sources.
+- Use the mirrored references to keep the research output structured: verified facts, secondary evidence, gaps, and recommended next route.
+- When a request is implementation-heavy but freshness matters, complete the research pass first, then move into the chosen workflow.
+## Gemini Platform Notes
+- Use `activate_skill` to invoke skills by name from Gemini CLI or Gemini Code Assist.
+- Skill files are stored under `.gemini/skills/` in the project root.
+- Gemini does not support `context: fork` — all skill execution is inline.
+- User arguments are passed as natural language in the activation prompt.
+- Reference files are loaded relative to the skill directory under `.gemini/skills/<skill-id>/`.

package/workflows/workflows/agent-environment-setup/platforms/gemini/skills/deep-research/evals/assertions.md ADDED Viewed

@@ -0,0 +1,17 @@
+# Deep Research Eval Assertions
+## Eval 1: Latest Capability Verification
+1. **repo-first** — Starts by checking repo or local evidence before jumping to web claims.
+2. **official-first** — Uses official docs or upstream sources as the primary evidence for the capability.
+3. **secondary-labeled** — If community sources are mentioned, labels them as secondary evidence instead of presenting them as authoritative.
+4. **gaps-called-out** — Identifies unresolved uncertainty or missing confirmation.
+5. **next-route** — Ends with a concrete recommended workflow, agent, or skill to use next.
+## Eval 2: Tool Comparison
+1. **comparison-frame** — Defines the comparison axes instead of producing vague preferences.
+2. **repo-impact** — Connects the comparison back to the current repo or implementation constraints.
+3. **fact-vs-inference** — Separates verified facts from inference or interpretation.
+4. **decision-oriented** — Produces a recommendation or explicit defer condition.
+5. **no-research-sprawl** — Keeps the output concise and structured rather than dumping raw links.

package/workflows/workflows/agent-environment-setup/platforms/gemini/skills/deep-research/evals/evals.json ADDED Viewed

@@ -0,0 +1,56 @@
+[
+  {
+    "name": "latest-capability-verification",
+    "description": "Validate that the skill performs repo-first research, prioritizes official documentation, labels secondary evidence, and recommends a next route.",
+    "prompt": "Research whether the latest official Claude Code hook surface supports reinforcing route honoring before implementation. Start with the current repo state, then use official docs if needed. If community sources add useful practical context, include them but label them appropriately. End with the next workflow, agent, or skill we should use.",
+    "assertions": [
+      {
+        "id": "repo-first",
+        "description": "Starts with repo or local evidence before using external sources."
+      },
+      {
+        "id": "official-first",
+        "description": "Treats official docs or upstream sources as the primary evidence."
+      },
+      {
+        "id": "secondary-labeled",
+        "description": "Labels any Reddit or community evidence as secondary rather than authoritative."
+      },
+      {
+        "id": "gaps-called-out",
+        "description": "States any unresolved gaps, conflicts, or unknowns."
+      },
+      {
+        "id": "next-route",
+        "description": "Ends with a concrete recommended next route."
+      }
+    ]
+  },
+  {
+    "name": "tool-comparison",
+    "description": "Validate that the skill compares options with a clear frame, ties findings back to repo impact, and produces a decision-ready output.",
+    "prompt": "Compare whether our CLI should keep enforcement in Gemini command wrappers only or also add Claude hook templates. Use the repo state first, then official docs for current platform capabilities, and include community evidence only if it adds implementation nuance. Finish with a recommendation and the next route to take.",
+    "assertions": [
+      {
+        "id": "comparison-frame",
+        "description": "Defines concrete comparison axes such as repo impact, enforcement surface, and maintenance cost."
+      },
+      {
+        "id": "repo-impact",
+        "description": "Connects each option back to the current repo or bundle behavior."
+      },
+      {
+        "id": "fact-vs-inference",
+        "description": "Separates verified facts from inference or interpretation."
+      },
+      {
+        "id": "decision-oriented",
+        "description": "Produces a recommendation or explicit defer condition."
+      },
+      {
+        "id": "no-research-sprawl",
+        "description": "Keeps the answer structured instead of turning into an unfiltered dump of sources."
+      }
+    ]
+  }
+]

package/workflows/workflows/agent-environment-setup/platforms/gemini/skills/deep-research/examples/01-latest-docs-check.md ADDED Viewed

@@ -0,0 +1,12 @@
+# Example: Latest Docs Check
+## User Request
+> Research whether Claude Code hooks can reinforce route honoring before we add new workflow rules.
+## Expected Shape
+1. Inspect the repo's current Claude rule and hook support first.
+2. Verify the current official Claude docs for hooks, event names, and config format.
+3. Separate those verified facts from any community commentary about hook effectiveness.
+4. End with the recommended next route, for example `@researcher` continuing the research or `/create` applying the validated hook template changes.

package/workflows/workflows/agent-environment-setup/platforms/gemini/skills/deep-research/examples/02-ecosystem-comparison.md ADDED Viewed

@@ -0,0 +1,12 @@
+# Example: Ecosystem Comparison
+## User Request
+> Compare whether our CLI should keep Gemini command enforcement only, or add another platform-native hook layer for Claude as well.
+## Expected Shape
+1. Start with repo constraints and current platform bundle behavior.
+2. Compare the official platform capabilities using primary docs.
+3. Add any useful community evidence as clearly labeled secondary input.
+4. Produce a recommendation tied to the repo: which platform gets which enforcement surface, and why.

package/workflows/workflows/agent-environment-setup/platforms/gemini/skills/deep-research/examples/03-research-to-implementation-handoff.md ADDED Viewed

@@ -0,0 +1,12 @@
+# Example: Research To Implementation Handoff
+## User Request
+> Research the latest Codex, Claude, and Gemini MCP behavior, then tell me the next route to update our workflow rules safely.
+## Expected Shape
+1. Gather repo evidence first.
+2. Verify current official docs for each platform.
+3. Summarize verified facts, secondary evidence, and gaps.
+4. End with a precise next route such as `/plan` for a policy change or `skill-creator` for skill/rule packaging work.