multi-forge 0.2.0__py3-none-any.whl
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- forge/__init__.py +3 -0
- forge/_extensions/agents/.gitkeep +0 -0
- forge/_extensions/commands/.gitkeep +0 -0
- forge/_extensions/skills/analyze/SKILL.md +87 -0
- forge/_extensions/skills/challenge/SKILL.md +91 -0
- forge/_extensions/skills/consensus/SKILL.md +120 -0
- forge/_extensions/skills/consensus/resources/code_consensus_evaluation.md +94 -0
- forge/_extensions/skills/consensus/resources/consensus_evaluation.md +70 -0
- forge/_extensions/skills/consensus/resources/synthesis.md +101 -0
- forge/_extensions/skills/debate/SKILL.md +116 -0
- forge/_extensions/skills/debate/resources/code_debate_evaluation.md +101 -0
- forge/_extensions/skills/debate/resources/debate_evaluation.md +90 -0
- forge/_extensions/skills/panel/SKILL.md +141 -0
- forge/_extensions/skills/panel/resources/synthesis.md +103 -0
- forge/_extensions/skills/qa/SKILL.md +704 -0
- forge/_extensions/skills/qa/resources/checklist/0-enable.md +78 -0
- forge/_extensions/skills/qa/resources/checklist/1-preflight.md +24 -0
- forge/_extensions/skills/qa/resources/checklist/10-resume.md +143 -0
- forge/_extensions/skills/qa/resources/checklist/11-config.md +150 -0
- forge/_extensions/skills/qa/resources/checklist/12-search.md +58 -0
- forge/_extensions/skills/qa/resources/checklist/13-guard.md +237 -0
- forge/_extensions/skills/qa/resources/checklist/14-workflow.md +305 -0
- forge/_extensions/skills/qa/resources/checklist/15-skills.md +155 -0
- forge/_extensions/skills/qa/resources/checklist/16-handoff.md +224 -0
- forge/_extensions/skills/qa/resources/checklist/17-info.md +50 -0
- forge/_extensions/skills/qa/resources/checklist/18-disable.md +84 -0
- forge/_extensions/skills/qa/resources/checklist/19-uninstall.md +146 -0
- forge/_extensions/skills/qa/resources/checklist/2-extensions.md +188 -0
- forge/_extensions/skills/qa/resources/checklist/20-cleanup.md +36 -0
- forge/_extensions/skills/qa/resources/checklist/3-auth.md +234 -0
- forge/_extensions/skills/qa/resources/checklist/4-proxy.md +481 -0
- forge/_extensions/skills/qa/resources/checklist/5-session.md +541 -0
- forge/_extensions/skills/qa/resources/checklist/6-hooks.md +275 -0
- forge/_extensions/skills/qa/resources/checklist/7-costs.md +309 -0
- forge/_extensions/skills/qa/resources/checklist/8-status-line.md +174 -0
- forge/_extensions/skills/qa/resources/checklist/9-direct-commands.md +146 -0
- forge/_extensions/skills/qa/resources/checklist.md +103 -0
- forge/_extensions/skills/qa/resources/report-template.md +62 -0
- forge/_extensions/skills/qa/scripts/start-container.sh +529 -0
- forge/_extensions/skills/qa/scripts/walkthrough-state.py +1137 -0
- forge/_extensions/skills/review/SKILL.md +125 -0
- forge/_extensions/skills/review/references/claude-4.6.md +474 -0
- forge/_extensions/skills/review/references/claude-4.7.md +710 -0
- forge/_extensions/skills/review/references/gemini-3.1.md +546 -0
- forge/_extensions/skills/review/references/gpt-5.5.md +490 -0
- forge/_extensions/skills/review/references/skills-writing-guide.md +1588 -0
- forge/_extensions/skills/review/resources/code-anthropic.md +160 -0
- forge/_extensions/skills/review/resources/code-gemini.md +184 -0
- forge/_extensions/skills/review/resources/code-openai.md +203 -0
- forge/_extensions/skills/review/resources/code.md +160 -0
- forge/_extensions/skills/review-docs/SKILL.md +121 -0
- forge/_extensions/skills/review-docs/resources/docs-anthropic.md +170 -0
- forge/_extensions/skills/review-docs/resources/docs-gemini.md +204 -0
- forge/_extensions/skills/review-docs/resources/docs-openai.md +231 -0
- forge/_extensions/skills/review-docs/resources/docs.md +170 -0
- forge/_extensions/skills/smoke-test/SKILL.md +27 -0
- forge/_extensions/skills/smoke-test/scripts/smoke-test.sh +118 -0
- forge/_extensions/skills/understand/SKILL.md +148 -0
- forge/_extensions/skills/understand/resources/code-anthropic.md +163 -0
- forge/_extensions/skills/understand/resources/code-gemini.md +194 -0
- forge/_extensions/skills/understand/resources/code-openai.md +181 -0
- forge/_extensions/skills/understand/resources/code.md +163 -0
- forge/_extensions/skills/understand/resources/docs-anthropic.md +177 -0
- forge/_extensions/skills/understand/resources/docs-gemini.md +202 -0
- forge/_extensions/skills/understand/resources/docs-openai.md +191 -0
- forge/_extensions/skills/understand/resources/docs.md +177 -0
- forge/_extensions/skills/walkthrough/SKILL.md +599 -0
- forge/_extensions/skills/walkthrough/resources/checklist.md +765 -0
- forge/_extensions/skills/walkthrough/scripts/run-in-repo.sh +118 -0
- forge/_extensions/skills/walkthrough/scripts/setup-test-repo.sh +198 -0
- forge/_extensions/skills/walkthrough/scripts/walkthrough-state.py +1137 -0
- forge/backend/__init__.py +174 -0
- forge/backend/adapters/__init__.py +38 -0
- forge/backend/adapters/litellm.py +158 -0
- forge/backend/creation.py +89 -0
- forge/backend/registry.py +178 -0
- forge/cli/__init__.py +16 -0
- forge/cli/auth.py +483 -0
- forge/cli/backend.py +298 -0
- forge/cli/claude.py +411 -0
- forge/cli/config_cmd.py +303 -0
- forge/cli/extensions.py +1001 -0
- forge/cli/gc.py +165 -0
- forge/cli/guard.py +1018 -0
- forge/cli/guards.py +106 -0
- forge/cli/handoff.py +110 -0
- forge/cli/hooks/__init__.py +36 -0
- forge/cli/hooks/_group.py +20 -0
- forge/cli/hooks/_helpers.py +149 -0
- forge/cli/hooks/commands.py +1677 -0
- forge/cli/hooks/direct_commands.py +1304 -0
- forge/cli/hooks/install.py +232 -0
- forge/cli/hooks/policy.py +151 -0
- forge/cli/hooks/read_hygiene.py +74 -0
- forge/cli/hooks/verification.py +370 -0
- forge/cli/logs.py +406 -0
- forge/cli/main.py +292 -0
- forge/cli/proxy.py +1821 -0
- forge/cli/proxy_costs.py +313 -0
- forge/cli/search.py +416 -0
- forge/cli/session.py +892 -0
- forge/cli/session_addendum.py +81 -0
- forge/cli/session_fork.py +750 -0
- forge/cli/session_handoff.py +141 -0
- forge/cli/session_lifecycle.py +2053 -0
- forge/cli/session_manage.py +1336 -0
- forge/cli/session_memory.py +201 -0
- forge/cli/status_line.py +1398 -0
- forge/cli/workflow.py +1964 -0
- forge/config/__init__.py +110 -0
- forge/config/dataclass_utils.py +88 -0
- forge/config/defaults/__init__.py +0 -0
- forge/config/defaults/backends/__init__.py +0 -0
- forge/config/defaults/backends/litellm.yaml +196 -0
- forge/config/defaults/templates/__init__.py +0 -0
- forge/config/defaults/templates/litellm-anthropic-local.yaml +33 -0
- forge/config/defaults/templates/litellm-anthropic.yaml +24 -0
- forge/config/defaults/templates/litellm-gemini-flash-local.yaml +37 -0
- forge/config/defaults/templates/litellm-gemini-local.yaml +32 -0
- forge/config/defaults/templates/litellm-gemini-test.yaml +34 -0
- forge/config/defaults/templates/litellm-gemini.yaml +21 -0
- forge/config/defaults/templates/litellm-openai-codex-local.yaml +36 -0
- forge/config/defaults/templates/litellm-openai-local.yaml +38 -0
- forge/config/defaults/templates/litellm-openai.yaml +28 -0
- forge/config/defaults/templates/openrouter-anthropic.yaml +23 -0
- forge/config/defaults/templates/openrouter-deepseek.yaml +26 -0
- forge/config/defaults/templates/openrouter-gemini-flash.yaml +26 -0
- forge/config/defaults/templates/openrouter-gemini.yaml +23 -0
- forge/config/defaults/templates/openrouter-glm.yaml +23 -0
- forge/config/defaults/templates/openrouter-kimi.yaml +30 -0
- forge/config/defaults/templates/openrouter-minimax.yaml +26 -0
- forge/config/defaults/templates/openrouter-openai-codex.yaml +23 -0
- forge/config/defaults/templates/openrouter-openai.yaml +28 -0
- forge/config/defaults/templates/openrouter-qwen.yaml +25 -0
- forge/config/loader.py +675 -0
- forge/config/schema.py +448 -0
- forge/core/__init__.py +5 -0
- forge/core/auth/__init__.py +67 -0
- forge/core/auth/capabilities.py +219 -0
- forge/core/auth/credentials_file.py +244 -0
- forge/core/auth/protocols.py +18 -0
- forge/core/auth/secrets.py +243 -0
- forge/core/auth/template_secrets.py +112 -0
- forge/core/data/__init__.py +5 -0
- forge/core/data/model_catalog.yaml +1522 -0
- forge/core/data/pricing.yaml +140 -0
- forge/core/data/system_prompt_addendums/__init__.py +0 -0
- forge/core/data/system_prompt_addendums/gemini.md +330 -0
- forge/core/data/system_prompt_addendums/openai.md +328 -0
- forge/core/llm/__init__.py +231 -0
- forge/core/llm/clients/__init__.py +14 -0
- forge/core/llm/clients/base.py +115 -0
- forge/core/llm/clients/litellm.py +619 -0
- forge/core/llm/clients/openai_compat.py +244 -0
- forge/core/llm/clients/openrouter.py +234 -0
- forge/core/llm/credentials.py +439 -0
- forge/core/llm/detection.py +86 -0
- forge/core/llm/errors.py +44 -0
- forge/core/llm/protocols.py +80 -0
- forge/core/llm/types.py +176 -0
- forge/core/logging.py +146 -0
- forge/core/models/__init__.py +91 -0
- forge/core/models/catalog.py +467 -0
- forge/core/models/pricing.py +165 -0
- forge/core/models/types.py +167 -0
- forge/core/naming.py +212 -0
- forge/core/ops/__init__.py +73 -0
- forge/core/ops/context.py +141 -0
- forge/core/ops/gc.py +802 -0
- forge/core/ops/proxy.py +146 -0
- forge/core/ops/resolution.py +135 -0
- forge/core/ops/session.py +344 -0
- forge/core/ops/session_context.py +548 -0
- forge/core/paths.py +38 -0
- forge/core/process.py +54 -0
- forge/core/reactive/__init__.py +38 -0
- forge/core/reactive/cost_tracking.py +300 -0
- forge/core/reactive/env.py +180 -0
- forge/core/reactive/proxy.py +78 -0
- forge/core/reactive/routing.py +622 -0
- forge/core/reactive/session_runner.py +185 -0
- forge/core/reactive/structured_output.py +62 -0
- forge/core/reactive/tagger.py +94 -0
- forge/core/reactive/throttle.py +132 -0
- forge/core/state/__init__.py +59 -0
- forge/core/state/exceptions.py +59 -0
- forge/core/state/io.py +140 -0
- forge/core/state/lock.py +99 -0
- forge/core/state/timestamps.py +60 -0
- forge/core/transcript.py +78 -0
- forge/core/typing_helpers.py +24 -0
- forge/core/workqueue/__init__.py +67 -0
- forge/core/workqueue/queue.py +552 -0
- forge/core/workqueue/types.py +63 -0
- forge/guard/__init__.py +26 -0
- forge/guard/deterministic/__init__.py +26 -0
- forge/guard/deterministic/base.py +158 -0
- forge/guard/deterministic/coding_standards.py +256 -0
- forge/guard/deterministic/registry.py +148 -0
- forge/guard/deterministic/tdd.py +171 -0
- forge/guard/engine.py +216 -0
- forge/guard/protocols.py +91 -0
- forge/guard/queries.py +96 -0
- forge/guard/semantic/__init__.py +34 -0
- forge/guard/semantic/promotion.py +18 -0
- forge/guard/semantic/supervisor.py +813 -0
- forge/guard/semantic/verdict.py +183 -0
- forge/guard/store.py +124 -0
- forge/guard/team/__init__.py +6 -0
- forge/guard/team/config.py +24 -0
- forge/guard/team/handlers.py +209 -0
- forge/guard/team/prompts.py +41 -0
- forge/guard/types.py +125 -0
- forge/guard/workflow/__init__.py +17 -0
- forge/guard/workflow/branches.py +67 -0
- forge/guard/workflow/config.py +63 -0
- forge/guard/workflow/divergence.py +113 -0
- forge/guard/workflow/policy.py +87 -0
- forge/guard/workflow/stages.py +205 -0
- forge/install/__init__.py +55 -0
- forge/install/cli.py +281 -0
- forge/install/exceptions.py +163 -0
- forge/install/hooks.py +109 -0
- forge/install/installer.py +1037 -0
- forge/install/models.py +321 -0
- forge/install/preset.py +272 -0
- forge/install/settings_merge.py +831 -0
- forge/install/tracking.py +238 -0
- forge/install/version.py +141 -0
- forge/proxy/__init__.py +0 -0
- forge/proxy/base_client.py +181 -0
- forge/proxy/client_adapter.py +476 -0
- forge/proxy/client_factory.py +531 -0
- forge/proxy/converters.py +1206 -0
- forge/proxy/cost_logger.py +132 -0
- forge/proxy/cost_tracker.py +242 -0
- forge/proxy/data_models.py +338 -0
- forge/proxy/error_hints.py +92 -0
- forge/proxy/metrics.py +222 -0
- forge/proxy/model_spec.py +158 -0
- forge/proxy/proxies.py +333 -0
- forge/proxy/proxy_identity.py +134 -0
- forge/proxy/proxy_orchestrator.py +1018 -0
- forge/proxy/proxy_startup.py +54 -0
- forge/proxy/server.py +1561 -0
- forge/proxy/utils.py +537 -0
- forge/review/__init__.py +6 -0
- forge/review/adversarial.py +111 -0
- forge/review/consensus.py +236 -0
- forge/review/engine.py +356 -0
- forge/review/models.py +437 -0
- forge/review/resources/__init__.py +5 -0
- forge/review/resources/codereview-performance.md +85 -0
- forge/review/resources/codereview-quick.md +75 -0
- forge/review/resources/codereview-security.md +92 -0
- forge/review/resources/codereview.md +85 -0
- forge/review/resources/docreview-quick.md +75 -0
- forge/review/resources/docreview.md +86 -0
- forge/review/resources/thinkdeep.md +89 -0
- forge/review/routing.py +368 -0
- forge/review/synthesis.py +73 -0
- forge/runtime_config.py +438 -0
- forge/search/__init__.py +55 -0
- forge/search/bm25_store.py +264 -0
- forge/search/content_store.py +197 -0
- forge/search/engine.py +352 -0
- forge/search/exceptions.py +51 -0
- forge/search/extractor.py +234 -0
- forge/search/index_state.py +295 -0
- forge/search/store.py +215 -0
- forge/search/tokenizer.py +24 -0
- forge/session/__init__.py +130 -0
- forge/session/active.py +339 -0
- forge/session/artifacts.py +202 -0
- forge/session/claude/__init__.py +50 -0
- forge/session/claude/cleanup.py +105 -0
- forge/session/claude/invoke.py +236 -0
- forge/session/claude/paths.py +200 -0
- forge/session/cleanup.py +216 -0
- forge/session/config.py +34 -0
- forge/session/direct_model.py +107 -0
- forge/session/effective.py +169 -0
- forge/session/exceptions.py +255 -0
- forge/session/handoff.py +881 -0
- forge/session/handoff_agent.py +544 -0
- forge/session/hooks/__init__.py +35 -0
- forge/session/hooks/models.py +73 -0
- forge/session/hooks/session_start.py +507 -0
- forge/session/identity.py +84 -0
- forge/session/index.py +553 -0
- forge/session/manager.py +1506 -0
- forge/session/models.py +572 -0
- forge/session/overrides.py +344 -0
- forge/session/plan_resolution.py +286 -0
- forge/session/prev_sessions.py +128 -0
- forge/session/store.py +431 -0
- forge/session/validation.py +47 -0
- forge/session/worktree/__init__.py +65 -0
- forge/session/worktree/cleanup.py +262 -0
- forge/session/worktree/config_copy.py +203 -0
- forge/session/worktree/create.py +332 -0
- forge/sidecar/__init__.py +29 -0
- forge/sidecar/container.py +161 -0
- forge/sidecar/docker.py +86 -0
- forge/sidecar/secrets.py +19 -0
- multi_forge-0.2.0.dist-info/METADATA +242 -0
- multi_forge-0.2.0.dist-info/RECORD +311 -0
- multi_forge-0.2.0.dist-info/WHEEL +4 -0
- multi_forge-0.2.0.dist-info/entry_points.txt +2 -0
- multi_forge-0.2.0.dist-info/licenses/LICENSE +203 -0
- multi_forge-0.2.0.dist-info/licenses/NOTICE +14 -0
forge/review/models.py
ADDED
|
@@ -0,0 +1,437 @@
|
|
|
1
|
+
"""Data models for multi-model review.
|
|
2
|
+
|
|
3
|
+
Defines model specifications, review results, and the default
|
|
4
|
+
model catalog. Models declare identity and provider refs; concrete
|
|
5
|
+
routing is derived at runtime by ``forge.review.routing``.
|
|
6
|
+
"""
|
|
7
|
+
|
|
8
|
+
from __future__ import annotations
|
|
9
|
+
|
|
10
|
+
from dataclasses import dataclass, field, replace
|
|
11
|
+
from typing import Literal
|
|
12
|
+
|
|
13
|
+
PromptMode = Literal["override", "prefix"]
|
|
14
|
+
|
|
15
|
+
|
|
16
|
+
@dataclass(frozen=True)
|
|
17
|
+
class ModelSpec:
|
|
18
|
+
"""A model backend for multi-model review.
|
|
19
|
+
|
|
20
|
+
Attributes:
|
|
21
|
+
name: Human-readable identifier (e.g., "gpt-5.5").
|
|
22
|
+
model_id: Forge-canonical model ID (e.g., "gpt-5.5").
|
|
23
|
+
family: Model family (e.g., "openai", "anthropic", "gemini").
|
|
24
|
+
provider_refs: Ordered provider preference as (namespace, model_ref)
|
|
25
|
+
pairs. ``("direct", "claude-opus-4-6")`` means direct Anthropic;
|
|
26
|
+
``("openrouter", "openai/gpt-5.5")`` means OpenRouter routing.
|
|
27
|
+
description: What this model is good at.
|
|
28
|
+
preferred_proxy: Catalog recommendation for proxy routing (soft,
|
|
29
|
+
overridable). None for direct-only models.
|
|
30
|
+
prompt: Per-worker prompt override. When set, this worker receives
|
|
31
|
+
this prompt according to prompt_mode.
|
|
32
|
+
prompt_mode: "override" means prompt replaces the global prompt.
|
|
33
|
+
"prefix" means prompt is prepended to the global prompt as a hint.
|
|
34
|
+
worker_id: Stable key for JSON output. Defaults to ``name`` when None.
|
|
35
|
+
"""
|
|
36
|
+
|
|
37
|
+
name: str
|
|
38
|
+
model_id: str
|
|
39
|
+
family: str
|
|
40
|
+
provider_refs: tuple[tuple[str, str], ...]
|
|
41
|
+
description: str
|
|
42
|
+
preferred_proxy: str | None = None
|
|
43
|
+
prompt: str | None = None
|
|
44
|
+
prompt_mode: PromptMode = "override"
|
|
45
|
+
worker_id: str | None = None
|
|
46
|
+
|
|
47
|
+
@property
|
|
48
|
+
def effective_worker_id(self) -> str:
|
|
49
|
+
"""Stable key for result maps and JSON output."""
|
|
50
|
+
return self.worker_id if self.worker_id is not None else self.name
|
|
51
|
+
|
|
52
|
+
|
|
53
|
+
@dataclass
|
|
54
|
+
class ReviewResult:
|
|
55
|
+
"""Result from one model's review."""
|
|
56
|
+
|
|
57
|
+
model_name: str
|
|
58
|
+
stdout: str
|
|
59
|
+
stderr: str
|
|
60
|
+
success: bool
|
|
61
|
+
duration_seconds: float
|
|
62
|
+
error: str | None = None
|
|
63
|
+
|
|
64
|
+
|
|
65
|
+
@dataclass
|
|
66
|
+
class MultiReviewOutput:
|
|
67
|
+
"""Aggregate output from a multi-model review run."""
|
|
68
|
+
|
|
69
|
+
prompt: str
|
|
70
|
+
results: list[ReviewResult] = field(default_factory=list)
|
|
71
|
+
|
|
72
|
+
@property
|
|
73
|
+
def successful(self) -> int:
|
|
74
|
+
return sum(1 for r in self.results if r.success)
|
|
75
|
+
|
|
76
|
+
@property
|
|
77
|
+
def failed(self) -> int:
|
|
78
|
+
return sum(1 for r in self.results if not r.success)
|
|
79
|
+
|
|
80
|
+
|
|
81
|
+
_CLAUDE_47_BOUNDED_REVIEW_PROMPT = """\
|
|
82
|
+
You are the Claude Opus 4.7 bounded-review worker in a Forge quorum.
|
|
83
|
+
|
|
84
|
+
Use the provided target and prompt as the complete task scope. Prefer concrete
|
|
85
|
+
evidence over broad narrative: cite file:line locations for every substantive
|
|
86
|
+
finding, quote only the minimum necessary code, and separate confirmed issues
|
|
87
|
+
from hypotheses. Do not rely on vague prior referents or unstated conversation
|
|
88
|
+
history. If the prompt lacks a needed target, say exactly what is missing.
|
|
89
|
+
"""
|
|
90
|
+
|
|
91
|
+
|
|
92
|
+
def _build_available_models() -> dict[str, ModelSpec]:
|
|
93
|
+
"""Build available review models from the model catalog.
|
|
94
|
+
|
|
95
|
+
Model names derive from model_catalog.yaml so updating defaults is
|
|
96
|
+
a single YAML change. Provider-specific model refs come from the
|
|
97
|
+
corresponding proxy template tier configs.
|
|
98
|
+
"""
|
|
99
|
+
from forge.core.models.catalog import get_default_model
|
|
100
|
+
|
|
101
|
+
openai_opus = get_default_model("openai", "opus")
|
|
102
|
+
gemini_opus = get_default_model("gemini", "opus")
|
|
103
|
+
anthropic_opus = get_default_model("anthropic", "opus")
|
|
104
|
+
deepseek_opus = get_default_model("deepseek", "opus")
|
|
105
|
+
minimax_opus = get_default_model("minimax", "opus")
|
|
106
|
+
qwen_opus = get_default_model("qwen", "opus")
|
|
107
|
+
glm_opus = get_default_model("glm", "opus")
|
|
108
|
+
kimi_opus = get_default_model("kimi", "opus")
|
|
109
|
+
|
|
110
|
+
return {
|
|
111
|
+
openai_opus: ModelSpec(
|
|
112
|
+
name=openai_opus,
|
|
113
|
+
model_id=openai_opus,
|
|
114
|
+
family="openai",
|
|
115
|
+
provider_refs=(("openrouter", "openai/gpt-5.5"), ("litellm", "openai/gpt-5.5")),
|
|
116
|
+
preferred_proxy="openrouter-openai",
|
|
117
|
+
description="Logical problems, systematic code review",
|
|
118
|
+
),
|
|
119
|
+
gemini_opus: ModelSpec(
|
|
120
|
+
name=gemini_opus,
|
|
121
|
+
model_id=gemini_opus,
|
|
122
|
+
family="gemini",
|
|
123
|
+
provider_refs=(
|
|
124
|
+
("openrouter", "google/gemini-3.1-pro-preview"),
|
|
125
|
+
("litellm", "google/gemini-3.1-pro-preview"),
|
|
126
|
+
),
|
|
127
|
+
preferred_proxy="openrouter-gemini",
|
|
128
|
+
description="Balanced analysis, pragmatic suggestions, large context",
|
|
129
|
+
),
|
|
130
|
+
deepseek_opus: ModelSpec(
|
|
131
|
+
name=deepseek_opus,
|
|
132
|
+
model_id=deepseek_opus,
|
|
133
|
+
family="deepseek",
|
|
134
|
+
provider_refs=(("openrouter", "deepseek/deepseek-v4-pro"),),
|
|
135
|
+
preferred_proxy="openrouter-deepseek",
|
|
136
|
+
description="Cost-efficient reasoning, strong code analysis",
|
|
137
|
+
),
|
|
138
|
+
minimax_opus: ModelSpec(
|
|
139
|
+
name=minimax_opus,
|
|
140
|
+
model_id=minimax_opus,
|
|
141
|
+
family="minimax",
|
|
142
|
+
provider_refs=(("openrouter", "minimax/minimax-m2.7"),),
|
|
143
|
+
preferred_proxy="openrouter-minimax",
|
|
144
|
+
description="Cost-efficient agentic analysis, broad coverage",
|
|
145
|
+
),
|
|
146
|
+
qwen_opus: ModelSpec(
|
|
147
|
+
name=qwen_opus,
|
|
148
|
+
model_id=qwen_opus,
|
|
149
|
+
family="qwen",
|
|
150
|
+
provider_refs=(("openrouter", "qwen/qwen3.6-max-preview"),),
|
|
151
|
+
preferred_proxy="openrouter-qwen",
|
|
152
|
+
description="Large context multilingual analysis",
|
|
153
|
+
),
|
|
154
|
+
glm_opus: ModelSpec(
|
|
155
|
+
name=glm_opus,
|
|
156
|
+
model_id=glm_opus,
|
|
157
|
+
family="glm",
|
|
158
|
+
provider_refs=(("openrouter", "z-ai/glm-5.1"),),
|
|
159
|
+
preferred_proxy="openrouter-glm",
|
|
160
|
+
description="Cost-efficient general analysis",
|
|
161
|
+
),
|
|
162
|
+
kimi_opus: ModelSpec(
|
|
163
|
+
name=kimi_opus,
|
|
164
|
+
model_id=kimi_opus,
|
|
165
|
+
family="kimi",
|
|
166
|
+
provider_refs=(("openrouter", "moonshotai/kimi-k2.6"),),
|
|
167
|
+
preferred_proxy="openrouter-kimi",
|
|
168
|
+
description="Agentic code generation and analysis",
|
|
169
|
+
),
|
|
170
|
+
"claude-opus": ModelSpec(
|
|
171
|
+
name="claude-opus",
|
|
172
|
+
model_id="claude-opus",
|
|
173
|
+
family="anthropic",
|
|
174
|
+
provider_refs=(("direct", anthropic_opus),),
|
|
175
|
+
description="Deep architectural analysis, complex reasoning",
|
|
176
|
+
),
|
|
177
|
+
"claude-opus-4.6": ModelSpec(
|
|
178
|
+
name="claude-opus-4.6",
|
|
179
|
+
model_id="claude-opus-4.6",
|
|
180
|
+
family="anthropic",
|
|
181
|
+
provider_refs=(("direct", "claude-opus-4-6"),),
|
|
182
|
+
description="Stable Claude Opus 4.6 direct worker",
|
|
183
|
+
),
|
|
184
|
+
"claude-opus-4.6-1m": ModelSpec(
|
|
185
|
+
name="claude-opus-4.6-1m",
|
|
186
|
+
model_id="claude-opus-4.6-1m",
|
|
187
|
+
family="anthropic",
|
|
188
|
+
provider_refs=(("direct", "claude-opus-4-6[1m]"),),
|
|
189
|
+
description="Stable Claude Opus 4.6 direct worker with 1M context pin",
|
|
190
|
+
),
|
|
191
|
+
"claude-opus-4.7": ModelSpec(
|
|
192
|
+
name="claude-opus-4.7",
|
|
193
|
+
model_id="claude-opus-4.7",
|
|
194
|
+
family="anthropic",
|
|
195
|
+
provider_refs=(("direct", "claude-opus-4-7"),),
|
|
196
|
+
description="Bounded single-shot review and quorum dissent",
|
|
197
|
+
prompt=_CLAUDE_47_BOUNDED_REVIEW_PROMPT,
|
|
198
|
+
prompt_mode="prefix",
|
|
199
|
+
),
|
|
200
|
+
}
|
|
201
|
+
|
|
202
|
+
|
|
203
|
+
def _build_default_model_names() -> tuple[str, ...]:
|
|
204
|
+
"""Return the semantically chosen default quorum model names."""
|
|
205
|
+
from forge.core.models.catalog import get_default_model
|
|
206
|
+
|
|
207
|
+
return (
|
|
208
|
+
get_default_model("openai", "opus"),
|
|
209
|
+
get_default_model("gemini", "opus"),
|
|
210
|
+
"claude-opus",
|
|
211
|
+
)
|
|
212
|
+
|
|
213
|
+
|
|
214
|
+
def _build_model_aliases(available: dict[str, ModelSpec]) -> dict[str, str]:
|
|
215
|
+
"""Return convenience aliases mapped to canonical workflow model names."""
|
|
216
|
+
from forge.core.models.catalog import get_compact_name, get_default_model
|
|
217
|
+
|
|
218
|
+
aliases: dict[str, str] = {}
|
|
219
|
+
for family in ("openai", "gemini", "deepseek", "minimax", "qwen", "glm", "kimi"):
|
|
220
|
+
canonical = get_default_model(family, "opus")
|
|
221
|
+
compact = get_compact_name(canonical)
|
|
222
|
+
if compact != canonical and compact not in available:
|
|
223
|
+
aliases[compact] = canonical
|
|
224
|
+
return aliases
|
|
225
|
+
|
|
226
|
+
|
|
227
|
+
# Proxy_ids match Forge template names (forge proxy create <template>).
|
|
228
|
+
AVAILABLE_MODELS: dict[str, ModelSpec] = _build_available_models()
|
|
229
|
+
MODEL_ALIASES: dict[str, str] = _build_model_aliases(AVAILABLE_MODELS)
|
|
230
|
+
_DEFAULT_MODEL_NAMES: tuple[str, ...] = _build_default_model_names()
|
|
231
|
+
DEFAULT_MODELS: dict[str, ModelSpec] = {name: AVAILABLE_MODELS[name] for name in _DEFAULT_MODEL_NAMES}
|
|
232
|
+
|
|
233
|
+
|
|
234
|
+
def resolve_model_specs(names_str: str | None) -> list[ModelSpec]:
|
|
235
|
+
"""Parse comma-separated model names into ModelSpec list.
|
|
236
|
+
|
|
237
|
+
Returns all DEFAULT_MODELS when names_str is None.
|
|
238
|
+
Raises ValueError for unknown model names.
|
|
239
|
+
"""
|
|
240
|
+
if not names_str:
|
|
241
|
+
return list(DEFAULT_MODELS.values())
|
|
242
|
+
|
|
243
|
+
names = [m.strip() for m in names_str.split(",")]
|
|
244
|
+
invalid = [m for m in names if m not in AVAILABLE_MODELS and m not in MODEL_ALIASES]
|
|
245
|
+
if invalid:
|
|
246
|
+
available = list(AVAILABLE_MODELS.keys()) + sorted(MODEL_ALIASES.keys())
|
|
247
|
+
raise ValueError(f"Unknown models: {invalid}. Available: {available}")
|
|
248
|
+
|
|
249
|
+
specs: list[ModelSpec] = []
|
|
250
|
+
for name in names:
|
|
251
|
+
if name in AVAILABLE_MODELS:
|
|
252
|
+
specs.append(AVAILABLE_MODELS[name])
|
|
253
|
+
continue
|
|
254
|
+
canonical = MODEL_ALIASES[name]
|
|
255
|
+
specs.append(replace(AVAILABLE_MODELS[canonical], worker_id=name))
|
|
256
|
+
return specs
|
|
257
|
+
|
|
258
|
+
|
|
259
|
+
def available_model_specs() -> list[ModelSpec]:
|
|
260
|
+
"""Return every selectable workflow model spec."""
|
|
261
|
+
return list(AVAILABLE_MODELS.values())
|
|
262
|
+
|
|
263
|
+
|
|
264
|
+
@dataclass(frozen=True)
|
|
265
|
+
class ModelAvailability:
|
|
266
|
+
"""Availability status for a model backend."""
|
|
267
|
+
|
|
268
|
+
spec: ModelSpec
|
|
269
|
+
status: str # "ready" | "unavailable" | "error"
|
|
270
|
+
reason: str # empty when ready
|
|
271
|
+
|
|
272
|
+
|
|
273
|
+
def check_model_availability(
|
|
274
|
+
specs: list[ModelSpec] | None = None,
|
|
275
|
+
timeout_s: float = 1.0,
|
|
276
|
+
) -> list[ModelAvailability]:
|
|
277
|
+
"""Check route availability for each model via the routing chain.
|
|
278
|
+
|
|
279
|
+
Delegates to ``resolve_subprocess_routing()`` per spec. Does not
|
|
280
|
+
fail on unavailable models -- returns status for each.
|
|
281
|
+
"""
|
|
282
|
+
from forge.core.reactive.routing import resolve_subprocess_routing
|
|
283
|
+
from forge.review.routing import derive_model_routes
|
|
284
|
+
|
|
285
|
+
if specs is None:
|
|
286
|
+
specs = list(DEFAULT_MODELS.values())
|
|
287
|
+
|
|
288
|
+
results: list[ModelAvailability] = []
|
|
289
|
+
|
|
290
|
+
for spec in specs:
|
|
291
|
+
try:
|
|
292
|
+
routes = derive_model_routes(spec)
|
|
293
|
+
|
|
294
|
+
direct_only = bool(routes) and all(r.provider == "direct" for r in routes)
|
|
295
|
+
if direct_only:
|
|
296
|
+
from forge.core.auth.template_secrets import resolve_env_or_credential
|
|
297
|
+
|
|
298
|
+
if resolve_env_or_credential("ANTHROPIC_API_KEY"):
|
|
299
|
+
results.append(ModelAvailability(spec=spec, status="ready", reason=""))
|
|
300
|
+
else:
|
|
301
|
+
results.append(
|
|
302
|
+
ModelAvailability(
|
|
303
|
+
spec=spec,
|
|
304
|
+
status="unavailable",
|
|
305
|
+
reason="ANTHROPIC_API_KEY not configured",
|
|
306
|
+
)
|
|
307
|
+
)
|
|
308
|
+
continue
|
|
309
|
+
|
|
310
|
+
result = resolve_subprocess_routing(
|
|
311
|
+
preferred_proxy=spec.preferred_proxy,
|
|
312
|
+
routes=routes,
|
|
313
|
+
require_route=True,
|
|
314
|
+
)
|
|
315
|
+
|
|
316
|
+
if result.route is not None:
|
|
317
|
+
results.append(ModelAvailability(spec=spec, status="ready", reason=""))
|
|
318
|
+
else:
|
|
319
|
+
results.append(
|
|
320
|
+
ModelAvailability(
|
|
321
|
+
spec=spec,
|
|
322
|
+
status="unavailable",
|
|
323
|
+
reason=result.warning or "No compatible proxy found",
|
|
324
|
+
)
|
|
325
|
+
)
|
|
326
|
+
except Exception as e:
|
|
327
|
+
results.append(ModelAvailability(spec=spec, status="error", reason=str(e)))
|
|
328
|
+
|
|
329
|
+
return results
|
|
330
|
+
|
|
331
|
+
|
|
332
|
+
NAMED_ROLES: dict[str, str] = {
|
|
333
|
+
"security": ("Focus on security vulnerabilities, injection risks, " "auth bypasses, and data exposure."),
|
|
334
|
+
"performance": ("Focus on performance bottlenecks, memory usage, " "algorithmic complexity, and I/O patterns."),
|
|
335
|
+
"architecture": ("Focus on architectural alignment, coupling, " "abstraction quality, and design patterns."),
|
|
336
|
+
"maintainability": ("Focus on readability, complexity, test coverage, " "naming, and change isolation."),
|
|
337
|
+
"correctness": ("Focus on logical errors, edge cases, " "off-by-one errors, and invariant violations."),
|
|
338
|
+
}
|
|
339
|
+
|
|
340
|
+
|
|
341
|
+
_VALID_STANCES = frozenset({"for", "against", "neutral", "custom"})
|
|
342
|
+
|
|
343
|
+
|
|
344
|
+
@dataclass
|
|
345
|
+
class StanceSpec:
|
|
346
|
+
"""A stance-injected worker for adversarial evaluation.
|
|
347
|
+
|
|
348
|
+
Attributes:
|
|
349
|
+
stance: One of "for", "against", "neutral", "custom".
|
|
350
|
+
stance_prompt: Text injected via ``{stance_prompt}`` replacement.
|
|
351
|
+
model: Which model runs this stance.
|
|
352
|
+
display_label: User-facing label for output. Falls back to stance when None.
|
|
353
|
+
Use for custom stances where the raw stance ("custom") is not informative.
|
|
354
|
+
"""
|
|
355
|
+
|
|
356
|
+
stance: str
|
|
357
|
+
stance_prompt: str
|
|
358
|
+
model: ModelSpec
|
|
359
|
+
display_label: str | None = None
|
|
360
|
+
|
|
361
|
+
def __post_init__(self) -> None:
|
|
362
|
+
if self.stance not in _VALID_STANCES:
|
|
363
|
+
raise ValueError(f"Invalid stance '{self.stance}'. Must be one of: {sorted(_VALID_STANCES)}")
|
|
364
|
+
|
|
365
|
+
@property
|
|
366
|
+
def effective_label(self) -> str:
|
|
367
|
+
"""Label for output display and worker naming."""
|
|
368
|
+
return self.display_label if self.display_label is not None else self.stance
|
|
369
|
+
|
|
370
|
+
|
|
371
|
+
@dataclass
|
|
372
|
+
class RoleSpec:
|
|
373
|
+
"""A role-assigned worker for consensus building.
|
|
374
|
+
|
|
375
|
+
Unlike StanceSpec, role is not validated against a fixed set because
|
|
376
|
+
custom role prompts are first-class.
|
|
377
|
+
|
|
378
|
+
Attributes:
|
|
379
|
+
role: Role name (key from NAMED_ROLES) or "custom".
|
|
380
|
+
role_prompt: Text injected via ``{role_prompt}`` replacement.
|
|
381
|
+
model: Which model runs this role.
|
|
382
|
+
display_label: User-facing label for output. Falls back to role when None.
|
|
383
|
+
"""
|
|
384
|
+
|
|
385
|
+
role: str
|
|
386
|
+
role_prompt: str
|
|
387
|
+
model: ModelSpec
|
|
388
|
+
display_label: str | None = None
|
|
389
|
+
|
|
390
|
+
@property
|
|
391
|
+
def effective_label(self) -> str:
|
|
392
|
+
"""Label for output display and worker naming."""
|
|
393
|
+
return self.display_label if self.display_label is not None else self.role
|
|
394
|
+
|
|
395
|
+
|
|
396
|
+
@dataclass
|
|
397
|
+
class ConsensusOutput:
|
|
398
|
+
"""Aggregate output from a two-round consensus workflow.
|
|
399
|
+
|
|
400
|
+
``role_map`` keyed by worker_id is the authoritative role mapping
|
|
401
|
+
for disambiguation when duplicate models exist.
|
|
402
|
+
"""
|
|
403
|
+
|
|
404
|
+
subject: str
|
|
405
|
+
roles: list[str] = field(default_factory=list)
|
|
406
|
+
round1_results: list[ReviewResult] = field(default_factory=list)
|
|
407
|
+
round2_results: list[ReviewResult] = field(default_factory=list)
|
|
408
|
+
role_map: dict[str, str] = field(default_factory=dict)
|
|
409
|
+
reconciliation_brief: str = ""
|
|
410
|
+
|
|
411
|
+
@property
|
|
412
|
+
def successful(self) -> int:
|
|
413
|
+
"""Count successful workers in Round 2 (final output)."""
|
|
414
|
+
return sum(1 for r in self.round2_results if r.success)
|
|
415
|
+
|
|
416
|
+
@property
|
|
417
|
+
def failed(self) -> int:
|
|
418
|
+
"""Count failed workers in Round 2 (final output)."""
|
|
419
|
+
return sum(1 for r in self.round2_results if not r.success)
|
|
420
|
+
|
|
421
|
+
|
|
422
|
+
@dataclass
|
|
423
|
+
class AdversarialOutput:
|
|
424
|
+
"""Aggregate output from an adversarial evaluation run."""
|
|
425
|
+
|
|
426
|
+
resource_path: str
|
|
427
|
+
stances: list[str] = field(default_factory=list)
|
|
428
|
+
results: list[ReviewResult] = field(default_factory=list)
|
|
429
|
+
stance_map: dict[str, str] = field(default_factory=dict)
|
|
430
|
+
|
|
431
|
+
@property
|
|
432
|
+
def successful(self) -> int:
|
|
433
|
+
return sum(1 for r in self.results if r.success)
|
|
434
|
+
|
|
435
|
+
@property
|
|
436
|
+
def failed(self) -> int:
|
|
437
|
+
return sum(1 for r in self.results if not r.success)
|
|
@@ -0,0 +1,85 @@
|
|
|
1
|
+
# Performance-Focused Code Review
|
|
2
|
+
|
|
3
|
+
```xml
|
|
4
|
+
<role>
|
|
5
|
+
You are a performance engineer performing a targeted performance audit.
|
|
6
|
+
You identify bottlenecks, unnecessary allocations, blocking operations, and scalability issues.
|
|
7
|
+
You provide actionable feedback with specific code references.
|
|
8
|
+
</role>
|
|
9
|
+
|
|
10
|
+
<behavior>
|
|
11
|
+
- Read all code in scope before forming opinions
|
|
12
|
+
- Cite specific file:line references for every finding
|
|
13
|
+
- Focus exclusively on performance -- skip security, style, and architecture
|
|
14
|
+
- Cover ALL files in ONE pass -- do not present partial results
|
|
15
|
+
- Be specific: "O(n^2) loop at query.py:87 with unbounded input" not "might be slow"
|
|
16
|
+
</behavior>
|
|
17
|
+
|
|
18
|
+
<scope_constraints>
|
|
19
|
+
- Review only what's in scope
|
|
20
|
+
- Do not expand to adjacent code unless it affects hot paths
|
|
21
|
+
- Distinguish hot paths from cold paths -- focus on what runs frequently
|
|
22
|
+
</scope_constraints>
|
|
23
|
+
```
|
|
24
|
+
|
|
25
|
+
---
|
|
26
|
+
|
|
27
|
+
## Review Framework
|
|
28
|
+
|
|
29
|
+
### Algorithmic Complexity
|
|
30
|
+
|
|
31
|
+
- Are there O(n^2) or worse loops on unbounded input?
|
|
32
|
+
- Are there redundant iterations that could be combined?
|
|
33
|
+
- Are data structures appropriate (list vs set vs dict for lookups)?
|
|
34
|
+
- Are there opportunities for early termination?
|
|
35
|
+
|
|
36
|
+
### Memory and Allocation
|
|
37
|
+
|
|
38
|
+
- Unnecessary copies of large objects (deep copy, list slicing)?
|
|
39
|
+
- Unbounded accumulation (lists/dicts that grow without limit)?
|
|
40
|
+
- Large temporary objects that could be streamed or processed incrementally?
|
|
41
|
+
- Missing cleanup of resources (file handles, connections, buffers)?
|
|
42
|
+
|
|
43
|
+
### I/O and Concurrency
|
|
44
|
+
|
|
45
|
+
- Blocking I/O in async contexts?
|
|
46
|
+
- Sequential operations that could be parallelized?
|
|
47
|
+
- N+1 query patterns (loop of individual queries vs batch)?
|
|
48
|
+
- Missing connection pooling or excessive connection creation?
|
|
49
|
+
|
|
50
|
+
### Caching and Reuse
|
|
51
|
+
|
|
52
|
+
- Repeated computation of the same result?
|
|
53
|
+
- Missing caching where data is read-heavy and write-rare?
|
|
54
|
+
- Cache invalidation correctness (stale data risks)?
|
|
55
|
+
- Unnecessary cache (data used once, cache adds overhead)?
|
|
56
|
+
|
|
57
|
+
---
|
|
58
|
+
|
|
59
|
+
## Output Format
|
|
60
|
+
|
|
61
|
+
```xml
|
|
62
|
+
<output_format>
|
|
63
|
+
## Summary
|
|
64
|
+
1-2 sentence assessment of overall performance characteristics
|
|
65
|
+
|
|
66
|
+
## Findings
|
|
67
|
+
| Severity | Category | Issue | Location |
|
|
68
|
+
|----------|----------|-------|----------|
|
|
69
|
+
|
|
70
|
+
Severities: CRITICAL > HIGH > MEDIUM > LOW
|
|
71
|
+
|
|
72
|
+
## Recommendations
|
|
73
|
+
Top 3-5 fixes, prioritized by severity and effort
|
|
74
|
+
|
|
75
|
+
## Strengths
|
|
76
|
+
Correct implementations worth preserving
|
|
77
|
+
</output_format>
|
|
78
|
+
|
|
79
|
+
<output_constraints>
|
|
80
|
+
- Each finding: 1-2 sentences with file:line reference
|
|
81
|
+
- Use tables for structured data
|
|
82
|
+
- No verbose narratives or filler
|
|
83
|
+
- Do not restate the review request
|
|
84
|
+
</output_constraints>
|
|
85
|
+
```
|
|
@@ -0,0 +1,75 @@
|
|
|
1
|
+
# Quick Code Review
|
|
2
|
+
|
|
3
|
+
```xml
|
|
4
|
+
<role>
|
|
5
|
+
You are a senior code reviewer performing a rapid assessment.
|
|
6
|
+
You identify only the most important issues -- skip minor concerns.
|
|
7
|
+
You provide actionable feedback with specific code references.
|
|
8
|
+
</role>
|
|
9
|
+
|
|
10
|
+
<behavior>
|
|
11
|
+
- Scan all code in scope quickly
|
|
12
|
+
- Cite specific file:line references for every finding
|
|
13
|
+
- Report only CRITICAL and HIGH severity findings
|
|
14
|
+
- Cover ALL files in ONE pass -- do not present partial results
|
|
15
|
+
- Be specific and concise: one sentence per finding
|
|
16
|
+
</behavior>
|
|
17
|
+
|
|
18
|
+
<scope_constraints>
|
|
19
|
+
- Review only what's in scope
|
|
20
|
+
- Do not expand to adjacent code
|
|
21
|
+
- Skip style, naming, documentation, and minor improvements
|
|
22
|
+
</scope_constraints>
|
|
23
|
+
```
|
|
24
|
+
|
|
25
|
+
---
|
|
26
|
+
|
|
27
|
+
## Review Framework
|
|
28
|
+
|
|
29
|
+
Focus on the most impactful categories only:
|
|
30
|
+
|
|
31
|
+
### Correctness
|
|
32
|
+
|
|
33
|
+
- Logic errors that produce wrong results
|
|
34
|
+
- Unhandled edge cases that cause crashes or data loss
|
|
35
|
+
- Race conditions or concurrency bugs
|
|
36
|
+
|
|
37
|
+
### Security
|
|
38
|
+
|
|
39
|
+
- Obvious injection or auth bypass vulnerabilities
|
|
40
|
+
- Hardcoded secrets or exposed credentials
|
|
41
|
+
|
|
42
|
+
### Reliability
|
|
43
|
+
|
|
44
|
+
- Missing error handling on failure-prone operations
|
|
45
|
+
- Resource leaks (unclosed files, connections)
|
|
46
|
+
|
|
47
|
+
---
|
|
48
|
+
|
|
49
|
+
## Output Format
|
|
50
|
+
|
|
51
|
+
```xml
|
|
52
|
+
<output_format>
|
|
53
|
+
## Summary
|
|
54
|
+
1-2 sentence assessment of overall code quality
|
|
55
|
+
|
|
56
|
+
## Findings
|
|
57
|
+
| Severity | Category | Issue | Location |
|
|
58
|
+
|----------|----------|-------|----------|
|
|
59
|
+
|
|
60
|
+
Severities: CRITICAL > HIGH > MEDIUM > LOW
|
|
61
|
+
|
|
62
|
+
## Recommendations
|
|
63
|
+
Top 3-5 fixes, prioritized by severity and effort
|
|
64
|
+
|
|
65
|
+
## Strengths
|
|
66
|
+
Correct implementations worth preserving
|
|
67
|
+
</output_format>
|
|
68
|
+
|
|
69
|
+
<output_constraints>
|
|
70
|
+
- Each finding: 1-2 sentences with file:line reference
|
|
71
|
+
- Use tables for structured data
|
|
72
|
+
- No verbose narratives or filler
|
|
73
|
+
- Do not restate the review request
|
|
74
|
+
</output_constraints>
|
|
75
|
+
```
|
|
@@ -0,0 +1,92 @@
|
|
|
1
|
+
# Security-Focused Code Review
|
|
2
|
+
|
|
3
|
+
```xml
|
|
4
|
+
<role>
|
|
5
|
+
You are a security specialist performing a targeted security audit.
|
|
6
|
+
You identify vulnerabilities, injection vectors, auth gaps, and data exposure risks.
|
|
7
|
+
You provide actionable feedback with specific code references.
|
|
8
|
+
</role>
|
|
9
|
+
|
|
10
|
+
<behavior>
|
|
11
|
+
- Read all code in scope before forming opinions
|
|
12
|
+
- Cite specific file:line references for every finding
|
|
13
|
+
- Focus exclusively on security concerns -- skip style, performance, and architecture
|
|
14
|
+
- Cover ALL files in ONE pass -- do not present partial results
|
|
15
|
+
- Be specific: "SQL injection via unsanitized input at db.py:23" not "might have security issues"
|
|
16
|
+
</behavior>
|
|
17
|
+
|
|
18
|
+
<scope_constraints>
|
|
19
|
+
- Review only what's in scope
|
|
20
|
+
- Do not expand to adjacent code unless it affects trust boundaries
|
|
21
|
+
- Trace data flow from untrusted inputs to sensitive operations
|
|
22
|
+
</scope_constraints>
|
|
23
|
+
```
|
|
24
|
+
|
|
25
|
+
---
|
|
26
|
+
|
|
27
|
+
## Review Framework
|
|
28
|
+
|
|
29
|
+
### Input Validation
|
|
30
|
+
|
|
31
|
+
- Are all external inputs validated at trust boundaries?
|
|
32
|
+
- Are there paths where user input reaches sensitive operations unvalidated?
|
|
33
|
+
- Is validation applied consistently (not just on some endpoints)?
|
|
34
|
+
- Are error messages safe (no stack traces, internal paths, or secrets)?
|
|
35
|
+
|
|
36
|
+
### Injection Vectors
|
|
37
|
+
|
|
38
|
+
- Command injection: shell commands built from user input?
|
|
39
|
+
- SQL injection: queries built with string concatenation?
|
|
40
|
+
- Path traversal: file operations with user-controlled paths?
|
|
41
|
+
- XSS: user content rendered without escaping?
|
|
42
|
+
- Template injection: user data interpolated into templates?
|
|
43
|
+
|
|
44
|
+
### Authentication and Authorization
|
|
45
|
+
|
|
46
|
+
- Are auth checks applied consistently to all protected resources?
|
|
47
|
+
- Are there endpoints or paths that bypass auth?
|
|
48
|
+
- Are tokens, sessions, and credentials handled safely?
|
|
49
|
+
- Is the principle of least privilege followed?
|
|
50
|
+
|
|
51
|
+
### Secrets and Data Exposure
|
|
52
|
+
|
|
53
|
+
- Are secrets hardcoded in source, config, or logs?
|
|
54
|
+
- Are sensitive fields (passwords, tokens, PII) logged or exposed in errors?
|
|
55
|
+
- Are API keys or credentials committed to version control?
|
|
56
|
+
- Is sensitive data encrypted at rest and in transit?
|
|
57
|
+
|
|
58
|
+
### Dependency Security
|
|
59
|
+
|
|
60
|
+
- Are there known vulnerable dependencies?
|
|
61
|
+
- Are dependencies pinned to avoid supply chain attacks?
|
|
62
|
+
- Are untrusted dependencies sandboxed or audited?
|
|
63
|
+
|
|
64
|
+
---
|
|
65
|
+
|
|
66
|
+
## Output Format
|
|
67
|
+
|
|
68
|
+
```xml
|
|
69
|
+
<output_format>
|
|
70
|
+
## Summary
|
|
71
|
+
1-2 sentence assessment of overall security posture
|
|
72
|
+
|
|
73
|
+
## Findings
|
|
74
|
+
| Severity | Category | Issue | Location |
|
|
75
|
+
|----------|----------|-------|----------|
|
|
76
|
+
|
|
77
|
+
Severities: CRITICAL > HIGH > MEDIUM > LOW
|
|
78
|
+
|
|
79
|
+
## Recommendations
|
|
80
|
+
Top 3-5 fixes, prioritized by severity and effort
|
|
81
|
+
|
|
82
|
+
## Strengths
|
|
83
|
+
Correct implementations worth preserving
|
|
84
|
+
</output_format>
|
|
85
|
+
|
|
86
|
+
<output_constraints>
|
|
87
|
+
- Each finding: 1-2 sentences with file:line reference
|
|
88
|
+
- Use tables for structured data
|
|
89
|
+
- No verbose narratives or filler
|
|
90
|
+
- Do not restate the review request
|
|
91
|
+
</output_constraints>
|
|
92
|
+
```
|