PyPI - shotgun-sh - Versions diffs - 0.1.0.dev12__py3-none-any.whl → 0.1.0.dev13__py3-none-any.whl - Mend

shotgun-sh 0.1.0.dev12py3-none-any.whl → 0.1.0.dev13py3-none-any.whl

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Potentially problematic release.

This version of shotgun-sh might be problematic. Click here for more details.

Files changed (49) hide show

shotgun/agents/common.py +94 -79
shotgun/agents/config/constants.py +18 -0
shotgun/agents/config/manager.py +68 -16
shotgun/agents/config/provider.py +11 -6
shotgun/agents/models.py +6 -0
shotgun/agents/plan.py +15 -37
shotgun/agents/research.py +10 -45
shotgun/agents/specify.py +97 -0
shotgun/agents/tasks.py +7 -36
shotgun/agents/tools/artifact_management.py +450 -0
shotgun/agents/tools/file_management.py +2 -2
shotgun/artifacts/__init__.py +17 -0
shotgun/artifacts/exceptions.py +89 -0
shotgun/artifacts/manager.py +529 -0
shotgun/artifacts/models.py +332 -0
shotgun/artifacts/service.py +463 -0
shotgun/artifacts/templates/__init__.py +10 -0
shotgun/artifacts/templates/loader.py +252 -0
shotgun/artifacts/templates/models.py +136 -0
shotgun/artifacts/templates/plan/delivery_and_release_plan.yaml +66 -0
shotgun/artifacts/templates/research/market_research.yaml +585 -0
shotgun/artifacts/templates/research/sdk_comparison.yaml +257 -0
shotgun/artifacts/templates/specify/prd.yaml +331 -0
shotgun/artifacts/templates/specify/product_spec.yaml +301 -0
shotgun/artifacts/utils.py +76 -0
shotgun/cli/plan.py +1 -4
shotgun/cli/specify.py +69 -0
shotgun/cli/tasks.py +0 -4
shotgun/logging_config.py +23 -7
shotgun/main.py +7 -6
shotgun/prompts/agents/partials/artifact_system.j2 +32 -0
shotgun/prompts/agents/partials/common_agent_system_prompt.j2 +28 -2
shotgun/prompts/agents/partials/content_formatting.j2 +65 -0
shotgun/prompts/agents/partials/interactive_mode.j2 +10 -2
shotgun/prompts/agents/plan.j2 +31 -32
shotgun/prompts/agents/research.j2 +37 -29
shotgun/prompts/agents/specify.j2 +31 -0
shotgun/prompts/agents/tasks.j2 +27 -12
shotgun/sdk/artifact_models.py +186 -0
shotgun/sdk/artifacts.py +448 -0
shotgun/tui/app.py +26 -7
shotgun/tui/screens/chat.py +28 -3
shotgun/tui/screens/directory_setup.py +113 -0
{shotgun_sh-0.1.0.dev12.dist-info → shotgun_sh-0.1.0.dev13.dist-info}/METADATA +2 -2
{shotgun_sh-0.1.0.dev12.dist-info → shotgun_sh-0.1.0.dev13.dist-info}/RECORD +48 -25
shotgun/prompts/user/research.j2 +0 -5
{shotgun_sh-0.1.0.dev12.dist-info → shotgun_sh-0.1.0.dev13.dist-info}/WHEEL +0 -0
{shotgun_sh-0.1.0.dev12.dist-info → shotgun_sh-0.1.0.dev13.dist-info}/entry_points.txt +0 -0
{shotgun_sh-0.1.0.dev12.dist-info → shotgun_sh-0.1.0.dev13.dist-info}/licenses/LICENSE +0 -0

shotgun/artifacts/templates/research/sdk_comparison.yaml ADDED Viewed

@@ -0,0 +1,257 @@
+name: SDK & Library Research Template
+purpose: Use this template to build a comprehensive research and comparison document for SDKs and libraries considered in a project.
+prompt: |
+  First, suggest creating a research.overview section to establish the problem and requirements.
+  Then, once the user reviews, move on to the discovery and evaluation sections.
+  Include a systematic evaluation framework that defines technical requirements, constraints, and quality criteria for the SDK/library selection.
+  The research should enable the team to make an informed decision based on objective comparisons and practical testing.
+  Prioritize evaluation criteria by technical criticality × business impact to maximize long-term project success.
+  Facilitate thorough investigation: Breadth of options before depth of analysis
+  Suggest alternatives if the user's initial requirements seem too restrictive: Flexibility over perfect fit
+  Your focus needs to be on creating clear, actionable research artifacts that development teams can use to make and justify technical decisions.
+  If user provides detailed requirements with use case, technical constraints, and project context, acknowledge and get to work OR ask for more details if critical context is truly missing.
+sections:
+    research.overview:
+        instructions: |
+            # What is a good research.overview section?
+            •	Clarifies the technical need
+            •	Establishes scope and constraints
+            •	Aligns the team on evaluation criteria
+            •	Sets realistic timeline for decision
+            # Questions to ask yourself:
+            •	What problem are we trying to solve with this SDK/library?
+            •	What happens if we make the wrong choice?
+            •	What are our non-negotiable requirements?
+            •	Is there anything specific to our tech stack that constrains choices?
+            # Includes:
+              Research Objective: [One clear sentence stating what SDK/library type we need and why]
+              Problem Context:
+                Current limitation: [What can't we do now?]
+                Business impact: [Cost of not solving this]
+                Technical debt risk: [What happens if we choose poorly?]
+                Timeline: [When do we need to decide?]
+                [additional relevant context]
+            # Success Criteria:
+            | Criterion | Description | Priority |
+            |-----------|-------------|----------|
+            | Performance requirements | Must handle X operations/sec | Critical |
+            | Developer experience | Clear documentation, active community | High |
+            | Maintenance burden | Regular updates, security patches | Medium |
+            [more criteria relevant to the specific research]
+    research.requirements:
+        instructions: |
+            # How to write a good research.requirements section
+            Define technical and business requirements that will guide the evaluation.
+            •	Separates must-haves from nice-to-haves
+            •	Includes both functional and non-functional requirements
+            •	Considers team capabilities and constraints
+            # Questions to ask yourself:
+            •	What features are absolutely essential vs. "would be nice"?
+            •	What are our technical constraints (language, platform, licenses)?
+            •	What are our resource constraints (budget, team expertise)?
+            # Includes:
+              ## Functional Requirements
+              Essential Features:
+                - [Core functionality needed]
+                - [Integration requirements]
+                - [Performance benchmarks]
+              Nice-to-Have Features:
+                - [Additional capabilities]
+                - [Future-proofing considerations]
+              ## Non-Functional Requirements
+              Technical Constraints:
+                - Language/Platform: [e.g., Must support TypeScript]
+                - Architecture: [e.g., Must work in serverless environment]
+                - Performance: [e.g., < 50ms latency for operations]
+              Business Constraints:
+                - License: [e.g., MIT or Apache 2.0 preferred]
+                - Cost: [e.g., Free for commercial use or < $X/month]
+                - Support: [e.g., Active community or paid support available]
+        depends_on:
+            - "research.overview"
+    research.discovery:
+        instructions: |
+            # How to write a good research.discovery section
+            Document the process and results of finding available options.
+            •	Shows comprehensive search methodology
+            •	Lists all viable candidates with brief descriptions
+            •	Explains why certain options were excluded early
+            # Questions to ask yourself:
+            •	Where did we search? (GitHub, npm, package managers, forums)
+            •	What keywords and criteria did we use?
+            •	Are there any industry-standard or popular choices we're missing?
+            # Includes:
+              ## Search Methodology
+              Sources Consulted:
+                - [Package registries searched]
+                - [Community recommendations]
+                - [Industry reports/comparisons referenced]
+              ## Initial Candidates
+              | Library/SDK | Description | Stars/Downloads | Initial Assessment |
+              |-------------|-------------|-----------------|-------------------|
+              | Option A | Brief description | GitHub stars or weekly downloads | Meets X requirements, concerns about Y |
+              | Option B | Brief description | Metrics | Strong community, lacks feature Z |
+              [comprehensive list]
+              ## Early Eliminations
+              Excluded Options:
+                - **[Library Name]**: [Reason for exclusion]
+                - **[Library Name]**: [Critical missing feature or constraint violation]
+        depends_on:
+            - "research.requirements"
+    research.technical_evaluation:
+        instructions: |
+            # How to write a good research.technical_evaluation section
+            Deep technical analysis of the top candidates.
+            •	Provides objective, measurable comparisons
+            •	Includes code examples and API comparisons
+            •	Evaluates developer experience and learning curve
+            # Questions to ask yourself:
+            •	How easy is it to implement our use case in each option?
+            •	What are the performance characteristics under our expected load?
+            •	How well does each option integrate with our existing stack?
+            # Includes:
+              ## Evaluated Options (Top 3-5 candidates)
+              ### **Option A: [Name]**
+              Technical Architecture:
+                - Core design philosophy
+                - Key abstractions and patterns
+                - Dependencies and size
+              Code Example:
+              ```[language]
+              // Example implementing our primary use case
+              ```
+              Performance Metrics:
+                - Benchmark results
+                - Memory footprint
+                - Bundle size impact
+              Developer Experience:
+                - Setup complexity: [Rating]
+                - Documentation quality: [Rating]
+                - API design: [Rating]
+                - Debugging tools: [Rating]
+              [Repeat for each candidate]
+        depends_on:
+            - "research.discovery"
+    research.comparison_matrix:
+        instructions: |
+            # How to write a good research.comparison_matrix section
+            Side-by-side comparison of all evaluation criteria.
+            •	Enables quick visual comparison
+            •	Uses consistent scoring methodology
+            •	Weights criteria by importance
+            # Questions to ask yourself:
+            •	Are we comparing apples to apples?
+            •	Is our scoring methodology clear and reproducible?
+            •	Have we weighted criteria appropriately for our use case?
+            # Includes:
+              ## Scoring Methodology
+              - 🟢 Excellent (3 points): Exceeds requirements
+              - 🟡 Good (2 points): Meets requirements
+              - 🔴 Poor (1 point): Below requirements
+              - ❌ Missing (0 points): Doesn't support
+              ## Feature Comparison
+              | Criterion | Weight | Option A | Option B | Option C |
+              |-----------|--------|----------|----------|----------|
+              | Core Functionality | 3x | 🟢 (3) | 🟡 (2) | 🟢 (3) |
+              | Performance | 2x | 🟡 (2) | 🟢 (3) | 🟡 (2) |
+              | Documentation | 1x | 🟢 (3) | 🟡 (2) | 🔴 (1) |
+              | Community Support | 1x | 🟡 (2) | 🟢 (3) | 🟡 (2) |
+              | **Weighted Total** | | **17** | **18** | **15** |
+              ## Risk Assessment
+              | Risk Factor | Option A | Option B | Option C |
+              |-------------|----------|----------|----------|
+              | Vendor Lock-in | Low | Medium | Low |
+              | Technical Debt | Low | Low | High |
+              | Maintenance Risk | Medium | Low | High |
+        depends_on:
+            - "research.technical_evaluation"
+    research.decision:
+        instructions: |
+            # How to write a good research.decision section
+            Clear recommendation with thorough justification.
+            •	States the recommendation unambiguously
+            •	Provides clear rationale tied to requirements
+            •	Addresses concerns and mitigation strategies
+            # Questions to ask yourself:
+            •	Will stakeholders understand why we chose this option?
+            •	Have we addressed all major concerns about our choice?
+            •	Is our decision reversible if needed?
+            # Includes:
+              ## Recommendation
+              **Selected Solution:** [Name and version]
+              **Primary Reasons:**
+              1. [Most compelling reason tied to critical requirement]
+              2. [Second key differentiator]
+              3. [Long-term benefit]
+              ## Detailed Justification
+              ### Why [Selected] Over [Alternative A]
+              - [Specific technical advantage]
+              - [Business consideration]
+              - [Risk mitigation]
+              ### Why [Selected] Over [Alternative B]
+              - [Comparison points]
+              ## Risk Mitigation Plan
+              | Identified Risk | Mitigation Strategy | Owner |
+              |----------------|---------------------|-------|
+              | [Specific risk] | [How we'll address it] | [Team/Person] |
+              ## Dissenting Opinions
+              - [Any team concerns and how they were addressed]
+        depends_on:
+            - "research.comparison_matrix"
+            - "research.technical_evaluation"

shotgun/artifacts/templates/specify/prd.yaml ADDED Viewed

@@ -0,0 +1,331 @@
+name: Product Requirements Document (PRD) Template
+purpose: Use this template to produce a crisp, decision-oriented PRD that enables engineers to build and executives to understand the bet. Optimize for specificity, thresholds, and rollout decisions; cut anything that doesn’t change how someone builds, measures, or kills the feature.
+prompt: |
+  First, create prd.overview to capture the two-sentence problem statement and core inputs.
+  Then proceed through the sections in order.
+  Keep it ≤5 formatted pages. Favor real data, prototype learnings, concrete thresholds, and adult rollout plans.
+  Delete sections that don’t change decisions.
+  Critical principles:
+  - No fluff; defend decisions like a Staff PM.
+  - Decisions > documentation; thresholds over adjectives.
+  - Use real examples and prototype learnings.
+  - Set explicit rollout gates, owners, and kill criteria.
+sections:
+  prd.overview:
+    instructions: |
+      # What to capture
+      - **Feature Name** and a **2-sentence problem statement**:
+        1) One-sentence problem with data.
+        2) One sentence on why solving now unlocks value.
+      - **Inputs** (fill what you have; leave gaps as Open Questions):
+        COMPANY, PRODUCT/AREA, TARGET USERS, STRATEGIC BET/OKR,
+        CORE PROBLEM (with data), CURRENT BASELINE METRICS,
+        PROTOTYPE LEARNINGS, AI FEATURE? (model/latency/cost),
+        HARD DEADLINE, KEY STAKEHOLDERS (names).
+      # Output should make an engineer say “I can build this” and an exec say “I get the bet.”
+      # Includes
+      Feature Name: [ ]
+      Problem Statement: "[Data-backed sentence]. [Why now unlocks value.]"
+      Inputs:
+        company: [ ]
+        product_area: [ ]
+        target_users: [ ]
+        strategic_bet_or_okr: [ ]
+        core_problem: "[e.g., 73% of X experience Y causing $Z loss]"
+        current_baseline_metrics: [ ]
+        prototype_learnings: [ ]
+        ai_feature:
+          enabled: [true|false]
+          model: [ ]
+          latency_requirements_ms:
+            p50: [ ]
+            p99: [ ]
+          cost_per_1k_requests_usd_max: [ ]
+        hard_deadline: [YYYY-MM-DD]
+        key_stakeholders:
+          - name: [ ]
+            role: [ ]
+    # first section, no deps
+  opportunity.framing:
+    instructions: |
+      # Frame the bet crisply
+      Core Problem: "[X% of users do Y causing $Z impact]"
+      Hypothesis: "If we do X, then Y will improve by Z%."
+      Strategy Fit: "[Which OKR this unlocks AND what we are explicitly NOT doing]"
+      Financial Model (monthly impact):
+        - Pessimistic: assumptions=[ ], impact=$X, confidence=60%
+        - Most Likely: assumptions=[ ], impact=$Y, confidence=30%
+        - Optimistic: assumptions=[ ], impact=$Z, confidence=10%
+    depends_on:
+      - "prd.overview"
+  scope.boundaries:
+    instructions: |
+      # Draw hard lines to prevent scope creep
+      In Scope (checked user stories/surfaces/endpoints):
+        - [ ]
+        - [ ]
+        - [ ]
+      NON-GOALS:
+        1) **[Not building X]** because [reason]
+        2) **[Deferring Z to Qn]** because [dependency]
+        3) **[Not touching system V]** due to [migration/constraints]
+    depends_on:
+      - "opportunity.framing"
+  success.measurement:
+    instructions: |
+      # Define success with numbers and decision thresholds
+      Primary Success Metrics:
+        - name: [ ]
+          baseline: "[value ± variance]"
+          target: "[value]"
+          mde: "[%]"
+          decision_threshold: "Ship if >A, kill if <B"
+        - name: [ ]
+          baseline: "[ ]"
+          target: "[ ]"
+          mde: "[%]"
+          decision_threshold: "[e.g., Must exceed by week 2]"
+      Guardrails (must not regress):
+        - metric: [ ]
+          threshold: [e.g., P99 latency < Y ms]
+        - metric: [ ]
+          threshold: [e.g., FP rate < Z%]
+      Graduation Criteria:
+        scale_to_100_if: [clear conditions]
+        kill_and_revert_if: [explicit failure conditions]
+        iterate_if: [middle-ground conditions]
+    depends_on:
+      - "scope.boundaries"
+  rollout.plan:
+    instructions: |
+      # Adult rollout plan with real gates, names, and dates
+      Phase 1: Validation (Days 1–7)
+        exposure: "[X% of specific segment]"
+        sampling: "[user/session/request]"
+        gate_1:
+          date: [YYYY-MM-DD]
+          decider: [Name]
+          criteria: "[metrics and thresholds]"
+      Phase 2: Controlled Expansion (Days 8–21)
+        exposure_ramp: ["X%" , "Y%" , "Z%"]
+        gate_2:
+          date: [YYYY-MM-DD]
+          stakeholder_check: "[specific metric] with [stakeholder]"
+        finance_approval_if: "[condition]"
+      Phase 3: Full Launch (Day 22+)
+        contingent_on: "[specific criteria]"
+        regional_or_segment_order: "[sequence + why]"
+    depends_on:
+      - "success.measurement"
+  prototype.loop:
+    instructions: |
+      # Close the loop between prototypes and PRD
+      What prototypes taught us:
+        - [Learning 1 → changed approach from X to Y]
+        - [Learning 2 → discovered edge case Z]
+        - [Learning 3 → users want A not B]
+      Next prototype must answer:
+        - [ ] Can we handle [edge case] at scale?
+        - [ ] Does [approach] work for [segment]?
+        - [ ] Is [latency] acceptable for [use case]?
+      PRD revision triggers:
+        - If prototype shows [X], revisit [section Y].
+        - If [metric] differs by >20% from estimate, reopen financial model.
+    depends_on:
+      - "rollout.plan"
+  ai.behavior_contract:
+    instructions: |
+      # Include only if AI feature; otherwise skip this section
+      Behavior Contract:
+        must:
+          - [Specific behavior 1]
+          - [Specific behavior 2]
+        must_not:
+          - [Prohibited behavior 1]
+          - [Prohibited behavior 2]
+      Reference Examples (use real data)
+        good_examples: 7-10 with Input / Output / Why Good
+        bad_examples: 7-10 with Input / Current Output / Correct Handling / Why Bad
+        reject_cases: 5 edge cases with Scenario / Detection / Response / Why Reject
+      AI Guardrails:
+        pii_echo: "[detection + prevention]"
+        hallucination_handling: "[threshold + fallback]"
+        latency_ms:
+          p50: [ ]
+          p99: [ ]
+        cost_per_1k_requests_usd_max: [ ]
+        token_limits:
+          input_tokens_max: [ ]
+          output_tokens_max: [ ]
+    depends_on:
+      - "prototype.loop"
+  risk.management:
+    instructions: |
+      # Name risks, detection, mitigation, and owners
+      risks:
+        - risk: [ ]
+          likelihood: [Low|Med|High]
+          impact: [Low|Med|High]
+          detection: "[How detected]"
+          mitigation: "[Response]"
+          owner: "[Name]"
+      kill_switch:
+        location: "[config/flag path]"
+        who_can_pull: "[team on-call + escalation]"
+        ttr_minutes: [ ]
+        fallback_state: "[system reverts to X]"
+    depends_on:
+      - "prototype.loop"
+  launch.readiness:
+    instructions: |
+      # Prove we’re ready to ship Day 1
+      pre_launch_checklist:
+        - golden_eval_set: "[X examples, Y% coverage]"
+        - load_test: "Confirmed [X] QPS with [Y] ms P99"
+        - dashboards: "[link]"
+        - runbook: "[link]"
+        - legal_review: "[Name approved Date]"
+        - security_review: "[Name approved Date]"
+        - finance_model: "Approval if > $[X] impact"
+      day_1_success:
+        - "[Metric 1] within expected range"
+        - "No P0 incidents"
+        - "[X]% of traffic successfully served"
+    depends_on:
+      - "risk.management"
+  ownership.decisions:
+    instructions: |
+      # Who decides what, and when
+      core_team:
+        - role: DRI
+          name: [ ]
+          accountable_for: "Overall success, kill decision"
+        - role: Eng Lead
+          name: [ ]
+          accountable_for: "Technical delivery, latency"
+        - role: Data
+          name: [ ]
+          accountable_for: "Experiment design, analysis"
+        - role: Design
+          name: [ ]
+          accountable_for: "User experience bar"
+        - role: Legal/Security
+          name: [ ]
+          accountable_for: "Compliance, data handling"
+      decision_calendar:
+        - date: [YYYY-MM-DD]
+          decision: "Gate 1: Expand?"
+          criteria: "[Metrics]"
+          decider: "[Name]"
+        - date: [YYYY-MM-DD]
+          decision: "Gate 2: Graduate?"
+          criteria: "[Metrics]"
+          decider: "[Name]"
+        - date: [YYYY-MM-DD]
+          decision: "Post-mortem"
+          criteria: "Learnings"
+          decider: "All"
+    depends_on:
+      - "launch.readiness"
+  open.questions:
+    instructions: |
+      # Track gaps with owners and due dates
+      questions:
+        - question: "[Missing data X?]"
+          owner: "[Name]"
+          due_date: [YYYY-MM-DD]
+          blocks: "[Launch|Phase 2|—]"
+        - question: "[Technical unknown Y?]"
+          owner: "[Name]"
+          due_date: [YYYY-MM-DD]
+          blocks: "[ ]"
+    depends_on:
+      - "ownership.decisions"
+  json.summary:
+    instructions: |
+      # Brief machine-readable summary for dashboards/automation
+      schema:
+        title: "[Feature Name]"
+        hypothesis: "[One sentence]"
+        problem_magnitude: "$[X]M annual impact"
+        confidence_level: "[X]%"
+        primary_metrics:
+          - name: ""
+            baseline: ""
+            target: ""
+            mde: ""
+            threshold: ""
+        guardrails:
+          - metric: ""
+            threshold: ""
+            consequence: "auto-kill if breached"
+        rollout:
+          start_date: "YYYY-MM-DD"
+          initial_exposure: "X%"
+          duration_days: X
+          gates:
+            - date: ""
+              decision: ""
+              criteria: ""
+        kill_switch:
+          location: ""
+          owners: [""]
+          ttr_minutes: X
+        ai_config:
+          model: ""
+          behavior_examples:
+            good: X
+            bad: Y
+            reject: Z
+          max_latency_p99_ms: X
+          max_cost_per_1k: X
+        owners:
+          dri: ""
+          eng_lead: ""
+          data_lead: ""
+        prototype_learnings_incorporated: true
+        next_revision_date: "YYYY-MM-DD"
+    depends_on:
+      - "open.questions"
+  final.quality_checks:
+    instructions: |
+      # Run this quick list before sharing
+      checklist:
+        - "Would an engineer know exactly what to build?"
+        - "Can an exec understand the bet in 30 seconds?"
+        - "All metrics have baselines and targets?"
+        - "Prototype learnings reflected (not hypotheticals)?"
+        - "Examples come from real data?"
+        - "Every section helps ship, measure, or kill?"
+        - "Total length ≤ 5 pages when formatted?"
+    depends_on:
+      - "json.summary"

shotgun-sh 0.1.0.dev12__py3-none-any.whl → 0.1.0.dev13__py3-none-any.whl

Potentially problematic release.

shotgun-sh 0.1.0.dev12py3-none-any.whl → 0.1.0.dev13py3-none-any.whl