npm - theslopmachine - Versions diffs - 0.3.0 - Mend

theslopmachine 0.3.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (31) hide show

package/MANUAL.md +63 -0
package/README.md +23 -0
package/RELEASE.md +81 -0
package/assets/agents/developer.md +294 -0
package/assets/agents/slopmachine.md +510 -0
package/assets/skills/beads-operations/SKILL.md +75 -0
package/assets/skills/clarification-gate/SKILL.md +51 -0
package/assets/skills/developer-session-lifecycle/SKILL.md +75 -0
package/assets/skills/final-evaluation-orchestration/SKILL.md +75 -0
package/assets/skills/frontend-design/SKILL.md +41 -0
package/assets/skills/get-overlays/SKILL.md +157 -0
package/assets/skills/planning-gate/SKILL.md +68 -0
package/assets/skills/submission-packaging/SKILL.md +268 -0
package/assets/skills/verification-gates/SKILL.md +106 -0
package/assets/slopmachine/backend-evaluation-prompt.md +275 -0
package/assets/slopmachine/beads-init.js +428 -0
package/assets/slopmachine/document-completeness.md +45 -0
package/assets/slopmachine/engineering-results.md +59 -0
package/assets/slopmachine/frontend-evaluation-prompt.md +304 -0
package/assets/slopmachine/implementation-comparison.md +36 -0
package/assets/slopmachine/quality-document.md +108 -0
package/assets/slopmachine/templates/AGENTS.md +114 -0
package/assets/slopmachine/utils/convert_ai_session.py +1837 -0
package/assets/slopmachine/utils/strip_session_parent.py +66 -0
package/bin/slopmachine.js +9 -0
package/package.json +25 -0
package/src/cli.js +32 -0
package/src/constants.js +77 -0
package/src/init.js +179 -0
package/src/install.js +330 -0
package/src/utils.js +162 -0

package/assets/slopmachine/frontend-evaluation-prompt.md ADDED Viewed

@@ -0,0 +1,304 @@
+You are the reviewer responsible for “Delivery Acceptance / Project Architecture Inspection.”
+In the current working directory, review the frontend project point by point and make determination judgments based on the [Business / Task Prompt] and the [Acceptance / Scoring Criteria]. The acceptance criteria are the sole standard of judgment.
+[Business / Task Prompt]
+{prompt}
+[Acceptance / Scoring Criteria (single source of truth)]
+{
+1. Mandatory Gate Checks
+1.1 Can the delivered project actually be run and verified?
+- Is there a clear explanation of how to start, run, build, or preview the project?
+- Can it be started, built, or verified locally without modifying core code?
+- Do the actual results generally match the delivery documentation?
+    1.2 Does the deliverable materially deviate from the Prompt?
+- Does the implementation stay aligned with the business goal, page scenarios, and user flows described in the Prompt?
+- Is there functionality that is only weakly related or unrelated to the Prompt?
+- Has the implementation replaced, weakened, or ignored the core problem definition in the Prompt without explanation?
+2. Completeness of Delivery
+2.1 Does the deliverable fully cover the core requirements explicitly stated in the Prompt?
+- Are the required pages, core features, core interactions, and key UI states implemented?
+- Are the main user flows covered, rather than only static UI or isolated fragments?
+    2.2 Does the deliverable have the shape of a real end-to-end project rather than a partial sample, demo fragment, or illustrative code snippet?
+- Is mock / hardcoded behavior used in place of real logic without being disclosed?
+- Is there a complete project structure rather than scattered code or a single-file example?
+- Is there basic project documentation such as a README or equivalent?
+- Does the project have a basic organization for pages, routing, state, or data flow, rather than just stitched-together display code?
+3. Engineering and Architecture Quality
+3.1 Does the deliverable use a reasonable structure and module split for the scope of the problem?
+- Is the project structure clear, with reasonably separated responsibilities?
+- Is there basic separation across pages, components, state, service calls, and utility functions?
+- Are there unnecessary or redundant files?
+- Is too much logic stacked into a single file?
+    3.2 Does the deliverable show basic maintainability and extensibility rather than being a temporary or piled-up implementation?
+- Is there obvious confusion or tight coupling?
+- Does the core logic leave room for extension, or is everything hardcoded?
+- Are component reuse, state management, API abstraction, and constant/config organization handled in a maintainable way?
+4. Engineering Detail and Professionalism
+4.1 Does the deliverable reflect sound frontend engineering practice in terms of error handling, logging, validation, state feedback, and interaction design?
+- Is error handling basically reliable and user-friendly?
+- Is necessary validation present for important inputs, key interactions, and boundary cases?
+- Are essential UI states handled, such as loading, empty, error, submitting, and success / failure feedback?
+- Is logging used to support troubleshooting rather than being random, excessive, or entirely absent?
+- Is there any risk of sensitive data being exposed through console output, analytics, visible UI content, or similar surfaces?
+    4.2 Does the deliverable resemble a real product rather than a demo or tutorial artifact?
+- Does the project look like a real application rather than a teaching sample or showcase demo?
+- Are the pages meaningfully connected to each other?
+- Are the interaction flows complete, rather than only displaying static outcomes?
+5. Prompt Understanding and Fit
+5.1 Does the deliverable correctly understand and respond to the business goal, usage scenario, and implied constraints in the Prompt, rather than merely implementing surface-level UI?
+- Does it correctly fulfill the Prompt’s core business objective?
+- Is there any clear misunderstanding of the requirement or deviation from the real problem being solved?
+- Have key constraints in the Prompt been changed or ignored without explanation?
+- Does the project only “look right” visually while failing to complete the actual interaction flow, state transitions, or user task closure?
+6. Visual and Interaction Quality (frontend projects only)
+6.1 Are the visuals and interactions appropriate to the scenario, and is the design reasonably polished?
+- Are different functional areas visually distinguishable through background, separation, spacing, hierarchy, or similar means?
+- Is the overall layout coherent, with consistent alignment, spacing, and proportions?
+- Do UI elements such as text, images, and icons render correctly?
+- Do the visual elements match the theme and content, or are there images / illustrations / decorative assets that clearly do not fit?
+- Is there basic interaction feedback such as hover states, click states, disabled states, transitions, or current-state indications to help users understand what is happening?
+- Are fonts, font sizes, colors, and icon styles basically consistent, or is the visual language mixed and inconsistent?
+  }
+Review Objective
+Determine whether the delivered project is a credible, runnable, prompt-aligned, and minimally professional frontend deliverable.
+Priority Order
+1. Runnability boundary
+2. Prompt requirement fit
+3. Security-critical flaws
+4. Test sufficiency
+5. Major engineering quality issues
+6. Visual and interaction quality, only when clearly applicable
+Execution Rules
+1. Review only the highest-impact findings that can change the final verdict.
+   Do not exhaustively enumerate every secondary or tertiary checklist item.
+2. Do not relax standards for:
+- security
+- prompt fit
+- completeness of delivery
+- test sufficiency
+- evidence for material conclusions
+3. Do not skip any issue that could independently lead to a Fail or Partial Pass verdict.
+4. If a security, prompt-fit, runnability, or core test-sufficiency issue is suspected, continue investigating until it is either evidenced or explicitly marked Cannot Confirm.
+5. Stop expanding once either of the following conditions is met:
+- up to 10 findings have been identified in total
+- up to 5 High / Blocker findings have been identified
+  Whichever comes first.
+6. Do not modify project code.
+7. Require evidence only for material conclusions.
+   For any conclusion that changes the final verdict, provide concrete, traceable evidence. Evidence may be in the form of file path + line number, tool output, or an explicit runtime result.
+8. If evidence is insufficient, do not guess.
+   Use “Cannot Confirm,” or explicitly label the judgment as an assumption together with its applicable boundary.
+9. Perform runtime verification only when all of the following are true:
+- the command is explicitly documented
+- no Docker is required
+- no Docker-related command is required
+- no container orchestration is required
+- no privileged system access is required
+- no external network / third-party dependency is required
+- expected execution time is short
+10. Never run any Docker-related command.
+    This includes, but is not limited to:
+- docker
+- docker compose
+- docker-compose
+- podman
+- container runtime / orchestration commands with equivalent effect
+11. If verification would require Docker or any container-related command, do not execute it.
+    Instead:
+- clearly state that Docker-based runtime verification was not performed
+- treat it as a verification boundary, not automatically as a project defect
+- provide local reproduction commands the user can run
+- state what was confirmed through static review
+- state what remains unconfirmed
+12. Docker non-execution is a verification constraint, not a project defect by itself.
+    Only report a defect if the project itself lacks runnable documentation, has broken startup logic, or shows static evidence of delivery failure.
+13. Security review has priority over style issues.
+    Always assess:
+- authentication entry points and login-state handling
+- frontend route protection / route guards
+- page-level / feature-level access control
+- whether admin pages, debug pages, config pages, or hidden menus can be accessed directly
+- whether tokens, user information, secrets, environment variables, or debug data are exposed in frontend code, logs, analytics, localStorage, sessionStorage, visible responses, or console output
+- whether switching between users leaves behind cached data, stale state, or leaked page content
+14. Tests and logging are part of the acceptance scope, but do not build a full requirement-to-test traceability matrix.
+    Only assess whether the following are sufficiently covered:
+- the core business happy path
+- major failure paths, such as validation failure, unauthenticated interception, insufficient permission feedback, missing-resource empty / error states, request failure handling, duplicate-submission protection, and similar relevant cases
+- frontend security-critical areas
+- important boundaries directly tied to the business flow, such as pagination, sorting, filtering, search, loading / empty / error states, repeat clicks / repeat requests, async race conditions, and state recovery
+- whether unit tests, component tests, page / route integration tests, and E2E tests exist and appear basically runnable
+- whether log categorization is clear and whether there is any risk of sensitive-data leakage
+15. For test coverage, state only:
+- covered / partially covered / missing / cannot confirm
+- one or two supporting evidence points
+- the minimum necessary additional test suggestion when coverage is weak
+16. Keep logging review concise.
+    Only assess:
+- whether logging exists and meaningfully supports troubleshooting
+- whether logging categories are basically clear, if such categorization exists
+- whether there is any obvious risk of sensitive data leakage through logs, UI, analytics, or other frontend-visible surfaces
+17. Mock / stub / fake behavior is not a defect by itself unless the Prompt or documentation explicitly requires real backend integration.
+    If such behavior exists, explain only:
+- the scope of the mock
+- how it is enabled
+- whether there is any obvious risk of shipping mock behavior to production, such as default mock mode, silent interception of real requests, or bypassing real error handling
+18. Once the final verdict is sufficiently supported, do not continue searching for additional low-severity issues.
+19. Once enough evidence has been collected to support the final verdict and the main findings, do not continue reading unrelated files.
+20. Never read, search, open, quote, summarize, or rely on any file under ./.tmp/ or any of its subdirectories.
+    Treat ./.tmp/ as an excluded directory, scratch directory, or output directory, not as a source of project truth.
+    Even if ./.tmp/ appears to contain relevant content, it must not be used as evidence.
+21. Ignore any existing reports, summaries, logs, scan outputs, or markdown files under ./.tmp/.
+    Do not treat them as authoritative input, prior evidence, or prior conclusions.
+22. If the same information exists both inside and outside ./.tmp/, only the non-.tmp source may be used.
+    If a piece of information exists only under ./.tmp/, it must not be treated as authoritative evidence.
+Output Requirements
+The output must strictly follow these sections:
+1. Verdict
+- Pass / Partial Pass / Fail / Cannot Confirm
+2. Scope and Verification Boundary
+- what was reviewed
+- what input sources were excluded, including ./.tmp/
+- what was not executed
+- whether Docker-based verification was required but not executed
+- what remains unconfirmed
+3. Top Findings
+- list up to 10 findings only
+- each finding must include:
+    - Severity: Blocker / High / Medium / Low
+    - Conclusion
+    - Brief rationale
+    - Evidence
+    - Impact
+    - Minimum actionable fix
+4. Security Summary
+   At minimum, cover the following dimensions:
+- authentication / login-state handling
+- frontend route protection / route guards
+- page-level / feature-level access control
+- sensitive information exposure
+- cache / state isolation after switching users
+For each item above, output:
+- Pass / Partial Pass / Fail / Cannot Confirm
+- brief evidence or verification-boundary explanation
+5. Test Sufficiency Summary
+   Must include:
+- Test Overview
+    - whether unit tests exist
+    - whether component tests exist
+    - whether page / route integration tests exist
+    - whether E2E tests exist
+    - if they exist, what the obvious test entry points are
+- Core Coverage
+    - happy path: covered / partial / missing / cannot confirm
+    - key failure paths: covered / partial / missing / cannot confirm
+    - security-critical coverage: covered / partial / missing / cannot confirm
+- Major Gaps
+    - list up to 3 highest-risk testing gaps
+- Final Test Verdict
+    - Pass / Partial Pass / Fail / Cannot Confirm
+6. Engineering Quality Summary
+   Assess only the major maintainability / architecture issues that materially affect delivery credibility.
+7. Visual and Interaction Summary
+   Output this section only when clearly applicable.
+   Assess only the visual and interaction issues that materially affect delivery quality.
+8. Next Actions
+- list up to 5 actions
+- sort them by severity and unblock value
+Final Verification Before Output
+Before finalizing, check all of the following:
+1. Does each material conclusion have supporting evidence?
+2. Are any claims stronger than the evidence actually supports?
+3. If all unsupported observations are removed, does the final verdict still hold?
+4. Has any uncertain point been incorrectly presented as a confirmed fact?
+5. Has security or test sufficiency been judged too loosely without evidence?
+6. Has a Docker non-execution boundary been incorrectly described as a confirmed runtime failure?
+7. Has any material conclusion directly or indirectly relied on files under ./.tmp/?
+If file writing is supported, save the final report as a markdown file.
+Otherwise, return the report directly in the conversation.

package/assets/slopmachine/implementation-comparison.md ADDED Viewed

@@ -0,0 +1,36 @@
+**8.2 Actual Implementation vs. Requirements Comparison**
+| Requirement Item | Original Requirement | Actual Implementation | Exceeding Portion |
+| :---- | :---- | :---- | :---- |
+| **Complaint & Suggestion Function** | Basic requirement | ✅ Fully implemented | Added categories, status tracking, and administrator replies |
+| **Data Management** | Basic requirement | ✅ Fully implemented | Added Django Admin backend management |
+| **User Interaction** | Not specified | ✅ Fully implemented | Added responsive design and aesthetic interface |
+| **Innovative Features** | Not required | ✅ Exceeded implementation | "What I Want to Eat Most Tomorrow" leaderboard, voting function, and word cloud display |
+| **Image Upload** | Not required | ✅ Exceeded implementation | Supports uploading images as evidence |
+**8.3 Depth of Requirement Understanding**
+The project not only met the original requirements but also deeply understood the underlying business scenarios:
+* **Understood the "Canteen" scenario**: Beyond just complaints and suggestions, a "What I Want to Eat Most Tomorrow" voting function was added to increase user engagement.
+* **Understood "Management" needs**: Provided a complete backend management system supporting categorization, status tracking, and administrator replies.
+* **Understood the value of "Visualization"**: Used word clouds to display trending issues, intuitively showing user points of concern.
+* **Understood the importance of "User Experience"**: Responsive design, aesthetic interface, and smooth animations.
+* **Understood the key to "Runnability"**: Docker one-click deployment, data persistence, and comprehensive testing.
+* **Homepage Display**
+* **Suggestion Word Cloud**
+* **Submit Suggestion**
+* **View Suggestions**
+* **Status Filtering**
+**Admin Management Default Password**: admin / admin123
+* **Admin Management**
+* **Background Color Toggle**
+* **Add User**
+* **Modify User Information**
+* **User Permission Control**
+* **Modify Suggestion**
+* **Batch Execution**
+* **Search**
+* **Add Suggestion**

package/assets/slopmachine/quality-document.md ADDED Viewed

@@ -0,0 +1,108 @@
+**Self-Test Results \- Engineering and Architecture Quality**
+**Project Positioning**
+This is a full-stack canteen management system, including:
+* **Frontend Display Page**: User-facing; displays "Tomorrow's Most Wanted Food List," allows submission of opinions and suggestions, and displays word clouds.
+* **Administrator Backend**: Admin-facing; manages food voting and opinions/suggestions.
+**Technology Stack Selection**
+* **Backend**: Django 6.0.2 \+ Python 3.8+
+* **Database**: SQLite3
+* **Image Processing**: Pillow 12.1.0
+* **Chinese Segmentation**: jieba 0.42.1
+* **Word Cloud Generation**: wordcloud 1.9.6 \+ matplotlib 3.10.8
+* **Testing Framework**: pytest 8.3.4 \+ pytest-django 4.9.0
+* **Deployment**: Docker Compose one-click startup
+**Overall Architecture Diagram**
+*(Architecture diagram placeholder)*
+**Architecture Description:**
+* **Client Layer**: Browser access to frontend pages and backend management.
+* **URL Routing Layer**: Receives HTTP requests and routes them to corresponding view functions.
+* **View Layer**: Processes business logic and calls the model layer for data operations.
+* **Model Layer**: Defines data models and interacts with the database via Django ORM.
+* **Template Layer**: Renders HTML pages to return to the client.
+* **Backend Management**: Management interface provided by Django Admin.
+* **Data Layer**: SQLite3 database for data storage.
+**Module Division (Clear Responsibilities)**
+| Module | Responsibility | File |
+| :---- | :---- | :---- |
+| **Configuration Management** | Django settings, middleware, database configuration | complaint\_system/settings.py |
+| **URL Routing** | Request routing and dispatch | complaint\_system/urls.py, main/urls.py |
+| **Data Models** | ORM model definitions, database table structures | main/models.py |
+| **View Functions** | Business logic processing, request/response handling | main/views.py |
+| **Backend Management** | Django Admin configuration | main/admin.py |
+| **Templates** | HTML page rendering | main/templates/ |
+| **Testing** | Unit testing, integration testing | tests/ |
+**Request Processing Flow**
+*(Data flow diagram placeholder)*
+**3.2 Architecture Quality Rating**
+**Score: 9.0/10**
+**Pros:**
+* **Modular Design**: Clear responsibilities, low coupling, and easy to maintain.
+* **Layered Architecture**: Clear levels from View Layer → Model Layer → Data Layer.
+* **Django Best Practices**: Utilizes built-in functions such as Django ORM, Admin, and Messages.
+* **Centralized Configuration**: All configurations are managed centrally in settings.py.
+* **Scalability**: The database can be easily replaced with PostgreSQL or MySQL.
+**Areas for Improvement:**
+* Consider introducing a Service layer to decouple complex business logic from views.
+* Consider using the Django REST Framework to provide API interfaces.
+**3.3 Database Design**
+**Table Structure: FoodVote (Food Voting)**
+| Field | Type | Description |
+| :---- | :---- | :---- |
+| id | INTEGER PK | Primary Key |
+| name | VARCHAR(100) | Food Name |
+| meal\_type | VARCHAR(20) | Meal Type (breakfast/lunch/dinner) |
+| vote\_count | INTEGER | Vote Count |
+| created\_at | DATETIME | Creation Time |
+| updated\_at | DATETIME | Update Time |
+**Table Structure: Suggestion (Opinions and Suggestions)**
+| Field | Type | Description |
+| :---- | :---- | :---- |
+| id | INTEGER PK | Primary Key |
+| title | VARCHAR(200) | Title |
+| content | TEXT | Content |
+| category | VARCHAR(20) | Category (food/service/environment/price/other) |
+| status | VARCHAR(20) | Status (pending/processing/resolved/closed) |
+| submitter\_name | VARCHAR(50) | Submitter Name (Optional) |
+| submitter\_contact | VARCHAR(100) | Contact Info (Optional) |
+| image | VARCHAR(100) | Image Path (Optional) |
+| admin\_reply | TEXT | Admin Reply (Optional) |
+| created\_at | DATETIME | Creation Time |
+| updated\_at | DATETIME | Update Time |
+**Index Design**
+* **FoodVote Table Index**: Composite index to support queries by meal type and vote count: models.Index(fields=\['meal\_type', '-vote\_count'\]).
+* **Suggestion Table Index**:
+  * Composite index for status and creation time: models.Index(fields=\['status', '-created\_at'\]).
+  * Composite index for category and creation time: models.Index(fields=\['category', '-created\_at'\]).
+**Design Evaluation:**
+* ✅ Index design is reasonable and covers main query scenarios.
+* ✅ Field types are chosen appropriately, using TextChoices enumeration types.
+* ✅ Uses Django ORM for easy migration to other databases.
+* ✅ Timestamp fields are automatically managed using auto\_now\_add and auto\_now.

package/assets/slopmachine/templates/AGENTS.md ADDED Viewed

@@ -0,0 +1,114 @@
+# Developer Rulebook
+This file is the developer-facing operating rulebook for project execution.
+## Scope
+- Treat the current working directory as the project.
+- Ignore files outside the current working directory unless the user explicitly asks you to use them.
+- Do not use parent-directory files as hidden requirements.
+## Working Style
+- Operate like a senior software engineer with strong judgment and attention to detail.
+- Plan before coding when the work is non-trivial.
+- Build in meaningful vertical slices instead of scattering half-finished work across the codebase.
+- Prefer reading the actual code and project state over inventing assumptions.
+- Surface weak spots, risks, and missing information honestly.
+- Do not call work complete when it is still shaky.
+- Reuse and extend established cross-cutting patterns for errors, audit/logging, permissions, auth/session behavior, and state transitions where relevant instead of reinventing them per module.
+- For complex security, offline, authorization, storage, or data-governance features, define what done means across all promised dimensions before implementing.
+- When a requirement implies enforcement, persistence, statefulness, or rejection behavior, assume that behavior needs to be real unless it is explicitly scoped down.
+- Before reporting foundational work complete, challenge whether the behavior is real at runtime or only present in visible shape through constants, headers, helper wiring, or partial middleware.
+- Treat module completion as system-compatible completion, not isolated happy-path completion.
+- If you discover a meaningful failing user-facing, release-facing, production-path, or build check, do not treat the slice as complete unless that check was explicitly scoped out.
+- If a required user-facing or admin-facing flow cannot be exercised through its real surface, treat that as missing implementation rather than something to bypass with API shortcuts or test-only workarounds.
+## Runtime And Verification Rules
+- A heavy gate is an owner-run integrated verification boundary, not every ordinary phase change.
+- Heavy gates normally include full clean runtime proof, full `run_tests.sh`, and Playwright plus screenshot evidence when UI or fullstack flows exist.
+- Heavy gates are expected at scaffold acceptance, integrated/full verification, and post-evaluation remediation re-acceptance.
+- Ordinary phase progression and module completion do not automatically mean rerunning every heavy-gate command.
+- Treat Docker as the main runtime contract.
+- `docker compose up --build` is the canonical startup path and must work when the project expects Dockerized execution.
+- `run_tests.sh` is a required project test entrypoint and must exist and work.
+- After scaffold is established, do not rerun full `docker compose up --build` and `run_tests.sh` on every small implementation step.
+- During normal iteration, prefer the fastest meaningful local verification inside the current working directory using the project-appropriate test environment and tooling.
+- If the local test toolchain is missing, try to install or enable it before falling back to `run_tests.sh`.
+- Treat `docker compose up --build` and `run_tests.sh` as critical-gate verification commands, not normal per-turn iteration commands.
+- The workflow owner handles those expensive critical-gate runs; focus on strong local verification during normal work so the gate passes succeed cleanly.
+- After post-evaluation remediation, strengthen local verification and affected Playwright checks rather than rerunning every full gate command yourself unless explicitly required.
+- Do not let unverified work accumulate.
+## Testing Rules
+- Tests must be real, meaningful, and tied to actual behavior.
+- Cover happy paths, failure paths, and realistic edge cases.
+- For API-bearing projects, prefer real endpoint invocation where practical.
+- For backend integration tests, prefer production-equivalent infrastructure when practical instead of a weaker substitute that can hide real defects.
+- For applicable frontend or fullstack work, run local Playwright against affected end-to-end flows during implementation and inspect screenshots to verify the UI actually matches.
+- Do not pad the test suite with superficial or fake tests.
+- If verification is weak, say so plainly and fix it.
+## Frontend Product Integrity
+- Do not place development, setup, scaffold, seed, or debug information in the product UI.
+- Do not add demo banners, `database is working` messages, scaffold-password hints, setup reminders, or similar developer-facing content to frontend screens.
+- If a screen exists, it should serve the real user or operator purpose it was created for.
+- Keep setup and debug instructions in docs or operator tooling, not in the frontend interface.
+## Documentation Rules
+- Keep docs aligned with the current implementation.
+- During development, keep working technical docs under `docs/`.
+- Maintain a test-coverage document under `docs/` that explains the major flow coverage, the relevant test entry points, and any important coverage boundaries.
+- Do not add or keep tests that only assert that docs directories or docs files exist.
+- Delivery packaging may relocate docs, but that is not product behavior and should not be tested as application logic.
+- Update technical docs when behavior, architecture, interfaces, runtime steps, or verification expectations change.
+- The README must explain what the project is, how to run it, how to test it, and how to verify it.
+- Do not leave misleading docs in place after changing behavior.
+## Engineering Quality Rules
+- Keep architecture intentional and boundaries clean.
+- Avoid giant mixed-responsibility files and tangled logic.
+- Treat validation, security boundaries, secret handling, and logging hygiene as baseline concerns.
+- Fix obvious quality problems early instead of stacking them for later.
+## Secret Handling Rules
+- Do not persist local secrets anywhere in the repository.
+- Do not hardcode credentials, API keys, tokens, signing material, database passwords, certificate private keys, or similar sensitive values in code.
+- Keep committed env/config examples limited to placeholders or clearly non-production defaults.
+- If a real secret is needed, inject it through Docker-managed runtime configuration and keep it out of committed source files.
+- Do not leak raw secrets into logs, docs, screenshots, telemetry, or operator-facing UI.
+- Treat frontend and backend observability paths as secret-sensitive by default and redact accordingly.
+## Prototype Cleanup Rules
+- Remove seeded credentials, weak demo defaults, login hints, test-account residue, and other prototype-only artifacts before reporting work complete.
+- Do not leave login forms prefilled with credentials or keep obvious demo usernames/passwords in UI, config, or docs.
+- Keep error surfaces sanitized for users and operators; do not leak internal paths, stack traces, database details, or hidden account-state details unless explicitly required.
+## Communication Rules
+- Be direct, honest, and technically clear.
+- When reporting progress, explain what changed, what you verified, and what still looks weak or unfinished.
+## Skills
+- Before implementing against a library, framework, API, or tool, lean toward checking Context7 documentation first.
+- If you need targeted outside research on a specific issue, behavior, example, or current fact, use Exa web search next.
+- Then use the most relevant skill for the matter you are actively working on, or `find-skills` if the right skill is unclear.
+- Use Context7, Exa, and skills to improve implementation quality, not as a substitute for engineering judgment.
+## Avoid
+- coding before thinking
+- fake confidence
+- fake tests
+- shallow verification
+- hidden setup
+- documentation drift
+- using files outside the current working directory as hidden requirements