@kevinrabun/judges 3.123.4 → 3.124.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/agents/api-design.judge.md +9 -0
- package/agents/cloud-readiness.judge.md +7 -0
- package/agents/compliance.judge.md +7 -0
- package/agents/data-security.judge.md +7 -0
- package/agents/database.judge.md +8 -0
- package/agents/documentation.judge.md +7 -0
- package/agents/observability.judge.md +8 -0
- package/agents/reliability.judge.md +9 -0
- package/agents/scalability.judge.md +9 -0
- package/agents/software-practices.judge.md +10 -0
- package/agents/testing.judge.md +6 -0
- package/dist/judges/api-design.js +9 -0
- package/dist/judges/cloud-readiness.js +7 -0
- package/dist/judges/compliance.js +7 -0
- package/dist/judges/data-security.js +7 -0
- package/dist/judges/database.js +8 -0
- package/dist/judges/documentation.js +7 -0
- package/dist/judges/observability.js +8 -0
- package/dist/judges/reliability.js +9 -0
- package/dist/judges/scalability.js +9 -0
- package/dist/judges/software-practices.js +10 -0
- package/dist/judges/testing.js +6 -0
- package/package.json +1 -1
- package/server.json +2 -2
|
@@ -32,6 +32,15 @@ RULES FOR YOUR EVALUATION:
|
|
|
32
32
|
- Consider both API producer and consumer perspectives.
|
|
33
33
|
- Score from 0-100 where 100 means exemplary API design.
|
|
34
34
|
|
|
35
|
+
CLEAN CODE RECOGNITION (if ALL of the following are true, report ZERO findings):
|
|
36
|
+
- API endpoints follow RESTful conventions with appropriate HTTP methods and status codes.
|
|
37
|
+
- Input validation is present (schema validation, type checking, or manual guards).
|
|
38
|
+
- Error responses use a consistent format with meaningful error codes/messages.
|
|
39
|
+
- Pagination or result limits are applied to collection endpoints.
|
|
40
|
+
- API versioning strategy is apparent (URL path, header, or documented convention).
|
|
41
|
+
- Response shapes are consistent across similar endpoints.
|
|
42
|
+
If the code meets these criteria, the API design is sound. Do NOT flag stylistic preferences or theoretical improvements.
|
|
43
|
+
|
|
35
44
|
FALSE POSITIVE AVOIDANCE:
|
|
36
45
|
- Only flag API design issues in code that defines or implements HTTP/REST/GraphQL API endpoints.
|
|
37
46
|
- Do NOT flag CLI tools, batch scripts, internal libraries, or infrastructure code for API design issues.
|
|
@@ -30,6 +30,13 @@ RULES FOR YOUR EVALUATION:
|
|
|
30
30
|
- Recommend specific services or patterns (e.g., "Use Azure Key Vault instead of .env files in production").
|
|
31
31
|
- Score from 0-100 where 100 means fully cloud-native.
|
|
32
32
|
|
|
33
|
+
CLEAN CODE RECOGNITION (if ALL of the following are true, report ZERO findings):
|
|
34
|
+
- Configuration is loaded from environment variables or external config stores.
|
|
35
|
+
- The application binds to configurable ports, not hardcoded values.
|
|
36
|
+
- File system usage is for temporary/cache purposes, not persistent state.
|
|
37
|
+
- External dependencies (DB, cache, queues) use client libraries with connection pooling.
|
|
38
|
+
If the code follows 12-factor app principles, it is cloud-ready. Do NOT flag theoretical cloud improvements when the code already works correctly in containerized environments.
|
|
39
|
+
|
|
33
40
|
FALSE POSITIVE AVOIDANCE:
|
|
34
41
|
- Only flag cloud-readiness issues in code that involves cloud deployment, containerization, or distributed systems.
|
|
35
42
|
- Do NOT flag local development utilities, CLI tools, or scripts for cloud-readiness issues.
|
|
@@ -30,6 +30,13 @@ RULES FOR YOUR EVALUATION:
|
|
|
30
30
|
- Recommend both code changes and process changes where applicable.
|
|
31
31
|
- Score from 0-100 where 100 means fully compliant.
|
|
32
32
|
|
|
33
|
+
CLEAN CODE RECOGNITION (if ALL of the following are true, report ZERO findings):
|
|
34
|
+
- Audit logging is present for administrative and data-modifying operations.
|
|
35
|
+
- User consent management is implemented where personal data is collected.
|
|
36
|
+
- Data retention policies are documented or implemented in code.
|
|
37
|
+
- Access controls are present for sensitive data and operations.
|
|
38
|
+
If the code demonstrates compliance awareness, do NOT flag it for missing specific certification requirements (HIPAA, PCI-DSS, SOC 2) that are organizational rather than code-level concerns.
|
|
39
|
+
|
|
33
40
|
FALSE POSITIVE AVOIDANCE:
|
|
34
41
|
- **"age" in cache/TTL contexts**: The word "age" in cache_age, max_age, ttl_age, stale_age refers to data freshness timing, NOT user age or minor-age verification. Only flag COMP-001 for age-related compliance when the code processes date-of-birth, minor status, or parental consent — not cache expiration.
|
|
35
42
|
|
|
@@ -27,6 +27,13 @@ RULES FOR YOUR EVALUATION:
|
|
|
27
27
|
- Reference standards where applicable (OWASP, NIST 800-53, GDPR Article numbers).
|
|
28
28
|
- Score from 0-100 where 100 means fully compliant with no findings.
|
|
29
29
|
|
|
30
|
+
CLEAN CODE RECOGNITION (if ALL of the following are true, report ZERO findings):
|
|
31
|
+
- Sensitive data (PII, credentials, tokens) is not logged, returned in error responses, or stored in plaintext.
|
|
32
|
+
- Cryptographic operations use standard libraries with recommended algorithms (AES-256-GCM, SHA-256+, bcrypt/scrypt).
|
|
33
|
+
- Database credentials and API keys are loaded from environment variables or secrets managers.
|
|
34
|
+
- Data at rest and in transit uses encryption (TLS, encrypted storage).
|
|
35
|
+
If the code meets these criteria, data security is properly implemented. Do NOT flag theoretical data handling improvements.
|
|
36
|
+
|
|
30
37
|
FALSE POSITIVE AVOIDANCE:
|
|
31
38
|
- Do NOT flag code that uses established encryption libraries (crypto, sodium, bouncy castle) with standard configurations.
|
|
32
39
|
- Data flowing through authenticated APIs with proper access controls is not a data security issue.
|
package/agents/database.judge.md
CHANGED
|
@@ -30,6 +30,14 @@ RULES FOR YOUR EVALUATION:
|
|
|
30
30
|
- Flag patterns that will degrade as data volume grows.
|
|
31
31
|
- Score from 0-100 where 100 means excellent database practices.
|
|
32
32
|
|
|
33
|
+
CLEAN CODE RECOGNITION (if ALL of the following are true, report ZERO findings):
|
|
34
|
+
- Queries use parameterized statements or an ORM with proper escaping.
|
|
35
|
+
- Connection pooling is configured (not per-request connections).
|
|
36
|
+
- Transactions are used for multi-step data modifications.
|
|
37
|
+
- No SELECT * in production queries — columns are explicitly listed.
|
|
38
|
+
- Indexes are referenced in query design or migration files.
|
|
39
|
+
If the code uses an ORM with standard patterns, database practices are adequate. Do NOT flag ORM-generated queries or standard CRUD operations.
|
|
40
|
+
|
|
33
41
|
FALSE POSITIVE AVOIDANCE:
|
|
34
42
|
- **Environment variable fallback defaults**: Connection strings in os.environ.get('DB_URL', 'sqlite:///default.db') or process.env.DB_URL || 'localhost' are standard development defaults, NOT hardcoded production credentials. Only flag DB-001 when a connection string with real credentials appears outside an env-var fallback pattern.
|
|
35
43
|
- **In-memory/embedded databases as defaults**: SQLite, DuckDB, or H2 defaults are normal for local development and testing. Flag only when production deployment docs are missing, not the default value itself.
|
|
@@ -32,6 +32,13 @@ RULES FOR YOUR EVALUATION:
|
|
|
32
32
|
- Evaluate from the perspective of a new developer encountering the code for the first time.
|
|
33
33
|
- Score from 0-100 where 100 means exemplary documentation.
|
|
34
34
|
|
|
35
|
+
CLEAN CODE RECOGNITION (if ALL of the following are true, report ZERO findings):
|
|
36
|
+
- Public/exported functions have JSDoc, docstrings, or clear descriptive names that convey purpose.
|
|
37
|
+
- Complex algorithms or non-obvious logic have inline comments explaining the "why."
|
|
38
|
+
- Function/method names and parameter names are self-documenting.
|
|
39
|
+
- Module or file has a top-level comment or README describing its purpose (if it's a standalone module).
|
|
40
|
+
If the code is reasonably self-documenting, do NOT demand exhaustive JSDoc on every function. Well-named code does not need redundant documentation.
|
|
41
|
+
|
|
35
42
|
FALSE POSITIVE AVOIDANCE:
|
|
36
43
|
- Do NOT flag missing documentation for self-documenting code (clear function names, obvious parameters, standard patterns).
|
|
37
44
|
- Configuration files, data files, and infrastructure code have different documentation standards than application code.
|
|
@@ -30,6 +30,14 @@ RULES FOR YOUR EVALUATION:
|
|
|
30
30
|
- Evaluate whether the observability data would be useful during a production incident.
|
|
31
31
|
- Score from 0-100 where 100 means fully observable and debuggable in production.
|
|
32
32
|
|
|
33
|
+
CLEAN CODE RECOGNITION (if ALL of the following are true, report ZERO findings):
|
|
34
|
+
- Logging is present at key decision points (request handling, error paths, important state changes).
|
|
35
|
+
- Log statements include contextual data (request IDs, user identifiers, operation names).
|
|
36
|
+
- Errors are logged with sufficient context for debugging (error message, stack trace, or relevant state).
|
|
37
|
+
- Structured logging format is used (JSON, key-value pairs) rather than bare string concatenation.
|
|
38
|
+
- Health check or readiness endpoints exist for services.
|
|
39
|
+
If the code meets these criteria, observability is adequate. Do NOT flag missing OpenTelemetry, Prometheus metrics, or distributed tracing when basic logging is already present — those are operational enhancements, not code defects.
|
|
40
|
+
|
|
33
41
|
FALSE POSITIVE AVOIDANCE:
|
|
34
42
|
- Only flag observability issues in application code that handles requests, processes events, or performs business operations.
|
|
35
43
|
- Do NOT flag utility functions, type definitions, or configuration files for missing observability.
|
|
@@ -32,6 +32,15 @@ RULES FOR YOUR EVALUATION:
|
|
|
32
32
|
- Recommend specific resilience libraries or patterns with configuration examples.
|
|
33
33
|
- Score from 0-100 where 100 means highly resilient and fault-tolerant.
|
|
34
34
|
|
|
35
|
+
CLEAN CODE RECOGNITION (if ALL of the following are true, report ZERO findings):
|
|
36
|
+
- External I/O operations (network, database, file) have error handling (try/catch, error callbacks, or Result types).
|
|
37
|
+
- HTTP responses check status codes before processing data.
|
|
38
|
+
- Promises/async operations have rejection handling.
|
|
39
|
+
- Resource cleanup is present (finally blocks, defer, using/with statements, or disposal patterns).
|
|
40
|
+
- No fire-and-forget async operations on critical paths.
|
|
41
|
+
- Timeouts are configured for external calls.
|
|
42
|
+
If the code meets these criteria, it is handling failures appropriately. Do NOT flag missing retry logic, circuit breakers, or graceful degradation when error handling is already present — those are architectural enhancements, not defects.
|
|
43
|
+
|
|
35
44
|
FALSE POSITIVE AVOIDANCE:
|
|
36
45
|
- Only flag reliability issues in code that handles production workloads, external dependencies, or user-facing operations.
|
|
37
46
|
- Scripts, CLI tools, and development utilities have different reliability requirements than production services.
|
|
@@ -30,6 +30,15 @@ RULES FOR YOUR EVALUATION:
|
|
|
30
30
|
- Recommend specific architectural patterns (CQRS, event sourcing, circuit breakers, etc.).
|
|
31
31
|
- Score from 0-100 where 100 means fully scalable with no bottlenecks.
|
|
32
32
|
|
|
33
|
+
CLEAN CODE RECOGNITION (if ALL of the following are true, report ZERO findings):
|
|
34
|
+
- Database queries use parameterized statements and avoid N+1 patterns.
|
|
35
|
+
- No unbounded in-memory collections (results are paginated or streamed).
|
|
36
|
+
- Connection pooling is used for database connections (not per-request connections).
|
|
37
|
+
- Background/async processing for long-running operations.
|
|
38
|
+
- No global mutable state that would prevent horizontal scaling.
|
|
39
|
+
- Caching strategy is present for frequently-accessed data.
|
|
40
|
+
If the code meets these criteria, it is reasonably scalable. Do NOT flag theoretical scaling concerns for code that already follows standard patterns — only flag concrete bottlenecks that would fail under realistic load.
|
|
41
|
+
|
|
33
42
|
FALSE POSITIVE AVOIDANCE:
|
|
34
43
|
- **Distributed lock with local fallback**: When code implements a distributed lock (Redlock, Redis lock, etcd, Consul) as the primary mechanism AND uses a local lock (asyncio.Lock, threading.Lock) as a documented single-instance fallback, do NOT flag the local lock as a scaling issue. This is a correct graceful-degradation pattern.
|
|
35
44
|
- **Two-tier locking**: If comments document a two-tier design (distributed for multi-instance, local for single-instance), accept the design. A compliance/dev tool should still function without external infrastructure.
|
|
@@ -32,6 +32,16 @@ RULES FOR YOUR EVALUATION:
|
|
|
32
32
|
- Reference Clean Code (Robert Martin), SOLID, DRY, KISS, YAGNI where applicable.
|
|
33
33
|
- Score from 0-100 where 100 means exemplary software engineering.
|
|
34
34
|
|
|
35
|
+
CLEAN CODE RECOGNITION (if ALL of the following are true, report ZERO findings):
|
|
36
|
+
- Code follows consistent naming conventions and formatting throughout.
|
|
37
|
+
- Functions/methods are reasonably sized (under ~50 lines) with clear single responsibilities.
|
|
38
|
+
- No bare linter/type-checker suppression directives without justification.
|
|
39
|
+
- Variables use const/let (not var) and meaningful names — no single-letter names outside loops.
|
|
40
|
+
- Error handling is present for I/O operations and external calls.
|
|
41
|
+
- No obvious code duplication (copy-paste blocks of 10+ lines).
|
|
42
|
+
- Dependencies are imported from standard registries, not vendored or outdated.
|
|
43
|
+
If the code meets these criteria, it follows good software practices. Do NOT manufacture findings about theoretical improvements.
|
|
44
|
+
|
|
35
45
|
FALSE POSITIVE AVOIDANCE:
|
|
36
46
|
- **Justified suppression comments**: type: ignore, noqa, eslint-disable, and similar comments that include a rationale (e.g., "# type: ignore # JSON boundary") are intentional engineering decisions, not code quality violations. Only flag SWDEV-001 for bare suppressions without justification.
|
|
37
47
|
- **Minimum-viable nesting in async code**: Async functions with try/except/with patterns inherently add 2-3 nesting levels. Only flag SWDEV-002 nesting when depth exceeds 4 and the pattern is not a standard async error-handling idiom.
|
package/agents/testing.judge.md
CHANGED
|
@@ -32,6 +32,12 @@ RULES FOR YOUR EVALUATION:
|
|
|
32
32
|
- Evaluate both the tests AND the testability of the code under test.
|
|
33
33
|
- Score from 0-100 where 100 means comprehensive, well-structured test suite.
|
|
34
34
|
|
|
35
|
+
CLEAN CODE RECOGNITION (if ALL of the following are true, report ZERO findings):
|
|
36
|
+
- The code being evaluated IS a test file, OR the code is a small utility/helper that would be tested at a higher level.
|
|
37
|
+
- Type definitions, interfaces, enums, and configuration files do not need dedicated tests.
|
|
38
|
+
- Generated code, data migrations, and infrastructure-as-code have different testing strategies.
|
|
39
|
+
Do NOT flag code for "missing tests" unless it contains complex business logic or critical paths that clearly need unit test coverage.
|
|
40
|
+
|
|
35
41
|
FALSE POSITIVE AVOIDANCE:
|
|
36
42
|
- Only flag testing issues when evaluating test files or when application code lacks testability.
|
|
37
43
|
- Do NOT flag production code for "missing tests" — tests exist in separate files that may not be provided.
|
|
@@ -31,6 +31,15 @@ RULES FOR YOUR EVALUATION:
|
|
|
31
31
|
- Consider both API producer and consumer perspectives.
|
|
32
32
|
- Score from 0-100 where 100 means exemplary API design.
|
|
33
33
|
|
|
34
|
+
CLEAN CODE RECOGNITION (if ALL of the following are true, report ZERO findings):
|
|
35
|
+
- API endpoints follow RESTful conventions with appropriate HTTP methods and status codes.
|
|
36
|
+
- Input validation is present (schema validation, type checking, or manual guards).
|
|
37
|
+
- Error responses use a consistent format with meaningful error codes/messages.
|
|
38
|
+
- Pagination or result limits are applied to collection endpoints.
|
|
39
|
+
- API versioning strategy is apparent (URL path, header, or documented convention).
|
|
40
|
+
- Response shapes are consistent across similar endpoints.
|
|
41
|
+
If the code meets these criteria, the API design is sound. Do NOT flag stylistic preferences or theoretical improvements.
|
|
42
|
+
|
|
34
43
|
FALSE POSITIVE AVOIDANCE:
|
|
35
44
|
- Only flag API design issues in code that defines or implements HTTP/REST/GraphQL API endpoints.
|
|
36
45
|
- Do NOT flag CLI tools, batch scripts, internal libraries, or infrastructure code for API design issues.
|
|
@@ -29,6 +29,13 @@ RULES FOR YOUR EVALUATION:
|
|
|
29
29
|
- Recommend specific services or patterns (e.g., "Use Azure Key Vault instead of .env files in production").
|
|
30
30
|
- Score from 0-100 where 100 means fully cloud-native.
|
|
31
31
|
|
|
32
|
+
CLEAN CODE RECOGNITION (if ALL of the following are true, report ZERO findings):
|
|
33
|
+
- Configuration is loaded from environment variables or external config stores.
|
|
34
|
+
- The application binds to configurable ports, not hardcoded values.
|
|
35
|
+
- File system usage is for temporary/cache purposes, not persistent state.
|
|
36
|
+
- External dependencies (DB, cache, queues) use client libraries with connection pooling.
|
|
37
|
+
If the code follows 12-factor app principles, it is cloud-ready. Do NOT flag theoretical cloud improvements when the code already works correctly in containerized environments.
|
|
38
|
+
|
|
32
39
|
FALSE POSITIVE AVOIDANCE:
|
|
33
40
|
- Only flag cloud-readiness issues in code that involves cloud deployment, containerization, or distributed systems.
|
|
34
41
|
- Do NOT flag local development utilities, CLI tools, or scripts for cloud-readiness issues.
|
|
@@ -29,6 +29,13 @@ RULES FOR YOUR EVALUATION:
|
|
|
29
29
|
- Recommend both code changes and process changes where applicable.
|
|
30
30
|
- Score from 0-100 where 100 means fully compliant.
|
|
31
31
|
|
|
32
|
+
CLEAN CODE RECOGNITION (if ALL of the following are true, report ZERO findings):
|
|
33
|
+
- Audit logging is present for administrative and data-modifying operations.
|
|
34
|
+
- User consent management is implemented where personal data is collected.
|
|
35
|
+
- Data retention policies are documented or implemented in code.
|
|
36
|
+
- Access controls are present for sensitive data and operations.
|
|
37
|
+
If the code demonstrates compliance awareness, do NOT flag it for missing specific certification requirements (HIPAA, PCI-DSS, SOC 2) that are organizational rather than code-level concerns.
|
|
38
|
+
|
|
32
39
|
FALSE POSITIVE AVOIDANCE:
|
|
33
40
|
- **"age" in cache/TTL contexts**: The word "age" in cache_age, max_age, ttl_age, stale_age refers to data freshness timing, NOT user age or minor-age verification. Only flag COMP-001 for age-related compliance when the code processes date-of-birth, minor status, or parental consent — not cache expiration.
|
|
34
41
|
|
|
@@ -26,6 +26,13 @@ RULES FOR YOUR EVALUATION:
|
|
|
26
26
|
- Reference standards where applicable (OWASP, NIST 800-53, GDPR Article numbers).
|
|
27
27
|
- Score from 0-100 where 100 means fully compliant with no findings.
|
|
28
28
|
|
|
29
|
+
CLEAN CODE RECOGNITION (if ALL of the following are true, report ZERO findings):
|
|
30
|
+
- Sensitive data (PII, credentials, tokens) is not logged, returned in error responses, or stored in plaintext.
|
|
31
|
+
- Cryptographic operations use standard libraries with recommended algorithms (AES-256-GCM, SHA-256+, bcrypt/scrypt).
|
|
32
|
+
- Database credentials and API keys are loaded from environment variables or secrets managers.
|
|
33
|
+
- Data at rest and in transit uses encryption (TLS, encrypted storage).
|
|
34
|
+
If the code meets these criteria, data security is properly implemented. Do NOT flag theoretical data handling improvements.
|
|
35
|
+
|
|
29
36
|
FALSE POSITIVE AVOIDANCE:
|
|
30
37
|
- Do NOT flag code that uses established encryption libraries (crypto, sodium, bouncy castle) with standard configurations.
|
|
31
38
|
- Data flowing through authenticated APIs with proper access controls is not a data security issue.
|
package/dist/judges/database.js
CHANGED
|
@@ -29,6 +29,14 @@ RULES FOR YOUR EVALUATION:
|
|
|
29
29
|
- Flag patterns that will degrade as data volume grows.
|
|
30
30
|
- Score from 0-100 where 100 means excellent database practices.
|
|
31
31
|
|
|
32
|
+
CLEAN CODE RECOGNITION (if ALL of the following are true, report ZERO findings):
|
|
33
|
+
- Queries use parameterized statements or an ORM with proper escaping.
|
|
34
|
+
- Connection pooling is configured (not per-request connections).
|
|
35
|
+
- Transactions are used for multi-step data modifications.
|
|
36
|
+
- No SELECT * in production queries — columns are explicitly listed.
|
|
37
|
+
- Indexes are referenced in query design or migration files.
|
|
38
|
+
If the code uses an ORM with standard patterns, database practices are adequate. Do NOT flag ORM-generated queries or standard CRUD operations.
|
|
39
|
+
|
|
32
40
|
FALSE POSITIVE AVOIDANCE:
|
|
33
41
|
- **Environment variable fallback defaults**: Connection strings in os.environ.get('DB_URL', 'sqlite:///default.db') or process.env.DB_URL || 'localhost' are standard development defaults, NOT hardcoded production credentials. Only flag DB-001 when a connection string with real credentials appears outside an env-var fallback pattern.
|
|
34
42
|
- **In-memory/embedded databases as defaults**: SQLite, DuckDB, or H2 defaults are normal for local development and testing. Flag only when production deployment docs are missing, not the default value itself.
|
|
@@ -31,6 +31,13 @@ RULES FOR YOUR EVALUATION:
|
|
|
31
31
|
- Evaluate from the perspective of a new developer encountering the code for the first time.
|
|
32
32
|
- Score from 0-100 where 100 means exemplary documentation.
|
|
33
33
|
|
|
34
|
+
CLEAN CODE RECOGNITION (if ALL of the following are true, report ZERO findings):
|
|
35
|
+
- Public/exported functions have JSDoc, docstrings, or clear descriptive names that convey purpose.
|
|
36
|
+
- Complex algorithms or non-obvious logic have inline comments explaining the "why."
|
|
37
|
+
- Function/method names and parameter names are self-documenting.
|
|
38
|
+
- Module or file has a top-level comment or README describing its purpose (if it's a standalone module).
|
|
39
|
+
If the code is reasonably self-documenting, do NOT demand exhaustive JSDoc on every function. Well-named code does not need redundant documentation.
|
|
40
|
+
|
|
34
41
|
FALSE POSITIVE AVOIDANCE:
|
|
35
42
|
- Do NOT flag missing documentation for self-documenting code (clear function names, obvious parameters, standard patterns).
|
|
36
43
|
- Configuration files, data files, and infrastructure code have different documentation standards than application code.
|
|
@@ -29,6 +29,14 @@ RULES FOR YOUR EVALUATION:
|
|
|
29
29
|
- Evaluate whether the observability data would be useful during a production incident.
|
|
30
30
|
- Score from 0-100 where 100 means fully observable and debuggable in production.
|
|
31
31
|
|
|
32
|
+
CLEAN CODE RECOGNITION (if ALL of the following are true, report ZERO findings):
|
|
33
|
+
- Logging is present at key decision points (request handling, error paths, important state changes).
|
|
34
|
+
- Log statements include contextual data (request IDs, user identifiers, operation names).
|
|
35
|
+
- Errors are logged with sufficient context for debugging (error message, stack trace, or relevant state).
|
|
36
|
+
- Structured logging format is used (JSON, key-value pairs) rather than bare string concatenation.
|
|
37
|
+
- Health check or readiness endpoints exist for services.
|
|
38
|
+
If the code meets these criteria, observability is adequate. Do NOT flag missing OpenTelemetry, Prometheus metrics, or distributed tracing when basic logging is already present — those are operational enhancements, not code defects.
|
|
39
|
+
|
|
32
40
|
FALSE POSITIVE AVOIDANCE:
|
|
33
41
|
- Only flag observability issues in application code that handles requests, processes events, or performs business operations.
|
|
34
42
|
- Do NOT flag utility functions, type definitions, or configuration files for missing observability.
|
|
@@ -31,6 +31,15 @@ RULES FOR YOUR EVALUATION:
|
|
|
31
31
|
- Recommend specific resilience libraries or patterns with configuration examples.
|
|
32
32
|
- Score from 0-100 where 100 means highly resilient and fault-tolerant.
|
|
33
33
|
|
|
34
|
+
CLEAN CODE RECOGNITION (if ALL of the following are true, report ZERO findings):
|
|
35
|
+
- External I/O operations (network, database, file) have error handling (try/catch, error callbacks, or Result types).
|
|
36
|
+
- HTTP responses check status codes before processing data.
|
|
37
|
+
- Promises/async operations have rejection handling.
|
|
38
|
+
- Resource cleanup is present (finally blocks, defer, using/with statements, or disposal patterns).
|
|
39
|
+
- No fire-and-forget async operations on critical paths.
|
|
40
|
+
- Timeouts are configured for external calls.
|
|
41
|
+
If the code meets these criteria, it is handling failures appropriately. Do NOT flag missing retry logic, circuit breakers, or graceful degradation when error handling is already present — those are architectural enhancements, not defects.
|
|
42
|
+
|
|
34
43
|
FALSE POSITIVE AVOIDANCE:
|
|
35
44
|
- Only flag reliability issues in code that handles production workloads, external dependencies, or user-facing operations.
|
|
36
45
|
- Scripts, CLI tools, and development utilities have different reliability requirements than production services.
|
|
@@ -29,6 +29,15 @@ RULES FOR YOUR EVALUATION:
|
|
|
29
29
|
- Recommend specific architectural patterns (CQRS, event sourcing, circuit breakers, etc.).
|
|
30
30
|
- Score from 0-100 where 100 means fully scalable with no bottlenecks.
|
|
31
31
|
|
|
32
|
+
CLEAN CODE RECOGNITION (if ALL of the following are true, report ZERO findings):
|
|
33
|
+
- Database queries use parameterized statements and avoid N+1 patterns.
|
|
34
|
+
- No unbounded in-memory collections (results are paginated or streamed).
|
|
35
|
+
- Connection pooling is used for database connections (not per-request connections).
|
|
36
|
+
- Background/async processing for long-running operations.
|
|
37
|
+
- No global mutable state that would prevent horizontal scaling.
|
|
38
|
+
- Caching strategy is present for frequently-accessed data.
|
|
39
|
+
If the code meets these criteria, it is reasonably scalable. Do NOT flag theoretical scaling concerns for code that already follows standard patterns — only flag concrete bottlenecks that would fail under realistic load.
|
|
40
|
+
|
|
32
41
|
FALSE POSITIVE AVOIDANCE:
|
|
33
42
|
- **Distributed lock with local fallback**: When code implements a distributed lock (Redlock, Redis lock, etcd, Consul) as the primary mechanism AND uses a local lock (asyncio.Lock, threading.Lock) as a documented single-instance fallback, do NOT flag the local lock as a scaling issue. This is a correct graceful-degradation pattern.
|
|
34
43
|
- **Two-tier locking**: If comments document a two-tier design (distributed for multi-instance, local for single-instance), accept the design. A compliance/dev tool should still function without external infrastructure.
|
|
@@ -31,6 +31,16 @@ RULES FOR YOUR EVALUATION:
|
|
|
31
31
|
- Reference Clean Code (Robert Martin), SOLID, DRY, KISS, YAGNI where applicable.
|
|
32
32
|
- Score from 0-100 where 100 means exemplary software engineering.
|
|
33
33
|
|
|
34
|
+
CLEAN CODE RECOGNITION (if ALL of the following are true, report ZERO findings):
|
|
35
|
+
- Code follows consistent naming conventions and formatting throughout.
|
|
36
|
+
- Functions/methods are reasonably sized (under ~50 lines) with clear single responsibilities.
|
|
37
|
+
- No bare linter/type-checker suppression directives without justification.
|
|
38
|
+
- Variables use const/let (not var) and meaningful names — no single-letter names outside loops.
|
|
39
|
+
- Error handling is present for I/O operations and external calls.
|
|
40
|
+
- No obvious code duplication (copy-paste blocks of 10+ lines).
|
|
41
|
+
- Dependencies are imported from standard registries, not vendored or outdated.
|
|
42
|
+
If the code meets these criteria, it follows good software practices. Do NOT manufacture findings about theoretical improvements.
|
|
43
|
+
|
|
34
44
|
FALSE POSITIVE AVOIDANCE:
|
|
35
45
|
- **Justified suppression comments**: type: ignore, noqa, eslint-disable, and similar comments that include a rationale (e.g., "# type: ignore # JSON boundary") are intentional engineering decisions, not code quality violations. Only flag SWDEV-001 for bare suppressions without justification.
|
|
36
46
|
- **Minimum-viable nesting in async code**: Async functions with try/except/with patterns inherently add 2-3 nesting levels. Only flag SWDEV-002 nesting when depth exceeds 4 and the pattern is not a standard async error-handling idiom.
|
package/dist/judges/testing.js
CHANGED
|
@@ -31,6 +31,12 @@ RULES FOR YOUR EVALUATION:
|
|
|
31
31
|
- Evaluate both the tests AND the testability of the code under test.
|
|
32
32
|
- Score from 0-100 where 100 means comprehensive, well-structured test suite.
|
|
33
33
|
|
|
34
|
+
CLEAN CODE RECOGNITION (if ALL of the following are true, report ZERO findings):
|
|
35
|
+
- The code being evaluated IS a test file, OR the code is a small utility/helper that would be tested at a higher level.
|
|
36
|
+
- Type definitions, interfaces, enums, and configuration files do not need dedicated tests.
|
|
37
|
+
- Generated code, data migrations, and infrastructure-as-code have different testing strategies.
|
|
38
|
+
Do NOT flag code for "missing tests" unless it contains complex business logic or critical paths that clearly need unit test coverage.
|
|
39
|
+
|
|
34
40
|
FALSE POSITIVE AVOIDANCE:
|
|
35
41
|
- Only flag testing issues when evaluating test files or when application code lacks testability.
|
|
36
42
|
- Do NOT flag production code for "missing tests" — tests exist in separate files that may not be provided.
|
package/package.json
CHANGED
package/server.json
CHANGED
|
@@ -16,12 +16,12 @@
|
|
|
16
16
|
"mimeType": "image/png"
|
|
17
17
|
}
|
|
18
18
|
],
|
|
19
|
-
"version": "3.
|
|
19
|
+
"version": "3.124.0",
|
|
20
20
|
"packages": [
|
|
21
21
|
{
|
|
22
22
|
"registryType": "npm",
|
|
23
23
|
"identifier": "@kevinrabun/judges",
|
|
24
|
-
"version": "3.
|
|
24
|
+
"version": "3.124.0",
|
|
25
25
|
"transport": {
|
|
26
26
|
"type": "stdio"
|
|
27
27
|
}
|