@wazir-dev/cli 1.0.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (629) hide show
  1. package/AGENTS.md +111 -0
  2. package/CHANGELOG.md +14 -0
  3. package/CONTRIBUTING.md +101 -0
  4. package/LICENSE +21 -0
  5. package/README.md +314 -0
  6. package/assets/composition-engine.mmd +34 -0
  7. package/assets/demo-script.sh +17 -0
  8. package/assets/logo-dark.svg +14 -0
  9. package/assets/logo.svg +14 -0
  10. package/assets/pipeline.mmd +39 -0
  11. package/assets/record-demo.sh +51 -0
  12. package/docs/README.md +51 -0
  13. package/docs/adapters/context-mode.md +60 -0
  14. package/docs/concepts/architecture.md +87 -0
  15. package/docs/concepts/artifact-model.md +60 -0
  16. package/docs/concepts/composition-engine.md +36 -0
  17. package/docs/concepts/indexing-and-recall.md +160 -0
  18. package/docs/concepts/observability.md +41 -0
  19. package/docs/concepts/roles-and-workflows.md +59 -0
  20. package/docs/concepts/terminology-policy.md +27 -0
  21. package/docs/getting-started/01-installation.md +78 -0
  22. package/docs/getting-started/02-first-run.md +102 -0
  23. package/docs/getting-started/03-adding-to-project.md +15 -0
  24. package/docs/getting-started/04-host-setup.md +15 -0
  25. package/docs/guides/ci-integration.md +15 -0
  26. package/docs/guides/creating-skills.md +15 -0
  27. package/docs/guides/expertise-module-authoring.md +15 -0
  28. package/docs/guides/hook-development.md +15 -0
  29. package/docs/guides/memory-and-learnings.md +34 -0
  30. package/docs/guides/multi-host-export.md +15 -0
  31. package/docs/guides/troubleshooting.md +101 -0
  32. package/docs/guides/writing-custom-roles.md +15 -0
  33. package/docs/plans/2026-03-15-cli-pipeline-integration-design.md +592 -0
  34. package/docs/plans/2026-03-15-cli-pipeline-integration-plan.md +598 -0
  35. package/docs/plans/2026-03-15-docs-enforcement-plan.md +238 -0
  36. package/docs/readmes/INDEX.md +99 -0
  37. package/docs/readmes/features/expertise/README.md +171 -0
  38. package/docs/readmes/features/exports/README.md +222 -0
  39. package/docs/readmes/features/hooks/README.md +103 -0
  40. package/docs/readmes/features/hooks/loop-cap-guard.md +133 -0
  41. package/docs/readmes/features/hooks/post-tool-capture.md +121 -0
  42. package/docs/readmes/features/hooks/post-tool-lint.md +130 -0
  43. package/docs/readmes/features/hooks/pre-compact-summary.md +122 -0
  44. package/docs/readmes/features/hooks/pre-tool-capture-route.md +100 -0
  45. package/docs/readmes/features/hooks/protected-path-write-guard.md +128 -0
  46. package/docs/readmes/features/hooks/session-start.md +119 -0
  47. package/docs/readmes/features/hooks/stop-handoff-harvest.md +125 -0
  48. package/docs/readmes/features/roles/README.md +157 -0
  49. package/docs/readmes/features/roles/clarifier.md +152 -0
  50. package/docs/readmes/features/roles/content-author.md +190 -0
  51. package/docs/readmes/features/roles/designer.md +193 -0
  52. package/docs/readmes/features/roles/executor.md +184 -0
  53. package/docs/readmes/features/roles/learner.md +210 -0
  54. package/docs/readmes/features/roles/planner.md +182 -0
  55. package/docs/readmes/features/roles/researcher.md +164 -0
  56. package/docs/readmes/features/roles/reviewer.md +184 -0
  57. package/docs/readmes/features/roles/specifier.md +162 -0
  58. package/docs/readmes/features/roles/verifier.md +215 -0
  59. package/docs/readmes/features/schemas/README.md +178 -0
  60. package/docs/readmes/features/skills/README.md +63 -0
  61. package/docs/readmes/features/skills/brainstorming.md +96 -0
  62. package/docs/readmes/features/skills/debugging.md +148 -0
  63. package/docs/readmes/features/skills/design.md +120 -0
  64. package/docs/readmes/features/skills/prepare-next.md +109 -0
  65. package/docs/readmes/features/skills/run-audit.md +159 -0
  66. package/docs/readmes/features/skills/scan-project.md +109 -0
  67. package/docs/readmes/features/skills/self-audit.md +176 -0
  68. package/docs/readmes/features/skills/tdd.md +137 -0
  69. package/docs/readmes/features/skills/using-skills.md +92 -0
  70. package/docs/readmes/features/skills/verification.md +120 -0
  71. package/docs/readmes/features/skills/writing-plans.md +104 -0
  72. package/docs/readmes/features/tooling/README.md +320 -0
  73. package/docs/readmes/features/workflows/README.md +186 -0
  74. package/docs/readmes/features/workflows/author.md +181 -0
  75. package/docs/readmes/features/workflows/clarify.md +154 -0
  76. package/docs/readmes/features/workflows/design-review.md +171 -0
  77. package/docs/readmes/features/workflows/design.md +169 -0
  78. package/docs/readmes/features/workflows/discover.md +162 -0
  79. package/docs/readmes/features/workflows/execute.md +173 -0
  80. package/docs/readmes/features/workflows/learn.md +167 -0
  81. package/docs/readmes/features/workflows/plan-review.md +165 -0
  82. package/docs/readmes/features/workflows/plan.md +170 -0
  83. package/docs/readmes/features/workflows/prepare-next.md +167 -0
  84. package/docs/readmes/features/workflows/review.md +169 -0
  85. package/docs/readmes/features/workflows/run-audit.md +191 -0
  86. package/docs/readmes/features/workflows/spec-challenge.md +159 -0
  87. package/docs/readmes/features/workflows/specify.md +160 -0
  88. package/docs/readmes/features/workflows/verify.md +177 -0
  89. package/docs/readmes/packages/README.md +50 -0
  90. package/docs/readmes/packages/ajv.md +117 -0
  91. package/docs/readmes/packages/context-mode.md +118 -0
  92. package/docs/readmes/packages/gray-matter.md +116 -0
  93. package/docs/readmes/packages/node-test.md +137 -0
  94. package/docs/readmes/packages/yaml.md +112 -0
  95. package/docs/reference/configuration-reference.md +159 -0
  96. package/docs/reference/expertise-index.md +52 -0
  97. package/docs/reference/git-flow.md +43 -0
  98. package/docs/reference/hooks.md +87 -0
  99. package/docs/reference/host-exports.md +50 -0
  100. package/docs/reference/launch-checklist.md +172 -0
  101. package/docs/reference/marketplace-listings.md +76 -0
  102. package/docs/reference/release-process.md +34 -0
  103. package/docs/reference/roles-reference.md +77 -0
  104. package/docs/reference/skills.md +33 -0
  105. package/docs/reference/templates.md +29 -0
  106. package/docs/reference/tooling-cli.md +94 -0
  107. package/docs/truth-claims.yaml +222 -0
  108. package/expertise/PROGRESS.md +63 -0
  109. package/expertise/README.md +18 -0
  110. package/expertise/antipatterns/PROGRESS.md +56 -0
  111. package/expertise/antipatterns/backend/api-design-antipatterns.md +1271 -0
  112. package/expertise/antipatterns/backend/auth-antipatterns.md +1195 -0
  113. package/expertise/antipatterns/backend/caching-antipatterns.md +622 -0
  114. package/expertise/antipatterns/backend/database-antipatterns.md +1038 -0
  115. package/expertise/antipatterns/backend/index.md +24 -0
  116. package/expertise/antipatterns/backend/microservices-antipatterns.md +850 -0
  117. package/expertise/antipatterns/code/architecture-antipatterns.md +919 -0
  118. package/expertise/antipatterns/code/async-antipatterns.md +622 -0
  119. package/expertise/antipatterns/code/code-smells.md +1186 -0
  120. package/expertise/antipatterns/code/dependency-antipatterns.md +1209 -0
  121. package/expertise/antipatterns/code/error-handling-antipatterns.md +1360 -0
  122. package/expertise/antipatterns/code/index.md +27 -0
  123. package/expertise/antipatterns/code/naming-and-abstraction.md +1118 -0
  124. package/expertise/antipatterns/code/state-management-antipatterns.md +1076 -0
  125. package/expertise/antipatterns/code/testing-antipatterns.md +1053 -0
  126. package/expertise/antipatterns/design/accessibility-antipatterns.md +1136 -0
  127. package/expertise/antipatterns/design/dark-patterns.md +1121 -0
  128. package/expertise/antipatterns/design/index.md +22 -0
  129. package/expertise/antipatterns/design/ui-antipatterns.md +1202 -0
  130. package/expertise/antipatterns/design/ux-antipatterns.md +680 -0
  131. package/expertise/antipatterns/frontend/css-layout-antipatterns.md +691 -0
  132. package/expertise/antipatterns/frontend/flutter-antipatterns.md +1827 -0
  133. package/expertise/antipatterns/frontend/index.md +23 -0
  134. package/expertise/antipatterns/frontend/mobile-antipatterns.md +573 -0
  135. package/expertise/antipatterns/frontend/react-antipatterns.md +1128 -0
  136. package/expertise/antipatterns/frontend/spa-antipatterns.md +1235 -0
  137. package/expertise/antipatterns/index.md +31 -0
  138. package/expertise/antipatterns/performance/index.md +20 -0
  139. package/expertise/antipatterns/performance/performance-antipatterns.md +1013 -0
  140. package/expertise/antipatterns/performance/premature-optimization.md +623 -0
  141. package/expertise/antipatterns/performance/scaling-antipatterns.md +785 -0
  142. package/expertise/antipatterns/process/ai-coding-antipatterns.md +853 -0
  143. package/expertise/antipatterns/process/code-review-antipatterns.md +656 -0
  144. package/expertise/antipatterns/process/deployment-antipatterns.md +920 -0
  145. package/expertise/antipatterns/process/index.md +23 -0
  146. package/expertise/antipatterns/process/technical-debt-antipatterns.md +647 -0
  147. package/expertise/antipatterns/security/index.md +20 -0
  148. package/expertise/antipatterns/security/secrets-antipatterns.md +849 -0
  149. package/expertise/antipatterns/security/security-theater.md +843 -0
  150. package/expertise/antipatterns/security/vulnerability-patterns.md +801 -0
  151. package/expertise/architecture/PROGRESS.md +70 -0
  152. package/expertise/architecture/data/caching-architecture.md +671 -0
  153. package/expertise/architecture/data/data-consistency.md +574 -0
  154. package/expertise/architecture/data/data-modeling.md +536 -0
  155. package/expertise/architecture/data/event-streams-and-queues.md +634 -0
  156. package/expertise/architecture/data/index.md +25 -0
  157. package/expertise/architecture/data/search-architecture.md +663 -0
  158. package/expertise/architecture/data/sql-vs-nosql.md +708 -0
  159. package/expertise/architecture/decisions/architecture-decision-records.md +640 -0
  160. package/expertise/architecture/decisions/build-vs-buy.md +616 -0
  161. package/expertise/architecture/decisions/index.md +23 -0
  162. package/expertise/architecture/decisions/monolith-to-microservices.md +790 -0
  163. package/expertise/architecture/decisions/technology-selection.md +616 -0
  164. package/expertise/architecture/distributed/cap-theorem-and-tradeoffs.md +800 -0
  165. package/expertise/architecture/distributed/circuit-breaker-bulkhead.md +741 -0
  166. package/expertise/architecture/distributed/consensus-and-coordination.md +796 -0
  167. package/expertise/architecture/distributed/distributed-systems-fundamentals.md +564 -0
  168. package/expertise/architecture/distributed/idempotency-and-retry.md +796 -0
  169. package/expertise/architecture/distributed/index.md +25 -0
  170. package/expertise/architecture/distributed/saga-pattern.md +797 -0
  171. package/expertise/architecture/foundations/architectural-thinking.md +460 -0
  172. package/expertise/architecture/foundations/coupling-and-cohesion.md +770 -0
  173. package/expertise/architecture/foundations/design-principles-solid.md +649 -0
  174. package/expertise/architecture/foundations/domain-driven-design.md +719 -0
  175. package/expertise/architecture/foundations/index.md +25 -0
  176. package/expertise/architecture/foundations/separation-of-concerns.md +472 -0
  177. package/expertise/architecture/foundations/twelve-factor-app.md +797 -0
  178. package/expertise/architecture/index.md +34 -0
  179. package/expertise/architecture/integration/api-design-graphql.md +638 -0
  180. package/expertise/architecture/integration/api-design-grpc.md +804 -0
  181. package/expertise/architecture/integration/api-design-rest.md +892 -0
  182. package/expertise/architecture/integration/index.md +25 -0
  183. package/expertise/architecture/integration/third-party-integration.md +795 -0
  184. package/expertise/architecture/integration/webhooks-and-callbacks.md +1152 -0
  185. package/expertise/architecture/integration/websockets-realtime.md +791 -0
  186. package/expertise/architecture/mobile-architecture/index.md +22 -0
  187. package/expertise/architecture/mobile-architecture/mobile-app-architecture.md +780 -0
  188. package/expertise/architecture/mobile-architecture/mobile-backend-for-frontend.md +670 -0
  189. package/expertise/architecture/mobile-architecture/offline-first.md +719 -0
  190. package/expertise/architecture/mobile-architecture/push-and-sync.md +782 -0
  191. package/expertise/architecture/patterns/cqrs-event-sourcing.md +717 -0
  192. package/expertise/architecture/patterns/event-driven.md +797 -0
  193. package/expertise/architecture/patterns/hexagonal-clean-architecture.md +870 -0
  194. package/expertise/architecture/patterns/index.md +27 -0
  195. package/expertise/architecture/patterns/layered-architecture.md +736 -0
  196. package/expertise/architecture/patterns/microservices.md +753 -0
  197. package/expertise/architecture/patterns/modular-monolith.md +692 -0
  198. package/expertise/architecture/patterns/monolith.md +626 -0
  199. package/expertise/architecture/patterns/plugin-architecture.md +735 -0
  200. package/expertise/architecture/patterns/serverless.md +780 -0
  201. package/expertise/architecture/scaling/database-scaling.md +615 -0
  202. package/expertise/architecture/scaling/feature-flags-and-rollouts.md +757 -0
  203. package/expertise/architecture/scaling/horizontal-vs-vertical.md +606 -0
  204. package/expertise/architecture/scaling/index.md +24 -0
  205. package/expertise/architecture/scaling/multi-tenancy.md +800 -0
  206. package/expertise/architecture/scaling/stateless-design.md +787 -0
  207. package/expertise/backend/embedded-firmware.md +625 -0
  208. package/expertise/backend/go.md +853 -0
  209. package/expertise/backend/index.md +24 -0
  210. package/expertise/backend/java-spring.md +448 -0
  211. package/expertise/backend/node-typescript.md +625 -0
  212. package/expertise/backend/python-fastapi.md +724 -0
  213. package/expertise/backend/rust.md +458 -0
  214. package/expertise/backend/solidity.md +711 -0
  215. package/expertise/composition-map.yaml +443 -0
  216. package/expertise/content/foundations/content-modeling.md +395 -0
  217. package/expertise/content/foundations/editorial-standards.md +449 -0
  218. package/expertise/content/foundations/index.md +24 -0
  219. package/expertise/content/foundations/microcopy.md +455 -0
  220. package/expertise/content/foundations/terminology-governance.md +509 -0
  221. package/expertise/content/index.md +34 -0
  222. package/expertise/content/patterns/accessibility-copy.md +518 -0
  223. package/expertise/content/patterns/index.md +24 -0
  224. package/expertise/content/patterns/notification-content.md +433 -0
  225. package/expertise/content/patterns/sample-content.md +486 -0
  226. package/expertise/content/patterns/state-copy.md +439 -0
  227. package/expertise/design/PROGRESS.md +58 -0
  228. package/expertise/design/disciplines/dark-mode-theming.md +577 -0
  229. package/expertise/design/disciplines/design-systems.md +595 -0
  230. package/expertise/design/disciplines/index.md +25 -0
  231. package/expertise/design/disciplines/information-architecture.md +800 -0
  232. package/expertise/design/disciplines/interaction-design.md +788 -0
  233. package/expertise/design/disciplines/responsive-design.md +552 -0
  234. package/expertise/design/disciplines/usability-testing.md +516 -0
  235. package/expertise/design/disciplines/user-research.md +792 -0
  236. package/expertise/design/foundations/accessibility-design.md +796 -0
  237. package/expertise/design/foundations/color-theory.md +797 -0
  238. package/expertise/design/foundations/iconography.md +795 -0
  239. package/expertise/design/foundations/index.md +26 -0
  240. package/expertise/design/foundations/motion-and-animation.md +653 -0
  241. package/expertise/design/foundations/rtl-design.md +585 -0
  242. package/expertise/design/foundations/spacing-and-layout.md +607 -0
  243. package/expertise/design/foundations/typography.md +800 -0
  244. package/expertise/design/foundations/visual-hierarchy.md +761 -0
  245. package/expertise/design/index.md +32 -0
  246. package/expertise/design/patterns/authentication-flows.md +474 -0
  247. package/expertise/design/patterns/content-consumption.md +789 -0
  248. package/expertise/design/patterns/data-display.md +618 -0
  249. package/expertise/design/patterns/e-commerce.md +1494 -0
  250. package/expertise/design/patterns/feedback-and-states.md +642 -0
  251. package/expertise/design/patterns/forms-and-input.md +819 -0
  252. package/expertise/design/patterns/gamification.md +801 -0
  253. package/expertise/design/patterns/index.md +31 -0
  254. package/expertise/design/patterns/microinteractions.md +449 -0
  255. package/expertise/design/patterns/navigation.md +800 -0
  256. package/expertise/design/patterns/notifications.md +705 -0
  257. package/expertise/design/patterns/onboarding.md +700 -0
  258. package/expertise/design/patterns/search-and-filter.md +601 -0
  259. package/expertise/design/patterns/settings-and-preferences.md +768 -0
  260. package/expertise/design/patterns/social-and-community.md +748 -0
  261. package/expertise/design/platforms/desktop-native.md +612 -0
  262. package/expertise/design/platforms/index.md +25 -0
  263. package/expertise/design/platforms/mobile-android.md +825 -0
  264. package/expertise/design/platforms/mobile-cross-platform.md +983 -0
  265. package/expertise/design/platforms/mobile-ios.md +699 -0
  266. package/expertise/design/platforms/tablet.md +794 -0
  267. package/expertise/design/platforms/web-dashboard.md +790 -0
  268. package/expertise/design/platforms/web-responsive.md +550 -0
  269. package/expertise/design/psychology/behavioral-nudges.md +449 -0
  270. package/expertise/design/psychology/cognitive-load.md +1191 -0
  271. package/expertise/design/psychology/error-psychology.md +778 -0
  272. package/expertise/design/psychology/index.md +22 -0
  273. package/expertise/design/psychology/persuasive-design.md +736 -0
  274. package/expertise/design/psychology/user-mental-models.md +623 -0
  275. package/expertise/design/tooling/open-pencil.md +266 -0
  276. package/expertise/frontend/angular.md +1073 -0
  277. package/expertise/frontend/desktop-electron.md +546 -0
  278. package/expertise/frontend/flutter.md +782 -0
  279. package/expertise/frontend/index.md +27 -0
  280. package/expertise/frontend/native-android.md +409 -0
  281. package/expertise/frontend/native-ios.md +490 -0
  282. package/expertise/frontend/react-native.md +1160 -0
  283. package/expertise/frontend/react.md +808 -0
  284. package/expertise/frontend/vue.md +1089 -0
  285. package/expertise/humanize/domain-rules-code.md +79 -0
  286. package/expertise/humanize/domain-rules-content.md +67 -0
  287. package/expertise/humanize/domain-rules-technical-docs.md +56 -0
  288. package/expertise/humanize/index.md +35 -0
  289. package/expertise/humanize/self-audit-checklist.md +87 -0
  290. package/expertise/humanize/sentence-patterns.md +218 -0
  291. package/expertise/humanize/vocabulary-blacklist.md +105 -0
  292. package/expertise/i18n/PROGRESS.md +65 -0
  293. package/expertise/i18n/advanced/accessibility-and-i18n.md +28 -0
  294. package/expertise/i18n/advanced/bidirectional-text-algorithm.md +38 -0
  295. package/expertise/i18n/advanced/complex-scripts.md +30 -0
  296. package/expertise/i18n/advanced/performance-and-i18n.md +27 -0
  297. package/expertise/i18n/advanced/testing-i18n.md +28 -0
  298. package/expertise/i18n/content/content-adaptation.md +23 -0
  299. package/expertise/i18n/content/locale-specific-formatting.md +23 -0
  300. package/expertise/i18n/content/machine-translation-integration.md +28 -0
  301. package/expertise/i18n/content/translation-management.md +29 -0
  302. package/expertise/i18n/foundations/date-time-calendars.md +67 -0
  303. package/expertise/i18n/foundations/i18n-architecture.md +272 -0
  304. package/expertise/i18n/foundations/locale-and-language-tags.md +79 -0
  305. package/expertise/i18n/foundations/numbers-currency-units.md +61 -0
  306. package/expertise/i18n/foundations/pluralization-and-gender.md +109 -0
  307. package/expertise/i18n/foundations/string-externalization.md +236 -0
  308. package/expertise/i18n/foundations/text-direction-bidi.md +241 -0
  309. package/expertise/i18n/foundations/unicode-and-encoding.md +86 -0
  310. package/expertise/i18n/index.md +38 -0
  311. package/expertise/i18n/platform/backend-i18n.md +31 -0
  312. package/expertise/i18n/platform/flutter-i18n.md +148 -0
  313. package/expertise/i18n/platform/native-android-i18n.md +36 -0
  314. package/expertise/i18n/platform/native-ios-i18n.md +36 -0
  315. package/expertise/i18n/platform/react-i18n.md +103 -0
  316. package/expertise/i18n/platform/web-css-i18n.md +81 -0
  317. package/expertise/i18n/rtl/arabic-specific.md +175 -0
  318. package/expertise/i18n/rtl/hebrew-specific.md +149 -0
  319. package/expertise/i18n/rtl/rtl-animations-and-transitions.md +111 -0
  320. package/expertise/i18n/rtl/rtl-forms-and-input.md +161 -0
  321. package/expertise/i18n/rtl/rtl-fundamentals.md +211 -0
  322. package/expertise/i18n/rtl/rtl-icons-and-images.md +181 -0
  323. package/expertise/i18n/rtl/rtl-layout-mirroring.md +252 -0
  324. package/expertise/i18n/rtl/rtl-navigation-and-gestures.md +107 -0
  325. package/expertise/i18n/rtl/rtl-testing-and-qa.md +147 -0
  326. package/expertise/i18n/rtl/rtl-typography.md +160 -0
  327. package/expertise/index.md +113 -0
  328. package/expertise/index.yaml +216 -0
  329. package/expertise/infrastructure/cloud-aws.md +597 -0
  330. package/expertise/infrastructure/cloud-gcp.md +599 -0
  331. package/expertise/infrastructure/cybersecurity.md +816 -0
  332. package/expertise/infrastructure/database-mongodb.md +447 -0
  333. package/expertise/infrastructure/database-postgres.md +400 -0
  334. package/expertise/infrastructure/devops-cicd.md +787 -0
  335. package/expertise/infrastructure/index.md +27 -0
  336. package/expertise/performance/PROGRESS.md +50 -0
  337. package/expertise/performance/backend/api-latency.md +1204 -0
  338. package/expertise/performance/backend/background-jobs.md +506 -0
  339. package/expertise/performance/backend/connection-pooling.md +1209 -0
  340. package/expertise/performance/backend/database-query-optimization.md +515 -0
  341. package/expertise/performance/backend/index.md +23 -0
  342. package/expertise/performance/backend/rate-limiting-and-throttling.md +971 -0
  343. package/expertise/performance/foundations/algorithmic-complexity.md +954 -0
  344. package/expertise/performance/foundations/caching-strategies.md +489 -0
  345. package/expertise/performance/foundations/concurrency-and-parallelism.md +847 -0
  346. package/expertise/performance/foundations/index.md +24 -0
  347. package/expertise/performance/foundations/measuring-and-profiling.md +440 -0
  348. package/expertise/performance/foundations/memory-management.md +964 -0
  349. package/expertise/performance/foundations/performance-budgets.md +1314 -0
  350. package/expertise/performance/index.md +31 -0
  351. package/expertise/performance/infrastructure/auto-scaling.md +1059 -0
  352. package/expertise/performance/infrastructure/cdn-and-edge.md +1081 -0
  353. package/expertise/performance/infrastructure/index.md +22 -0
  354. package/expertise/performance/infrastructure/load-balancing.md +1081 -0
  355. package/expertise/performance/infrastructure/observability.md +1079 -0
  356. package/expertise/performance/mobile/index.md +23 -0
  357. package/expertise/performance/mobile/mobile-animations.md +544 -0
  358. package/expertise/performance/mobile/mobile-memory-battery.md +416 -0
  359. package/expertise/performance/mobile/mobile-network.md +452 -0
  360. package/expertise/performance/mobile/mobile-rendering.md +599 -0
  361. package/expertise/performance/mobile/mobile-startup-time.md +505 -0
  362. package/expertise/performance/platform-specific/flutter-performance.md +647 -0
  363. package/expertise/performance/platform-specific/index.md +22 -0
  364. package/expertise/performance/platform-specific/node-performance.md +1307 -0
  365. package/expertise/performance/platform-specific/postgres-performance.md +1366 -0
  366. package/expertise/performance/platform-specific/react-performance.md +1403 -0
  367. package/expertise/performance/web/bundle-optimization.md +1239 -0
  368. package/expertise/performance/web/image-and-media.md +636 -0
  369. package/expertise/performance/web/index.md +24 -0
  370. package/expertise/performance/web/network-optimization.md +1133 -0
  371. package/expertise/performance/web/rendering-performance.md +1098 -0
  372. package/expertise/performance/web/ssr-and-hydration.md +918 -0
  373. package/expertise/performance/web/web-vitals.md +1374 -0
  374. package/expertise/quality/accessibility.md +985 -0
  375. package/expertise/quality/evidence-based-verification.md +499 -0
  376. package/expertise/quality/index.md +24 -0
  377. package/expertise/quality/ml-model-audit.md +614 -0
  378. package/expertise/quality/performance.md +600 -0
  379. package/expertise/quality/testing-api.md +891 -0
  380. package/expertise/quality/testing-mobile.md +496 -0
  381. package/expertise/quality/testing-web.md +849 -0
  382. package/expertise/security/PROGRESS.md +54 -0
  383. package/expertise/security/agentic-identity.md +540 -0
  384. package/expertise/security/compliance-frameworks.md +601 -0
  385. package/expertise/security/data/data-encryption.md +364 -0
  386. package/expertise/security/data/data-privacy-gdpr.md +692 -0
  387. package/expertise/security/data/database-security.md +1171 -0
  388. package/expertise/security/data/index.md +22 -0
  389. package/expertise/security/data/pii-handling.md +531 -0
  390. package/expertise/security/foundations/authentication.md +1041 -0
  391. package/expertise/security/foundations/authorization.md +603 -0
  392. package/expertise/security/foundations/cryptography.md +1001 -0
  393. package/expertise/security/foundations/index.md +25 -0
  394. package/expertise/security/foundations/owasp-top-10.md +1354 -0
  395. package/expertise/security/foundations/secrets-management.md +1217 -0
  396. package/expertise/security/foundations/secure-sdlc.md +700 -0
  397. package/expertise/security/foundations/supply-chain-security.md +698 -0
  398. package/expertise/security/index.md +31 -0
  399. package/expertise/security/infrastructure/cloud-security-aws.md +1296 -0
  400. package/expertise/security/infrastructure/cloud-security-gcp.md +1376 -0
  401. package/expertise/security/infrastructure/container-security.md +721 -0
  402. package/expertise/security/infrastructure/incident-response.md +1295 -0
  403. package/expertise/security/infrastructure/index.md +24 -0
  404. package/expertise/security/infrastructure/logging-and-monitoring.md +1618 -0
  405. package/expertise/security/infrastructure/network-security.md +1337 -0
  406. package/expertise/security/mobile/index.md +23 -0
  407. package/expertise/security/mobile/mobile-android-security.md +1218 -0
  408. package/expertise/security/mobile/mobile-binary-protection.md +1229 -0
  409. package/expertise/security/mobile/mobile-data-storage.md +1265 -0
  410. package/expertise/security/mobile/mobile-ios-security.md +1401 -0
  411. package/expertise/security/mobile/mobile-network-security.md +1520 -0
  412. package/expertise/security/smart-contract-security.md +594 -0
  413. package/expertise/security/testing/index.md +22 -0
  414. package/expertise/security/testing/penetration-testing.md +1258 -0
  415. package/expertise/security/testing/security-code-review.md +1765 -0
  416. package/expertise/security/testing/threat-modeling.md +1074 -0
  417. package/expertise/security/testing/vulnerability-scanning.md +1062 -0
  418. package/expertise/security/web/api-security.md +586 -0
  419. package/expertise/security/web/cors-and-headers.md +433 -0
  420. package/expertise/security/web/csrf.md +562 -0
  421. package/expertise/security/web/file-upload.md +1477 -0
  422. package/expertise/security/web/index.md +25 -0
  423. package/expertise/security/web/injection.md +1375 -0
  424. package/expertise/security/web/session-management.md +1101 -0
  425. package/expertise/security/web/xss.md +1158 -0
  426. package/exports/README.md +17 -0
  427. package/exports/hosts/claude/.claude/agents/clarifier.md +42 -0
  428. package/exports/hosts/claude/.claude/agents/content-author.md +63 -0
  429. package/exports/hosts/claude/.claude/agents/designer.md +55 -0
  430. package/exports/hosts/claude/.claude/agents/executor.md +55 -0
  431. package/exports/hosts/claude/.claude/agents/learner.md +51 -0
  432. package/exports/hosts/claude/.claude/agents/planner.md +53 -0
  433. package/exports/hosts/claude/.claude/agents/researcher.md +43 -0
  434. package/exports/hosts/claude/.claude/agents/reviewer.md +54 -0
  435. package/exports/hosts/claude/.claude/agents/specifier.md +47 -0
  436. package/exports/hosts/claude/.claude/agents/verifier.md +71 -0
  437. package/exports/hosts/claude/.claude/commands/author.md +42 -0
  438. package/exports/hosts/claude/.claude/commands/clarify.md +38 -0
  439. package/exports/hosts/claude/.claude/commands/design-review.md +46 -0
  440. package/exports/hosts/claude/.claude/commands/design.md +44 -0
  441. package/exports/hosts/claude/.claude/commands/discover.md +37 -0
  442. package/exports/hosts/claude/.claude/commands/execute.md +48 -0
  443. package/exports/hosts/claude/.claude/commands/learn.md +38 -0
  444. package/exports/hosts/claude/.claude/commands/plan-review.md +42 -0
  445. package/exports/hosts/claude/.claude/commands/plan.md +39 -0
  446. package/exports/hosts/claude/.claude/commands/prepare-next.md +37 -0
  447. package/exports/hosts/claude/.claude/commands/review.md +40 -0
  448. package/exports/hosts/claude/.claude/commands/run-audit.md +41 -0
  449. package/exports/hosts/claude/.claude/commands/spec-challenge.md +41 -0
  450. package/exports/hosts/claude/.claude/commands/specify.md +38 -0
  451. package/exports/hosts/claude/.claude/commands/verify.md +37 -0
  452. package/exports/hosts/claude/.claude/settings.json +34 -0
  453. package/exports/hosts/claude/CLAUDE.md +19 -0
  454. package/exports/hosts/claude/export.manifest.json +38 -0
  455. package/exports/hosts/claude/host-package.json +67 -0
  456. package/exports/hosts/codex/AGENTS.md +19 -0
  457. package/exports/hosts/codex/export.manifest.json +38 -0
  458. package/exports/hosts/codex/host-package.json +41 -0
  459. package/exports/hosts/cursor/.cursor/hooks.json +16 -0
  460. package/exports/hosts/cursor/.cursor/rules/wazir-core.mdc +19 -0
  461. package/exports/hosts/cursor/export.manifest.json +38 -0
  462. package/exports/hosts/cursor/host-package.json +42 -0
  463. package/exports/hosts/gemini/GEMINI.md +19 -0
  464. package/exports/hosts/gemini/export.manifest.json +38 -0
  465. package/exports/hosts/gemini/host-package.json +41 -0
  466. package/hooks/README.md +18 -0
  467. package/hooks/definitions/loop_cap_guard.yaml +21 -0
  468. package/hooks/definitions/post_tool_capture.yaml +24 -0
  469. package/hooks/definitions/pre_compact_summary.yaml +19 -0
  470. package/hooks/definitions/pre_tool_capture_route.yaml +19 -0
  471. package/hooks/definitions/protected_path_write_guard.yaml +19 -0
  472. package/hooks/definitions/session_start.yaml +19 -0
  473. package/hooks/definitions/stop_handoff_harvest.yaml +20 -0
  474. package/hooks/loop-cap-guard +17 -0
  475. package/hooks/post-tool-lint +36 -0
  476. package/hooks/protected-path-write-guard +17 -0
  477. package/hooks/session-start +41 -0
  478. package/llms-full.txt +2355 -0
  479. package/llms.txt +43 -0
  480. package/package.json +79 -0
  481. package/roles/README.md +20 -0
  482. package/roles/clarifier.md +42 -0
  483. package/roles/content-author.md +63 -0
  484. package/roles/designer.md +55 -0
  485. package/roles/executor.md +55 -0
  486. package/roles/learner.md +51 -0
  487. package/roles/planner.md +53 -0
  488. package/roles/researcher.md +43 -0
  489. package/roles/reviewer.md +54 -0
  490. package/roles/specifier.md +47 -0
  491. package/roles/verifier.md +71 -0
  492. package/schemas/README.md +24 -0
  493. package/schemas/accepted-learning.schema.json +20 -0
  494. package/schemas/author-artifact.schema.json +156 -0
  495. package/schemas/clarification.schema.json +19 -0
  496. package/schemas/design-artifact.schema.json +80 -0
  497. package/schemas/docs-claim.schema.json +18 -0
  498. package/schemas/export-manifest.schema.json +20 -0
  499. package/schemas/hook.schema.json +67 -0
  500. package/schemas/host-export-package.schema.json +18 -0
  501. package/schemas/implementation-plan.schema.json +19 -0
  502. package/schemas/proposed-learning.schema.json +19 -0
  503. package/schemas/research.schema.json +18 -0
  504. package/schemas/review.schema.json +29 -0
  505. package/schemas/run-manifest.schema.json +18 -0
  506. package/schemas/spec-challenge.schema.json +18 -0
  507. package/schemas/spec.schema.json +20 -0
  508. package/schemas/usage.schema.json +102 -0
  509. package/schemas/verification-proof.schema.json +29 -0
  510. package/schemas/wazir-manifest.schema.json +173 -0
  511. package/skills/README.md +40 -0
  512. package/skills/brainstorming/SKILL.md +77 -0
  513. package/skills/debugging/SKILL.md +50 -0
  514. package/skills/design/SKILL.md +61 -0
  515. package/skills/dispatching-parallel-agents/SKILL.md +128 -0
  516. package/skills/executing-plans/SKILL.md +70 -0
  517. package/skills/finishing-a-development-branch/SKILL.md +169 -0
  518. package/skills/humanize/SKILL.md +123 -0
  519. package/skills/init-pipeline/SKILL.md +124 -0
  520. package/skills/prepare-next/SKILL.md +20 -0
  521. package/skills/receiving-code-review/SKILL.md +123 -0
  522. package/skills/requesting-code-review/SKILL.md +105 -0
  523. package/skills/requesting-code-review/code-reviewer.md +108 -0
  524. package/skills/run-audit/SKILL.md +197 -0
  525. package/skills/scan-project/SKILL.md +41 -0
  526. package/skills/self-audit/SKILL.md +153 -0
  527. package/skills/subagent-driven-development/SKILL.md +154 -0
  528. package/skills/subagent-driven-development/code-quality-reviewer-prompt.md +26 -0
  529. package/skills/subagent-driven-development/implementer-prompt.md +102 -0
  530. package/skills/subagent-driven-development/spec-reviewer-prompt.md +61 -0
  531. package/skills/tdd/SKILL.md +23 -0
  532. package/skills/using-git-worktrees/SKILL.md +163 -0
  533. package/skills/using-skills/SKILL.md +95 -0
  534. package/skills/verification/SKILL.md +22 -0
  535. package/skills/wazir/SKILL.md +463 -0
  536. package/skills/writing-plans/SKILL.md +30 -0
  537. package/skills/writing-skills/SKILL.md +157 -0
  538. package/skills/writing-skills/anthropic-best-practices.md +122 -0
  539. package/skills/writing-skills/persuasion-principles.md +50 -0
  540. package/templates/README.md +20 -0
  541. package/templates/artifacts/README.md +10 -0
  542. package/templates/artifacts/accepted-learning.md +19 -0
  543. package/templates/artifacts/accepted-learning.template.json +12 -0
  544. package/templates/artifacts/author.md +74 -0
  545. package/templates/artifacts/author.template.json +19 -0
  546. package/templates/artifacts/clarification.md +21 -0
  547. package/templates/artifacts/clarification.template.json +12 -0
  548. package/templates/artifacts/execute-notes.md +19 -0
  549. package/templates/artifacts/implementation-plan.md +21 -0
  550. package/templates/artifacts/implementation-plan.template.json +11 -0
  551. package/templates/artifacts/learning-proposal.md +19 -0
  552. package/templates/artifacts/next-run-handoff.md +21 -0
  553. package/templates/artifacts/plan-review.md +19 -0
  554. package/templates/artifacts/proposed-learning.template.json +12 -0
  555. package/templates/artifacts/research.md +21 -0
  556. package/templates/artifacts/research.template.json +12 -0
  557. package/templates/artifacts/review-findings.md +19 -0
  558. package/templates/artifacts/review.template.json +11 -0
  559. package/templates/artifacts/run-manifest.template.json +8 -0
  560. package/templates/artifacts/spec-challenge.md +19 -0
  561. package/templates/artifacts/spec-challenge.template.json +11 -0
  562. package/templates/artifacts/spec.md +21 -0
  563. package/templates/artifacts/spec.template.json +12 -0
  564. package/templates/artifacts/verification-proof.md +19 -0
  565. package/templates/artifacts/verification-proof.template.json +11 -0
  566. package/templates/examples/accepted-learning.example.json +14 -0
  567. package/templates/examples/author.example.json +152 -0
  568. package/templates/examples/clarification.example.json +15 -0
  569. package/templates/examples/docs-claim.example.json +8 -0
  570. package/templates/examples/export-manifest.example.json +7 -0
  571. package/templates/examples/host-export-package.example.json +11 -0
  572. package/templates/examples/implementation-plan.example.json +17 -0
  573. package/templates/examples/proposed-learning.example.json +13 -0
  574. package/templates/examples/research.example.json +15 -0
  575. package/templates/examples/research.example.md +6 -0
  576. package/templates/examples/review.example.json +17 -0
  577. package/templates/examples/run-manifest.example.json +9 -0
  578. package/templates/examples/spec-challenge.example.json +14 -0
  579. package/templates/examples/spec.example.json +21 -0
  580. package/templates/examples/verification-proof.example.json +21 -0
  581. package/templates/examples/wazir-manifest.example.yaml +65 -0
  582. package/templates/task-definition-schema.md +99 -0
  583. package/tooling/README.md +20 -0
  584. package/tooling/src/adapters/context-mode.js +50 -0
  585. package/tooling/src/capture/command.js +376 -0
  586. package/tooling/src/capture/store.js +99 -0
  587. package/tooling/src/capture/usage.js +270 -0
  588. package/tooling/src/checks/branches.js +50 -0
  589. package/tooling/src/checks/brand-truth.js +110 -0
  590. package/tooling/src/checks/changelog.js +231 -0
  591. package/tooling/src/checks/command-registry.js +36 -0
  592. package/tooling/src/checks/commits.js +102 -0
  593. package/tooling/src/checks/docs-drift.js +103 -0
  594. package/tooling/src/checks/docs-truth.js +201 -0
  595. package/tooling/src/checks/runtime-surface.js +156 -0
  596. package/tooling/src/cli.js +116 -0
  597. package/tooling/src/command-options.js +56 -0
  598. package/tooling/src/commands/validate.js +320 -0
  599. package/tooling/src/doctor/command.js +91 -0
  600. package/tooling/src/export/command.js +77 -0
  601. package/tooling/src/export/compiler.js +498 -0
  602. package/tooling/src/guards/loop-cap-guard.js +52 -0
  603. package/tooling/src/guards/protected-path-write-guard.js +67 -0
  604. package/tooling/src/index/command.js +152 -0
  605. package/tooling/src/index/storage.js +1061 -0
  606. package/tooling/src/index/summarizers.js +261 -0
  607. package/tooling/src/loaders.js +18 -0
  608. package/tooling/src/project-root.js +22 -0
  609. package/tooling/src/recall/command.js +225 -0
  610. package/tooling/src/schema-validator.js +30 -0
  611. package/tooling/src/state-root.js +40 -0
  612. package/tooling/src/status/command.js +71 -0
  613. package/wazir.manifest.yaml +135 -0
  614. package/workflows/README.md +19 -0
  615. package/workflows/author.md +42 -0
  616. package/workflows/clarify.md +38 -0
  617. package/workflows/design-review.md +46 -0
  618. package/workflows/design.md +44 -0
  619. package/workflows/discover.md +37 -0
  620. package/workflows/execute.md +48 -0
  621. package/workflows/learn.md +38 -0
  622. package/workflows/plan-review.md +42 -0
  623. package/workflows/plan.md +39 -0
  624. package/workflows/prepare-next.md +37 -0
  625. package/workflows/review.md +40 -0
  626. package/workflows/run-audit.md +41 -0
  627. package/workflows/spec-challenge.md +41 -0
  628. package/workflows/specify.md +38 -0
  629. package/workflows/verify.md +37 -0
@@ -0,0 +1,1053 @@
1
+ # Testing Anti-Patterns
2
+
3
+ > Testing is the primary safety net for software quality, yet the tests themselves are frequently riddled with anti-patterns that create a false sense of security, slow down development, and allow real bugs to escape into production. A bad test suite is worse than no test suite: it costs time to maintain, lies about what it verifies, and erodes developer trust until the team ignores test results entirely. The anti-patterns below represent the most common, most damaging, and most subtle ways test suites fail their purpose.
4
+
5
+ > **Domain:** Code
6
+ > **Anti-patterns covered:** 22
7
+ > **Highest severity:** Critical
8
+
9
+ ---
10
+
11
+ ## Anti-Patterns
12
+
13
+ ### AP-01: The Mockery (Over-Mocking)
14
+
15
+ **Also known as:** Mock Happy, Mock Hell, Mockitis, Test Double Abuse
16
+ **Frequency:** Very Common
17
+ **Severity:** Critical
18
+ **Detection difficulty:** Moderate
19
+
20
+ **What it looks like:**
21
+ A test where the majority of the code is setting up mocks, stubs, and fakes. The system under test calls mocked dependencies that return pre-configured values, and the assertions verify that those pre-configured values flowed through correctly. The actual business logic is barely exercised because every collaborator has been replaced with a controlled stand-in.
22
+
23
+ ```python
24
+ # BAD: More mock setup than actual testing
25
+ def test_process_order(self):
26
+ mock_db = Mock()
27
+ mock_payment = Mock()
28
+ mock_inventory = Mock()
29
+ mock_email = Mock()
30
+ mock_logger = Mock()
31
+
32
+ mock_inventory.check_stock.return_value = True
33
+ mock_payment.charge.return_value = PaymentResult(success=True)
34
+ mock_db.save.return_value = Order(id=1)
35
+
36
+ service = OrderService(mock_db, mock_payment, mock_inventory, mock_email, mock_logger)
37
+ result = service.process(order_data)
38
+
39
+ # Testing that mocks returned what we told them to return
40
+ assert result.success == True
41
+ ```
42
+
43
+ **Why developers do it:**
44
+ Mocking makes tests fast and isolated. Developers learn early that "unit tests should not touch the database" and overcorrect by mocking everything. Mocking frameworks make it trivially easy to stub out any dependency. The resulting test runs in milliseconds and appears to provide coverage.
45
+
46
+ **What goes wrong:**
47
+ The test verifies the wiring between mocks, not the actual behavior. When the real database, payment gateway, or inventory service behaves differently than the mock (different error format, different null handling, different timing), the test passes but production breaks. Google's testing blog documented that over-mocked tests accounted for a significant portion of tests that passed in CI but failed to catch real integration bugs. The mocking approach also couples tests tightly to implementation details -- any refactoring of how dependencies are called breaks every test, even when the behavior is unchanged.
48
+
49
+ **The fix:**
50
+ Reserve mocks for true external boundaries (network calls, third-party APIs, clocks). Use real implementations for internal collaborators. For database interactions, use in-memory databases or test containers. Apply the "Sociable Unit Test" pattern where the unit under test uses real collaborators.
51
+
52
+ ```python
53
+ # BETTER: Real collaborators, mock only the external boundary
54
+ def test_process_order(self):
55
+ db = InMemoryOrderRepository()
56
+ payment = FakePaymentGateway(always_succeeds=True)
57
+ inventory = InMemoryInventory({"SKU-1": 10})
58
+ email = SpyEmailSender()
59
+
60
+ service = OrderService(db, payment, inventory, email)
61
+ result = service.process(order_data)
62
+
63
+ assert result.success == True
64
+ assert db.find(result.order_id) is not None
65
+ assert email.sent_count == 1
66
+ ```
67
+
68
+ **Detection rule:**
69
+ If a test method has more lines of mock setup than lines of assertions and action combined, suspect AP-01. Count the `Mock()`, `.return_value`, and `.side_effect` calls -- if they exceed 5, the test is likely over-mocked.
70
+
71
+ ---
72
+
73
+ ### AP-02: Testing Implementation, Not Behavior
74
+
75
+ **Also known as:** Structural Inspection, White-Box Obsession, Brittle Tests
76
+ **Frequency:** Very Common
77
+ **Severity:** Critical
78
+ **Detection difficulty:** Moderate
79
+
80
+ **What it looks like:**
81
+ Tests that assert on internal method calls, private state, execution order, or specific code paths rather than observable outputs. The test knows exactly how the code works and verifies that it works that way, rather than verifying what it produces.
82
+
83
+ ```javascript
84
+ // BAD: Testing HOW it works, not WHAT it does
85
+ test('calculates discount', () => {
86
+ const calculator = new PriceCalculator();
87
+ const spy = jest.spyOn(calculator, '_applyTierDiscount');
88
+
89
+ calculator.calculateTotal(items, customer);
90
+
91
+ expect(spy).toHaveBeenCalledWith(items, 'gold');
92
+ expect(spy).toHaveBeenCalledTimes(1);
93
+ });
94
+ ```
95
+
96
+ **Why developers do it:**
97
+ It feels thorough. Developers reason: "I know the discount logic goes through `_applyTierDiscount`, so I should verify it gets called." It is also easier to write -- asserting on calls is simpler than computing the expected output. Code coverage tools reward this approach because every branch gets "tested."
98
+
99
+ **What goes wrong:**
100
+ Every refactoring breaks the tests even when the behavior is correct. Kent Beck's principle states: "Programmer tests should be sensitive to behavior changes and insensitive to structure changes." When tests are coupled to structure, developers stop refactoring because the cost of updating tests exceeds the perceived benefit. The codebase ossifies. Meanwhile, the tests provide a false sense of security because they verify the mechanism, not the result -- a method could be called correctly but produce the wrong output, and the test would still pass.
101
+
102
+ **The fix:**
103
+ Test the public API. Assert on outputs, side effects observable from outside, and state changes visible through the public interface.
104
+
105
+ ```javascript
106
+ // GOOD: Testing WHAT it does
107
+ test('gold customers get 20% discount on orders over $100', () => {
108
+ const calculator = new PriceCalculator();
109
+ const goldCustomer = { tier: 'gold' };
110
+ const items = [{ price: 50 }, { price: 80 }];
111
+
112
+ const total = calculator.calculateTotal(items, goldCustomer);
113
+
114
+ expect(total).toBe(104); // (50 + 80) * 0.80
115
+ });
116
+ ```
117
+
118
+ **Detection rule:**
119
+ If a test uses `spyOn` on private/internal methods, accesses properties prefixed with `_`, uses reflection to read private fields, or asserts on the number of times an internal method was called, suspect AP-02.
120
+
121
+ ---
122
+
123
+ ### AP-03: Flaky / Non-Deterministic Tests
124
+
125
+ **Also known as:** The Blinking Test, Heisenbug Test, Random Failures
126
+ **Frequency:** Very Common
127
+ **Severity:** Critical
128
+ **Detection difficulty:** Hard
129
+
130
+ **What it looks like:**
131
+ A test that passes most of the time but occasionally fails without any code change. Re-running the test suite makes it pass again. Common causes include: reliance on system time, race conditions in async code, order-dependent tests, network calls to real services, and uncontrolled randomness.
132
+
133
+ ```java
134
+ // BAD: Depends on timing
135
+ @Test
136
+ void testCacheExpiry() {
137
+ cache.put("key", "value", Duration.ofMillis(100));
138
+ Thread.sleep(150); // Might not be enough on a loaded CI server
139
+ assertNull(cache.get("key"));
140
+ }
141
+ ```
142
+
143
+ **Why developers do it:**
144
+ The test works on the developer's machine. CI servers are faster or slower than expected, but the developer does not see the failure locally. Some flakiness is introduced unknowingly through shared state or implicit ordering. The developer ships it, and the flakiness only manifests under load or on different hardware.
145
+
146
+ **What goes wrong:**
147
+ Google reported that approximately 16% of their tests exhibit flaky behavior, and flaky tests took 1.5 times longer to fix than non-flaky ones. At scale, flaky tests cost engineering organizations over $4.3M annually in lost productivity (investigation time, re-runs, lost trust). The worst consequence is cultural: developers learn to ignore test failures ("it's just a flaky test"), and real bugs slip through because the signal is buried in noise. Spotify reported that their pre-merge suite of 48,000 tests required dedicated tooling to skip slow and flaky tests, and they invested heavily in flakiness detection infrastructure.
148
+
149
+ **The fix:**
150
+ Eliminate non-determinism at the source. Use injectable clocks instead of `Thread.sleep`. Use deterministic seeds for randomness. Isolate test state so order does not matter. Replace real network calls with controlled fakes. For async operations, use explicit synchronization (latches, futures, polling with timeout) rather than arbitrary delays.
151
+
152
+ ```java
153
+ // GOOD: Deterministic time control
154
+ @Test
155
+ void testCacheExpiry() {
156
+ FakeClock clock = new FakeClock();
157
+ Cache cache = new Cache(clock);
158
+ cache.put("key", "value", Duration.ofMillis(100));
159
+ clock.advance(Duration.ofMillis(150));
160
+ assertNull(cache.get("key"));
161
+ }
162
+ ```
163
+
164
+ **Detection rule:**
165
+ If a test uses `Thread.sleep`, `time.sleep`, `setTimeout` with a magic number, `Date.now()`, `Math.random()` without a seed, or makes real HTTP calls, suspect AP-03. Also flag any test that has been re-run or marked `@Retry` in CI configuration.
166
+
167
+ ---
168
+
169
+ ### AP-04: Testing Private Methods Directly
170
+
171
+ **Also known as:** The Anal Probe, The Inspector, Encapsulation Violation
172
+ **Frequency:** Common
173
+ **Severity:** High
174
+ **Detection difficulty:** Easy
175
+
176
+ **What it looks like:**
177
+ Tests that use reflection, `@VisibleForTesting` annotations, or language-specific hacks to access and test private/internal methods directly. The test reaches inside the class to call methods that are not part of the public contract.
178
+
179
+ ```csharp
180
+ // BAD: Using reflection to test a private method
181
+ [Test]
182
+ public void TestParseInternalFormat()
183
+ {
184
+ var parser = new DataProcessor();
185
+ var method = typeof(DataProcessor).GetMethod("ParseInternalFormat",
186
+ BindingFlags.NonPublic | BindingFlags.Instance);
187
+ var result = method.Invoke(parser, new object[] { rawData });
188
+ Assert.AreEqual(expected, result);
189
+ }
190
+ ```
191
+
192
+ **Why developers do it:**
193
+ The private method contains complex logic that the developer wants to test in isolation. Testing it through the public API feels indirect and requires more setup. The developer reasons: "This private method is the hard part; I should test it directly." Some code coverage tools flag uncovered private methods, pressuring developers to test them.
194
+
195
+ **What goes wrong:**
196
+ The test is now coupled to the internal structure. Any refactoring -- renaming the method, changing its signature, inlining it, or splitting it -- breaks the test. Worse, it signals a design problem: if a private method is complex enough to need its own tests, it likely belongs in a separate class. Vladimir Khorikov documented that the root issue is not encapsulation violation per se, but that testing private methods masks a missing abstraction that should be extracted and tested through its own public API.
197
+
198
+ **The fix:**
199
+ Test private methods indirectly through the public API. If the private method is too complex for that, extract it into a separate class with its own public interface and test that class directly.
200
+
201
+ ```csharp
202
+ // GOOD: Extract the complex logic into its own testable class
203
+ public class InternalFormatParser
204
+ {
205
+ public ParsedData Parse(byte[] rawData) { /* ... */ }
206
+ }
207
+
208
+ [Test]
209
+ public void TestInternalFormatParsing()
210
+ {
211
+ var parser = new InternalFormatParser();
212
+ var result = parser.Parse(rawData);
213
+ Assert.AreEqual(expected, result);
214
+ }
215
+ ```
216
+
217
+ **Detection rule:**
218
+ If a test uses reflection to access non-public members, uses `@VisibleForTesting` or `internal` access modifiers added solely for testing, or imports a method that starts with `_`, suspect AP-04.
219
+
220
+ ---
221
+
222
+ ### AP-05: The Coverage Obsession
223
+
224
+ **Also known as:** 100% Coverage Cult, Goodhart's Test, Metric Gaming
225
+ **Frequency:** Common
226
+ **Severity:** High
227
+ **Detection difficulty:** Moderate
228
+
229
+ **What it looks like:**
230
+ Teams enforce a hard 100% (or near-100%) code coverage requirement. Developers write trivial tests for getters, setters, constructors, and configuration code just to hit the number. Tests verify that code runs but not that it works correctly. Coverage becomes a KPI that is gamed rather than a signal that is interpreted.
231
+
232
+ ```java
233
+ // BAD: Testing a getter to inflate coverage
234
+ @Test
235
+ void testGetName() {
236
+ User user = new User("Alice");
237
+ assertEquals("Alice", user.getName());
238
+ }
239
+
240
+ // BAD: Testing framework configuration
241
+ @Test
242
+ void testSpringContextLoads() {
243
+ assertNotNull(applicationContext);
244
+ }
245
+ ```
246
+
247
+ **Why developers do it:**
248
+ Management mandates coverage thresholds. CI pipelines reject PRs below the threshold. Developers fill the gap with the easiest possible tests rather than the most valuable ones. The metric is visible and gameable, while actual test quality is invisible and subjective.
249
+
250
+ **What goes wrong:**
251
+ Google's own testing guidelines recommend "60% as acceptable, 75% as commendable, and 90% as exemplary" -- not 100%. Research has shown that when coverage becomes a target, teams optimize for the metric rather than for quality (Goodhart's Law). The resulting tests are expensive to maintain, break on every refactoring, and provide no safety. Codecov's analysis documented that teams pursuing 100% coverage spent disproportionate effort on the last 10-20% of code (edge cases in generated code, third-party adapters, trivial boilerplate), producing tests that caught zero real bugs. Meanwhile, the complex business logic at 80% coverage was undertested because developers spent their time elsewhere.
252
+
253
+ **The fix:**
254
+ Set coverage floors (70-85%), not ceilings. Track coverage trends rather than absolute numbers. Measure mutation testing scores for critical modules -- mutation testing verifies that tests actually catch bugs, not just execute code. Exclude generated code, DTOs, and trivial boilerplate from coverage requirements.
255
+
256
+ **Detection rule:**
257
+ If a test file contains only getter/setter tests, constructor tests, or tests that assert `assertNotNull` on injected dependencies, suspect AP-05. If the team has a 100% coverage requirement and tests are being added with no assertions beyond "it runs," this is active.
258
+
259
+ ---
260
+
261
+ ### AP-06: Assertion-Free Testing
262
+
263
+ **Also known as:** The Secret Catcher, The Placebo Test, Happy Path Smoke
264
+ **Frequency:** Common
265
+ **Severity:** High
266
+ **Detection difficulty:** Easy
267
+
268
+ **What it looks like:**
269
+ A test that calls production code but contains no assertions. It relies entirely on "no exception thrown" as the success criterion. The test method invokes a function and considers the test passed if execution completes without error.
270
+
271
+ ```python
272
+ # BAD: No assertions at all
273
+ def test_process_payment():
274
+ processor = PaymentProcessor()
275
+ processor.process(valid_order)
276
+ # ... that's it. No assertions.
277
+
278
+ # BAD: Assert only that no exception was thrown
279
+ def test_generate_report():
280
+ generator = ReportGenerator()
281
+ try:
282
+ generator.generate(data)
283
+ except Exception:
284
+ self.fail("Report generation raised an exception")
285
+ # But never checks the report content
286
+ ```
287
+
288
+ **Why developers do it:**
289
+ The developer wants quick coverage credit. The function is hard to observe (it writes to a file, sends an email, updates a database) and the developer does not invest in making the output inspectable. Martin Fowler documented this as "Assertion Free Testing" and noted that the most common reason is lack of observability -- the system under test does not expose its results in a way that is easy to assert on.
290
+
291
+ **What goes wrong:**
292
+ The test passes regardless of whether the function produces correct output. A payment processor that silently charges the wrong amount, a report generator that produces empty files, a data pipeline that drops records -- all pass these tests. The developer and the team believe these paths are tested. They are not. Research on JUnit test suites found that tests without assertions can achieve 100% code coverage while catching zero defects.
293
+
294
+ **The fix:**
295
+ Every test must assert on at least one observable outcome. If the function's output is hard to observe, refactor for testability: return values instead of void, use spy objects for side effects, or inject observable collaborators.
296
+
297
+ ```python
298
+ # GOOD: Assert on observable outcomes
299
+ def test_process_payment():
300
+ ledger = InMemoryLedger()
301
+ processor = PaymentProcessor(ledger)
302
+
303
+ result = processor.process(valid_order)
304
+
305
+ assert result.status == "completed"
306
+ assert ledger.last_entry().amount == valid_order.total
307
+ assert ledger.last_entry().merchant == valid_order.merchant_id
308
+ ```
309
+
310
+ **Detection rule:**
311
+ If a test method contains zero `assert`, `expect`, `should`, or equivalent assertion calls, it is AP-06. Static analysis tools can flag test methods with no assertion statements.
312
+
313
+ ---
314
+
315
+ ### AP-07: The Ice Cream Cone (Inverted Test Pyramid)
316
+
317
+ **Also known as:** Inverted Pyramid, E2E Heavy, Manual Testing Addiction
318
+ **Frequency:** Common
319
+ **Severity:** High
320
+ **Detection difficulty:** Easy
321
+
322
+ **What it looks like:**
323
+ The test suite has many end-to-end and UI tests, fewer integration tests, and very few (or no) unit tests. The majority of test effort goes into manual testing. The test distribution is the inverse of the recommended testing pyramid (many unit tests at the base, fewer integration tests in the middle, few E2E tests at the top).
324
+
325
+ **Why developers do it:**
326
+ E2E tests feel more "real" and trustworthy because they test the whole system. Writing unit tests requires understanding dependency injection and test doubles, which feels harder. In organizations without a strong testing culture, QA teams write E2E/manual tests because that is what they know. The first tests a team writes are often E2E because they require no code changes to support testability.
327
+
328
+ **What goes wrong:**
329
+ LayerX, a fintech company, documented this pattern: their manual E2E test suite grew to 900 items, and despite a two-day release cycle, three bugs slipped through immediately after a stable release due to human error in manual E2E testing. The problems with the inverted pyramid are compounding: E2E tests are slow (minutes to hours per run), flaky (browser timeouts, network issues, CSS selector changes), expensive to maintain (UI changes break many tests), and provide poor failure localization (a failing E2E test does not tell you which module has the bug). Teams with this pattern ship slower because the feedback loop is measured in hours, not seconds.
330
+
331
+ **The fix:**
332
+ Adopt the testing pyramid: 70% unit tests (fast, isolated, precise), 20% integration tests (verify module interactions), 10% E2E tests (critical user journeys only). When an E2E test fails, write a unit test that catches the same bug, then consider removing the E2E test.
333
+
334
+ **Detection rule:**
335
+ Count the tests at each level. If E2E/UI tests outnumber unit tests, or if the majority of testing is manual, suspect AP-07. If the full test suite takes more than 15 minutes, the pyramid is likely inverted.
336
+
337
+ ---
338
+
339
+ ### AP-08: Shared Mutable State Between Tests
340
+
341
+ **Also known as:** Generous Leftovers, Test Pollution, Order-Dependent Tests
342
+ **Frequency:** Common
343
+ **Severity:** High
344
+ **Detection difficulty:** Hard
345
+
346
+ **What it looks like:**
347
+ Tests share a database, a static/global variable, a singleton, a file on disk, or an in-memory collection that is mutated by one test and read by another. Tests pass when run in a specific order but fail when run in isolation, in parallel, or in a different order.
348
+
349
+ ```python
350
+ # BAD: Shared class-level state
351
+ class TestUserService:
352
+ users_db = {} # Shared across all tests
353
+
354
+ def test_create_user(self):
355
+ self.users_db["alice"] = User("alice")
356
+ assert "alice" in self.users_db
357
+
358
+ def test_list_users(self):
359
+ # Depends on test_create_user having run first
360
+ assert len(self.users_db) == 1
361
+ ```
362
+
363
+ **Why developers do it:**
364
+ Setting up test data is expensive. Developers reason: "The previous test already created the user, so I'll just use it." Shared setup methods (`@BeforeAll`, `setUpClass`) are convenient but dangerous when they create mutable state. Some developers are unaware that test execution order is not guaranteed in most frameworks.
365
+
366
+ **What goes wrong:**
367
+ Research from the University of Illinois found that Test Order Dependency accounts for 12% of flaky test failures, and 74% of these issues are fixed by cleaning shared state between test runs. When tests share state, failures become non-reproducible: a test fails in CI but passes locally because the local runner uses a different order. Debugging these failures is exceptionally time-consuming because the root cause is in a different test than the one that fails. In large test suites, shared state prevents parallelization, which multiplies execution time.
368
+
369
+ **The fix:**
370
+ Each test must create its own state and clean up after itself. Use `@BeforeEach`/`setUp` (not `@BeforeAll`) for test data. Use transactions that roll back after each test for database tests. Never use static mutable fields in test classes.
371
+
372
+ ```python
373
+ # GOOD: Each test owns its state
374
+ class TestUserService:
375
+ def setup_method(self):
376
+ self.users_db = {} # Fresh state for each test
377
+
378
+ def test_create_user(self):
379
+ self.users_db["alice"] = User("alice")
380
+ assert "alice" in self.users_db
381
+
382
+ def test_list_users_empty(self):
383
+ assert len(self.users_db) == 0
384
+ ```
385
+
386
+ **Detection rule:**
387
+ If tests use class-level mutable fields, static variables, singletons, or `@BeforeAll`/`setUpClass` that creates mutable data, suspect AP-08. If a test has `@Order` annotations or passes only when run in a specific sequence, this is confirmed.
388
+
389
+ ---
390
+
391
+ ### AP-09: The Slow Suite
392
+
393
+ **Also known as:** The Slow Poke, CI Bottleneck, Coffee Break Tests
394
+ **Frequency:** Common
395
+ **Severity:** High
396
+ **Detection difficulty:** Easy
397
+
398
+ **What it looks like:**
399
+ The test suite takes 10+ minutes to run locally and 30+ minutes in CI. Developers stop running tests before pushing because "it takes too long." CI feedback arrives after the developer has context-switched to another task.
400
+
401
+ **Why developers do it:**
402
+ Individual tests are added without considering cumulative impact. Each test seems reasonable in isolation (100ms here, 200ms there), but 5,000 of them add up. Tests hit real databases, make real HTTP calls, or use `Thread.sleep` for synchronization. Nobody owns the test suite performance budget.
403
+
404
+ **What goes wrong:**
405
+ Dropbox documented that their Android test pipeline averaged 25 minutes with a worst case of 3 hours before they invested in optimization. The root cause was poor developer experience: developers stopped waiting for CI and pushed untested code. A study cited by DevOps.com estimated that a typical developer with 5 CI runs per day at 30 minutes each loses 2.5 hours daily to waiting -- equivalent to 3+ full-time engineers' time for a 10-person team. Slow suites also prevent continuous deployment: if tests take 45 minutes, you can deploy at most ~10 times per day, even with parallelization.
406
+
407
+ **The fix:**
408
+ Set a test suite time budget (e.g., 5 minutes for unit tests). Profile the slowest tests and fix them first. Replace real I/O with in-memory alternatives. Parallelize test execution. Separate fast unit tests from slow integration tests and run them in different CI stages. Use test impact analysis to run only tests affected by the change.
409
+
410
+ **Detection rule:**
411
+ If the full unit test suite takes more than 5 minutes, or if individual tests take more than 1 second, suspect AP-09. If developers have a habit of pushing without running tests ("CI will catch it"), this is confirmed.
412
+
413
+ ---
414
+
415
+ ### AP-10: Copy-Paste Test Code
416
+
417
+ **Also known as:** Copypasta Tests, Test Code Duplication, WET Tests
418
+ **Frequency:** Very Common
419
+ **Severity:** Medium
420
+ **Detection difficulty:** Easy
421
+
422
+ **What it looks like:**
423
+ Test methods that are near-identical copies of each other with minor variations in input data or expected values. The test setup, action, and assertion structure is duplicated across dozens of test methods. When the production API changes, every copied test must be updated individually.
424
+
425
+ ```javascript
426
+ // BAD: Copy-pasted test with tiny variations
427
+ test('validates email with no @', () => {
428
+ const validator = new EmailValidator();
429
+ const result = validator.validate('invalidemail.com');
430
+ expect(result.valid).toBe(false);
431
+ expect(result.error).toBe('Invalid email format');
432
+ });
433
+
434
+ test('validates email with no domain', () => {
435
+ const validator = new EmailValidator();
436
+ const result = validator.validate('user@');
437
+ expect(result.valid).toBe(false);
438
+ expect(result.error).toBe('Invalid email format');
439
+ });
440
+
441
+ // ... 15 more identical copies with different input strings
442
+ ```
443
+
444
+ **Why developers do it:**
445
+ Duplicating a test and changing one value is faster than designing a reusable test structure. The developer is "in the zone" and wants to add test cases quickly. Some developers have heard that "tests should be DAMP (Descriptive And Meaningful Phrases) not DRY" and interpret this as "duplication in tests is always fine."
446
+
447
+ **What goes wrong:**
448
+ When the `EmailValidator` API changes (e.g., the error message format changes), every copy must be updated. The xUnit Patterns wiki documents that Test Code Duplication causes a "very large increase in the cost to introduce new functionality because of the effort involved in updating all the tests that have copies of the affected code." In practice, developers update some copies but miss others, creating tests that fail for the wrong reason.
449
+
450
+ **The fix:**
451
+ Use parameterized tests (data-driven tests) for input variations. Extract common setup into helper methods or test fixtures. Keep each test readable but eliminate structural duplication.
452
+
453
+ ```javascript
454
+ // GOOD: Parameterized test
455
+ test.each([
456
+ ['invalidemail.com', 'no @ symbol'],
457
+ ['user@', 'no domain'],
458
+ ['@domain.com', 'no local part'],
459
+ ['user@.com', 'domain starts with dot'],
460
+ ])('validates email: %s (%s)', (email, _description) => {
461
+ const result = new EmailValidator().validate(email);
462
+ expect(result.valid).toBe(false);
463
+ });
464
+ ```
465
+
466
+ **Detection rule:**
467
+ If two or more test methods in the same file share more than 80% of their code and differ only in input values, suspect AP-10. If a test file has more than 200 lines and most tests look structurally identical, this is confirmed.
468
+
469
+ ---
470
+
471
+ ### AP-11: Testing Too Many Things at Once
472
+
473
+ **Also known as:** The Giant, The Kitchen Sink Test, Mega-Test
474
+ **Frequency:** Common
475
+ **Severity:** Medium
476
+ **Detection difficulty:** Easy
477
+
478
+ **What it looks like:**
479
+ A single test method that verifies multiple independent behaviors. It creates data, calls several functions, and asserts on many unrelated outcomes. When it fails, the failure message does not indicate which behavior is broken.
480
+
481
+ ```python
482
+ # BAD: Testing creation, validation, persistence, and notification in one test
483
+ def test_user_registration(self):
484
+ user = UserService.register("alice", "alice@example.com", "password123")
485
+
486
+ assert user.id is not None
487
+ assert user.name == "alice"
488
+ assert user.email == "alice@example.com"
489
+ assert user.password != "password123" # hashed
490
+ assert user.created_at is not None
491
+ assert UserRepository.find(user.id) is not None
492
+ assert EmailService.last_sent_to == "alice@example.com"
493
+ assert AuditLog.last_entry().action == "user_registered"
494
+ assert RateLimiter.attempts_for("alice@example.com") == 1
495
+ ```
496
+
497
+ **Why developers do it:**
498
+ Setting up the test context is expensive, so the developer wants to maximize assertions per setup. It feels efficient: "I already have the user object, why not test everything about it?" This is especially common when the system under test has complex setup requirements.
499
+
500
+ **What goes wrong:**
501
+ When the test fails on the 4th assertion, the developer does not know whether assertions 5-9 would also fail. Fixing the 4th assertion and re-running might reveal another failure, creating a slow debugging cycle. The test name (`test_user_registration`) does not describe what specifically is being tested, making the test suite less useful as documentation. When adding a new feature, the developer cannot tell which mega-test to update.
502
+
503
+ **The fix:**
504
+ One concept per test. Group related assertions (e.g., "user is persisted with correct fields") but separate unrelated behaviors (e.g., "notification is sent" vs. "rate limiter is updated") into distinct tests. Invest in test fixture builders to make setup cheap.
505
+
506
+ **Detection rule:**
507
+ If a test method has more than 5-6 assertions on different objects or properties, or if the test name is generic (e.g., `testEverything`, `testUserFlow`), suspect AP-11.
508
+
509
+ ---
510
+
511
+ ### AP-12: Not Testing Edge Cases
512
+
513
+ **Also known as:** Happy Path Only, Golden Path Trap, Boundary Blindness
514
+ **Frequency:** Very Common
515
+ **Severity:** High
516
+ **Detection difficulty:** Hard
517
+
518
+ **What it looks like:**
519
+ Tests only cover the normal successful case. There are no tests for empty inputs, null values, boundary values (0, -1, MAX_INT), Unicode characters, concurrent access, or error conditions.
520
+
521
+ ```go
522
+ // BAD: Only tests the happy path
523
+ func TestDivide(t *testing.T) {
524
+ result := Divide(10, 2)
525
+ assert.Equal(t, 5.0, result)
526
+ }
527
+ // Missing: Divide(0, 5), Divide(10, 0), Divide(-1, -1),
528
+ // Divide(math.MaxFloat64, 0.001), Divide(0, 0)
529
+ ```
530
+
531
+ **Why developers do it:**
532
+ The happy path is the obvious test case. Edge cases require more thought and often reveal uncomfortable design questions ("what should happen when the input is nil?"). Developers under deadline pressure write the test that proves the feature works and move on. Edge cases "probably won't happen in production."
533
+
534
+ **What goes wrong:**
535
+ Edge cases are where the majority of production bugs live. The Ariane 5 rocket (Flight 501, 1996) exploded because a 64-bit float was converted to a 16-bit integer, causing an overflow -- a boundary condition that was never tested. More commonly, APIs that work perfectly with typical data fail on empty strings, null fields, very long inputs, or concurrent requests. These failures manifest as 500 errors in production, data corruption, or security vulnerabilities (integer overflows, buffer overruns).
536
+
537
+ **The fix:**
538
+ For every function, explicitly test: null/nil/undefined inputs, empty collections, zero values, negative numbers, boundary values (MAX/MIN), very large inputs, special characters (Unicode, emoji, newlines, SQL metacharacters), and concurrent access where applicable. Use property-based testing (QuickCheck, Hypothesis, fast-check) to automatically generate edge cases.
539
+
540
+ **Detection rule:**
541
+ If a test file has only positive-case tests (inputs that exercise the main code path) and no tests with empty, null, zero, negative, or boundary inputs, suspect AP-12.
542
+
543
+ ---
544
+
545
+ ### AP-13: Commenting Out Failing Tests
546
+
547
+ **Also known as:** @Ignored Tests, Skipped Tests, Test Graveyard
548
+ **Frequency:** Common
549
+ **Severity:** High
550
+ **Detection difficulty:** Easy
551
+
552
+ **What it looks like:**
553
+ Tests that were commented out, annotated with `@Ignore`/`@Disabled`/`skip`, or wrapped in `if (false)` blocks. The test was failing, and instead of fixing it, someone disabled it. The comment says "TODO: fix this" or "temporarily disabled" -- from 18 months ago.
554
+
555
+ ```java
556
+ // BAD: Disabled "temporarily" in 2024
557
+ @Disabled("Flaky on CI, will fix later")
558
+ @Test
559
+ void testPaymentRefund() {
560
+ // ... test code that once worked ...
561
+ }
562
+ ```
563
+
564
+ **Why developers do it:**
565
+ The test is failing and blocking the CI pipeline. The developer does not have time to investigate. Disabling it unblocks the build immediately. The developer genuinely intends to fix it later. Later never comes. Other developers see the pattern and follow it.
566
+
567
+ **What goes wrong:**
568
+ Disabled tests are dead code that rots. The production code they tested continues to change, so the disabled test becomes increasingly invalid. The behavior it tested is now unverified -- if a bug is introduced in that code path, nothing catches it. Over time, a test suite can accumulate dozens of disabled tests, representing a growing blind spot. The practical impact is test debt: teams documented that when regression suites are not updated, "chaos forms quickly when testers attempt to execute regression tests with known defects or technical debt."
569
+
570
+ **The fix:**
571
+ Treat disabled tests as bugs. If a test fails, either fix it immediately or delete it and create a tracked ticket. Set a CI rule: no `@Disabled` without a linked issue. Run a weekly report of disabled tests. If a test has been disabled for more than 2 weeks without progress, delete it.
572
+
573
+ **Detection rule:**
574
+ Search for `@Ignore`, `@Disabled`, `skip(`, `xit(`, `xdescribe(`, `@pytest.mark.skip`, or commented-out test methods. If any exist without a linked issue tracker reference, suspect AP-13.
575
+
576
+ ---
577
+
578
+ ### AP-14: Test Data Coupling
579
+
580
+ **Also known as:** Database-Dependent Tests, Seed Data Addiction, Fixture Coupling
581
+ **Frequency:** Common
582
+ **Severity:** High
583
+ **Detection difficulty:** Moderate
584
+
585
+ **What it looks like:**
586
+ Tests depend on specific data existing in a shared database, fixture file, or seed script. The test assumes that user ID 42 exists, that the "admin" role is pre-loaded, or that the test database was seeded before the suite ran.
587
+
588
+ ```ruby
589
+ # BAD: Depends on specific seed data
590
+ test "admin can delete users" do
591
+ admin = User.find(1) # Assumes seed data has admin with ID 1
592
+ target = User.find(42) # Assumes user 42 exists
593
+
594
+ delete :destroy, params: { id: target.id }, session: { user_id: admin.id }
595
+
596
+ assert_response :success
597
+ end
598
+ ```
599
+
600
+ **Why developers do it:**
601
+ Using pre-existing data is faster than creating it in each test. Seed files are "just there" in the test database. The developer tested locally where the seeds were loaded and did not realize the coupling.
602
+
603
+ **What goes wrong:**
604
+ Seed data changes break tests that depend on it. New developers run the tests without loading seeds and get mysterious failures. Tests cannot be run in parallel because they compete for the same database rows. Database migrations can invalidate fixture data, causing cascading test failures across the entire suite. In microservice architectures, shared test databases between services create implicit coupling that makes independent deployment impossible.
605
+
606
+ **The fix:**
607
+ Each test creates exactly the data it needs. Use factory patterns (Factory Bot, Fishery, test builders) to create test data declaratively. Use database transactions that roll back after each test. Never reference specific IDs or assume pre-existing data.
608
+
609
+ ```ruby
610
+ # GOOD: Test creates its own data
611
+ test "admin can delete users" do
612
+ admin = create(:user, role: :admin)
613
+ target = create(:user)
614
+
615
+ delete :destroy, params: { id: target.id }, session: { user_id: admin.id }
616
+
617
+ assert_response :success
618
+ assert_nil User.find_by(id: target.id)
619
+ end
620
+ ```
621
+
622
+ **Detection rule:**
623
+ If a test calls `find(literal_id)`, references hardcoded database IDs, or has a comment like "make sure seeds are loaded," suspect AP-14.
624
+
625
+ ---
626
+
627
+ ### AP-15: Using Sleep/Delays for Synchronization
628
+
629
+ **Also known as:** Thread.sleep Testing, Arbitrary Waits, Timing Bombs
630
+ **Frequency:** Common
631
+ **Severity:** Medium
632
+ **Detection difficulty:** Easy
633
+
634
+ **What it looks like:**
635
+ Tests use fixed-duration waits (`Thread.sleep`, `time.sleep`, `setTimeout`) to wait for asynchronous operations to complete, rather than using explicit synchronization mechanisms.
636
+
637
+ ```python
638
+ # BAD: Arbitrary sleep
639
+ def test_async_processing():
640
+ queue.submit(job)
641
+ time.sleep(2) # Hope it's done by now
642
+ assert job.status == "completed"
643
+ ```
644
+
645
+ **Why developers do it:**
646
+ It is the simplest way to handle async behavior: "just wait long enough." The developer tested locally where 2 seconds was sufficient. The alternative (polling, callbacks, latches) requires more code and understanding of concurrency primitives.
647
+
648
+ **What goes wrong:**
649
+ The developer is stuck between two bad options: a short sleep that causes flaky failures on slow CI servers, or a long sleep that makes the test suite unnecessarily slow. A 2-second sleep across 100 async tests adds 3+ minutes to the suite. On loaded CI servers, even generous sleeps may be insufficient, causing intermittent failures. Enterprise Craftsmanship documented that "you cannot know for sure when exactly a job will complete and thus you are essentially guessing with the time interval."
650
+
651
+ **The fix:**
652
+ Use polling with a timeout: check the condition repeatedly with a short interval and a maximum wait time. Use synchronization primitives (CountDownLatch, Future, Promise). Use test-specific hooks that signal completion. For UI tests, use explicit wait conditions ("wait until element is visible") rather than `sleep`.
653
+
654
+ ```python
655
+ # GOOD: Polling with timeout
656
+ def test_async_processing():
657
+ queue.submit(job)
658
+ wait_until(lambda: job.status == "completed", timeout=5.0, interval=0.1)
659
+ assert job.status == "completed"
660
+ ```
661
+
662
+ **Detection rule:**
663
+ If a test contains `sleep`, `Thread.sleep`, `time.sleep`, `Task.Delay`, or `setTimeout` with a numeric literal, suspect AP-15. Every sleep in a test is a potential flakiness source.
664
+
665
+ ---
666
+
667
+ ### AP-16: Tautological Tests
668
+
669
+ **Also known as:** Self-Fulfilling Tests, Circular Assertions, Mirror Tests
670
+ **Frequency:** Occasional
671
+ **Severity:** High
672
+ **Detection difficulty:** Very Hard
673
+
674
+ **What it looks like:**
675
+ A test where the expected value is computed using the same logic as the production code, so the test is guaranteed to pass by construction. The test and the code use the same formula, making the test a tautology.
676
+
677
+ ```javascript
678
+ // BAD: Computing expected value with the same logic
679
+ test('calculates tax', () => {
680
+ const price = 100;
681
+ const taxRate = 0.08;
682
+ const expected = price * taxRate; // Same formula as production code
683
+
684
+ const result = calculateTax(price, taxRate);
685
+
686
+ expect(result).toBe(expected); // Will always pass, even if the formula is wrong
687
+ });
688
+
689
+ // BAD: Asserting what was just set up
690
+ test('user has correct name', () => {
691
+ const user = new User({ name: 'Alice' });
692
+ expect(user.name).toBe('Alice'); // Testing the constructor, not behavior
693
+ });
694
+ ```
695
+
696
+ **Why developers do it:**
697
+ The developer wants to avoid hardcoding expected values (which feels fragile) and instead derives them. They reason: "If the formula changes, the test will automatically update." They do not realize this defeats the purpose of the test. As Randy Coulman documented, "test code that's impossible to edit without looking at the implementation is a strong indicator that you've got a tautological test."
698
+
699
+ **What goes wrong:**
700
+ If the formula in production code has a bug (e.g., tax should be `price * taxRate / 100` but is `price * taxRate`), the test replicates the same bug and passes. The test can never fail for a logic error because it uses the same logic. This creates maximum false confidence: the code has 100% coverage and all tests pass, but the result is wrong.
701
+
702
+ **The fix:**
703
+ Always use pre-computed literal values as expected results. Work from concrete examples, not from derived computations. If you need to test `calculateTax(100, 0.08)`, the expected result is `8.00`, hardcoded.
704
+
705
+ ```javascript
706
+ // GOOD: Hardcoded expected value from requirements
707
+ test('calculates 8% tax on $100', () => {
708
+ expect(calculateTax(100, 0.08)).toBe(8.00);
709
+ });
710
+
711
+ test('calculates 8% tax on $250', () => {
712
+ expect(calculateTax(250, 0.08)).toBe(20.00);
713
+ });
714
+ ```
715
+
716
+ **Detection rule:**
717
+ If a test's expected value is computed using a function or formula (not a literal), and that computation mirrors the production code, suspect AP-16. Also flag tests that only assert on values that were directly set in the test setup.
718
+
719
+ ---
720
+
721
+ ### AP-17: Not Testing Error Paths
722
+
723
+ **Also known as:** Sunny Day Testing, Exception Blindness, Error Path Amnesia
724
+ **Frequency:** Very Common
725
+ **Severity:** High
726
+ **Detection difficulty:** Moderate
727
+
728
+ **What it looks like:**
729
+ Tests verify that the function works when given valid input but never test what happens with invalid input, network failures, timeout conditions, permission errors, or resource exhaustion.
730
+
731
+ ```python
732
+ # Tests exist for: create_user("valid_name", "valid@email.com")
733
+ # No tests for:
734
+ # create_user("", "") -- empty inputs
735
+ # create_user(None, None) -- null inputs
736
+ # create_user("a"*1000, "valid@email.com") -- oversized input
737
+ # create_user("valid", "valid@email.com") when DB is down
738
+ # create_user("valid", "valid@email.com") when duplicate exists
739
+ ```
740
+
741
+ **Why developers do it:**
742
+ Error paths are less interesting to write. The developer focuses on making the feature work and writes tests to confirm it works. Error handling is often an afterthought in both the production code and the tests. Testing error paths requires simulating failures (database down, network timeout), which is harder than testing the happy path.
743
+
744
+ **What goes wrong:**
745
+ Error paths are where production incidents happen. When the database connection pool is exhausted, when a downstream service returns an unexpected 500, when disk space runs out -- these are the scenarios that cause outages. If these paths are untested, the error handling code may be incorrect (swallowing exceptions, returning null instead of throwing, leaking resources). The code may crash with an unhandled exception, exposing stack traces to users or creating security vulnerabilities.
746
+
747
+ **The fix:**
748
+ For every function, explicitly list the error conditions and write tests for each. Test: invalid inputs, null/undefined values, resource failures (DB down, network timeout), concurrent access conflicts, permission violations, and resource limits. Use fault injection to simulate infrastructure failures.
749
+
750
+ **Detection rule:**
751
+ If a test file has only tests for valid inputs and no tests that expect exceptions, error codes, or error states, suspect AP-17. If the production code has `try/catch` blocks or error handling, but no tests exercise those branches, this is confirmed.
752
+
753
+ ---
754
+
755
+ ### AP-18: Testing the Framework
756
+
757
+ **Also known as:** Framework Verification, Third-Party Testing, Library QA
758
+ **Frequency:** Common
759
+ **Severity:** Medium
760
+ **Detection difficulty:** Moderate
761
+
762
+ **What it looks like:**
763
+ Tests that verify the behavior of the framework, library, or language runtime rather than the application's own logic. The test is confirming that Spring injects dependencies, that React renders JSX, that Django ORM persists objects, or that Array.sort() works.
764
+
765
+ ```java
766
+ // BAD: Testing that Spring DI works
767
+ @Test
768
+ void testServiceIsInjected() {
769
+ assertNotNull(userService);
770
+ assertNotNull(userRepository);
771
+ }
772
+
773
+ // BAD: Testing that JPA saves entities
774
+ @Test
775
+ void testUserIsSaved() {
776
+ User user = new User("Alice");
777
+ entityManager.persist(user);
778
+ entityManager.flush();
779
+ assertNotNull(user.getId()); // Testing that JPA generates IDs
780
+ }
781
+ ```
782
+
783
+ **Why developers do it:**
784
+ It is easy to write and provides coverage numbers. The developer is learning the framework and writes tests to confirm their understanding. Some testing tutorials use framework verification as examples, and developers carry the pattern into production code. It feels safer to "verify everything."
785
+
786
+ **What goes wrong:**
787
+ These tests provide zero value in catching application bugs. The framework is already tested by its own test suite (Spring has thousands of tests; you do not need to re-test dependency injection). They add to the maintenance burden and suite execution time. When the framework is upgraded, these tests may break due to internal changes, creating noise that obscures real failures. As the Codepipes blog documented: "Writing software tests for trivial code because this is the correct way to 'do TDD' will get you nowhere."
788
+
789
+ **The fix:**
790
+ Test your application's behavior, not the framework's. If your code configures Spring beans, test that your application behaves correctly (e.g., "when a user signs up, they receive a welcome email"), not that Spring wired the beans. Trust the framework's own tests.
791
+
792
+ **Detection rule:**
793
+ If a test's only assertions are `assertNotNull` on injected dependencies, or if the test verifies behavior that is documented in the framework's own documentation (e.g., "JPA generates IDs"), suspect AP-18.
794
+
795
+ ---
796
+
797
+ ### AP-19: Logic in Tests
798
+
799
+ **Also known as:** Conditional Tests, Test Spaghetti, Computed Assertions
800
+ **Frequency:** Occasional
801
+ **Severity:** Medium
802
+ **Detection difficulty:** Moderate
803
+
804
+ **What it looks like:**
805
+ Tests that contain conditional logic (`if/else`), loops (`for/while`), or complex computations. The test code itself is complex enough to have bugs.
806
+
807
+ ```python
808
+ # BAD: Logic in tests
809
+ def test_bulk_processing(self):
810
+ results = processor.process_batch(items)
811
+
812
+ for i, result in enumerate(results):
813
+ if items[i].type == "premium":
814
+ assert result.priority == "high"
815
+ elif items[i].type == "standard":
816
+ assert result.priority == "normal"
817
+ else:
818
+ assert result.priority == "low"
819
+ ```
820
+
821
+ **Why developers do it:**
822
+ The developer wants to test multiple scenarios efficiently. Loops feel like a DRY approach to testing. The developer does not realize that the logic in the test introduces the same risk of bugs that testing is supposed to catch.
823
+
824
+ **What goes wrong:**
825
+ As Gil Zilberfeld documented: "Logic is a petri dish for bugs. The reason we're testing in the first place is to make sure code that contains logic works. Adding logic to tests is like inviting a vampire into your home." If the conditional in the test has a bug, the test may silently skip assertions or assert the wrong thing. The test becomes harder to read and debug because the reader must trace the logic to understand what is being verified.
826
+
827
+ **The fix:**
828
+ Use parameterized tests for multiple scenarios. Each test path should be explicit and linear -- no branching. If a test needs a loop, it should be a parameterized test where each iteration is an independent test case.
829
+
830
+ ```python
831
+ # GOOD: Explicit, linear test cases
832
+ @pytest.mark.parametrize("item_type,expected_priority", [
833
+ ("premium", "high"),
834
+ ("standard", "normal"),
835
+ ("budget", "low"),
836
+ ])
837
+ def test_processing_priority(self, item_type, expected_priority):
838
+ item = create_item(type=item_type)
839
+ result = processor.process(item)
840
+ assert result.priority == expected_priority
841
+ ```
842
+
843
+ **Detection rule:**
844
+ If a test method contains `if`, `else`, `for`, `while`, or `switch`/`match` statements, suspect AP-19. Test methods should be linear: arrange, act, assert -- no branching.
845
+
846
+ ---
847
+
848
+ ### AP-20: Ignoring Test Maintenance
849
+
850
+ **Also known as:** Test Rot, Stale Tests, Abandoned Test Suite
851
+ **Frequency:** Common
852
+ **Severity:** High
853
+ **Detection difficulty:** Moderate
854
+
855
+ **What it looks like:**
856
+ The test suite has not been updated alongside the production code. Tests reference deprecated APIs, use outdated patterns, have hardcoded dates that have passed, or test features that no longer exist. Warning messages flood the test output. The suite "mostly passes" but the failures are background noise.
857
+
858
+ **Why developers do it:**
859
+ Tests are treated as second-class code. Production code has PR reviews, coding standards, and refactoring cycles; tests do not. When a feature changes, the developer updates the production code but says "I'll fix the tests later." Test maintenance is not tracked in sprint planning or velocity calculations. Nobody "owns" the test suite.
860
+
861
+ **What goes wrong:**
862
+ The test suite becomes unreliable and developers stop trusting it. Legitimate failures are ignored because "that test always fails." New developers cannot use the tests to understand the system because the tests describe old behavior. The cost of reviving an abandoned test suite grows exponentially -- after 6 months of neglect, it is often cheaper to rewrite than to fix. Organizations documented that test debt, unlike code debt, affects how effectively teams validate quality, and the accumulation of gaps in test coverage reduces the team's ability to release with confidence.
863
+
864
+ **The fix:**
865
+ Treat test code with the same quality standards as production code. Include test updates in the definition of done for every feature. Assign test suite ownership. Track and trend: number of disabled tests, test suite execution time, flaky test rate. Budget 15-20% of development time for test maintenance.
866
+
867
+ **Detection rule:**
868
+ If the test suite has `@SuppressWarnings`, deprecation warnings, tests referencing removed classes, or tests that have not been modified in 12+ months while the production code has changed, suspect AP-20.
869
+
870
+ ---
871
+
872
+ ### AP-21: Excessive Test Setup
873
+
874
+ **Also known as:** The Ceremony, Test Novel, Arrangement Overload
875
+ **Frequency:** Common
876
+ **Severity:** Medium
877
+ **Detection difficulty:** Easy
878
+
879
+ **What it looks like:**
880
+ A test method where 80% of the code is setting up the context (creating objects, configuring dependencies, loading data) and only 2-3 lines perform the action and assertion. The setup is so long that the reader cannot see what is actually being tested.
881
+
882
+ ```java
883
+ // BAD: 30 lines of setup for a 2-line test
884
+ @Test
885
+ void testOrderDiscount() {
886
+ Address address = new Address("123 Main St", "City", "ST", "12345");
887
+ Customer customer = new Customer("Alice", "alice@test.com", address);
888
+ customer.setTier(CustomerTier.GOLD);
889
+ customer.setMemberSince(LocalDate.of(2020, 1, 1));
890
+ Product product1 = new Product("Widget", 29.99, Category.ELECTRONICS);
891
+ Product product2 = new Product("Gadget", 49.99, Category.ELECTRONICS);
892
+ product1.setWeight(0.5);
893
+ product2.setWeight(1.2);
894
+ Order order = new Order(customer);
895
+ order.addItem(product1, 2);
896
+ order.addItem(product2, 1);
897
+ order.setShippingMethod(ShippingMethod.STANDARD);
898
+ order.setCouponCode("SAVE10");
899
+ // ... 15 more lines of setup ...
900
+
901
+ double total = order.calculateTotal();
902
+
903
+ assertEquals(92.97, total, 0.01);
904
+ }
905
+ ```
906
+
907
+ **Why developers do it:**
908
+ The system under test has many dependencies and requires a complex object graph. The developer does not invest in builder patterns or factory methods because each test "only" needs this setup. Over time, every test copies and slightly modifies the same setup.
909
+
910
+ **What goes wrong:**
911
+ Tests become unreadable: the reader cannot determine what matters for this specific test and what is incidental setup. When the domain model changes (e.g., `Customer` requires a new field), every test with this setup must be updated. The setup noise hides the intent of the test, making it useless as documentation.
912
+
913
+ **The fix:**
914
+ Use the Builder or Object Mother pattern to create test data with sensible defaults. Only specify values that matter for the specific test. Extract common setup into well-named helper methods.
915
+
916
+ ```java
917
+ // GOOD: Builder with defaults, only specify what matters
918
+ @Test
919
+ void testGoldCustomerGets20PercentDiscount() {
920
+ Order order = anOrder()
921
+ .withGoldCustomer()
922
+ .withItems(aProduct().priced(100.00))
923
+ .build();
924
+
925
+ double total = order.calculateTotal();
926
+
927
+ assertEquals(80.00, total, 0.01);
928
+ }
929
+ ```
930
+
931
+ **Detection rule:**
932
+ If a test method has more than 10 lines of object construction before the action, or if the setup-to-assertion ratio exceeds 5:1, suspect AP-21.
933
+
934
+ ---
935
+
936
+ ### AP-22: Test Double Misuse
937
+
938
+ **Also known as:** Wrong Double, Stub-Mock Confusion, Fake Fragility
939
+ **Frequency:** Occasional
940
+ **Severity:** Medium
941
+ **Detection difficulty:** Hard
942
+
943
+ **What it looks like:**
944
+ Using the wrong type of test double for the situation: using a mock (which verifies interactions) when a stub (which provides canned answers) would suffice, or using a full mock framework when a simple hand-written fake would be clearer. Verifying every interaction with every mock, even when only the output matters.
945
+
946
+ ```python
947
+ # BAD: Using mock verification when only the output matters
948
+ def test_calculate_shipping(self):
949
+ mock_weight_service = Mock()
950
+ mock_weight_service.get_weight.return_value = 5.0
951
+
952
+ cost = calculator.calculate_shipping(order, mock_weight_service)
953
+
954
+ assert cost == 12.50
955
+ # Unnecessary: verifying HOW the weight was retrieved
956
+ mock_weight_service.get_weight.assert_called_once_with(order.items)
957
+ mock_weight_service.get_weight.assert_called_with(order.items)
958
+ ```
959
+
960
+ **Why developers do it:**
961
+ Mock frameworks make it trivially easy to add verification calls. The developer adds them "just to be thorough." The difference between mocks, stubs, fakes, and spies is poorly understood -- many developers use "mock" as a generic term for all test doubles.
962
+
963
+ **What goes wrong:**
964
+ Over-verified mocks couple tests to implementation details (AP-02). When the production code changes how it calls collaborators (e.g., batching calls, reordering calls, adding caching), the mock verifications fail even though the behavior is correct. This creates maintenance overhead and discourages refactoring. As the Cash App engineering blog documented: "Mocking isn't evil, but avoid it anyway" -- not because mocking is wrong, but because it is chronically misused.
965
+
966
+ **The fix:**
967
+ Use stubs for queries (methods that return data) and mocks only for commands (methods that cause side effects you need to verify). Prefer hand-written fakes over mock frameworks for complex collaborators. Only verify interactions that are part of the test's purpose.
968
+
969
+ **Detection rule:**
970
+ If a test uses `.assert_called_with()` or `verify()` on a test double that is only used for its return value, suspect AP-22. If every mock in the test has verification calls, this is confirmed.
971
+
972
+ ---
973
+
974
+ ## Root Cause Analysis
975
+
976
+ | Anti-Pattern | Root Cause | Prevention |
977
+ |---|---|---|
978
+ | AP-01: The Mockery | Cargo culting ("unit tests must be isolated") | Prefer sociable unit tests; mock only external boundaries |
979
+ | AP-02: Testing Implementation | Ignorance of behavior vs. structure distinction | Train on Kent Beck's test desiderata; review tests for structural coupling |
980
+ | AP-03: Flaky Tests | Laziness (non-deterministic shortcuts) | Inject clocks, seeds, and controlled dependencies; quarantine flaky tests |
981
+ | AP-04: Testing Private Methods | Ignorance (missing abstraction) | Extract complex private methods into separate classes |
982
+ | AP-05: Coverage Obsession | Cargo culting (metrics as goals) | Use mutation testing; set coverage floors not ceilings |
983
+ | AP-06: Assertion-Free Testing | Laziness (quick coverage credit) | Enforce assertion count > 0 in CI linter rules |
984
+ | AP-07: Ice Cream Cone | Ignorance (no unit test culture) | Adopt testing pyramid; require unit tests in PR reviews |
985
+ | AP-08: Shared Mutable State | Laziness (reusing existing data) | Fresh state per test; rollback transactions |
986
+ | AP-09: The Slow Suite | Premature integration (testing too much via I/O) | Set time budgets; profile slowest tests; parallelize |
987
+ | AP-10: Copy-Paste Tests | Laziness (faster than designing abstractions) | Use parameterized tests and test builders |
988
+ | AP-11: Testing Too Many Things | Laziness (maximize assertions per setup) | One concept per test; cheap test setup via builders |
989
+ | AP-12: Not Testing Edge Cases | Laziness (deadline pressure) | Boundary value checklists; property-based testing |
990
+ | AP-13: Commenting Out Tests | Laziness (unblock CI quickly) | CI rule: no @Disabled without issue link; weekly report |
991
+ | AP-14: Test Data Coupling | Ignorance (unaware of seed dependency) | Factory patterns; each test creates its own data |
992
+ | AP-15: Sleep/Delays | Ignorance (simplest async solution) | Polling with timeout; explicit synchronization |
993
+ | AP-16: Tautological Tests | Ignorance (computed expectations feel robust) | Always use hardcoded expected values |
994
+ | AP-17: Not Testing Errors | Laziness (error paths are boring) | Error condition checklist per function |
995
+ | AP-18: Testing the Framework | Cargo culting ("test everything") | Only test your application's behavior |
996
+ | AP-19: Logic in Tests | Copy-paste from AI/SO (complex test patterns) | Parameterized tests; no branching in test methods |
997
+ | AP-20: Ignoring Maintenance | Laziness (tests are second-class) | Include test updates in definition of done |
998
+ | AP-21: Excessive Setup | Ignorance (no builder/factory patterns) | Object Mother / Builder pattern for test data |
999
+ | AP-22: Test Double Misuse | Ignorance (mock/stub/fake confusion) | Learn test double taxonomy; stubs for queries, mocks for commands |
1000
+
1001
+ ---
1002
+
1003
+ ## Self-Check Questions
1004
+
1005
+ Ask these questions during code review or while writing tests:
1006
+
1007
+ 1. **Am I testing behavior or implementation?** If I refactored the internals without changing the output, would this test break?
1008
+ 2. **Does this test have at least one meaningful assertion?** Not just "it didn't throw" -- does it verify a specific output or state change?
1009
+ 3. **Could this test fail for the wrong reason?** Is it coupled to test execution order, system time, or specific database state?
1010
+ 4. **Am I mocking things I own?** Mocks should wrap external boundaries, not internal collaborators.
1011
+ 5. **What happens if I change the input to null, empty, zero, or MAX_INT?** Is that tested?
1012
+ 6. **Is the expected value hardcoded or computed?** If computed, am I just re-implementing the production logic in the test?
1013
+ 7. **Would a new team member understand what this test verifies from the test name alone?**
1014
+ 8. **If this test fails, will the failure message tell me exactly what broke?** Or will I need to debug?
1015
+ 9. **Am I testing my code or the framework's code?** Would this test be useful if I swapped frameworks?
1016
+ 10. **How long does this test take?** Would I notice if it was 10x slower?
1017
+ 11. **Does this test create its own data, or does it depend on data created elsewhere?**
1018
+ 12. **Am I using `sleep` or `Thread.sleep` in this test?** Is there a deterministic alternative?
1019
+ 13. **If I delete this test, what bug could slip into production undetected?** If none, the test may not be worth maintaining.
1020
+ 14. **Am I commenting out this test because it is flaky, or because it found a real bug I don't want to fix right now?**
1021
+ 15. **Does my test suite have tests for error paths, not just success paths?**
1022
+
1023
+ ---
1024
+
1025
+ ## Code Smell Quick Reference
1026
+
1027
+ | If you see... | Suspect... | Verify... |
1028
+ |---|---|---|
1029
+ | More mock setup lines than assertion lines | AP-01: The Mockery | Are internal collaborators mocked? Could real implementations be used? |
1030
+ | `spyOn(obj, '_privateMethod')` | AP-02: Testing Implementation | Does the test still pass after an internal refactoring? |
1031
+ | `Thread.sleep` or `time.sleep` in a test | AP-03/AP-15: Flaky/Sleep | Can this be replaced with polling or an injectable clock? |
1032
+ | Reflection to access private members | AP-04: Testing Private Methods | Should this logic be extracted to its own public class? |
1033
+ | Coverage report at 100% but bugs in production | AP-05: Coverage Obsession | Run mutation testing -- how many mutants survive? |
1034
+ | Test method with zero `assert`/`expect` calls | AP-06: Assertion-Free | What observable outcome should be verified? |
1035
+ | E2E test count > unit test count | AP-07: Ice Cream Cone | Can the same bug be caught by a unit test instead? |
1036
+ | `static` or class-level mutable fields in tests | AP-08: Shared State | Does each test method get a fresh instance? |
1037
+ | Full suite > 10 minutes | AP-09: Slow Suite | Which 10 tests are slowest? Do they use real I/O? |
1038
+ | 3+ test methods with >80% identical code | AP-10: Copy-Paste | Can these become a parameterized test? |
1039
+ | Test with 8+ assertions on different objects | AP-11: Too Many Things | Does the test name describe a single behavior? |
1040
+ | All test inputs are "valid" or "normal" | AP-12: No Edge Cases | Where are the null, empty, zero, and boundary tests? |
1041
+ | `@Disabled`, `@Ignore`, `skip(`, `xit(` | AP-13: Commented Out | Is there a linked issue? How old is this skip? |
1042
+ | `User.find(1)` or `Order.find(42)` in tests | AP-14: Data Coupling | Does the test create this data or depend on seeds? |
1043
+ | `expected = price * rate` in test assertion | AP-16: Tautological | Is the expected value a hardcoded literal? |
1044
+ | Test file with zero tests for exceptions/errors | AP-17: No Error Paths | What error handling code exists but is untested? |
1045
+ | `assertNotNull(injectedService)` as only assertion | AP-18: Testing Framework | Does this test verify application behavior? |
1046
+ | `if`/`else`/`for` inside a test method | AP-19: Logic in Tests | Can this be a parameterized test instead? |
1047
+ | Test warnings, deprecations in test output | AP-20: Stale Tests | When was this test last updated vs. production code? |
1048
+ | 20+ lines of object construction before the action | AP-21: Excessive Setup | Is there a builder or factory pattern available? |
1049
+ | `mock.verify()` on every mock in the test | AP-22: Test Double Misuse | Is this mock used for its return value or its side effect? |
1050
+
1051
+ ---
1052
+
1053
+ *Researched: 2026-03-08 | Sources: [Software Testing Anti-patterns (Codepipes)](https://blog.codepipes.com/testing/software-testing-antipatterns.html), [Unit Testing Anti-Patterns Full List (Yegor256)](https://www.yegor256.com/2018/12/11/unit-testing-anti-patterns.html), [Flaky Tests at Google (Google Testing Blog)](https://testing.googleblog.com/2016/05/flaky-tests-at-google-and-how-we.html), [Unit Testing Principles, Practices, and Patterns (Manning)](https://livebook.manning.com/book/unit-testing/chapter-11), [The Case Against 100% Code Coverage (Codecov)](https://about.codecov.io/blog/the-case-against-100-code-coverage/), [Assertion Free Testing (Martin Fowler)](https://martinfowler.com/bliki/AssertionFreeTesting.html), [Tautological Tests (Randy Coulman)](https://randycoulman.com/blog/2016/12/20/tautological-tests/), [Test Desiderata (Kent Beck)](https://medium.com/@kentbeck_7670/test-desiderata-94150638a4b3), [Mocking is an Anti-Pattern (AmazingCTO)](https://www.amazingcto.com/mocking-is-an-antipattern-how-to-test-without-mocking/), [Mocking isn't evil (Cash App)](https://code.cash.app/mocking), [Test Code Duplication (xUnit Patterns)](http://xunitpatterns.com/Test%20Code%20Duplication.html), [Ice Cream Cone Anti-Pattern (BugBug)](https://bugbug.io/blog/software-testing/ice-cream-cone-anti-pattern/), [Test Flakiness at Spotify (Spotify Engineering)](https://engineering.atspotify.com/2019/11/test-flakiness-methods-for-identifying-and-dealing-with-flaky-tests/), [Revamping Android Testing at Dropbox (Dropbox Tech)](https://dropbox.tech/mobile/revamping-the-android-testing-pipeline-at-dropbox), [Logic in Tests (TestinGil)](https://www.everydayunittesting.com/2016/08/unit-test-anti-patterns-logic-in-tests.html), [Private Methods and Encapsulation (Vladimir Khorikov)](https://khorikov.org/posts/2020-03-26-private-methods-encapsulation/), [Non-determinism in Tests (Enterprise Craftsmanship)](https://enterprisecraftsmanship.com/posts/non-determinism-tests/), [LayerX QA Initiative (Autify)](https://nocode.autify.com/blog/layerxs-qa-initiative-dont-be-tempted-by-the-ice-cream-cone), [Flaky Tests Cost $4.3M (Medium)](https://medium.com/@ran.algawi/its-just-a-flaky-test-the-most-expensive-lie-in-engineering-4b18b0207d96), [An Empirical Analysis of Flaky Tests (U of Illinois)](https://mir.cs.illinois.edu/lamyaa/publications/fse14.pdf)*