@zimezone/z-command 1.1.1 → 1.1.4

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (288) hide show
  1. package/README.md +13 -1
  2. package/dist/cli.js +1 -1
  3. package/dist/commands/init.d.ts.map +1 -1
  4. package/dist/commands/init.js +42 -10
  5. package/dist/commands/init.js.map +1 -1
  6. package/dist/platforms.d.ts.map +1 -1
  7. package/dist/platforms.js +11 -1
  8. package/dist/platforms.js.map +1 -1
  9. package/dist/types.d.ts +2 -0
  10. package/dist/types.d.ts.map +1 -1
  11. package/package.json +8 -3
  12. package/templates.zip +0 -0
  13. package/templates/agents/accessibility-expert.agent.md +0 -56
  14. package/templates/agents/ai-engineer.agent.md +0 -61
  15. package/templates/agents/angular-architect.agent.md +0 -49
  16. package/templates/agents/api-designer.agent.md +0 -40
  17. package/templates/agents/api-documenter.agent.md +0 -161
  18. package/templates/agents/architect-review.agent.md +0 -146
  19. package/templates/agents/arm-cortex-expert.agent.md +0 -288
  20. package/templates/agents/azure-infra-engineer.agent.md +0 -57
  21. package/templates/agents/backend-architect.agent.md +0 -309
  22. package/templates/agents/backend-developer.agent.md +0 -61
  23. package/templates/agents/backend-security-coder.agent.md +0 -152
  24. package/templates/agents/bash-pro.agent.md +0 -285
  25. package/templates/agents/blockchain-developer.agent.md +0 -57
  26. package/templates/agents/build-engineer.agent.md +0 -56
  27. package/templates/agents/business-analyst.agent.md +0 -47
  28. package/templates/agents/c-pro.agent.md +0 -35
  29. package/templates/agents/c4-code.agent.md +0 -320
  30. package/templates/agents/c4-component.agent.md +0 -227
  31. package/templates/agents/c4-container.agent.md +0 -248
  32. package/templates/agents/c4-context.agent.md +0 -235
  33. package/templates/agents/cli-developer.agent.md +0 -57
  34. package/templates/agents/cloud-architect.agent.md +0 -56
  35. package/templates/agents/code-architect.agent.md +0 -63
  36. package/templates/agents/code-reviewer.agent.md +0 -49
  37. package/templates/agents/competitive-analyst.agent.md +0 -48
  38. package/templates/agents/conductor-validator.agent.md +0 -245
  39. package/templates/agents/context-manager.agent.md +0 -55
  40. package/templates/agents/cpp-pro.agent.md +0 -59
  41. package/templates/agents/csharp-developer.agent.md +0 -57
  42. package/templates/agents/csharp-pro.agent.md +0 -38
  43. package/templates/agents/customer-support.agent.md +0 -148
  44. package/templates/agents/data-engineer.agent.md +0 -55
  45. package/templates/agents/data-researcher.agent.md +0 -55
  46. package/templates/agents/data-scientist.agent.md +0 -56
  47. package/templates/agents/database-admin.agent.md +0 -142
  48. package/templates/agents/database-administrator.agent.md +0 -50
  49. package/templates/agents/database-architect.agent.md +0 -238
  50. package/templates/agents/database-optimizer.agent.md +0 -144
  51. package/templates/agents/debugger.agent.md +0 -30
  52. package/templates/agents/deployment-engineer.agent.md +0 -0
  53. package/templates/agents/devops-engineer.agent.md +0 -59
  54. package/templates/agents/devops-troubleshooter.agent.md +0 -138
  55. package/templates/agents/django-developer.agent.md +0 -50
  56. package/templates/agents/django-pro.agent.md +0 -159
  57. package/templates/agents/docs-architect.agent.md +0 -77
  58. package/templates/agents/documentation-engineer.agent.md +0 -57
  59. package/templates/agents/dotnet-architect.agent.md +0 -175
  60. package/templates/agents/dx-optimizer.agent.md +0 -63
  61. package/templates/agents/electron-pro.agent.md +0 -56
  62. package/templates/agents/elixir-pro.agent.md +0 -38
  63. package/templates/agents/embedded-systems.agent.md +0 -55
  64. package/templates/agents/error-detective.agent.md +0 -32
  65. package/templates/agents/event-sourcing-architect.agent.md +0 -42
  66. package/templates/agents/fastapi-pro.agent.md +0 -171
  67. package/templates/agents/fintech-engineer.agent.md +0 -57
  68. package/templates/agents/firmware-analyst.agent.md +0 -330
  69. package/templates/agents/flutter-expert.agent.md +0 -50
  70. package/templates/agents/frontend-developer.agent.md +0 -59
  71. package/templates/agents/frontend-security-coder.agent.md +0 -149
  72. package/templates/agents/fullstack-developer.agent.md +0 -46
  73. package/templates/agents/git-workflow-manager.agent.md +0 -57
  74. package/templates/agents/golang-pro.agent.md +0 -50
  75. package/templates/agents/graphql-architect.agent.md +0 -48
  76. package/templates/agents/haskell-pro.agent.md +0 -37
  77. package/templates/agents/hr-pro.agent.md +0 -105
  78. package/templates/agents/incident-responder.agent.md +0 -190
  79. package/templates/agents/ios-developer.agent.md +0 -198
  80. package/templates/agents/iot-engineer.agent.md +0 -56
  81. package/templates/agents/java-architect.agent.md +0 -48
  82. package/templates/agents/java-pro.agent.md +0 -156
  83. package/templates/agents/javascript-pro.agent.md +0 -35
  84. package/templates/agents/julia-pro.agent.md +0 -187
  85. package/templates/agents/kotlin-specialist.agent.md +0 -50
  86. package/templates/agents/laravel-specialist.agent.md +0 -50
  87. package/templates/agents/legacy-modernizer.agent.md +0 -56
  88. package/templates/agents/legal-advisor.agent.md +0 -49
  89. package/templates/agents/llm-architect.agent.md +0 -58
  90. package/templates/agents/malware-analyst.agent.md +0 -272
  91. package/templates/agents/mcp-developer.agent.md +0 -54
  92. package/templates/agents/mermaid-expert.agent.md +0 -39
  93. package/templates/agents/microservices-architect.agent.md +0 -47
  94. package/templates/agents/minecraft-bukkit-pro.agent.md +0 -104
  95. package/templates/agents/ml-engineer.agent.md +0 -56
  96. package/templates/agents/mlops-engineer.agent.md +0 -56
  97. package/templates/agents/mobile-developer.agent.md +0 -45
  98. package/templates/agents/mobile-security-coder.agent.md +0 -163
  99. package/templates/agents/monorepo-architect.agent.md +0 -44
  100. package/templates/agents/multi-agent-coordinator.agent.md +0 -55
  101. package/templates/agents/network-engineer.agent.md +0 -57
  102. package/templates/agents/nextjs-developer.agent.md +0 -48
  103. package/templates/agents/nlp-engineer.agent.md +0 -58
  104. package/templates/agents/observability-engineer.agent.md +0 -228
  105. package/templates/agents/payment-integration.agent.md +0 -56
  106. package/templates/agents/performance-engineer.agent.md +0 -167
  107. package/templates/agents/performance-optimizer.agent.md +0 -57
  108. package/templates/agents/php-pro.agent.md +0 -43
  109. package/templates/agents/platform-engineer.agent.md +0 -57
  110. package/templates/agents/posix-shell-pro.agent.md +0 -284
  111. package/templates/agents/postgres-pro.agent.md +0 -58
  112. package/templates/agents/product-manager.agent.md +0 -55
  113. package/templates/agents/project-manager.agent.md +0 -57
  114. package/templates/agents/prompt-engineer.agent.md +0 -58
  115. package/templates/agents/python-pro.agent.md +0 -48
  116. package/templates/agents/quant-analyst.agent.md +0 -32
  117. package/templates/agents/rails-expert.agent.md +0 -50
  118. package/templates/agents/react-specialist.agent.md +0 -49
  119. package/templates/agents/refactoring-specialist.agent.md +0 -56
  120. package/templates/agents/reference-builder.agent.md +0 -167
  121. package/templates/agents/research-analyst.agent.md +0 -63
  122. package/templates/agents/reverse-engineer.agent.md +0 -202
  123. package/templates/agents/risk-manager.agent.md +0 -41
  124. package/templates/agents/ruby-pro.agent.md +0 -35
  125. package/templates/agents/rust-pro.agent.md +0 -156
  126. package/templates/agents/sales-automator.agent.md +0 -35
  127. package/templates/agents/scala-pro.agent.md +0 -60
  128. package/templates/agents/scrum-master.agent.md +0 -54
  129. package/templates/agents/search-specialist.agent.md +0 -59
  130. package/templates/agents/security-analyst.agent.md +0 -57
  131. package/templates/agents/security-auditor.agent.md +0 -138
  132. package/templates/agents/security-engineer.agent.md +0 -57
  133. package/templates/agents/seo-authority-builder.agent.md +0 -116
  134. package/templates/agents/seo-cannibalization-detector.agent.md +0 -103
  135. package/templates/agents/seo-content-auditor.agent.md +0 -63
  136. package/templates/agents/seo-content-planner.agent.md +0 -88
  137. package/templates/agents/seo-content-refresher.agent.md +0 -98
  138. package/templates/agents/seo-content-writer.agent.md +0 -76
  139. package/templates/agents/seo-keyword-strategist.agent.md +0 -75
  140. package/templates/agents/seo-meta-optimizer.agent.md +0 -72
  141. package/templates/agents/seo-snippet-hunter.agent.md +0 -94
  142. package/templates/agents/seo-specialist.agent.md +0 -57
  143. package/templates/agents/seo-structure-architect.agent.md +0 -88
  144. package/templates/agents/service-mesh-expert.agent.md +0 -41
  145. package/templates/agents/sql-pro.agent.md +0 -146
  146. package/templates/agents/sre-engineer.agent.md +0 -58
  147. package/templates/agents/swift-expert.agent.md +0 -49
  148. package/templates/agents/task-distributor.agent.md +0 -47
  149. package/templates/agents/tdd-orchestrator.agent.md +0 -183
  150. package/templates/agents/technical-writer.agent.md +0 -48
  151. package/templates/agents/temporal-python-pro.agent.md +0 -349
  152. package/templates/agents/terraform-engineer.agent.md +0 -57
  153. package/templates/agents/terraform-specialist.agent.md +0 -137
  154. package/templates/agents/test-automator.agent.md +0 -203
  155. package/templates/agents/test-engineer.agent.md +0 -55
  156. package/templates/agents/threat-modeling-expert.agent.md +0 -44
  157. package/templates/agents/trend-analyst.agent.md +0 -47
  158. package/templates/agents/tutorial-engineer.agent.md +0 -118
  159. package/templates/agents/typescript-pro.agent.md +0 -48
  160. package/templates/agents/ui-designer.agent.md +0 -48
  161. package/templates/agents/ui-ux-designer.agent.md +0 -188
  162. package/templates/agents/ui-visual-validator.agent.md +0 -192
  163. package/templates/agents/ux-researcher.agent.md +0 -48
  164. package/templates/agents/vector-database-engineer.agent.md +0 -43
  165. package/templates/agents/vue-expert.agent.md +0 -48
  166. package/templates/agents/websocket-engineer.agent.md +0 -49
  167. package/templates/agents/workflow-orchestrator.agent.md +0 -48
  168. package/templates/skills/angular-migration/SKILL.md +0 -410
  169. package/templates/skills/api-design-principles/SKILL.md +0 -528
  170. package/templates/skills/api-design-principles/assets/api-design-checklist.md +0 -155
  171. package/templates/skills/api-design-principles/assets/rest-api-template.py +0 -182
  172. package/templates/skills/api-design-principles/references/graphql-schema-design.md +0 -583
  173. package/templates/skills/api-design-principles/references/rest-best-practices.md +0 -408
  174. package/templates/skills/architecture-decision-records/SKILL.md +0 -428
  175. package/templates/skills/architecture-patterns/SKILL.md +0 -494
  176. package/templates/skills/async-python-patterns/SKILL.md +0 -694
  177. package/templates/skills/auth-implementation-patterns/SKILL.md +0 -634
  178. package/templates/skills/changelog-automation/SKILL.md +0 -552
  179. package/templates/skills/code-review/SKILL.md +0 -62
  180. package/templates/skills/code-review-excellence/SKILL.md +0 -520
  181. package/templates/skills/competitive-landscape/SKILL.md +0 -479
  182. package/templates/skills/context-driven-development/SKILL.md +0 -385
  183. package/templates/skills/cost-optimization/SKILL.md +0 -274
  184. package/templates/skills/cqrs-implementation/SKILL.md +0 -554
  185. package/templates/skills/data-quality-frameworks/SKILL.md +0 -587
  186. package/templates/skills/data-storytelling/SKILL.md +0 -453
  187. package/templates/skills/database-migration/SKILL.md +0 -424
  188. package/templates/skills/dbt-transformation-patterns/SKILL.md +0 -561
  189. package/templates/skills/debugging-strategies/SKILL.md +0 -527
  190. package/templates/skills/defi-protocol-templates/SKILL.md +0 -454
  191. package/templates/skills/dependency-upgrade/SKILL.md +0 -409
  192. package/templates/skills/deployment-pipeline-design/SKILL.md +0 -359
  193. package/templates/skills/distributed-tracing/SKILL.md +0 -438
  194. package/templates/skills/dotnet-backend-patterns/SKILL.md +0 -815
  195. package/templates/skills/dotnet-backend-patterns/assets/repository-template.cs +0 -523
  196. package/templates/skills/dotnet-backend-patterns/assets/service-template.cs +0 -336
  197. package/templates/skills/dotnet-backend-patterns/references/dapper-patterns.md +0 -544
  198. package/templates/skills/dotnet-backend-patterns/references/ef-core-best-practices.md +0 -355
  199. package/templates/skills/e2e-testing-patterns/SKILL.md +0 -547
  200. package/templates/skills/employment-contract-templates/SKILL.md +0 -507
  201. package/templates/skills/error-handling-patterns/SKILL.md +0 -636
  202. package/templates/skills/event-store-design/SKILL.md +0 -437
  203. package/templates/skills/fastapi-templates/SKILL.md +0 -567
  204. package/templates/skills/git-advanced-workflows/SKILL.md +0 -400
  205. package/templates/skills/github-actions-templates/SKILL.md +0 -333
  206. package/templates/skills/go-concurrency-patterns/SKILL.md +0 -655
  207. package/templates/skills/grafana-dashboards/SKILL.md +0 -369
  208. package/templates/skills/helm-chart-scaffolding/SKILL.md +0 -544
  209. package/templates/skills/helm-chart-scaffolding/assets/Chart.yaml.template +0 -42
  210. package/templates/skills/helm-chart-scaffolding/assets/values.yaml.template +0 -185
  211. package/templates/skills/helm-chart-scaffolding/references/chart-structure.md +0 -500
  212. package/templates/skills/helm-chart-scaffolding/scripts/validate-chart.sh +0 -244
  213. package/templates/skills/javascript-testing-patterns/SKILL.md +0 -1025
  214. package/templates/skills/langchain-architecture/SKILL.md +0 -338
  215. package/templates/skills/llm-evaluation/SKILL.md +0 -471
  216. package/templates/skills/microservices-patterns/SKILL.md +0 -595
  217. package/templates/skills/modern-javascript-patterns/SKILL.md +0 -911
  218. package/templates/skills/monorepo-management/SKILL.md +0 -622
  219. package/templates/skills/nextjs-app-router-patterns/SKILL.md +0 -544
  220. package/templates/skills/nodejs-backend-patterns/SKILL.md +0 -1020
  221. package/templates/skills/nx-workspace-patterns/SKILL.md +0 -452
  222. package/templates/skills/openapi-spec-generation/SKILL.md +0 -1028
  223. package/templates/skills/paypal-integration/SKILL.md +0 -467
  224. package/templates/skills/pci-compliance/SKILL.md +0 -466
  225. package/templates/skills/postgresql/SKILL.md +0 -204
  226. package/templates/skills/projection-patterns/SKILL.md +0 -490
  227. package/templates/skills/prometheus-configuration/SKILL.md +0 -392
  228. package/templates/skills/prompt-engineering-patterns/SKILL.md +0 -201
  229. package/templates/skills/prompt-engineering-patterns/assets/few-shot-examples.json +0 -106
  230. package/templates/skills/prompt-engineering-patterns/assets/prompt-template-library.md +0 -246
  231. package/templates/skills/prompt-engineering-patterns/references/chain-of-thought.md +0 -399
  232. package/templates/skills/prompt-engineering-patterns/references/few-shot-learning.md +0 -369
  233. package/templates/skills/prompt-engineering-patterns/references/prompt-optimization.md +0 -414
  234. package/templates/skills/prompt-engineering-patterns/references/prompt-templates.md +0 -470
  235. package/templates/skills/prompt-engineering-patterns/references/system-prompts.md +0 -189
  236. package/templates/skills/prompt-engineering-patterns/scripts/optimize-prompt.py +0 -279
  237. package/templates/skills/python-packaging/SKILL.md +0 -870
  238. package/templates/skills/python-performance-optimization/SKILL.md +0 -869
  239. package/templates/skills/python-testing-patterns/SKILL.md +0 -907
  240. package/templates/skills/rag-implementation/SKILL.md +0 -403
  241. package/templates/skills/react-modernization/SKILL.md +0 -513
  242. package/templates/skills/react-native-architecture/SKILL.md +0 -671
  243. package/templates/skills/react-state-management/SKILL.md +0 -429
  244. package/templates/skills/risk-metrics-calculation/SKILL.md +0 -555
  245. package/templates/skills/rust-async-patterns/SKILL.md +0 -517
  246. package/templates/skills/secrets-management/SKILL.md +0 -346
  247. package/templates/skills/security-requirement-extraction/SKILL.md +0 -677
  248. package/templates/skills/security-review/SKILL.md +0 -78
  249. package/templates/skills/shellcheck-configuration/SKILL.md +0 -454
  250. package/templates/skills/similarity-search-patterns/SKILL.md +0 -558
  251. package/templates/skills/slo-implementation/SKILL.md +0 -329
  252. package/templates/skills/sql-optimization-patterns/SKILL.md +0 -493
  253. package/templates/skills/stripe-integration/SKILL.md +0 -442
  254. package/templates/skills/systematic-debugging/SKILL.md +0 -57
  255. package/templates/skills/tailwind-design-system/SKILL.md +0 -666
  256. package/templates/skills/temporal-python-testing/SKILL.md +0 -158
  257. package/templates/skills/temporal-python-testing/resources/integration-testing.md +0 -455
  258. package/templates/skills/temporal-python-testing/resources/local-setup.md +0 -553
  259. package/templates/skills/temporal-python-testing/resources/replay-testing.md +0 -462
  260. package/templates/skills/temporal-python-testing/resources/unit-testing.md +0 -328
  261. package/templates/skills/terraform-module-library/SKILL.md +0 -249
  262. package/templates/skills/terraform-module-library/references/aws-modules.md +0 -63
  263. package/templates/skills/test-driven-development/SKILL.md +0 -46
  264. package/templates/skills/threat-mitigation-mapping/SKILL.md +0 -745
  265. package/templates/skills/track-management/SKILL.md +0 -593
  266. package/templates/skills/typescript-advanced-types/SKILL.md +0 -717
  267. package/templates/skills/ui-ux-pro-max/SKILL.md +0 -352
  268. package/templates/skills/ui-ux-pro-max/data/charts.csv +0 -26
  269. package/templates/skills/ui-ux-pro-max/data/colors.csv +0 -97
  270. package/templates/skills/ui-ux-pro-max/data/icons.csv +0 -101
  271. package/templates/skills/ui-ux-pro-max/data/landing.csv +0 -31
  272. package/templates/skills/ui-ux-pro-max/data/products.csv +0 -97
  273. package/templates/skills/ui-ux-pro-max/data/prompts.csv +0 -24
  274. package/templates/skills/ui-ux-pro-max/data/react-performance.csv +0 -45
  275. package/templates/skills/ui-ux-pro-max/data/styles.csv +0 -59
  276. package/templates/skills/ui-ux-pro-max/data/typography.csv +0 -58
  277. package/templates/skills/ui-ux-pro-max/data/ui-reasoning.csv +0 -101
  278. package/templates/skills/ui-ux-pro-max/data/ux-guidelines.csv +0 -100
  279. package/templates/skills/ui-ux-pro-max/data/web-interface.csv +0 -31
  280. package/templates/skills/ui-ux-pro-max/scripts/core.py +0 -258
  281. package/templates/skills/ui-ux-pro-max/scripts/design_system.py +0 -547
  282. package/templates/skills/ui-ux-pro-max/scripts/search.py +0 -76
  283. package/templates/skills/uv-package-manager/SKILL.md +0 -831
  284. package/templates/skills/vector-index-tuning/SKILL.md +0 -521
  285. package/templates/skills/wcag-audit-patterns/SKILL.md +0 -555
  286. package/templates/skills/workflow-orchestration-patterns/SKILL.md +0 -316
  287. package/templates/skills/workflow-patterns/SKILL.md +0 -623
  288. package/templates/skills/writing-plans/SKILL.md +0 -64
@@ -1,414 +0,0 @@
1
- # Prompt Optimization Guide
2
-
3
- ## Systematic Refinement Process
4
-
5
- ### 1. Baseline Establishment
6
- ```python
7
- def establish_baseline(prompt, test_cases):
8
- results = {
9
- 'accuracy': 0,
10
- 'avg_tokens': 0,
11
- 'avg_latency': 0,
12
- 'success_rate': 0
13
- }
14
-
15
- for test_case in test_cases:
16
- response = llm.complete(prompt.format(**test_case['input']))
17
-
18
- results['accuracy'] += evaluate_accuracy(response, test_case['expected'])
19
- results['avg_tokens'] += count_tokens(response)
20
- results['avg_latency'] += measure_latency(response)
21
- results['success_rate'] += is_valid_response(response)
22
-
23
- # Average across test cases
24
- n = len(test_cases)
25
- return {k: v/n for k, v in results.items()}
26
- ```
27
-
28
- ### 2. Iterative Refinement Workflow
29
- ```
30
- Initial Prompt → Test → Analyze Failures → Refine → Test → Repeat
31
- ```
32
-
33
- ```python
34
- class PromptOptimizer:
35
- def __init__(self, initial_prompt, test_suite):
36
- self.prompt = initial_prompt
37
- self.test_suite = test_suite
38
- self.history = []
39
-
40
- def optimize(self, max_iterations=10):
41
- for i in range(max_iterations):
42
- # Test current prompt
43
- results = self.evaluate_prompt(self.prompt)
44
- self.history.append({
45
- 'iteration': i,
46
- 'prompt': self.prompt,
47
- 'results': results
48
- })
49
-
50
- # Stop if good enough
51
- if results['accuracy'] > 0.95:
52
- break
53
-
54
- # Analyze failures
55
- failures = self.analyze_failures(results)
56
-
57
- # Generate refinement suggestions
58
- refinements = self.generate_refinements(failures)
59
-
60
- # Apply best refinement
61
- self.prompt = self.select_best_refinement(refinements)
62
-
63
- return self.get_best_prompt()
64
- ```
65
-
66
- ### 3. A/B Testing Framework
67
- ```python
68
- class PromptABTest:
69
- def __init__(self, variant_a, variant_b):
70
- self.variant_a = variant_a
71
- self.variant_b = variant_b
72
-
73
- def run_test(self, test_queries, metrics=['accuracy', 'latency']):
74
- results = {
75
- 'A': {m: [] for m in metrics},
76
- 'B': {m: [] for m in metrics}
77
- }
78
-
79
- for query in test_queries:
80
- # Randomly assign variant (50/50 split)
81
- variant = 'A' if random.random() < 0.5 else 'B'
82
- prompt = self.variant_a if variant == 'A' else self.variant_b
83
-
84
- response, metrics_data = self.execute_with_metrics(
85
- prompt.format(query=query['input'])
86
- )
87
-
88
- for metric in metrics:
89
- results[variant][metric].append(metrics_data[metric])
90
-
91
- return self.analyze_results(results)
92
-
93
- def analyze_results(self, results):
94
- from scipy import stats
95
-
96
- analysis = {}
97
- for metric in results['A'].keys():
98
- a_values = results['A'][metric]
99
- b_values = results['B'][metric]
100
-
101
- # Statistical significance test
102
- t_stat, p_value = stats.ttest_ind(a_values, b_values)
103
-
104
- analysis[metric] = {
105
- 'A_mean': np.mean(a_values),
106
- 'B_mean': np.mean(b_values),
107
- 'improvement': (np.mean(b_values) - np.mean(a_values)) / np.mean(a_values),
108
- 'statistically_significant': p_value < 0.05,
109
- 'p_value': p_value,
110
- 'winner': 'B' if np.mean(b_values) > np.mean(a_values) else 'A'
111
- }
112
-
113
- return analysis
114
- ```
115
-
116
- ## Optimization Strategies
117
-
118
- ### Token Reduction
119
- ```python
120
- def optimize_for_tokens(prompt):
121
- optimizations = [
122
- # Remove redundant phrases
123
- ('in order to', 'to'),
124
- ('due to the fact that', 'because'),
125
- ('at this point in time', 'now'),
126
-
127
- # Consolidate instructions
128
- ('First, ...\\nThen, ...\\nFinally, ...', 'Steps: 1) ... 2) ... 3) ...'),
129
-
130
- # Use abbreviations (after first definition)
131
- ('Natural Language Processing (NLP)', 'NLP'),
132
-
133
- # Remove filler words
134
- (' actually ', ' '),
135
- (' basically ', ' '),
136
- (' really ', ' ')
137
- ]
138
-
139
- optimized = prompt
140
- for old, new in optimizations:
141
- optimized = optimized.replace(old, new)
142
-
143
- return optimized
144
- ```
145
-
146
- ### Latency Reduction
147
- ```python
148
- def optimize_for_latency(prompt):
149
- strategies = {
150
- 'shorter_prompt': reduce_token_count(prompt),
151
- 'streaming': enable_streaming_response(prompt),
152
- 'caching': add_cacheable_prefix(prompt),
153
- 'early_stopping': add_stop_sequences(prompt)
154
- }
155
-
156
- # Test each strategy
157
- best_strategy = None
158
- best_latency = float('inf')
159
-
160
- for name, modified_prompt in strategies.items():
161
- latency = measure_average_latency(modified_prompt)
162
- if latency < best_latency:
163
- best_latency = latency
164
- best_strategy = modified_prompt
165
-
166
- return best_strategy
167
- ```
168
-
169
- ### Accuracy Improvement
170
- ```python
171
- def improve_accuracy(prompt, failure_cases):
172
- improvements = []
173
-
174
- # Add constraints for common failures
175
- if has_format_errors(failure_cases):
176
- improvements.append("Output must be valid JSON with no additional text.")
177
-
178
- # Add examples for edge cases
179
- edge_cases = identify_edge_cases(failure_cases)
180
- if edge_cases:
181
- improvements.append(f"Examples of edge cases:\\n{format_examples(edge_cases)}")
182
-
183
- # Add verification step
184
- if has_logical_errors(failure_cases):
185
- improvements.append("Before responding, verify your answer is logically consistent.")
186
-
187
- # Strengthen instructions
188
- if has_ambiguity_errors(failure_cases):
189
- improvements.append(clarify_ambiguous_instructions(prompt))
190
-
191
- return integrate_improvements(prompt, improvements)
192
- ```
193
-
194
- ## Performance Metrics
195
-
196
- ### Core Metrics
197
- ```python
198
- class PromptMetrics:
199
- @staticmethod
200
- def accuracy(responses, ground_truth):
201
- return sum(r == gt for r, gt in zip(responses, ground_truth)) / len(responses)
202
-
203
- @staticmethod
204
- def consistency(responses):
205
- # Measure how often identical inputs produce identical outputs
206
- from collections import defaultdict
207
- input_responses = defaultdict(list)
208
-
209
- for inp, resp in responses:
210
- input_responses[inp].append(resp)
211
-
212
- consistency_scores = []
213
- for inp, resps in input_responses.items():
214
- if len(resps) > 1:
215
- # Percentage of responses that match the most common response
216
- most_common_count = Counter(resps).most_common(1)[0][1]
217
- consistency_scores.append(most_common_count / len(resps))
218
-
219
- return np.mean(consistency_scores) if consistency_scores else 1.0
220
-
221
- @staticmethod
222
- def token_efficiency(prompt, responses):
223
- avg_prompt_tokens = np.mean([count_tokens(prompt.format(**r['input'])) for r in responses])
224
- avg_response_tokens = np.mean([count_tokens(r['output']) for r in responses])
225
- return avg_prompt_tokens + avg_response_tokens
226
-
227
- @staticmethod
228
- def latency_p95(latencies):
229
- return np.percentile(latencies, 95)
230
- ```
231
-
232
- ### Automated Evaluation
233
- ```python
234
- def evaluate_prompt_comprehensively(prompt, test_suite):
235
- results = {
236
- 'accuracy': [],
237
- 'consistency': [],
238
- 'latency': [],
239
- 'tokens': [],
240
- 'success_rate': []
241
- }
242
-
243
- # Run each test case multiple times for consistency measurement
244
- for test_case in test_suite:
245
- runs = []
246
- for _ in range(3): # 3 runs per test case
247
- start = time.time()
248
- response = llm.complete(prompt.format(**test_case['input']))
249
- latency = time.time() - start
250
-
251
- runs.append(response)
252
- results['latency'].append(latency)
253
- results['tokens'].append(count_tokens(prompt) + count_tokens(response))
254
-
255
- # Accuracy (best of 3 runs)
256
- accuracies = [evaluate_accuracy(r, test_case['expected']) for r in runs]
257
- results['accuracy'].append(max(accuracies))
258
-
259
- # Consistency (how similar are the 3 runs?)
260
- results['consistency'].append(calculate_similarity(runs))
261
-
262
- # Success rate (all runs successful?)
263
- results['success_rate'].append(all(is_valid(r) for r in runs))
264
-
265
- return {
266
- 'avg_accuracy': np.mean(results['accuracy']),
267
- 'avg_consistency': np.mean(results['consistency']),
268
- 'p95_latency': np.percentile(results['latency'], 95),
269
- 'avg_tokens': np.mean(results['tokens']),
270
- 'success_rate': np.mean(results['success_rate'])
271
- }
272
- ```
273
-
274
- ## Failure Analysis
275
-
276
- ### Categorizing Failures
277
- ```python
278
- class FailureAnalyzer:
279
- def categorize_failures(self, test_results):
280
- categories = {
281
- 'format_errors': [],
282
- 'factual_errors': [],
283
- 'logic_errors': [],
284
- 'incomplete_responses': [],
285
- 'hallucinations': [],
286
- 'off_topic': []
287
- }
288
-
289
- for result in test_results:
290
- if not result['success']:
291
- category = self.determine_failure_type(
292
- result['response'],
293
- result['expected']
294
- )
295
- categories[category].append(result)
296
-
297
- return categories
298
-
299
- def generate_fixes(self, categorized_failures):
300
- fixes = []
301
-
302
- if categorized_failures['format_errors']:
303
- fixes.append({
304
- 'issue': 'Format errors',
305
- 'fix': 'Add explicit format examples and constraints',
306
- 'priority': 'high'
307
- })
308
-
309
- if categorized_failures['hallucinations']:
310
- fixes.append({
311
- 'issue': 'Hallucinations',
312
- 'fix': 'Add grounding instruction: "Base your answer only on provided context"',
313
- 'priority': 'critical'
314
- })
315
-
316
- if categorized_failures['incomplete_responses']:
317
- fixes.append({
318
- 'issue': 'Incomplete responses',
319
- 'fix': 'Add: "Ensure your response fully addresses all parts of the question"',
320
- 'priority': 'medium'
321
- })
322
-
323
- return fixes
324
- ```
325
-
326
- ## Versioning and Rollback
327
-
328
- ### Prompt Version Control
329
- ```python
330
- class PromptVersionControl:
331
- def __init__(self, storage_path):
332
- self.storage = storage_path
333
- self.versions = []
334
-
335
- def save_version(self, prompt, metadata):
336
- version = {
337
- 'id': len(self.versions),
338
- 'prompt': prompt,
339
- 'timestamp': datetime.now(),
340
- 'metrics': metadata.get('metrics', {}),
341
- 'description': metadata.get('description', ''),
342
- 'parent_id': metadata.get('parent_id')
343
- }
344
- self.versions.append(version)
345
- self.persist()
346
- return version['id']
347
-
348
- def rollback(self, version_id):
349
- if version_id < len(self.versions):
350
- return self.versions[version_id]['prompt']
351
- raise ValueError(f"Version {version_id} not found")
352
-
353
- def compare_versions(self, v1_id, v2_id):
354
- v1 = self.versions[v1_id]
355
- v2 = self.versions[v2_id]
356
-
357
- return {
358
- 'diff': generate_diff(v1['prompt'], v2['prompt']),
359
- 'metrics_comparison': {
360
- metric: {
361
- 'v1': v1['metrics'].get(metric),
362
- 'v2': v2['metrics'].get(metric'),
363
- 'change': v2['metrics'].get(metric, 0) - v1['metrics'].get(metric, 0)
364
- }
365
- for metric in set(v1['metrics'].keys()) | set(v2['metrics'].keys())
366
- }
367
- }
368
- ```
369
-
370
- ## Best Practices
371
-
372
- 1. **Establish Baseline**: Always measure initial performance
373
- 2. **Change One Thing**: Isolate variables for clear attribution
374
- 3. **Test Thoroughly**: Use diverse, representative test cases
375
- 4. **Track Metrics**: Log all experiments and results
376
- 5. **Validate Significance**: Use statistical tests for A/B comparisons
377
- 6. **Document Changes**: Keep detailed notes on what and why
378
- 7. **Version Everything**: Enable rollback to previous versions
379
- 8. **Monitor Production**: Continuously evaluate deployed prompts
380
-
381
- ## Common Optimization Patterns
382
-
383
- ### Pattern 1: Add Structure
384
- ```
385
- Before: "Analyze this text"
386
- After: "Analyze this text for:\n1. Main topic\n2. Key arguments\n3. Conclusion"
387
- ```
388
-
389
- ### Pattern 2: Add Examples
390
- ```
391
- Before: "Extract entities"
392
- After: "Extract entities\\n\\nExample:\\nText: Apple released iPhone\\nEntities: {company: Apple, product: iPhone}"
393
- ```
394
-
395
- ### Pattern 3: Add Constraints
396
- ```
397
- Before: "Summarize this"
398
- After: "Summarize in exactly 3 bullet points, 15 words each"
399
- ```
400
-
401
- ### Pattern 4: Add Verification
402
- ```
403
- Before: "Calculate..."
404
- After: "Calculate... Then verify your calculation is correct before responding."
405
- ```
406
-
407
- ## Tools and Utilities
408
-
409
- - Prompt diff tools for version comparison
410
- - Automated test runners
411
- - Metric dashboards
412
- - A/B testing frameworks
413
- - Token counting utilities
414
- - Latency profilers