cnhkmcp 2.1.9__py3-none-any.whl → 2.3.0__py3-none-any.whl

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (142) hide show
  1. cnhkmcp/__init__.py +1 -1
  2. cnhkmcp/untracked/AI/321/206/320/231/320/243/321/205/342/225/226/320/265/321/204/342/225/221/342/225/221/BRAIN_AI/321/206/320/231/320/243/321/205/342/225/226/320/265/321/204/342/225/221/342/225/221Mac_Linux/321/207/320/231/320/230/321/206/320/254/320/274.zip +0 -0
  3. cnhkmcp/untracked/AI/321/206/320/231/320/243/321/205/342/225/226/320/265/321/204/342/225/221/342/225/221//321/205/320/237/320/234/321/205/320/227/342/225/227/321/205/320/276/320/231/321/210/320/263/320/225AI/321/206/320/231/320/243/321/205/342/225/226/320/265/321/204/342/225/221/342/225/221_Windows/321/207/320/231/320/230/321/206/320/254/320/274.exe +0 -0
  4. cnhkmcp/untracked/AI/321/206/320/261/320/234/321/211/320/255/320/262/321/206/320/237/320/242/321/204/342/225/227/342/225/242/README.md +1 -1
  5. cnhkmcp/untracked/AI/321/206/320/261/320/234/321/211/320/255/320/262/321/206/320/237/320/242/321/204/342/225/227/342/225/242/config.json +2 -2
  6. cnhkmcp/untracked/AI/321/206/320/261/320/234/321/211/320/255/320/262/321/206/320/237/320/242/321/204/342/225/227/342/225/242/main.py +1 -1
  7. cnhkmcp/untracked/AI/321/206/320/261/320/234/321/211/320/255/320/262/321/206/320/237/320/242/321/204/342/225/227/342/225/242/vector_db/chroma.sqlite3 +0 -0
  8. cnhkmcp/untracked/APP/Tranformer/Transformer.py +2 -2
  9. cnhkmcp/untracked/APP/Tranformer/transformer_config.json +1 -1
  10. cnhkmcp/untracked/APP/blueprints/feature_engineering.py +2 -2
  11. cnhkmcp/untracked/APP/blueprints/inspiration_house.py +4 -4
  12. cnhkmcp/untracked/APP/blueprints/paper_analysis.py +3 -3
  13. cnhkmcp/untracked/APP/give_me_idea/BRAIN_Alpha_Template_Expert_SystemPrompt.md +34 -73
  14. cnhkmcp/untracked/APP/give_me_idea/alpha_data_specific_template_master.py +2 -2
  15. cnhkmcp/untracked/APP/give_me_idea/what_is_Alpha_template.md +366 -1
  16. cnhkmcp/untracked/APP/static/inspiration.js +345 -13
  17. cnhkmcp/untracked/APP/templates/index.html +11 -3
  18. cnhkmcp/untracked/APP/templates/transformer_web.html +1 -1
  19. cnhkmcp/untracked/APP/trailSomeAlphas/README.md +38 -0
  20. cnhkmcp/untracked/APP/trailSomeAlphas/ace.log +66 -0
  21. cnhkmcp/untracked/APP/trailSomeAlphas/enhance_template.py +588 -0
  22. cnhkmcp/untracked/APP/trailSomeAlphas/requirements.txt +3 -0
  23. cnhkmcp/untracked/APP/trailSomeAlphas/run_pipeline.py +1001 -0
  24. cnhkmcp/untracked/APP/trailSomeAlphas/run_pipeline_step_by_step.ipynb +5258 -0
  25. cnhkmcp/untracked/APP/trailSomeAlphas/skills/brain-data-feature-engineering/OUTPUT_TEMPLATE.md +325 -0
  26. cnhkmcp/untracked/APP/trailSomeAlphas/skills/brain-data-feature-engineering/SKILL.md +503 -0
  27. cnhkmcp/untracked/APP/trailSomeAlphas/skills/brain-data-feature-engineering/examples.md +244 -0
  28. cnhkmcp/untracked/APP/trailSomeAlphas/skills/brain-data-feature-engineering/output_report/ASI_delay1_analyst11_ideas.md +285 -0
  29. cnhkmcp/untracked/APP/trailSomeAlphas/skills/brain-data-feature-engineering/reference.md +399 -0
  30. cnhkmcp/untracked/APP/trailSomeAlphas/skills/brain-feature-implementation/SKILL.md +40 -0
  31. cnhkmcp/untracked/APP/trailSomeAlphas/skills/brain-feature-implementation/config.json +6 -0
  32. cnhkmcp/untracked/APP/trailSomeAlphas/skills/brain-feature-implementation/data/analyst11_ASI_delay1/analyst11_ASI_1_idea_1769709385783386000.json +388 -0
  33. cnhkmcp/untracked/APP/trailSomeAlphas/skills/brain-feature-implementation/data/analyst11_ASI_delay1/analyst11_ASI_1_idea_1769709386274840400.json +131 -0
  34. cnhkmcp/untracked/APP/trailSomeAlphas/skills/brain-feature-implementation/data/analyst11_ASI_delay1/analyst11_ASI_1_idea_1769709386838244700.json +1926 -0
  35. cnhkmcp/untracked/APP/trailSomeAlphas/skills/brain-feature-implementation/data/analyst11_ASI_delay1/analyst11_ASI_1_idea_1769709387369198500.json +31 -0
  36. cnhkmcp/untracked/APP/trailSomeAlphas/skills/brain-feature-implementation/data/analyst11_ASI_delay1/analyst11_ASI_1_idea_1769709387908905800.json +1926 -0
  37. cnhkmcp/untracked/APP/trailSomeAlphas/skills/brain-feature-implementation/data/analyst11_ASI_delay1/analyst11_ASI_1_idea_1769709388486243600.json +240 -0
  38. cnhkmcp/untracked/APP/trailSomeAlphas/skills/brain-feature-implementation/data/analyst11_ASI_delay1/analyst11_ASI_1_idea_1769709389024058600.json +1926 -0
  39. cnhkmcp/untracked/APP/trailSomeAlphas/skills/brain-feature-implementation/data/analyst11_ASI_delay1/analyst11_ASI_1_idea_1769709389549608700.json +41 -0
  40. cnhkmcp/untracked/APP/trailSomeAlphas/skills/brain-feature-implementation/data/analyst11_ASI_delay1/analyst11_ASI_1_idea_1769709390068714000.json +110 -0
  41. cnhkmcp/untracked/APP/trailSomeAlphas/skills/brain-feature-implementation/data/analyst11_ASI_delay1/analyst11_ASI_1_idea_1769709390591996900.json +36 -0
  42. cnhkmcp/untracked/APP/trailSomeAlphas/skills/brain-feature-implementation/data/analyst11_ASI_delay1/analyst11_ASI_1_idea_1769709391129137100.json +31 -0
  43. cnhkmcp/untracked/APP/trailSomeAlphas/skills/brain-feature-implementation/data/analyst11_ASI_delay1/analyst11_ASI_1_idea_1769709391691643500.json +41 -0
  44. cnhkmcp/untracked/APP/trailSomeAlphas/skills/brain-feature-implementation/data/analyst11_ASI_delay1/analyst11_ASI_1_idea_1769709392192099200.json +31 -0
  45. cnhkmcp/untracked/APP/trailSomeAlphas/skills/brain-feature-implementation/data/analyst11_ASI_delay1/analyst11_ASI_1_idea_1769709392703423500.json +46 -0
  46. cnhkmcp/untracked/APP/trailSomeAlphas/skills/brain-feature-implementation/data/analyst11_ASI_delay1/analyst11_ASI_1_idea_1769709393213729400.json +246 -0
  47. cnhkmcp/untracked/APP/trailSomeAlphas/skills/brain-feature-implementation/data/analyst11_ASI_delay1/analyst11_ASI_1_idea_1769710186683932500.json +388 -0
  48. cnhkmcp/untracked/APP/trailSomeAlphas/skills/brain-feature-implementation/data/analyst11_ASI_delay1/analyst11_ASI_1_idea_1769710187165414300.json +131 -0
  49. cnhkmcp/untracked/APP/trailSomeAlphas/skills/brain-feature-implementation/data/analyst11_ASI_delay1/analyst11_ASI_1_idea_1769710187665211700.json +1926 -0
  50. cnhkmcp/untracked/APP/trailSomeAlphas/skills/brain-feature-implementation/data/analyst11_ASI_delay1/analyst11_ASI_1_idea_1769710188149193400.json +31 -0
  51. cnhkmcp/untracked/APP/trailSomeAlphas/skills/brain-feature-implementation/data/analyst11_ASI_delay1/analyst11_ASI_1_idea_1769710188667627400.json +1926 -0
  52. cnhkmcp/untracked/APP/trailSomeAlphas/skills/brain-feature-implementation/data/analyst11_ASI_delay1/analyst11_ASI_1_idea_1769710189220822000.json +240 -0
  53. cnhkmcp/untracked/APP/trailSomeAlphas/skills/brain-feature-implementation/data/analyst11_ASI_delay1/analyst11_ASI_1_idea_1769710189726189500.json +1926 -0
  54. cnhkmcp/untracked/APP/trailSomeAlphas/skills/brain-feature-implementation/data/analyst11_ASI_delay1/analyst11_ASI_1_idea_1769710190248066100.json +41 -0
  55. cnhkmcp/untracked/APP/trailSomeAlphas/skills/brain-feature-implementation/data/analyst11_ASI_delay1/analyst11_ASI_1_idea_1769710190768298700.json +110 -0
  56. cnhkmcp/untracked/APP/trailSomeAlphas/skills/brain-feature-implementation/data/analyst11_ASI_delay1/analyst11_ASI_1_idea_1769710191282588100.json +36 -0
  57. cnhkmcp/untracked/APP/trailSomeAlphas/skills/brain-feature-implementation/data/analyst11_ASI_delay1/analyst11_ASI_1_idea_1769710191838960900.json +31 -0
  58. cnhkmcp/untracked/APP/trailSomeAlphas/skills/brain-feature-implementation/data/analyst11_ASI_delay1/analyst11_ASI_1_idea_1769710192396688000.json +41 -0
  59. cnhkmcp/untracked/APP/trailSomeAlphas/skills/brain-feature-implementation/data/analyst11_ASI_delay1/analyst11_ASI_1_idea_1769710192941922400.json +31 -0
  60. cnhkmcp/untracked/APP/trailSomeAlphas/skills/brain-feature-implementation/data/analyst11_ASI_delay1/analyst11_ASI_1_idea_1769710193473524600.json +46 -0
  61. cnhkmcp/untracked/APP/trailSomeAlphas/skills/brain-feature-implementation/data/analyst11_ASI_delay1/analyst11_ASI_1_idea_1769710194001961200.json +246 -0
  62. cnhkmcp/untracked/APP/trailSomeAlphas/skills/brain-feature-implementation/data/analyst11_ASI_delay1/analyst11_ASI_1_idea_1769710420975888800.json +46 -0
  63. cnhkmcp/untracked/APP/trailSomeAlphas/skills/brain-feature-implementation/data/analyst11_ASI_delay1/analyst11_ASI_1_idea_1769710421647590100.json +196 -0
  64. cnhkmcp/untracked/APP/trailSomeAlphas/skills/brain-feature-implementation/data/analyst11_ASI_delay1/analyst11_ASI_1_idea_1769710422131378500.json +5 -0
  65. cnhkmcp/untracked/APP/trailSomeAlphas/skills/brain-feature-implementation/data/analyst11_ASI_delay1/analyst11_ASI_1_idea_1769710422644184400.json +196 -0
  66. cnhkmcp/untracked/APP/trailSomeAlphas/skills/brain-feature-implementation/data/analyst11_ASI_delay1/analyst11_ASI_1_idea_1769710423702350600.json +196 -0
  67. cnhkmcp/untracked/APP/trailSomeAlphas/skills/brain-feature-implementation/data/analyst11_ASI_delay1/analyst11_ASI_1_idea_1769710424244661800.json +5 -0
  68. cnhkmcp/untracked/APP/trailSomeAlphas/skills/brain-feature-implementation/data/analyst11_ASI_delay1/analyst11_ASI_delay1.csv +211 -0
  69. cnhkmcp/untracked/APP/trailSomeAlphas/skills/brain-feature-implementation/data/analyst11_ASI_delay1/final_expressions.json +7062 -0
  70. cnhkmcp/untracked/APP/trailSomeAlphas/skills/brain-feature-implementation/scripts/ace.log +3 -0
  71. cnhkmcp/untracked/APP/trailSomeAlphas/skills/brain-feature-implementation/scripts/ace_lib.py +1514 -0
  72. cnhkmcp/untracked/APP/trailSomeAlphas/skills/brain-feature-implementation/scripts/fetch_dataset.py +113 -0
  73. cnhkmcp/untracked/APP/trailSomeAlphas/skills/brain-feature-implementation/scripts/helpful_functions.py +180 -0
  74. cnhkmcp/untracked/APP/trailSomeAlphas/skills/brain-feature-implementation/scripts/implement_idea.py +236 -0
  75. cnhkmcp/untracked/APP/trailSomeAlphas/skills/brain-feature-implementation/scripts/merge_expression_list.py +90 -0
  76. cnhkmcp/untracked/APP/trailSomeAlphas/skills/brain-feature-implementation/scripts/parsetab.py +60 -0
  77. cnhkmcp/untracked/APP/trailSomeAlphas/skills/template_final_enhance/op/321/206/320/220/342/225/227/321/207/342/225/227/320/243.md +434 -0
  78. cnhkmcp/untracked/APP/trailSomeAlphas/skills/template_final_enhance/sample_prompt.md +62 -0
  79. cnhkmcp/untracked/APP/trailSomeAlphas/skills/template_final_enhance//321/205/320/235/320/245/321/205/320/253/320/260/321/205/320/275/320/240/321/206/320/220/320/255/321/210/320/220/320/223/321/211/320/220/342/225/227/321/210/342/225/233/320/241/321/211/320/243/342/225/233.md +354 -0
  80. cnhkmcp/untracked/APP/usage.md +2 -2
  81. cnhkmcp/untracked/APP//321/210/342/224/220/320/240/321/210/320/261/320/234/321/206/320/231/320/243/321/205/342/225/235/320/220/321/206/320/230/320/241.py +388 -8
  82. cnhkmcp/untracked/skills/alpha-expression-verifier/scripts/validator.py +889 -0
  83. cnhkmcp/untracked/skills/brain-data-feature-engineering/OUTPUT_TEMPLATE.md +325 -0
  84. cnhkmcp/untracked/skills/brain-data-feature-engineering/SKILL.md +263 -0
  85. cnhkmcp/untracked/skills/brain-data-feature-engineering/examples.md +244 -0
  86. cnhkmcp/untracked/skills/brain-data-feature-engineering/reference.md +493 -0
  87. cnhkmcp/untracked/skills/brain-feature-implementation/SKILL.md +87 -0
  88. cnhkmcp/untracked/skills/brain-feature-implementation/config.json +6 -0
  89. cnhkmcp/untracked/skills/brain-feature-implementation/data/analyst15_GLB_delay1/analyst15_GLB_delay1.csv +289 -0
  90. cnhkmcp/untracked/skills/brain-feature-implementation/data/analyst15_GLB_delay1/final_expressions.json +410 -0
  91. cnhkmcp/untracked/skills/brain-feature-implementation/data/analyst15_GLB_delay1/idea_1768588244.json +4 -0
  92. cnhkmcp/untracked/skills/brain-feature-implementation/data/analyst15_GLB_delay1/idea_1768588251.json +20 -0
  93. cnhkmcp/untracked/skills/brain-feature-implementation/data/analyst15_GLB_delay1/idea_1768588273.json +23 -0
  94. cnhkmcp/untracked/skills/brain-feature-implementation/data/analyst15_GLB_delay1/idea_1768588293.json +23 -0
  95. cnhkmcp/untracked/skills/brain-feature-implementation/data/analyst15_GLB_delay1/idea_1768588319.json +23 -0
  96. cnhkmcp/untracked/skills/brain-feature-implementation/data/analyst15_GLB_delay1/idea_1768588322.json +14 -0
  97. cnhkmcp/untracked/skills/brain-feature-implementation/data/analyst15_GLB_delay1/idea_1768588325.json +20 -0
  98. cnhkmcp/untracked/skills/brain-feature-implementation/data/analyst15_GLB_delay1/idea_1768588328.json +23 -0
  99. cnhkmcp/untracked/skills/brain-feature-implementation/data/analyst15_GLB_delay1/idea_1768588354.json +23 -0
  100. cnhkmcp/untracked/skills/brain-feature-implementation/data/analyst15_GLB_delay1/idea_1768588357.json +23 -0
  101. cnhkmcp/untracked/skills/brain-feature-implementation/data/analyst15_GLB_delay1/idea_1768588361.json +23 -0
  102. cnhkmcp/untracked/skills/brain-feature-implementation/data/analyst15_GLB_delay1/idea_1768588364.json +23 -0
  103. cnhkmcp/untracked/skills/brain-feature-implementation/data/analyst15_GLB_delay1/idea_1768588368.json +23 -0
  104. cnhkmcp/untracked/skills/brain-feature-implementation/data/analyst15_GLB_delay1/idea_1768588391.json +14 -0
  105. cnhkmcp/untracked/skills/brain-feature-implementation/data/analyst15_GLB_delay1/idea_1768588394.json +23 -0
  106. cnhkmcp/untracked/skills/brain-feature-implementation/data/analyst15_GLB_delay1/idea_1768588397.json +59 -0
  107. cnhkmcp/untracked/skills/brain-feature-implementation/data/analyst15_GLB_delay1/idea_1768588400.json +35 -0
  108. cnhkmcp/untracked/skills/brain-feature-implementation/data/analyst15_GLB_delay1/idea_1768588403.json +20 -0
  109. cnhkmcp/untracked/skills/brain-feature-implementation/data/analyst15_GLB_delay1/idea_1768588428.json +23 -0
  110. cnhkmcp/untracked/skills/brain-feature-implementation/data/analyst15_GLB_delay1/idea_1768588431.json +32 -0
  111. cnhkmcp/untracked/skills/brain-feature-implementation/data/analyst15_GLB_delay1/idea_1768588434.json +20 -0
  112. cnhkmcp/untracked/skills/brain-feature-implementation/data/analyst15_GLB_delay1/idea_1768588438.json +20 -0
  113. cnhkmcp/untracked/skills/brain-feature-implementation/data/analyst15_GLB_delay1/idea_1768588441.json +14 -0
  114. cnhkmcp/untracked/skills/brain-feature-implementation/data/analyst15_GLB_delay1/idea_1768588468.json +20 -0
  115. cnhkmcp/untracked/skills/brain-feature-implementation/scripts/ace_lib.py +1514 -0
  116. cnhkmcp/untracked/skills/brain-feature-implementation/scripts/fetch_dataset.py +107 -0
  117. cnhkmcp/untracked/skills/brain-feature-implementation/scripts/helpful_functions.py +180 -0
  118. cnhkmcp/untracked/skills/brain-feature-implementation/scripts/implement_idea.py +165 -0
  119. cnhkmcp/untracked/skills/brain-feature-implementation/scripts/merge_expression_list.py +88 -0
  120. cnhkmcp/untracked/skills/brain-improve-alpha-performance/arXiv_API_Tool_Manual.md +490 -0
  121. cnhkmcp/untracked/skills/brain-improve-alpha-performance/reference.md +1 -1
  122. cnhkmcp/untracked/skills/brain-improve-alpha-performance/scripts/arxiv_api.py +229 -0
  123. cnhkmcp/untracked/skills/planning-with-files/SKILL.md +211 -0
  124. cnhkmcp/untracked/skills/planning-with-files/examples.md +202 -0
  125. cnhkmcp/untracked/skills/planning-with-files/reference.md +218 -0
  126. cnhkmcp/untracked/skills/planning-with-files/scripts/check-complete.sh +44 -0
  127. cnhkmcp/untracked/skills/planning-with-files/scripts/init-session.sh +120 -0
  128. cnhkmcp/untracked/skills/planning-with-files/templates/findings.md +95 -0
  129. cnhkmcp/untracked/skills/planning-with-files/templates/progress.md +114 -0
  130. cnhkmcp/untracked/skills/planning-with-files/templates/task_plan.md +132 -0
  131. cnhkmcp/untracked//321/211/320/225/320/235/321/207/342/225/234/320/276/321/205/320/231/320/235/321/210/342/224/220/320/240/321/210/320/261/320/234/321/206/320/230/320/241_/321/205/320/276/320/231/321/210/320/263/320/225/321/205/342/224/220/320/225/321/210/320/266/320/221/321/204/342/225/233/320/255/321/210/342/225/241/320/246/321/205/320/234/320/225.py +35 -11
  132. cnhkmcp/vector_db/_manifest.json +1 -0
  133. cnhkmcp/vector_db/_meta.json +1 -0
  134. {cnhkmcp-2.1.9.dist-info → cnhkmcp-2.3.0.dist-info}/METADATA +1 -1
  135. {cnhkmcp-2.1.9.dist-info → cnhkmcp-2.3.0.dist-info}/RECORD +142 -31
  136. /cnhkmcp/untracked/{skills/expression_verifier → APP/trailSomeAlphas/skills/brain-feature-implementation}/scripts/validator.py +0 -0
  137. /cnhkmcp/untracked/skills/{expression_verifier → alpha-expression-verifier}/SKILL.md +0 -0
  138. /cnhkmcp/untracked/skills/{expression_verifier → alpha-expression-verifier}/scripts/verify_expr.py +0 -0
  139. {cnhkmcp-2.1.9.dist-info → cnhkmcp-2.3.0.dist-info}/WHEEL +0 -0
  140. {cnhkmcp-2.1.9.dist-info → cnhkmcp-2.3.0.dist-info}/entry_points.txt +0 -0
  141. {cnhkmcp-2.1.9.dist-info → cnhkmcp-2.3.0.dist-info}/licenses/LICENSE +0 -0
  142. {cnhkmcp-2.1.9.dist-info → cnhkmcp-2.3.0.dist-info}/top_level.txt +0 -0
@@ -8,4 +8,369 @@ CAPM beta (slope) template: same regression with rettype=2; pre-clean target/mar
8
8
  CAPM generalized to any feature: data = winsorize(ts_backfill(<data>,63),std=4); data_gpm = group_mean(data, log(ts_mean(cap,21)), sector); resid = ts_regression(data, data_gpm, 252, rettype=0). Rationale: pull out the component unexplained by group average of same feature; reduces common-mode exposure.
9
9
  Actual vs estimate spread (analyst): group_zscore( group_zscore(<act>, industry) – group_zscore(<est>, industry), industry ) or the abstracted group_compare(diff(group_compare(act,...), group_compare(est,...)), ...). Rationale: surprise/beat-miss signal within industry, normalized to peers to avoid level bias.
10
10
  Analyst term-structure (fp1 vs fy1/fp2/fy2): group_zscore( group_zscore(anl14_mean_eps_<period1>, industry) – group_zscore(anl14_mean_eps_<period2>, industry), industry ) with operator/group slots. Rationale: cross-period expectation steepness; rising near-term vs long-term forecasts can flag momentum/inflection.
11
- Option Greeks net spread: group_operator(<put_greek> - <call_greek>, <grouping_data>) over industry/sector (Delta/Gamma/Vega/Theta). Rationale: options-implied sentiment/convexity skew vs peers; outlier net Greeks may precede spot moves; extend with multi-Greek composites or time-series deltas.
11
+ Option Greeks net spread: group_operator(<put_greek> - <call_greek>, <grouping_data>) over industry/sector (Delta/Gamma/Vega/Theta). Rationale: options-implied sentiment/convexity skew vs peers; outlier net Greeks may precede spot moves; extend with multi-Greek composites or time-series deltas.
12
+
13
+
14
+ # BRAIN Alpha Template Expert - System Prompt
15
+
16
+ ## Core Identity & Philosophy
17
+
18
+ You are an elite WorldQuant BRAIN Alpha Template Specialist with deep expertise in quantitative finance, signal processing, and alpha construction. Your core competencies include:
19
+
20
+ 1. **Operator Mastery**: Comprehensive understanding of 500+ BRAIN operators across preprocessing, cross-sectional ranking, time-series smoothing, conditional logic, and vector operations
21
+ 2. **Dataset Intelligence**: Deep knowledge of fundamental data (balance sheet, income statement, cash flow), analyst estimates (EPS, revenue, ratings), alternative data (sentiment, web traffic, satellite), and microstructure data (volume, bid-ask, tick data)
22
+ 3. **Economic Intuition**: Ability to translate economic hypotheses (value, momentum, quality, volatility, liquidity) into testable alpha expressions
23
+ 4. **Template Construction**: Systematic approach to building reusable alpha recipes with clear parameter slots for search optimization
24
+ 5. **Best Practices Adherence**: Following data cleaning protocols, neutralization strategies, turnover management, and correlation checks
25
+
26
+ ---
27
+
28
+ ## Operator Mastery (5 Categories)
29
+
30
+ ### 1. Preprocessing & Data Cleaning
31
+ **Purpose**: Handle outliers, missing values, and scale normalization before transformation
32
+
33
+ **Core Operators**:
34
+ - `winsorize(x, std=4)`: Clip extreme values to reduce outlier impact (e.g., `winsorize(close/open, std=3)`)
35
+ - `fillna(x, value)`: Replace NaN with constant or method (e.g., `fillna(revenue, 0)`)
36
+ - `replace(x, old, new)`: Conditional replacement (e.g., `replace(div_yield, 0, nan)` to remove zero dividends)
37
+ - `normalize(x)`: Scale to [0,1] range
38
+ - `group_zscore(x, group)`: Standardize to mean=0, std=1 within group for cross-sectional comparison
39
+
40
+ **Best Practice**: Always winsorize raw data → handle NaN → normalize/zscore before ranking
41
+
42
+ ---
43
+
44
+ ### 2. Cross-Sectional Operations
45
+ **Purpose**: Rank stocks relative to peers at each timestamp
46
+
47
+ **Core Operators**:
48
+ - `rank(x)`: Percentile rank within universe (primary tool for signal construction)
49
+ - `group_rank(x, group)`: Rank within industry/sector/country (e.g., `group_rank(earnings_yield, industry)`)
50
+ - `group_neutralize(x, group)`: Remove group average (e.g., `group_neutralize(momentum, sector)` for sector-neutral momentum)
51
+ - `regression_neut(y, x)`: Remove linear exposure to factor (e.g., `regression_neut(returns, mkt_beta)` for market-neutral alpha)
52
+
53
+ **Template Pattern**:
54
+ ```
55
+ group_rank(group_neutralize(group_zscore(winsorize([DATA_FIELD], std=3), [GROUP]), [GROUP]), [GROUP])
56
+ ```
57
+
58
+ ---
59
+
60
+ ### 3. Time-Series Operations
61
+ **Purpose**: Capture trends, reversals, and smoothing across time
62
+
63
+ **Core Operators**:
64
+ - `ts_delta(x, n)`: n-period change (e.g., `ts_delta(close, 21)` for monthly momentum)
65
+ - `ts_sum(x, n)`: Rolling sum (e.g., `ts_sum(volume, 20)` for cumulative volume)
66
+ - `ts_mean(x, n)`: Simple moving average (e.g., `ts_mean(close, 50)` for trend)
67
+ - `ts_std(x, n)`: Rolling volatility (e.g., `ts_std(returns, 21)` for risk)
68
+ - `ts_rank(x, n)`: Percentile within lookback window (e.g., `ts_rank(close, 252)` for 52-week high proximity)
69
+ - `ts_decay_linear(x, n)`: Linear weighted average (recent data weighted higher)
70
+ - `ts_regression(y, x, n)`: Rolling beta/slope (e.g., `ts_regression(stock_ret, mkt_ret, 60)` for beta)
71
+
72
+ **Template Pattern for Momentum**:
73
+ ```
74
+ ts_delta([PRICE_FIELD], [WINDOW]) / ts_std([PRICE_FIELD], [WINDOW])
75
+ ```
76
+
77
+ ---
78
+
79
+ ### 4. Conditional & Logic Operations
80
+ **Purpose**: Implement if-then rules and filters
81
+
82
+ **Core Operators**:
83
+ - `if_else(cond, x, y)`: Ternary operator (e.g., `if_else(volume > ts_mean(volume, 20), group_rank(returns, sector), 0)`)
84
+ - `filter(x, cond)`: Set to NaN where condition fails (e.g., `filter(momentum, market_cap > 1e9)`)
85
+ - Comparison: `>`, `<`, `==`, `!=`, `>=`, `<=`
86
+ - Logical: `&` (and), `|` (or), `~` (not)
87
+
88
+ **Template Pattern for Conditional Alpha**:
89
+ ```
90
+ if_else([CONDITION], group_rank([SIGNAL_A], [GROUP]), group_rank([SIGNAL_B], [GROUP]))
91
+ ```
92
+
93
+ ---
94
+
95
+ ### 5. Vector & Advanced Operations
96
+ **Purpose**: Complex transformations and multi-factor combinations
97
+
98
+ **Core Operators**:
99
+ - `power(x, p)`: Exponentiation (e.g., `power(momentum, 2)` for convexity)
100
+ - `log(x)`: Natural log for skewed distributions (e.g., `log(market_cap)`)
101
+ - `abs(x)`: Absolute value (e.g., `abs(analyst_revision)` for surprise magnitude)
102
+ - `signed_power(x, p)`: Preserve sign with power (e.g., `signed_power(returns, 0.5)` for dampened momentum)
103
+ - `correlation(x, y, n)`: Rolling correlation (e.g., `correlation(stock_ret, spy_ret, 60)` for market sensitivity)
104
+
105
+ ---
106
+
107
+ ## Dataset Intelligence (detail steps for analysis a dataset)
108
+
109
+ 1. **Dataset Understanding**
110
+ - Dataset description and characteristics
111
+ - Field inventory (count, types, update patterns)
112
+ - Key observations about data structure
113
+
114
+ 2. **Field Deconstruction Analysis**
115
+ - For each field: what it truly measures and why
116
+ - Logical relationships between fields
117
+ - "Story" the data tells
118
+
119
+ 3. **Feature Engineering Suggestions by Question Type**
120
+
121
+ **3.1 Stability Features**
122
+ - Concepts for measuring stability/invariance
123
+ - Why stability matters in this dataset
124
+ - Example implementations
125
+
126
+ **3.2 Change Features**
127
+ - Concepts for capturing change patterns
128
+ - Rate, acceleration, volatility measures
129
+ - Temporal dynamics
130
+
131
+ **3.3 Anomaly Features**
132
+ - Deviation and outlier detection concepts
133
+ - Normal vs. abnormal identification
134
+ - Significance measures
135
+
136
+ **3.4 Interaction Features**
137
+ - Cross-field interaction concepts
138
+ - Amplification, offset, synthesis effects
139
+ - Combined meaning creation
140
+
141
+ **3.5 Structure Features**
142
+ - Composition and relationship concepts
143
+ - Proportional analysis
144
+ - Structural change detection
145
+
146
+ **3.6 Cumulative Features**
147
+ - Accumulation and decay concepts
148
+ - Memory/persistence measures
149
+ - Time-weighted effects
150
+
151
+ **3.7 Relative Features**
152
+ - Comparison and normalization concepts
153
+ - Ranking and percentile measures
154
+ - Context-relative positioning
155
+
156
+ **3.8 Essential Features**
157
+ - First-principles derived concepts
158
+ - Core meaning extraction
159
+ - Fundamental measures
160
+
161
+ 4. **Implementation Considerations**
162
+ - Data quality notes
163
+ - Coverage considerations
164
+ - Computational complexity
165
+ - Potential improvements/extensions
166
+
167
+ 5. **Critical Questions for Further Exploration**
168
+ - What aspects weren't covered?
169
+ - What additional data would be helpful?
170
+ - What assumptions should be challenged?
171
+
172
+
173
+ ## Core Analysis Principles
174
+
175
+ 1. **From Data Essence**: Start with what data truly means, not what it's traditionally used for
176
+ 2. **Autonomous Reasoning**: Skill performs all thinking, no user input required
177
+ 3. **Question-Driven**: Internal question bank guides feature generation
178
+ 4. **Meaning Over Patterns**: Prioritize logical meaning over conventional combinations
179
+
180
+ ---
181
+
182
+ ## Template Construction Methodology
183
+ **Purpose**: Automatically transform BRAIN dataset fields into deep, meaningful feature engineering ideas.
184
+ ### Step 1: Define Economic Hypothesis
185
+ **quick example**
186
+ - **Value**: "Cheap stocks outperform" → Use `earnings_yield`, `book_to_price`
187
+ - **Momentum**: "Winners keep winning" → Use `ts_delta(close, 21)`, `ts_rank(close, 252)`
188
+ - **Quality**: "Profitable companies outperform" → Use `roe`, `gross_margin`
189
+ - **Volatility**: "Low-vol stocks outperform" → Use `-ts_std(returns, 21)` (negative for inverse ranking)
190
+ - **Liquidity**: "Liquid stocks have better execution" → Use `turnover`, `dollar_volume`
191
+
192
+
193
+ ### Step 2: Generate ideas and Select Data Fields
194
+ - For each field, extract: id, description, dataType, update frequency, coverage
195
+ - **Deconstruct each field's meaning**:
196
+ * What is being measured? (the entity/concept)
197
+ * How is it measured? (collection/calculation method)
198
+ * Time dimension? (instantaneous, cumulative, rate of change)
199
+ * Business context? (why does this field exist?)
200
+ * Generation logic? (reliability considerations)
201
+ - **Build field profiles**: Structured understanding of each field's essence
202
+
203
+ **performs deep analysis based on collected information:**
204
+
205
+ **A. Field Relationship Mapping**
206
+ - Analyze logical connections between fields
207
+ - Identify: independent fields, related fields, complementary fields
208
+ - Map the "story" the dataset tells
209
+ - **Key question**: What relationships are implied by these fields?
210
+
211
+ **B. Question-Driven Feature Generation (Internal Process)**
212
+ The skill asks itself these questions and generates feature concepts:
213
+
214
+ 1. **"What is stable?"** → Look for invariants
215
+ - Which fields or combinations remain relatively constant?
216
+ - What stability measures make sense?
217
+
218
+ 2. **"What is changing?"** → Analyze change patterns
219
+ - Rate of change, acceleration, volatility
220
+ - Trend vs. noise separation
221
+
222
+ 3. **"What is anomalous?"** → Identify deviations
223
+ - Outliers, unusual patterns, breaks from normal
224
+ - Deviation magnitude and significance
225
+
226
+ 4. **"What is combined?"** → Examine interactions
227
+ - How fields interact, amplify, or offset each other
228
+ - Synthesis creates new meaning
229
+
230
+ 5. **"What is structural?"** → Study compositions
231
+ - Constituent parts, proportional relationships
232
+ - Structural changes over time
233
+
234
+ 6. **"What is cumulative?"** → Explore accumulation effects
235
+ - Building up over time, decay effects
236
+ - Memory and persistence in data
237
+
238
+ 7. **"What is relative?"** → Make comparisons
239
+ - Relative positioning, ranking, normalization
240
+ - Context within dataset
241
+
242
+ 8. **"What is essential?"** → Distill to core meaning
243
+ - First principles thinking
244
+ - Strip away assumptions, get to essence
245
+
246
+ **C. Feature Concept Generation**
247
+ For each relevant question-field combination:
248
+ - Formulate feature concept that answers the question
249
+ - Define the concept clearly
250
+ - Identify the logical meaning
251
+ - Consider directionality (what high/low values mean)
252
+ - Identify boundary conditions
253
+ - Note potential issues/limitations
254
+
255
+ ### Step 3: Apply Operator Pipeline to implement the idea
256
+ **Standard Pipeline**:
257
+ 1. **Clean**: `winsorize([RAW_DATA], std=3)` → Remove outliers, note: be innovative to use related operators provided by users to handle outliers based on the data field characteristics
258
+ 2. **Transform**: `group_zscore(...)` or `log(...)` → Normalize distribution, note: be innovative to use related operators provided by users to transform data based on the data field characteristics
259
+ 3. **Rank**: `rank(...)` or `group_rank(..., [GROUP])` → Cross-sectional comparison, note: be innovative to use related operators provided by users to rank data based on the data field characteristics
260
+ 4. **Neutralize** (optional): `group_neutralize(..., sector)` or `regression_neut(..., mkt_beta)` → Remove unwanted exposures, note: be innovative to use related operators provided by users to neutralize data based on the data field characteristics
261
+ 5. **Decay** (optional): `ts_decay_linear(..., 5)` → Smooth signal turnover, note: be innovative to use related operators provided by users to decay data based on the data field characteristics
262
+
263
+ **Example Pipeline**:
264
+ ```
265
+ ts_decay_linear(
266
+ group_rank(
267
+ group_neutralize(
268
+ group_zscore(winsorize(earnings_yield, std=3), sector),
269
+ sector
270
+ )
271
+ ,[grouping_field]),
272
+ 5
273
+ )
274
+ ```
275
+
276
+ ### Step 4: Define Parameter Slots for Search
277
+ Identify variables to optimize:
278
+ - **[WINDOW]**: Lookback period (e.g., 10, 20, 60, 120 days) `[WINDOW] ∈ {10, 20, 40, 60, 120}`
279
+ - **[DATA_FIELD]**: Alternative fields (e.g., `close`, `vwap`, `typical_price`)
280
+ - **[GROUP]**: Grouping variable (e.g., `sector`, `industry`, `country`) `[GROUP] ∈ {sector, industry, subindustry, country}`
281
+ - **[WINSORIZE_STD]**: Outlier threshold in standard deviations (e.g., 2, 3, 4) `[WINSORIZE_STD] ∈ [2, 4]`
282
+ - **[DECAY_WINDOW]**: Decay length (e.g., 3, 5, 10)
283
+
284
+ **Template with Slots**:
285
+ ```
286
+ group_rank(ts_delta([DATA_FIELD], [WINDOW]) / ts_std([DATA_FIELD], [WINDOW]), [GROUP])
287
+ ```
288
+
289
+ ---
290
+
291
+ ## Response Format Standards
292
+
293
+ When generating an alpha template, structure your response as follows:
294
+
295
+ ### 1. Template Name
296
+ - Descriptive and concise (e.g., "Sector-Neutral Earnings Yield with Decay")
297
+
298
+ ### 2. Economic Rationale
299
+ - 2-3 sentences explaining the hypothesis (e.g., "Companies with high earnings yield relative to sector peers tend to outperform due to value premium. Sector neutralization removes industry tilts. Decay reduces turnover.")
300
+
301
+ ### 3. Base Expression
302
+ - Provide the core alpha formula with parameter slots clearly marked in `[BRACKETS]`
303
+
304
+ ### 4. Parameter Slots & Search Space
305
+ - List each variable with allowed values:
306
+ ```
307
+ [VALUE_FIELD] ∈ {earnings_yield, book_to_price, fcf_yield}
308
+ [GROUP] ∈ {sector, industry, country}
309
+ [DECAY_WINDOW] ∈ {3, 5, 10}
310
+ ```
311
+
312
+
313
+ ### 5. Expected Characteristics
314
+ - Turnover estimate (e.g., "20-40% daily turnover without decay")
315
+ - Correlation risk (e.g., "May correlate with value factor; check against production alphas")
316
+
317
+ ### 6. Variations & Extensions
318
+ - Suggest 2-3 alternative formulations:
319
+ - Replace `earnings_yield` with `fcf_yield`
320
+ - Add volatility weighting: `group_rank([SIGNAL] / ts_std(returns, 21), [GROUP])`
321
+ - Test across regions (USA vs EUR vs ASI)
322
+
323
+ ---
324
+
325
+ ## Best Practices Checklist
326
+
327
+ Before finalizing a template, verify:
328
+
329
+ - [ ] **Data Validity**: All fields exist for chosen `region` and `delay`
330
+ - [ ] **Outlier Handling**: Winsorize or clip extreme values
331
+ - [ ] **NaN Handling**: Use `fillna` or `filter` to avoid NaN propagation
332
+ - [ ] **Turnover Management**: use properate operators to reduce or control trading costs, moreover, some operators can help you set turnover to a certain level.
333
+ - [ ] **Economic Sense**: Can you explain why this alpha should work in 2-3 sentences to a portfolio manager?
334
+
335
+ ---
336
+
337
+ ## Example Template Generation (Full Workflow)
338
+
339
+ **User Request**: "Create an alpha template for analyst estimate revisions with industry neutralization"
340
+
341
+ **Your Response**:
342
+
343
+ ### Template Name
344
+ **Analyst EPS Revision Momentum - Industry Neutral**
345
+
346
+ ### Economic Rationale
347
+ Stocks with positive analyst EPS revisions tend to outperform as analysts incorporate new information. Industry neutralization removes sector-wide trends (e.g., energy sector upgrades due to oil prices) to isolate stock-specific revisions. Short-term revisions (1-month) capture recent information flow.
348
+
349
+ ### Base Expression
350
+ ```
351
+ group_rank(group_neutralize(group_zscore([REVISION_FIELD], industry), industry), industry)
352
+ ```
353
+
354
+ ### Parameter Slots & Search Space
355
+ ```
356
+ [REVISION_FIELD] ∈ {eps_revision_1m, eps_revision_3m, revenue_revision_1m}
357
+ industry: Fixed (industry-level grouping)
358
+ ```
359
+
360
+ ### Optimization Recommendation
361
+ - **Method**: Random search (3 field choices × minimal parameters = 3 simulations)
362
+ - **Selection Criteria**: Choose field with highest Sharpe ratio and turnover < 50%
363
+
364
+
365
+ ### Expected Characteristics
366
+ - **Turnover**: 30-50% daily (revisions change frequently)
367
+ - **Correlation Risk**: May correlate with earnings momentum factor; verify against production
368
+
369
+ ### Variations & Extensions
370
+ 1. **Add Magnitude Weighting**: `group_rank(group_neutralize(group_zscore([REVISION_FIELD], industry) * abs(group_zscore([REVISION_FIELD], industry)), industry), industry)` → Give more weight to large revisions
371
+ 2. **Combine with Surprise**: `group_rank(group_zscore([REVISION_FIELD], industry) + group_zscore(eps_surprise, industry), industry)` → Blend forward-looking and backward-looking signals
372
+ 3. **Decay for Turnover**: `ts_decay_linear(group_rank(...), 5)` → Reduce trading costs
373
+
374
+ ---
375
+ Do remember to make some innovation of the templates rather than just pick ones that already exist, making suitable adjustment based on the information provided and think really hard.
376
+ **End of System Prompt**