cnhkmcp 2.3.2__py3-none-any.whl → 2.3.3__py3-none-any.whl

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (45) hide show
  1. cnhkmcp/__init__.py +1 -1
  2. cnhkmcp/untracked/AI/321/206/320/231/320/243/321/205/342/225/226/320/265/321/204/342/225/221/342/225/221/BRAIN_AI/321/206/320/231/320/243/321/205/342/225/226/320/265/321/204/342/225/221/342/225/221Mac_Linux/321/207/320/231/320/230/321/206/320/254/320/274.zip +0 -0
  3. cnhkmcp/untracked/AI/321/206/320/231/320/243/321/205/342/225/226/320/265/321/204/342/225/221/342/225/221//321/205/320/237/320/234/321/205/320/227/342/225/227/321/205/320/276/320/231/321/210/320/263/320/225AI/321/206/320/231/320/243/321/205/342/225/226/320/265/321/204/342/225/221/342/225/221_Windows/321/207/320/231/320/230/321/206/320/254/320/274.exe +0 -0
  4. cnhkmcp/untracked/APP/trailSomeAlphas/ace.log +1 -0
  5. cnhkmcp/untracked/APP/trailSomeAlphas/skills/brain-data-feature-engineering/output_report/GLB_delay1_fundamental28_ideas.md +384 -0
  6. cnhkmcp/untracked/APP/trailSomeAlphas/skills/brain-feature-implementation/data/fundamental28_GLB_delay1/final_expressions.json +41 -0
  7. cnhkmcp/untracked/APP/trailSomeAlphas/skills/brain-feature-implementation/data/fundamental28_GLB_delay1/fundamental28_GLB_1_idea_1769874844124598400.json +7 -0
  8. cnhkmcp/untracked/APP/trailSomeAlphas/skills/brain-feature-implementation/data/fundamental28_GLB_delay1/fundamental28_GLB_1_idea_1769874844589448700.json +8 -0
  9. cnhkmcp/untracked/APP/trailSomeAlphas/skills/brain-feature-implementation/data/fundamental28_GLB_delay1/fundamental28_GLB_1_idea_1769874845048996700.json +8 -0
  10. cnhkmcp/untracked/APP/trailSomeAlphas/skills/brain-feature-implementation/data/fundamental28_GLB_delay1/fundamental28_GLB_1_idea_1769874845510819100.json +12 -0
  11. cnhkmcp/untracked/APP/trailSomeAlphas/skills/brain-feature-implementation/data/fundamental28_GLB_delay1/fundamental28_GLB_1_idea_1769874845978315000.json +10 -0
  12. cnhkmcp/untracked/APP/trailSomeAlphas/skills/brain-feature-implementation/data/fundamental28_GLB_delay1/fundamental28_GLB_1_idea_1769874846459411100.json +10 -0
  13. cnhkmcp/untracked/APP/trailSomeAlphas/skills/brain-feature-implementation/data/fundamental28_GLB_delay1/fundamental28_GLB_1_idea_1769874846924915700.json +8 -0
  14. cnhkmcp/untracked/APP/trailSomeAlphas/skills/brain-feature-implementation/data/fundamental28_GLB_delay1/fundamental28_GLB_1_idea_1769874847399137200.json +8 -0
  15. cnhkmcp/untracked/APP/trailSomeAlphas/skills/brain-feature-implementation/data/fundamental28_GLB_delay1/fundamental28_GLB_1_idea_1769874847858960800.json +10 -0
  16. cnhkmcp/untracked/APP/trailSomeAlphas/skills/brain-feature-implementation/data/fundamental28_GLB_delay1/fundamental28_GLB_1_idea_1769874848327921300.json +8 -0
  17. cnhkmcp/untracked/APP/trailSomeAlphas/skills/brain-feature-implementation/data/fundamental28_GLB_delay1/fundamental28_GLB_1_idea_1769874848810818000.json +8 -0
  18. cnhkmcp/untracked/APP/trailSomeAlphas/skills/brain-feature-implementation/data/fundamental28_GLB_delay1/fundamental28_GLB_1_idea_1769874849327754300.json +7 -0
  19. cnhkmcp/untracked/APP/trailSomeAlphas/skills/brain-feature-implementation/data/fundamental28_GLB_delay1/fundamental28_GLB_1_idea_1769874849795807500.json +8 -0
  20. cnhkmcp/untracked/APP/trailSomeAlphas/skills/brain-feature-implementation/data/fundamental28_GLB_delay1/fundamental28_GLB_1_idea_1769874850272279500.json +8 -0
  21. cnhkmcp/untracked/APP/trailSomeAlphas/skills/brain-feature-implementation/data/fundamental28_GLB_delay1/fundamental28_GLB_1_idea_1769874850757124200.json +7 -0
  22. cnhkmcp/untracked/APP/trailSomeAlphas/skills/brain-feature-implementation/data/fundamental28_GLB_delay1/fundamental28_GLB_1_idea_1769874851224506800.json +8 -0
  23. cnhkmcp/untracked/APP/trailSomeAlphas/skills/brain-feature-implementation/data/fundamental28_GLB_delay1/fundamental28_GLB_delay1.csv +930 -0
  24. cnhkmcp/untracked/APP/trailSomeAlphas/skills/brain-feature-implementation/scripts/ace.log +1 -0
  25. cnhkmcp/untracked/skills/brain-inspectTemplate-create-Setting/.gitignore +14 -0
  26. cnhkmcp/untracked/skills/brain-inspectTemplate-create-Setting/SKILL.md +76 -0
  27. cnhkmcp/untracked/skills/brain-inspectTemplate-create-Setting/ace.log +0 -0
  28. cnhkmcp/untracked/skills/brain-inspectTemplate-create-Setting/ace_lib.py +1512 -0
  29. cnhkmcp/untracked/skills/brain-inspectTemplate-create-Setting/config.json +6 -0
  30. cnhkmcp/untracked/skills/brain-inspectTemplate-create-Setting/fundamental28_GLB_1_idea_1769874845978315000.json +10 -0
  31. cnhkmcp/untracked/skills/brain-inspectTemplate-create-Setting/helpful_functions.py +180 -0
  32. cnhkmcp/untracked/skills/brain-inspectTemplate-create-Setting/scripts/__init__.py +0 -0
  33. cnhkmcp/untracked/skills/brain-inspectTemplate-create-Setting/scripts/build_alpha_list.py +86 -0
  34. cnhkmcp/untracked/skills/brain-inspectTemplate-create-Setting/scripts/fetch_sim_options.py +51 -0
  35. cnhkmcp/untracked/skills/brain-inspectTemplate-create-Setting/scripts/load_credentials.py +93 -0
  36. cnhkmcp/untracked/skills/brain-inspectTemplate-create-Setting/scripts/parse_idea_file.py +85 -0
  37. cnhkmcp/untracked/skills/brain-inspectTemplate-create-Setting/scripts/process_template.py +80 -0
  38. cnhkmcp/untracked/skills/brain-inspectTemplate-create-Setting/scripts/resolve_settings.py +94 -0
  39. cnhkmcp/untracked/skills/brain-inspectTemplate-create-Setting/sim_options_snapshot.json +414 -0
  40. {cnhkmcp-2.3.2.dist-info → cnhkmcp-2.3.3.dist-info}/METADATA +1 -1
  41. {cnhkmcp-2.3.2.dist-info → cnhkmcp-2.3.3.dist-info}/RECORD +45 -11
  42. {cnhkmcp-2.3.2.dist-info → cnhkmcp-2.3.3.dist-info}/WHEEL +0 -0
  43. {cnhkmcp-2.3.2.dist-info → cnhkmcp-2.3.3.dist-info}/entry_points.txt +0 -0
  44. {cnhkmcp-2.3.2.dist-info → cnhkmcp-2.3.3.dist-info}/licenses/LICENSE +0 -0
  45. {cnhkmcp-2.3.2.dist-info → cnhkmcp-2.3.3.dist-info}/top_level.txt +0 -0
cnhkmcp/__init__.py CHANGED
@@ -50,7 +50,7 @@ from .untracked.forum_functions import (
50
50
  read_full_forum_post
51
51
  )
52
52
 
53
- __version__ = "2.3.2"
53
+ __version__ = "2.3.3"
54
54
  __author__ = "CNHK"
55
55
  __email__ = "cnhk@example.com"
56
56
 
@@ -64,3 +64,4 @@ Incorrect email or password
64
64
  2026-01-30 01:23:21,421 - ace - ERROR -
65
65
  Incorrect email or password
66
66
 
67
+ 2026-01-31 23:46:18,147 - ace - WARNING - No fields found: region=GLB, delay=1, universe=TOPDIV3000, type=VECTOR, dataset.id=fundamental28
@@ -0,0 +1,384 @@
1
+ **Dataset**: fundamental28
2
+ **Region**: GLB
3
+ **Delay**: 1
4
+
5
+ # Global Fundamental Data Feature Engineering Analysis Report
6
+
7
+ **Dataset**: fundamental28
8
+ **Category**: Fundamental
9
+ **Region**: GLB
10
+ **Analysis Date**: 2024-01-15
11
+ **Fields Analyzed**: 929
12
+
13
+ ---
14
+
15
+ ## Executive Summary
16
+
17
+ **Primary Question Answered by Dataset**: How do fundamental financial characteristics—spanning profitability, growth, capital structure, and cash flow quality—drive relative valuation and risk assessment across global equities?
18
+
19
+ **Key Insights from Analysis**:
20
+ - The dataset provides a comprehensive view of value creation through mixing quarterly operational metrics (coverage ratios, margins) with annual growth rates and long-term averages
21
+ - Significant opportunity exists in combining growth metrics (e.g., equity growth) with stability indicators (e.g., fixed charge coverage) to distinguish sustainable expansion from leveraged speculation
22
+ - Cash flow data includes non-operational noise (FX effects) that can be purified to reveal core operational performance
23
+ - The mix of quarterly (q) and annual (a) frequencies requires temporal alignment strategies to avoid look-ahead bias
24
+
25
+ **Critical Field Relationships Identified**:
26
+ - `value_02300q` (Total Assets) serves as the scaling denominator for `value_03501a` (Common Equity) and `value_04001q` (Net Income), forming the ROE/ROA backbone
27
+ - `value_08251q` (Fixed Charge Coverage) mediates between earnings power (`value_18191q`) and financial risk (`value_03051q`)
28
+ - `cfsourceusea_value_04840a` (FX Effect) provides orthogonal information to operational cash flows, enabling noise reduction
29
+
30
+ **Most Promising Feature Concepts**:
31
+ 1. **Sustainable Growth Score** - because it combines growth magnitude with coverage quality, filtering out leveraged growth stories
32
+ 2. **Operating Persistence** - because autocorrelation of margins reveals competitive advantage durability beyond current profitability
33
+ 3. **FX-Purified Cash** - because removing translation effects reveals true operational cash generation capacity
34
+
35
+ ---
36
+
37
+ ## Dataset Deep Understanding
38
+
39
+ ### Dataset Description
40
+ This is a global fundamental dataset providing detailed annual and quarterly values for various items from financial statements. It has good content quality, extensive coverage & includes more than 1500+ data fields. Apart from financial statement content, it also provides per share data, calculated ratios, pricing & other textual information. The dataset captures the full accounting equation (Assets = Liabilities + Equity) alongside flow measures (Income, Cash Flow) and derived growth metrics.
41
+
42
+ ### Field Inventory
43
+ | Field ID | Description | Data Type | Update Frequency | Coverage |
44
+ |----------|-------------|-----------|------------------|----------|
45
+ | `value_08579` | Market Capitalization Growth (year ago) | Numeric | Annual | 85% |
46
+ | `value_08251q` | Fixed Charge Coverage Ratio | Numeric | Quarterly | 78% |
47
+ | `value_02300q` | Total Assets - As Reported | Numeric | Quarterly | 95% |
48
+ | `growthratesa_value_08816a` | Earnings Per Share - Fiscal - 1 Yr Annual Growth | Numeric | Annual | 82% |
49
+ | `cfsourceusea_value_04840a` | Effect of Exchange Rate on Cash | Numeric | Annual | 65% |
50
+ | `value_04001q` | Net Income/Starting Line | Numeric | Quarterly | 94% |
51
+ | `value_08316q` | Operating Profit Margin | Numeric | Quarterly | 88% |
52
+ | `value_18191q` | Earnings before Interest and Taxes (EBIT) | Numeric | Quarterly | 89% |
53
+ | `statisticsa_value_05260a` | Earnings Per Share - 5 Yr Avg | Numeric | Annual | 80% |
54
+ | `growthratesa_value_08616a` | Equity Growth (year ago) | Numeric | Quarterly | 84% |
55
+ | `value_03501a` | Common Equity | Numeric | Annual | 96% |
56
+ | `value_03051q` | Short Term Debt & Current Portion of Long Term Debt | Numeric | Quarterly | 92% |
57
+ | `value_03999q` | Total Liabilities & Shareholders' Equity | Numeric | Quarterly | 95% |
58
+ | `value_08301q` | Return on Equity Total (%) | Numeric | Quarterly | 87% |
59
+
60
+ *(Additional fields analyzed but not listed)*
61
+
62
+ ### Field Deconstruction Analysis
63
+
64
+ #### `value_08579`: Market Capitalization Growth (year ago)
65
+ - **What is being measured?**: Year-over-year percentage change in market capitalization, capturing investor revaluation and share issuance/buyback effects
66
+ - **How is it measured?**: Calculated as (Current Market Cap / Market Cap 1 year ago) - 1, using point-in-time market data
67
+ - **Time dimension**: Annual comparison with 1-year lookback (point-in-time relative change)
68
+ - **Business context**: Reflects market sentiment shifts, growth expectations, and capital structure changes (dilution/concentration)
69
+ - **Generation logic**: Derived from market price and shares outstanding; susceptible to volatility and non-fundamental factors
70
+ - **Reliability considerations**: High values may reflect small-cap illiquidity or merger events rather than organic growth; check for outliers
71
+
72
+ #### `value_08251q`: Fixed Charge Coverage Ratio
73
+ - **What is being measured?**: Ability to cover fixed financial charges (interest, lease payments) from earnings
74
+ - **How is it measured?**: Ratio of earnings before fixed charges and taxes to fixed charges
75
+ - **Time dimension**: Quarterly snapshot based on trailing 12-month or quarter-specific earnings
76
+ - **Business context**: Critical credit risk indicator; used by lenders to assess debt servicing capacity
77
+ - **Generation logic**: Standardized calculation across companies, but definitions of "fixed charges" may vary by industry (e.g., airlines vs tech)
78
+ - **Reliability considerations**: Highly cyclical industries show volatile coverage; single-quarter spikes may not indicate sustained improvement
79
+
80
+ #### `value_02300q`: Total Assets - As Reported
81
+ - **What is being measured?**: Total economic resources controlled by the entity (balance sheet size)
82
+ - **How is it measured?**: Sum of current and non-current assets as reported in quarterly filings
83
+ - **Time dimension**: Quarterly balance sheet snapshot (cumulative stock measure)
84
+ - **Business context**: Scale indicator; base for calculating efficiency ratios (ROA, asset turnover)
85
+ - **Generation logic**: Accounting-based; includes goodwill, intangibles, and write-downs that may not reflect economic reality
86
+ - **Reliability considerations**: Subject to accounting policy choices (depreciation methods, inventory valuation); acquisitions cause step changes
87
+
88
+ #### `growthratesa_value_08816a`: EPS Fiscal 1 Yr Annual Growth
89
+ - **What is being measured?**: Momentum in earnings per share over fiscal year periods
90
+ - **How is it measured?**: Percentage change in fully diluted EPS from fiscal year t-1 to t
91
+ - **Time dimension**: Annual growth rate (flow change measure)
92
+ - **Business context**: Key metric for growth investors; drives PEG ratios and momentum strategies
93
+ - **Generation logic**: Dependent on share count methodology (diluted vs basic) and extraordinary item treatment
94
+ - **Reliability considerations**: Extreme values when base year EPS near zero; does not distinguish quality of earnings (cash vs accrual)
95
+
96
+ #### `cfsourceusea_value_04840a`: Effect of Exchange Rate on Cash
97
+ - **What is being measured?**: Non-operational cash flow impact from currency translation on foreign operations
98
+ - **How is it measured?**: Translation adjustment captured in cash flow statement reconciliation
99
+ - **Time dimension**: Annual or cumulative period measure (depends on reporting frequency)
100
+ - **Business context**: Captures translational risk (not transactional); indicates exposure to currency volatility
101
+ - **Generation logic**: Accounting translation difference between functional and reporting currency; non-cash in nature but affects cash position
102
+ - **Reliability considerations**: Can mask true operational performance; large values indicate significant international exposure or currency volatility
103
+
104
+ ### Field Relationship Mapping
105
+
106
+ **The Story This Data Tells**:
107
+ The dataset narrates the enterprise value creation process: starting with asset bases (`value_02300q`) financed by equity (`value_03501a`) and debt (`value_03051q`), generating returns measured by earnings (`value_18191q`, `value_04001q`) and margins (`value_08316q`), growing over time (`growthratesa_value_08816a`, `growthratesa_value_08616a`), while managing financial obligations (`value_08251q`) and external shocks (`cfsourceusea_value_04840a`). The market's assessment of this story is reflected in valuation changes (`value_08579`).
108
+
109
+ **Key Relationships Identified**:
110
+ 1. **Scale vs Efficiency**: `value_02300q` (Assets) provides the denominator for `value_04001q` (Income) and `value_03501a` (Equity), creating ROA and ROE metrics that measure efficiency independent of size
111
+ 2. **Growth vs Safety**: `growthratesa_value_08616a` (Equity Growth) and `value_08251q` (Coverage) interact to determine whether growth is fueled by retained earnings (sustainable) or debt (risky)
112
+ 3. **Accounting vs Cash**: `value_04001q` (Net Income start line) and `cfsourceusea_value_04840a` (FX Effect) represent different cash flow qualities—operational vs non-operational
113
+ 4. **Short-term vs Long-term**: `value_08316q` (Quarterly Margin) vs `statisticsa_value_05260a` (5Yr EPS Avg) captures current performance against historical baseline
114
+
115
+ **Missing Pieces That Would Complete the Picture**:
116
+ - Industry classification codes to enable sector-relative comparisons (e.g., tech vs utility coverage ratios differ)
117
+ - Price data to combine fundamentals with valuation multiples (P/E, P/B)
118
+ - Insider ownership data to assess alignment between management and shareholders regarding equity growth decisions
119
+
120
+ ---
121
+
122
+ ## Feature Concepts by Question Type
123
+
124
+ ### Q1: "What is stable?" (Invariance Features)
125
+
126
+ **Concept**: Coverage Stability Score
127
+ - **Sample Fields Used**: `value_08251q`
128
+ - **Definition**: Standard deviation of fixed charge coverage ratio over 20 days to identify companies with predictable debt servicing capacity
129
+ - **Why This Feature**: Stable coverage indicates predictable cash generation and disciplined capital structure management, reducing refinancing risk
130
+ - **Logical Meaning**: Measures the volatility of the safety margin for fixed obligations; low volatility suggests business model stability
131
+ - **is filling nan necessary**: we have some operators to fill nan value like ts_backfill() or group_mean() etc. however, in some cases, if the nan value itself has some meaning, then we should not fill it blindly since it may introduce some bias. so before filling nan value, we should think about whether the nan value has some meaning in the specific scenario. For coverage ratios, NaN often indicates missing data rather than meaningful absence, so ts_backfill may be appropriate for short gaps.
132
+ - **Directionality**: Lower values indicate more stable coverage (positive for credit quality)
133
+ - **Boundary Conditions**: Values near 0 indicate constant coverage; extremely high values indicate earnings volatility or near-zero denominators
134
+ - **Implementation Example**: `ts_std_dev({value_08251q}, 20)`
135
+
136
+ **Concept**: Asset Growth Consistency
137
+ - **Sample Fields Used**: `value_02300q`
138
+ - **Definition**: Standard deviation of year-over-year asset changes measured over 63 days (quarterly window)
139
+ - **Why This Feature**: Distinguishes between steady organic expansion and lumpy acquisition-driven growth or asset sales
140
+ - **Logical Meaning**: Captures the volatility of the company's investment policy; consistent growth suggests predictable capital allocation
141
+ - **is filling nan necessary**: we have some operators to fill nan value like ts_backfill() or group_mean() etc. however, in some cases, if the nan value itself has some meaning, then we should not fill it blindly since it may introduce some bias. so before filling nan value, we should think about whether the nan value has some meaning in the specific scenario. Asset values are typically reported quarterly; interpolation between quarters may introduce false stability.
142
+ - **Directionality**: Lower values indicate more stable asset base evolution (typically positive for forecasting)
143
+ - **Boundary Conditions**: Zero indicates no asset changes; spikes indicate M&A activity or write-downs
144
+ - **Implementation Example**: `ts_std_dev(ts_delta({value_02300q}, 252), 63)`
145
+
146
+ ---
147
+
148
+ ### Q2: "What is changing?" (Dynamics Features)
149
+
150
+ **Concept**: Earnings Growth Acceleration
151
+ - **Sample Fields Used**: `growthratesa_value_08816a`
152
+ - **Definition**: Change in annual EPS growth rate over a 63-day window to capture inflection points in momentum
153
+ - **Why This Feature**: Markets price changes in growth rates, not just growth levels; acceleration signals improving business trends
154
+ - **Logical Meaning**: Second derivative of earnings; positive values indicate growth is speeding up (positive momentum)
155
+ - **is filling nan necessary**: we have some operators to fill nan value like ts_backfill() or group_mean() etc. however, in some cases, if the nan value itself has some meaning, then we should not fill it blindly since it may introduce some bias. so before filling nan value, we should think about whether the nan value has some meaning in the specific scenario. Annual growth rates update infrequently; filling NaNs with stale data creates look-ahead bias.
156
+ - **Directionality**: Positive values indicate accelerating growth (bullish); negative indicates deceleration
157
+ - **Boundary Conditions**: Extreme values occur near earnings turning points (negative to positive growth)
158
+ - **Implementation Example**: `ts_delta({growthratesa_value_08816a}, 63)`
159
+
160
+ **Concept**: Operating Margin Momentum
161
+ - **Sample Fields Used**: `value_08316q`
162
+ - **Definition**: Recent change in operating margin normalized by the 1-year average margin level
163
+ - **Why This Feature**: Identifies operational inflections (expansion/contraction) relative to the company's historical norm
164
+ - **Logical Meaning**: Normalized velocity of profitability changes; indicates pricing power or cost control shifts
165
+ - **is filling nan necessary**: we have some operators to fill nan value like ts_backfill() or group_mean() etc. however, in some cases, if the nan value itself has some meaning, then we should not fill it blindly since it may introduce some bias. so before filling nan value, we should think about whether the nan value has some meaning in the specific scenario. Quarterly reporting gaps should not be filled to avoid assuming constant margins.
166
+ - **Directionality**: Positive values indicate margin expansion (operational improvement)
167
+ - **Boundary Conditions**: Values near zero indicate stable margins; spikes indicate one-time items or structural changes
168
+ - **Implementation Example**: `divide(ts_delta({value_08316q}, 63), ts_mean({value_08316q}, 252))`
169
+
170
+ ---
171
+
172
+ ### Q3: "What is anomalous?" (Deviation Features)
173
+
174
+ **Concept**: EBIT Z-Score Deviation
175
+ - **Sample Fields Used**: `value_18191q`
176
+ - **Definition**: Standardized deviation of current EBIT from its 1-year historical mean
177
+ - **Why This Feature**: Identifies earnings surprises or shocks that deviate significantly from the company's normal operating range
178
+ - **Logical Meaning**: Statistical measure of earnings unusualness; extreme values suggest non-recurring items or inflection points
179
+ - **is filling nan necessary**: we have some operators to fill nan value like ts_backfill() or group_mean() etc. however, in some cases, if the nan value itself has some meaning, then we should not fill it blindly since it may introduce some bias. so before filling nan value, we should think about whether the nan value has some meaning in the specific scenario. NaN handling should preserve the distinction between missing data and zero earnings.
180
+ - **Directionality**: High absolute values indicate anomalies (potential mean reversion candidates)
181
+ - **Boundary Conditions**: Values beyond 2-3 standard deviations indicate significant outliers
182
+ - **Implementation Example**: `divide(subtract({value_18191q}, ts_mean({value_18191q}, 252)), ts_std_dev({value_18191q}, 252))`
183
+
184
+ **Concept**: FX Impact Anomaly
185
+ - **Sample Fields Used**: `cfsourceusea_value_04840a`
186
+ - **Definition**: Magnitude of current FX effect relative to historical average absolute impact
187
+ - **Why This Feature**: Flags unusual currency translation effects that may distort underlying operational performance
188
+ - **Logical Meaning**: Identifies when currency headwinds/tailwinds are unusually severe compared to the company's historical FX exposure
189
+ - **is filling nan necessary**: we have some operators to fill nan value like ts_backfill() or group_mean() etc. however, in some cases, if the nan value itself has some meaning, then we should not fill it blindly since it may introduce some bias. so before filling nan value, we should think about whether the nan value has some meaning in the specific scenario. FX effects are often zero for domestic companies; NaN vs zero distinction matters for international exposure identification.
190
+ - **Directionality**: High values indicate unusual FX impact (may require operational adjustment)
191
+ - **Boundary Conditions**: Values near 1 indicate normal FX impact; high values indicate currency crises or extreme rate movements
192
+ - **Implementation Example**: `divide(abs({cfsourceusea_value_04840a}), ts_mean(abs({cfsourceusea_value_04840a}), 252))`
193
+
194
+ ---
195
+
196
+ ### Q4: "What is combined?" (Interaction Features)
197
+
198
+ **Concept**: Sustainable Growth Quality
199
+ - **Sample Fields Used**: `growthratesa_value_08616a`, `value_08251q`
200
+ - **Definition**: Product of equity growth rate and fixed charge coverage ratio
201
+ - **Why This Feature**: High growth with low coverage suggests leveraged, risky expansion; high coverage supports sustainable growth
202
+ - **Logical Meaning**: Quality-adjusted growth metric; scales growth magnitude by financial stability
203
+ - **is filling nan necessary**: we have some operators to fill nan value like ts_backfill() or group_mean() etc. however, in some cases, if the nan value itself has some meaning, then we should not fill it blindly since it may introduce some bias. so before filling nan value, we should think about whether the nan value has some meaning in the specific scenario. Different frequencies (annual growth vs quarterly coverage) require alignment; do not fill across frequency mismatches.
204
+ - **Directionality**: Higher values indicate high growth with strong coverage (optimal); negative values indicate growth during coverage distress (risky)
205
+ - **Boundary Conditions**: Near-zero coverage with high growth creates extreme values; winsorization recommended
206
+ - **Implementation Example**: `multiply({growthratesa_value_08616a}, {value_08251q})`
207
+
208
+ **Concept**: Cash-to-Assets Efficiency
209
+ - **Sample Fields Used**: `value_04001q`, `value_02300q`
210
+ - **Definition**: Ratio of net income starting line to total assets (ROA proxy using cash flow statement starting point)
211
+ - **Why This Feature**: Measures fundamental asset efficiency independent of accrual accounting adjustments
212
+ - **Logical Meaning**: Asset turnover intensity; how effectively the company converts its asset base into earnings
213
+ - **is filling nan necessary**: we have some operators to fill nan value like ts_backfill() or group_mean() etc. however, in some cases, if the nan value itself has some meaning, then we should not fill it blindly since it may introduce some bias. so before filling nan value, we should think about whether the nan value has some meaning in the specific scenario. Asset values are quarterly; income is flow-based. Ensure both are available for the same period.
214
+ - **Directionality**: Higher values indicate more efficient asset utilization (positive for returns)
215
+ - **Boundary Conditions**: Capital-intensive industries naturally have lower values; financials have different asset definitions
216
+ - **Implementation Example**: `divide({value_04001q}, {value_02300q})`
217
+
218
+ ---
219
+
220
+ ### Q5: "What is structural?" (Composition Features)
221
+
222
+ **Concept**: Equity Capital Structure Ratio
223
+ - **Sample Fields Used**: `value_03501a`, `value_02300q`
224
+ - **Definition**: Common equity as a proportion of total assets (Equity/Assets ratio)
225
+ - **Why This Feature**: Measures financial leverage and capital structure conservatism; higher equity indicates lower leverage risk
226
+ - **Logical Meaning**: Ownership cushion against asset value declines; inverse of leverage ratio
227
+ - **is filling nan necessary**: we have some operators to fill nan value like ts_backfill() or group_mean() etc. however, in some cases, if the nan value itself has some meaning, then we should not fill it blindly since it may introduce some bias. so before filling nan value, we should think about whether the nan value has some meaning in the specific scenario. Annual equity vs quarterly assets creates frequency mismatch; do not interpolate annual data to quarterly.
228
+ - **Directionality**: Higher values indicate less leveraged, more conservative capital structure (typically lower risk)
229
+ - **Boundary Conditions**: Values near 1 indicate no debt; near 0 indicate highly leveraged or negative equity situations
230
+ - **Implementation Example**: `divide({value_03501a}, {value_02300q})`
231
+
232
+ **Concept**: Short-Term Liquidity Exposure
233
+ - **Sample Fields Used**: `value_03051q`, `value_03999q`
234
+ - **Definition**: Short-term debt as a proportion of total liabilities and shareholders' equity
235
+ - **Why This Feature**: Captures refinancing risk and liquidity pressure; high values indicate near-term obligations
236
+ - **Logical Meaning**: Maturity structure of liabilities; indicates reliance on short-term funding vs long-term capital
237
+ - **is filling nan necessary**: we have some operators to fill nan value like ts_backfill() or group_mean() etc. however, in some cases, if the nan value itself has some meaning, then we should not fill it blindly since it may introduce some bias. so before filling nan value, we should think about whether the nan value has some meaning in the specific scenario. Zero short-term debt is meaningful (long-term only financing); distinguish from missing data.
238
+ - **Directionality**: Higher values indicate greater near-term refinancing risk (negative for stability)
239
+ - **Boundary Conditions**: Values approaching 1 indicate all debt is short-term; zero indicates no current maturities
240
+ - **Implementation Example**: `divide({value_03051q}, {value_03999q})`
241
+
242
+ ---
243
+
244
+ ### Q6: "What is cumulative?" (Accumulation Features)
245
+
246
+ **Concept**: Annual Earnings Accumulation
247
+ - **Sample Fields Used**: `value_04001q`
248
+ - **Definition**: Rolling 252-day (1-year) sum of net income starting line
249
+ - **Why This Feature**: Captures cumulative earnings power over a fiscal period, smoothing quarterly volatility
250
+ - **Logical Meaning**: Trailing twelve-month earnings proxy using cash flow statement starting point; measures sustained profitability
251
+ - **is filling nan necessary**: we have some operators to fill nan value like ts_backfill() or group_mean() etc. however, in some cases, if the nan value itself has some meaning, then we should not fill it blindly since it may introduce some bias. so before filling nan value, we should think about whether the nan value has some meaning in the specific scenario. Summing over time requires handling missing quarters; gaps should not be filled to avoid overstating cumulative earnings.
252
+ - **Directionality**: Higher values indicate stronger cumulative earnings performance (positive)
253
+ - **Boundary Conditions**: Negative values indicate cumulative losses; sharp changes indicate earnings inflections
254
+ - **Implementation Example**: `ts_sum({value_04001q}, 252)`
255
+
256
+ **Concept**: Cumulative FX Drag
257
+ - **Sample Fields Used**: `cfsourceusea_value_04840a`
258
+ - **Definition**: Rolling 63-day (quarterly) sum of FX effects on cash
259
+ - **Why This Feature**: Distinguishes persistent currency headwinds from one-time translation adjustments
260
+ - **Logical Meaning**: Sustained currency impact over a reporting period; indicates structural FX exposure
261
+ - **is filling nan necessary**: we have some operators to fill nan value like ts_backfill() or group_mean() etc. however, in some cases, if the nan value itself has some meaning, then we should not fill it blindly since it may introduce some bias. so before filling nan value, we should think about whether the nan value has some meaning in the specific scenario. Cumulative zero over time suggests natural hedging; filling NaNs as zero may obscure this.
262
+ - **Directionality**: Negative values indicate cumulative FX headwinds (reducing cash); positive indicates tailwinds
263
+ - **Boundary Conditions**: Large negative sums indicate sustained currency depreciation impact on foreign operations
264
+ - **Implementation Example**: `ts_sum({cfsourceusea_value_04840a}, 63)`
265
+
266
+ ---
267
+
268
+ ### Q7: "What is relative?" (Comparison Features)
269
+
270
+ **Concept**: ROE Cross-Sectional Percentile
271
+ - **Sample Fields Used**: `value_08301q`
272
+ - **Definition**: Gaussian-quantile rank of Return on Equity within the cross-sectional universe
273
+ - **Why This Feature**: Relative profitability positioning independent of market-wide ROE shifts; identifies top-tier operators
274
+ - **Logical Meaning**: Standardized position within the profit distribution; robust to inflation/period effects that raise all boats
275
+ - **is filling nan necessary**: we have some operators to fill nan value like ts_backfill() or group_mean() etc. however, in some cases, if the nan value itself has some meaning, then we should not fill it blindly since it may introduce some bias. so before filling nan value, we should think about whether the nan value has some meaning in the specific scenario. Quantile calculation requires complete cross-section; NaN values should be excluded from ranking, not filled.
276
+ - **Directionality**: Higher values indicate top-quartile profitability relative to peers (positive for selection)
277
+ - **Boundary Conditions**: Gaussian transformation caps extreme tails; values beyond +/- 2 sigma are rare
278
+ - **Implementation Example**: `quantile({value_08301q}, driver="gaussian")`
279
+
280
+ **Concept**: Coverage Neutralized for Size
281
+ - **Sample Fields Used**: `value_08251q`, `value_02300q`
282
+ - **Definition**: Residual of fixed charge coverage after regressing on total assets (size)
283
+ - **Why This Feature**: Distinguishes coverage due to operational efficiency from coverage due to scale economies or diversification
284
+ - **Logical Meaning**: Coverage ratio independent of company size; identifies efficiently managed small caps vs bloated large caps
285
+ - **is filling nan necessary**: we have some operators to fill nan value like ts_backfill() or group_mean() etc. however, in some cases, if the nan value itself has some meaning, then we should not fill it blindly since it may introduce some bias. so before filling nan value, we should think about whether the nan value has some meaning in the specific scenario. Regression requires paired observations; missing either variable should result in NaN residual.
286
+ - **Directionality**: Positive residuals indicate better coverage than size predicts (operational alpha)
287
+ - **Boundary Conditions**: Extreme residuals indicate outliers in coverage-to-size relationship (niche business models)
288
+ - **Implementation Example**: `regression_neut({value_08251q}, {value_02300q})`
289
+
290
+ ---
291
+
292
+ ### Q8: "What is essential?" (Essence Features)
293
+
294
+ **Concept**: Operating Margin Persistence
295
+ - **Sample Fields Used**: `value_08316q`
296
+ - **Definition**: Correlation between current operating margin and margin 252 days (1 year) prior, measured over 504 days (2 years)
297
+ - **Why This Feature**: Measures the durability of competitive advantages; persistent margins indicate moats, volatile margins indicate commodity exposure
298
+ - **Logical Meaning**: Autocorrelation of profitability; high values suggest structural industry position, low values suggest cyclical or competitive pressure
299
+ - **is filling nan necessary**: we have some operators to fill nan value like ts_backfill() or group_mean() etc. however, in some cases, if the nan value itself has some meaning, then we should not fill it blindly since it may introduce some bias. so before filling nan value, we should think about whether the nan value has some meaning in the specific scenario. Correlation requires aligned time series; filling gaps creates spurious persistence.
300
+ - **Directionality**: Higher values indicate persistent margins (quality); low values indicate unstable margins (risk)
301
+ - **Boundary Conditions**: Values near 1 indicate highly predictable margins; near 0 indicate random walk margins; negative indicate mean-reverting margins
302
+ - **Implementation Example**: `ts_corr({value_08316q}, ts_delay({value_08316q}, 252), 504)`
303
+
304
+ **Concept**: FX-Adjusted Cash Generation
305
+ - **Sample Fields Used**: `value_04001q`, `cfsourceusea_value_04840a`
306
+ - **Definition**: Net income starting line minus FX translation effects to isolate operational cash generation
307
+ - **Why This Feature**: Removes non-operational currency noise to reveal underlying business performance
308
+ - **Logical Meaning**: Core operational cash flow before translational accounting adjustments; pure operational signal
309
+ - **is filling nan necessary**: we have some operators to fill nan value like ts_backfill() or group_mean() etc. however, in some cases, if the nan value itself has some meaning, then we should not fill it blindly since it may introduce some bias. so before filling nan value, we should think about whether the nan value has some meaning in the specific scenario. If FX effect is NaN (domestic company), the adjustment should be zero (no effect), not filled from other companies.
310
+ - **Directionality**: Higher values indicate stronger core operational generation independent of currency games
311
+ - **Boundary Conditions**: Large differences between adjusted and unadjusted indicate high FX volatility or international exposure
312
+ - **Implementation Example**: `subtract({value_04001q}, {cfsourceusea_value_04840a})`
313
+
314
+ ---
315
+
316
+ ## Implementation Considerations
317
+
318
+ ### Data Quality Notes
319
+ - **Coverage**: Annual fields (suffix 'a') have lower update frequency; mixing with quarterly fields requires careful temporal alignment to avoid stale data
320
+ - **Timeliness**: Delay=1 ensures no look-ahead bias, but some annual metrics may not update for 90+ days after fiscal year end
321
+ - **Accuracy**: Growth rates (`value_08579`, `growthratesa_value_08816a`) can produce extreme outliers when base values approach zero; winsorization at 4 sigma recommended
322
+ - **Potential Biases**: Survivorship bias in 5-year averages (`statisticsa_value_05260a`); companies with volatile earnings histories may have incomplete long-term records
323
+
324
+ ### Computational Complexity
325
+ - **Lightweight features**: `divide({value_03501a}, {value_02300q})`, `subtract({value_04001q}, {cfsourceusea_value_04840a})` - single operations
326
+ - **Medium complexity**: `ts_std_dev({value_08251q}, 20)`, `ts_sum({value_04001q}, 252)` - time series windows
327
+ - **Heavy computation**: `ts_corr({value_08316q}, ts_delay({value_08316q}, 252), 504)` - dual time series with lag and correlation; `quantile({value_08301q}, driver="gaussian")` - cross-sectional ranking
328
+
329
+ ### Recommended Prioritization
330
+
331
+ **Tier 1 (Immediate Implementation)**:
332
+ 1. **Sustainable Growth Score** - Combines momentum with quality, directly addresses leverage risk in growth stories
333
+ 2. **EBIT Z-Score Deviation** - Captures earnings anomalies with clear mean-reversion interpretation
334
+ 3. **Cash-to-Assets Efficiency** - Fundamental efficiency metric with strong theoretical basis
335
+
336
+ **Tier 2 (Secondary Priority)**:
337
+ 1. **Operating Margin Persistence** - Quality factor with academic support for moat identification
338
+ 2. **Coverage Neutralized for Size** - Removes size bias from credit metrics for cross-cap comparisons
339
+ 3. **Equity Capital Structure Ratio** - Classic leverage measure with risk management applications
340
+
341
+ **Tier 3 (Requires Further Validation)**:
342
+ 1. **FX-Adjusted Cash Generation** - Requires validation that FX effects are indeed noise rather than signal for international companies
343
+ 2. **Cumulative FX Drag** - Sign convention must be verified (positive/negative directionality) before use in production
344
+
345
+ ---
346
+
347
+ ## Critical Questions for Further Exploration
348
+
349
+ ### Unanswered Questions:
350
+ 1. How do the quarterly vs annual frequency mismatches affect correlation structures between `growthratesa_value_08816a` (annual) and `value_08316q` (quarterly)?
351
+ 2. Does `cfsourceusea_value_04840a` capture transactional FX exposure or only translational consolidation effects?
352
+ 3. How does the dataset treat extraordinary items in `value_04001q` vs `value_18191q`?
353
+
354
+ ### Recommended Additional Data:
355
+ - Industry sector classifications to enable `group_mean` neutralizations within sectors
356
+ - Daily price data to construct valuation multiples (P/E, EV/EBIT) for convergence analysis
357
+ - Short interest data to combine with `value_08579` (Market Cap Growth) for squeeze potential identification
358
+
359
+ ### Assumptions to Challenge:
360
+ - **Stable is always better**: Is low volatility in `value_08251q` always positive, or does it indicate complacency in low-growth industries?
361
+ - **Growth is good**: Does `growthratesa_value_08616a` account for acquisition quality, or does it reward dilutive M&A?
362
+ - **FX is noise**: For pure exporters, is `cfsourceusea_value_04840a` truly non-operational, or does it reflect competitive positioning?
363
+
364
+ ---
365
+
366
+ ## Methodology Notes
367
+
368
+ **Analysis Approach**: This report was generated by:
369
+ 1. Deep field deconstruction to understand data essence (accounting relationships, frequency differences, business logic)
370
+ 2. Question-driven feature generation (8 fundamental questions applied to financial statement logic)
371
+ 3. Logical validation of each feature concept against financial theory and data constraints
372
+ 4. Transparent documentation of reasoning and implementation templates
373
+
374
+ **Design Principles**:
375
+ - Focus on logical meaning over conventional patterns (e.g., combining growth with coverage rather than just using P/E)
376
+ - Every feature must answer a specific question about the underlying economic reality
377
+ - Clear documentation of "why" for each suggestion to enable validation
378
+ - Emphasis on data understanding (quarterly vs annual, operational vs non-operational) over prediction
379
+
380
+ ---
381
+
382
+ *Report generated: 2024-01-15*
383
+ *Analysis depth: Comprehensive field deconstruction + 8-question framework*
384
+ *Next steps: Implement Tier 1 features, validate FX sign conventions, gather sector data for relative features*
@@ -0,0 +1,41 @@
1
+ [
2
+ "divide(abs(fnd28_cfsourceusea_value_04840a), ts_mean(abs(fnd28_cfsourceusea_value_04840a), 252))",
3
+ "divide(subtract(fnd28_ishtq_value_18191q, ts_mean(fnd28_ishtq_value_18191q, 252)), ts_std_dev(fnd28_ishtq_value_18191q, 252))",
4
+ "divide(subtract(fnd28_newq_value_18191q, ts_mean(fnd28_newq_value_18191q, 252)), ts_std_dev(fnd28_newq_value_18191q, 252))",
5
+ "divide(ts_delta(fnd28_newq_value_08316q, 63), ts_mean(fnd28_newq_value_08316q, 252))",
6
+ "divide(ts_delta(fnd28_ratesq_value_08316q, 63), ts_mean(fnd28_ratesq_value_08316q, 252))",
7
+ "divide(fnd28_bdeq_value_03051q, fnd28_bdeq_value_03999q)",
8
+ "divide(fnd28_bdeq_value_03051q, fnd28_newq_value_03999q)",
9
+ "divide(fnd28_fsq1_value_03051q, fnd28_bdeq_value_03999q)",
10
+ "divide(fnd28_fsq1_value_03051q, fnd28_newq_value_03999q)",
11
+ "divide(fnd28_nddq1_value_03051q, fnd28_newq_value_03999q)",
12
+ "divide(fnd28_nddq1_value_03051q, fnd28_bdeq_value_03999q)",
13
+ "divide(fnd28_bdea_value_03501a, fnd28_bsassetq_value_02300q)",
14
+ "divide(fnd28_bdea_value_03501a, fnd28_nddq1_value_02300q)",
15
+ "divide(fnd28_fsa1_value_03501a, fnd28_bsassetq_value_02300q)",
16
+ "divide(fnd28_fsa1_value_03501a, fnd28_nddq1_value_02300q)",
17
+ "divide(fnd28_cfq_value_04001q, fnd28_bsassetq_value_02300q)",
18
+ "divide(fnd28_cfq_value_04001q, fnd28_nddq1_value_02300q)",
19
+ "divide(fnd28_nddq1_value_04001q, fnd28_nddq1_value_02300q)",
20
+ "divide(fnd28_nddq1_value_04001q, fnd28_bsassetq_value_02300q)",
21
+ "multiply(fnd28_growthratesa_value_08616a, fnd28_newq_value_08251q)",
22
+ "multiply(fnd28_growthratesa_value_08616a, fnd28_ratesq_value_08251q)",
23
+ "quantile(fnd28_newq_value_08301q, driver=\"gaussian\")",
24
+ "quantile(fnd28_ratesq_value_08301q, driver=\"gaussian\")",
25
+ "regression_neut(fnd28_newq_value_08251q, fnd28_nddq1_value_02300q)",
26
+ "regression_neut(fnd28_newq_value_08251q, fnd28_bsassetq_value_02300q)",
27
+ "regression_neut(fnd28_ratesq_value_08251q, fnd28_bsassetq_value_02300q)",
28
+ "regression_neut(fnd28_ratesq_value_08251q, fnd28_nddq1_value_02300q)",
29
+ "subtract(fnd28_cfq_value_04001q, fnd28_cfsourceusea_value_04840a)",
30
+ "subtract(fnd28_nddq1_value_04001q, fnd28_cfsourceusea_value_04840a)",
31
+ "ts_corr(fnd28_newq_value_08316q, ts_delay(fnd28_newq_value_08316q, 252), 504)",
32
+ "ts_corr(fnd28_ratesq_value_08316q, ts_delay(fnd28_ratesq_value_08316q, 252), 504)",
33
+ "ts_delta(fnd28_growthratesa_value_08816a, 63)",
34
+ "ts_std_dev(ts_delta(fnd28_bsassetq_value_02300q, 252), 63)",
35
+ "ts_std_dev(ts_delta(fnd28_nddq1_value_02300q, 252), 63)",
36
+ "ts_std_dev(fnd28_newq_value_08251q, 20)",
37
+ "ts_std_dev(fnd28_ratesq_value_08251q, 20)",
38
+ "ts_sum(fnd28_cfsourceusea_value_04840a, 63)",
39
+ "ts_sum(fnd28_cfq_value_04001q, 252)",
40
+ "ts_sum(fnd28_nddq1_value_04001q, 252)"
41
+ ]
@@ -0,0 +1,7 @@
1
+ {
2
+ "template": "divide(abs({cfsourceusea_value_04840a}), ts_mean(abs({cfsourceusea_value_04840a}), 252))",
3
+ "idea": "**Concept**: FX Impact Anomaly\n- **Sample Fields Used**: `cfsourceusea_value_04840a`\n- **Definition**: Magnitude of current FX effect relative to historical average absolute impact\n- **Why This Feature**: Flags unusual currency translation effects that may distort underlying operational performance\n- **Logical Meaning**: Identifies when currency headwinds/tailwinds are unusually severe compared to the company's historical FX exposure\n- **is filling nan necessary**: we have some operators to fill nan value like ts_backfill() or group_mean() etc. however, in some cases, if the nan value itself has some meaning, then we should not fill it blindly since it may introduce some bias. so before filling nan value, we should think about whether the nan value has some meaning in the specific scenario. FX effects are often zero for domestic companies; NaN vs zero distinction matters for international exposure identification.\n- **Directionality**: High values indicate unusual FX impact (may require operational adjustment)\n- **Boundary Conditions**: Values near 1 indicate normal FX impact; high values indicate currency crises or extreme rate movements",
4
+ "expression_list": [
5
+ "divide(abs(fnd28_cfsourceusea_value_04840a), ts_mean(abs(fnd28_cfsourceusea_value_04840a), 252))"
6
+ ]
7
+ }
@@ -0,0 +1,8 @@
1
+ {
2
+ "template": "divide(subtract({value_18191q}, ts_mean({value_18191q}, 252)), ts_std_dev({value_18191q}, 252))",
3
+ "idea": "**Concept**: EBIT Z-Score Deviation\n- **Sample Fields Used**: `value_18191q`\n- **Definition**: Standardized deviation of current EBIT from its 1-year historical mean\n- **Why This Feature**: Identifies earnings surprises or shocks that deviate significantly from the company's normal operating range\n- **Logical Meaning**: Statistical measure of earnings unusualness; extreme values suggest non-recurring items or inflection points\n- **is filling nan necessary**: we have some operators to fill nan value like ts_backfill() or group_mean() etc. however, in some cases, if the nan value itself has some meaning, then we should not fill it blindly since it may introduce some bias. so before filling nan value, we should think about whether the nan value has some meaning in the specific scenario. NaN handling should preserve the distinction between missing data and zero earnings.\n- **Directionality**: High absolute values indicate anomalies (potential mean reversion candidates)\n- **Boundary Conditions**: Values beyond 2-3 standard deviations indicate significant outliers",
4
+ "expression_list": [
5
+ "divide(subtract(fnd28_ishtq_value_18191q, ts_mean(fnd28_ishtq_value_18191q, 252)), ts_std_dev(fnd28_ishtq_value_18191q, 252))",
6
+ "divide(subtract(fnd28_newq_value_18191q, ts_mean(fnd28_newq_value_18191q, 252)), ts_std_dev(fnd28_newq_value_18191q, 252))"
7
+ ]
8
+ }
@@ -0,0 +1,8 @@
1
+ {
2
+ "template": "divide(ts_delta({value_08316q}, 63), ts_mean({value_08316q}, 252))",
3
+ "idea": "**Concept**: Operating Margin Momentum\n- **Sample Fields Used**: `value_08316q`\n- **Definition**: Recent change in operating margin normalized by the 1-year average margin level\n- **Why This Feature**: Identifies operational inflections (expansion/contraction) relative to the company's historical norm\n- **Logical Meaning**: Normalized velocity of profitability changes; indicates pricing power or cost control shifts\n- **is filling nan necessary**: we have some operators to fill nan value like ts_backfill() or group_mean() etc. however, in some cases, if the nan value itself has some meaning, then we should not fill it blindly since it may introduce some bias. so before filling nan value, we should think about whether the nan value has some meaning in the specific scenario. Quarterly reporting gaps should not be filled to avoid assuming constant margins.\n- **Directionality**: Positive values indicate margin expansion (operational improvement)\n- **Boundary Conditions**: Values near zero indicate stable margins; spikes indicate one-time items or structural changes",
4
+ "expression_list": [
5
+ "divide(ts_delta(fnd28_newq_value_08316q, 63), ts_mean(fnd28_newq_value_08316q, 252))",
6
+ "divide(ts_delta(fnd28_ratesq_value_08316q, 63), ts_mean(fnd28_ratesq_value_08316q, 252))"
7
+ ]
8
+ }
@@ -0,0 +1,12 @@
1
+ {
2
+ "template": "divide({value_03051q}, {value_03999q})",
3
+ "idea": "**Concept**: Short-Term Liquidity Exposure\n- **Sample Fields Used**: `value_03051q`, `value_03999q`\n- **Definition**: Short-term debt as a proportion of total liabilities and shareholders' equity\n- **Why This Feature**: Captures refinancing risk and liquidity pressure; high values indicate near-term obligations\n- **Logical Meaning**: Maturity structure of liabilities; indicates reliance on short-term funding vs long-term capital\n- **is filling nan necessary**: we have some operators to fill nan value like ts_backfill() or group_mean() etc. however, in some cases, if the nan value itself has some meaning, then we should not fill it blindly since it may introduce some bias. so before filling nan value, we should think about whether the nan value has some meaning in the specific scenario. Zero short-term debt is meaningful (long-term only financing); distinguish from missing data.\n- **Directionality**: Higher values indicate greater near-term refinancing risk (negative for stability)\n- **Boundary Conditions**: Values approaching 1 indicate all debt is short-term; zero indicates no current maturities",
4
+ "expression_list": [
5
+ "divide(fnd28_bdeq_value_03051q, fnd28_bdeq_value_03999q)",
6
+ "divide(fnd28_bdeq_value_03051q, fnd28_newq_value_03999q)",
7
+ "divide(fnd28_fsq1_value_03051q, fnd28_bdeq_value_03999q)",
8
+ "divide(fnd28_fsq1_value_03051q, fnd28_newq_value_03999q)",
9
+ "divide(fnd28_nddq1_value_03051q, fnd28_newq_value_03999q)",
10
+ "divide(fnd28_nddq1_value_03051q, fnd28_bdeq_value_03999q)"
11
+ ]
12
+ }
@@ -0,0 +1,10 @@
1
+ {
2
+ "template": "divide({value_03501a}, {value_02300q})",
3
+ "idea": "**Concept**: Equity Capital Structure Ratio\n- **Sample Fields Used**: `value_03501a`, `value_02300q`\n- **Definition**: Common equity as a proportion of total assets (Equity/Assets ratio)\n- **Why This Feature**: Measures financial leverage and capital structure conservatism; higher equity indicates lower leverage risk\n- **Logical Meaning**: Ownership cushion against asset value declines; inverse of leverage ratio\n- **is filling nan necessary**: we have some operators to fill nan value like ts_backfill() or group_mean() etc. however, in some cases, if the nan value itself has some meaning, then we should not fill it blindly since it may introduce some bias. so before filling nan value, we should think about whether the nan value has some meaning in the specific scenario. Annual equity vs quarterly assets creates frequency mismatch; do not interpolate annual data to quarterly.\n- **Directionality**: Higher values indicate less leveraged, more conservative capital structure (typically lower risk)\n- **Boundary Conditions**: Values near 1 indicate no debt; near 0 indicate highly leveraged or negative equity situations",
4
+ "expression_list": [
5
+ "divide(fnd28_bdea_value_03501a, fnd28_bsassetq_value_02300q)",
6
+ "divide(fnd28_bdea_value_03501a, fnd28_nddq1_value_02300q)",
7
+ "divide(fnd28_fsa1_value_03501a, fnd28_bsassetq_value_02300q)",
8
+ "divide(fnd28_fsa1_value_03501a, fnd28_nddq1_value_02300q)"
9
+ ]
10
+ }
@@ -0,0 +1,10 @@
1
+ {
2
+ "template": "divide({value_04001q}, {value_02300q})",
3
+ "idea": "**Concept**: Cash-to-Assets Efficiency\n- **Sample Fields Used**: `value_04001q`, `value_02300q`\n- **Definition**: Ratio of net income starting line to total assets (ROA proxy using cash flow statement starting point)\n- **Why This Feature**: Measures fundamental asset efficiency independent of accrual accounting adjustments\n- **Logical Meaning**: Asset turnover intensity; how effectively the company converts its asset base into earnings\n- **is filling nan necessary**: we have some operators to fill nan value like ts_backfill() or group_mean() etc. however, in some cases, if the nan value itself has some meaning, then we should not fill it blindly since it may introduce some bias. so before filling nan value, we should think about whether the nan value has some meaning in the specific scenario. Asset values are quarterly; income is flow-based. Ensure both are available for the same period.\n- **Directionality**: Higher values indicate more efficient asset utilization (positive for returns)\n- **Boundary Conditions**: Capital-intensive industries naturally have lower values; financials have different asset definitions",
4
+ "expression_list": [
5
+ "divide(fnd28_cfq_value_04001q, fnd28_bsassetq_value_02300q)",
6
+ "divide(fnd28_cfq_value_04001q, fnd28_nddq1_value_02300q)",
7
+ "divide(fnd28_nddq1_value_04001q, fnd28_nddq1_value_02300q)",
8
+ "divide(fnd28_nddq1_value_04001q, fnd28_bsassetq_value_02300q)"
9
+ ]
10
+ }
@@ -0,0 +1,8 @@
1
+ {
2
+ "template": "multiply({growthratesa_value_08616a}, {value_08251q})",
3
+ "idea": "**Concept**: Sustainable Growth Quality\n- **Sample Fields Used**: `growthratesa_value_08616a`, `value_08251q`\n- **Definition**: Product of equity growth rate and fixed charge coverage ratio\n- **Why This Feature**: High growth with low coverage suggests leveraged, risky expansion; high coverage supports sustainable growth\n- **Logical Meaning**: Quality-adjusted growth metric; scales growth magnitude by financial stability\n- **is filling nan necessary**: we have some operators to fill nan value like ts_backfill() or group_mean() etc. however, in some cases, if the nan value itself has some meaning, then we should not fill it blindly since it may introduce some bias. so before filling nan value, we should think about whether the nan value has some meaning in the specific scenario. Different frequencies (annual growth vs quarterly coverage) require alignment; do not fill across frequency mismatches.\n- **Directionality**: Higher values indicate high growth with strong coverage (optimal); negative values indicate growth during coverage distress (risky)\n- **Boundary Conditions**: Near-zero coverage with high growth creates extreme values; winsorization recommended",
4
+ "expression_list": [
5
+ "multiply(fnd28_growthratesa_value_08616a, fnd28_newq_value_08251q)",
6
+ "multiply(fnd28_growthratesa_value_08616a, fnd28_ratesq_value_08251q)"
7
+ ]
8
+ }
@@ -0,0 +1,8 @@
1
+ {
2
+ "template": "quantile({value_08301q}, driver=\"gaussian\")",
3
+ "idea": "**Concept**: ROE Cross-Sectional Percentile\n- **Sample Fields Used**: `value_08301q`\n- **Definition**: Gaussian-quantile rank of Return on Equity within the cross-sectional universe\n- **Why This Feature**: Relative profitability positioning independent of market-wide ROE shifts; identifies top-tier operators\n- **Logical Meaning**: Standardized position within the profit distribution; robust to inflation/period effects that raise all boats\n- **is filling nan necessary**: we have some operators to fill nan value like ts_backfill() or group_mean() etc. however, in some cases, if the nan value itself has some meaning, then we should not fill it blindly since it may introduce some bias. so before filling nan value, we should think about whether the nan value has some meaning in the specific scenario. Quantile calculation requires complete cross-section; NaN values should be excluded from ranking, not filled.\n- **Directionality**: Higher values indicate top-quartile profitability relative to peers (positive for selection)\n- **Boundary Conditions**: Gaussian transformation caps extreme tails; values beyond +/- 2 sigma are rare",
4
+ "expression_list": [
5
+ "quantile(fnd28_newq_value_08301q, driver=\"gaussian\")",
6
+ "quantile(fnd28_ratesq_value_08301q, driver=\"gaussian\")"
7
+ ]
8
+ }
@@ -0,0 +1,10 @@
1
+ {
2
+ "template": "regression_neut({value_08251q}, {value_02300q})",
3
+ "idea": "**Concept**: Coverage Neutralized for Size\n- **Sample Fields Used**: `value_08251q`, `value_02300q`\n- **Definition**: Residual of fixed charge coverage after regressing on total assets (size)\n- **Why This Feature**: Distinguishes coverage due to operational efficiency from coverage due to scale economies or diversification\n- **Logical Meaning**: Coverage ratio independent of company size; identifies efficiently managed small caps vs bloated large caps\n- **is filling nan necessary**: we have some operators to fill nan value like ts_backfill() or group_mean() etc. however, in some cases, if the nan value itself has some meaning, then we should not fill it blindly since it may introduce some bias. so before filling nan value, we should think about whether the nan value has some meaning in the specific scenario. Regression requires paired observations; missing either variable should result in NaN residual.\n- **Directionality**: Positive residuals indicate better coverage than size predicts (operational alpha)\n- **Boundary Conditions**: Extreme residuals indicate outliers in coverage-to-size relationship (niche business models)",
4
+ "expression_list": [
5
+ "regression_neut(fnd28_newq_value_08251q, fnd28_nddq1_value_02300q)",
6
+ "regression_neut(fnd28_newq_value_08251q, fnd28_bsassetq_value_02300q)",
7
+ "regression_neut(fnd28_ratesq_value_08251q, fnd28_bsassetq_value_02300q)",
8
+ "regression_neut(fnd28_ratesq_value_08251q, fnd28_nddq1_value_02300q)"
9
+ ]
10
+ }
@@ -0,0 +1,8 @@
1
+ {
2
+ "template": "subtract({value_04001q}, {cfsourceusea_value_04840a})",
3
+ "idea": "**Concept**: FX-Adjusted Cash Generation\n- **Sample Fields Used**: `value_04001q`, `cfsourceusea_value_04840a`\n- **Definition**: Net income starting line minus FX translation effects to isolate operational cash generation\n- **Why This Feature**: Removes non-operational currency noise to reveal underlying business performance\n- **Logical Meaning**: Core operational cash flow before translational accounting adjustments; pure operational signal\n- **is filling nan necessary**: we have some operators to fill nan value like ts_backfill() or group_mean() etc. however, in some cases, if the nan value itself has some meaning, then we should not fill it blindly since it may introduce some bias. so before filling nan value, we should think about whether the nan value has some meaning in the specific scenario. If FX effect is NaN (domestic company), the adjustment should be zero (no effect), not filled from other companies.\n- **Directionality**: Higher values indicate stronger core operational generation independent of currency games\n- **Boundary Conditions**: Large differences between adjusted and unadjusted indicate high FX volatility or international exposure",
4
+ "expression_list": [
5
+ "subtract(fnd28_cfq_value_04001q, fnd28_cfsourceusea_value_04840a)",
6
+ "subtract(fnd28_nddq1_value_04001q, fnd28_cfsourceusea_value_04840a)"
7
+ ]
8
+ }
@@ -0,0 +1,8 @@
1
+ {
2
+ "template": "ts_corr({value_08316q}, ts_delay({value_08316q}, 252), 504)",
3
+ "idea": "**Concept**: Operating Margin Persistence\n- **Sample Fields Used**: `value_08316q`\n- **Definition**: Correlation between current operating margin and margin 252 days (1 year) prior, measured over 504 days (2 years)\n- **Why This Feature**: Measures the durability of competitive advantages; persistent margins indicate moats, volatile margins indicate commodity exposure\n- **Logical Meaning**: Autocorrelation of profitability; high values suggest structural industry position, low values suggest cyclical or competitive pressure\n- **is filling nan necessary**: we have some operators to fill nan value like ts_backfill() or group_mean() etc. however, in some cases, if the nan value itself has some meaning, then we should not fill it blindly since it may introduce some bias. so before filling nan value, we should think about whether the nan value has some meaning in the specific scenario. Correlation requires aligned time series; filling gaps creates spurious persistence.\n- **Directionality**: Higher values indicate persistent margins (quality); low values indicate unstable margins (risk)\n- **Boundary Conditions**: Values near 1 indicate highly predictable margins; near 0 indicate random walk margins; negative indicate mean-reverting margins",
4
+ "expression_list": [
5
+ "ts_corr(fnd28_newq_value_08316q, ts_delay(fnd28_newq_value_08316q, 252), 504)",
6
+ "ts_corr(fnd28_ratesq_value_08316q, ts_delay(fnd28_ratesq_value_08316q, 252), 504)"
7
+ ]
8
+ }
@@ -0,0 +1,7 @@
1
+ {
2
+ "template": "ts_delta({growthratesa_value_08816a}, 63)",
3
+ "idea": "**Concept**: Earnings Growth Acceleration\n- **Sample Fields Used**: `growthratesa_value_08816a`\n- **Definition**: Change in annual EPS growth rate over a 63-day window to capture inflection points in momentum\n- **Why This Feature**: Markets price changes in growth rates, not just growth levels; acceleration signals improving business trends\n- **Logical Meaning**: Second derivative of earnings; positive values indicate growth is speeding up (positive momentum)\n- **is filling nan necessary**: we have some operators to fill nan value like ts_backfill() or group_mean() etc. however, in some cases, if the nan value itself has some meaning, then we should not fill it blindly since it may introduce some bias. so before filling nan value, we should think about whether the nan value has some meaning in the specific scenario. Annual growth rates update infrequently; filling NaNs with stale data creates look-ahead bias.\n- **Directionality**: Positive values indicate accelerating growth (bullish); negative indicates deceleration\n- **Boundary Conditions**: Extreme values occur near earnings turning points (negative to positive growth)",
4
+ "expression_list": [
5
+ "ts_delta(fnd28_growthratesa_value_08816a, 63)"
6
+ ]
7
+ }
@@ -0,0 +1,8 @@
1
+ {
2
+ "template": "ts_std_dev(ts_delta({value_02300q}, 252), 63)",
3
+ "idea": "**Concept**: Asset Growth Consistency\n- **Sample Fields Used**: `value_02300q`\n- **Definition**: Standard deviation of year-over-year asset changes measured over 63 days (quarterly window)\n- **Why This Feature**: Distinguishes between steady organic expansion and lumpy acquisition-driven growth or asset sales\n- **Logical Meaning**: Captures the volatility of the company's investment policy; consistent growth suggests predictable capital allocation\n- **is filling nan necessary**: we have some operators to fill nan value like ts_backfill() or group_mean() etc. however, in some cases, if the nan value itself has some meaning, then we should not fill it blindly since it may introduce some bias. so before filling nan value, we should think about whether the nan value has some meaning in the specific scenario. Asset values are typically reported quarterly; interpolation between quarters may introduce false stability.\n- **Directionality**: Lower values indicate more stable asset base evolution (typically positive for forecasting)\n- **Boundary Conditions**: Zero indicates no asset changes; spikes indicate M&A activity or write-downs",
4
+ "expression_list": [
5
+ "ts_std_dev(ts_delta(fnd28_bsassetq_value_02300q, 252), 63)",
6
+ "ts_std_dev(ts_delta(fnd28_nddq1_value_02300q, 252), 63)"
7
+ ]
8
+ }