@wentorai/research-plugins 1.0.0 → 1.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (203) hide show
  1. package/README.md +22 -22
  2. package/curated/analysis/README.md +71 -56
  3. package/curated/domains/README.md +176 -67
  4. package/curated/literature/README.md +71 -47
  5. package/curated/research/README.md +91 -58
  6. package/curated/tools/README.md +88 -87
  7. package/curated/writing/README.md +80 -45
  8. package/mcp-configs/cloud-docs/confluence-mcp.json +37 -0
  9. package/mcp-configs/cloud-docs/google-drive-mcp.json +35 -0
  10. package/mcp-configs/cloud-docs/notion-mcp.json +29 -0
  11. package/mcp-configs/communication/discord-mcp.json +29 -0
  12. package/mcp-configs/communication/slack-mcp.json +29 -0
  13. package/mcp-configs/communication/telegram-mcp.json +28 -0
  14. package/mcp-configs/database/neo4j-mcp.json +37 -0
  15. package/mcp-configs/database/postgres-mcp.json +28 -0
  16. package/mcp-configs/database/sqlite-mcp.json +29 -0
  17. package/mcp-configs/dev-platform/github-mcp.json +31 -0
  18. package/mcp-configs/dev-platform/gitlab-mcp.json +34 -0
  19. package/mcp-configs/email/email-mcp.json +40 -0
  20. package/mcp-configs/email/gmail-mcp.json +37 -0
  21. package/mcp-configs/registry.json +178 -149
  22. package/mcp-configs/repository/dataverse-mcp.json +33 -0
  23. package/mcp-configs/repository/huggingface-mcp.json +29 -0
  24. package/openclaw.plugin.json +2 -2
  25. package/package.json +2 -2
  26. package/skills/analysis/dataviz/algorithm-visualizer-guide/SKILL.md +259 -0
  27. package/skills/analysis/dataviz/bokeh-visualization-guide/SKILL.md +270 -0
  28. package/skills/analysis/dataviz/chart-image-generator/SKILL.md +229 -0
  29. package/skills/analysis/dataviz/d3-visualization-guide/SKILL.md +281 -0
  30. package/skills/analysis/dataviz/echarts-visualization-guide/SKILL.md +250 -0
  31. package/skills/analysis/dataviz/metabase-analytics-guide/SKILL.md +242 -0
  32. package/skills/analysis/dataviz/plotly-interactive-guide/SKILL.md +266 -0
  33. package/skills/analysis/dataviz/redash-analytics-guide/SKILL.md +284 -0
  34. package/skills/analysis/econometrics/econml-causal-guide/SKILL.md +163 -0
  35. package/skills/analysis/econometrics/mostly-harmless-guide/SKILL.md +139 -0
  36. package/skills/analysis/econometrics/panel-data-analyst/SKILL.md +259 -0
  37. package/skills/analysis/econometrics/python-causality-guide/SKILL.md +134 -0
  38. package/skills/analysis/econometrics/stata-accounting-guide/SKILL.md +269 -0
  39. package/skills/analysis/econometrics/stata-analyst-guide/SKILL.md +245 -0
  40. package/skills/analysis/statistics/data-anomaly-detection/SKILL.md +157 -0
  41. package/skills/analysis/statistics/ml-experiment-tracker/SKILL.md +212 -0
  42. package/skills/analysis/statistics/pywayne-statistics-guide/SKILL.md +192 -0
  43. package/skills/analysis/statistics/quantitative-methods-guide/SKILL.md +193 -0
  44. package/skills/analysis/statistics/senior-data-scientist-guide/SKILL.md +223 -0
  45. package/skills/analysis/wrangling/csv-data-analyzer/SKILL.md +170 -0
  46. package/skills/analysis/wrangling/data-cleaning-pipeline/SKILL.md +266 -0
  47. package/skills/analysis/wrangling/data-cog-guide/SKILL.md +178 -0
  48. package/skills/analysis/wrangling/stata-data-cleaning/SKILL.md +276 -0
  49. package/skills/analysis/wrangling/survey-data-processing/SKILL.md +298 -0
  50. package/skills/domains/ai-ml/ai-model-benchmarking/SKILL.md +209 -0
  51. package/skills/domains/ai-ml/annotated-dl-papers-guide/SKILL.md +159 -0
  52. package/skills/domains/ai-ml/dl-transformer-finetune/SKILL.md +239 -0
  53. package/skills/domains/ai-ml/generative-ai-guide/SKILL.md +146 -0
  54. package/skills/domains/ai-ml/huggingface-inference-guide/SKILL.md +196 -0
  55. package/skills/domains/ai-ml/keras-deep-learning/SKILL.md +210 -0
  56. package/skills/domains/ai-ml/llm-from-scratch-guide/SKILL.md +124 -0
  57. package/skills/domains/ai-ml/ml-pipeline-guide/SKILL.md +295 -0
  58. package/skills/domains/ai-ml/nlp-toolkit-guide/SKILL.md +247 -0
  59. package/skills/domains/ai-ml/pytorch-guide/SKILL.md +281 -0
  60. package/skills/domains/ai-ml/pytorch-lightning-guide/SKILL.md +244 -0
  61. package/skills/domains/ai-ml/tensorflow-guide/SKILL.md +241 -0
  62. package/skills/domains/biomedical/bioagents-guide/SKILL.md +308 -0
  63. package/skills/domains/biomedical/medgeclaw-guide/SKILL.md +345 -0
  64. package/skills/domains/biomedical/medical-imaging-guide/SKILL.md +305 -0
  65. package/skills/domains/business/architecture-design-guide/SKILL.md +279 -0
  66. package/skills/domains/business/innovation-management-guide/SKILL.md +257 -0
  67. package/skills/domains/business/operations-research-guide/SKILL.md +258 -0
  68. package/skills/domains/chemistry/molecular-dynamics-guide/SKILL.md +237 -0
  69. package/skills/domains/chemistry/pubchem-api-guide/SKILL.md +180 -0
  70. package/skills/domains/chemistry/spectroscopy-analysis-guide/SKILL.md +290 -0
  71. package/skills/domains/cs/distributed-systems-guide/SKILL.md +268 -0
  72. package/skills/domains/cs/formal-verification-guide/SKILL.md +298 -0
  73. package/skills/domains/ecology/species-distribution-guide/SKILL.md +343 -0
  74. package/skills/domains/economics/imf-data-api-guide/SKILL.md +174 -0
  75. package/skills/domains/economics/post-labor-economics/SKILL.md +254 -0
  76. package/skills/domains/economics/pricing-psychology-guide/SKILL.md +273 -0
  77. package/skills/domains/economics/world-bank-data-guide/SKILL.md +179 -0
  78. package/skills/domains/education/assessment-design-guide/SKILL.md +213 -0
  79. package/skills/domains/education/educational-research-methods/SKILL.md +179 -0
  80. package/skills/domains/education/mooc-analytics-guide/SKILL.md +206 -0
  81. package/skills/domains/finance/portfolio-optimization-guide/SKILL.md +279 -0
  82. package/skills/domains/finance/risk-modeling-guide/SKILL.md +260 -0
  83. package/skills/domains/finance/stata-accounting-research/SKILL.md +372 -0
  84. package/skills/domains/geoscience/climate-modeling-guide/SKILL.md +215 -0
  85. package/skills/domains/geoscience/satellite-remote-sensing/SKILL.md +193 -0
  86. package/skills/domains/geoscience/seismology-data-guide/SKILL.md +208 -0
  87. package/skills/domains/humanities/ethical-philosophy-guide/SKILL.md +244 -0
  88. package/skills/domains/humanities/history-research-guide/SKILL.md +260 -0
  89. package/skills/domains/humanities/political-history-guide/SKILL.md +241 -0
  90. package/skills/domains/law/legal-nlp-guide/SKILL.md +236 -0
  91. package/skills/domains/law/patent-analysis-guide/SKILL.md +257 -0
  92. package/skills/domains/law/regulatory-compliance-guide/SKILL.md +267 -0
  93. package/skills/domains/math/symbolic-computation-guide/SKILL.md +263 -0
  94. package/skills/domains/math/topology-data-analysis/SKILL.md +305 -0
  95. package/skills/domains/pharma/clinical-trial-design-guide/SKILL.md +271 -0
  96. package/skills/domains/pharma/drug-target-interaction/SKILL.md +242 -0
  97. package/skills/domains/pharma/pharmacovigilance-guide/SKILL.md +216 -0
  98. package/skills/domains/physics/astrophysics-data-guide/SKILL.md +305 -0
  99. package/skills/domains/physics/particle-physics-guide/SKILL.md +287 -0
  100. package/skills/domains/social-science/network-analysis-guide/SKILL.md +310 -0
  101. package/skills/domains/social-science/psychology-research-guide/SKILL.md +270 -0
  102. package/skills/domains/social-science/sociology-research-guide/SKILL.md +238 -0
  103. package/skills/literature/discovery/paper-recommendation-guide/SKILL.md +120 -0
  104. package/skills/literature/discovery/semantic-paper-radar/SKILL.md +144 -0
  105. package/skills/literature/discovery/zotero-arxiv-daily-guide/SKILL.md +94 -0
  106. package/skills/literature/fulltext/core-api-guide/SKILL.md +144 -0
  107. package/skills/literature/fulltext/institutional-repository-guide/SKILL.md +212 -0
  108. package/skills/literature/fulltext/open-access-mining-guide/SKILL.md +341 -0
  109. package/skills/literature/metadata/academic-paper-summarizer/SKILL.md +101 -0
  110. package/skills/literature/metadata/wikidata-api-guide/SKILL.md +156 -0
  111. package/skills/literature/search/arxiv-batch-reporting/SKILL.md +133 -0
  112. package/skills/literature/search/arxiv-paper-processor/SKILL.md +141 -0
  113. package/skills/literature/search/baidu-scholar-guide/SKILL.md +110 -0
  114. package/skills/literature/search/chatpaper-guide/SKILL.md +122 -0
  115. package/skills/literature/search/deep-literature-search/SKILL.md +149 -0
  116. package/skills/literature/search/deepgit-search-guide/SKILL.md +147 -0
  117. package/skills/literature/search/pasa-paper-search-guide/SKILL.md +138 -0
  118. package/skills/research/automation/ai-scientist-v2-guide/SKILL.md +284 -0
  119. package/skills/research/automation/aim-experiment-guide/SKILL.md +234 -0
  120. package/skills/research/automation/datagen-research-guide/SKILL.md +131 -0
  121. package/skills/research/automation/kedro-pipeline-guide/SKILL.md +216 -0
  122. package/skills/research/automation/mle-agent-guide/SKILL.md +139 -0
  123. package/skills/research/automation/paper-to-agent-guide/SKILL.md +116 -0
  124. package/skills/research/automation/rd-agent-guide/SKILL.md +246 -0
  125. package/skills/research/automation/research-paper-orchestrator/SKILL.md +254 -0
  126. package/skills/research/deep-research/academic-deep-research/SKILL.md +190 -0
  127. package/skills/research/deep-research/auto-deep-research-guide/SKILL.md +141 -0
  128. package/skills/research/deep-research/deep-research-pro/SKILL.md +213 -0
  129. package/skills/research/deep-research/deep-research-work/SKILL.md +204 -0
  130. package/skills/research/deep-research/deep-searcher-guide/SKILL.md +253 -0
  131. package/skills/research/deep-research/gpt-researcher-guide/SKILL.md +191 -0
  132. package/skills/research/deep-research/khoj-research-guide/SKILL.md +200 -0
  133. package/skills/research/deep-research/local-deep-research-guide/SKILL.md +253 -0
  134. package/skills/research/deep-research/tongyi-deep-research-guide/SKILL.md +217 -0
  135. package/skills/research/funding/eu-horizon-guide/SKILL.md +244 -0
  136. package/skills/research/funding/grant-budget-guide/SKILL.md +284 -0
  137. package/skills/research/funding/nih-reporter-api-guide/SKILL.md +166 -0
  138. package/skills/research/funding/nsf-award-api-guide/SKILL.md +133 -0
  139. package/skills/research/methodology/academic-mentor-guide/SKILL.md +169 -0
  140. package/skills/research/methodology/claude-scientific-guide/SKILL.md +122 -0
  141. package/skills/research/methodology/deep-innovator-guide/SKILL.md +242 -0
  142. package/skills/research/methodology/osf-api-guide/SKILL.md +165 -0
  143. package/skills/research/methodology/research-paper-kb/SKILL.md +263 -0
  144. package/skills/research/methodology/research-town-guide/SKILL.md +263 -0
  145. package/skills/research/paper-review/automated-review-guide/SKILL.md +281 -0
  146. package/skills/research/paper-review/paper-compare-guide/SKILL.md +238 -0
  147. package/skills/research/paper-review/paper-digest-guide/SKILL.md +240 -0
  148. package/skills/research/paper-review/paper-research-assistant/SKILL.md +231 -0
  149. package/skills/research/paper-review/research-quality-filter/SKILL.md +261 -0
  150. package/skills/research/paper-review/review-response-guide/SKILL.md +275 -0
  151. package/skills/tools/code-exec/google-colab-guide/SKILL.md +276 -0
  152. package/skills/tools/code-exec/kaggle-api-guide/SKILL.md +216 -0
  153. package/skills/tools/code-exec/overleaf-cli-guide/SKILL.md +279 -0
  154. package/skills/tools/diagram/code-flow-visualizer/SKILL.md +197 -0
  155. package/skills/tools/diagram/excalidraw-diagram-guide/SKILL.md +170 -0
  156. package/skills/tools/diagram/json-data-visualizer/SKILL.md +270 -0
  157. package/skills/tools/diagram/mermaid-architect-guide/SKILL.md +219 -0
  158. package/skills/tools/diagram/tldraw-whiteboard-guide/SKILL.md +397 -0
  159. package/skills/tools/document/docsgpt-guide/SKILL.md +130 -0
  160. package/skills/tools/document/large-document-reader/SKILL.md +202 -0
  161. package/skills/tools/document/paper-parse-guide/SKILL.md +243 -0
  162. package/skills/tools/knowledge-graph/citation-network-builder/SKILL.md +244 -0
  163. package/skills/tools/knowledge-graph/concept-map-generator/SKILL.md +284 -0
  164. package/skills/tools/knowledge-graph/graphiti-guide/SKILL.md +219 -0
  165. package/skills/tools/ocr-translate/pdf-math-translate-guide/SKILL.md +141 -0
  166. package/skills/tools/ocr-translate/zotero-pdf-translate-guide/SKILL.md +95 -0
  167. package/skills/tools/ocr-translate/zotero-pdf2zh-guide/SKILL.md +143 -0
  168. package/skills/tools/scraping/dataset-finder-guide/SKILL.md +253 -0
  169. package/skills/tools/scraping/easy-spider-guide/SKILL.md +250 -0
  170. package/skills/tools/scraping/google-scholar-scraper/SKILL.md +255 -0
  171. package/skills/tools/scraping/repository-harvesting-guide/SKILL.md +310 -0
  172. package/skills/writing/citation/academic-citation-manager/SKILL.md +314 -0
  173. package/skills/writing/citation/jabref-reference-guide/SKILL.md +127 -0
  174. package/skills/writing/citation/jasminum-zotero-guide/SKILL.md +103 -0
  175. package/skills/writing/citation/obsidian-citation-guide/SKILL.md +164 -0
  176. package/skills/writing/citation/obsidian-zotero-guide/SKILL.md +137 -0
  177. package/skills/writing/citation/papersgpt-zotero-guide/SKILL.md +132 -0
  178. package/skills/writing/citation/papis-cli-guide/SKILL.md +213 -0
  179. package/skills/writing/citation/zotero-better-bibtex-guide/SKILL.md +107 -0
  180. package/skills/writing/citation/zotero-better-notes-guide/SKILL.md +121 -0
  181. package/skills/writing/citation/zotero-gpt-guide/SKILL.md +111 -0
  182. package/skills/writing/citation/zotero-mcp-guide/SKILL.md +164 -0
  183. package/skills/writing/citation/zotero-mdnotes-guide/SKILL.md +162 -0
  184. package/skills/writing/citation/zotero-reference-guide/SKILL.md +139 -0
  185. package/skills/writing/citation/zotero-scholar-guide/SKILL.md +294 -0
  186. package/skills/writing/citation/zotfile-attachment-guide/SKILL.md +140 -0
  187. package/skills/writing/composition/ml-paper-writing/SKILL.md +163 -0
  188. package/skills/writing/composition/paper-debugger-guide/SKILL.md +143 -0
  189. package/skills/writing/composition/scientific-writing-resources/SKILL.md +151 -0
  190. package/skills/writing/composition/scientific-writing-wrapper/SKILL.md +153 -0
  191. package/skills/writing/latex/latex-drawing-collection/SKILL.md +154 -0
  192. package/skills/writing/latex/latex-templates-collection/SKILL.md +159 -0
  193. package/skills/writing/latex/md-to-pdf-academic/SKILL.md +230 -0
  194. package/skills/writing/latex/tex-render-guide/SKILL.md +243 -0
  195. package/skills/writing/polish/academic-tone-guide/SKILL.md +209 -0
  196. package/skills/writing/polish/conciseness-editing-guide/SKILL.md +225 -0
  197. package/skills/writing/polish/paper-polish-guide/SKILL.md +160 -0
  198. package/skills/writing/templates/graphical-abstract-guide/SKILL.md +183 -0
  199. package/skills/writing/templates/novathesis-guide/SKILL.md +152 -0
  200. package/skills/writing/templates/scientific-article-pdf/SKILL.md +261 -0
  201. package/skills/writing/templates/sjtuthesis-guide/SKILL.md +197 -0
  202. package/skills/writing/templates/thuthesis-guide/SKILL.md +181 -0
  203. package/skills/literature/fulltext/repository-harvesting-guide/SKILL.md +0 -207
@@ -0,0 +1,260 @@
1
+ ---
2
+ name: risk-modeling-guide
3
+ description: "Financial risk modeling including VaR, stress testing, and credit risk"
4
+ metadata:
5
+ openclaw:
6
+ emoji: "chart-decreasing"
7
+ category: "domains"
8
+ subcategory: "finance"
9
+ keywords: ["risk-modeling", "var", "stress-testing", "credit-risk", "monte-carlo", "basel"]
10
+ source: "wentor"
11
+ ---
12
+
13
+ # Risk Modeling Guide
14
+
15
+ A skill for quantitative financial risk modeling, covering Value at Risk, Expected Shortfall, credit risk, stress testing, and Monte Carlo simulation methods. Essential for financial engineering research and regulatory risk analysis.
16
+
17
+ ## Market Risk: Value at Risk
18
+
19
+ ### VaR Methodologies
20
+
21
+ | Method | Description | Pros | Cons |
22
+ |--------|-------------|------|------|
23
+ | Historical simulation | Replay past returns | No distributional assumption | Assumes past repeats |
24
+ | Variance-covariance | Assume normal returns | Fast, analytical | Underestimates tail risk |
25
+ | Monte Carlo simulation | Simulate from fitted model | Flexible distributions | Computationally expensive |
26
+ | Filtered historical simulation | GARCH + historical innovations | Captures volatility clustering | More complex |
27
+
28
+ ### Implementation
29
+
30
+ ```python
31
+ import numpy as np
32
+ import pandas as pd
33
+ from scipy.stats import norm, t as t_dist
34
+
35
+ def historical_var(returns: np.ndarray, confidence: float = 0.99,
36
+ horizon_days: int = 1) -> dict:
37
+ """
38
+ Compute Value at Risk using historical simulation.
39
+ returns: array of daily log returns
40
+ confidence: confidence level (e.g., 0.99 for 99% VaR)
41
+ horizon_days: risk horizon in days
42
+ """
43
+ # Scale returns to horizon
44
+ if horizon_days > 1:
45
+ # Rolling sum for overlapping returns
46
+ scaled_returns = pd.Series(returns).rolling(horizon_days).sum().dropna().values
47
+ else:
48
+ scaled_returns = returns
49
+
50
+ alpha = 1 - confidence
51
+ var = -np.percentile(scaled_returns, alpha * 100)
52
+ es = -np.mean(scaled_returns[scaled_returns <= -var])
53
+
54
+ return {
55
+ "VaR": round(var, 6),
56
+ "Expected_Shortfall": round(es, 6),
57
+ "confidence": confidence,
58
+ "horizon_days": horizon_days,
59
+ "n_observations": len(scaled_returns),
60
+ }
61
+
62
+ def parametric_var(returns: np.ndarray, confidence: float = 0.99,
63
+ distribution: str = "normal") -> dict:
64
+ """
65
+ Parametric VaR assuming normal or Student-t distribution.
66
+ """
67
+ mu = np.mean(returns)
68
+ sigma = np.std(returns, ddof=1)
69
+
70
+ if distribution == "normal":
71
+ z = norm.ppf(1 - confidence)
72
+ var = -(mu + sigma * z)
73
+ # Analytical ES for normal
74
+ es = -mu + sigma * norm.pdf(norm.ppf(1 - confidence)) / (1 - confidence)
75
+ elif distribution == "student-t":
76
+ # Fit Student-t
77
+ df, loc, scale = t_dist.fit(returns)
78
+ z = t_dist.ppf(1 - confidence, df)
79
+ var = -(loc + scale * z)
80
+ # ES for Student-t
81
+ t_pdf = t_dist.pdf(t_dist.ppf(1 - confidence, df), df)
82
+ es = -loc + scale * (t_pdf / (1 - confidence)) * ((df + z**2) / (df - 1))
83
+ else:
84
+ raise ValueError(f"Unknown distribution: {distribution}")
85
+
86
+ return {
87
+ "VaR": round(var, 6),
88
+ "Expected_Shortfall": round(es, 6),
89
+ "distribution": distribution,
90
+ "mean": round(mu, 6),
91
+ "std": round(sigma, 6),
92
+ }
93
+ ```
94
+
95
+ ### Monte Carlo VaR
96
+
97
+ ```python
98
+ def monte_carlo_var(returns: np.ndarray, n_simulations: int = 100000,
99
+ confidence: float = 0.99,
100
+ horizon_days: int = 10) -> dict:
101
+ """
102
+ Monte Carlo VaR using GBM (Geometric Brownian Motion).
103
+ """
104
+ mu = np.mean(returns)
105
+ sigma = np.std(returns, ddof=1)
106
+
107
+ # Simulate daily returns for the horizon
108
+ rng = np.random.default_rng(42)
109
+ simulated = rng.normal(
110
+ mu * horizon_days,
111
+ sigma * np.sqrt(horizon_days),
112
+ size=n_simulations,
113
+ )
114
+
115
+ alpha = 1 - confidence
116
+ var = -np.percentile(simulated, alpha * 100)
117
+ es = -np.mean(simulated[simulated <= -var])
118
+
119
+ return {
120
+ "VaR": round(var, 6),
121
+ "Expected_Shortfall": round(es, 6),
122
+ "n_simulations": n_simulations,
123
+ "confidence": confidence,
124
+ "horizon_days": horizon_days,
125
+ }
126
+ ```
127
+
128
+ ## Credit Risk Modeling
129
+
130
+ ### Probability of Default Estimation
131
+
132
+ ```python
133
+ from sklearn.linear_model import LogisticRegression
134
+
135
+ def build_pd_model(features: pd.DataFrame,
136
+ default_flag: pd.Series) -> dict:
137
+ """
138
+ Build a Probability of Default (PD) model using logistic regression.
139
+ Common features: debt-to-income, credit utilization, payment history,
140
+ employment length, loan amount.
141
+ """
142
+ model = LogisticRegression(max_iter=1000, class_weight="balanced")
143
+ model.fit(features, default_flag)
144
+
145
+ # Coefficient interpretation
146
+ coef_df = pd.DataFrame({
147
+ "feature": features.columns,
148
+ "coefficient": model.coef_[0],
149
+ "odds_ratio": np.exp(model.coef_[0]),
150
+ }).sort_values("coefficient", ascending=False)
151
+
152
+ # Model discrimination
153
+ from sklearn.metrics import roc_auc_score
154
+ pred_proba = model.predict_proba(features)[:, 1]
155
+ auc = roc_auc_score(default_flag, pred_proba)
156
+
157
+ return {
158
+ "auc": round(auc, 4),
159
+ "coefficients": coef_df.to_dict("records"),
160
+ "intercept": round(model.intercept_[0], 4),
161
+ }
162
+ ```
163
+
164
+ ### Loss Given Default and EAD
165
+
166
+ ```python
167
+ def compute_expected_loss(pd_score: float, lgd: float,
168
+ ead: float) -> dict:
169
+ """
170
+ Compute Expected Loss = PD x LGD x EAD.
171
+ pd_score: probability of default (0-1)
172
+ lgd: loss given default (0-1, fraction of exposure lost)
173
+ ead: exposure at default (dollar amount)
174
+ """
175
+ el = pd_score * lgd * ead
176
+ return {
177
+ "PD": pd_score,
178
+ "LGD": lgd,
179
+ "EAD": ead,
180
+ "Expected_Loss": round(el, 2),
181
+ "Unexpected_Loss_99": round(el * 2.33 * np.sqrt(pd_score * (1 - pd_score)), 2),
182
+ }
183
+ ```
184
+
185
+ ## Stress Testing
186
+
187
+ ### Scenario-Based Stress Tests
188
+
189
+ ```python
190
+ def run_stress_test(portfolio_returns: pd.DataFrame,
191
+ scenarios: dict[str, dict]) -> pd.DataFrame:
192
+ """
193
+ Apply macroeconomic stress scenarios to a portfolio.
194
+ scenarios: {name: {factor: shock_value}} where factors are
195
+ macroeconomic variables (interest_rate, gdp_growth, unemployment, etc.)
196
+ """
197
+ # Factor sensitivities (betas from regression)
198
+ # In practice, estimated via historical regression
199
+ factor_betas = {
200
+ "interest_rate": -0.15, # portfolio loses 15bp per 1% rate increase
201
+ "gdp_growth": 0.08, # gains 8bp per 1% GDP growth
202
+ "unemployment": -0.12, # loses 12bp per 1% unemployment increase
203
+ "equity_market": 0.45, # 45bp per 1% equity market move
204
+ "credit_spread": -0.25, # loses 25bp per 1% spread widening
205
+ }
206
+
207
+ results = []
208
+ for name, shocks in scenarios.items():
209
+ portfolio_impact = 0
210
+ for factor, shock in shocks.items():
211
+ beta = factor_betas.get(factor, 0)
212
+ portfolio_impact += beta * shock
213
+
214
+ results.append({
215
+ "scenario": name,
216
+ "portfolio_impact_pct": round(portfolio_impact * 100, 2),
217
+ "shocks": shocks,
218
+ })
219
+
220
+ return pd.DataFrame(results)
221
+
222
+ # Example scenarios
223
+ scenarios = {
224
+ "Mild Recession": {
225
+ "interest_rate": -0.5, "gdp_growth": -2.0,
226
+ "unemployment": 2.0, "equity_market": -15.0,
227
+ "credit_spread": 1.5,
228
+ },
229
+ "Severe Recession": {
230
+ "interest_rate": -1.0, "gdp_growth": -5.0,
231
+ "unemployment": 5.0, "equity_market": -40.0,
232
+ "credit_spread": 4.0,
233
+ },
234
+ "Rate Shock": {
235
+ "interest_rate": 3.0, "gdp_growth": -1.0,
236
+ "unemployment": 1.0, "equity_market": -10.0,
237
+ "credit_spread": 1.0,
238
+ },
239
+ }
240
+ ```
241
+
242
+ ## Regulatory Framework
243
+
244
+ ### Basel III Capital Requirements
245
+
246
+ | Risk Type | Measurement | Capital Charge |
247
+ |-----------|-------------|---------------|
248
+ | Market risk | FRTB (Fundamental Review of the Trading Book) | ES at 97.5%, stressed calibration |
249
+ | Credit risk | SA or IRB approach | PD, LGD, EAD based risk weights |
250
+ | Operational risk | Basic Indicator / Standardized | Business indicator x ILM |
251
+ | Liquidity risk | LCR and NSFR ratios | High-quality liquid assets buffer |
252
+
253
+ ## Tools and Libraries
254
+
255
+ - **QuantLib (Python/C++)**: Derivatives pricing and risk analytics
256
+ - **riskfolio-lib**: Portfolio risk and optimization in Python
257
+ - **arch (Python)**: GARCH models for volatility estimation
258
+ - **pyfolio**: Portfolio performance and risk analysis
259
+ - **OpenGamma Strata**: Open-source market risk analytics (Java)
260
+ - **Moody's Analytics / Bloomberg PORT**: Commercial risk platforms
@@ -0,0 +1,372 @@
1
+ ---
2
+ name: stata-accounting-research
3
+ description: "STATA code patterns for empirical accounting and finance research"
4
+ metadata:
5
+ openclaw:
6
+ emoji: "📒"
7
+ category: "domains"
8
+ subcategory: "finance"
9
+ keywords: ["STATA", "accounting", "empirical finance", "panel data", "earnings management", "audit"]
10
+ source: "https://github.com/stata-accounting/resources"
11
+ ---
12
+
13
+ # STATA Accounting Research Guide
14
+
15
+ ## Overview
16
+
17
+ Empirical accounting research relies heavily on STATA for data manipulation, statistical analysis, and robustness testing. The field has developed standardized methodological approaches -- earnings quality models, event studies, difference-in-differences for regulatory changes, and instrument variable strategies for endogeneity -- that are implemented in a relatively stable set of STATA patterns.
18
+
19
+ This guide provides the core STATA code patterns used in top accounting journals (The Accounting Review, Journal of Accounting Research, Journal of Accounting and Economics, and Review of Accounting Studies). These patterns are drawn from commonly used research designs in financial reporting, auditing, tax, and managerial accounting research.
20
+
21
+ Whether you are estimating discretionary accruals, conducting an event study around an earnings announcement, testing the effect of auditor rotation on audit quality, or implementing a regulatory shock analysis, these patterns provide tested, reviewable STATA implementations.
22
+
23
+ ## Data Preparation
24
+
25
+ ### Loading and Cleaning COMPUSTAT Data
26
+
27
+ ```stata
28
+ * ============================================================
29
+ * COMPUSTAT Annual Data Preparation for Accounting Research
30
+ * Standard preparation used across most empirical accounting papers
31
+ * ============================================================
32
+
33
+ * Load COMPUSTAT annual data
34
+ use "compustat_annual.dta", clear
35
+
36
+ * Keep relevant variables
37
+ keep gvkey fyear datadate at sale cogs xsga dp ib oancf act lct che dlc ///
38
+ csho prcc_f ceq re dltt txp xrd ppegt ppent invt rect
39
+
40
+ * Set panel structure
41
+ destring gvkey, replace
42
+ xtset gvkey fyear
43
+
44
+ * --- Basic cleaning ---
45
+ * Drop financial firms (SIC 6000-6999) and utilities (SIC 4900-4999)
46
+ drop if inrange(sic, 6000, 6999) | inrange(sic, 4900, 4999)
47
+
48
+ * Require minimum observations
49
+ bysort gvkey: gen nobs = _N
50
+ drop if nobs < 3
51
+ drop nobs
52
+
53
+ * --- Generate common variables ---
54
+ * Total accruals (balance sheet approach)
55
+ gen total_accruals = (D.act - D.che) - (D.lct - D.dlc) - dp
56
+
57
+ * Total accruals (cash flow approach, preferred)
58
+ gen total_accruals_cf = ib - oancf
59
+
60
+ * Scale by lagged total assets
61
+ gen lag_at = L.at
62
+ gen ta_scaled = total_accruals_cf / lag_at
63
+ gen sale_scaled = sale / lag_at
64
+ gen ppe_scaled = ppent / lag_at
65
+ gen dsale = D.sale / lag_at
66
+ gen drec = D.rect / lag_at
67
+ gen roa = ib / lag_at
68
+
69
+ * Market value of equity
70
+ gen mve = csho * prcc_f
71
+
72
+ * Book-to-market ratio
73
+ gen btm = ceq / mve
74
+
75
+ * Leverage
76
+ gen leverage = (dlc + dltt) / at
77
+
78
+ * Firm size
79
+ gen size = ln(at)
80
+
81
+ * --- Winsorize at 1% and 99% ---
82
+ foreach var of varlist ta_scaled sale_scaled ppe_scaled roa btm leverage size {
83
+ winsor2 `var', replace cuts(1 99)
84
+ }
85
+
86
+ * Label variables
87
+ label var ta_scaled "Total accruals / lagged assets"
88
+ label var roa "Return on assets"
89
+ label var btm "Book-to-market ratio"
90
+ label var leverage "Total debt / total assets"
91
+ label var size "Log(total assets)"
92
+
93
+ save "compustat_clean.dta", replace
94
+ ```
95
+
96
+ ## Earnings Quality Models
97
+
98
+ ### Modified Jones Model (Dechow et al., 1995)
99
+
100
+ ```stata
101
+ * ============================================================
102
+ * Modified Jones Model: Estimate discretionary accruals
103
+ * Standard model for earnings management research
104
+ * ============================================================
105
+
106
+ use "compustat_clean.dta", clear
107
+
108
+ * --- Step 1: Estimate non-discretionary accruals by industry-year ---
109
+ * Jones (1991) model estimated cross-sectionally
110
+
111
+ gen inv_lag_at = 1 / lag_at
112
+ gen dsale_drec = dsale - drec // Modified Jones adjustment
113
+
114
+ * Estimate by 2-digit SIC and year (require >= 15 obs per group)
115
+ gen sic2 = floor(sic / 100)
116
+
117
+ * Cross-sectional estimation
118
+ gen da_mj = .
119
+ gen nda_mj = .
120
+
121
+ levelsof fyear, local(years)
122
+ foreach y of local years {
123
+ levelsof sic2 if fyear == `y', local(industries)
124
+ foreach ind of local industries {
125
+ * Count observations in this industry-year
126
+ count if sic2 == `ind' & fyear == `y' & !missing(ta_scaled, inv_lag_at, dsale_drec, ppe_scaled)
127
+ if r(N) >= 15 {
128
+ * Estimate Jones model
129
+ quietly reg ta_scaled inv_lag_at dsale_drec ppe_scaled ///
130
+ if sic2 == `ind' & fyear == `y', robust
131
+
132
+ * Predict non-discretionary accruals
133
+ quietly predict temp_nda if sic2 == `ind' & fyear == `y', xb
134
+ quietly replace nda_mj = temp_nda if sic2 == `ind' & fyear == `y'
135
+ drop temp_nda
136
+ }
137
+ }
138
+ }
139
+
140
+ * Discretionary accruals = Total accruals - Non-discretionary accruals
141
+ replace da_mj = ta_scaled - nda_mj
142
+
143
+ * Absolute discretionary accruals (common measure of earnings quality)
144
+ gen abs_da = abs(da_mj)
145
+
146
+ label var da_mj "Discretionary accruals (Modified Jones)"
147
+ label var abs_da "Absolute discretionary accruals"
148
+
149
+ save "accruals_data.dta", replace
150
+ ```
151
+
152
+ ### Performance-Matched Discretionary Accruals (Kothari et al., 2005)
153
+
154
+ ```stata
155
+ * ============================================================
156
+ * Kothari (2005): Performance-matched discretionary accruals
157
+ * Controls for correlation between performance and accruals
158
+ * ============================================================
159
+
160
+ * Add ROA to the Jones model
161
+ gen da_kothari = .
162
+
163
+ levelsof fyear, local(years)
164
+ foreach y of local years {
165
+ levelsof sic2 if fyear == `y', local(industries)
166
+ foreach ind of local industries {
167
+ count if sic2 == `ind' & fyear == `y' & !missing(ta_scaled, inv_lag_at, dsale_drec, ppe_scaled, roa)
168
+ if r(N) >= 15 {
169
+ quietly reg ta_scaled inv_lag_at dsale_drec ppe_scaled roa ///
170
+ if sic2 == `ind' & fyear == `y', robust
171
+ quietly predict temp_res if sic2 == `ind' & fyear == `y', residuals
172
+ quietly replace da_kothari = temp_res if sic2 == `ind' & fyear == `y'
173
+ drop temp_res
174
+ }
175
+ }
176
+ }
177
+
178
+ gen abs_da_kothari = abs(da_kothari)
179
+ label var da_kothari "Discretionary accruals (Kothari)"
180
+ label var abs_da_kothari "Absolute DA (Kothari)"
181
+ ```
182
+
183
+ ## Event Study
184
+
185
+ ```stata
186
+ * ============================================================
187
+ * Short-window event study around earnings announcements
188
+ * Standard methodology for capital markets research
189
+ * ============================================================
190
+
191
+ use "crsp_daily_returns.dta", clear
192
+
193
+ * Merge with event dates
194
+ merge m:1 gvkey fyear using "earnings_dates.dta", keep(match) nogen
195
+
196
+ * --- Estimation window: [-250, -30] relative to announcement ---
197
+ gen event_day = date - rdq // rdq = report date of quarterly earnings
198
+ keep if inrange(event_day, -250, 10)
199
+
200
+ * Estimate market model in estimation window
201
+ gen est_window = inrange(event_day, -250, -30)
202
+ gen event_window = inrange(event_day, -1, 1) // 3-day window [-1, +1]
203
+
204
+ * Market model: R_i = alpha + beta * R_m + epsilon
205
+ bysort permno fyear: egen has_enough = total(est_window)
206
+ keep if has_enough >= 100 // Require 100+ days in estimation window
207
+
208
+ * Estimate market model parameters
209
+ gen alpha = .
210
+ gen beta_mkt = .
211
+
212
+ levelsof permno, local(firms)
213
+ foreach p of local firms {
214
+ capture quietly reg ret mktrf if permno == `p' & est_window == 1
215
+ if _rc == 0 {
216
+ quietly replace alpha = _b[_cons] if permno == `p'
217
+ quietly replace beta_mkt = _b[mktrf] if permno == `p'
218
+ }
219
+ }
220
+
221
+ * Abnormal returns
222
+ gen ar = ret - (alpha + beta_mkt * mktrf)
223
+
224
+ * Cumulative abnormal returns [-1, +1]
225
+ bysort permno fyear (event_day): egen car_3day = total(ar) if event_window == 1
226
+
227
+ * Cross-sectional test
228
+ preserve
229
+ keep if event_day == 0
230
+ * t-test: Is average CAR different from zero?
231
+ ttest car_3day == 0
232
+ * Regression with controls
233
+ reg car_3day surprise size btm, robust
234
+ restore
235
+ ```
236
+
237
+ ## Regression Specifications
238
+
239
+ ### Standard Panel Regression with Fixed Effects
240
+
241
+ ```stata
242
+ * ============================================================
243
+ * Standard regression specification for accounting research
244
+ * Includes firm and year fixed effects, clustered standard errors
245
+ * ============================================================
246
+
247
+ use "merged_analysis_data.dta", clear
248
+
249
+ * --- Main specification ---
250
+ * DV: Absolute discretionary accruals (earnings quality)
251
+ * Key IV: Big 4 auditor indicator
252
+
253
+ * Model 1: Pooled OLS (baseline, for comparison only)
254
+ reg abs_da big4 size leverage btm roa loss, robust
255
+ estimates store m1
256
+
257
+ * Model 2: Year fixed effects
258
+ reg abs_da big4 size leverage btm roa loss i.fyear, robust
259
+ estimates store m2
260
+
261
+ * Model 3: Industry + Year fixed effects
262
+ reg abs_da big4 size leverage btm roa loss i.sic2 i.fyear, robust
263
+ estimates store m3
264
+
265
+ * Model 4: Firm + Year fixed effects (preferred specification)
266
+ reghdfe abs_da big4 size leverage btm roa loss, absorb(gvkey fyear) ///
267
+ cluster(gvkey)
268
+ estimates store m4
269
+
270
+ * Model 5: Firm + Year FE, two-way clustering (firm and year)
271
+ reghdfe abs_da big4 size leverage btm roa loss, absorb(gvkey fyear) ///
272
+ cluster(gvkey fyear)
273
+ estimates store m5
274
+
275
+ * --- Output table ---
276
+ esttab m1 m2 m3 m4 m5 using "table_main.tex", replace ///
277
+ star(* 0.10 ** 0.05 *** 0.01) ///
278
+ b(%9.4f) se(%9.4f) ///
279
+ stats(N r2 r2_a, fmt(%9.0g %9.4f %9.4f) ///
280
+ labels("Observations" "R-squared" "Adj. R-squared")) ///
281
+ title("Effect of Auditor Type on Earnings Quality") ///
282
+ label booktabs
283
+ ```
284
+
285
+ ## Robustness Tests
286
+
287
+ ### Propensity Score Matching
288
+
289
+ ```stata
290
+ * ============================================================
291
+ * Propensity Score Matching (PSM) for endogeneity concerns
292
+ * Used when treatment assignment (e.g., Big 4 auditor) is not random
293
+ * ============================================================
294
+
295
+ * Step 1: Estimate propensity score
296
+ logit big4 size leverage btm roa loss age_firm, robust
297
+ predict pscore, pr
298
+
299
+ * Step 2: Common support check
300
+ gen cs = pscore >= 0.1 & pscore <= 0.9 // Trim extreme propensity scores
301
+
302
+ * Step 3: Nearest-neighbor matching (1:1, without replacement)
303
+ psmatch2 big4 size leverage btm roa loss if cs == 1, ///
304
+ outcome(abs_da) neighbor(1) caliper(0.01) common
305
+
306
+ * Check covariate balance after matching
307
+ pstest size leverage btm roa loss, both
308
+
309
+ * Step 4: Re-estimate on matched sample
310
+ gen matched = _weight != .
311
+ reg abs_da big4 size leverage btm roa loss if matched == 1, robust
312
+ ```
313
+
314
+ ### Heckman Selection Model
315
+
316
+ ```stata
317
+ * ============================================================
318
+ * Heckman two-stage model for sample selection bias
319
+ * Example: Analyst coverage → Earnings quality
320
+ * ============================================================
321
+
322
+ * First stage: Selection equation (what determines analyst coverage?)
323
+ probit analyst_covered size btm roa institutional_ownership sp500 ///
324
+ exchange_listed, robust
325
+
326
+ * Second stage: Outcome equation with inverse Mills ratio
327
+ heckman abs_da analyst_covered size leverage btm roa, ///
328
+ select(analyst_covered = size btm roa institutional_ownership ///
329
+ sp500 exchange_listed) ///
330
+ twostep
331
+ ```
332
+
333
+ ## Publication-Ready Output
334
+
335
+ ```stata
336
+ * ============================================================
337
+ * Generating publication-ready tables and statistics
338
+ * ============================================================
339
+
340
+ * Summary statistics table
341
+ estpost summarize abs_da big4 size leverage btm roa loss, detail
342
+ esttab using "table_sumstats.tex", replace ///
343
+ cells("count mean sd p25 p50 p75") ///
344
+ label booktabs title("Summary Statistics")
345
+
346
+ * Correlation matrix
347
+ pwcorr abs_da big4 size leverage btm roa, star(0.05) sig
348
+ estpost correlate abs_da big4 size leverage btm roa, matrix listwise
349
+ esttab using "table_corr.tex", replace unstack not noobs ///
350
+ label booktabs title("Correlation Matrix")
351
+
352
+ * Univariate comparison (treatment vs. control)
353
+ ttest abs_da, by(big4) unequal
354
+ ranksum abs_da, by(big4)
355
+ ```
356
+
357
+ ## Best Practices
358
+
359
+ - **Always cluster standard errors** by firm (at minimum) in panel data. Two-way clustering by firm and year is increasingly required by reviewers.
360
+ - **Use `reghdfe`** for high-dimensional fixed effects. It is faster and more memory-efficient than `areg` or `xtreg, fe`.
361
+ - **Report economic magnitude.** A one-standard-deviation change in X produces a Y% change in the dependent variable.
362
+ - **Include all robustness tests** that reviewers expect: PSM, Heckman, placebo tests, entropy balancing, and alternative variable definitions.
363
+ - **Winsorize at 1% and 99%** as a default; report results at 5%/95% as a robustness check.
364
+ - **Use `eststo` and `esttab`** for consistent, automated table generation. Never hand-type regression results.
365
+
366
+ ## References
367
+
368
+ - Dechow, P. M., Sloan, R. G., & Sweeney, A. P. (1995). Detecting Earnings Management. The Accounting Review, 70(2), 193-225.
369
+ - Kothari, S. P., Leone, A. J., & Wasley, C. E. (2005). Performance Matched Discretionary Accrual Measures. Journal of Accounting and Economics, 39(1), 163-197.
370
+ - [WRDS (Wharton Research Data Services)](https://wrds-www.wharton.upenn.edu/) -- Standard data platform for accounting/finance research
371
+ - [reghdfe documentation](https://github.com/sergiocorreia/reghdfe) -- Fast fixed-effects estimation in STATA
372
+ - Gow, I. D., Ormazabal, G., & Taylor, D. J. (2010). Correcting for Cross-Sectional and Time-Series Dependence in Accounting Research. The Accounting Review, 85(2), 483-512.