nested-learning 0.2.0__tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (340) hide show
  1. nested_learning-0.2.0/.github/ISSUE_TEMPLATE/config.yml +2 -0
  2. nested_learning-0.2.0/.github/ISSUE_TEMPLATE/eval_request.md +27 -0
  3. nested_learning-0.2.0/.github/ISSUE_TEMPLATE/faithfulness_gap.md +30 -0
  4. nested_learning-0.2.0/.github/ISSUE_TEMPLATE/perf_regression.md +32 -0
  5. nested_learning-0.2.0/.github/workflows/ci.yml +197 -0
  6. nested_learning-0.2.0/.github/workflows/release.yml +79 -0
  7. nested_learning-0.2.0/.github/workflows/security.yml +45 -0
  8. nested_learning-0.2.0/.gitignore +40 -0
  9. nested_learning-0.2.0/CHANGELOG.md +59 -0
  10. nested_learning-0.2.0/LICENSE +201 -0
  11. nested_learning-0.2.0/PKG-INFO +390 -0
  12. nested_learning-0.2.0/README.md +353 -0
  13. nested_learning-0.2.0/TODO.md +85 -0
  14. nested_learning-0.2.0/configs/ablations/cms_sparse.yaml +46 -0
  15. nested_learning-0.2.0/configs/ablations/selfmod_chunked_8_64.yaml +24 -0
  16. nested_learning-0.2.0/configs/ablations/selfmod_momentum_off.yaml +23 -0
  17. nested_learning-0.2.0/configs/ablations/selfmod_momentum_on.yaml +23 -0
  18. nested_learning-0.2.0/configs/ablations/selfmod_no_alpha.yaml +23 -0
  19. nested_learning-0.2.0/configs/ablations/selfmod_no_cms.yaml +23 -0
  20. nested_learning-0.2.0/configs/ablations/selfmod_rank1_precond_off.yaml +23 -0
  21. nested_learning-0.2.0/configs/data/continual_segments_sample.yaml +9 -0
  22. nested_learning-0.2.0/configs/data/fineweb_edu_longdoc_filtered_sample.yaml +14 -0
  23. nested_learning-0.2.0/configs/data/fineweb_edu_mixture_full.yaml +14 -0
  24. nested_learning-0.2.0/configs/data/fineweb_edu_mixture_sample.yaml +14 -0
  25. nested_learning-0.2.0/configs/data/refinedweb_mixture.yaml +48 -0
  26. nested_learning-0.2.0/configs/data/refinedweb_mixture_filtered.yaml +48 -0
  27. nested_learning-0.2.0/configs/data/refinedweb_mixture_full.yaml +48 -0
  28. nested_learning-0.2.0/configs/data/refinedweb_mixture_sample.yaml +51 -0
  29. nested_learning-0.2.0/configs/deepspeed/zero3.json +25 -0
  30. nested_learning-0.2.0/configs/hope/mid.yaml +118 -0
  31. nested_learning-0.2.0/configs/hope/mid_fsdp.yaml +47 -0
  32. nested_learning-0.2.0/configs/hope/pilot.yaml +2 -0
  33. nested_learning-0.2.0/configs/hope/pilot_attention.yaml +9 -0
  34. nested_learning-0.2.0/configs/hope/pilot_selfmod.yaml +20 -0
  35. nested_learning-0.2.0/configs/hope/pilot_transformer.yaml +9 -0
  36. nested_learning-0.2.0/configs/hope/target.yaml +145 -0
  37. nested_learning-0.2.0/configs/hope/target_fsdp.yaml +47 -0
  38. nested_learning-0.2.0/configs/mid_smoke.yaml +99 -0
  39. nested_learning-0.2.0/configs/mid_stage2.yaml +110 -0
  40. nested_learning-0.2.0/configs/mid_stage2_smoke.yaml +102 -0
  41. nested_learning-0.2.0/configs/mid_titan_baseline.yaml +92 -0
  42. nested_learning-0.2.0/configs/pilot.yaml +127 -0
  43. nested_learning-0.2.0/configs/pilot_paper_faithful.yaml +42 -0
  44. nested_learning-0.2.0/configs/pilot_selfmod_paper_faithful.yaml +18 -0
  45. nested_learning-0.2.0/configs/pilot_smoke.yaml +80 -0
  46. nested_learning-0.2.0/configs/resolved/cms_sparse_eval.yaml +105 -0
  47. nested_learning-0.2.0/configs/resolved/phase2_pilot_attention_eval.yaml +49 -0
  48. nested_learning-0.2.0/configs/resolved/phase2_pilot_transformer_eval.yaml +49 -0
  49. nested_learning-0.2.0/docs/BUG_REPORT_CHECKLIST.md +30 -0
  50. nested_learning-0.2.0/docs/COMPATIBILITY_MATRIX.md +43 -0
  51. nested_learning-0.2.0/docs/FSDP_SCALING_GUIDE.md +61 -0
  52. nested_learning-0.2.0/docs/IMPLEMENTATION_STATUS.md +34 -0
  53. nested_learning-0.2.0/docs/P4_REMEDIATION_PLAN.md +41 -0
  54. nested_learning-0.2.0/docs/PACKAGE_RELEASE_CHECKLIST.md +39 -0
  55. nested_learning-0.2.0/docs/PAPER_COMPLIANCE.md +381 -0
  56. nested_learning-0.2.0/docs/PHASE2_LONG_CONTEXT_COMPARISON.md +49 -0
  57. nested_learning-0.2.0/docs/PHASE_2_PLAN.md +119 -0
  58. nested_learning-0.2.0/docs/PYPI_TRUSTED_PUBLISHING.md +52 -0
  59. nested_learning-0.2.0/docs/STREAMING_CONTRACT.md +109 -0
  60. nested_learning-0.2.0/docs/VERSIONING_POLICY.md +35 -0
  61. nested_learning-0.2.0/docs/compute_plan.md +22 -0
  62. nested_learning-0.2.0/docs/continual_classification_eval.md +82 -0
  63. nested_learning-0.2.0/docs/continual_eval.md +40 -0
  64. nested_learning-0.2.0/docs/data_pipeline.md +219 -0
  65. nested_learning-0.2.0/docs/env_matrix.md +63 -0
  66. nested_learning-0.2.0/docs/experiments_report.md +197 -0
  67. nested_learning-0.2.0/docs/future_directions.md +34 -0
  68. nested_learning-0.2.0/docs/phase2_comparison.md +70 -0
  69. nested_learning-0.2.0/docs/release_checklist.md +43 -0
  70. nested_learning-0.2.0/docs/scaling_guidance.md +72 -0
  71. nested_learning-0.2.0/docs/spec_interfaces.md +23 -0
  72. nested_learning-0.2.0/docs/sprint_next_plan.md +95 -0
  73. nested_learning-0.2.0/docs/stage2_plan.md +158 -0
  74. nested_learning-0.2.0/docs/stage2_progress.md +39 -0
  75. nested_learning-0.2.0/docs/templates/checkpoint_report.md +50 -0
  76. nested_learning-0.2.0/docs/zeroshot_eval.md +118 -0
  77. nested_learning-0.2.0/eval/continual_dummy.json +11 -0
  78. nested_learning-0.2.0/eval/continual_mid_stage2.json +43 -0
  79. nested_learning-0.2.0/eval/continual_mid_stage2_smoke.json +43 -0
  80. nested_learning-0.2.0/eval/continual_mid_stage2_ts10.json +43 -0
  81. nested_learning-0.2.0/eval/continual_mid_stage2_ts10_single120_clip.json +43 -0
  82. nested_learning-0.2.0/eval/continual_mid_stage2_ts10_single140_schedC.json +43 -0
  83. nested_learning-0.2.0/eval/continual_mid_stage2_ts10_single220_schedD.json +43 -0
  84. nested_learning-0.2.0/eval/continual_mid_stage2_ts10_single80.json +43 -0
  85. nested_learning-0.2.0/eval/continual_mid_stage2_ts10_single80lr2e5.json +43 -0
  86. nested_learning-0.2.0/eval/continual_mid_stage2_ts20.json +43 -0
  87. nested_learning-0.2.0/eval/continual_mid_titan_baseline.json +43 -0
  88. nested_learning-0.2.0/eval/continual_pilot.json +73 -0
  89. nested_learning-0.2.0/eval/continual_pilot_cms_nochunk_step5000.json +11 -0
  90. nested_learning-0.2.0/eval/continual_pilot_cms_sparse_step5000.json +11 -0
  91. nested_learning-0.2.0/eval/continual_pilot_multi.json +119 -0
  92. nested_learning-0.2.0/eval/continual_pilot_opt_adamw_step5000.json +11 -0
  93. nested_learning-0.2.0/eval/continual_pilot_opt_muon_step5000.json +11 -0
  94. nested_learning-0.2.0/eval/continual_pilot_selfmod_off_step5000.json +11 -0
  95. nested_learning-0.2.0/eval/continual_pilot_step22000.json +11 -0
  96. nested_learning-0.2.0/eval/continual_pilot_step230000.json +41 -0
  97. nested_learning-0.2.0/eval/continual_pilot_teach05_long_step25000.json +11 -0
  98. nested_learning-0.2.0/eval/continual_pilot_teach05_step2000.json +11 -0
  99. nested_learning-0.2.0/eval/continual_pilot_teach15_long_step25000.json +11 -0
  100. nested_learning-0.2.0/eval/continual_pilot_teach15_step2000.json +11 -0
  101. nested_learning-0.2.0/eval/continual_smoke.json +43 -0
  102. nested_learning-0.2.0/eval/continual_titan.json +73 -0
  103. nested_learning-0.2.0/eval/continual_titan_relaunch_step001000.json +41 -0
  104. nested_learning-0.2.0/eval/continual_titan_step25000.json +41 -0
  105. nested_learning-0.2.0/eval/niah_dummy.json +4 -0
  106. nested_learning-0.2.0/eval/niah_mid_stage2.json +4 -0
  107. nested_learning-0.2.0/eval/niah_mid_stage2_smoke.json +5 -0
  108. nested_learning-0.2.0/eval/niah_mid_stage2_ts10.json +3 -0
  109. nested_learning-0.2.0/eval/niah_mid_stage2_ts10_single120_clip.json +3 -0
  110. nested_learning-0.2.0/eval/niah_mid_stage2_ts10_single140_schedC.json +3 -0
  111. nested_learning-0.2.0/eval/niah_mid_stage2_ts10_single220_schedD.json +4 -0
  112. nested_learning-0.2.0/eval/niah_mid_stage2_ts10_single80.json +3 -0
  113. nested_learning-0.2.0/eval/niah_mid_stage2_ts10_single80lr2e5.json +3 -0
  114. nested_learning-0.2.0/eval/niah_mid_stage2_ts20.json +3 -0
  115. nested_learning-0.2.0/eval/niah_mid_titan_baseline.json +4 -0
  116. nested_learning-0.2.0/eval/niah_pilot.json +38 -0
  117. nested_learning-0.2.0/eval/niah_pilot_cms_nochunk_step5000.json +8 -0
  118. nested_learning-0.2.0/eval/niah_pilot_cms_sparse_step5000.json +8 -0
  119. nested_learning-0.2.0/eval/niah_pilot_opt_adamw_step5000.json +8 -0
  120. nested_learning-0.2.0/eval/niah_pilot_opt_muon_step5000.json +8 -0
  121. nested_learning-0.2.0/eval/niah_pilot_selfmod_off_step5000.json +8 -0
  122. nested_learning-0.2.0/eval/niah_pilot_step22000.json +5 -0
  123. nested_learning-0.2.0/eval/niah_pilot_step230000.json +28 -0
  124. nested_learning-0.2.0/eval/niah_pilot_teach05_long_step25000.json +8 -0
  125. nested_learning-0.2.0/eval/niah_pilot_teach05_step2000.json +8 -0
  126. nested_learning-0.2.0/eval/niah_pilot_teach15_long_step25000.json +8 -0
  127. nested_learning-0.2.0/eval/niah_pilot_teach15_step2000.json +8 -0
  128. nested_learning-0.2.0/eval/niah_smoke.json +5 -0
  129. nested_learning-0.2.0/eval/niah_titan.json +38 -0
  130. nested_learning-0.2.0/eval/niah_titan_relaunch_step001000.json +30 -0
  131. nested_learning-0.2.0/eval/niah_titan_step25000.json +28 -0
  132. nested_learning-0.2.0/eval/passkey_pilot.json +21 -0
  133. nested_learning-0.2.0/eval/passkey_pilot_step230000.json +11 -0
  134. nested_learning-0.2.0/eval/passkey_titan.json +21 -0
  135. nested_learning-0.2.0/eval/passkey_titan_relaunch_step001000.json +13 -0
  136. nested_learning-0.2.0/eval/passkey_titan_step25000.json +11 -0
  137. nested_learning-0.2.0/eval/pg19_pilot.json +9 -0
  138. nested_learning-0.2.0/eval/pg19_pilot_step230000.json +7 -0
  139. nested_learning-0.2.0/eval/pg19_titan.json +9 -0
  140. nested_learning-0.2.0/eval/pg19_titan_relaunch_step001000.json +9 -0
  141. nested_learning-0.2.0/eval/pg19_titan_step25000.json +7 -0
  142. nested_learning-0.2.0/eval/phase2_compare_smoke_lastlayer_metrics.json +131 -0
  143. nested_learning-0.2.0/eval/zeroshot_full_smoke.json +16 -0
  144. nested_learning-0.2.0/eval/zeroshot_mid_stage2.json +8 -0
  145. nested_learning-0.2.0/eval/zeroshot_mid_stage2_smoke.json +16 -0
  146. nested_learning-0.2.0/eval/zeroshot_mid_stage2_smoke_piqa_baseline.json +4 -0
  147. nested_learning-0.2.0/eval/zeroshot_mid_stage2_smoke_piqa_mem.json +4 -0
  148. nested_learning-0.2.0/eval/zeroshot_mid_stage2_ts10.json +6 -0
  149. nested_learning-0.2.0/eval/zeroshot_mid_stage2_ts10_single120_clip.json +6 -0
  150. nested_learning-0.2.0/eval/zeroshot_mid_stage2_ts10_single140_schedC.json +6 -0
  151. nested_learning-0.2.0/eval/zeroshot_mid_stage2_ts10_single220_schedD.json +6 -0
  152. nested_learning-0.2.0/eval/zeroshot_mid_stage2_ts10_single80.json +6 -0
  153. nested_learning-0.2.0/eval/zeroshot_mid_stage2_ts10_single80lr2e5.json +6 -0
  154. nested_learning-0.2.0/eval/zeroshot_mid_stage2_ts20.json +6 -0
  155. nested_learning-0.2.0/eval/zeroshot_mid_titan_baseline.json +6 -0
  156. nested_learning-0.2.0/eval/zeroshot_pilot.json +155 -0
  157. nested_learning-0.2.0/eval/zeroshot_pilot_cms_nochunk_step5000.json +20 -0
  158. nested_learning-0.2.0/eval/zeroshot_pilot_cms_sparse_step5000.json +20 -0
  159. nested_learning-0.2.0/eval/zeroshot_pilot_debug.json +10 -0
  160. nested_learning-0.2.0/eval/zeroshot_pilot_dummy_piqa.json +4 -0
  161. nested_learning-0.2.0/eval/zeroshot_pilot_opt_adamw_step5000.json +10 -0
  162. nested_learning-0.2.0/eval/zeroshot_pilot_opt_muon_step5000.json +10 -0
  163. nested_learning-0.2.0/eval/zeroshot_pilot_selfmod_off_step5000.json +20 -0
  164. nested_learning-0.2.0/eval/zeroshot_pilot_step22000.json +4 -0
  165. nested_learning-0.2.0/eval/zeroshot_pilot_step230000.json +65 -0
  166. nested_learning-0.2.0/eval/zeroshot_pilot_teach05_long_step25000.json +20 -0
  167. nested_learning-0.2.0/eval/zeroshot_pilot_teach05_step2000.json +20 -0
  168. nested_learning-0.2.0/eval/zeroshot_pilot_teach15_long_step25000.json +20 -0
  169. nested_learning-0.2.0/eval/zeroshot_pilot_teach15_step2000.json +20 -0
  170. nested_learning-0.2.0/eval/zeroshot_smoke.json +4 -0
  171. nested_learning-0.2.0/eval/zeroshot_titan.json +155 -0
  172. nested_learning-0.2.0/eval/zeroshot_titan_relaunch_step001000.json +83 -0
  173. nested_learning-0.2.0/eval/zeroshot_titan_step25000.json +65 -0
  174. nested_learning-0.2.0/google_papers/Nested_Learning/Nested_Learning.json +270 -0
  175. nested_learning-0.2.0/google_papers/Nested_Learning/Nested_Learning.md +643 -0
  176. nested_learning-0.2.0/google_papers/Nested_Learning.pdf +0 -0
  177. nested_learning-0.2.0/google_papers/TITANs/TITANs.json +378 -0
  178. nested_learning-0.2.0/google_papers/TITANs/TITANs.md +711 -0
  179. nested_learning-0.2.0/google_papers/TITANs.pdf +0 -0
  180. nested_learning-0.2.0/pyproject.toml +83 -0
  181. nested_learning-0.2.0/reports/ablations.md +151 -0
  182. nested_learning-0.2.0/reports/cadence_mechanism_audit_smoke.json +27 -0
  183. nested_learning-0.2.0/reports/compliance_mechanism_audit_smoke.json +73 -0
  184. nested_learning-0.2.0/reports/compliance_summary_pilot.json +73 -0
  185. nested_learning-0.2.0/reports/compliance_summary_pilot_paper_faithful.json +42 -0
  186. nested_learning-0.2.0/reports/next_backlog_scoped.md +8 -0
  187. nested_learning-0.2.0/reports/plots/continual_pilot_refinedweb.png +0 -0
  188. nested_learning-0.2.0/reports/security_release_gate.md +33 -0
  189. nested_learning-0.2.0/reports/sprint_completion_report.md +37 -0
  190. nested_learning-0.2.0/reports/stage2_smoke.md +94 -0
  191. nested_learning-0.2.0/scripts/__init__.py +1 -0
  192. nested_learning-0.2.0/scripts/checkpoint/verify.py +24 -0
  193. nested_learning-0.2.0/scripts/checks/check_data_script_help.sh +36 -0
  194. nested_learning-0.2.0/scripts/checks/check_git_tracked_sizes.sh +26 -0
  195. nested_learning-0.2.0/scripts/checks/check_readme_commands.sh +11 -0
  196. nested_learning-0.2.0/scripts/checks/compliance_report.py +195 -0
  197. nested_learning-0.2.0/scripts/checks/run_fidelity_ci_subset.sh +38 -0
  198. nested_learning-0.2.0/scripts/checks/tokenizer_coverage_guard.py +98 -0
  199. nested_learning-0.2.0/scripts/checks/verify_docs_refs.py +206 -0
  200. nested_learning-0.2.0/scripts/checks/verify_update_cadence.py +138 -0
  201. nested_learning-0.2.0/scripts/compute/create_reservations.sh +38 -0
  202. nested_learning-0.2.0/scripts/data/__init__.py +2 -0
  203. nested_learning-0.2.0/scripts/data/check_tokenizer.py +80 -0
  204. nested_learning-0.2.0/scripts/data/check_tokenizer_coverage.py +34 -0
  205. nested_learning-0.2.0/scripts/data/filter_corpus.py +124 -0
  206. nested_learning-0.2.0/scripts/data/process_mixture.py +51 -0
  207. nested_learning-0.2.0/scripts/data/run_full.sh +169 -0
  208. nested_learning-0.2.0/scripts/data/run_sample.sh +110 -0
  209. nested_learning-0.2.0/scripts/data/shard_corpus.py +161 -0
  210. nested_learning-0.2.0/scripts/data/train_tokenizer.py +168 -0
  211. nested_learning-0.2.0/scripts/data/validate_mixture.py +73 -0
  212. nested_learning-0.2.0/scripts/eval/__init__.py +1 -0
  213. nested_learning-0.2.0/scripts/eval/compare_variants.py +523 -0
  214. nested_learning-0.2.0/scripts/eval/continual.py +262 -0
  215. nested_learning-0.2.0/scripts/eval/continual_classification.py +177 -0
  216. nested_learning-0.2.0/scripts/eval/niah.py +210 -0
  217. nested_learning-0.2.0/scripts/eval/niah_suite.py +408 -0
  218. nested_learning-0.2.0/scripts/eval/passkey.py +178 -0
  219. nested_learning-0.2.0/scripts/eval/pg19_perplexity.py +175 -0
  220. nested_learning-0.2.0/scripts/eval/phase2_memorization_delta_smoke.py +100 -0
  221. nested_learning-0.2.0/scripts/eval/plot_continual_classification.py +66 -0
  222. nested_learning-0.2.0/scripts/eval/plot_forgetting.py +44 -0
  223. nested_learning-0.2.0/scripts/eval/plot_niah_suite.py +61 -0
  224. nested_learning-0.2.0/scripts/eval/run_pilot_suite.sh +200 -0
  225. nested_learning-0.2.0/scripts/eval/summarize_eval.py +110 -0
  226. nested_learning-0.2.0/scripts/eval/zeroshot.py +399 -0
  227. nested_learning-0.2.0/scripts/package_pilot_release.sh +166 -0
  228. nested_learning-0.2.0/scripts/run_cpu_ddp_smoke.sh +18 -0
  229. nested_learning-0.2.0/scripts/run_e2e_smoke.sh +64 -0
  230. nested_learning-0.2.0/scripts/run_mechanism_audit_smoke.sh +36 -0
  231. nested_learning-0.2.0/scripts/run_smoke.sh +20 -0
  232. nested_learning-0.2.0/scripts/tests/run_passkey_smoke.sh +35 -0
  233. nested_learning-0.2.0/src/nested_learning/__init__.py +12 -0
  234. nested_learning-0.2.0/src/nested_learning/__main__.py +12 -0
  235. nested_learning-0.2.0/src/nested_learning/assoc_memory.py +23 -0
  236. nested_learning-0.2.0/src/nested_learning/backbones.py +147 -0
  237. nested_learning-0.2.0/src/nested_learning/capabilities.py +104 -0
  238. nested_learning-0.2.0/src/nested_learning/cli.py +253 -0
  239. nested_learning-0.2.0/src/nested_learning/cms.py +92 -0
  240. nested_learning-0.2.0/src/nested_learning/config_utils.py +50 -0
  241. nested_learning-0.2.0/src/nested_learning/continual_classification.py +136 -0
  242. nested_learning-0.2.0/src/nested_learning/continual_streaming.py +283 -0
  243. nested_learning-0.2.0/src/nested_learning/data.py +153 -0
  244. nested_learning-0.2.0/src/nested_learning/device.py +21 -0
  245. nested_learning-0.2.0/src/nested_learning/eval_state.py +72 -0
  246. nested_learning-0.2.0/src/nested_learning/fast_state.py +108 -0
  247. nested_learning-0.2.0/src/nested_learning/functional.py +69 -0
  248. nested_learning-0.2.0/src/nested_learning/hope/__init__.py +0 -0
  249. nested_learning-0.2.0/src/nested_learning/hope/block.py +1973 -0
  250. nested_learning-0.2.0/src/nested_learning/hope/self_mod.py +40 -0
  251. nested_learning-0.2.0/src/nested_learning/instrumentation.py +38 -0
  252. nested_learning-0.2.0/src/nested_learning/levels.py +94 -0
  253. nested_learning-0.2.0/src/nested_learning/logging_utils.py +64 -0
  254. nested_learning-0.2.0/src/nested_learning/memorize.py +382 -0
  255. nested_learning-0.2.0/src/nested_learning/model.py +604 -0
  256. nested_learning-0.2.0/src/nested_learning/optim/__init__.py +0 -0
  257. nested_learning-0.2.0/src/nested_learning/optim/deep.py +102 -0
  258. nested_learning-0.2.0/src/nested_learning/optim/factory.py +13 -0
  259. nested_learning-0.2.0/src/nested_learning/optim/m3.py +121 -0
  260. nested_learning-0.2.0/src/nested_learning/optim/manager.py +151 -0
  261. nested_learning-0.2.0/src/nested_learning/titan/__init__.py +0 -0
  262. nested_learning-0.2.0/src/nested_learning/titan/memory.py +88 -0
  263. nested_learning-0.2.0/src/nested_learning/titan/model.py +412 -0
  264. nested_learning-0.2.0/src/nested_learning/titan/self_modifying.py +724 -0
  265. nested_learning-0.2.0/src/nested_learning/tokenizer.py +28 -0
  266. nested_learning-0.2.0/src/nested_learning/tokenizer_coverage.py +77 -0
  267. nested_learning-0.2.0/src/nested_learning/training.py +1600 -0
  268. nested_learning-0.2.0/src/nested_learning/transformer.py +104 -0
  269. nested_learning-0.2.0/tests/conftest.py +7 -0
  270. nested_learning-0.2.0/tests/data/passkey_corpus.txt +3 -0
  271. nested_learning-0.2.0/tests/data/tiny_tokenizer.model +0 -0
  272. nested_learning-0.2.0/tests/data/tiny_tokenizer.vocab +128 -0
  273. nested_learning-0.2.0/tests/test_algorithm_mode_grad.py +47 -0
  274. nested_learning-0.2.0/tests/test_attention_cache.py +74 -0
  275. nested_learning-0.2.0/tests/test_attention_features.py +45 -0
  276. nested_learning-0.2.0/tests/test_boundary_state_mode.py +71 -0
  277. nested_learning-0.2.0/tests/test_boundary_state_training_loop.py +66 -0
  278. nested_learning-0.2.0/tests/test_build_model_from_cfg_selfmod.py +44 -0
  279. nested_learning-0.2.0/tests/test_checkpoint_metadata_and_eval_loaders.py +89 -0
  280. nested_learning-0.2.0/tests/test_cli_tooling.py +135 -0
  281. nested_learning-0.2.0/tests/test_cms.py +91 -0
  282. nested_learning-0.2.0/tests/test_cms_cross_call.py +118 -0
  283. nested_learning-0.2.0/tests/test_cms_delta_rule.py +43 -0
  284. nested_learning-0.2.0/tests/test_cms_flush_partial.py +47 -0
  285. nested_learning-0.2.0/tests/test_compare_variants_cli.py +100 -0
  286. nested_learning-0.2.0/tests/test_compile_toggle.py +63 -0
  287. nested_learning-0.2.0/tests/test_compliance_report.py +60 -0
  288. nested_learning-0.2.0/tests/test_continual_classification.py +119 -0
  289. nested_learning-0.2.0/tests/test_continual_eval_state_mode.py +81 -0
  290. nested_learning-0.2.0/tests/test_data_scripts_help.py +13 -0
  291. nested_learning-0.2.0/tests/test_data_split_fallbacks.py +126 -0
  292. nested_learning-0.2.0/tests/test_determinism_seed.py +42 -0
  293. nested_learning-0.2.0/tests/test_device_resolution.py +10 -0
  294. nested_learning-0.2.0/tests/test_distributed_fail_fast.py +90 -0
  295. nested_learning-0.2.0/tests/test_eval_builders.py +36 -0
  296. nested_learning-0.2.0/tests/test_eval_state.py +58 -0
  297. nested_learning-0.2.0/tests/test_eval_state_cli.py +90 -0
  298. nested_learning-0.2.0/tests/test_faithfulness_harness.py +57 -0
  299. nested_learning-0.2.0/tests/test_fast_state_batch_semantics.py +40 -0
  300. nested_learning-0.2.0/tests/test_fast_state_forward_equivalence.py +25 -0
  301. nested_learning-0.2.0/tests/test_fast_state_meta_grads.py +37 -0
  302. nested_learning-0.2.0/tests/test_fast_state_selfmod_meta_grads.py +54 -0
  303. nested_learning-0.2.0/tests/test_git_tracked_sizes_check.py +13 -0
  304. nested_learning-0.2.0/tests/test_hope_block.py +29 -0
  305. nested_learning-0.2.0/tests/test_hope_selfmod_fast_state_meta_unchanged.py +31 -0
  306. nested_learning-0.2.0/tests/test_hope_selfmod_integration.py +30 -0
  307. nested_learning-0.2.0/tests/test_hope_selfmod_update_pass.py +34 -0
  308. nested_learning-0.2.0/tests/test_levels.py +15 -0
  309. nested_learning-0.2.0/tests/test_m3.py +145 -0
  310. nested_learning-0.2.0/tests/test_m3_slow_timing.py +27 -0
  311. nested_learning-0.2.0/tests/test_memorization.py +179 -0
  312. nested_learning-0.2.0/tests/test_model.py +19 -0
  313. nested_learning-0.2.0/tests/test_model_streaming_cadence.py +186 -0
  314. nested_learning-0.2.0/tests/test_online_chunking.py +195 -0
  315. nested_learning-0.2.0/tests/test_optim.py +62 -0
  316. nested_learning-0.2.0/tests/test_optimizer_param_policy.py +76 -0
  317. nested_learning-0.2.0/tests/test_package_release_script.py +50 -0
  318. nested_learning-0.2.0/tests/test_paper_faithful_configs.py +71 -0
  319. nested_learning-0.2.0/tests/test_phase2_memorization_delta.py +50 -0
  320. nested_learning-0.2.0/tests/test_residual_mlp_memory.py +31 -0
  321. nested_learning-0.2.0/tests/test_run_features.py +70 -0
  322. nested_learning-0.2.0/tests/test_self_modifying_titans.py +75 -0
  323. nested_learning-0.2.0/tests/test_selfmod_adaptive_q.py +24 -0
  324. nested_learning-0.2.0/tests/test_selfmod_dgd_linear.py +43 -0
  325. nested_learning-0.2.0/tests/test_selfmod_grad_flow.py +33 -0
  326. nested_learning-0.2.0/tests/test_selfmod_local_conv.py +17 -0
  327. nested_learning-0.2.0/tests/test_selfmod_online.py +30 -0
  328. nested_learning-0.2.0/tests/test_strict_streaming_contract.py +201 -0
  329. nested_learning-0.2.0/tests/test_surprise_metric.py +170 -0
  330. nested_learning-0.2.0/tests/test_surprise_override.py +50 -0
  331. nested_learning-0.2.0/tests/test_teach_signal.py +231 -0
  332. nested_learning-0.2.0/tests/test_tied_weight_guard.py +34 -0
  333. nested_learning-0.2.0/tests/test_variants.py +67 -0
  334. nested_learning-0.2.0/tests/test_verify_docs_refs.py +63 -0
  335. nested_learning-0.2.0/tests/test_verify_update_cadence.py +118 -0
  336. nested_learning-0.2.0/train.py +18 -0
  337. nested_learning-0.2.0/train_deepspeed.py +121 -0
  338. nested_learning-0.2.0/train_dist.py +36 -0
  339. nested_learning-0.2.0/train_fsdp.py +199 -0
  340. nested_learning-0.2.0/uv.lock +3339 -0
@@ -0,0 +1,2 @@
1
+ blank_issues_enabled: false
2
+ contact_links: []
@@ -0,0 +1,27 @@
1
+ ---
2
+ name: Evaluation request
3
+ about: Propose a new benchmark or diagnostic to add
4
+ title: "[Eval] "
5
+ labels: ["evaluation", "needs-triage"]
6
+ assignees: []
7
+ ---
8
+
9
+ ## Motivation
10
+ Why is this evaluation important for HOPE/TITAN reproduction?
11
+
12
+ ## Task details
13
+ - Dataset / benchmark:
14
+ - Metric(s):
15
+ - Expected runtime / hardware:
16
+
17
+ ## Environment target
18
+ - OS:
19
+ - Python:
20
+ - Torch:
21
+ - Preferred backend (`cpu` / `cuda` / `mps` / `rocm`):
22
+
23
+ ## Implementation sketch
24
+ Outline scripts/flags needed (e.g., extend `scripts/eval/zeroshot.py`).
25
+
26
+ ## Acceptance criteria
27
+ Describe what needs to be captured (JSON fields, plots, etc.).
@@ -0,0 +1,30 @@
1
+ ---
2
+ name: Faithfulness gap
3
+ about: Report deviations vs. the Nested Learning / HOPE specs
4
+ title: "[Faithfulness] "
5
+ labels: ["faithfulness", "needs-triage"]
6
+ assignees: []
7
+ ---
8
+
9
+ ## Summary
10
+ Describe the suspected deviation (cite paper section/equation).
11
+
12
+ ## Evidence
13
+ - Config(s) / checkpoints affected
14
+ - Logs / screenshots / metrics
15
+ - Steps to reproduce
16
+
17
+ ## Environment
18
+ - OS:
19
+ - Python:
20
+ - Torch:
21
+ - Backend (`cpu` / `cuda` / `mps` / `rocm`):
22
+ - GPU/accelerator model (if any):
23
+
24
+ If using ROCm: this project currently treats ROCm support as best-effort. Include HIP/ROCm version and exact torch build.
25
+
26
+ ## Expected behavior
27
+ What should happen according to the paper?
28
+
29
+ ## Additional context
30
+ Add any extra notes, e.g., suggested fix or related PRs.
@@ -0,0 +1,32 @@
1
+ ---
2
+ name: Performance regression
3
+ about: Report a training / eval performance drop vs. baseline
4
+ title: "[Perf] "
5
+ labels: ["performance", "needs-triage"]
6
+ assignees: []
7
+ ---
8
+
9
+ ## Summary
10
+ Describe the regression and the baseline you’re comparing against.
11
+
12
+ ## Baseline
13
+ - Config / checkpoint:
14
+ - Metrics (loss / ppl / eval scores):
15
+
16
+ ## Repro steps
17
+ Exact commands with overrides, plus hardware details.
18
+
19
+ ## Environment
20
+ - OS:
21
+ - Python:
22
+ - Torch:
23
+ - Backend (`cpu` / `cuda` / `mps` / `rocm`):
24
+ - GPU/accelerator model (if any):
25
+
26
+ If using ROCm: this project currently treats ROCm support as best-effort. Include HIP/ROCm version and exact torch build.
27
+
28
+ ## Logs / artifacts
29
+ Attach relevant logs, W&B links, or JSON eval files.
30
+
31
+ ## Suspected cause
32
+ Optional theory or related commits/PRs.
@@ -0,0 +1,197 @@
1
+ name: CI
2
+
3
+ on:
4
+ push:
5
+ branches: ["main"]
6
+ pull_request:
7
+ branches: ["main"]
8
+
9
+ jobs:
10
+ lint-and-test:
11
+ runs-on: ubuntu-latest
12
+ steps:
13
+ - name: Checkout repository
14
+ uses: actions/checkout@v4
15
+
16
+ - name: Set up Python
17
+ uses: actions/setup-python@v5
18
+ with:
19
+ python-version: "3.12"
20
+
21
+ - name: Set up uv
22
+ uses: astral-sh/setup-uv@v3
23
+ with:
24
+ version: "0.9.8"
25
+
26
+ - name: Sync dependencies
27
+ run: uv sync --all-extras --dev
28
+
29
+ - name: Ruff
30
+ run: uv run ruff check .
31
+
32
+ - name: Mypy
33
+ run: uv run mypy src
34
+
35
+ - name: Verify docs path references
36
+ run: uv run python scripts/checks/verify_docs_refs.py
37
+
38
+ - name: Verify README critical commands
39
+ run: bash scripts/checks/check_readme_commands.sh
40
+
41
+ - name: Guard tracked file sizes / artifact extensions
42
+ run: bash scripts/checks/check_git_tracked_sizes.sh
43
+
44
+ - name: Verify scripts/data help exits cleanly
45
+ run: bash scripts/checks/check_data_script_help.sh
46
+
47
+ - name: Pytest
48
+ run: uv run pytest
49
+
50
+ cross-platform-smoke:
51
+ strategy:
52
+ fail-fast: false
53
+ matrix:
54
+ os: [ubuntu-latest, macos-latest, windows-latest]
55
+ python-version: ["3.10", "3.12"]
56
+ runs-on: ${{ matrix.os }}
57
+ steps:
58
+ - name: Checkout repository
59
+ uses: actions/checkout@v4
60
+
61
+ - name: Set up Python
62
+ uses: actions/setup-python@v5
63
+ with:
64
+ python-version: ${{ matrix.python-version }}
65
+
66
+ - name: Set up uv
67
+ uses: astral-sh/setup-uv@v3
68
+ with:
69
+ version: "0.9.8"
70
+
71
+ - name: Sync dependencies
72
+ run: uv sync --dev
73
+
74
+ - name: CLI help + doctor + smoke
75
+ run: |
76
+ uv run nl --help
77
+ uv run nl doctor --json
78
+ uv run nl smoke --config-name pilot_smoke --device cpu --batch-size 1 --seq-len 8
79
+ uv run python -m nested_learning --help
80
+
81
+ wheel-install-smoke:
82
+ runs-on: ubuntu-latest
83
+ steps:
84
+ - name: Checkout repository
85
+ uses: actions/checkout@v4
86
+
87
+ - name: Set up Python
88
+ uses: actions/setup-python@v5
89
+ with:
90
+ python-version: "3.12"
91
+
92
+ - name: Set up uv
93
+ uses: astral-sh/setup-uv@v3
94
+ with:
95
+ version: "0.9.8"
96
+
97
+ - name: Build wheel
98
+ run: uv build
99
+
100
+ - name: Install wheel in isolated venv
101
+ run: |
102
+ python -m venv /tmp/wheel-smoke
103
+ /tmp/wheel-smoke/bin/python -m pip install --upgrade pip
104
+ /tmp/wheel-smoke/bin/python -m pip install dist/*.whl
105
+
106
+ - name: Verify wheel entrypoints outside repo configs
107
+ run: |
108
+ /tmp/wheel-smoke/bin/python -m nested_learning --help
109
+ /tmp/wheel-smoke/bin/python -m nested_learning doctor --json
110
+ /tmp/wheel-smoke/bin/python - <<'PY'
111
+ import subprocess
112
+ import sys
113
+ import tempfile
114
+
115
+ tmp = tempfile.mkdtemp(prefix="nl-wheel-smoke-")
116
+ cmd = [
117
+ sys.executable,
118
+ "-m",
119
+ "nested_learning",
120
+ "smoke",
121
+ "--config-name",
122
+ "pilot_smoke",
123
+ "--device",
124
+ "cpu",
125
+ "--batch-size",
126
+ "1",
127
+ "--seq-len",
128
+ "8",
129
+ ]
130
+ subprocess.run(cmd, cwd=tmp, check=True)
131
+ PY
132
+
133
+ cpu-ddp-smoke:
134
+ runs-on: ubuntu-latest
135
+ steps:
136
+ - name: Checkout repository
137
+ uses: actions/checkout@v4
138
+
139
+ - name: Set up Python
140
+ uses: actions/setup-python@v5
141
+ with:
142
+ python-version: "3.12"
143
+
144
+ - name: Set up uv
145
+ uses: astral-sh/setup-uv@v3
146
+ with:
147
+ version: "0.9.8"
148
+
149
+ - name: Sync dependencies
150
+ run: uv sync --all-extras --dev
151
+
152
+ - name: CPU DDP smoke (gloo backend)
153
+ run: bash scripts/run_cpu_ddp_smoke.sh
154
+
155
+ passkey-smoke:
156
+ runs-on: ubuntu-latest
157
+ steps:
158
+ - name: Checkout repository
159
+ uses: actions/checkout@v4
160
+
161
+ - name: Set up Python
162
+ uses: actions/setup-python@v5
163
+ with:
164
+ python-version: "3.12"
165
+
166
+ - name: Set up uv
167
+ uses: astral-sh/setup-uv@v3
168
+ with:
169
+ version: "0.9.8"
170
+
171
+ - name: Sync dependencies
172
+ run: uv sync --all-extras --dev
173
+
174
+ - name: Run synthetic passkey memorization test
175
+ run: bash scripts/tests/run_passkey_smoke.sh
176
+
177
+ fidelity-subset:
178
+ runs-on: ubuntu-latest
179
+ steps:
180
+ - name: Checkout repository
181
+ uses: actions/checkout@v4
182
+
183
+ - name: Set up Python
184
+ uses: actions/setup-python@v5
185
+ with:
186
+ python-version: "3.12"
187
+
188
+ - name: Set up uv
189
+ uses: astral-sh/setup-uv@v3
190
+ with:
191
+ version: "0.9.8"
192
+
193
+ - name: Sync dependencies
194
+ run: uv sync --all-extras --dev
195
+
196
+ - name: Run fidelity subset + compliance report
197
+ run: bash scripts/checks/run_fidelity_ci_subset.sh
@@ -0,0 +1,79 @@
1
+ name: Release
2
+
3
+ on:
4
+ push:
5
+ tags:
6
+ - "v*"
7
+
8
+ permissions:
9
+ contents: write
10
+ id-token: write
11
+
12
+ jobs:
13
+ build:
14
+ runs-on: ubuntu-latest
15
+ steps:
16
+ - name: Checkout repository
17
+ uses: actions/checkout@v4
18
+
19
+ - name: Set up Python
20
+ uses: actions/setup-python@v5
21
+ with:
22
+ python-version: "3.12"
23
+
24
+ - name: Set up uv
25
+ uses: astral-sh/setup-uv@v3
26
+ with:
27
+ version: "0.9.8"
28
+
29
+ - name: Build source and wheel distributions
30
+ run: uv build
31
+
32
+ - name: Twine check
33
+ run: uvx twine check dist/*
34
+
35
+ - name: Upload dist artifacts
36
+ uses: actions/upload-artifact@v4
37
+ with:
38
+ name: dist
39
+ path: dist/*
40
+
41
+ publish-testpypi:
42
+ if: contains(github.ref_name, 'rc')
43
+ needs: build
44
+ runs-on: ubuntu-latest
45
+ environment:
46
+ name: testpypi
47
+ url: https://test.pypi.org/p/nested-learning
48
+ steps:
49
+ - name: Download dist artifacts
50
+ uses: actions/download-artifact@v4
51
+ with:
52
+ name: dist
53
+ path: dist
54
+
55
+ - name: Publish to TestPyPI via Trusted Publishing
56
+ uses: pypa/gh-action-pypi-publish@release/v1
57
+ with:
58
+ repository-url: https://test.pypi.org/legacy/
59
+ packages-dir: dist/
60
+
61
+ publish-pypi:
62
+ if: ${{ !contains(github.ref_name, 'rc') }}
63
+ needs: build
64
+ runs-on: ubuntu-latest
65
+ environment:
66
+ name: pypi
67
+ url: https://pypi.org/p/nested-learning
68
+ steps:
69
+ - name: Download dist artifacts
70
+ uses: actions/download-artifact@v4
71
+ with:
72
+ name: dist
73
+ path: dist
74
+
75
+ - name: Publish to PyPI via Trusted Publishing
76
+ uses: pypa/gh-action-pypi-publish@release/v1
77
+ with:
78
+ packages-dir: dist/
79
+
@@ -0,0 +1,45 @@
1
+ name: Security
2
+
3
+ on:
4
+ push:
5
+ branches: ["main"]
6
+ pull_request:
7
+ branches: ["main"]
8
+ schedule:
9
+ - cron: "0 6 * * 1"
10
+
11
+ jobs:
12
+ dependency-audit:
13
+ runs-on: ubuntu-latest
14
+ steps:
15
+ - name: Checkout repository
16
+ uses: actions/checkout@v4
17
+
18
+ - name: Set up Python
19
+ uses: actions/setup-python@v5
20
+ with:
21
+ python-version: "3.12"
22
+
23
+ - name: Set up uv
24
+ uses: astral-sh/setup-uv@v3
25
+ with:
26
+ version: "0.9.8"
27
+
28
+ - name: Export requirements
29
+ run: uv export --all-extras --dev --format requirements-txt --output-file /tmp/requirements.txt
30
+
31
+ - name: pip-audit
32
+ run: uvx pip-audit -r /tmp/requirements.txt
33
+ continue-on-error: true
34
+
35
+ secret-scan:
36
+ runs-on: ubuntu-latest
37
+ steps:
38
+ - name: Checkout repository
39
+ uses: actions/checkout@v4
40
+
41
+ - name: Gitleaks scan
42
+ uses: gitleaks/gitleaks-action@v2
43
+ env:
44
+ GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
45
+
@@ -0,0 +1,40 @@
1
+ # Environment / tooling
2
+ .venv/
3
+ __pycache__/
4
+ *.pyc
5
+ .pytest_cache/
6
+ .ruff_cache/
7
+ .mypy_cache/
8
+
9
+ # Local artifacts
10
+ logs/
11
+ artifacts/
12
+ /data/
13
+ outputs/
14
+ checkpoints/
15
+ *.pt
16
+ train.log
17
+ train_dist.log
18
+ ref_repos/
19
+ configs/_tmp*
20
+ git.env
21
+ docs/POSTS.md
22
+ docs/EX_*.md
23
+ docs/CHECK_2_PLANNING_MODEL_REQUEST.md
24
+ docs/CHECK_2_PLANNING_MODEL_RESPONSE.md
25
+ docs/planner_check2_attachments.zip
26
+ docs/tmp/
27
+ docs_tmp/
28
+ wandb/
29
+ eval/*_ci.json
30
+
31
+ # Local paper scans / scratch references (keep tracked references separate)
32
+ google_papers/*_arXiv_v1.pdf
33
+ google_papers/*_arXiv_v1/
34
+ google_papers/Nested_Learning_Full_Paper.pdf
35
+ google_papers/Nested_Learning_Full_Paper/
36
+
37
+ # Editors
38
+ .DS_Store
39
+ .idea/
40
+ .vscode/
@@ -0,0 +1,59 @@
1
+ # Changelog
2
+
3
+ All notable changes to this project will be documented here. The format loosely follows [Keep a Changelog](https://keepachangelog.com/en/1.1.0/) and uses semantic versioning once tagged releases begin.
4
+
5
+ ## [Unreleased]
6
+ ### Added
7
+ - Optional attention KV-cache path for continuous streaming inference (`init_attention_cache`, `attention_cache`, `return_attention_cache`) across HOPE/TITAN/Transformer blocks.
8
+ - Boundary-target online chunking mode (`train.online_boundary_targets`) and optional training-time attention-cache carry (`train.online_carry_attention_cache`) for stronger chunk-boundary semantics.
9
+ - Evaluation streaming-state utilities (`src/nested_learning/eval_state.py`) plus continual-eval controls (`--eval-state-mode`, `--eval-use-fast-state`, `--eval-use-attention-cache`).
10
+ - Compliance report automation (`scripts/checks/compliance_report.py`) with CI subset + mechanism smoke integration.
11
+ - Flash/SDPA-backed self-attention path with safe fallbacks, unlocking PyTorch 2.9 SDPA kernels by default.
12
+ - Hydra toggles for bf16 autocast (`train.mixed_precision.*`), `torch.compile` (`train.compile.*`), and fused optimizers.
13
+ - Muon + AdamW hybrid optimizer option exposed via `optim.type=muon`, routing ≥2D matrices through `torch.optim.Muon`.
14
+ - Test-time memorization flags (`--memorize*`) documented in README + `docs/guide.md`, matching TITAN eval behavior.
15
+ - Automation helpers: `scripts/run_e2e_smoke.sh` documented in Quickstart, plus new `scripts/run_cpu_ddp_smoke.sh` for CPU-only DDP/gloo smoke coverage.
16
+ - Streaming contract doc (`docs/STREAMING_CONTRACT.md`) defining sequence/segment/chunk semantics and fast-state lifecycle.
17
+ - Cadence verification utility (`scripts/checks/verify_update_cadence.py`) with synthetic tests and release-checklist integration.
18
+ - Fidelity CI subset runner (`scripts/checks/run_fidelity_ci_subset.sh`) and mechanism-auditing smoke runner (`scripts/run_mechanism_audit_smoke.sh`).
19
+ - Progress/status docs for P7 execution (`docs/PLAN_PROGRESS_P7.md`, `docs/IMPLEMENTATION_STATUS.md`).
20
+ - Bug-report reproducibility checklist (`docs/BUG_REPORT_CHECKLIST.md`).
21
+ - Boundary-state training-loop regression coverage (`tests/test_boundary_state_training_loop.py`) plus eval-loader/metadata roundtrip coverage (`tests/test_checkpoint_metadata_and_eval_loaders.py`).
22
+ - `scripts/checks/check_data_script_help.sh` to guarantee `scripts/data/* --help` exits cleanly; wired into CI.
23
+ - Markdown anchor verification in `scripts/checks/verify_docs_refs.py` with dedicated unit coverage.
24
+
25
+ ### Changed
26
+ - README / compliance / streaming docs now reflect boundary-target mode, optional KV-cache carry, and explicit scope boundaries.
27
+ - CPU DDP smoke now includes strict-mode fail-fast verification.
28
+ - Repository license metadata now matches the shipped Apache-2.0 text; badges updated accordingly.
29
+ - README and guide refreshed with performance knobs, optimizer guidance, and memorization instructions so release consumers have a single source of truth.
30
+ - Release checklist tracks the new CPU DDP smoke script to keep packaging instructions aligned with available tooling.
31
+ - Training loop strict-mode guardrails: `train.strict_streaming_contract` now fail-fasts on known semantics violations (DDP feature downgrades, shared-batch fast-state, non paper-defined variant in strict mode).
32
+ - CMS telemetry now includes cadence metrics (`updates_applied`, `tokens_flushed`, `pending_tokens`, `gate_hits`) to make update-frequency behavior auditable.
33
+ - Paper-auditing preset now explicitly enables strict streaming contract checks.
34
+ - `configs/pilot_paper_faithful.yaml` now explicitly sets `train.online_updates=true` and tests verify no implicit algorithm-mode fallback.
35
+ - Boundary-state mode now emits an explicit startup warning code (`experimental_boundary_state_mode`) and validates cache/chunk constraints early.
36
+ - Checkpoint metadata now records algorithm/online flags (`algorithm_mode`, `online_updates`, `online_boundary_targets`, `online_carry_attention_cache`, `use_fast_state`), and release manifest includes those flags.
37
+ - Data split fallback policy is deterministic across data scripts (`train -> validation -> test -> first available`) with explicit available-splits logging.
38
+
39
+ ### Upcoming
40
+ - GitHub Actions workflow covering `ruff`, `mypy`, and `pytest`.
41
+ - End-to-end release dry-run ahead of the `v0.1.0` tag.
42
+
43
+ ## [0.1.0] - 2025-11-09
44
+ ### Added
45
+ - PyTorch **2.9.0** / torchvision **0.24.0** environment managed via `uv` with reproducible `pyproject.toml` + `uv.lock`.
46
+ - HOPE block implementation (attention → TITAN memory → CMS + deep optimizers) with configurable level clocks and self-modifier wiring.
47
+ - Hydrated Hydra config tree for pilot, mid, target, and CPU-only smoke runs plus DDP/FSDP/DeepSpeed entrypoints.
48
+ - Data tooling: tokenizer trainer, corpus filtering, mixture processing, and `scripts/data/run_sample.sh` shortcut emitting stats under `data/mixtures/`.
49
+ - Evaluation suite: zero-shot benchmark CLI (PIQA/HellaSwag/WinoGrande/ARC/BoolQ/SIQA), Needle-in-a-Haystack generator, continual-learning forgetting analyzer.
50
+ - Sample artifacts (`artifacts/examples/pilot_dummy.pt`, `logs/pilot_smoke.json`, `logs/mid_smoke.json`) for reproducing eval commands without lengthy training.
51
+ - Documentation set (`docs/stage1_plan.md`, `docs/stage2_plan.md`, `docs/data_pipeline.md`, `docs/guide.md`) outlining architecture, scaling strategy, and onboarding.
52
+
53
+ ### Changed
54
+ - README rewritten with badges, quickstart commands, and references to the new guide + release checklist.
55
+ - Logging defaults clarified (`logging.backend=json|wandb`), with instructions for saving structured metrics under `logs/`.
56
+
57
+ ### Known gaps
58
+ - Release automation and CI are tracked in `docs/release_plan.md`.
59
+ - Scaling guidance for >100 B token corpora pending additional storage + GPU availability.