nested-learning 0.2.0__tar.gz
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- nested_learning-0.2.0/.github/ISSUE_TEMPLATE/config.yml +2 -0
- nested_learning-0.2.0/.github/ISSUE_TEMPLATE/eval_request.md +27 -0
- nested_learning-0.2.0/.github/ISSUE_TEMPLATE/faithfulness_gap.md +30 -0
- nested_learning-0.2.0/.github/ISSUE_TEMPLATE/perf_regression.md +32 -0
- nested_learning-0.2.0/.github/workflows/ci.yml +197 -0
- nested_learning-0.2.0/.github/workflows/release.yml +79 -0
- nested_learning-0.2.0/.github/workflows/security.yml +45 -0
- nested_learning-0.2.0/.gitignore +40 -0
- nested_learning-0.2.0/CHANGELOG.md +59 -0
- nested_learning-0.2.0/LICENSE +201 -0
- nested_learning-0.2.0/PKG-INFO +390 -0
- nested_learning-0.2.0/README.md +353 -0
- nested_learning-0.2.0/TODO.md +85 -0
- nested_learning-0.2.0/configs/ablations/cms_sparse.yaml +46 -0
- nested_learning-0.2.0/configs/ablations/selfmod_chunked_8_64.yaml +24 -0
- nested_learning-0.2.0/configs/ablations/selfmod_momentum_off.yaml +23 -0
- nested_learning-0.2.0/configs/ablations/selfmod_momentum_on.yaml +23 -0
- nested_learning-0.2.0/configs/ablations/selfmod_no_alpha.yaml +23 -0
- nested_learning-0.2.0/configs/ablations/selfmod_no_cms.yaml +23 -0
- nested_learning-0.2.0/configs/ablations/selfmod_rank1_precond_off.yaml +23 -0
- nested_learning-0.2.0/configs/data/continual_segments_sample.yaml +9 -0
- nested_learning-0.2.0/configs/data/fineweb_edu_longdoc_filtered_sample.yaml +14 -0
- nested_learning-0.2.0/configs/data/fineweb_edu_mixture_full.yaml +14 -0
- nested_learning-0.2.0/configs/data/fineweb_edu_mixture_sample.yaml +14 -0
- nested_learning-0.2.0/configs/data/refinedweb_mixture.yaml +48 -0
- nested_learning-0.2.0/configs/data/refinedweb_mixture_filtered.yaml +48 -0
- nested_learning-0.2.0/configs/data/refinedweb_mixture_full.yaml +48 -0
- nested_learning-0.2.0/configs/data/refinedweb_mixture_sample.yaml +51 -0
- nested_learning-0.2.0/configs/deepspeed/zero3.json +25 -0
- nested_learning-0.2.0/configs/hope/mid.yaml +118 -0
- nested_learning-0.2.0/configs/hope/mid_fsdp.yaml +47 -0
- nested_learning-0.2.0/configs/hope/pilot.yaml +2 -0
- nested_learning-0.2.0/configs/hope/pilot_attention.yaml +9 -0
- nested_learning-0.2.0/configs/hope/pilot_selfmod.yaml +20 -0
- nested_learning-0.2.0/configs/hope/pilot_transformer.yaml +9 -0
- nested_learning-0.2.0/configs/hope/target.yaml +145 -0
- nested_learning-0.2.0/configs/hope/target_fsdp.yaml +47 -0
- nested_learning-0.2.0/configs/mid_smoke.yaml +99 -0
- nested_learning-0.2.0/configs/mid_stage2.yaml +110 -0
- nested_learning-0.2.0/configs/mid_stage2_smoke.yaml +102 -0
- nested_learning-0.2.0/configs/mid_titan_baseline.yaml +92 -0
- nested_learning-0.2.0/configs/pilot.yaml +127 -0
- nested_learning-0.2.0/configs/pilot_paper_faithful.yaml +42 -0
- nested_learning-0.2.0/configs/pilot_selfmod_paper_faithful.yaml +18 -0
- nested_learning-0.2.0/configs/pilot_smoke.yaml +80 -0
- nested_learning-0.2.0/configs/resolved/cms_sparse_eval.yaml +105 -0
- nested_learning-0.2.0/configs/resolved/phase2_pilot_attention_eval.yaml +49 -0
- nested_learning-0.2.0/configs/resolved/phase2_pilot_transformer_eval.yaml +49 -0
- nested_learning-0.2.0/docs/BUG_REPORT_CHECKLIST.md +30 -0
- nested_learning-0.2.0/docs/COMPATIBILITY_MATRIX.md +43 -0
- nested_learning-0.2.0/docs/FSDP_SCALING_GUIDE.md +61 -0
- nested_learning-0.2.0/docs/IMPLEMENTATION_STATUS.md +34 -0
- nested_learning-0.2.0/docs/P4_REMEDIATION_PLAN.md +41 -0
- nested_learning-0.2.0/docs/PACKAGE_RELEASE_CHECKLIST.md +39 -0
- nested_learning-0.2.0/docs/PAPER_COMPLIANCE.md +381 -0
- nested_learning-0.2.0/docs/PHASE2_LONG_CONTEXT_COMPARISON.md +49 -0
- nested_learning-0.2.0/docs/PHASE_2_PLAN.md +119 -0
- nested_learning-0.2.0/docs/PYPI_TRUSTED_PUBLISHING.md +52 -0
- nested_learning-0.2.0/docs/STREAMING_CONTRACT.md +109 -0
- nested_learning-0.2.0/docs/VERSIONING_POLICY.md +35 -0
- nested_learning-0.2.0/docs/compute_plan.md +22 -0
- nested_learning-0.2.0/docs/continual_classification_eval.md +82 -0
- nested_learning-0.2.0/docs/continual_eval.md +40 -0
- nested_learning-0.2.0/docs/data_pipeline.md +219 -0
- nested_learning-0.2.0/docs/env_matrix.md +63 -0
- nested_learning-0.2.0/docs/experiments_report.md +197 -0
- nested_learning-0.2.0/docs/future_directions.md +34 -0
- nested_learning-0.2.0/docs/phase2_comparison.md +70 -0
- nested_learning-0.2.0/docs/release_checklist.md +43 -0
- nested_learning-0.2.0/docs/scaling_guidance.md +72 -0
- nested_learning-0.2.0/docs/spec_interfaces.md +23 -0
- nested_learning-0.2.0/docs/sprint_next_plan.md +95 -0
- nested_learning-0.2.0/docs/stage2_plan.md +158 -0
- nested_learning-0.2.0/docs/stage2_progress.md +39 -0
- nested_learning-0.2.0/docs/templates/checkpoint_report.md +50 -0
- nested_learning-0.2.0/docs/zeroshot_eval.md +118 -0
- nested_learning-0.2.0/eval/continual_dummy.json +11 -0
- nested_learning-0.2.0/eval/continual_mid_stage2.json +43 -0
- nested_learning-0.2.0/eval/continual_mid_stage2_smoke.json +43 -0
- nested_learning-0.2.0/eval/continual_mid_stage2_ts10.json +43 -0
- nested_learning-0.2.0/eval/continual_mid_stage2_ts10_single120_clip.json +43 -0
- nested_learning-0.2.0/eval/continual_mid_stage2_ts10_single140_schedC.json +43 -0
- nested_learning-0.2.0/eval/continual_mid_stage2_ts10_single220_schedD.json +43 -0
- nested_learning-0.2.0/eval/continual_mid_stage2_ts10_single80.json +43 -0
- nested_learning-0.2.0/eval/continual_mid_stage2_ts10_single80lr2e5.json +43 -0
- nested_learning-0.2.0/eval/continual_mid_stage2_ts20.json +43 -0
- nested_learning-0.2.0/eval/continual_mid_titan_baseline.json +43 -0
- nested_learning-0.2.0/eval/continual_pilot.json +73 -0
- nested_learning-0.2.0/eval/continual_pilot_cms_nochunk_step5000.json +11 -0
- nested_learning-0.2.0/eval/continual_pilot_cms_sparse_step5000.json +11 -0
- nested_learning-0.2.0/eval/continual_pilot_multi.json +119 -0
- nested_learning-0.2.0/eval/continual_pilot_opt_adamw_step5000.json +11 -0
- nested_learning-0.2.0/eval/continual_pilot_opt_muon_step5000.json +11 -0
- nested_learning-0.2.0/eval/continual_pilot_selfmod_off_step5000.json +11 -0
- nested_learning-0.2.0/eval/continual_pilot_step22000.json +11 -0
- nested_learning-0.2.0/eval/continual_pilot_step230000.json +41 -0
- nested_learning-0.2.0/eval/continual_pilot_teach05_long_step25000.json +11 -0
- nested_learning-0.2.0/eval/continual_pilot_teach05_step2000.json +11 -0
- nested_learning-0.2.0/eval/continual_pilot_teach15_long_step25000.json +11 -0
- nested_learning-0.2.0/eval/continual_pilot_teach15_step2000.json +11 -0
- nested_learning-0.2.0/eval/continual_smoke.json +43 -0
- nested_learning-0.2.0/eval/continual_titan.json +73 -0
- nested_learning-0.2.0/eval/continual_titan_relaunch_step001000.json +41 -0
- nested_learning-0.2.0/eval/continual_titan_step25000.json +41 -0
- nested_learning-0.2.0/eval/niah_dummy.json +4 -0
- nested_learning-0.2.0/eval/niah_mid_stage2.json +4 -0
- nested_learning-0.2.0/eval/niah_mid_stage2_smoke.json +5 -0
- nested_learning-0.2.0/eval/niah_mid_stage2_ts10.json +3 -0
- nested_learning-0.2.0/eval/niah_mid_stage2_ts10_single120_clip.json +3 -0
- nested_learning-0.2.0/eval/niah_mid_stage2_ts10_single140_schedC.json +3 -0
- nested_learning-0.2.0/eval/niah_mid_stage2_ts10_single220_schedD.json +4 -0
- nested_learning-0.2.0/eval/niah_mid_stage2_ts10_single80.json +3 -0
- nested_learning-0.2.0/eval/niah_mid_stage2_ts10_single80lr2e5.json +3 -0
- nested_learning-0.2.0/eval/niah_mid_stage2_ts20.json +3 -0
- nested_learning-0.2.0/eval/niah_mid_titan_baseline.json +4 -0
- nested_learning-0.2.0/eval/niah_pilot.json +38 -0
- nested_learning-0.2.0/eval/niah_pilot_cms_nochunk_step5000.json +8 -0
- nested_learning-0.2.0/eval/niah_pilot_cms_sparse_step5000.json +8 -0
- nested_learning-0.2.0/eval/niah_pilot_opt_adamw_step5000.json +8 -0
- nested_learning-0.2.0/eval/niah_pilot_opt_muon_step5000.json +8 -0
- nested_learning-0.2.0/eval/niah_pilot_selfmod_off_step5000.json +8 -0
- nested_learning-0.2.0/eval/niah_pilot_step22000.json +5 -0
- nested_learning-0.2.0/eval/niah_pilot_step230000.json +28 -0
- nested_learning-0.2.0/eval/niah_pilot_teach05_long_step25000.json +8 -0
- nested_learning-0.2.0/eval/niah_pilot_teach05_step2000.json +8 -0
- nested_learning-0.2.0/eval/niah_pilot_teach15_long_step25000.json +8 -0
- nested_learning-0.2.0/eval/niah_pilot_teach15_step2000.json +8 -0
- nested_learning-0.2.0/eval/niah_smoke.json +5 -0
- nested_learning-0.2.0/eval/niah_titan.json +38 -0
- nested_learning-0.2.0/eval/niah_titan_relaunch_step001000.json +30 -0
- nested_learning-0.2.0/eval/niah_titan_step25000.json +28 -0
- nested_learning-0.2.0/eval/passkey_pilot.json +21 -0
- nested_learning-0.2.0/eval/passkey_pilot_step230000.json +11 -0
- nested_learning-0.2.0/eval/passkey_titan.json +21 -0
- nested_learning-0.2.0/eval/passkey_titan_relaunch_step001000.json +13 -0
- nested_learning-0.2.0/eval/passkey_titan_step25000.json +11 -0
- nested_learning-0.2.0/eval/pg19_pilot.json +9 -0
- nested_learning-0.2.0/eval/pg19_pilot_step230000.json +7 -0
- nested_learning-0.2.0/eval/pg19_titan.json +9 -0
- nested_learning-0.2.0/eval/pg19_titan_relaunch_step001000.json +9 -0
- nested_learning-0.2.0/eval/pg19_titan_step25000.json +7 -0
- nested_learning-0.2.0/eval/phase2_compare_smoke_lastlayer_metrics.json +131 -0
- nested_learning-0.2.0/eval/zeroshot_full_smoke.json +16 -0
- nested_learning-0.2.0/eval/zeroshot_mid_stage2.json +8 -0
- nested_learning-0.2.0/eval/zeroshot_mid_stage2_smoke.json +16 -0
- nested_learning-0.2.0/eval/zeroshot_mid_stage2_smoke_piqa_baseline.json +4 -0
- nested_learning-0.2.0/eval/zeroshot_mid_stage2_smoke_piqa_mem.json +4 -0
- nested_learning-0.2.0/eval/zeroshot_mid_stage2_ts10.json +6 -0
- nested_learning-0.2.0/eval/zeroshot_mid_stage2_ts10_single120_clip.json +6 -0
- nested_learning-0.2.0/eval/zeroshot_mid_stage2_ts10_single140_schedC.json +6 -0
- nested_learning-0.2.0/eval/zeroshot_mid_stage2_ts10_single220_schedD.json +6 -0
- nested_learning-0.2.0/eval/zeroshot_mid_stage2_ts10_single80.json +6 -0
- nested_learning-0.2.0/eval/zeroshot_mid_stage2_ts10_single80lr2e5.json +6 -0
- nested_learning-0.2.0/eval/zeroshot_mid_stage2_ts20.json +6 -0
- nested_learning-0.2.0/eval/zeroshot_mid_titan_baseline.json +6 -0
- nested_learning-0.2.0/eval/zeroshot_pilot.json +155 -0
- nested_learning-0.2.0/eval/zeroshot_pilot_cms_nochunk_step5000.json +20 -0
- nested_learning-0.2.0/eval/zeroshot_pilot_cms_sparse_step5000.json +20 -0
- nested_learning-0.2.0/eval/zeroshot_pilot_debug.json +10 -0
- nested_learning-0.2.0/eval/zeroshot_pilot_dummy_piqa.json +4 -0
- nested_learning-0.2.0/eval/zeroshot_pilot_opt_adamw_step5000.json +10 -0
- nested_learning-0.2.0/eval/zeroshot_pilot_opt_muon_step5000.json +10 -0
- nested_learning-0.2.0/eval/zeroshot_pilot_selfmod_off_step5000.json +20 -0
- nested_learning-0.2.0/eval/zeroshot_pilot_step22000.json +4 -0
- nested_learning-0.2.0/eval/zeroshot_pilot_step230000.json +65 -0
- nested_learning-0.2.0/eval/zeroshot_pilot_teach05_long_step25000.json +20 -0
- nested_learning-0.2.0/eval/zeroshot_pilot_teach05_step2000.json +20 -0
- nested_learning-0.2.0/eval/zeroshot_pilot_teach15_long_step25000.json +20 -0
- nested_learning-0.2.0/eval/zeroshot_pilot_teach15_step2000.json +20 -0
- nested_learning-0.2.0/eval/zeroshot_smoke.json +4 -0
- nested_learning-0.2.0/eval/zeroshot_titan.json +155 -0
- nested_learning-0.2.0/eval/zeroshot_titan_relaunch_step001000.json +83 -0
- nested_learning-0.2.0/eval/zeroshot_titan_step25000.json +65 -0
- nested_learning-0.2.0/google_papers/Nested_Learning/Nested_Learning.json +270 -0
- nested_learning-0.2.0/google_papers/Nested_Learning/Nested_Learning.md +643 -0
- nested_learning-0.2.0/google_papers/Nested_Learning.pdf +0 -0
- nested_learning-0.2.0/google_papers/TITANs/TITANs.json +378 -0
- nested_learning-0.2.0/google_papers/TITANs/TITANs.md +711 -0
- nested_learning-0.2.0/google_papers/TITANs.pdf +0 -0
- nested_learning-0.2.0/pyproject.toml +83 -0
- nested_learning-0.2.0/reports/ablations.md +151 -0
- nested_learning-0.2.0/reports/cadence_mechanism_audit_smoke.json +27 -0
- nested_learning-0.2.0/reports/compliance_mechanism_audit_smoke.json +73 -0
- nested_learning-0.2.0/reports/compliance_summary_pilot.json +73 -0
- nested_learning-0.2.0/reports/compliance_summary_pilot_paper_faithful.json +42 -0
- nested_learning-0.2.0/reports/next_backlog_scoped.md +8 -0
- nested_learning-0.2.0/reports/plots/continual_pilot_refinedweb.png +0 -0
- nested_learning-0.2.0/reports/security_release_gate.md +33 -0
- nested_learning-0.2.0/reports/sprint_completion_report.md +37 -0
- nested_learning-0.2.0/reports/stage2_smoke.md +94 -0
- nested_learning-0.2.0/scripts/__init__.py +1 -0
- nested_learning-0.2.0/scripts/checkpoint/verify.py +24 -0
- nested_learning-0.2.0/scripts/checks/check_data_script_help.sh +36 -0
- nested_learning-0.2.0/scripts/checks/check_git_tracked_sizes.sh +26 -0
- nested_learning-0.2.0/scripts/checks/check_readme_commands.sh +11 -0
- nested_learning-0.2.0/scripts/checks/compliance_report.py +195 -0
- nested_learning-0.2.0/scripts/checks/run_fidelity_ci_subset.sh +38 -0
- nested_learning-0.2.0/scripts/checks/tokenizer_coverage_guard.py +98 -0
- nested_learning-0.2.0/scripts/checks/verify_docs_refs.py +206 -0
- nested_learning-0.2.0/scripts/checks/verify_update_cadence.py +138 -0
- nested_learning-0.2.0/scripts/compute/create_reservations.sh +38 -0
- nested_learning-0.2.0/scripts/data/__init__.py +2 -0
- nested_learning-0.2.0/scripts/data/check_tokenizer.py +80 -0
- nested_learning-0.2.0/scripts/data/check_tokenizer_coverage.py +34 -0
- nested_learning-0.2.0/scripts/data/filter_corpus.py +124 -0
- nested_learning-0.2.0/scripts/data/process_mixture.py +51 -0
- nested_learning-0.2.0/scripts/data/run_full.sh +169 -0
- nested_learning-0.2.0/scripts/data/run_sample.sh +110 -0
- nested_learning-0.2.0/scripts/data/shard_corpus.py +161 -0
- nested_learning-0.2.0/scripts/data/train_tokenizer.py +168 -0
- nested_learning-0.2.0/scripts/data/validate_mixture.py +73 -0
- nested_learning-0.2.0/scripts/eval/__init__.py +1 -0
- nested_learning-0.2.0/scripts/eval/compare_variants.py +523 -0
- nested_learning-0.2.0/scripts/eval/continual.py +262 -0
- nested_learning-0.2.0/scripts/eval/continual_classification.py +177 -0
- nested_learning-0.2.0/scripts/eval/niah.py +210 -0
- nested_learning-0.2.0/scripts/eval/niah_suite.py +408 -0
- nested_learning-0.2.0/scripts/eval/passkey.py +178 -0
- nested_learning-0.2.0/scripts/eval/pg19_perplexity.py +175 -0
- nested_learning-0.2.0/scripts/eval/phase2_memorization_delta_smoke.py +100 -0
- nested_learning-0.2.0/scripts/eval/plot_continual_classification.py +66 -0
- nested_learning-0.2.0/scripts/eval/plot_forgetting.py +44 -0
- nested_learning-0.2.0/scripts/eval/plot_niah_suite.py +61 -0
- nested_learning-0.2.0/scripts/eval/run_pilot_suite.sh +200 -0
- nested_learning-0.2.0/scripts/eval/summarize_eval.py +110 -0
- nested_learning-0.2.0/scripts/eval/zeroshot.py +399 -0
- nested_learning-0.2.0/scripts/package_pilot_release.sh +166 -0
- nested_learning-0.2.0/scripts/run_cpu_ddp_smoke.sh +18 -0
- nested_learning-0.2.0/scripts/run_e2e_smoke.sh +64 -0
- nested_learning-0.2.0/scripts/run_mechanism_audit_smoke.sh +36 -0
- nested_learning-0.2.0/scripts/run_smoke.sh +20 -0
- nested_learning-0.2.0/scripts/tests/run_passkey_smoke.sh +35 -0
- nested_learning-0.2.0/src/nested_learning/__init__.py +12 -0
- nested_learning-0.2.0/src/nested_learning/__main__.py +12 -0
- nested_learning-0.2.0/src/nested_learning/assoc_memory.py +23 -0
- nested_learning-0.2.0/src/nested_learning/backbones.py +147 -0
- nested_learning-0.2.0/src/nested_learning/capabilities.py +104 -0
- nested_learning-0.2.0/src/nested_learning/cli.py +253 -0
- nested_learning-0.2.0/src/nested_learning/cms.py +92 -0
- nested_learning-0.2.0/src/nested_learning/config_utils.py +50 -0
- nested_learning-0.2.0/src/nested_learning/continual_classification.py +136 -0
- nested_learning-0.2.0/src/nested_learning/continual_streaming.py +283 -0
- nested_learning-0.2.0/src/nested_learning/data.py +153 -0
- nested_learning-0.2.0/src/nested_learning/device.py +21 -0
- nested_learning-0.2.0/src/nested_learning/eval_state.py +72 -0
- nested_learning-0.2.0/src/nested_learning/fast_state.py +108 -0
- nested_learning-0.2.0/src/nested_learning/functional.py +69 -0
- nested_learning-0.2.0/src/nested_learning/hope/__init__.py +0 -0
- nested_learning-0.2.0/src/nested_learning/hope/block.py +1973 -0
- nested_learning-0.2.0/src/nested_learning/hope/self_mod.py +40 -0
- nested_learning-0.2.0/src/nested_learning/instrumentation.py +38 -0
- nested_learning-0.2.0/src/nested_learning/levels.py +94 -0
- nested_learning-0.2.0/src/nested_learning/logging_utils.py +64 -0
- nested_learning-0.2.0/src/nested_learning/memorize.py +382 -0
- nested_learning-0.2.0/src/nested_learning/model.py +604 -0
- nested_learning-0.2.0/src/nested_learning/optim/__init__.py +0 -0
- nested_learning-0.2.0/src/nested_learning/optim/deep.py +102 -0
- nested_learning-0.2.0/src/nested_learning/optim/factory.py +13 -0
- nested_learning-0.2.0/src/nested_learning/optim/m3.py +121 -0
- nested_learning-0.2.0/src/nested_learning/optim/manager.py +151 -0
- nested_learning-0.2.0/src/nested_learning/titan/__init__.py +0 -0
- nested_learning-0.2.0/src/nested_learning/titan/memory.py +88 -0
- nested_learning-0.2.0/src/nested_learning/titan/model.py +412 -0
- nested_learning-0.2.0/src/nested_learning/titan/self_modifying.py +724 -0
- nested_learning-0.2.0/src/nested_learning/tokenizer.py +28 -0
- nested_learning-0.2.0/src/nested_learning/tokenizer_coverage.py +77 -0
- nested_learning-0.2.0/src/nested_learning/training.py +1600 -0
- nested_learning-0.2.0/src/nested_learning/transformer.py +104 -0
- nested_learning-0.2.0/tests/conftest.py +7 -0
- nested_learning-0.2.0/tests/data/passkey_corpus.txt +3 -0
- nested_learning-0.2.0/tests/data/tiny_tokenizer.model +0 -0
- nested_learning-0.2.0/tests/data/tiny_tokenizer.vocab +128 -0
- nested_learning-0.2.0/tests/test_algorithm_mode_grad.py +47 -0
- nested_learning-0.2.0/tests/test_attention_cache.py +74 -0
- nested_learning-0.2.0/tests/test_attention_features.py +45 -0
- nested_learning-0.2.0/tests/test_boundary_state_mode.py +71 -0
- nested_learning-0.2.0/tests/test_boundary_state_training_loop.py +66 -0
- nested_learning-0.2.0/tests/test_build_model_from_cfg_selfmod.py +44 -0
- nested_learning-0.2.0/tests/test_checkpoint_metadata_and_eval_loaders.py +89 -0
- nested_learning-0.2.0/tests/test_cli_tooling.py +135 -0
- nested_learning-0.2.0/tests/test_cms.py +91 -0
- nested_learning-0.2.0/tests/test_cms_cross_call.py +118 -0
- nested_learning-0.2.0/tests/test_cms_delta_rule.py +43 -0
- nested_learning-0.2.0/tests/test_cms_flush_partial.py +47 -0
- nested_learning-0.2.0/tests/test_compare_variants_cli.py +100 -0
- nested_learning-0.2.0/tests/test_compile_toggle.py +63 -0
- nested_learning-0.2.0/tests/test_compliance_report.py +60 -0
- nested_learning-0.2.0/tests/test_continual_classification.py +119 -0
- nested_learning-0.2.0/tests/test_continual_eval_state_mode.py +81 -0
- nested_learning-0.2.0/tests/test_data_scripts_help.py +13 -0
- nested_learning-0.2.0/tests/test_data_split_fallbacks.py +126 -0
- nested_learning-0.2.0/tests/test_determinism_seed.py +42 -0
- nested_learning-0.2.0/tests/test_device_resolution.py +10 -0
- nested_learning-0.2.0/tests/test_distributed_fail_fast.py +90 -0
- nested_learning-0.2.0/tests/test_eval_builders.py +36 -0
- nested_learning-0.2.0/tests/test_eval_state.py +58 -0
- nested_learning-0.2.0/tests/test_eval_state_cli.py +90 -0
- nested_learning-0.2.0/tests/test_faithfulness_harness.py +57 -0
- nested_learning-0.2.0/tests/test_fast_state_batch_semantics.py +40 -0
- nested_learning-0.2.0/tests/test_fast_state_forward_equivalence.py +25 -0
- nested_learning-0.2.0/tests/test_fast_state_meta_grads.py +37 -0
- nested_learning-0.2.0/tests/test_fast_state_selfmod_meta_grads.py +54 -0
- nested_learning-0.2.0/tests/test_git_tracked_sizes_check.py +13 -0
- nested_learning-0.2.0/tests/test_hope_block.py +29 -0
- nested_learning-0.2.0/tests/test_hope_selfmod_fast_state_meta_unchanged.py +31 -0
- nested_learning-0.2.0/tests/test_hope_selfmod_integration.py +30 -0
- nested_learning-0.2.0/tests/test_hope_selfmod_update_pass.py +34 -0
- nested_learning-0.2.0/tests/test_levels.py +15 -0
- nested_learning-0.2.0/tests/test_m3.py +145 -0
- nested_learning-0.2.0/tests/test_m3_slow_timing.py +27 -0
- nested_learning-0.2.0/tests/test_memorization.py +179 -0
- nested_learning-0.2.0/tests/test_model.py +19 -0
- nested_learning-0.2.0/tests/test_model_streaming_cadence.py +186 -0
- nested_learning-0.2.0/tests/test_online_chunking.py +195 -0
- nested_learning-0.2.0/tests/test_optim.py +62 -0
- nested_learning-0.2.0/tests/test_optimizer_param_policy.py +76 -0
- nested_learning-0.2.0/tests/test_package_release_script.py +50 -0
- nested_learning-0.2.0/tests/test_paper_faithful_configs.py +71 -0
- nested_learning-0.2.0/tests/test_phase2_memorization_delta.py +50 -0
- nested_learning-0.2.0/tests/test_residual_mlp_memory.py +31 -0
- nested_learning-0.2.0/tests/test_run_features.py +70 -0
- nested_learning-0.2.0/tests/test_self_modifying_titans.py +75 -0
- nested_learning-0.2.0/tests/test_selfmod_adaptive_q.py +24 -0
- nested_learning-0.2.0/tests/test_selfmod_dgd_linear.py +43 -0
- nested_learning-0.2.0/tests/test_selfmod_grad_flow.py +33 -0
- nested_learning-0.2.0/tests/test_selfmod_local_conv.py +17 -0
- nested_learning-0.2.0/tests/test_selfmod_online.py +30 -0
- nested_learning-0.2.0/tests/test_strict_streaming_contract.py +201 -0
- nested_learning-0.2.0/tests/test_surprise_metric.py +170 -0
- nested_learning-0.2.0/tests/test_surprise_override.py +50 -0
- nested_learning-0.2.0/tests/test_teach_signal.py +231 -0
- nested_learning-0.2.0/tests/test_tied_weight_guard.py +34 -0
- nested_learning-0.2.0/tests/test_variants.py +67 -0
- nested_learning-0.2.0/tests/test_verify_docs_refs.py +63 -0
- nested_learning-0.2.0/tests/test_verify_update_cadence.py +118 -0
- nested_learning-0.2.0/train.py +18 -0
- nested_learning-0.2.0/train_deepspeed.py +121 -0
- nested_learning-0.2.0/train_dist.py +36 -0
- nested_learning-0.2.0/train_fsdp.py +199 -0
- nested_learning-0.2.0/uv.lock +3339 -0
|
@@ -0,0 +1,27 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: Evaluation request
|
|
3
|
+
about: Propose a new benchmark or diagnostic to add
|
|
4
|
+
title: "[Eval] "
|
|
5
|
+
labels: ["evaluation", "needs-triage"]
|
|
6
|
+
assignees: []
|
|
7
|
+
---
|
|
8
|
+
|
|
9
|
+
## Motivation
|
|
10
|
+
Why is this evaluation important for HOPE/TITAN reproduction?
|
|
11
|
+
|
|
12
|
+
## Task details
|
|
13
|
+
- Dataset / benchmark:
|
|
14
|
+
- Metric(s):
|
|
15
|
+
- Expected runtime / hardware:
|
|
16
|
+
|
|
17
|
+
## Environment target
|
|
18
|
+
- OS:
|
|
19
|
+
- Python:
|
|
20
|
+
- Torch:
|
|
21
|
+
- Preferred backend (`cpu` / `cuda` / `mps` / `rocm`):
|
|
22
|
+
|
|
23
|
+
## Implementation sketch
|
|
24
|
+
Outline scripts/flags needed (e.g., extend `scripts/eval/zeroshot.py`).
|
|
25
|
+
|
|
26
|
+
## Acceptance criteria
|
|
27
|
+
Describe what needs to be captured (JSON fields, plots, etc.).
|
|
@@ -0,0 +1,30 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: Faithfulness gap
|
|
3
|
+
about: Report deviations vs. the Nested Learning / HOPE specs
|
|
4
|
+
title: "[Faithfulness] "
|
|
5
|
+
labels: ["faithfulness", "needs-triage"]
|
|
6
|
+
assignees: []
|
|
7
|
+
---
|
|
8
|
+
|
|
9
|
+
## Summary
|
|
10
|
+
Describe the suspected deviation (cite paper section/equation).
|
|
11
|
+
|
|
12
|
+
## Evidence
|
|
13
|
+
- Config(s) / checkpoints affected
|
|
14
|
+
- Logs / screenshots / metrics
|
|
15
|
+
- Steps to reproduce
|
|
16
|
+
|
|
17
|
+
## Environment
|
|
18
|
+
- OS:
|
|
19
|
+
- Python:
|
|
20
|
+
- Torch:
|
|
21
|
+
- Backend (`cpu` / `cuda` / `mps` / `rocm`):
|
|
22
|
+
- GPU/accelerator model (if any):
|
|
23
|
+
|
|
24
|
+
If using ROCm: this project currently treats ROCm support as best-effort. Include HIP/ROCm version and exact torch build.
|
|
25
|
+
|
|
26
|
+
## Expected behavior
|
|
27
|
+
What should happen according to the paper?
|
|
28
|
+
|
|
29
|
+
## Additional context
|
|
30
|
+
Add any extra notes, e.g., suggested fix or related PRs.
|
|
@@ -0,0 +1,32 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: Performance regression
|
|
3
|
+
about: Report a training / eval performance drop vs. baseline
|
|
4
|
+
title: "[Perf] "
|
|
5
|
+
labels: ["performance", "needs-triage"]
|
|
6
|
+
assignees: []
|
|
7
|
+
---
|
|
8
|
+
|
|
9
|
+
## Summary
|
|
10
|
+
Describe the regression and the baseline you’re comparing against.
|
|
11
|
+
|
|
12
|
+
## Baseline
|
|
13
|
+
- Config / checkpoint:
|
|
14
|
+
- Metrics (loss / ppl / eval scores):
|
|
15
|
+
|
|
16
|
+
## Repro steps
|
|
17
|
+
Exact commands with overrides, plus hardware details.
|
|
18
|
+
|
|
19
|
+
## Environment
|
|
20
|
+
- OS:
|
|
21
|
+
- Python:
|
|
22
|
+
- Torch:
|
|
23
|
+
- Backend (`cpu` / `cuda` / `mps` / `rocm`):
|
|
24
|
+
- GPU/accelerator model (if any):
|
|
25
|
+
|
|
26
|
+
If using ROCm: this project currently treats ROCm support as best-effort. Include HIP/ROCm version and exact torch build.
|
|
27
|
+
|
|
28
|
+
## Logs / artifacts
|
|
29
|
+
Attach relevant logs, W&B links, or JSON eval files.
|
|
30
|
+
|
|
31
|
+
## Suspected cause
|
|
32
|
+
Optional theory or related commits/PRs.
|
|
@@ -0,0 +1,197 @@
|
|
|
1
|
+
name: CI
|
|
2
|
+
|
|
3
|
+
on:
|
|
4
|
+
push:
|
|
5
|
+
branches: ["main"]
|
|
6
|
+
pull_request:
|
|
7
|
+
branches: ["main"]
|
|
8
|
+
|
|
9
|
+
jobs:
|
|
10
|
+
lint-and-test:
|
|
11
|
+
runs-on: ubuntu-latest
|
|
12
|
+
steps:
|
|
13
|
+
- name: Checkout repository
|
|
14
|
+
uses: actions/checkout@v4
|
|
15
|
+
|
|
16
|
+
- name: Set up Python
|
|
17
|
+
uses: actions/setup-python@v5
|
|
18
|
+
with:
|
|
19
|
+
python-version: "3.12"
|
|
20
|
+
|
|
21
|
+
- name: Set up uv
|
|
22
|
+
uses: astral-sh/setup-uv@v3
|
|
23
|
+
with:
|
|
24
|
+
version: "0.9.8"
|
|
25
|
+
|
|
26
|
+
- name: Sync dependencies
|
|
27
|
+
run: uv sync --all-extras --dev
|
|
28
|
+
|
|
29
|
+
- name: Ruff
|
|
30
|
+
run: uv run ruff check .
|
|
31
|
+
|
|
32
|
+
- name: Mypy
|
|
33
|
+
run: uv run mypy src
|
|
34
|
+
|
|
35
|
+
- name: Verify docs path references
|
|
36
|
+
run: uv run python scripts/checks/verify_docs_refs.py
|
|
37
|
+
|
|
38
|
+
- name: Verify README critical commands
|
|
39
|
+
run: bash scripts/checks/check_readme_commands.sh
|
|
40
|
+
|
|
41
|
+
- name: Guard tracked file sizes / artifact extensions
|
|
42
|
+
run: bash scripts/checks/check_git_tracked_sizes.sh
|
|
43
|
+
|
|
44
|
+
- name: Verify scripts/data help exits cleanly
|
|
45
|
+
run: bash scripts/checks/check_data_script_help.sh
|
|
46
|
+
|
|
47
|
+
- name: Pytest
|
|
48
|
+
run: uv run pytest
|
|
49
|
+
|
|
50
|
+
cross-platform-smoke:
|
|
51
|
+
strategy:
|
|
52
|
+
fail-fast: false
|
|
53
|
+
matrix:
|
|
54
|
+
os: [ubuntu-latest, macos-latest, windows-latest]
|
|
55
|
+
python-version: ["3.10", "3.12"]
|
|
56
|
+
runs-on: ${{ matrix.os }}
|
|
57
|
+
steps:
|
|
58
|
+
- name: Checkout repository
|
|
59
|
+
uses: actions/checkout@v4
|
|
60
|
+
|
|
61
|
+
- name: Set up Python
|
|
62
|
+
uses: actions/setup-python@v5
|
|
63
|
+
with:
|
|
64
|
+
python-version: ${{ matrix.python-version }}
|
|
65
|
+
|
|
66
|
+
- name: Set up uv
|
|
67
|
+
uses: astral-sh/setup-uv@v3
|
|
68
|
+
with:
|
|
69
|
+
version: "0.9.8"
|
|
70
|
+
|
|
71
|
+
- name: Sync dependencies
|
|
72
|
+
run: uv sync --dev
|
|
73
|
+
|
|
74
|
+
- name: CLI help + doctor + smoke
|
|
75
|
+
run: |
|
|
76
|
+
uv run nl --help
|
|
77
|
+
uv run nl doctor --json
|
|
78
|
+
uv run nl smoke --config-name pilot_smoke --device cpu --batch-size 1 --seq-len 8
|
|
79
|
+
uv run python -m nested_learning --help
|
|
80
|
+
|
|
81
|
+
wheel-install-smoke:
|
|
82
|
+
runs-on: ubuntu-latest
|
|
83
|
+
steps:
|
|
84
|
+
- name: Checkout repository
|
|
85
|
+
uses: actions/checkout@v4
|
|
86
|
+
|
|
87
|
+
- name: Set up Python
|
|
88
|
+
uses: actions/setup-python@v5
|
|
89
|
+
with:
|
|
90
|
+
python-version: "3.12"
|
|
91
|
+
|
|
92
|
+
- name: Set up uv
|
|
93
|
+
uses: astral-sh/setup-uv@v3
|
|
94
|
+
with:
|
|
95
|
+
version: "0.9.8"
|
|
96
|
+
|
|
97
|
+
- name: Build wheel
|
|
98
|
+
run: uv build
|
|
99
|
+
|
|
100
|
+
- name: Install wheel in isolated venv
|
|
101
|
+
run: |
|
|
102
|
+
python -m venv /tmp/wheel-smoke
|
|
103
|
+
/tmp/wheel-smoke/bin/python -m pip install --upgrade pip
|
|
104
|
+
/tmp/wheel-smoke/bin/python -m pip install dist/*.whl
|
|
105
|
+
|
|
106
|
+
- name: Verify wheel entrypoints outside repo configs
|
|
107
|
+
run: |
|
|
108
|
+
/tmp/wheel-smoke/bin/python -m nested_learning --help
|
|
109
|
+
/tmp/wheel-smoke/bin/python -m nested_learning doctor --json
|
|
110
|
+
/tmp/wheel-smoke/bin/python - <<'PY'
|
|
111
|
+
import subprocess
|
|
112
|
+
import sys
|
|
113
|
+
import tempfile
|
|
114
|
+
|
|
115
|
+
tmp = tempfile.mkdtemp(prefix="nl-wheel-smoke-")
|
|
116
|
+
cmd = [
|
|
117
|
+
sys.executable,
|
|
118
|
+
"-m",
|
|
119
|
+
"nested_learning",
|
|
120
|
+
"smoke",
|
|
121
|
+
"--config-name",
|
|
122
|
+
"pilot_smoke",
|
|
123
|
+
"--device",
|
|
124
|
+
"cpu",
|
|
125
|
+
"--batch-size",
|
|
126
|
+
"1",
|
|
127
|
+
"--seq-len",
|
|
128
|
+
"8",
|
|
129
|
+
]
|
|
130
|
+
subprocess.run(cmd, cwd=tmp, check=True)
|
|
131
|
+
PY
|
|
132
|
+
|
|
133
|
+
cpu-ddp-smoke:
|
|
134
|
+
runs-on: ubuntu-latest
|
|
135
|
+
steps:
|
|
136
|
+
- name: Checkout repository
|
|
137
|
+
uses: actions/checkout@v4
|
|
138
|
+
|
|
139
|
+
- name: Set up Python
|
|
140
|
+
uses: actions/setup-python@v5
|
|
141
|
+
with:
|
|
142
|
+
python-version: "3.12"
|
|
143
|
+
|
|
144
|
+
- name: Set up uv
|
|
145
|
+
uses: astral-sh/setup-uv@v3
|
|
146
|
+
with:
|
|
147
|
+
version: "0.9.8"
|
|
148
|
+
|
|
149
|
+
- name: Sync dependencies
|
|
150
|
+
run: uv sync --all-extras --dev
|
|
151
|
+
|
|
152
|
+
- name: CPU DDP smoke (gloo backend)
|
|
153
|
+
run: bash scripts/run_cpu_ddp_smoke.sh
|
|
154
|
+
|
|
155
|
+
passkey-smoke:
|
|
156
|
+
runs-on: ubuntu-latest
|
|
157
|
+
steps:
|
|
158
|
+
- name: Checkout repository
|
|
159
|
+
uses: actions/checkout@v4
|
|
160
|
+
|
|
161
|
+
- name: Set up Python
|
|
162
|
+
uses: actions/setup-python@v5
|
|
163
|
+
with:
|
|
164
|
+
python-version: "3.12"
|
|
165
|
+
|
|
166
|
+
- name: Set up uv
|
|
167
|
+
uses: astral-sh/setup-uv@v3
|
|
168
|
+
with:
|
|
169
|
+
version: "0.9.8"
|
|
170
|
+
|
|
171
|
+
- name: Sync dependencies
|
|
172
|
+
run: uv sync --all-extras --dev
|
|
173
|
+
|
|
174
|
+
- name: Run synthetic passkey memorization test
|
|
175
|
+
run: bash scripts/tests/run_passkey_smoke.sh
|
|
176
|
+
|
|
177
|
+
fidelity-subset:
|
|
178
|
+
runs-on: ubuntu-latest
|
|
179
|
+
steps:
|
|
180
|
+
- name: Checkout repository
|
|
181
|
+
uses: actions/checkout@v4
|
|
182
|
+
|
|
183
|
+
- name: Set up Python
|
|
184
|
+
uses: actions/setup-python@v5
|
|
185
|
+
with:
|
|
186
|
+
python-version: "3.12"
|
|
187
|
+
|
|
188
|
+
- name: Set up uv
|
|
189
|
+
uses: astral-sh/setup-uv@v3
|
|
190
|
+
with:
|
|
191
|
+
version: "0.9.8"
|
|
192
|
+
|
|
193
|
+
- name: Sync dependencies
|
|
194
|
+
run: uv sync --all-extras --dev
|
|
195
|
+
|
|
196
|
+
- name: Run fidelity subset + compliance report
|
|
197
|
+
run: bash scripts/checks/run_fidelity_ci_subset.sh
|
|
@@ -0,0 +1,79 @@
|
|
|
1
|
+
name: Release
|
|
2
|
+
|
|
3
|
+
on:
|
|
4
|
+
push:
|
|
5
|
+
tags:
|
|
6
|
+
- "v*"
|
|
7
|
+
|
|
8
|
+
permissions:
|
|
9
|
+
contents: write
|
|
10
|
+
id-token: write
|
|
11
|
+
|
|
12
|
+
jobs:
|
|
13
|
+
build:
|
|
14
|
+
runs-on: ubuntu-latest
|
|
15
|
+
steps:
|
|
16
|
+
- name: Checkout repository
|
|
17
|
+
uses: actions/checkout@v4
|
|
18
|
+
|
|
19
|
+
- name: Set up Python
|
|
20
|
+
uses: actions/setup-python@v5
|
|
21
|
+
with:
|
|
22
|
+
python-version: "3.12"
|
|
23
|
+
|
|
24
|
+
- name: Set up uv
|
|
25
|
+
uses: astral-sh/setup-uv@v3
|
|
26
|
+
with:
|
|
27
|
+
version: "0.9.8"
|
|
28
|
+
|
|
29
|
+
- name: Build source and wheel distributions
|
|
30
|
+
run: uv build
|
|
31
|
+
|
|
32
|
+
- name: Twine check
|
|
33
|
+
run: uvx twine check dist/*
|
|
34
|
+
|
|
35
|
+
- name: Upload dist artifacts
|
|
36
|
+
uses: actions/upload-artifact@v4
|
|
37
|
+
with:
|
|
38
|
+
name: dist
|
|
39
|
+
path: dist/*
|
|
40
|
+
|
|
41
|
+
publish-testpypi:
|
|
42
|
+
if: contains(github.ref_name, 'rc')
|
|
43
|
+
needs: build
|
|
44
|
+
runs-on: ubuntu-latest
|
|
45
|
+
environment:
|
|
46
|
+
name: testpypi
|
|
47
|
+
url: https://test.pypi.org/p/nested-learning
|
|
48
|
+
steps:
|
|
49
|
+
- name: Download dist artifacts
|
|
50
|
+
uses: actions/download-artifact@v4
|
|
51
|
+
with:
|
|
52
|
+
name: dist
|
|
53
|
+
path: dist
|
|
54
|
+
|
|
55
|
+
- name: Publish to TestPyPI via Trusted Publishing
|
|
56
|
+
uses: pypa/gh-action-pypi-publish@release/v1
|
|
57
|
+
with:
|
|
58
|
+
repository-url: https://test.pypi.org/legacy/
|
|
59
|
+
packages-dir: dist/
|
|
60
|
+
|
|
61
|
+
publish-pypi:
|
|
62
|
+
if: ${{ !contains(github.ref_name, 'rc') }}
|
|
63
|
+
needs: build
|
|
64
|
+
runs-on: ubuntu-latest
|
|
65
|
+
environment:
|
|
66
|
+
name: pypi
|
|
67
|
+
url: https://pypi.org/p/nested-learning
|
|
68
|
+
steps:
|
|
69
|
+
- name: Download dist artifacts
|
|
70
|
+
uses: actions/download-artifact@v4
|
|
71
|
+
with:
|
|
72
|
+
name: dist
|
|
73
|
+
path: dist
|
|
74
|
+
|
|
75
|
+
- name: Publish to PyPI via Trusted Publishing
|
|
76
|
+
uses: pypa/gh-action-pypi-publish@release/v1
|
|
77
|
+
with:
|
|
78
|
+
packages-dir: dist/
|
|
79
|
+
|
|
@@ -0,0 +1,45 @@
|
|
|
1
|
+
name: Security
|
|
2
|
+
|
|
3
|
+
on:
|
|
4
|
+
push:
|
|
5
|
+
branches: ["main"]
|
|
6
|
+
pull_request:
|
|
7
|
+
branches: ["main"]
|
|
8
|
+
schedule:
|
|
9
|
+
- cron: "0 6 * * 1"
|
|
10
|
+
|
|
11
|
+
jobs:
|
|
12
|
+
dependency-audit:
|
|
13
|
+
runs-on: ubuntu-latest
|
|
14
|
+
steps:
|
|
15
|
+
- name: Checkout repository
|
|
16
|
+
uses: actions/checkout@v4
|
|
17
|
+
|
|
18
|
+
- name: Set up Python
|
|
19
|
+
uses: actions/setup-python@v5
|
|
20
|
+
with:
|
|
21
|
+
python-version: "3.12"
|
|
22
|
+
|
|
23
|
+
- name: Set up uv
|
|
24
|
+
uses: astral-sh/setup-uv@v3
|
|
25
|
+
with:
|
|
26
|
+
version: "0.9.8"
|
|
27
|
+
|
|
28
|
+
- name: Export requirements
|
|
29
|
+
run: uv export --all-extras --dev --format requirements-txt --output-file /tmp/requirements.txt
|
|
30
|
+
|
|
31
|
+
- name: pip-audit
|
|
32
|
+
run: uvx pip-audit -r /tmp/requirements.txt
|
|
33
|
+
continue-on-error: true
|
|
34
|
+
|
|
35
|
+
secret-scan:
|
|
36
|
+
runs-on: ubuntu-latest
|
|
37
|
+
steps:
|
|
38
|
+
- name: Checkout repository
|
|
39
|
+
uses: actions/checkout@v4
|
|
40
|
+
|
|
41
|
+
- name: Gitleaks scan
|
|
42
|
+
uses: gitleaks/gitleaks-action@v2
|
|
43
|
+
env:
|
|
44
|
+
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
|
|
45
|
+
|
|
@@ -0,0 +1,40 @@
|
|
|
1
|
+
# Environment / tooling
|
|
2
|
+
.venv/
|
|
3
|
+
__pycache__/
|
|
4
|
+
*.pyc
|
|
5
|
+
.pytest_cache/
|
|
6
|
+
.ruff_cache/
|
|
7
|
+
.mypy_cache/
|
|
8
|
+
|
|
9
|
+
# Local artifacts
|
|
10
|
+
logs/
|
|
11
|
+
artifacts/
|
|
12
|
+
/data/
|
|
13
|
+
outputs/
|
|
14
|
+
checkpoints/
|
|
15
|
+
*.pt
|
|
16
|
+
train.log
|
|
17
|
+
train_dist.log
|
|
18
|
+
ref_repos/
|
|
19
|
+
configs/_tmp*
|
|
20
|
+
git.env
|
|
21
|
+
docs/POSTS.md
|
|
22
|
+
docs/EX_*.md
|
|
23
|
+
docs/CHECK_2_PLANNING_MODEL_REQUEST.md
|
|
24
|
+
docs/CHECK_2_PLANNING_MODEL_RESPONSE.md
|
|
25
|
+
docs/planner_check2_attachments.zip
|
|
26
|
+
docs/tmp/
|
|
27
|
+
docs_tmp/
|
|
28
|
+
wandb/
|
|
29
|
+
eval/*_ci.json
|
|
30
|
+
|
|
31
|
+
# Local paper scans / scratch references (keep tracked references separate)
|
|
32
|
+
google_papers/*_arXiv_v1.pdf
|
|
33
|
+
google_papers/*_arXiv_v1/
|
|
34
|
+
google_papers/Nested_Learning_Full_Paper.pdf
|
|
35
|
+
google_papers/Nested_Learning_Full_Paper/
|
|
36
|
+
|
|
37
|
+
# Editors
|
|
38
|
+
.DS_Store
|
|
39
|
+
.idea/
|
|
40
|
+
.vscode/
|
|
@@ -0,0 +1,59 @@
|
|
|
1
|
+
# Changelog
|
|
2
|
+
|
|
3
|
+
All notable changes to this project will be documented here. The format loosely follows [Keep a Changelog](https://keepachangelog.com/en/1.1.0/) and uses semantic versioning once tagged releases begin.
|
|
4
|
+
|
|
5
|
+
## [Unreleased]
|
|
6
|
+
### Added
|
|
7
|
+
- Optional attention KV-cache path for continuous streaming inference (`init_attention_cache`, `attention_cache`, `return_attention_cache`) across HOPE/TITAN/Transformer blocks.
|
|
8
|
+
- Boundary-target online chunking mode (`train.online_boundary_targets`) and optional training-time attention-cache carry (`train.online_carry_attention_cache`) for stronger chunk-boundary semantics.
|
|
9
|
+
- Evaluation streaming-state utilities (`src/nested_learning/eval_state.py`) plus continual-eval controls (`--eval-state-mode`, `--eval-use-fast-state`, `--eval-use-attention-cache`).
|
|
10
|
+
- Compliance report automation (`scripts/checks/compliance_report.py`) with CI subset + mechanism smoke integration.
|
|
11
|
+
- Flash/SDPA-backed self-attention path with safe fallbacks, unlocking PyTorch 2.9 SDPA kernels by default.
|
|
12
|
+
- Hydra toggles for bf16 autocast (`train.mixed_precision.*`), `torch.compile` (`train.compile.*`), and fused optimizers.
|
|
13
|
+
- Muon + AdamW hybrid optimizer option exposed via `optim.type=muon`, routing ≥2D matrices through `torch.optim.Muon`.
|
|
14
|
+
- Test-time memorization flags (`--memorize*`) documented in README + `docs/guide.md`, matching TITAN eval behavior.
|
|
15
|
+
- Automation helpers: `scripts/run_e2e_smoke.sh` documented in Quickstart, plus new `scripts/run_cpu_ddp_smoke.sh` for CPU-only DDP/gloo smoke coverage.
|
|
16
|
+
- Streaming contract doc (`docs/STREAMING_CONTRACT.md`) defining sequence/segment/chunk semantics and fast-state lifecycle.
|
|
17
|
+
- Cadence verification utility (`scripts/checks/verify_update_cadence.py`) with synthetic tests and release-checklist integration.
|
|
18
|
+
- Fidelity CI subset runner (`scripts/checks/run_fidelity_ci_subset.sh`) and mechanism-auditing smoke runner (`scripts/run_mechanism_audit_smoke.sh`).
|
|
19
|
+
- Progress/status docs for P7 execution (`docs/PLAN_PROGRESS_P7.md`, `docs/IMPLEMENTATION_STATUS.md`).
|
|
20
|
+
- Bug-report reproducibility checklist (`docs/BUG_REPORT_CHECKLIST.md`).
|
|
21
|
+
- Boundary-state training-loop regression coverage (`tests/test_boundary_state_training_loop.py`) plus eval-loader/metadata roundtrip coverage (`tests/test_checkpoint_metadata_and_eval_loaders.py`).
|
|
22
|
+
- `scripts/checks/check_data_script_help.sh` to guarantee `scripts/data/* --help` exits cleanly; wired into CI.
|
|
23
|
+
- Markdown anchor verification in `scripts/checks/verify_docs_refs.py` with dedicated unit coverage.
|
|
24
|
+
|
|
25
|
+
### Changed
|
|
26
|
+
- README / compliance / streaming docs now reflect boundary-target mode, optional KV-cache carry, and explicit scope boundaries.
|
|
27
|
+
- CPU DDP smoke now includes strict-mode fail-fast verification.
|
|
28
|
+
- Repository license metadata now matches the shipped Apache-2.0 text; badges updated accordingly.
|
|
29
|
+
- README and guide refreshed with performance knobs, optimizer guidance, and memorization instructions so release consumers have a single source of truth.
|
|
30
|
+
- Release checklist tracks the new CPU DDP smoke script to keep packaging instructions aligned with available tooling.
|
|
31
|
+
- Training loop strict-mode guardrails: `train.strict_streaming_contract` now fail-fasts on known semantics violations (DDP feature downgrades, shared-batch fast-state, non paper-defined variant in strict mode).
|
|
32
|
+
- CMS telemetry now includes cadence metrics (`updates_applied`, `tokens_flushed`, `pending_tokens`, `gate_hits`) to make update-frequency behavior auditable.
|
|
33
|
+
- Paper-auditing preset now explicitly enables strict streaming contract checks.
|
|
34
|
+
- `configs/pilot_paper_faithful.yaml` now explicitly sets `train.online_updates=true` and tests verify no implicit algorithm-mode fallback.
|
|
35
|
+
- Boundary-state mode now emits an explicit startup warning code (`experimental_boundary_state_mode`) and validates cache/chunk constraints early.
|
|
36
|
+
- Checkpoint metadata now records algorithm/online flags (`algorithm_mode`, `online_updates`, `online_boundary_targets`, `online_carry_attention_cache`, `use_fast_state`), and release manifest includes those flags.
|
|
37
|
+
- Data split fallback policy is deterministic across data scripts (`train -> validation -> test -> first available`) with explicit available-splits logging.
|
|
38
|
+
|
|
39
|
+
### Upcoming
|
|
40
|
+
- GitHub Actions workflow covering `ruff`, `mypy`, and `pytest`.
|
|
41
|
+
- End-to-end release dry-run ahead of the `v0.1.0` tag.
|
|
42
|
+
|
|
43
|
+
## [0.1.0] - 2025-11-09
|
|
44
|
+
### Added
|
|
45
|
+
- PyTorch **2.9.0** / torchvision **0.24.0** environment managed via `uv` with reproducible `pyproject.toml` + `uv.lock`.
|
|
46
|
+
- HOPE block implementation (attention → TITAN memory → CMS + deep optimizers) with configurable level clocks and self-modifier wiring.
|
|
47
|
+
- Hydrated Hydra config tree for pilot, mid, target, and CPU-only smoke runs plus DDP/FSDP/DeepSpeed entrypoints.
|
|
48
|
+
- Data tooling: tokenizer trainer, corpus filtering, mixture processing, and `scripts/data/run_sample.sh` shortcut emitting stats under `data/mixtures/`.
|
|
49
|
+
- Evaluation suite: zero-shot benchmark CLI (PIQA/HellaSwag/WinoGrande/ARC/BoolQ/SIQA), Needle-in-a-Haystack generator, continual-learning forgetting analyzer.
|
|
50
|
+
- Sample artifacts (`artifacts/examples/pilot_dummy.pt`, `logs/pilot_smoke.json`, `logs/mid_smoke.json`) for reproducing eval commands without lengthy training.
|
|
51
|
+
- Documentation set (`docs/stage1_plan.md`, `docs/stage2_plan.md`, `docs/data_pipeline.md`, `docs/guide.md`) outlining architecture, scaling strategy, and onboarding.
|
|
52
|
+
|
|
53
|
+
### Changed
|
|
54
|
+
- README rewritten with badges, quickstart commands, and references to the new guide + release checklist.
|
|
55
|
+
- Logging defaults clarified (`logging.backend=json|wandb`), with instructions for saving structured metrics under `logs/`.
|
|
56
|
+
|
|
57
|
+
### Known gaps
|
|
58
|
+
- Release automation and CI are tracked in `docs/release_plan.md`.
|
|
59
|
+
- Scaling guidance for >100 B token corpora pending additional storage + GPU availability.
|