flowbook-python 0.0.1__tar.gz → 0.1.0__tar.gz
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- flowbook_python-0.1.0/.claude/agents/reproducibility-fixer.md +196 -0
- flowbook_python-0.1.0/.claude/commands/basic-run-nbi.md +52 -0
- flowbook_python-0.1.0/.claude/commands/basic-run.md +63 -0
- flowbook_python-0.1.0/.claude/commands/categorize-repro-errors.md +341 -0
- flowbook_python-0.1.0/.claude/commands/fix-notebook.md +251 -0
- flowbook_python-0.1.0/.claude/commands/sync-spec.md +16 -0
- flowbook_python-0.1.0/.copier-answers.yml +14 -0
- {flowbook_python-0.0.1 → flowbook_python-0.1.0}/.gitignore +12 -0
- flowbook_python-0.1.0/.prettierignore +18 -0
- flowbook_python-0.1.0/.yarnrc.yml +1 -0
- flowbook_python-0.1.0/CLAUDE.md +524 -0
- flowbook_python-0.1.0/CLI.md +287 -0
- flowbook_python-0.1.0/CONFLICT_RELATION.md +288 -0
- flowbook_python-0.1.0/FORMAL_DEVELOPMENT.md +1011 -0
- flowbook_python-0.1.0/MCP_ARCHITECTURE.md +303 -0
- flowbook_python-0.1.0/PKG-INFO +286 -0
- flowbook_python-0.1.0/README.md +199 -0
- flowbook_python-0.1.0/examples/FlowBookTutorial.ipynb +837 -0
- flowbook_python-0.1.0/examples/GettingStarted.ipynb +409 -0
- flowbook_python-0.1.0/examples/cudf/CudfCheckpointBenchmark.ipynb +151 -0
- flowbook_python-0.1.0/examples/cudf/CudfGpuCheckpoint.ipynb +225 -0
- flowbook_python-0.1.0/examples/cudf/CudfGpuPipeline.ipynb +272 -0
- flowbook_python-0.1.0/examples/demos/01_Basic_Tracking.ipynb +227 -0
- flowbook_python-0.1.0/examples/demos/02_Wrong_Order.ipynb +298 -0
- flowbook_python-0.1.0/examples/demos/03_Rejected_Overwrite.ipynb +70 -0
- flowbook_python-0.1.0/examples/demos/04_Column_Independence.ipynb +358 -0
- flowbook_python-0.1.0/examples/demos/05_Column_Conflict.ipynb +337 -0
- flowbook_python-0.1.0/examples/demos/06_Edit_Detection.ipynb +215 -0
- flowbook_python-0.1.0/examples/demos/09_UnrecoverableMutation_Lists.ipynb +291 -0
- flowbook_python-0.1.0/examples/demos/10_UnrecoverableMutation_Arrays.ipynb +226 -0
- flowbook_python-0.1.0/examples/demos/11_UnrecoverableMutation_DataFrameColumns.ipynb +512 -0
- flowbook_python-0.1.0/examples/forecasting-sticker-sales-1st-place-solution-gkhoussa/forecasting-sticker-sales-1st-place-solution.ipynb +3253 -0
- flowbook_python-0.1.0/examples/litmus/00_NoReadAndWrite.ipynb +102 -0
- flowbook_python-0.1.0/examples/litmus/01_WriteBeforeRead.ipynb +91 -0
- flowbook_python-0.1.0/examples/litmus/02_NoReadBeforeWrite.ipynb +167 -0
- flowbook_python-0.1.0/examples/litmus/03_NoWriteAfterRead.ipynb +406 -0
- flowbook_python-0.1.0/examples/litmus/04_ForwardStale.ipynb +196 -0
- flowbook_python-0.1.0/examples/litmus/05_BackwardStale.ipynb +106 -0
- flowbook_python-0.1.0/flowbook/__init__.py +82 -0
- flowbook_python-0.1.0/flowbook/__main__.py +44 -0
- flowbook_python-0.1.0/flowbook/_version.py +4 -0
- flowbook_python-0.1.0/flowbook/baseline_kernel/__init__.py +37 -0
- flowbook_python-0.1.0/flowbook/baseline_kernel/__main__.py +13 -0
- flowbook_python-0.1.0/flowbook/baseline_kernel/baseline_kernel.py +238 -0
- flowbook_python-0.1.0/flowbook/baseline_kernel/kernelspec/kernel.json +14 -0
- flowbook_python-0.1.0/flowbook/checkpoint_kernel/__init__.py +37 -0
- flowbook_python-0.1.0/flowbook/checkpoint_kernel/__main__.py +11 -0
- flowbook_python-0.1.0/flowbook/checkpoint_kernel/checkpoint_client.py +19 -0
- flowbook_python-0.1.0/flowbook/checkpoint_kernel/checkpoint_kernel.py +123 -0
- flowbook_python-0.1.0/flowbook/checkpoint_kernel/kernelspec/kernel.json +14 -0
- flowbook_python-0.1.0/flowbook/cli/__init__.py +1 -0
- flowbook_python-0.1.0/flowbook/cli/cli.py +307 -0
- flowbook_python-0.1.0/flowbook/cli/compare_fixed_overhead.py +671 -0
- flowbook_python-0.1.0/flowbook/cli/compare_overhead.py +4930 -0
- flowbook_python-0.1.0/flowbook/cli/flowbook_timers.py +1660 -0
- flowbook_python-0.1.0/flowbook/cli/helpers.py +738 -0
- flowbook_python-0.1.0/flowbook/cli/metadata_cli.py +228 -0
- flowbook_python-0.1.0/flowbook/cli/models.py +746 -0
- flowbook_python-0.1.0/flowbook/cli/optimization_metadata.py +90 -0
- flowbook_python-0.1.0/flowbook/cli/plot_extraction.py +1137 -0
- flowbook_python-0.1.0/flowbook/cli/plot_rendering.py +1490 -0
- flowbook_python-0.1.0/flowbook/cli/show_errors.py +409 -0
- flowbook_python-0.1.0/flowbook/cli/slurm_logs.py +433 -0
- flowbook_python-0.1.0/flowbook/cli/stats_cli.py +93 -0
- flowbook_python-0.1.0/flowbook/cli/stats_display.py +140 -0
- flowbook_python-0.1.0/flowbook/cli/tests/__init__.py +0 -0
- flowbook_python-0.1.0/flowbook/cli/tests/test_flowbook_timers.py +444 -0
- flowbook_python-0.1.0/flowbook/cli/tests/test_metadata_format.py +127 -0
- flowbook_python-0.1.0/flowbook/cli/tests/test_models.py +359 -0
- flowbook_python-0.1.0/flowbook/cli/tests/test_models_v5.py +179 -0
- flowbook_python-0.1.0/flowbook/cli/tests/test_plot_extraction.py +434 -0
- flowbook_python-0.1.0/flowbook/cli/tests/test_plot_extraction_v5.py +291 -0
- flowbook_python-0.1.0/flowbook/cli/tests/test_plot_rendering.py +205 -0
- flowbook_python-0.1.0/flowbook/cli/tests/test_rerun_overhead_plots.py +163 -0
- flowbook_python-0.1.0/flowbook/handlers.py +24 -0
- flowbook_python-0.1.0/flowbook/kernel/__init__.py +40 -0
- flowbook_python-0.1.0/flowbook/kernel/__main__.py +11 -0
- flowbook_python-0.1.0/flowbook/kernel/access_events.py +143 -0
- flowbook_python-0.1.0/flowbook/kernel/change_detector.py +537 -0
- flowbook_python-0.1.0/flowbook/kernel/changes.py +229 -0
- flowbook_python-0.1.0/flowbook/kernel/flowbook_client.py +76 -0
- flowbook_python-0.1.0/flowbook/kernel/flowbook_kernel.py +1875 -0
- flowbook_python-0.1.0/flowbook/kernel/kernelspec/kernel.json +8 -0
- flowbook_python-0.1.0/flowbook/kernel/kernelspec/logo-32x32.png +0 -0
- flowbook_python-0.1.0/flowbook/kernel/kernelspec/logo-64x64.png +0 -0
- flowbook_python-0.1.0/flowbook/kernel/loc_ids.py +181 -0
- flowbook_python-0.1.0/flowbook/kernel/locations.py +792 -0
- flowbook_python-0.1.0/flowbook/kernel/models.py +370 -0
- flowbook_python-0.1.0/flowbook/kernel/notebook_state.py +618 -0
- flowbook_python-0.1.0/flowbook/kernel/protocol.py +201 -0
- flowbook_python-0.1.0/flowbook/kernel/reproducibility_enforcer.py +2778 -0
- flowbook_python-0.1.0/flowbook/kernel/tests/LITMUS_TESTS.yaml +1586 -0
- flowbook_python-0.1.0/flowbook/kernel/tests/__init__.py +0 -0
- flowbook_python-0.1.0/flowbook/kernel/tests/conftest.py +170 -0
- flowbook_python-0.1.0/flowbook/kernel/tests/generate_litmus_latex.py +131 -0
- flowbook_python-0.1.0/flowbook/kernel/tests/litmus_helpers.py +121 -0
- flowbook_python-0.1.0/flowbook/kernel/tests/litmus_output/litmus_tests.tex +906 -0
- flowbook_python-0.1.0/flowbook/kernel/tests/litmus_output/litmus_tests.txt +499 -0
- flowbook_python-0.1.0/flowbook/kernel/tests/test_alias_conflicts.py +509 -0
- flowbook_python-0.1.0/flowbook/kernel/tests/test_change_detector_locs.py +160 -0
- flowbook_python-0.1.0/flowbook/kernel/tests/test_column_staleness_bug.py +724 -0
- flowbook_python-0.1.0/flowbook/kernel/tests/test_converters.py +517 -0
- flowbook_python-0.1.0/flowbook/kernel/tests/test_delete_transitions.py +1435 -0
- flowbook_python-0.1.0/flowbook/kernel/tests/test_flowbook_kernel.py +271 -0
- flowbook_python-0.1.0/flowbook/kernel/tests/test_forward_dependency.py +1532 -0
- flowbook_python-0.1.0/flowbook/kernel/tests/test_implementation_theory_diff.py +630 -0
- flowbook_python-0.1.0/flowbook/kernel/tests/test_inplace_mutation_detection.py +313 -0
- flowbook_python-0.1.0/flowbook/kernel/tests/test_litmus.py +773 -0
- flowbook_python-0.1.0/flowbook/kernel/tests/test_loc_ids.py +474 -0
- flowbook_python-0.1.0/flowbook/kernel/tests/test_locations.py +827 -0
- flowbook_python-0.1.0/flowbook/kernel/tests/test_locset_integration.py +770 -0
- flowbook_python-0.1.0/flowbook/kernel/tests/test_measure_rerun_overhead.py +239 -0
- flowbook_python-0.1.0/flowbook/kernel/tests/test_notebook_state.py +1137 -0
- flowbook_python-0.1.0/flowbook/kernel/tests/test_optimizations.py +482 -0
- flowbook_python-0.1.0/flowbook/kernel/tests/test_order_changes.py +414 -0
- flowbook_python-0.1.0/flowbook/kernel/tests/test_order_changes_complex.py +806 -0
- flowbook_python-0.1.0/flowbook/kernel/tests/test_paper_examples.py +1807 -0
- flowbook_python-0.1.0/flowbook/kernel/tests/test_protocol.py +223 -0
- flowbook_python-0.1.0/flowbook/kernel/tests/test_provenance.py +164 -0
- flowbook_python-0.1.0/flowbook/kernel/tests/test_removed_writes_scenario.py +212 -0
- flowbook_python-0.1.0/flowbook/kernel/tests/test_reproducibility_enforcer.py +6405 -0
- flowbook_python-0.1.0/flowbook/kernel/tests/test_reproducibility_structural.py +676 -0
- flowbook_python-0.1.0/flowbook/kernel/tests/test_typed_predicates.py +607 -0
- flowbook_python-0.1.0/flowbook/kernel/tests/test_uncopyable_as_write.py +256 -0
- flowbook_python-0.1.0/flowbook/kernel/tests/test_var_binding_semantics.py +496 -0
- flowbook_python-0.1.0/flowbook/kernel_discovery.py +134 -0
- flowbook_python-0.1.0/flowbook/kernel_support/__init__.py +0 -0
- flowbook_python-0.1.0/flowbook/kernel_support/ast_utils.py +56 -0
- flowbook_python-0.1.0/flowbook/kernel_support/base_client.py +79 -0
- flowbook_python-0.1.0/flowbook/kernel_support/base_kernel.py +284 -0
- flowbook_python-0.1.0/flowbook/kernel_support/checkpoint.py +238 -0
- flowbook_python-0.1.0/flowbook/kernel_support/checkpoint.py.backup +1749 -0
- flowbook_python-0.1.0/flowbook/kernel_support/column_provenance.py +196 -0
- flowbook_python-0.1.0/flowbook/kernel_support/column_tracking.py +1304 -0
- flowbook_python-0.1.0/flowbook/kernel_support/cudf_compat.py +1683 -0
- flowbook_python-0.1.0/flowbook/kernel_support/deepcopy.py +2934 -0
- flowbook_python-0.1.0/flowbook/kernel_support/deepcopyable.py +445 -0
- flowbook_python-0.1.0/flowbook/kernel_support/df_subset_detector.py +454 -0
- flowbook_python-0.1.0/flowbook/kernel_support/diff.py +4203 -0
- flowbook_python-0.1.0/flowbook/kernel_support/display_helpers.py +74 -0
- flowbook_python-0.1.0/flowbook/kernel_support/experimental_client.py +207 -0
- flowbook_python-0.1.0/flowbook/kernel_support/extended_types.py +354 -0
- flowbook_python-0.1.0/flowbook/kernel_support/file_checkpoint.py +296 -0
- flowbook_python-0.1.0/flowbook/kernel_support/fs_magics.py +103 -0
- flowbook_python-0.1.0/flowbook/kernel_support/heap_size.py +1096 -0
- flowbook_python-0.1.0/flowbook/kernel_support/install.py +50 -0
- flowbook_python-0.1.0/flowbook/kernel_support/json_utils.py +46 -0
- flowbook_python-0.1.0/flowbook/kernel_support/kernelspec/kernel.json +11 -0
- flowbook_python-0.1.0/flowbook/kernel_support/kernelspec/logo-64x64.png +0 -0
- flowbook_python-0.1.0/flowbook/kernel_support/locals.py +215 -0
- flowbook_python-0.1.0/flowbook/kernel_support/memory_checkpoint.py +3419 -0
- flowbook_python-0.1.0/flowbook/kernel_support/models.py +342 -0
- flowbook_python-0.1.0/flowbook/kernel_support/opaque.py +1156 -0
- flowbook_python-0.1.0/flowbook/kernel_support/process_cleanup.py +105 -0
- flowbook_python-0.1.0/flowbook/kernel_support/structural_tracking.py +1025 -0
- flowbook_python-0.1.0/flowbook/kernel_support/test_diff.py.backup +1289 -0
- flowbook_python-0.1.0/flowbook/kernel_support/test_subset_perf.py +327 -0
- flowbook_python-0.1.0/flowbook/kernel_support/tests/__init__.py +0 -0
- flowbook_python-0.1.0/flowbook/kernel_support/tests/test_ast_utils.py +105 -0
- flowbook_python-0.1.0/flowbook/kernel_support/tests/test_base_kernel.py +89 -0
- flowbook_python-0.1.0/flowbook/kernel_support/tests/test_catboost_pool.py +207 -0
- flowbook_python-0.1.0/flowbook/kernel_support/tests/test_chained_assignment_error.py +214 -0
- flowbook_python-0.1.0/flowbook/kernel_support/tests/test_checkpoint.py +141 -0
- flowbook_python-0.1.0/flowbook/kernel_support/tests/test_checkpoint_comprehensive.py +564 -0
- flowbook_python-0.1.0/flowbook/kernel_support/tests/test_checkpoint_corner_cases.py +996 -0
- flowbook_python-0.1.0/flowbook/kernel_support/tests/test_checkpoint_df_subsets.py +461 -0
- flowbook_python-0.1.0/flowbook/kernel_support/tests/test_checkpoint_functions.py +688 -0
- flowbook_python-0.1.0/flowbook/kernel_support/tests/test_checkpoint_memory_sharing.py +870 -0
- flowbook_python-0.1.0/flowbook/kernel_support/tests/test_checkpoint_nested.py +436 -0
- flowbook_python-0.1.0/flowbook/kernel_support/tests/test_checkpoint_object_conversion.py +287 -0
- flowbook_python-0.1.0/flowbook/kernel_support/tests/test_checkpoint_overhead_measurement.py +319 -0
- flowbook_python-0.1.0/flowbook/kernel_support/tests/test_column_provenance.py +482 -0
- flowbook_python-0.1.0/flowbook/kernel_support/tests/test_column_tracking.py +670 -0
- flowbook_python-0.1.0/flowbook/kernel_support/tests/test_column_tracking_methods.py +144 -0
- flowbook_python-0.1.0/flowbook/kernel_support/tests/test_column_tracking_performance.py +357 -0
- flowbook_python-0.1.0/flowbook/kernel_support/tests/test_column_tracking_setitem_reads.py +244 -0
- flowbook_python-0.1.0/flowbook/kernel_support/tests/test_cudf_checkpoint_dedup.py +267 -0
- flowbook_python-0.1.0/flowbook/kernel_support/tests/test_cudf_checkpoint_inflation.py +271 -0
- flowbook_python-0.1.0/flowbook/kernel_support/tests/test_cudf_checkpoint_perf.py +353 -0
- flowbook_python-0.1.0/flowbook/kernel_support/tests/test_cudf_compat.py +1671 -0
- flowbook_python-0.1.0/flowbook/kernel_support/tests/test_cudf_proxy_fingerprint.py +304 -0
- flowbook_python-0.1.0/flowbook/kernel_support/tests/test_cudf_proxy_tracking.py +286 -0
- flowbook_python-0.1.0/flowbook/kernel_support/tests/test_deep_alias_detection.py +878 -0
- flowbook_python-0.1.0/flowbook/kernel_support/tests/test_deepcopy_preserve_mode.py +216 -0
- flowbook_python-0.1.0/flowbook/kernel_support/tests/test_deepcopyable.py +677 -0
- flowbook_python-0.1.0/flowbook/kernel_support/tests/test_df_subset_detector.py +976 -0
- flowbook_python-0.1.0/flowbook/kernel_support/tests/test_diff_byte_comparison.py +464 -0
- flowbook_python-0.1.0/flowbook/kernel_support/tests/test_diff_deep.py +1066 -0
- flowbook_python-0.1.0/flowbook/kernel_support/tests/test_diff_float_tolerance.py +237 -0
- flowbook_python-0.1.0/flowbook/kernel_support/tests/test_diff_keras_catboost.py +578 -0
- flowbook_python-0.1.0/flowbook/kernel_support/tests/test_diff_object_float.py +244 -0
- flowbook_python-0.1.0/flowbook/kernel_support/tests/test_diff_structural.py +601 -0
- flowbook_python-0.1.0/flowbook/kernel_support/tests/test_display_helpers.py +66 -0
- flowbook_python-0.1.0/flowbook/kernel_support/tests/test_extended_types_coverage.py +403 -0
- flowbook_python-0.1.0/flowbook/kernel_support/tests/test_extension_dtypes.py +292 -0
- flowbook_python-0.1.0/flowbook/kernel_support/tests/test_file_checkpoint.py +195 -0
- flowbook_python-0.1.0/flowbook/kernel_support/tests/test_file_checkpoint_comprehensive.py +758 -0
- flowbook_python-0.1.0/flowbook/kernel_support/tests/test_fs_magics.py +175 -0
- flowbook_python-0.1.0/flowbook/kernel_support/tests/test_gpu_checkpoint.py +396 -0
- flowbook_python-0.1.0/flowbook/kernel_support/tests/test_gpu_checkpoint_perf.py +458 -0
- flowbook_python-0.1.0/flowbook/kernel_support/tests/test_heap_size.py +1695 -0
- flowbook_python-0.1.0/flowbook/kernel_support/tests/test_incremental_checkpoint.py +236 -0
- flowbook_python-0.1.0/flowbook/kernel_support/tests/test_index_deepcopy.py +676 -0
- flowbook_python-0.1.0/flowbook/kernel_support/tests/test_install.py +63 -0
- flowbook_python-0.1.0/flowbook/kernel_support/tests/test_json_utils.py +166 -0
- flowbook_python-0.1.0/flowbook/kernel_support/tests/test_keras_deepcopy.py +209 -0
- flowbook_python-0.1.0/flowbook/kernel_support/tests/test_large_list_cache.py +1030 -0
- flowbook_python-0.1.0/flowbook/kernel_support/tests/test_lightgbm_checkpoint.py +449 -0
- flowbook_python-0.1.0/flowbook/kernel_support/tests/test_locals.py +346 -0
- flowbook_python-0.1.0/flowbook/kernel_support/tests/test_memory_checkpoint.py +2127 -0
- flowbook_python-0.1.0/flowbook/kernel_support/tests/test_memory_checkpoint_comprehensive.py +1222 -0
- flowbook_python-0.1.0/flowbook/kernel_support/tests/test_memory_checkpoint_edge_cases.py +606 -0
- flowbook_python-0.1.0/flowbook/kernel_support/tests/test_memory_snapshot.py +199 -0
- flowbook_python-0.1.0/flowbook/kernel_support/tests/test_models.py +273 -0
- flowbook_python-0.1.0/flowbook/kernel_support/tests/test_models_coverage.py +291 -0
- flowbook_python-0.1.0/flowbook/kernel_support/tests/test_multiindex.py +362 -0
- flowbook_python-0.1.0/flowbook/kernel_support/tests/test_opaque_keras.py +429 -0
- flowbook_python-0.1.0/flowbook/kernel_support/tests/test_opaque_keras_gaps.py +780 -0
- flowbook_python-0.1.0/flowbook/kernel_support/tests/test_opaque_ml.py +295 -0
- flowbook_python-0.1.0/flowbook/kernel_support/tests/test_opaque_pytorch.py +405 -0
- flowbook_python-0.1.0/flowbook/kernel_support/tests/test_opaque_pytorch_gaps.py +404 -0
- flowbook_python-0.1.0/flowbook/kernel_support/tests/test_pytorch_diff.py +249 -0
- flowbook_python-0.1.0/flowbook/kernel_support/tests/test_shap_checkpoint.py +364 -0
- flowbook_python-0.1.0/flowbook/kernel_support/tests/test_stacking_checkpoint.py +562 -0
- flowbook_python-0.1.0/flowbook/kernel_support/tests/test_structural_tracking.py +2144 -0
- flowbook_python-0.1.0/flowbook/kernel_support/tests/test_targetencoder_checkpoint.py +413 -0
- flowbook_python-0.1.0/flowbook/kernel_support/tests/test_timeout_handler.py +131 -0
- flowbook_python-0.1.0/flowbook/kernel_support/tests/test_tracking.py +695 -0
- flowbook_python-0.1.0/flowbook/kernel_support/tests/test_tracking_coverage.py +209 -0
- flowbook_python-0.1.0/flowbook/kernel_support/tests/test_tracking_performance.py +265 -0
- flowbook_python-0.1.0/flowbook/kernel_support/tests/test_tracking_structural.py +709 -0
- flowbook_python-0.1.0/flowbook/kernel_support/tests/test_types_coverage.py +511 -0
- flowbook_python-0.1.0/flowbook/kernel_support/tests/test_virtual_fs.py +2151 -0
- flowbook_python-0.1.0/flowbook/kernel_support/timeout_handler.py +94 -0
- flowbook_python-0.1.0/flowbook/kernel_support/tracking.py +448 -0
- flowbook_python-0.1.0/flowbook/kernel_support/types.py +623 -0
- flowbook_python-0.1.0/flowbook/kernel_support/virtual_fs.py +932 -0
- flowbook_python-0.1.0/flowbook/labextension/package.json +239 -0
- flowbook_python-0.1.0/flowbook/labextension/schemas/flowbook/package.json.orig +234 -0
- flowbook_python-0.1.0/flowbook/labextension/schemas/flowbook/plugin.json +18 -0
- flowbook_python-0.1.0/flowbook/labextension/static/728.b1df4bca1a3305d0d0a7.js +1 -0
- flowbook_python-0.1.0/flowbook/labextension/static/873.3ca7ae352f965bccc339.js +1 -0
- flowbook_python-0.1.0/flowbook/labextension/static/905.94c2bfb401597cc2a103.js +2 -0
- flowbook_python-0.1.0/flowbook/labextension/static/905.94c2bfb401597cc2a103.js.LICENSE.txt +29 -0
- flowbook_python-0.1.0/flowbook/labextension/static/951.ba84389925d6a0676e79.js +1 -0
- flowbook_python-0.1.0/flowbook/labextension/static/remoteEntry.97e370ce33befeb5451b.js +1 -0
- flowbook_python-0.1.0/flowbook/labextension/static/style.js +4 -0
- flowbook_python-0.1.0/flowbook/labextension/static/third-party-licenses.json +112 -0
- flowbook_python-0.1.0/flowbook/mcp/__init__.py +0 -0
- flowbook_python-0.1.0/flowbook/mcp/jupyter_config.py +102 -0
- flowbook_python-0.1.0/flowbook/mcp/server.py +906 -0
- flowbook_python-0.1.0/flowbook/mcp/session.py +1499 -0
- flowbook_python-0.1.0/flowbook/mcp/tests/__init__.py +0 -0
- flowbook_python-0.1.0/flowbook/mcp/tests/test_cell_validation.py +60 -0
- flowbook_python-0.1.0/flowbook/mcp/tests/test_contents_api.py +417 -0
- flowbook_python-0.1.0/flowbook/mcp/tests/test_jupyter_config.py +51 -0
- flowbook_python-0.1.0/flowbook/mcp/tests/test_kernel_discovery.py +153 -0
- flowbook_python-0.1.0/flowbook/mcp/tests/test_new_tools.py +556 -0
- flowbook_python-0.1.0/flowbook/mcp/tests/test_session_integration.py +204 -0
- flowbook_python-0.1.0/flowbook/nbi/MANUAL_UI_TESTS.md +255 -0
- flowbook_python-0.1.0/flowbook/nbi/__init__.py +0 -0
- flowbook_python-0.1.0/flowbook/nbi/claude_commands/flowbook-nb-fix.md +225 -0
- flowbook_python-0.1.0/flowbook/nbi/extension.py +122 -0
- flowbook_python-0.1.0/flowbook/nbi/extension_data/extension.json +1 -0
- flowbook_python-0.1.0/flowbook/nbi/session.py +140 -0
- flowbook_python-0.1.0/flowbook/nbi/tests/__init__.py +0 -0
- flowbook_python-0.1.0/flowbook/nbi/tests/test_extension.py +105 -0
- flowbook_python-0.1.0/flowbook/nbi/tests/test_session.py +194 -0
- flowbook_python-0.1.0/flowbook/nbi/tests/test_tools.py +806 -0
- flowbook_python-0.1.0/flowbook/nbi/tools.py +682 -0
- flowbook_python-0.1.0/flowbook/scripts/__init__.py +16 -0
- flowbook_python-0.1.0/flowbook/scripts/fix_repro_errors.py +963 -0
- flowbook_python-0.1.0/flowbook/scripts/parse_repro_errors.py +216 -0
- flowbook_python-0.1.0/flowbook/scripts/tests/__init__.py +1 -0
- flowbook_python-0.1.0/flowbook/scripts/tests/test_fix_repro_errors.py +546 -0
- flowbook_python-0.1.0/flowbook/server/__init__.py +53 -0
- flowbook_python-0.1.0/flowbook/server/base.py +178 -0
- flowbook_python-0.1.0/flowbook/server/comm_models.py +21 -0
- flowbook_python-0.1.0/flowbook/server/commands/__init__.py +13 -0
- flowbook_python-0.1.0/flowbook/server/commands/compare_baseline.py +3061 -0
- flowbook_python-0.1.0/flowbook/server/commands/execute.py +308 -0
- flowbook_python-0.1.0/flowbook/server/commands/execute_base.py +241 -0
- flowbook_python-0.1.0/flowbook/server/commands/tests/__init__.py +0 -0
- flowbook_python-0.1.0/flowbook/server/commands/tests/test_compare_baseline_rerun.py +166 -0
- flowbook_python-0.1.0/flowbook/server/commands/tests/test_compare_baseline_v3.py +167 -0
- flowbook_python-0.1.0/flowbook/server/commands/tests/test_downsample_csv.py +256 -0
- flowbook_python-0.1.0/flowbook/server/commands/tests/test_flowbook_update_protocol.py +684 -0
- flowbook_python-0.1.0/flowbook/server/commands/tests/test_gpu_memory_recording.py +483 -0
- flowbook_python-0.1.0/flowbook/server/commands/tests/test_path_traversal.py +87 -0
- flowbook_python-0.1.0/flowbook/server/handlers.py +296 -0
- flowbook_python-0.1.0/flowbook/server/kernel_helper.py +233 -0
- flowbook_python-0.1.0/flowbook/server/kernel_manager.py +86 -0
- flowbook_python-0.1.0/flowbook/server/registry.py +73 -0
- flowbook_python-0.1.0/flowbook/server/tests/__init__.py +0 -0
- flowbook_python-0.1.0/flowbook/server/tests/test_handlers.py +140 -0
- flowbook_python-0.1.0/flowbook/server/tests/test_kernel_manager.py +116 -0
- flowbook_python-0.1.0/flowbook/slurm/slurm_cli.py +1355 -0
- flowbook_python-0.1.0/flowbook/testing/README.md +423 -0
- flowbook_python-0.1.0/flowbook/testing/__init__.py +23 -0
- flowbook_python-0.1.0/flowbook/testing/benchmark_checkpoint.py +1240 -0
- flowbook_python-0.1.0/flowbook/testing/checkpoint_overhead_test.py +728 -0
- flowbook_python-0.1.0/flowbook/testing/correctness.py +221 -0
- flowbook_python-0.1.0/flowbook/testing/kernel_comparison.py +469 -0
- flowbook_python-0.1.0/flowbook/testing/notebook_loader.py +63 -0
- flowbook_python-0.1.0/flowbook/testing/notebooks/data_pipeline.ipynb +185 -0
- flowbook_python-0.1.0/flowbook/testing/notebooks/dependencies.ipynb +106 -0
- flowbook_python-0.1.0/flowbook/testing/notebooks/deterministic.ipynb +91 -0
- flowbook_python-0.1.0/flowbook/testing/notebooks/ml_workflow.ipynb +194 -0
- flowbook_python-0.1.0/flowbook/testing/notebooks/multi_dataframe.ipynb +167 -0
- flowbook_python-0.1.0/flowbook/testing/notebooks/nondeterministic.ipynb +75 -0
- flowbook_python-0.1.0/flowbook/testing/notebooks/pandas_heavy.ipynb +126 -0
- flowbook_python-0.1.0/flowbook/testing/performance.py +624 -0
- flowbook_python-0.1.0/flowbook/testing/plot_checkpoint_percentage.py +117 -0
- flowbook_python-0.1.0/flowbook/testing/plot_checkpoint_timings.py +125 -0
- flowbook_python-0.1.0/flowbook/testing/results.py +178 -0
- flowbook_python-0.1.0/flowbook/testing/runner.py +332 -0
- flowbook_python-0.1.0/flowbook/testing/scripts/__init__.py +1 -0
- flowbook_python-0.1.0/flowbook/testing/scripts/run_correctness.py +158 -0
- flowbook_python-0.1.0/flowbook/testing/scripts/run_kernel_comparison.py +96 -0
- flowbook_python-0.1.0/flowbook/testing/scripts/run_performance.py +171 -0
- flowbook_python-0.1.0/flowbook/util/cell_ids.py +205 -0
- flowbook_python-0.1.0/flowbook/util/cell_index.py +134 -0
- flowbook_python-0.1.0/flowbook/util/dependencies.py +1517 -0
- flowbook_python-0.1.0/flowbook/util/flowbook_metadata.py +380 -0
- flowbook_python-0.1.0/flowbook/util/gpu_memory.py +98 -0
- flowbook_python-0.1.0/flowbook/util/kernel_installer.py +165 -0
- flowbook_python-0.1.0/flowbook/util/liveness.py +590 -0
- flowbook_python-0.1.0/flowbook/util/metadata_extractor.py +50 -0
- flowbook_python-0.1.0/flowbook/util/model_copy.py +218 -0
- flowbook_python-0.1.0/flowbook/util/nb_diff.py +98 -0
- flowbook_python-0.1.0/flowbook/util/notebook_analysis.py +306 -0
- flowbook_python-0.1.0/flowbook/util/notebook_to_python.py +159 -0
- flowbook_python-0.1.0/flowbook/util/output.py +392 -0
- flowbook_python-0.1.0/flowbook/util/prompts.py +97 -0
- flowbook_python-0.1.0/flowbook/util/prompts.yaml +437 -0
- flowbook_python-0.1.0/flowbook/util/tests/__init__.py +1 -0
- flowbook_python-0.1.0/flowbook/util/tests/test_cell_ids.py +303 -0
- flowbook_python-0.1.0/flowbook/util/tests/test_cell_index.py +131 -0
- flowbook_python-0.1.0/flowbook/util/tests/test_dependencies.py +1962 -0
- flowbook_python-0.1.0/flowbook/util/tests/test_gpu_memory.py +202 -0
- flowbook_python-0.1.0/flowbook/util/tests/test_liveness.py +740 -0
- flowbook_python-0.1.0/flowbook/util/tests/test_notebook_analysis.py +440 -0
- flowbook_python-0.1.0/flowbook/util/tests/test_notebook_to_python.py +366 -0
- flowbook_python-0.1.0/flowbook/util/text.py +384 -0
- flowbook_python-0.1.0/install.json +5 -0
- flowbook_python-0.1.0/jupyter-config/server-config/flowbook.json +7 -0
- flowbook_python-0.1.0/media/flowbook-small.png +0 -0
- flowbook_python-0.1.0/media/flowbook.png +0 -0
- flowbook_python-0.1.0/package.json +234 -0
- flowbook_python-0.1.0/pyproject.toml +135 -0
- flowbook_python-0.1.0/schema/plugin.json +18 -0
- flowbook_python-0.1.0/scripts/run-base.sh +5 -0
- flowbook_python-0.1.0/scripts/run-flowbook.sh +5 -0
- flowbook_python-0.1.0/scripts/timers.sh +3 -0
- flowbook_python-0.1.0/scripts/type_coverage.py +169 -0
- flowbook_python-0.1.0/setup.py +1 -0
- flowbook_python-0.1.0/src/_archived/api.ts +47 -0
- flowbook_python-0.1.0/src/_archived/executiondialog.tsx +353 -0
- flowbook_python-0.1.0/src/_archived/experimental/cellhighlighter.ts +162 -0
- flowbook_python-0.1.0/src/_archived/experimental/celltoolbar.ts +92 -0
- flowbook_python-0.1.0/src/_archived/experimental/executionhook.ts +296 -0
- flowbook_python-0.1.0/src/_archived/experimental/history.ts +708 -0
- flowbook_python-0.1.0/src/_archived/experimental/historypanel.tsx +239 -0
- flowbook_python-0.1.0/src/_archived/experimental/manager.ts +336 -0
- flowbook_python-0.1.0/src/_archived/experimental/metadatapanel.tsx +567 -0
- flowbook_python-0.1.0/src/_archived/experimental/plugin.ts +249 -0
- flowbook_python-0.1.0/src/_archived/experimental/toolbar.ts +158 -0
- flowbook_python-0.1.0/src/_archived/experimental/types.ts +249 -0
- flowbook_python-0.1.0/src/_archived/experimental/unittestpanel.tsx +300 -0
- flowbook_python-0.1.0/src/_archived/experimental/unittesttracker.ts +67 -0
- flowbook_python-0.1.0/src/_archived/kernel.ts +69 -0
- flowbook_python-0.1.0/src/_archived/logpanel.tsx +226 -0
- flowbook_python-0.1.0/src/_archived/messagecomponents.tsx +83 -0
- flowbook_python-0.1.0/src/cellindex.ts +222 -0
- flowbook_python-0.1.0/src/cellindexutils.ts +147 -0
- flowbook_python-0.1.0/src/flowbook/cellhighlighter.ts +445 -0
- flowbook_python-0.1.0/src/flowbook/dependenciespanel.tsx +572 -0
- flowbook_python-0.1.0/src/flowbook/executionhook.ts +743 -0
- flowbook_python-0.1.0/src/flowbook/metadatapanel.tsx +581 -0
- flowbook_python-0.1.0/src/flowbook/nbibridge.ts +682 -0
- flowbook_python-0.1.0/src/flowbook/plugin.ts +384 -0
- flowbook_python-0.1.0/src/flowbook/protocol.ts +95 -0
- flowbook_python-0.1.0/src/flowbook/stalenessmanager.ts +155 -0
- flowbook_python-0.1.0/src/flowbook/stalenessnotice.ts +322 -0
- flowbook_python-0.1.0/src/flowbook/toolbar.ts +158 -0
- flowbook_python-0.1.0/src/flowbook/types.ts +341 -0
- flowbook_python-0.1.0/src/flowbook/violationnotice.ts +214 -0
- flowbook_python-0.1.0/src/handler.ts +38 -0
- flowbook_python-0.1.0/src/index.ts +14 -0
- flowbook_python-0.1.0/src/shared/kerneldetection.ts +97 -0
- flowbook_python-0.1.0/src/shared/types.ts +33 -0
- flowbook_python-0.1.0/style/_archived.css +922 -0
- flowbook_python-0.1.0/style/base.css +326 -0
- flowbook_python-0.1.0/style/dependencies.css +43 -0
- flowbook_python-0.1.0/style/index.css +2 -0
- flowbook_python-0.1.0/style/index.js +1 -0
- flowbook_python-0.1.0/tests/demo_diffresult_basemodel.py +177 -0
- flowbook_python-0.1.0/tests/demo_markdown.py +118 -0
- flowbook_python-0.1.0/tests/demo_test_comm_structure.py +84 -0
- flowbook_python-0.1.0/tests/test_comm_simple.py +64 -0
- flowbook_python-0.1.0/tests/test_diff_refactor.py +168 -0
- flowbook_python-0.1.0/tests/test_diffresult_serialization.py +215 -0
- flowbook_python-0.1.0/tests/test_dtype_compatibility.py +113 -0
- flowbook_python-0.1.0/tests/test_env_in_optimize.py +95 -0
- flowbook_python-0.1.0/tests/test_execution_error_capture.py +539 -0
- flowbook_python-0.1.0/tests/test_generate_with_env.py +133 -0
- flowbook_python-0.1.0/tests/test_markdown_format.py +214 -0
- flowbook_python-0.1.0/tests/test_notebook.ipynb +28 -0
- flowbook_python-0.1.0/tests/test_optimize_command.py +460 -0
- flowbook_python-0.1.0/tests/test_optimize_fixes.py +267 -0
- flowbook_python-0.1.0/tests/test_optimize_keys_to_include.py +214 -0
- flowbook_python-0.1.0/tests/test_optimize_validation.py +211 -0
- flowbook_python-0.1.0/tests/test_partial_execution_bug.py +431 -0
- flowbook_python-0.1.0/tests/test_progress_messages.py +113 -0
- flowbook_python-0.1.0/tests/test_prompt_integration.py +86 -0
- flowbook_python-0.1.0/tests/test_real_kernel_message_capture.py +291 -0
- flowbook_python-0.1.0/tests/test_test_comm.py +82 -0
- flowbook_python-0.1.0/tests/test_validate_change_command.py +217 -0
- flowbook_python-0.1.0/tsconfig.json +26 -0
- flowbook_python-0.1.0/yarn.lock +7747 -0
- flowbook_python-0.0.1/PKG-INFO +0 -61
- flowbook_python-0.0.1/README.md +0 -11
- flowbook_python-0.0.1/flowbook/__init__.py +0 -31
- flowbook_python-0.0.1/pyproject.toml +0 -40
- {flowbook_python-0.0.1 → flowbook_python-0.1.0}/LICENSE +0 -0
|
@@ -0,0 +1,196 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: reproducibility-fixer
|
|
3
|
+
description: "Use this agent when the user wants to fix reproducibility errors in a Jupyter notebook. This includes situations where the user mentions flowbook errors, reproducibility violations, stale cells, variable reuse issues, or wants to make a notebook reproducible. The agent runs the notebook through flowbook, analyzes the embedded reproducibility metadata, fixes violations (typically through alpha-renaming reused variables), and produces a detailed fix report.\\n\\nExamples:\\n\\n<example>\\nContext: User has a notebook with reproducibility issues they want fixed.\\nuser: \"My notebook analysis.ipynb has some flowbook errors, can you fix them?\"\\nassistant: \"I'll use the reproducibility-fixer agent to analyze and fix the flowbook reproducibility errors in your notebook.\"\\n<Task tool call to launch reproducibility-fixer agent>\\n</example>\\n\\n<example>\\nContext: User wants to make their notebook reproducible before sharing.\\nuser: \"I need to clean up experiment.ipynb - it has some variable reuse issues that cause reproducibility problems\"\\nassistant: \"Let me use the reproducibility-fixer agent to identify and fix the variable reuse issues in your notebook.\"\\n<Task tool call to launch reproducibility-fixer agent>\\n</example>\\n\\n<example>\\nContext: User mentions stale cells or backward violations.\\nuser: \"Flowbook is showing backward violations in my data_processing.ipynb notebook\"\\nassistant: \"I'll launch the reproducibility-fixer agent to analyze the backward violations and fix them, typically by renaming variables that are being reused inappropriately.\"\\n<Task tool call to launch reproducibility-fixer agent>\\n</example>"
|
|
4
|
+
model: opus
|
|
5
|
+
color: orange
|
|
6
|
+
memory: project
|
|
7
|
+
---
|
|
8
|
+
|
|
9
|
+
You are an expert Jupyter notebook reproducibility engineer specializing in FlowBook's reproducibility enforcement system. Your deep understanding of data flow analysis, variable scoping, and computational reproducibility allows you to diagnose and fix notebooks that have reproducibility violations.
|
|
10
|
+
|
|
11
|
+
## Your Mission
|
|
12
|
+
|
|
13
|
+
Fix reproducibility errors in Jupyter notebooks by:
|
|
14
|
+
|
|
15
|
+
1. Running the notebook through FlowBook to identify violations
|
|
16
|
+
2. Analyzing the embedded reproducibility metadata
|
|
17
|
+
3. Applying targeted fixes (primarily alpha-renaming reused variables)
|
|
18
|
+
4. Producing a detailed fix report
|
|
19
|
+
|
|
20
|
+
## Understanding FlowBook Reproducibility Errors
|
|
21
|
+
|
|
22
|
+
FlowBook tracks variable reads and writes across cells to ensure notebooks are reproducible. Key violation types:
|
|
23
|
+
|
|
24
|
+
- **Backward Conflict (BackConflict)**: A cell wrote to a variable that was read by an earlier fresh cell. This breaks reproducibility because re-running cells out of order would produce different results.
|
|
25
|
+
- **Forward Contamination (FwdContaminated)**: A cell read a variable that was written by a later-executed cell. The cell is marked stale.
|
|
26
|
+
- **Staleness (StaleFwd)**: A cell wrote to a variable read by a later fresh cell, making that later cell stale.
|
|
27
|
+
|
|
28
|
+
The metadata in cell outputs contains:
|
|
29
|
+
|
|
30
|
+
- `reads`: Variables read by the cell
|
|
31
|
+
- `writes`: Variables written by the cell
|
|
32
|
+
- `violation`: Details about any reproducibility violation
|
|
33
|
+
- `stale_cells`: List of cells that became stale
|
|
34
|
+
- `changed_variables`: Variables that changed value
|
|
35
|
+
|
|
36
|
+
## Workflow
|
|
37
|
+
|
|
38
|
+
### Step 1: Run the Notebook Through FlowBook
|
|
39
|
+
|
|
40
|
+
Use the FlowBook CLI to execute the notebook:
|
|
41
|
+
|
|
42
|
+
```bash
|
|
43
|
+
flowbook execute --output=input-fixed.ipynb input.ipynb
|
|
44
|
+
```
|
|
45
|
+
|
|
46
|
+
This executes all cells and embeds reproducibility metadata in the outputs.
|
|
47
|
+
|
|
48
|
+
### Step 2: Analyze the Fixed Notebook
|
|
49
|
+
|
|
50
|
+
Read the output notebook and examine each cell's outputs for `flowbook` metadata. Look for:
|
|
51
|
+
|
|
52
|
+
- Cells with `violation` fields (these are the errors to fix)
|
|
53
|
+
- Cells listed in `stale_cells` arrays
|
|
54
|
+
- Patterns of variable reuse across cells
|
|
55
|
+
|
|
56
|
+
### Step 3: Fix Each Violation
|
|
57
|
+
|
|
58
|
+
**Primary Fix Strategy - Alpha Renaming:**
|
|
59
|
+
When a variable is reused (written in multiple cells), rename subsequent uses to unique names:
|
|
60
|
+
|
|
61
|
+
- Original: `df = pd.read_csv('data.csv')` ... later ... `df = pd.read_csv('other.csv')`
|
|
62
|
+
- Fixed: `df = pd.read_csv('data.csv')` ... later ... `df2 = pd.read_csv('other.csv')`
|
|
63
|
+
|
|
64
|
+
Update ALL subsequent references to use the new name.
|
|
65
|
+
|
|
66
|
+
**Alternative Fixes (when alpha-renaming is insufficient):**
|
|
67
|
+
|
|
68
|
+
- Reorder cells if the logical flow allows
|
|
69
|
+
- Split cells that do too much
|
|
70
|
+
- Introduce intermediate variables to break dependency chains
|
|
71
|
+
- Use `del variable` to explicitly release a variable before reuse
|
|
72
|
+
|
|
73
|
+
### Step 4: Write the Fix Report
|
|
74
|
+
|
|
75
|
+
Create a report file (e.g., `input-fixes.txt`) documenting:
|
|
76
|
+
|
|
77
|
+
- Each violation found (cell ID, violation type, variables involved)
|
|
78
|
+
- The fix applied
|
|
79
|
+
- Any notes about why that fix was chosen
|
|
80
|
+
|
|
81
|
+
### Step 5: Run FlowBook Again
|
|
82
|
+
|
|
83
|
+
After applying fixes, run the notebook through FlowBook again to confirm that all violations are resolved and the notebook is now reproducible.
|
|
84
|
+
|
|
85
|
+
## Fix Report Format
|
|
86
|
+
|
|
87
|
+
```
|
|
88
|
+
FlowBook Reproducibility Fix Report
|
|
89
|
+
Notebook: {original_filename}
|
|
90
|
+
Generated: {timestamp}
|
|
91
|
+
|
|
92
|
+
=== Summary ===
|
|
93
|
+
Total violations found: N
|
|
94
|
+
Violations fixed: N
|
|
95
|
+
|
|
96
|
+
=== Violation 1 ===
|
|
97
|
+
Cell ID: {cell_id}
|
|
98
|
+
Violation Type: {BackConflict|FwdContaminated|etc}
|
|
99
|
+
Variable(s): {variable_names}
|
|
100
|
+
Description: {what was wrong}
|
|
101
|
+
Fix Applied: {what was changed}
|
|
102
|
+
Details: {specific code changes made}
|
|
103
|
+
|
|
104
|
+
=== Violation 2 ===
|
|
105
|
+
...
|
|
106
|
+
|
|
107
|
+
=== Notes ===
|
|
108
|
+
{any additional observations or recommendations}
|
|
109
|
+
```
|
|
110
|
+
|
|
111
|
+
## Important Guidelines
|
|
112
|
+
|
|
113
|
+
1. **Preserve Functionality**: Fixes must not change what the notebook computes, only how variables are named/scoped.
|
|
114
|
+
|
|
115
|
+
2. **Minimal Changes**: Prefer the smallest change that fixes the violation. Alpha-renaming is usually sufficient.
|
|
116
|
+
|
|
117
|
+
3. **Consistent Naming**: When renaming, use clear suffixes like `df2`, `df_processed`, `model_v2` that indicate the variable's purpose.
|
|
118
|
+
|
|
119
|
+
4. **Update All References**: After renaming a variable, find and update ALL subsequent uses of that variable in the notebook.
|
|
120
|
+
|
|
121
|
+
5. **Re-verify**: After making fixes, consider running flowbook again to confirm violations are resolved.
|
|
122
|
+
|
|
123
|
+
6. **File Naming Convention**:
|
|
124
|
+
- Input: `name.ipynb`
|
|
125
|
+
- Fixed notebook: `name-fixed.ipynb`
|
|
126
|
+
- Fix report: `name-fixes.txt`
|
|
127
|
+
|
|
128
|
+
## Example Fix
|
|
129
|
+
|
|
130
|
+
**Before (violation - 'results' reused):**
|
|
131
|
+
|
|
132
|
+
```python
|
|
133
|
+
# Cell 1
|
|
134
|
+
results = model.fit(X_train)
|
|
135
|
+
print(results.score)
|
|
136
|
+
|
|
137
|
+
# Cell 5 (later)
|
|
138
|
+
results = model2.fit(X_test) # BackConflict: 'results' was read by cell 2
|
|
139
|
+
```
|
|
140
|
+
|
|
141
|
+
**After (fixed via alpha-rename):**
|
|
142
|
+
|
|
143
|
+
```python
|
|
144
|
+
# Cell 1
|
|
145
|
+
results = model.fit(X_train)
|
|
146
|
+
print(results.score)
|
|
147
|
+
|
|
148
|
+
# Cell 5
|
|
149
|
+
results2 = model2.fit(X_test) # No conflict - different variable
|
|
150
|
+
```
|
|
151
|
+
|
|
152
|
+
## Error Handling
|
|
153
|
+
|
|
154
|
+
- If flowbook CLI fails, check that the notebook is valid JSON and cells are properly formatted
|
|
155
|
+
- If a violation cannot be fixed with simple renaming, document why and suggest manual intervention
|
|
156
|
+
- If the notebook has syntax errors, fix those first before addressing reproducibility
|
|
157
|
+
|
|
158
|
+
You have the expertise to make notebooks reproducible. Approach each violation methodically, apply the minimum fix needed, and document everything clearly.
|
|
159
|
+
|
|
160
|
+
# Persistent Agent Memory
|
|
161
|
+
|
|
162
|
+
You have a persistent Persistent Agent Memory directory at `/Users/freund/other/FlowBook/.claude/agent-memory/reproducibility-fixer/`. Its contents persist across conversations.
|
|
163
|
+
|
|
164
|
+
As you work, consult your memory files to build on previous experience. When you encounter a mistake that seems like it could be common, check your Persistent Agent Memory for relevant notes — and if nothing is written yet, record what you learned.
|
|
165
|
+
|
|
166
|
+
Guidelines:
|
|
167
|
+
|
|
168
|
+
- `MEMORY.md` is always loaded into your system prompt — lines after 200 will be truncated, so keep it concise
|
|
169
|
+
- Create separate topic files (e.g., `debugging.md`, `patterns.md`) for detailed notes and link to them from MEMORY.md
|
|
170
|
+
- Update or remove memories that turn out to be wrong or outdated
|
|
171
|
+
- Organize memory semantically by topic, not chronologically
|
|
172
|
+
- Use the Write and Edit tools to update your memory files
|
|
173
|
+
|
|
174
|
+
What to save:
|
|
175
|
+
|
|
176
|
+
- Stable patterns and conventions confirmed across multiple interactions
|
|
177
|
+
- Key architectural decisions, important file paths, and project structure
|
|
178
|
+
- User preferences for workflow, tools, and communication style
|
|
179
|
+
- Solutions to recurring problems and debugging insights
|
|
180
|
+
|
|
181
|
+
What NOT to save:
|
|
182
|
+
|
|
183
|
+
- Session-specific context (current task details, in-progress work, temporary state)
|
|
184
|
+
- Information that might be incomplete — verify against project docs before writing
|
|
185
|
+
- Anything that duplicates or contradicts existing CLAUDE.md instructions
|
|
186
|
+
- Speculative or unverified conclusions from reading a single file
|
|
187
|
+
|
|
188
|
+
Explicit user requests:
|
|
189
|
+
|
|
190
|
+
- When the user asks you to remember something across sessions (e.g., "always use bun", "never auto-commit"), save it — no need to wait for multiple interactions
|
|
191
|
+
- When the user asks to forget or stop remembering something, find and remove the relevant entries from your memory files
|
|
192
|
+
- Since this memory is project-scope and shared with your team via version control, tailor your memories to this project
|
|
193
|
+
|
|
194
|
+
## MEMORY.md
|
|
195
|
+
|
|
196
|
+
Your MEMORY.md is currently empty. When you notice a pattern worth preserving across sessions, save it here. Anything in MEMORY.md will be included in your system prompt next time.
|
|
@@ -0,0 +1,52 @@
|
|
|
1
|
+
---
|
|
2
|
+
description: 'Run the currently open notebook through FlowBook, executing all cells. Stops on the first error and reports reproducibility status. Works on the active JupyterLab notebook.'
|
|
3
|
+
---
|
|
4
|
+
|
|
5
|
+
# Basic Run (NBI)
|
|
6
|
+
|
|
7
|
+
Run the currently open notebook through FlowBook using NBI tools. Execute all cells, report any errors or reproducibility violations.
|
|
8
|
+
|
|
9
|
+
**Input**: $ARGUMENTS (optional — ignored, works on active notebook)
|
|
10
|
+
|
|
11
|
+
## Steps
|
|
12
|
+
|
|
13
|
+
1. Enable continue-after-violation so all cells run even if there are violations:
|
|
14
|
+
|
|
15
|
+
```
|
|
16
|
+
continue_after_violation(true)
|
|
17
|
+
```
|
|
18
|
+
|
|
19
|
+
2. Run all actionable cells:
|
|
20
|
+
|
|
21
|
+
```
|
|
22
|
+
run_actionable_cells()
|
|
23
|
+
```
|
|
24
|
+
|
|
25
|
+
3. Show the reproducibility status:
|
|
26
|
+
|
|
27
|
+
```
|
|
28
|
+
get_flowbook_status()
|
|
29
|
+
```
|
|
30
|
+
|
|
31
|
+
4. If there were errors or violations, show the first problem:
|
|
32
|
+
|
|
33
|
+
```
|
|
34
|
+
get_next_actionable_cell()
|
|
35
|
+
```
|
|
36
|
+
|
|
37
|
+
5. Print the session log for review:
|
|
38
|
+
|
|
39
|
+
```
|
|
40
|
+
print_log()
|
|
41
|
+
```
|
|
42
|
+
|
|
43
|
+
## Output
|
|
44
|
+
|
|
45
|
+
Report to the user:
|
|
46
|
+
|
|
47
|
+
- Whether the notebook ran successfully or which cell errored (use @A notation)
|
|
48
|
+
- Number of reproducibility violations found (if any), with a brief description of each
|
|
49
|
+
- Number of stale cells
|
|
50
|
+
- Whether the notebook is reproducible
|
|
51
|
+
|
|
52
|
+
Keep the report concise. If there are violations, list them briefly (cell @-label, type, variable). Do not attempt to fix anything — just report.
|
|
@@ -0,0 +1,63 @@
|
|
|
1
|
+
---
|
|
2
|
+
description: 'Run a Jupyter notebook through FlowBook, executing all cells in order. Stops on the first error and reports it. Shows the reproducibility status at the end.'
|
|
3
|
+
---
|
|
4
|
+
|
|
5
|
+
# Basic Run
|
|
6
|
+
|
|
7
|
+
Run a notebook through FlowBook using the MCP server. Execute all cells in order, report any errors or reproducibility violations.
|
|
8
|
+
|
|
9
|
+
**Input**: $ARGUMENTS (path to a .ipynb file)
|
|
10
|
+
|
|
11
|
+
## Steps
|
|
12
|
+
|
|
13
|
+
1. Load the notebook:
|
|
14
|
+
|
|
15
|
+
```
|
|
16
|
+
load_notebook("$ARGUMENTS")
|
|
17
|
+
```
|
|
18
|
+
|
|
19
|
+
2. Enable continue-after-violation so all cells run even if there are violations:
|
|
20
|
+
|
|
21
|
+
```
|
|
22
|
+
continue_after_violation(true)
|
|
23
|
+
```
|
|
24
|
+
|
|
25
|
+
3. Run all actionable cells:
|
|
26
|
+
|
|
27
|
+
```
|
|
28
|
+
run_actionable_cells()
|
|
29
|
+
```
|
|
30
|
+
|
|
31
|
+
4. If `run_actionable_cells` reports an error, get the failing cell to show the user what went wrong:
|
|
32
|
+
|
|
33
|
+
```
|
|
34
|
+
get_next_actionable_cell()
|
|
35
|
+
```
|
|
36
|
+
|
|
37
|
+
5. Show the reproducibility status:
|
|
38
|
+
|
|
39
|
+
```
|
|
40
|
+
get_status()
|
|
41
|
+
```
|
|
42
|
+
|
|
43
|
+
6. Print the session log for review:
|
|
44
|
+
|
|
45
|
+
```
|
|
46
|
+
print_log()
|
|
47
|
+
```
|
|
48
|
+
|
|
49
|
+
7. Save the log:
|
|
50
|
+
```
|
|
51
|
+
save_log()
|
|
52
|
+
```
|
|
53
|
+
|
|
54
|
+
## Output
|
|
55
|
+
|
|
56
|
+
Report to the user:
|
|
57
|
+
|
|
58
|
+
- Whether the notebook ran successfully or which cell errored (use @-labels, e.g., @A, @B)
|
|
59
|
+
- Number of reproducibility violations found (if any), with a brief description of each (cell @-label, type, variable)
|
|
60
|
+
- Number of stale cells
|
|
61
|
+
- Path to the saved log file
|
|
62
|
+
|
|
63
|
+
Keep the report concise. If there are violations, list them briefly (@-label, type, variable). Do not attempt to fix anything — just report.
|
|
@@ -0,0 +1,341 @@
|
|
|
1
|
+
# Categorize and Fix Reproducibility Errors
|
|
2
|
+
|
|
3
|
+
Analyze reproducibility errors from a FlowBook error report or directly from a processed notebook, categorize each error, and optionally fix them.
|
|
4
|
+
|
|
5
|
+
## Usage
|
|
6
|
+
|
|
7
|
+
```
|
|
8
|
+
/categorize-repro-errors ERROR_REPORT_FILE [NOTEBOOKS_DIR] [--fix]
|
|
9
|
+
/categorize-repro-errors NOTEBOOK_PATH [--fix]
|
|
10
|
+
```
|
|
11
|
+
|
|
12
|
+
**Mode 1: Error Report Mode**
|
|
13
|
+
|
|
14
|
+
- `ERROR_REPORT_FILE`: Path to the error report file (e.g., `errors.txt`)
|
|
15
|
+
- `NOTEBOOKS_DIR`: Optional directory containing the notebook files
|
|
16
|
+
- `--fix`: Optional flag to apply fixes after categorization
|
|
17
|
+
|
|
18
|
+
**Mode 2: Single Notebook Mode** (when only a `.ipynb` path is provided)
|
|
19
|
+
|
|
20
|
+
- `NOTEBOOK_PATH`: Path to a processed notebook (must have been run through FlowBook kernel)
|
|
21
|
+
- `--fix`: Optional flag to apply fixes after categorization
|
|
22
|
+
|
|
23
|
+
## Task
|
|
24
|
+
|
|
25
|
+
### For Error Report Mode:
|
|
26
|
+
|
|
27
|
+
1. Parse the error report file using `flowbook/scripts/parse_repro_errors.py`
|
|
28
|
+
2. For each notebook with errors, launch a parallel agent to analyze and categorize errors
|
|
29
|
+
|
|
30
|
+
### For Single Notebook Mode:
|
|
31
|
+
|
|
32
|
+
1. Extract errors directly from the notebook's cell metadata (see "Extracting Errors from Notebook" below)
|
|
33
|
+
2. Analyze and categorize errors for that single notebook
|
|
34
|
+
|
|
35
|
+
3. Each error should be categorized into exactly ONE of these categories:
|
|
36
|
+
|
|
37
|
+
### Error Categories
|
|
38
|
+
|
|
39
|
+
| Category | Description | Example | Fix Strategy |
|
|
40
|
+
| ------------------------------------------- | ---------------------------------------------------------------------- | --------------------------------------- | -------------------------- |
|
|
41
|
+
| **In-place variable reassignment** | Cell reads and overwrites same variable | `train = pd.concat([train, extra])` | Deep-copy + alpha-rename |
|
|
42
|
+
| **Sequential transformation chain** | Downstream depends on upstream transformation | Imputation then feature engineering | Deep-copy + alpha-rename |
|
|
43
|
+
| **Diagnostic inspection before mutation** | Read-only cell captures pre-transformation state | `df.info()` before `df["col"] = ...` | Cell split + `%diagnostic` |
|
|
44
|
+
| **Visualization before mutation** | Plot accesses all columns before column added | `sns.heatmap(df.corr())` before new col | Cell split + `%diagnostic` |
|
|
45
|
+
| **Reusing variable for different purposes** | Variable reused for different purposes in disjoint regions of the code | `model` reused for different model | Alpha-rename downstream |
|
|
46
|
+
| **Unrecoverable in-place mutation** | Cell mutates object without rebinding | `model.fit()`, `df.drop(inplace=True)` | See sub-types below |
|
|
47
|
+
|
|
48
|
+
### Unrecoverable Mutation Sub-types
|
|
49
|
+
|
|
50
|
+
When the predicate is `"unrecoverable_mutation"`, identify the sub-type from the cell source:
|
|
51
|
+
|
|
52
|
+
| Sub-type | Detection Pattern | Fix Type | Example |
|
|
53
|
+
| ------------------------- | ---------------------------------------------------------- | ------------------ | ---------------------------- |
|
|
54
|
+
| **ML model mutation** | `.fit()`, `.fit_transform()`, `.predict()` on model/scaler | `model-copy` | `model.fit(X, y)` |
|
|
55
|
+
| **DataFrame inplace** | `inplace=True` argument | `inplace-to-copy` | `df.drop(col, inplace=True)` |
|
|
56
|
+
| **Structural assignment** | `.columns = ...`, `.index = ...` | `struct-copy` | `df.columns = ['a', 'b']` |
|
|
57
|
+
| **Container mutation** | `.append()`, `[i] = ...` on list/dict/array | `inplace-reassign` | `arr[5] = 99` |
|
|
58
|
+
|
|
59
|
+
**Why these are unrecoverable:** Re-executing the cell cannot restore the full value of the variable. For example, `model.fit()` only trains the model — it cannot "un-train" changes from a deleted cell. Similarly, `arr[5] = 99` sets one element but cannot restore what a deleted cell wrote to `arr[3]`.
|
|
60
|
+
|
|
61
|
+
## Important Notes
|
|
62
|
+
|
|
63
|
+
- **Always use code cell indices (`@N`)** to reference cells, NOT cell IDs. Cell IDs in processed notebooks may not match the original notebooks. The error report provides both cell IDs and code cell indices — always use the code cell index with `@` prefix (e.g., `@5`).
|
|
64
|
+
- Cell indices in error reports are **CODE cell indices** (not including markdown cells)
|
|
65
|
+
- Write results to `error_categories.tsv` as you go
|
|
66
|
+
- TSV format: `NOTEBOOK_NAME<TAB>ERROR_NUMBER<TAB>CELL_ID<TAB>CELL_CODE_INDEX<TAB>CATEGORY<TAB>VARIABLE<TAB>FIX_TYPE<TAB>EXPLANATION`
|
|
67
|
+
- The VARIABLE column should contain the primary variable involved in the error
|
|
68
|
+
- The EXPLANATION column should contain the rationale for the categorization
|
|
69
|
+
|
|
70
|
+
## Fix Script Usage
|
|
71
|
+
|
|
72
|
+
After categorization, apply fixes using `flowbook/scripts/fix_repro_errors.py`.
|
|
73
|
+
|
|
74
|
+
**IMPORTANT:** Always use `@N` notation (code cell index) when calling the fix script. Never use cell IDs.
|
|
75
|
+
|
|
76
|
+
### High-level fix types (single command):
|
|
77
|
+
|
|
78
|
+
```bash
|
|
79
|
+
# For in-place reassignment or sequential chain:
|
|
80
|
+
python flowbook/scripts/fix_repro_errors.py NOTEBOOK @CODE_INDEX --fix-type inplace-reassign --variable VAR
|
|
81
|
+
|
|
82
|
+
# For variable reuse:
|
|
83
|
+
python flowbook/scripts/fix_repro_errors.py NOTEBOOK @CODE_INDEX --fix-type variable-reuse --variable VAR
|
|
84
|
+
|
|
85
|
+
# For ML model mutation (unrecoverable):
|
|
86
|
+
python flowbook/scripts/fix_repro_errors.py NOTEBOOK @CODE_INDEX --fix-type model-copy --variable VAR
|
|
87
|
+
|
|
88
|
+
# For DataFrame inplace=True (unrecoverable):
|
|
89
|
+
python flowbook/scripts/fix_repro_errors.py NOTEBOOK @CODE_INDEX --fix-type inplace-to-copy --variable VAR
|
|
90
|
+
|
|
91
|
+
# For structural assignment (unrecoverable):
|
|
92
|
+
python flowbook/scripts/fix_repro_errors.py NOTEBOOK @CODE_INDEX --fix-type struct-copy --variable VAR
|
|
93
|
+
```
|
|
94
|
+
|
|
95
|
+
### Diagnostic/Visualization fixes (agent-driven cell splitting):
|
|
96
|
+
|
|
97
|
+
For diagnostic and visualization errors, the agent must analyze the cell and split it using primitive operations. **Do NOT blindly add `%diagnostic` to cells that contain mutations.**
|
|
98
|
+
|
|
99
|
+
The fix script provides three primitive operations:
|
|
100
|
+
|
|
101
|
+
```bash
|
|
102
|
+
# Replace a cell's source code:
|
|
103
|
+
python flowbook/scripts/fix_repro_errors.py NOTEBOOK @N --fix-type set-source --source-file PATH
|
|
104
|
+
|
|
105
|
+
# Insert a new code cell after @N:
|
|
106
|
+
python flowbook/scripts/fix_repro_errors.py NOTEBOOK @N --fix-type insert-cell-after --source-file PATH
|
|
107
|
+
|
|
108
|
+
# Prepend %diagnostic magic to a cell:
|
|
109
|
+
python flowbook/scripts/fix_repro_errors.py NOTEBOOK @N --fix-type add-diagnostic
|
|
110
|
+
```
|
|
111
|
+
|
|
112
|
+
**Index safety:** `insert-cell-after` shifts all subsequent code cell indices by 1. When applying multiple fixes to the same notebook, process diagnostic/visualization fixes from **highest code cell index to lowest** to avoid index invalidation. Other fix types (inplace-reassign, model-copy, etc.) don't insert cells and are safe in any order.
|
|
113
|
+
|
|
114
|
+
#### How to fix diagnostic/visualization errors:
|
|
115
|
+
|
|
116
|
+
1. **Read the cell source** at the error's code cell index
|
|
117
|
+
2. **Classify each line** as either mutation (writes variables, assigns, calls .fit(), etc.) or diagnostic (print, display, .info(), .head(), plotting, etc.)
|
|
118
|
+
3. **Decide the fix**:
|
|
119
|
+
- **If the cell is purely diagnostic** (no mutations at all): Just use `add-diagnostic`
|
|
120
|
+
- **If the cell mixes mutation and diagnostic code**: Split it:
|
|
121
|
+
a. Write the mutation lines to a temp file
|
|
122
|
+
b. Write the diagnostic lines to a temp file
|
|
123
|
+
c. Use `set-source` to replace the original cell with mutation-only code
|
|
124
|
+
d. Use `insert-cell-after` to add the diagnostic code as a new cell
|
|
125
|
+
e. Use `add-diagnostic` on the new cell (which is now at @N+1)
|
|
126
|
+
- **If the cell is purely mutation** (no diagnostic code): This was miscategorized. Do NOT add `%diagnostic`. Instead, recategorize as `inplace-reassign` or `sequential-chain` and apply the appropriate deep-copy fix.
|
|
127
|
+
4. **Add `# [FLOWBOOK FIX]` comments** to both the mutation and diagnostic cells explaining what was done
|
|
128
|
+
|
|
129
|
+
The script creates `<notebook>-fixed.ipynb` with:
|
|
130
|
+
|
|
131
|
+
- Comments marked `# [FLOWBOOK FIX]` explaining the original error and fix
|
|
132
|
+
- Deep copies with `_flow_XXXX` suffix for renamed variables
|
|
133
|
+
- Split cells with `%diagnostic` magic on the read-only part
|
|
134
|
+
- For `model-copy`: Uses `safe_model_copy()` which handles sklearn, PyTorch, XGBoost, etc.
|
|
135
|
+
- For `inplace-to-copy`: Converts `df.method(inplace=True)` to `df = df.method()`
|
|
136
|
+
|
|
137
|
+
## Extracting Errors from Notebook
|
|
138
|
+
|
|
139
|
+
When using Single Notebook Mode, extract errors from the notebook's cell outputs. FlowBook stores violation information in `display_data` outputs with special metadata keys.
|
|
140
|
+
|
|
141
|
+
**Python code to extract errors from a notebook:**
|
|
142
|
+
|
|
143
|
+
```python
|
|
144
|
+
import json
|
|
145
|
+
from pathlib import Path
|
|
146
|
+
|
|
147
|
+
def extract_errors_from_notebook(notebook_path: str) -> dict:
|
|
148
|
+
"""Extract reproducibility errors from a processed FlowBook notebook.
|
|
149
|
+
|
|
150
|
+
Returns dict in same format as parse_repro_errors.py output:
|
|
151
|
+
{
|
|
152
|
+
"notebook.ipynb": {
|
|
153
|
+
"notebook_path": "/path/to/notebook.ipynb",
|
|
154
|
+
"error_count": N,
|
|
155
|
+
"errors": [
|
|
156
|
+
{
|
|
157
|
+
"error_num": 1,
|
|
158
|
+
"cell_id": "abcd",
|
|
159
|
+
"cell_index": 5, # CODE cell index
|
|
160
|
+
"summary": "Cell @A reads and writes the same locations: var",
|
|
161
|
+
"predicate": "no_read_and_write",
|
|
162
|
+
"locations": ["var"],
|
|
163
|
+
"accepted": True
|
|
164
|
+
},
|
|
165
|
+
...
|
|
166
|
+
]
|
|
167
|
+
}
|
|
168
|
+
}
|
|
169
|
+
"""
|
|
170
|
+
path = Path(notebook_path)
|
|
171
|
+
with open(path) as f:
|
|
172
|
+
nb = json.load(f)
|
|
173
|
+
|
|
174
|
+
errors = []
|
|
175
|
+
code_cell_index = 0
|
|
176
|
+
error_num = 0
|
|
177
|
+
|
|
178
|
+
for cell in nb.get('cells', []):
|
|
179
|
+
if cell.get('cell_type') != 'code':
|
|
180
|
+
continue
|
|
181
|
+
|
|
182
|
+
cell_id = cell.get('id', '')
|
|
183
|
+
|
|
184
|
+
# Check outputs for predicate_violation metadata
|
|
185
|
+
for output in cell.get('outputs', []):
|
|
186
|
+
if output.get('output_type') != 'display_data':
|
|
187
|
+
continue
|
|
188
|
+
|
|
189
|
+
metadata = output.get('metadata', {})
|
|
190
|
+
|
|
191
|
+
# Check for predicate_violation (NoReadAndWrite, BackwardStale, etc.)
|
|
192
|
+
if 'predicate_violation' in metadata:
|
|
193
|
+
violation = metadata['predicate_violation']
|
|
194
|
+
error_num += 1
|
|
195
|
+
errors.append({
|
|
196
|
+
'error_num': error_num,
|
|
197
|
+
'cell_id': violation.get('cell_id', cell_id),
|
|
198
|
+
'cell_index': code_cell_index,
|
|
199
|
+
'summary': violation.get('message', ''),
|
|
200
|
+
'predicate': violation.get('predicate', ''),
|
|
201
|
+
'locations': violation.get('locations', []),
|
|
202
|
+
'accepted': violation.get('accepted', True)
|
|
203
|
+
})
|
|
204
|
+
|
|
205
|
+
# Also check flowbook metadata for stale cells info
|
|
206
|
+
if 'flowbook' in metadata:
|
|
207
|
+
fb = metadata['flowbook']
|
|
208
|
+
# Stale cells indicate forward contamination (not an error per se,
|
|
209
|
+
# but useful context for categorization)
|
|
210
|
+
|
|
211
|
+
code_cell_index += 1
|
|
212
|
+
|
|
213
|
+
notebook_name = path.name
|
|
214
|
+
return {
|
|
215
|
+
notebook_name: {
|
|
216
|
+
'notebook_path': str(path.absolute()),
|
|
217
|
+
'error_count': len(errors),
|
|
218
|
+
'errors': errors
|
|
219
|
+
}
|
|
220
|
+
}
|
|
221
|
+
```
|
|
222
|
+
|
|
223
|
+
**Key metadata locations in notebook JSON:**
|
|
224
|
+
|
|
225
|
+
- `cell.outputs[].metadata.predicate_violation` - Violation details:
|
|
226
|
+
- `predicate`: Type of violation (`"no_read_and_write"`, `"backward_stale"`, etc.)
|
|
227
|
+
- `cell_id`: Cell that triggered the violation
|
|
228
|
+
- `locations`: List of variable names involved
|
|
229
|
+
- `message`: Human-readable error message
|
|
230
|
+
- `accepted`: Whether execution continued despite violation
|
|
231
|
+
- `cell.outputs[].metadata.flowbook` - Execution metadata:
|
|
232
|
+
- `reads`: Variables read by the cell
|
|
233
|
+
- `writes`: Variables written by the cell
|
|
234
|
+
- `stale_cells`: List of cell IDs that became stale
|
|
235
|
+
- `staleness_reasons`: Dict mapping cell_id to reason objects
|
|
236
|
+
|
|
237
|
+
## Instructions
|
|
238
|
+
|
|
239
|
+
When the user invokes this command:
|
|
240
|
+
|
|
241
|
+
### Detect Mode
|
|
242
|
+
|
|
243
|
+
First, determine which mode to use:
|
|
244
|
+
|
|
245
|
+
- If the first argument ends with `.ipynb`, use **Single Notebook Mode**
|
|
246
|
+
- Otherwise, use **Error Report Mode**
|
|
247
|
+
|
|
248
|
+
### Error Report Mode
|
|
249
|
+
|
|
250
|
+
1. Run the parsing script to get structured error data:
|
|
251
|
+
|
|
252
|
+
```bash
|
|
253
|
+
python flowbook/scripts/parse_repro_errors.py $ERROR_REPORT_FILE $NOTEBOOKS_DIR --json
|
|
254
|
+
```
|
|
255
|
+
|
|
256
|
+
2. Create/initialize the output file `error_categories.tsv` with header
|
|
257
|
+
|
|
258
|
+
3. For each notebook with errors, launch a parallel **opus** agent to:
|
|
259
|
+
- Read the relevant section from the error report (use line_range from parsed data)
|
|
260
|
+
- Read the notebook to understand context
|
|
261
|
+
- Categorize each error according to the taxonomy
|
|
262
|
+
- Identify the primary variable involved
|
|
263
|
+
- **Always reference cells by code cell index (`@N`), never by cell ID**
|
|
264
|
+
- For diagnostic/visualization errors: read the cell source, determine which lines are mutation vs diagnostic, and note the split plan
|
|
265
|
+
- Produce a short, coherent explanation for the categorization
|
|
266
|
+
- Output TSV lines
|
|
267
|
+
|
|
268
|
+
### Single Notebook Mode
|
|
269
|
+
|
|
270
|
+
1. Read the notebook JSON and extract errors using the logic above (check each cell's outputs for `predicate_violation` metadata)
|
|
271
|
+
|
|
272
|
+
2. If no errors found, report that the notebook has no reproducibility violations
|
|
273
|
+
|
|
274
|
+
3. For each error found:
|
|
275
|
+
- Read the cell source code to understand context
|
|
276
|
+
- Look at surrounding cells for the full picture
|
|
277
|
+
- Categorize the error according to the taxonomy above
|
|
278
|
+
- Identify the primary variable involved (from `locations` or by analyzing the code)
|
|
279
|
+
- **Always reference cells by code cell index (`@N`)**
|
|
280
|
+
|
|
281
|
+
### Final Steps (both modes)
|
|
282
|
+
|
|
283
|
+
4. Collect results and write to `error_categories.tsv` (for single notebook mode, also print to console)
|
|
284
|
+
|
|
285
|
+
5. Print a summary of categories found
|
|
286
|
+
|
|
287
|
+
6. If `--fix` flag is provided:
|
|
288
|
+
- For each notebook, initialize the fixed copy with `--init --force`
|
|
289
|
+
- Apply non-inserting fixes (inplace-reassign, sequential-chain, model-copy, inplace-to-copy, struct-copy, variable-reuse) in any order
|
|
290
|
+
- Apply diagnostic/visualization splits from **highest code cell index to lowest** (to avoid index shifts from cell insertions):
|
|
291
|
+
- Read the cell source
|
|
292
|
+
- Determine mutation vs diagnostic lines
|
|
293
|
+
- If purely diagnostic: use `add-diagnostic`
|
|
294
|
+
- If mixed: write temp files and use `set-source` + `insert-cell-after` + `add-diagnostic`
|
|
295
|
+
- If purely mutation: skip %diagnostic, apply appropriate rename fix instead
|
|
296
|
+
- Report which notebooks were fixed and where the `-fixed.ipynb` files are
|
|
297
|
+
|
|
298
|
+
## Progress Reporting
|
|
299
|
+
|
|
300
|
+
**Report progress as you work.** For each notebook being processed, output status like:
|
|
301
|
+
|
|
302
|
+
```
|
|
303
|
+
[1/23] backpack-pred-baseline-ensemble-eda.ipynb (4 errors)
|
|
304
|
+
- Error 1 (@3): Diagnostic inspection before mutation → train_data [split: 4 mutation + 2 diagnostic lines]
|
|
305
|
+
- Error 2 (@5): Sequential transformation chain → test_data
|
|
306
|
+
- Error 3 (@8): Visualization before mutation → train_data [pure diagnostic, add %diagnostic]
|
|
307
|
+
- Error 4 (@12): Sequential transformation chain → train_data
|
|
308
|
+
```
|
|
309
|
+
|
|
310
|
+
When applying fixes (with `--fix`):
|
|
311
|
+
|
|
312
|
+
```
|
|
313
|
+
[1/23] backpack-pred-baseline-ensemble-eda.ipynb
|
|
314
|
+
Initializing: backpack-pred-baseline-ensemble-eda-fixed.ipynb
|
|
315
|
+
- Fixing @5: sequential-chain --variable test_data
|
|
316
|
+
- Fixing @12: sequential-chain --variable train_data
|
|
317
|
+
- Fixing @8: add-diagnostic (pure diagnostic cell)
|
|
318
|
+
- Fixing @3: set-source + insert-cell-after + add-diagnostic (split mixed cell)
|
|
319
|
+
✓ Fixed 4 errors → backpack-pred-baseline-ensemble-eda-fixed.ipynb
|
|
320
|
+
```
|
|
321
|
+
|
|
322
|
+
At the end, print a summary:
|
|
323
|
+
|
|
324
|
+
```
|
|
325
|
+
=== Summary ===
|
|
326
|
+
Notebooks processed: 23
|
|
327
|
+
Total errors categorized: 116
|
|
328
|
+
- In-place variable reassignment: 41
|
|
329
|
+
- Sequential transformation chain: 51
|
|
330
|
+
- Diagnostic inspection before mutation: 17
|
|
331
|
+
- Visualization before mutation: 2
|
|
332
|
+
- Reusing variable for different purposes: 5
|
|
333
|
+
- Unrecoverable mutation (ML model): 12
|
|
334
|
+
- Unrecoverable mutation (inplace): 8
|
|
335
|
+
- Unrecoverable mutation (structural): 3
|
|
336
|
+
|
|
337
|
+
Fixed notebooks saved to:
|
|
338
|
+
- .../backpack-pred-baseline-ensemble-eda-fixed.ipynb
|
|
339
|
+
- .../forecasting-sticker-sales-fixed.ipynb
|
|
340
|
+
- ...
|
|
341
|
+
```
|