structflo-cser 0.3.0__tar.gz → 0.4.0__tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (110) hide show
  1. {structflo_cser-0.3.0 → structflo_cser-0.4.0}/CLAUDE.md +12 -3
  2. {structflo_cser-0.3.0 → structflo_cser-0.4.0}/PKG-INFO +1 -1
  3. structflo_cser-0.4.0/notebooks/01-quickstart.ipynb +1170 -0
  4. {structflo_cser-0.3.0 → structflo_cser-0.4.0}/pyproject.toml +2 -1
  5. {structflo_cser-0.3.0 → structflo_cser-0.4.0}/scripts/finetune/lps/diag_e2e_decompose.py +16 -1
  6. structflo_cser-0.4.0/scripts/finetune/lps/diag_label_recall.py +195 -0
  7. structflo_cser-0.4.0/scripts/finetune/lps/diag_lps_scores.py +204 -0
  8. structflo_cser-0.4.0/scripts/finetune/relmatch/eval_compare.py +191 -0
  9. structflo_cser-0.4.0/scripts/finetune/relmatch/eval_compare_all.py +204 -0
  10. structflo_cser-0.4.0/scripts/finetune/relmatch/prepare_det_data.py +156 -0
  11. structflo_cser-0.4.0/scripts/finetune/relmatch/sweep_margin.py +152 -0
  12. {structflo_cser-0.3.0 → structflo_cser-0.4.0}/scripts/publish_weights.py +42 -12
  13. {structflo_cser-0.3.0 → structflo_cser-0.4.0}/structflo/cser/inference/detector.py +19 -4
  14. {structflo_cser-0.3.0 → structflo_cser-0.4.0}/structflo/cser/pipeline/pipeline.py +18 -5
  15. structflo_cser-0.4.0/structflo/cser/relmatch/__init__.py +22 -0
  16. structflo_cser-0.4.0/structflo/cser/relmatch/dataset.py +234 -0
  17. structflo_cser-0.4.0/structflo/cser/relmatch/features.py +62 -0
  18. structflo_cser-0.4.0/structflo/cser/relmatch/matcher.py +124 -0
  19. structflo_cser-0.4.0/structflo/cser/relmatch/model.py +179 -0
  20. structflo_cser-0.4.0/structflo/cser/relmatch/train.py +213 -0
  21. {structflo_cser-0.3.0 → structflo_cser-0.4.0}/structflo/cser/weights.py +11 -0
  22. structflo_cser-0.3.0/notebooks/01-quickstart.ipynb +0 -959
  23. {structflo_cser-0.3.0 → structflo_cser-0.4.0}/.github/workflows/ci.yml +0 -0
  24. {structflo_cser-0.3.0 → structflo_cser-0.4.0}/.github/workflows/publish.yml +0 -0
  25. {structflo_cser-0.3.0 → structflo_cser-0.4.0}/.gitignore +0 -0
  26. {structflo_cser-0.3.0 → structflo_cser-0.4.0}/.python-version +0 -0
  27. {structflo_cser-0.3.0 → structflo_cser-0.4.0}/Makefile +0 -0
  28. {structflo_cser-0.3.0 → structflo_cser-0.4.0}/README.md +0 -0
  29. {structflo_cser-0.3.0 → structflo_cser-0.4.0}/annotate/__main__.py +0 -0
  30. {structflo_cser-0.3.0 → structflo_cser-0.4.0}/annotate/pdf.py +0 -0
  31. {structflo_cser-0.3.0 → structflo_cser-0.4.0}/annotate/server.py +0 -0
  32. {structflo_cser-0.3.0 → structflo_cser-0.4.0}/annotate/storage.py +0 -0
  33. {structflo_cser-0.3.0 → structflo_cser-0.4.0}/annotate/templates/index.html +0 -0
  34. {structflo_cser-0.3.0 → structflo_cser-0.4.0}/asset_scripts/download_chembl.sh +0 -0
  35. {structflo_cser-0.3.0 → structflo_cser-0.4.0}/config/data.yaml +0 -0
  36. {structflo_cser-0.3.0 → structflo_cser-0.4.0}/config/pipeline.yaml +0 -0
  37. {structflo_cser-0.3.0 → structflo_cser-0.4.0}/docs/fine-tune.md +0 -0
  38. {structflo_cser-0.3.0 → structflo_cser-0.4.0}/docs/images/example-1.png +0 -0
  39. {structflo_cser-0.3.0 → structflo_cser-0.4.0}/docs/images/example-2.png +0 -0
  40. {structflo_cser-0.3.0 → structflo_cser-0.4.0}/docs/learned_matcher_plan.md +0 -0
  41. {structflo_cser-0.3.0 → structflo_cser-0.4.0}/docs/lps.md +0 -0
  42. {structflo_cser-0.3.0 → structflo_cser-0.4.0}/docs/publishing-weights.md +0 -0
  43. {structflo_cser-0.3.0 → structflo_cser-0.4.0}/main.py +0 -0
  44. {structflo_cser-0.3.0 → structflo_cser-0.4.0}/notebooks/02-LPS.ipynb +0 -0
  45. {structflo_cser-0.3.0 → structflo_cser-0.4.0}/notebooks/03-PDF.ipynb +0 -0
  46. {structflo_cser-0.3.0 → structflo_cser-0.4.0}/notebooks/notebook-data/bio-arcgive-1.png +0 -0
  47. {structflo_cser-0.3.0 → structflo_cser-0.4.0}/notebooks/notebook-data/example-annotated.pdf +0 -0
  48. {structflo_cser-0.3.0 → structflo_cser-0.4.0}/notebooks/notebook-data/example.pdf +0 -0
  49. {structflo_cser-0.3.0 → structflo_cser-0.4.0}/notebooks/notebook-data/example.pptx +0 -0
  50. {structflo_cser-0.3.0 → structflo_cser-0.4.0}/notebooks/notebook-data/screen-1.png +0 -0
  51. {structflo_cser-0.3.0 → structflo_cser-0.4.0}/notebooks/notebook-data/syn-1.jpg +0 -0
  52. {structflo_cser-0.3.0 → structflo_cser-0.4.0}/quick.md +0 -0
  53. {structflo_cser-0.3.0 → structflo_cser-0.4.0}/scripts/finetune/lps/eval_compare.py +0 -0
  54. {structflo_cser-0.3.0 → structflo_cser-0.4.0}/scripts/finetune/lps/eval_end2end.py +0 -0
  55. {structflo_cser-0.3.0 → structflo_cser-0.4.0}/scripts/finetune/lps/eval_rejection.py +0 -0
  56. {structflo_cser-0.3.0 → structflo_cser-0.4.0}/scripts/finetune/lps/mine_fp_negatives.py +0 -0
  57. {structflo_cser-0.3.0 → structflo_cser-0.4.0}/scripts/finetune/lps/prepare_data.py +0 -0
  58. {structflo_cser-0.3.0 → structflo_cser-0.4.0}/scripts/finetune/lps/train.sh +0 -0
  59. {structflo_cser-0.3.0 → structflo_cser-0.4.0}/scripts/finetune/yolo/eval_compare.py +0 -0
  60. {structflo_cser-0.3.0 → structflo_cser-0.4.0}/scripts/finetune/yolo/prepare_data.py +0 -0
  61. {structflo_cser-0.3.0 → structflo_cser-0.4.0}/scripts/finetune/yolo/train.sh +0 -0
  62. {structflo_cser-0.3.0 → structflo_cser-0.4.0}/structflo/cser/__init__.py +0 -0
  63. {structflo_cser-0.3.0 → structflo_cser-0.4.0}/structflo/cser/_geometry.py +0 -0
  64. {structflo_cser-0.3.0 → structflo_cser-0.4.0}/structflo/cser/config.py +0 -0
  65. {structflo_cser-0.3.0 → structflo_cser-0.4.0}/structflo/cser/data/__init__.py +0 -0
  66. {structflo_cser-0.3.0 → structflo_cser-0.4.0}/structflo/cser/data/distractor_images.py +0 -0
  67. {structflo_cser-0.3.0 → structflo_cser-0.4.0}/structflo/cser/data/smiles.py +0 -0
  68. {structflo_cser-0.3.0 → structflo_cser-0.4.0}/structflo/cser/distractors/__init__.py +0 -0
  69. {structflo_cser-0.3.0 → structflo_cser-0.4.0}/structflo/cser/distractors/charts.py +0 -0
  70. {structflo_cser-0.3.0 → structflo_cser-0.4.0}/structflo/cser/distractors/shapes.py +0 -0
  71. {structflo_cser-0.3.0 → structflo_cser-0.4.0}/structflo/cser/distractors/text_elements.py +0 -0
  72. {structflo_cser-0.3.0 → structflo_cser-0.4.0}/structflo/cser/generation/__init__.py +0 -0
  73. {structflo_cser-0.3.0 → structflo_cser-0.4.0}/structflo/cser/generation/dataset.py +0 -0
  74. {structflo_cser-0.3.0 → structflo_cser-0.4.0}/structflo/cser/generation/page.py +0 -0
  75. {structflo_cser-0.3.0 → structflo_cser-0.4.0}/structflo/cser/generation/specialty.py +0 -0
  76. {structflo_cser-0.3.0 → structflo_cser-0.4.0}/structflo/cser/generation/tabular.py +0 -0
  77. {structflo_cser-0.3.0 → structflo_cser-0.4.0}/structflo/cser/inference/__init__.py +0 -0
  78. {structflo_cser-0.3.0 → structflo_cser-0.4.0}/structflo/cser/inference/nms.py +0 -0
  79. {structflo_cser-0.3.0 → structflo_cser-0.4.0}/structflo/cser/inference/pairing.py +0 -0
  80. {structflo_cser-0.3.0 → structflo_cser-0.4.0}/structflo/cser/inference/tiling.py +0 -0
  81. {structflo_cser-0.3.0 → structflo_cser-0.4.0}/structflo/cser/lps/__init__.py +0 -0
  82. {structflo_cser-0.3.0 → structflo_cser-0.4.0}/structflo/cser/lps/dataset.py +0 -0
  83. {structflo_cser-0.3.0 → structflo_cser-0.4.0}/structflo/cser/lps/evaluate.py +0 -0
  84. {structflo_cser-0.3.0 → structflo_cser-0.4.0}/structflo/cser/lps/features.py +0 -0
  85. {structflo_cser-0.3.0 → structflo_cser-0.4.0}/structflo/cser/lps/matcher.py +0 -0
  86. {structflo_cser-0.3.0 → structflo_cser-0.4.0}/structflo/cser/lps/scorer.py +0 -0
  87. {structflo_cser-0.3.0 → structflo_cser-0.4.0}/structflo/cser/lps/train.py +0 -0
  88. {structflo_cser-0.3.0 → structflo_cser-0.4.0}/structflo/cser/pipeline/__init__.py +0 -0
  89. {structflo_cser-0.3.0 → structflo_cser-0.4.0}/structflo/cser/pipeline/cli.py +0 -0
  90. {structflo_cser-0.3.0 → structflo_cser-0.4.0}/structflo/cser/pipeline/matcher.py +0 -0
  91. {structflo_cser-0.3.0 → structflo_cser-0.4.0}/structflo/cser/pipeline/models.py +0 -0
  92. {structflo_cser-0.3.0 → structflo_cser-0.4.0}/structflo/cser/pipeline/ocr.py +0 -0
  93. {structflo_cser-0.3.0 → structflo_cser-0.4.0}/structflo/cser/pipeline/smiles_extractor.py +0 -0
  94. {structflo_cser-0.3.0 → structflo_cser-0.4.0}/structflo/cser/rendering/__init__.py +0 -0
  95. {structflo_cser-0.3.0 → structflo_cser-0.4.0}/structflo/cser/rendering/chemistry.py +0 -0
  96. {structflo_cser-0.3.0 → structflo_cser-0.4.0}/structflo/cser/rendering/text.py +0 -0
  97. {structflo_cser-0.3.0 → structflo_cser-0.4.0}/structflo/cser/training/__init__.py +0 -0
  98. {structflo_cser-0.3.0 → structflo_cser-0.4.0}/structflo/cser/training/trainer.py +0 -0
  99. {structflo_cser-0.3.0 → structflo_cser-0.4.0}/structflo/cser/viz/__init__.py +0 -0
  100. {structflo_cser-0.3.0 → structflo_cser-0.4.0}/structflo/cser/viz/detections.py +0 -0
  101. {structflo_cser-0.3.0 → structflo_cser-0.4.0}/structflo/cser/viz/labels.py +0 -0
  102. {structflo_cser-0.3.0 → structflo_cser-0.4.0}/tests/__init__.py +0 -0
  103. {structflo_cser-0.3.0 → structflo_cser-0.4.0}/tests/test_config.py +0 -0
  104. {structflo_cser-0.3.0 → structflo_cser-0.4.0}/tests/test_generation.py +0 -0
  105. {structflo_cser-0.3.0 → structflo_cser-0.4.0}/tests/test_geometry.py +0 -0
  106. {structflo_cser-0.3.0 → structflo_cser-0.4.0}/tests/test_imports.py +0 -0
  107. {structflo_cser-0.3.0 → structflo_cser-0.4.0}/tests/test_inference.py +0 -0
  108. {structflo_cser-0.3.0 → structflo_cser-0.4.0}/tests/test_models.py +0 -0
  109. {structflo_cser-0.3.0 → structflo_cser-0.4.0}/tests/test_viz.py +0 -0
  110. {structflo_cser-0.3.0 → structflo_cser-0.4.0}/uv.lock +0 -0
@@ -119,16 +119,25 @@ ChemPipeline.to_records(pairs)
119
119
  - **Architecture**: YOLO11l (ultralytics)
120
120
  - **Classes**: 2 — `chemical_structure` (0), `compound_label` (1)
121
121
  - **Training image size**: 1280px
122
- - **Inference**: sliding-window tiling (1536px tiles, 20% overlap) + per-class NMS
122
+ - **Inference**: full-image at imgsz=1280 (the training resolution) is the default and
123
+ strictly outperforms tiling on large landscape pages — verified on real_test: label
124
+ recall 53%→80%, struct 93%→99%, 5× fewer false positives, end-to-end pairing F1 0.41→0.82.
125
+ Sliding-window tiling (1536px tiles, 20% overlap, per-class NMS) remains available via
126
+ `tile=True` for very dense pages, but cuts labels at tile boundaries.
123
127
  - **Training config**: AdamW, cosine LR, grayscale images, no colour augmentation
124
128
  - **Runs directory**: `runs/labels_detect/`
125
129
  - **YOLO data config**: `config/data.yaml`
126
130
 
127
131
  ## Matching strategies
128
132
 
129
- 1. **HungarianMatcher** — centroid Euclidean distance + `scipy.optimize.linear_sum_assignment`
133
+ 1. **HungarianMatcher** — centroid Euclidean distance + `scipy.optimize.linear_sum_assignment`.
134
+ Parameter-free; strong baseline on clean detections.
130
135
  2. **LearnedMatcher** (LPS) — CNN scorer produces association probability per (struct, label) pair,
131
- then Hungarian on `1 - score`. Default in ChemPipeline.
136
+ then Hungarian on `1 - score`.
137
+ 3. **RelationalMatcher** (`structflo/cser/relmatch/`) — geometry-only transformer over all page
138
+ detections + Sinkhorn optimal transport with learnable dustbins (SuperGlue-style). **Default in
139
+ ChemPipeline.** Best learned matcher in the benchmark (matches distance on assignment, best at
140
+ rejecting unlabelled structures). Weights: `cser-relmatcher` (HF Hub).
132
141
 
133
142
  ## Weights system
134
143
 
@@ -1,6 +1,6 @@
1
1
  Metadata-Version: 2.4
2
2
  Name: structflo-cser
3
- Version: 0.3.0
3
+ Version: 0.4.0
4
4
  Summary: Chemical structure-label pair extraction from scientific documents.
5
5
  Requires-Python: >=3.11
6
6
  Requires-Dist: chembl-webresource-client>=0.10.9