samplekit 0.1.0__py3-none-any.whl

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -0,0 +1,422 @@
1
+ Metadata-Version: 2.4
2
+ Name: samplekit
3
+ Version: 0.1.0
4
+ Summary: Lightweight Python framework for documenting scientific samples with bidirectional Markdown I/O
5
+ Author-email: zelyph <github@zelyph.fr>
6
+ License: MIT
7
+ Project-URL: Homepage, https://github.com/zelyph/SampleKit
8
+ Project-URL: Repository, https://github.com/zelyph/SampleKit
9
+ Keywords: science,measurement,markdown,yaml,sample,property
10
+ Classifier: Development Status :: 3 - Alpha
11
+ Classifier: Intended Audience :: Science/Research
12
+ Classifier: License :: OSI Approved :: MIT License
13
+ Classifier: Programming Language :: Python :: 3
14
+ Classifier: Programming Language :: Python :: 3.10
15
+ Classifier: Programming Language :: Python :: 3.11
16
+ Classifier: Programming Language :: Python :: 3.12
17
+ Classifier: Programming Language :: Python :: 3.13
18
+ Classifier: Topic :: Scientific/Engineering
19
+ Requires-Python: >=3.10
20
+ Description-Content-Type: text/markdown
21
+ License-File: LICENSE
22
+ Requires-Dist: pyyaml>=5.0
23
+ Requires-Dist: pandas>=1.0
24
+ Dynamic: license-file
25
+
26
+ # SampleKit
27
+
28
+ **Lightweight Python framework for documenting scientific samples with bidirectional Markdown I/O.**
29
+
30
+ ---
31
+
32
+ ## Origin
33
+
34
+ SampleKit was born during my PhD research — I needed a structured way to document and process scientific samples without leaving Python or relying on proprietary formats. The first version was written entirely by hand. The current rewrite was **vibecoded** with [GitHub Copilot](https://github.com/features/copilot) (Claude) because maintaining a side-project framework alongside a thesis is, frankly, unsustainable otherwise.
35
+
36
+ ---
37
+
38
+ ## What it does
39
+
40
+ SampleKit sits between the **lab notebook** and the **analysis script**. It gives you a structured, human-readable record of measurements that stays fully exploitable in code — no proprietary format, no binary files.
41
+
42
+ Each sample is a plain **Markdown file** with a YAML header for the raw data and a body for formatted tables and notes. You can generate it from code, hand-edit it, and reload it — the round-trip is lossless.
43
+
44
+ ### Core concepts
45
+
46
+ | Concept | Purpose |
47
+ |---|---|
48
+ | **Property** | A scalar scientific quantity: value ± uncertainty, unit, symbols (text & LaTeX). Pass a list of measurements → auto mean ± stdev. Define a `compute` callback → lazy evaluation with dependency-based cache invalidation. |
49
+ | **Table** | Tabular scientific data indexed by a primary variable (temperature, time, …). Each cell is a `Property` — value, uncertainty, unit, formatting. Supports row-wise and column-wise computed columns. |
50
+ | **Column** | Metadata descriptor for a Table column: unit, symbol, precision. |
51
+ | **Sample** | A named container of Properties and Tables. Subclass it, declare data in `__init__`, optionally override `template()` for a custom Markdown layout. Reads/writes `.md` files with lossless round-trip. |
52
+ | **SampleList** | A collection of Samples loaded from a directory, file list, or built programmatically. Supports filtering, sorting, batch saving, and pandas export. |
53
+
54
+ ### Key features
55
+
56
+ - **Bidirectional Markdown I/O** — Write a `.md` file from code, edit it by hand, reload it. YAML frontmatter stores raw data, the body is regenerated from `template()`.
57
+ - **Auto statistics** — Pass a list of measurements and SampleKit computes the mean and sample standard deviation automatically.
58
+ - **Computed properties** — Define a `compute` callback for lazy evaluation. Declare `depends_on` and dependent caches are invalidated on change.
59
+ - **Tables** — Index-based tabular data with per-cell uncertainty. Row-wise and column-wise computed columns. Full YAML round-trip.
60
+ - **Dual symbol system** — `symbol` (text/unicode for CLI/TUI) and `symbol_math` (LaTeX for reports). Automatic fallback: `symbol_math` → `symbol` → attribute name.
61
+ - **Text & math rendering** — Properties and tables render in `text` mode (plain text) or `math` mode (inline LaTeX `$...$`), ready for Pandoc conversion to PDF.
62
+ - **Pandas integration** — Export a single sample or an entire collection to a DataFrame.
63
+ - **Pure Python** — Only depends on PyYAML and pandas. `pip install` and go.
64
+
65
+ ---
66
+
67
+ ## Installation
68
+
69
+ > **Requires** Python ≥ 3.10
70
+
71
+ ```bash
72
+ pip install samplekit
73
+ ```
74
+
75
+ ---
76
+
77
+ ## Quick start
78
+
79
+ ### Property
80
+
81
+ ```python
82
+ from samplekit import Property
83
+
84
+ # Static value
85
+ length = Property(value=12.5, uncertainty=0.1, unit="mm", symbol="L")
86
+ print(length) # 12.5 ± 0.1 mm
87
+
88
+ # Measured value — auto mean ± stdev
89
+ readings = Property(value=[101.1, 101.3, 101.5], unit="kPa", symbol="P")
90
+ print(readings.value) # 101.3
91
+ print(readings.uncertainty) # ~0.2
92
+ ```
93
+
94
+ ### Table
95
+
96
+ ```python
97
+ from samplekit import Table, Column
98
+
99
+ measurements = Table(
100
+ title="Resistance vs Temperature",
101
+ columns={
102
+ "T": Column(unit="C", symbol="T"),
103
+ "R": Column(unit="ohm", symbol="R", precision=".2f"),
104
+ },
105
+ )
106
+
107
+ measurements.add(T=20, R=100.5)
108
+ measurements.add(T=40, R=102.3)
109
+ measurements.add(T=60, R=(105.1, 0.3)) # value ± uncertainty
110
+
111
+ print(measurements(40).R.value) # 102.3
112
+ print(measurements.R.values) # [100.5, 102.3, 105.1]
113
+ ```
114
+
115
+ ### Sample
116
+
117
+ ```python
118
+ from samplekit import Sample, Property, Table, Column, report
119
+
120
+
121
+ class Experiment(Sample):
122
+ def __init__(self, name=None, filepath=None):
123
+ super().__init__(name, filepath)
124
+
125
+ self.temperature = Property(
126
+ value=25.0, uncertainty=0.5,
127
+ unit="C", symbol="T",
128
+ )
129
+ self.pressure = Property(
130
+ value=[101.1, 101.3, 101.5],
131
+ unit="kPa", symbol="P",
132
+ )
133
+ self.ratio = Property(
134
+ unit="kPa/C", symbol="R",
135
+ compute=self._calc_ratio,
136
+ depends_on=[self.pressure, self.temperature],
137
+ )
138
+ self.readings = Table(
139
+ title="Sensor readings",
140
+ columns={
141
+ "t": Column(unit="s"),
142
+ "signal": Column(unit="mV", precision=".2f"),
143
+ },
144
+ )
145
+
146
+ def _calc_ratio(self):
147
+ return self.pressure.value / self.temperature.value
148
+
149
+ def template(self, style="math"):
150
+ parts = []
151
+ parts.append(report.heading(f"Experiment: {self.name}"))
152
+ parts.append(report.properties_table(
153
+ self, ["temperature", "pressure", "ratio"], style=style,
154
+ ))
155
+ if self.readings:
156
+ parts.append(report.heading("Readings"))
157
+ parts.append(report.table_to_markdown(self.readings, style=style))
158
+ return "\n\n".join(parts)
159
+
160
+
161
+ exp = Experiment("EXP_001")
162
+ exp.readings.add(t=0, signal=12.34)
163
+ exp.readings.add(t=1, signal=12.51)
164
+
165
+ # Save to Markdown, then reload (round-trip is lossless)
166
+ exp.save("EXP_001.md")
167
+ loaded = Experiment.load("EXP_001.md")
168
+ ```
169
+
170
+ **Generated file (`EXP_001.md`)**:
171
+
172
+ ```markdown
173
+ ---
174
+ name: EXP_001
175
+ temperature: {value: 25.0, uncertainty: 0.5, unit: C, precision: .1f}
176
+ pressure: {value: 101.3, data: [101.1, 101.3, 101.5], uncertainty: 0.2, unit: kPa}
177
+ ratio: {value: 4.052, unit: kPa/C}
178
+ readings:
179
+ _title: Sensor readings
180
+ _index: t
181
+ _columns:
182
+ - {name: t, unit: s}
183
+ - {name: signal, unit: mV, precision: '.2f'}
184
+ _rows:
185
+ - {t: 0, signal: 12.34}
186
+ - {t: 1, signal: 12.51}
187
+ ---
188
+
189
+ ## Experiment: EXP_001
190
+
191
+ | Property | Value | Unit |
192
+ | :---: | :---: | :---: |
193
+ | $T$ | $25.0 \pm 0.5$ | $C$ |
194
+ | $P$ | $101.3 \pm 0.2$ | $kPa$ |
195
+ | $R$ | $4.052$ | $kPa/C$ |
196
+
197
+ ## Readings
198
+
199
+ | $t$ ($s$) | $signal$ ($mV$) |
200
+ | :---: | :---: |
201
+ | 0 | 12.34 |
202
+ | 1 | 12.51 |
203
+ ```
204
+
205
+ ### SampleList
206
+
207
+ ```python
208
+ from samplekit import SampleList
209
+
210
+ # Load all samples from a directory
211
+ samples = SampleList("data/", sample_class=Experiment)
212
+
213
+ # Filter and sort
214
+ hot = samples.filter(lambda s: s.temperature.value > 30)
215
+ ordered = hot.sort("pressure")
216
+
217
+ # Multi-key sort
218
+ by_group = samples.sort(
219
+ [lambda s: s.group.value, "temperature"],
220
+ reverse=[False, True],
221
+ )
222
+
223
+ # Batch save
224
+ samples.save_all("output/", overwrite=True)
225
+
226
+ # Export to pandas DataFrame
227
+ df = samples.to_dataframe()
228
+ ```
229
+
230
+ ---
231
+
232
+ ## File format
233
+
234
+ SampleKit uses Markdown files with YAML frontmatter:
235
+
236
+ ```yaml
237
+ ---
238
+ name: SAMPLE_001
239
+
240
+ # Scalar properties
241
+ label: Test coupon A
242
+ length: {value: 50.12, data: [50.12, 50.08, 50.15], uncertainty: 0.035, unit: mm}
243
+ width: {value: 25.0, unit: mm}
244
+ area: 1252.9
245
+
246
+ # Table
247
+ measurements:
248
+ _title: Resistance vs Temperature
249
+ _index: T
250
+ _columns:
251
+ - {name: T, unit: C}
252
+ - {name: R, unit: ohm, precision: '.2f'}
253
+ _rows:
254
+ - {T: 20, R: 100.5}
255
+ - {T: 40, R: 102.3}
256
+ ---
257
+ ```
258
+
259
+ **Properties:**
260
+ - Bare number if only a value: `area: 1252.9`
261
+ - Bare string for text: `label: Test coupon A`
262
+ - Dict with metadata: `{value, data, uncertainty, unit, symbol, symbol_math, precision, ...}`
263
+ - Conditional storage — redundant fields are omitted (e.g. `symbol_math` if same as `symbol`)
264
+
265
+ **Tables:**
266
+ - `_title`: display title
267
+ - `_index`: name of the index column
268
+ - `_columns`: list of column descriptors `[{name, unit, symbol, precision, ...}]`
269
+ - `_rows`: list of row dicts `[{col: value_or_{value, uncertainty}}]`
270
+
271
+ ---
272
+
273
+ ## API reference
274
+
275
+ ### `Property`
276
+
277
+ ```python
278
+ Property(
279
+ value=None, # float, str, list[float], or None
280
+ uncertainty=None, # float — overrides auto-stdev if set
281
+ unit="", # display unit (text)
282
+ unit_math=None, # LaTeX unit (defaults to unit)
283
+ symbol=None, # text symbol (defaults to attr name)
284
+ symbol_math=None, # LaTeX symbol (defaults to symbol)
285
+ precision="", # format spec, e.g. ".2f", ".1e"
286
+ precision_unc=None, # format spec for uncertainty (defaults to precision)
287
+ compute=None, # callable → lazy, cached value
288
+ compute_unc=None, # callable → lazy uncertainty
289
+ depends_on=None, # list[Property] → auto-invalidation
290
+ )
291
+ ```
292
+
293
+ | Member | Description |
294
+ |---|---|
295
+ | `.value` | Get/set the value. Lists → auto mean. Setting clears `compute`. |
296
+ | `.uncertainty` | Get/set uncertainty. Lists → auto stdev. Setting clears `compute_unc`. |
297
+ | `.data` | Raw measurement list (read-only copy), or `None`. |
298
+ | `.text` | Shortcut for `.format()` → `"25.0 ± 0.5 mm"`. |
299
+ | `.format(unit=True)` | Plain-text representation. |
300
+ | `.is_computed` | `True` if the value comes from a `compute` callback. |
301
+ | `.invalidate()` | Clear cache and propagate to dependents. |
302
+
303
+ ### `Table`
304
+
305
+ ```python
306
+ Table(
307
+ columns=None, # dict[str, Column] — first key is the index
308
+ index=None, # list — pre-populate index values
309
+ data=None, # dict — static data: {idx: {col: val}}
310
+ compute=None, # dict — column-wise: {col: fn(index_vals) → list}
311
+ compute_unc=None, # dict — column-wise uncertainty
312
+ compute_row=None, # dict — row-wise: {col: (fn, [dep_cols])}
313
+ title=None, # display title
314
+ )
315
+ ```
316
+
317
+ | Member | Description |
318
+ |---|---|
319
+ | `.add(**kwargs)` | Add a row. Tuples `(value, unc)` for uncertainty. |
320
+ | `table(idx)` | Access row by index value → `RowView`. |
321
+ | `table[i]` | Access row by position → `RowView`. |
322
+ | `table.col` | Access column → `ColumnView` (`.values`, `.uncertainties`). |
323
+ | `len(table)` | Number of rows. |
324
+ | `bool(table)` | `True` if any rows exist. |
325
+ | `.index_values` | Sorted list of index values. |
326
+ | `.data_columns` | List of non-index column names. |
327
+
328
+ ### `Column`
329
+
330
+ ```python
331
+ Column(
332
+ unit="", # display unit
333
+ unit_math=None, # LaTeX unit
334
+ symbol=None, # text symbol
335
+ symbol_math=None, # LaTeX symbol
336
+ precision="", # format spec
337
+ precision_unc=None, # format spec for uncertainty
338
+ )
339
+ ```
340
+
341
+ ### `Sample`
342
+
343
+ ```python
344
+ Sample(name=None, filepath=None)
345
+ ```
346
+
347
+ Subclass it and declare Properties and Tables in `__init__`. Assignment auto-registers them and wires names, symbols, and dependencies.
348
+
349
+ | Member | Description |
350
+ |---|---|
351
+ | `.props` | `dict[str, Property]` — all registered properties. |
352
+ | `.tables` | `dict[str, Table]` — all registered tables. |
353
+ | `.save(filepath, style="math")` | Write YAML frontmatter + template body to `.md`. |
354
+ | `.load(filepath)` | classmethod — load from `.md` file. |
355
+ | `.template(style="math")` | Override for custom Markdown body. |
356
+ | `.to_dict()` | Export all data as a plain dict. |
357
+ | `.to_dataframe()` | Export scalar properties as a single-row DataFrame. |
358
+
359
+ ### `SampleList`
360
+
361
+ ```python
362
+ SampleList(source=None, sample_class=Sample, pattern="*.md")
363
+ ```
364
+
365
+ `source` can be a directory path, a list of file paths, or a list of Sample objects.
366
+
367
+ | Member | Description |
368
+ |---|---|
369
+ | `.filter(func)` | New SampleList with matching samples. |
370
+ | `.sort(key, reverse=False)` | Sort by property name, callable, or list (multi-key). |
371
+ | `.save_all(directory, overwrite=False)` | Save each sample as `{name}.md`. |
372
+ | `.append(sample)` | Add a sample. |
373
+ | `samples[i]`, `samples["name"]`, `samples[a:b]` | Access by index, name, or slice. |
374
+ | `.to_dataframe()` | Concatenated DataFrame, one row per sample. |
375
+ | `.stats(prop_name)` | Descriptive statistics for a property across all samples. |
376
+
377
+ ### `report`
378
+
379
+ | Function | Description |
380
+ |---|---|
381
+ | `heading(text, level=2)` | Markdown heading. |
382
+ | `format_property(prop, style, unit=True)` | Format a Property in `"text"` or `"math"` mode. |
383
+ | `properties_table(sample, names, style="math")` | Render selected properties as a markdown table. |
384
+ | `table_to_markdown(table, style="math")` | Render a Table as a markdown table. |
385
+ | `markdown_table(rows, headers, align)` | Generic markdown table builder. |
386
+
387
+ ### `converters`
388
+
389
+ | Function | Description |
390
+ |---|---|
391
+ | `sample_to_dict(sample)` | Sample → plain dict. |
392
+ | `sample_to_dataframe(sample)` | Sample → single-row DataFrame. |
393
+ | `samplelist_to_dataframe(slist)` | SampleList → DataFrame (one row per sample). |
394
+ | `samplelist_stats(slist, prop)` | Descriptive statistics (via `pd.Series.describe()`). |
395
+
396
+ These are also accessible as methods: `sample.to_dict()`, `samples.to_dataframe()`, etc.
397
+
398
+ ---
399
+
400
+ ## Roadmap
401
+
402
+ - [ ] **CLI** — Terminal access to samples: read, compare, export.
403
+ - [ ] **TUI** — Interactive terminal explorer with filtering and keyboard navigation.
404
+ - [ ] **Plotting** — Matplotlib integration with automatic axis labels from metadata and error bars.
405
+ - [ ] **LaTeX report** — Direct PDF generation from sample data.
406
+ - [ ] **Additional export formats** — CSV, JSON, LaTeX tables.
407
+
408
+ ---
409
+
410
+ ## Authors
411
+
412
+ **Thomas Lavie** ([@zelyph](https://github.com/zelyph)) — design, original implementation, PhD research context
413
+
414
+ **GitHub Copilot** (Claude) — v0.1 rewrite, architecture, documentation
415
+
416
+ ---
417
+
418
+ ## License
419
+
420
+ [MIT](LICENSE)
421
+
422
+
@@ -0,0 +1,12 @@
1
+ samplekit/__init__.py,sha256=71nA0Dl9W-TV2GgsOGYoJL9u0Z6Lv1nr6S4pyQcS2B4,402
2
+ samplekit/converters.py,sha256=t4I5Gbg3XNI3vxpoNRSEA-5-a_m3TgOfNEE370Ixlxk,3344
3
+ samplekit/property.py,sha256=ysS27_RCwYl1sRRFX3JrkJtH_X25JETTydL5lHWoLRA,12058
4
+ samplekit/report.py,sha256=OafxF-xBNo7RbHV9WRnAGgr8djhbq8MeO61J5mUuCVY,7941
5
+ samplekit/sample.py,sha256=TE7IJ6HLE8sT2Yk0Qpr_8yU1VtROiqxommyShlUv9_8,12483
6
+ samplekit/sample_list.py,sha256=vefxR7vi7Re7XFQNjftBSMqUtdcYgkPCGjc-vCMugg4,6974
7
+ samplekit/table.py,sha256=hN_gl1jDEthh0IXhOgib8u2dXxP42mpsI-6P6apuO5k,24118
8
+ samplekit-0.1.0.dist-info/licenses/LICENSE,sha256=-8aZAtBhtY2_SNUZ_QPFIaeI7zWp3mmOhPxnnOjhH3c,1067
9
+ samplekit-0.1.0.dist-info/METADATA,sha256=Uf1aq2quHoxZwDHFEegWBzwjaQog80EZiaX-i1x0K40,14981
10
+ samplekit-0.1.0.dist-info/WHEEL,sha256=aeYiig01lYGDzBgS8HxWXOg3uV61G9ijOsup-k9o1sk,91
11
+ samplekit-0.1.0.dist-info/top_level.txt,sha256=TzRpK_mefugHna4tru0gYUUPIFFIP8fZGD2t0GSrs7E,10
12
+ samplekit-0.1.0.dist-info/RECORD,,
@@ -0,0 +1,5 @@
1
+ Wheel-Version: 1.0
2
+ Generator: setuptools (82.0.1)
3
+ Root-Is-Purelib: true
4
+ Tag: py3-none-any
5
+
@@ -0,0 +1,8 @@
1
+ Copyright (c) 2025 zelyph
2
+
3
+ Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the “Software”), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:
4
+
5
+ The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.
6
+
7
+ THE SOFTWARE IS PROVIDED “AS IS”, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
8
+
@@ -0,0 +1 @@
1
+ samplekit