pearmut 1.0.0__py3-none-any.whl → 1.0.1__py3-none-any.whl

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
pearmut/static/index.html CHANGED
@@ -1 +1 @@
1
- <!doctype html><html lang="en" style="height: 100%;"><head><meta charset="UTF-8"><meta name="viewport" content="width=900px"><title>Pearmut Evaluation</title><link rel="icon" type="image/svg+xml" href="favicon.svg"><script defer="defer" src="index.bundle.js?148e44d47bac0dd405e1"></script><link href="style.css?148e44d47bac0dd405e1" rel="stylesheet"></head><body><div class="white-box" style="width: max-content; font-size: large; position: absolute; top: 50%; left: 50%; transform: translate(-50%, -50%);">You have reached the Pearmut🍐 evaluation interface.<ul><li>If you are an annotator, you should have received a specialized link that takes you to the annotations.</li><li>If you are annotation manager, then you should distribute these links.</li></ul><br><br>See the <a href="https://github.com/zouharvi/pearmut">Pearmut project on GitHub</a>. Made with 💚 by Vilém Zouhar and others in 2025-2026.</div></body></html>
1
+ <!doctype html><html lang="en" style="height: 100%;"><head><meta charset="UTF-8"><meta name="viewport" content="width=900px"><title>Pearmut Evaluation</title><link rel="icon" type="image/svg+xml" href="favicon.svg"><script defer="defer" src="index.bundle.js?0d289122fd490c931aec"></script><link href="style.css?0d289122fd490c931aec" rel="stylesheet"></head><body><div class="white-box" style="width: max-content; font-size: large; position: absolute; top: 50%; left: 50%; transform: translate(-50%, -50%);">You have reached the Pearmut🍐 evaluation interface.<ul><li>If you are an annotator, you should have received a specialized link that takes you to the annotations.</li><li>If you are annotation manager, then you should distribute these links.</li></ul><br><br>See the <a href="https://github.com/zouharvi/pearmut">Pearmut project on GitHub</a>. Made with 💚 by Vilém Zouhar and others in 2025-2026.</div></body></html>
pearmut/utils.py CHANGED
@@ -20,20 +20,8 @@ def load_progress_data(warn: str | None = None):
20
20
 
21
21
 
22
22
  def save_progress_data(data):
23
- # Convert sets to lists for JSON serialization
24
- def convert_sets(obj):
25
- if isinstance(obj, dict):
26
- return {k: convert_sets(v) for k, v in obj.items()}
27
- elif isinstance(obj, list):
28
- return [convert_sets(item) for item in obj]
29
- elif isinstance(obj, set):
30
- return list(obj)
31
- else:
32
- return obj
33
-
34
- serializable_data = convert_sets(data)
35
23
  with open(f"{ROOT}/data/progress.json", "w") as f:
36
- json.dump(serializable_data, f, indent=2)
24
+ json.dump(data, f, indent=2)
37
25
 
38
26
 
39
27
  _logs = {}
@@ -1,6 +1,6 @@
1
1
  Metadata-Version: 2.4
2
2
  Name: pearmut
3
- Version: 1.0.0
3
+ Version: 1.0.1
4
4
  Summary: A tool for evaluation of model outputs, primarily MT.
5
5
  Author-email: Vilém Zouhar <vilem.zouhar@gmail.com>
6
6
  License: MIT
@@ -19,17 +19,10 @@ Provides-Extra: dev
19
19
  Requires-Dist: pytest; extra == "dev"
20
20
  Dynamic: license-file
21
21
 
22
- # Pearmut 🍐
22
+ # 🍐Pearmut &nbsp; &nbsp; [![PyPi version](https://badgen.net/pypi/v/pearmut/)](https://pypi.org/project/pearmut) [![PyPI download/month](https://img.shields.io/pypi/dm/pearmut.svg)](https://pypi.python.org/pypi/pearmut/) [![PyPi license](https://badgen.net/pypi/license/pearmut/)](https://pypi.org/project/pearmut/) [![build status](https://github.com/zouharvi/pearmut/actions/workflows/test.yml/badge.svg)](https://github.com/zouharvi/pearmut/actions/workflows/test.yml)
23
23
 
24
24
  **Platform for Evaluation and Reviewing of Multilingual Tasks**: Evaluate model outputs for translation and NLP tasks with support for multimodal data (text, video, audio, images) and multiple annotation protocols ([DA](https://aclanthology.org/N15-1124/), [ESA](https://aclanthology.org/2024.wmt-1.131/), [ESA<sup>AI</sup>](https://aclanthology.org/2025.naacl-long.255/), [MQM](https://doi.org/10.1162/tacl_a_00437), and more!).
25
25
 
26
- [![PyPi version](https://badgen.net/pypi/v/pearmut/)](https://pypi.org/project/pearmut)
27
- &nbsp;
28
- [![PyPI download/month](https://img.shields.io/pypi/dm/pearmut.svg)](https://pypi.python.org/pypi/pearmut/)
29
- &nbsp;
30
- [![PyPi license](https://badgen.net/pypi/license/pearmut/)](https://pypi.org/project/pearmut/)
31
- &nbsp;
32
- [![build status](https://github.com/zouharvi/pearmut/actions/workflows/test.yml/badge.svg)](https://github.com/zouharvi/pearmut/actions/workflows/test.yml)
33
26
 
34
27
  <img width="1000" alt="Screenshot of ESA/MQM interface" src="https://github.com/user-attachments/assets/71334238-300b-4ffc-b777-7f3c242b1630" />
35
28
 
@@ -52,6 +45,8 @@ Dynamic: license-file
52
45
  - [Terminology](#terminology)
53
46
  - [Development](#development)
54
47
  - [Citation](#citation)
48
+ - [Changelog](#changelog)
49
+
55
50
 
56
51
  ## Quick Start
57
52
 
@@ -111,7 +106,9 @@ Campaigns are defined in JSON files (see [examples/](examples/)). The simplest c
111
106
  }
112
107
  ```
113
108
 
114
- Each item has to have `src` (string) and `tgt` (dictionary from model names to strings, even for a single model evaluation).
109
+ Each item has to have `tgt` (dictionary from model names to strings, even for a single model evaluation).
110
+ Optionally, you can also include `src` (source string) and/or `ref` (reference string).
111
+ If neither `src` nor `ref` is provided, only the model outputs will be displayed.
115
112
  For full Pearmut functionality (e.g. automatic statistical analysis), add `item_id` as well.
116
113
  Any other keys that you add will simply be stored in the logs.
117
114
 
@@ -145,6 +142,40 @@ The `shuffle` parameter in campaign `info` controls this behavior:
145
142
  }
146
143
  ```
147
144
 
145
+ ### Custom Score Sliders
146
+
147
+ For multi-dimensional evaluation tasks (e.g., assessing fluency on a Likert scale), you can define custom sliders with specific ranges and steps:
148
+
149
+ ```python
150
+ {
151
+ "info": {
152
+ "assignment": "task-based",
153
+ "protocol": "ESA",
154
+ "sliders": [
155
+ {"name": "Fluency", "min": 0, "max": 5, "step": 1},
156
+ {"name": "Adequacy", "min": 0, "max": 100, "step": 1}
157
+ ]
158
+ },
159
+ "campaign_id": "my_campaign",
160
+ "data": [...]
161
+ }
162
+ ```
163
+
164
+ When `sliders` is specified, only the custom sliders are shown. Each slider must have `name`, `min`, `max`, and `step` properties. All sliders must be answered before proceeding.
165
+
166
+ ### Custom Instructions
167
+
168
+ Set campaign-level instructions using the `instructions` field in `info` (supports HTML).
169
+ Instructions default to protocol-specific ones (DA: scoring, ESA: error spans + scoring, MQM: error spans + categories + scoring).
170
+ ```python
171
+ {
172
+ "info": {
173
+ "protocol": "DA",
174
+ "instructions": "Rate translation quality on a 0-100 scale.<br>Pay special attention to document-level phenomena."
175
+ }
176
+ }
177
+ ```
178
+
148
179
  ### Pre-filled Error Spans (ESA<sup>AI</sup>)
149
180
 
150
181
  Include `error_spans` to pre-fill annotations that users can review, modify, or delete:
@@ -263,7 +294,7 @@ All items must contain outputs from all models for this assignment type to work
263
294
  **How it works:**
264
295
  1. Initial phase: Each model gets `dynamic_first` annotations with fully random contrastive evaluation
265
296
  2. Dynamic phase: After the initial phase, top `dynamic_top` models (by average score) are identified
266
- 3. Contrastive evaluatoin: From the top N models, `dynamic_contrastive_models` models are randomly selected for each item
297
+ 3. Contrastive evaluation: From the top N models, `dynamic_contrastive_models` models are randomly selected for each item
267
298
  4. Item prioritization: Items with the least annotations for the selected models are prioritized
268
299
  5. Backoff: With probability `dynamic_backoff`, uniform random selection is used instead to maintain exploration
269
300
 
@@ -289,6 +320,7 @@ The `users` field accepts:
289
320
  }
290
321
  ```
291
322
 
323
+
292
324
  ### Multimodal Annotations
293
325
 
294
326
  Support for HTML-compatible elements (YouTube embeds, `<video>` tags, images). Ensure elements are pre-styled. See [examples/multimodal.json](examples/multimodal.json).
@@ -418,3 +450,61 @@ If you use this work in your paper, please cite as following.
418
450
  ```
419
451
 
420
452
  Contributions are welcome! Please reach out to [Vilém Zouhar](mailto:vilem.zouhar@gmail.com).
453
+
454
+ # Changelog
455
+
456
+ - v1.0.1
457
+ - Support RTL languages
458
+ - Add boxes for references
459
+ - Add custom score sliders for multi-dimensional evaluation
460
+ - Make instructions customizable and protocol-dependent
461
+ - Support custom sliders
462
+ - Purge/reset whole tasks from dashboard
463
+ - Fix resetting individual users in single-stream/dynamic
464
+ - Fix notification stacking
465
+ - Add campaigns from dashboard
466
+ - v0.3.3
467
+ - Rename `doc_id` to `item_id`
468
+ - Add Typst, LaTeX, and PDF export for model ranking tables. Hide them by default.
469
+ - Add dynamic assignment type with contrastive model comparison
470
+ - Add `instructions_goodbye` field with variable substitution
471
+ - Add visual anchors at 33% and 66% on sliders
472
+ - Add German→English ESA tutorial with attention checks
473
+ - Validate document model consistency before shuffle
474
+ - Fix UI block on any interaction
475
+ - v0.3.2
476
+ - Revert seeding of user IDs
477
+ - Set ESA (Error Span Annotation) as default
478
+ - Update server IP address configuration
479
+ - Show approximate alignment by default
480
+ - Unify pointwise and listwise interfaces into `basic`
481
+ - Refactor protocol configuration (breaking change)
482
+ - v0.2.11
483
+ - Add comment field in settings panel
484
+ - Add `score_gt` validation for listwise comparisons
485
+ - Add Content-Disposition headers for proper download filenames
486
+ - Add model results display to dashboard with rankings
487
+ - Add campaign file structure validation
488
+ - Purge command now unlinks assets
489
+ - v0.2.6
490
+ - Add frozen annotation links feature for view-only mode
491
+ - Add word-level annotation mode toggle for error spans
492
+ - Add `[missing]` token support
493
+ - Improve frontend speed and cleanup toolboxes on item load
494
+ - Host assets via symlinks
495
+ - Add validation threshold for success/fail tokens
496
+ - Implement reset masking for annotations
497
+ - Allow pre-defined user IDs and tokens in campaign data
498
+ - v0.1.1
499
+ - Set server defaults and add VM launch scripts
500
+ - Add warning dialog when navigating away with unsaved work
501
+ - Add tutorial validation support for pointwise and listwise
502
+ - Add ability to preview existing annotations via progress bar
503
+ - Add support for ESA<sup>AI</sup> pre-filled error_spans
504
+ - Rename pairwise to listwise and update layout
505
+ - Implement single-stream assignment type
506
+ - v0.0.3
507
+ - Support multimodal inputs and outputs
508
+ - Add dashboard
509
+ - Implement ESA (Error Span Annotation) and MQM support
510
+
@@ -0,0 +1,20 @@
1
+ pearmut/app.py,sha256=R33IAsiElYhr_17eCJ2gNSa6XGmnp_qB1-RaNDqItR4,12963
2
+ pearmut/assignment.py,sha256=wuRtP3WSsyRPk432n8EbjJI9GgK0ruJq0PLKDmtM87w,22105
3
+ pearmut/cli.py,sha256=hvKX8et4jWMs0W_O3Da12d8zTb7E2Yw4mlPd49cXYfE,26400
4
+ pearmut/constants.py,sha256=iYONCk2kyYcKy3kikhSKyXRKZ1lWVaVFdcWh6kUYTrQ,4844
5
+ pearmut/results_export.py,sha256=UxtbbqbrqVJDBSYQf-aT3M7lMgQgQiepHkHy4TSZrGg,5743
6
+ pearmut/utils.py,sha256=Rkc08bzm4Z96n_Ks8R--2c2B9TUTWB9z9QQxuyL8bAA,4368
7
+ pearmut/static/basic.bundle.js,sha256=N6ECQMPGq6hLDrwmU-ys5aegiMJu-X20Cfg73vawk7I,114962
8
+ pearmut/static/basic.html,sha256=URS37Iwum2ttnJ886LuoFHs55EjoNTHyI6YKj1TwDTc,5075
9
+ pearmut/static/dashboard.bundle.js,sha256=lNQh_pcKCGOSHLsQoBkU1pahhs6hql8GFm9YqpQRv0Y,105990
10
+ pearmut/static/dashboard.html,sha256=qBQak_12-tQW3vdAXPm2cxdeddQIIOGo_jUXtWsWW3c,3356
11
+ pearmut/static/favicon.svg,sha256=gVPxdBlyfyJVkiMfh8WLaiSyH4lpwmKZs8UiOeX8YW4,7347
12
+ pearmut/static/index.bundle.js,sha256=-koQkaoRCei-H40wozYnvf0PnrAoZbtOXHotJcTn5OM,346
13
+ pearmut/static/index.html,sha256=ImMkxquH2zfNOaZEIPkUoPAga79pTdTIuOm3yKWLMoE,930
14
+ pearmut/static/style.css,sha256=NKdwsugsS946w3pREfLab2Rf3Av9hNk0fvkTxmhyGrQ,4102
15
+ pearmut-1.0.1.dist-info/licenses/LICENSE,sha256=GtR6RcTdRn-P23h5pKFuWSLZrLPD0ytHAwSOBt7aLpI,1071
16
+ pearmut-1.0.1.dist-info/METADATA,sha256=z67XRDVps98CLH5XLmrTUiKhO2wXHbksu2Ez6OtDDJA,21663
17
+ pearmut-1.0.1.dist-info/WHEEL,sha256=_zCd3N1l69ArxyTb8rzEoP9TpbYXkqRFSNOD5OuxnTs,91
18
+ pearmut-1.0.1.dist-info/entry_points.txt,sha256=eEA9LVWsS3neQbMvL_nMvEw8I0oFudw8nQa1iqxOiWM,45
19
+ pearmut-1.0.1.dist-info/top_level.txt,sha256=CdgtUM-SKQDt6o5g0QreO-_7XTBP9_wnHMS1P-Rl5Go,8
20
+ pearmut-1.0.1.dist-info/RECORD,,
@@ -1,19 +0,0 @@
1
- pearmut/app.py,sha256=eZgJjQfBi5WuNYyc91JB_wo_8dEklhNbzmbDh228f_0,10635
2
- pearmut/assignment.py,sha256=wtOyiEycm-yiYPt9NfSnOLa52bz6vDj7M4_6jaHoMi4,20011
3
- pearmut/cli.py,sha256=-79930TRcNqBDOhWvxGJnhEX8mYH0OA1EgpfYz3H_uI,24711
4
- pearmut/results_export.py,sha256=UxtbbqbrqVJDBSYQf-aT3M7lMgQgQiepHkHy4TSZrGg,5743
5
- pearmut/utils.py,sha256=7CUemQHlQnOhc_a07QXVivdd8DodVykI6dek151ftTs,4798
6
- pearmut/static/basic.bundle.js,sha256=LnPSRoU-05MGLzS6tYT50eAxWpXIYj8rtjdfT4biyGc,110682
7
- pearmut/static/basic.html,sha256=s1Y9Qsn-VuQNoW4Pi9BNQ03z7W5wOTPRi7dY88qT59U,5998
8
- pearmut/static/dashboard.bundle.js,sha256=jtSI2-UehI_tMJMWJLRAopr1nlXUMQD7ohCGU6JGQEo,102109
9
- pearmut/static/dashboard.html,sha256=T-4J82egtpuEhjF3LVET0JENJ2vPMiFA351W9bnCgsg,3187
10
- pearmut/static/favicon.svg,sha256=gVPxdBlyfyJVkiMfh8WLaiSyH4lpwmKZs8UiOeX8YW4,7347
11
- pearmut/static/index.bundle.js,sha256=-koQkaoRCei-H40wozYnvf0PnrAoZbtOXHotJcTn5OM,346
12
- pearmut/static/index.html,sha256=r4PHyLh0JZ99nAZVlfcq70XIBRzoI4_C5MMXiL9kktw,930
13
- pearmut/static/style.css,sha256=NKdwsugsS946w3pREfLab2Rf3Av9hNk0fvkTxmhyGrQ,4102
14
- pearmut-1.0.0.dist-info/licenses/LICENSE,sha256=GtR6RcTdRn-P23h5pKFuWSLZrLPD0ytHAwSOBt7aLpI,1071
15
- pearmut-1.0.0.dist-info/METADATA,sha256=MaOqnfxUJgsyFXiFIFOutjNsy3TuDm1auuVSI74YiSw,18060
16
- pearmut-1.0.0.dist-info/WHEEL,sha256=_zCd3N1l69ArxyTb8rzEoP9TpbYXkqRFSNOD5OuxnTs,91
17
- pearmut-1.0.0.dist-info/entry_points.txt,sha256=eEA9LVWsS3neQbMvL_nMvEw8I0oFudw8nQa1iqxOiWM,45
18
- pearmut-1.0.0.dist-info/top_level.txt,sha256=CdgtUM-SKQDt6o5g0QreO-_7XTBP9_wnHMS1P-Rl5Go,8
19
- pearmut-1.0.0.dist-info/RECORD,,