PyPI - pearmut - Versions diffs - 1.0.0__tar.gz → 1.0.2__tar.gz - Mend

pearmut 1.0.0tar.gz → 1.0.2tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (28) hide show

{pearmut-1.0.0 → pearmut-1.0.2}/PKG-INFO RENAMED Viewed

@@ -1,6 +1,6 @@
 Metadata-Version: 2.4
 Name: pearmut
-Version: 1.0.0
+Version: 1.0.2
 Summary: A tool for evaluation of model outputs, primarily MT.
 Author-email: Vilém Zouhar <vilem.zouhar@gmail.com>
 License: MIT
@@ -19,17 +19,10 @@ Provides-Extra: dev
 Requires-Dist: pytest; extra == "dev"
 Dynamic: license-file
-# Pearmut 🍐
+# 🍐Pearmut <br> [![PyPi version](https://badgen.net/pypi/v/pearmut/)](https://pypi.org/project/pearmut) [![PyPI download/month](https://img.shields.io/pypi/dm/pearmut.svg)](https://pypi.python.org/pypi/pearmut/) [![PyPi license](https://badgen.net/pypi/license/pearmut/)](https://pypi.org/project/pearmut/) [![build status](https://github.com/zouharvi/pearmut/actions/workflows/test.yml/badge.svg)](https://github.com/zouharvi/pearmut/actions/workflows/test.yml) [![arXiv](https://img.shields.io/badge/arXiv-2601.02933-b31b1b.svg?style=flat)](https://arxiv.org/abs/2601.02933)
 **Platform for Evaluation and Reviewing of Multilingual Tasks**: Evaluate model outputs for translation and NLP tasks with support for multimodal data (text, video, audio, images) and multiple annotation protocols ([DA](https://aclanthology.org/N15-1124/), [ESA](https://aclanthology.org/2024.wmt-1.131/), [ESA<sup>AI</sup>](https://aclanthology.org/2025.naacl-long.255/), [MQM](https://doi.org/10.1162/tacl_a_00437), and more!).
-[![PyPi version](https://badgen.net/pypi/v/pearmut/)](https://pypi.org/project/pearmut)
-&nbsp;
-[![PyPI download/month](https://img.shields.io/pypi/dm/pearmut.svg)](https://pypi.python.org/pypi/pearmut/)
-&nbsp;
-[![PyPi license](https://badgen.net/pypi/license/pearmut/)](https://pypi.org/project/pearmut/)
-&nbsp;
-[![build status](https://github.com/zouharvi/pearmut/actions/workflows/test.yml/badge.svg)](https://github.com/zouharvi/pearmut/actions/workflows/test.yml)
 <img width="1000" alt="Screenshot of ESA/MQM interface" src="https://github.com/user-attachments/assets/71334238-300b-4ffc-b777-7f3c242b1630" />
@@ -52,6 +45,8 @@ Dynamic: license-file
 - [Terminology](#terminology)
 - [Development](#development)
 - [Citation](#citation)
+- [Changelog](#changelog)
 ## Quick Start
@@ -111,7 +106,9 @@ Campaigns are defined in JSON files (see [examples/](examples/)). The simplest c
 }
 ```
-Each item has to have `src` (string) and `tgt` (dictionary from model names to strings, even for a single model evaluation).
+Each item has to have `tgt` (dictionary from model names to strings, even for a single model evaluation).
+Optionally, you can also include `src` (source string) and/or `ref` (reference string).
+If neither `src` nor `ref` is provided, only the model outputs will be displayed.
 For full Pearmut functionality (e.g. automatic statistical analysis), add `item_id` as well.
 Any other keys that you add will simply be stored in the logs.
@@ -145,6 +142,74 @@ The `shuffle` parameter in campaign `info` controls this behavior:
 }
 ```
+### Showing Model Names
+By default, model names are hidden to avoid biasing annotators. To display model names on top of each output block, set `show_model_names` to `true`:
+```python
+{
+  "info": {
+    "assignment": "task-based",
+    "protocol": "ESA",
+    "show_model_names": true  # Default: false.
+  },
+  "campaign_id": "my_campaign",
+  "data": [...]
+}
+```
+### Custom Score Sliders
+For multi-dimensional evaluation tasks (e.g., assessing fluency on a Likert scale), you can define custom sliders with specific ranges and steps:
+```python
+{
+  "info": {
+    "assignment": "task-based",
+    "protocol": "ESA",
+    "sliders": [
+      {"name": "Fluency", "min": 0, "max": 5, "step": 1},
+      {"name": "Adequacy", "min": 0, "max": 100, "step": 1}
+    ]
+  },
+  "campaign_id": "my_campaign",
+  "data": [...]
+}
+```
+When `sliders` is specified, only the custom sliders are shown. Each slider must have `name`, `min`, `max`, and `step` properties. All sliders must be answered before proceeding.
+### Textfield for Post-editing/Translation
+Enable a textfield for post-editing or translation tasks using the `textfield` parameter in `info`. The textfield content is stored in annotations alongside scores and error spans.
+```python
+{
+  "info": {
+    "protocol": "DA",
+    "textfield": "prefilled"  # Options: null, "hidden", "visible", "prefilled"
+  }
+}
+```
+**Textfield modes:**
+- `null` or omitted: No textfield (default)
+- `"hidden"`: Textfield hidden by default, shown by clicking a button
+- `"visible"`: Textfield always visible
+- `"prefilled"`: Textfield visible and pre-filled with model output for post-editing
+### Custom Instructions
+Set campaign-level instructions using the `instructions` field in `info` (supports HTML).
+Instructions default to protocol-specific ones (DA: scoring, ESA: error spans + scoring, MQM: error spans + categories + scoring).
+```python
+{
+  "info": {
+    "protocol": "DA",
+    "instructions": "Rate translation quality on a 0-100 scale.<br>Pay special attention to document-level phenomena."
+  }
+}
+```
 ### Pre-filled Error Spans (ESA<sup>AI</sup>)
 Include `error_spans` to pre-fill annotations that users can review, modify, or delete:
@@ -263,7 +328,7 @@ All items must contain outputs from all models for this assignment type to work
 **How it works:**
 1. Initial phase: Each model gets `dynamic_first` annotations with fully random contrastive evaluation
 2. Dynamic phase: After the initial phase, top `dynamic_top` models (by average score) are identified
-3. Contrastive evaluatoin: From the top N models, `dynamic_contrastive_models` models are randomly selected for each item
+3. Contrastive evaluation: From the top N models, `dynamic_contrastive_models` models are randomly selected for each item
 4. Item prioritization: Items with the least annotations for the selected models are prioritized
 5. Backoff: With probability `dynamic_backoff`, uniform random selection is used instead to maintain exploration
@@ -289,6 +354,7 @@ The `users` field accepts:
 }
 ```
 ### Multimodal Annotations
 Support for HTML-compatible elements (YouTube embeds, `<video>` tags, images). Ensure elements are pre-styled. See [examples/multimodal.json](examples/multimodal.json).
@@ -369,7 +435,7 @@ Customize the goodbye message shown to users when they complete all annotations
   - **Score**: Numeric quality rating (0-100)
   - **Error Spans**: Text highlights marking errors with severity (`minor`, `major`)
   - **Error Categories**: MQM taxonomy labels for errors
-- **Template**: The annotation interface type. The `basic` template supports comparing multiple outputs simultaneously.
+- **Template**: The annotation interface type. The `annotate` template supports comparing multiple outputs simultaneously.
 - **Assignment**: The method for distributing items to users:
   - **Task-based**: Each user has predefined items
   - **Single-stream**: Users draw from a shared pool with random assignment
@@ -400,7 +466,7 @@ pearmut run
 2. Add build rule to `webpack.config.js`
 3. Reference as `info->template` in campaign JSON
-See [web/src/basic.ts](web/src/basic.ts) for example.
+See [web/src/annotate.ts](web/src/annotate.ts) for example.
 ### Deployment
@@ -411,10 +477,15 @@ Run on public server or tunnel local port to public IP/domain and run locally.
 If you use this work in your paper, please cite as following.
 ```bibtex
 @misc{zouhar2026pearmut,
-  author = {Zouhar, Vilém},
-  title = {Pearmut: Human Evaluation of Translation Made Trivial},
-  year = {2026}
+      title={Pearmut: Human Evaluation of Translation Made Trivial},
+      author={Vilém Zouhar and Tom Kocmi},
+      year={2026},
+      eprint={2601.02933},
+      archivePrefix={arXiv},
+      primaryClass={cs.CL},
+      url={https://arxiv.org/abs/2601.02933},
 }
 ```
 Contributions are welcome! Please reach out to [Vilém Zouhar](mailto:vilem.zouhar@gmail.com).
+See changes in [CHANGELOG.md](CHANGELOG.md).

{pearmut-1.0.0 → pearmut-1.0.2}/README.md RENAMED Viewed

@@ -1,14 +1,7 @@
-# Pearmut 🍐
+# 🍐Pearmut <br> [![PyPi version](https://badgen.net/pypi/v/pearmut/)](https://pypi.org/project/pearmut) [![PyPI download/month](https://img.shields.io/pypi/dm/pearmut.svg)](https://pypi.python.org/pypi/pearmut/) [![PyPi license](https://badgen.net/pypi/license/pearmut/)](https://pypi.org/project/pearmut/) [![build status](https://github.com/zouharvi/pearmut/actions/workflows/test.yml/badge.svg)](https://github.com/zouharvi/pearmut/actions/workflows/test.yml) [![arXiv](https://img.shields.io/badge/arXiv-2601.02933-b31b1b.svg?style=flat)](https://arxiv.org/abs/2601.02933)
 **Platform for Evaluation and Reviewing of Multilingual Tasks**: Evaluate model outputs for translation and NLP tasks with support for multimodal data (text, video, audio, images) and multiple annotation protocols ([DA](https://aclanthology.org/N15-1124/), [ESA](https://aclanthology.org/2024.wmt-1.131/), [ESA<sup>AI</sup>](https://aclanthology.org/2025.naacl-long.255/), [MQM](https://doi.org/10.1162/tacl_a_00437), and more!).
-[![PyPi version](https://badgen.net/pypi/v/pearmut/)](https://pypi.org/project/pearmut)
-&nbsp;
-[![PyPI download/month](https://img.shields.io/pypi/dm/pearmut.svg)](https://pypi.python.org/pypi/pearmut/)
-&nbsp;
-[![PyPi license](https://badgen.net/pypi/license/pearmut/)](https://pypi.org/project/pearmut/)
-&nbsp;
-[![build status](https://github.com/zouharvi/pearmut/actions/workflows/test.yml/badge.svg)](https://github.com/zouharvi/pearmut/actions/workflows/test.yml)
 <img width="1000" alt="Screenshot of ESA/MQM interface" src="https://github.com/user-attachments/assets/71334238-300b-4ffc-b777-7f3c242b1630" />
@@ -31,6 +24,8 @@
 - [Terminology](#terminology)
 - [Development](#development)
 - [Citation](#citation)
+- [Changelog](#changelog)
 ## Quick Start
@@ -90,7 +85,9 @@ Campaigns are defined in JSON files (see [examples/](examples/)). The simplest c
 }
 ```
-Each item has to have `src` (string) and `tgt` (dictionary from model names to strings, even for a single model evaluation).
+Each item has to have `tgt` (dictionary from model names to strings, even for a single model evaluation).
+Optionally, you can also include `src` (source string) and/or `ref` (reference string).
+If neither `src` nor `ref` is provided, only the model outputs will be displayed.
 For full Pearmut functionality (e.g. automatic statistical analysis), add `item_id` as well.
 Any other keys that you add will simply be stored in the logs.
@@ -124,6 +121,74 @@ The `shuffle` parameter in campaign `info` controls this behavior:
 }
 ```
+### Showing Model Names
+By default, model names are hidden to avoid biasing annotators. To display model names on top of each output block, set `show_model_names` to `true`:
+```python
+{
+  "info": {
+    "assignment": "task-based",
+    "protocol": "ESA",
+    "show_model_names": true  # Default: false.
+  },
+  "campaign_id": "my_campaign",
+  "data": [...]
+}
+```
+### Custom Score Sliders
+For multi-dimensional evaluation tasks (e.g., assessing fluency on a Likert scale), you can define custom sliders with specific ranges and steps:
+```python
+{
+  "info": {
+    "assignment": "task-based",
+    "protocol": "ESA",
+    "sliders": [
+      {"name": "Fluency", "min": 0, "max": 5, "step": 1},
+      {"name": "Adequacy", "min": 0, "max": 100, "step": 1}
+    ]
+  },
+  "campaign_id": "my_campaign",
+  "data": [...]
+}
+```
+When `sliders` is specified, only the custom sliders are shown. Each slider must have `name`, `min`, `max`, and `step` properties. All sliders must be answered before proceeding.
+### Textfield for Post-editing/Translation
+Enable a textfield for post-editing or translation tasks using the `textfield` parameter in `info`. The textfield content is stored in annotations alongside scores and error spans.
+```python
+{
+  "info": {
+    "protocol": "DA",
+    "textfield": "prefilled"  # Options: null, "hidden", "visible", "prefilled"
+  }
+}
+```
+**Textfield modes:**
+- `null` or omitted: No textfield (default)
+- `"hidden"`: Textfield hidden by default, shown by clicking a button
+- `"visible"`: Textfield always visible
+- `"prefilled"`: Textfield visible and pre-filled with model output for post-editing
+### Custom Instructions
+Set campaign-level instructions using the `instructions` field in `info` (supports HTML).
+Instructions default to protocol-specific ones (DA: scoring, ESA: error spans + scoring, MQM: error spans + categories + scoring).
+```python
+{
+  "info": {
+    "protocol": "DA",
+    "instructions": "Rate translation quality on a 0-100 scale.<br>Pay special attention to document-level phenomena."
+  }
+}
+```
 ### Pre-filled Error Spans (ESA<sup>AI</sup>)
 Include `error_spans` to pre-fill annotations that users can review, modify, or delete:
@@ -242,7 +307,7 @@ All items must contain outputs from all models for this assignment type to work
 **How it works:**
 1. Initial phase: Each model gets `dynamic_first` annotations with fully random contrastive evaluation
 2. Dynamic phase: After the initial phase, top `dynamic_top` models (by average score) are identified
-3. Contrastive evaluatoin: From the top N models, `dynamic_contrastive_models` models are randomly selected for each item
+3. Contrastive evaluation: From the top N models, `dynamic_contrastive_models` models are randomly selected for each item
 4. Item prioritization: Items with the least annotations for the selected models are prioritized
 5. Backoff: With probability `dynamic_backoff`, uniform random selection is used instead to maintain exploration
@@ -268,6 +333,7 @@ The `users` field accepts:
 }
 ```
 ### Multimodal Annotations
 Support for HTML-compatible elements (YouTube embeds, `<video>` tags, images). Ensure elements are pre-styled. See [examples/multimodal.json](examples/multimodal.json).
@@ -348,7 +414,7 @@ Customize the goodbye message shown to users when they complete all annotations
   - **Score**: Numeric quality rating (0-100)
   - **Error Spans**: Text highlights marking errors with severity (`minor`, `major`)
   - **Error Categories**: MQM taxonomy labels for errors
-- **Template**: The annotation interface type. The `basic` template supports comparing multiple outputs simultaneously.
+- **Template**: The annotation interface type. The `annotate` template supports comparing multiple outputs simultaneously.
 - **Assignment**: The method for distributing items to users:
   - **Task-based**: Each user has predefined items
   - **Single-stream**: Users draw from a shared pool with random assignment
@@ -379,7 +445,7 @@ pearmut run
 2. Add build rule to `webpack.config.js`
 3. Reference as `info->template` in campaign JSON
-See [web/src/basic.ts](web/src/basic.ts) for example.
+See [web/src/annotate.ts](web/src/annotate.ts) for example.
 ### Deployment
@@ -390,10 +456,15 @@ Run on public server or tunnel local port to public IP/domain and run locally.
 If you use this work in your paper, please cite as following.
 ```bibtex
 @misc{zouhar2026pearmut,
-  author = {Zouhar, Vilém},
-  title = {Pearmut: Human Evaluation of Translation Made Trivial},
-  year = {2026}
+      title={Pearmut: Human Evaluation of Translation Made Trivial},
+      author={Vilém Zouhar and Tom Kocmi},
+      year={2026},
+      eprint={2601.02933},
+      archivePrefix={arXiv},
+      primaryClass={cs.CL},
+      url={https://arxiv.org/abs/2601.02933},
 }
 ```
 Contributions are welcome! Please reach out to [Vilém Zouhar](mailto:vilem.zouhar@gmail.com).
+See changes in [CHANGELOG.md](CHANGELOG.md).

{pearmut-1.0.0 → pearmut-1.0.2}/pearmut.egg-info/PKG-INFO RENAMED Viewed

@@ -1,6 +1,6 @@
 Metadata-Version: 2.4
 Name: pearmut
-Version: 1.0.0
+Version: 1.0.2
 Summary: A tool for evaluation of model outputs, primarily MT.
 Author-email: Vilém Zouhar <vilem.zouhar@gmail.com>
 License: MIT
@@ -19,17 +19,10 @@ Provides-Extra: dev
 Requires-Dist: pytest; extra == "dev"
 Dynamic: license-file
-# Pearmut 🍐
+# 🍐Pearmut <br> [![PyPi version](https://badgen.net/pypi/v/pearmut/)](https://pypi.org/project/pearmut) [![PyPI download/month](https://img.shields.io/pypi/dm/pearmut.svg)](https://pypi.python.org/pypi/pearmut/) [![PyPi license](https://badgen.net/pypi/license/pearmut/)](https://pypi.org/project/pearmut/) [![build status](https://github.com/zouharvi/pearmut/actions/workflows/test.yml/badge.svg)](https://github.com/zouharvi/pearmut/actions/workflows/test.yml) [![arXiv](https://img.shields.io/badge/arXiv-2601.02933-b31b1b.svg?style=flat)](https://arxiv.org/abs/2601.02933)
 **Platform for Evaluation and Reviewing of Multilingual Tasks**: Evaluate model outputs for translation and NLP tasks with support for multimodal data (text, video, audio, images) and multiple annotation protocols ([DA](https://aclanthology.org/N15-1124/), [ESA](https://aclanthology.org/2024.wmt-1.131/), [ESA<sup>AI</sup>](https://aclanthology.org/2025.naacl-long.255/), [MQM](https://doi.org/10.1162/tacl_a_00437), and more!).
-[![PyPi version](https://badgen.net/pypi/v/pearmut/)](https://pypi.org/project/pearmut)
-&nbsp;
-[![PyPI download/month](https://img.shields.io/pypi/dm/pearmut.svg)](https://pypi.python.org/pypi/pearmut/)
-&nbsp;
-[![PyPi license](https://badgen.net/pypi/license/pearmut/)](https://pypi.org/project/pearmut/)
-&nbsp;
-[![build status](https://github.com/zouharvi/pearmut/actions/workflows/test.yml/badge.svg)](https://github.com/zouharvi/pearmut/actions/workflows/test.yml)
 <img width="1000" alt="Screenshot of ESA/MQM interface" src="https://github.com/user-attachments/assets/71334238-300b-4ffc-b777-7f3c242b1630" />
@@ -52,6 +45,8 @@ Dynamic: license-file
 - [Terminology](#terminology)
 - [Development](#development)
 - [Citation](#citation)
+- [Changelog](#changelog)
 ## Quick Start
@@ -111,7 +106,9 @@ Campaigns are defined in JSON files (see [examples/](examples/)). The simplest c
 }
 ```
-Each item has to have `src` (string) and `tgt` (dictionary from model names to strings, even for a single model evaluation).
+Each item has to have `tgt` (dictionary from model names to strings, even for a single model evaluation).
+Optionally, you can also include `src` (source string) and/or `ref` (reference string).
+If neither `src` nor `ref` is provided, only the model outputs will be displayed.
 For full Pearmut functionality (e.g. automatic statistical analysis), add `item_id` as well.
 Any other keys that you add will simply be stored in the logs.
@@ -145,6 +142,74 @@ The `shuffle` parameter in campaign `info` controls this behavior:
 }
 ```
+### Showing Model Names
+By default, model names are hidden to avoid biasing annotators. To display model names on top of each output block, set `show_model_names` to `true`:
+```python
+{
+  "info": {
+    "assignment": "task-based",
+    "protocol": "ESA",
+    "show_model_names": true  # Default: false.
+  },
+  "campaign_id": "my_campaign",
+  "data": [...]
+}
+```
+### Custom Score Sliders
+For multi-dimensional evaluation tasks (e.g., assessing fluency on a Likert scale), you can define custom sliders with specific ranges and steps:
+```python
+{
+  "info": {
+    "assignment": "task-based",
+    "protocol": "ESA",
+    "sliders": [
+      {"name": "Fluency", "min": 0, "max": 5, "step": 1},
+      {"name": "Adequacy", "min": 0, "max": 100, "step": 1}
+    ]
+  },
+  "campaign_id": "my_campaign",
+  "data": [...]
+}
+```
+When `sliders` is specified, only the custom sliders are shown. Each slider must have `name`, `min`, `max`, and `step` properties. All sliders must be answered before proceeding.
+### Textfield for Post-editing/Translation
+Enable a textfield for post-editing or translation tasks using the `textfield` parameter in `info`. The textfield content is stored in annotations alongside scores and error spans.
+```python
+{
+  "info": {
+    "protocol": "DA",
+    "textfield": "prefilled"  # Options: null, "hidden", "visible", "prefilled"
+  }
+}
+```
+**Textfield modes:**
+- `null` or omitted: No textfield (default)
+- `"hidden"`: Textfield hidden by default, shown by clicking a button
+- `"visible"`: Textfield always visible
+- `"prefilled"`: Textfield visible and pre-filled with model output for post-editing
+### Custom Instructions
+Set campaign-level instructions using the `instructions` field in `info` (supports HTML).
+Instructions default to protocol-specific ones (DA: scoring, ESA: error spans + scoring, MQM: error spans + categories + scoring).
+```python
+{
+  "info": {
+    "protocol": "DA",
+    "instructions": "Rate translation quality on a 0-100 scale.<br>Pay special attention to document-level phenomena."
+  }
+}
+```
 ### Pre-filled Error Spans (ESA<sup>AI</sup>)
 Include `error_spans` to pre-fill annotations that users can review, modify, or delete:
@@ -263,7 +328,7 @@ All items must contain outputs from all models for this assignment type to work
 **How it works:**
 1. Initial phase: Each model gets `dynamic_first` annotations with fully random contrastive evaluation
 2. Dynamic phase: After the initial phase, top `dynamic_top` models (by average score) are identified
-3. Contrastive evaluatoin: From the top N models, `dynamic_contrastive_models` models are randomly selected for each item
+3. Contrastive evaluation: From the top N models, `dynamic_contrastive_models` models are randomly selected for each item
 4. Item prioritization: Items with the least annotations for the selected models are prioritized
 5. Backoff: With probability `dynamic_backoff`, uniform random selection is used instead to maintain exploration
@@ -289,6 +354,7 @@ The `users` field accepts:
 }
 ```
 ### Multimodal Annotations
 Support for HTML-compatible elements (YouTube embeds, `<video>` tags, images). Ensure elements are pre-styled. See [examples/multimodal.json](examples/multimodal.json).
@@ -369,7 +435,7 @@ Customize the goodbye message shown to users when they complete all annotations
   - **Score**: Numeric quality rating (0-100)
   - **Error Spans**: Text highlights marking errors with severity (`minor`, `major`)
   - **Error Categories**: MQM taxonomy labels for errors
-- **Template**: The annotation interface type. The `basic` template supports comparing multiple outputs simultaneously.
+- **Template**: The annotation interface type. The `annotate` template supports comparing multiple outputs simultaneously.
 - **Assignment**: The method for distributing items to users:
   - **Task-based**: Each user has predefined items
   - **Single-stream**: Users draw from a shared pool with random assignment
@@ -400,7 +466,7 @@ pearmut run
 2. Add build rule to `webpack.config.js`
 3. Reference as `info->template` in campaign JSON
-See [web/src/basic.ts](web/src/basic.ts) for example.
+See [web/src/annotate.ts](web/src/annotate.ts) for example.
 ### Deployment
@@ -411,10 +477,15 @@ Run on public server or tunnel local port to public IP/domain and run locally.
 If you use this work in your paper, please cite as following.
 ```bibtex
 @misc{zouhar2026pearmut,
-  author = {Zouhar, Vilém},
-  title = {Pearmut: Human Evaluation of Translation Made Trivial},
-  year = {2026}
+      title={Pearmut: Human Evaluation of Translation Made Trivial},
+      author={Vilém Zouhar and Tom Kocmi},
+      year={2026},
+      eprint={2601.02933},
+      archivePrefix={arXiv},
+      primaryClass={cs.CL},
+      url={https://arxiv.org/abs/2601.02933},
 }
 ```
 Contributions are welcome! Please reach out to [Vilém Zouhar](mailto:vilem.zouhar@gmail.com).
+See changes in [CHANGELOG.md](CHANGELOG.md).

{pearmut-1.0.0 → pearmut-1.0.2}/pearmut.egg-info/SOURCES.txt RENAMED Viewed

@@ -10,10 +10,11 @@ pearmut.egg-info/top_level.txt
 server/app.py
 server/assignment.py
 server/cli.py
+server/constants.py
 server/results_export.py
 server/utils.py
-server/static/basic.bundle.js
-server/static/basic.html
+server/static/annotate.bundle.js
+server/static/annotate.html
 server/static/dashboard.bundle.js
 server/static/dashboard.html
 server/static/favicon.svg

{pearmut-1.0.0 → pearmut-1.0.2}/pyproject.toml RENAMED Viewed

@@ -1,6 +1,6 @@
 [project]
 name = "pearmut"
-version = "1.0.0"
+version = "1.0.2"
 description = "A tool for evaluation of model outputs, primarily MT."
 readme = "README.md"
 license = { text = "MIT" }

{pearmut-1.0.0 → pearmut-1.0.2}/server/app.py RENAMED Viewed

@@ -4,7 +4,7 @@ from typing import Any
 from fastapi import FastAPI, Query
 from fastapi.middleware.cors import CORSMiddleware
-from fastapi.responses import JSONResponse, Response
+from fastapi.responses import FileResponse, JSONResponse, Response
 from fastapi.staticfiles import StaticFiles
 from pydantic import BaseModel
@@ -17,6 +17,7 @@ from .results_export import (
 )
 from .utils import (
     ROOT,
+    TOKEN_MAIN,
     check_validation_threshold,
     load_progress_data,
     save_db_payload,
@@ -192,7 +193,11 @@ async def _dashboard_data(request: DashboardDataRequest):
         progress_new[user_id] = entry
     return JSONResponse(
-        content={"data": progress_new, "validation_threshold": validation_threshold},
+        content={
+            "data": progress_new,
+            "validation_threshold": validation_threshold,
+            "assignment": assignment,
+        },
         status_code=200,
     )
@@ -280,6 +285,91 @@ async def _reset_task(request: ResetTaskRequest):
     return response
+class PurgeCampaignRequest(BaseModel):
+    campaign_id: str
+    token: str
+@app.post("/purge-campaign")
+async def _purge_campaign(request: PurgeCampaignRequest):
+    global progress_data, tasks_data
+    campaign_id = request.campaign_id
+    token = request.token
+    if campaign_id not in progress_data:
+        return JSONResponse(content="Unknown campaign ID", status_code=400)
+    if token != tasks_data[campaign_id]["token"]:
+        return JSONResponse(content="Invalid token", status_code=400)
+    # Unlink assets if they exist
+    destination = (
+        tasks_data[campaign_id].get("info", {}).get("assets", {}).get("destination")
+    )
+    if destination:
+        symlink_path = f"{ROOT}/data/{destination}".rstrip("/")
+        if os.path.islink(symlink_path):
+            os.remove(symlink_path)
+    # Remove task file
+    task_file = f"{ROOT}/data/tasks/{campaign_id}.json"
+    if os.path.exists(task_file):
+        os.remove(task_file)
+    # Remove output file
+    output_file = f"{ROOT}/data/outputs/{campaign_id}.jsonl"
+    if os.path.exists(output_file):
+        os.remove(output_file)
+    # Remove from in-memory data structures
+    del tasks_data[campaign_id]
+    del progress_data[campaign_id]
+    # Save updated progress data
+    save_progress_data(progress_data)
+    return JSONResponse(content="ok", status_code=200)
+class AddCampaignRequest(BaseModel):
+    campaign_data: dict[str, Any]
+    token_main: str
+@app.post("/add-campaign")
+async def _add_campaign(request: AddCampaignRequest):
+    global progress_data, tasks_data
+    from .cli import _add_single_campaign
+    if request.token_main != TOKEN_MAIN:
+        return JSONResponse(
+            content={"error": "Invalid main token. Use the latest one."},
+            status_code=400,
+        )
+    try:
+        server = f"{os.environ.get('PEARMUT_SERVER_URL', 'http://localhost:8001')}"
+        _add_single_campaign(request.campaign_data, overwrite=False, server=server)
+        campaign_id = request.campaign_data["campaign_id"]
+        with open(f"{ROOT}/data/tasks/{campaign_id}.json", "r") as f:
+            tasks_data[campaign_id] = json.load(f)
+        progress_data = load_progress_data(warn=None)
+        return JSONResponse(
+            content={
+                "status": "ok",
+                "campaign_id": campaign_id,
+                "token": tasks_data[campaign_id]["token"],
+            },
+            status_code=200,
+        )
+    except Exception as e:
+        return JSONResponse(content={"error": str(e)}, status_code=400)
 @app.get("/download-annotations")
 async def _download_annotations(
     campaign_id: list[str] = Query(),
@@ -345,6 +435,17 @@ if not os.path.exists(static_dir + "index.html"):
         "Static directory not found. Please build the frontend first."
     )
+# Serve HTML files directly without redirect
+@app.get("/annotate")
+async def serve_annotate():
+    return FileResponse(static_dir + "annotate.html")
+@app.get("/dashboard")
+async def serve_dashboard():
+    return FileResponse(static_dir + "dashboard.html")
 # Mount user assets from data/assets/
 assets_dir = f"{ROOT}/data/assets"
 os.makedirs(assets_dir, exist_ok=True)

pearmut 1.0.0__tar.gz → 1.0.2__tar.gz

pearmut 1.0.0tar.gz → 1.0.2tar.gz