PyPI - pearmut - Versions diffs - 0.2.2__py3-none-any.whl → 0.2.4__py3-none-any.whl - Mend

pearmut 0.2.2py3-none-any.whl → 0.2.4py3-none-any.whl

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (18) hide show

pearmut/app.py +21 -4
pearmut/assignment.py +29 -11
pearmut/cli.py +169 -50
pearmut/static/assets/style.css +4 -0
pearmut/static/dashboard.bundle.js +1 -1
pearmut/static/dashboard.html +1 -1
pearmut/static/listwise.bundle.js +1 -1
pearmut/static/listwise.html +2 -2
pearmut/static/pointwise.bundle.js +1 -1
pearmut/static/pointwise.html +1 -1
pearmut/utils.py +72 -4
{pearmut-0.2.2.dist-info → pearmut-0.2.4.dist-info}/METADATA +59 -7
pearmut-0.2.4.dist-info/RECORD +19 -0
pearmut-0.2.2.dist-info/RECORD +0 -19
{pearmut-0.2.2.dist-info → pearmut-0.2.4.dist-info}/WHEEL +0 -0
{pearmut-0.2.2.dist-info → pearmut-0.2.4.dist-info}/entry_points.txt +0 -0
{pearmut-0.2.2.dist-info → pearmut-0.2.4.dist-info}/licenses/LICENSE +0 -0
{pearmut-0.2.2.dist-info → pearmut-0.2.4.dist-info}/top_level.txt +0 -0

pearmut/static/pointwise.html CHANGED Viewed

@@ -66,4 +66,4 @@
       direction: rtl;
       width: 16px;
       height: 200px;
-    }</style><script defer="defer" src="pointwise.bundle.js"></script></head><body><div style="max-width: 1600px; min-width: 900px; margin-left: auto; margin-right: auto; margin-top: 20px; padding-left: 10px;"><div class="white-box" style="margin-right: 30px; background-color: #e7e2cf; padding: 5px 15px 5px 5px;"><span id="instructions_global" style="display: inline-block; font-size: 11pt; width: calc(100% - 170px);"><ul><li id="instructions_spans">Error spans:<ul><li><strong>Select</strong> the part of translation where you have identified a <strong>translation error</strong> (drag or click start & end).</li><li><strong>Click</strong> on the highlight to change error severity (minor/major) or remove the highlight.</li></ul>Choose error severity:<ul><li><span class="instruction_sev" id="instruction_sev_minor">Minor errors:</span> Style, grammar, word choice could be better or more natural.</li><li><span class="instruction_sev" id="instruction_sev_major">Major errors:</span>: The meaning is changed significantly and/or the part is really hard to understand.</li></ul><strong>Tip</strong>: Highlight the word or general area of the error (it doesn't need to be exact). Use separate highlights for different errors.<br></li><li id="instructions_score">Score the translation: Please use the slider and set an overall score based on meaning preservation and general quality:</li><ul><li>0: <strong>No meaning preserved</strong>: most information is lost.</li><li>33%: <strong>Some meaning preserved</strong>: major gaps and narrative issues.</li><li>66%: <strong>Most meaning preserved</strong>: minor issues with grammar or consistency.</li><li>100%: <strong>Perfect</strong>: meaning and grammar align completely with the source.</li></ul><li id="instructions_categories">Error types: After highlighting an error fragment, you will be asked to select the specific error type (main category and subcategory). If you are unsure about which errors fall under which categories, please consult the <a href="https://themqm.org/the-mqm-typology/" style="font-weight: bold; text-decoration: none; color: black;">typology definitions</a>.</li></ul></span><div style="width: 170px; display: inline-block; vertical-align: top; text-align: right; padding-top: 5px;"><span id="time" style="width: 135px; text-align: left; display: inline-block; font-size: 11pt;" title="Approximation of total annotation time.">Time: 0m</span> <input type="button" value="⚙️" id="button_settings" style="height: 1.5em; width: 30px;"><br><br><div id="progress" style="text-align: center;"></div><br><br><input type="button" value="Next 🛠️" id="button_next" disabled="disabled" style="width: 170px; height: 2.5em;" title="Finish annotating all examples first."> <input type="button" value="skip tutorial" id="button_skip_tutorial" style="width: 170px; font-size: 11pt; height: 30px; margin-top: 10px; display: none;" title="Skip tutorial only if you completed it already."></div></div><div id="settings_div" class="white-box" style="margin-right: 20px; margin-top: 10px; display: none; background-color: #e7e2cf; font-size: 11pt;"><input type="checkbox" id="settings_approximate_alignment"> <label for="settings_approximate_alignment">Show approximate alignment</label></div><div id="output_div" style="margin-top: 100px;"></div><br><br><br></div></body></html>
+    }</style><script defer="defer" src="pointwise.bundle.js"></script></head><body><div style="max-width: 1600px; min-width: 900px; margin-left: auto; margin-right: auto; margin-top: 20px; padding-left: 10px;"><div class="white-box" style="margin-right: 30px; background-color: #e7e2cf; padding: 5px 15px 5px 5px;"><span id="instructions_global" style="display: inline-block; font-size: 11pt; width: calc(100% - 170px);"><ul><li id="instructions_spans">Error spans:<ul><li><strong>Click</strong> on the start of an error, then <strong>click</strong> on the end to mark an error span.</li><li><strong>Click</strong> on an existing highlight to change error severity (minor/major) or remove it.</li></ul>Error severity:<ul><li><span class="instruction_sev" id="instruction_sev_minor">Minor:</span> Style, grammar, or word choice could be better.</li><li><span class="instruction_sev" id="instruction_sev_major">Major:</span> Meaning is significantly changed or is hard to understand.</li></ul><strong>Tip</strong>: Mark the general area of the error (doesn't need to be exact). Use separate highlights for different errors.<br></li><li id="instructions_score">Score the translation: Please use the slider and set an overall score based on meaning preservation and general quality:</li><ul><li>0: <strong>No meaning preserved</strong>: most information is lost.</li><li>33%: <strong>Some meaning preserved</strong>: major gaps and narrative issues.</li><li>66%: <strong>Most meaning preserved</strong>: minor issues with grammar or consistency.</li><li>100%: <strong>Perfect</strong>: meaning and grammar align completely with the source.</li></ul><li id="instructions_categories">Error types: After highlighting an error fragment, you will be asked to select the specific error type (main category and subcategory). If you are unsure about which errors fall under which categories, please consult the <a href="https://themqm.org/the-mqm-typology/" style="font-weight: bold; text-decoration: none; color: black;">typology definitions</a>.</li></ul></span><div style="width: 170px; display: inline-block; vertical-align: top; text-align: right; padding-top: 5px;"><span id="time" style="width: 135px; text-align: left; display: inline-block; font-size: 11pt;" title="Approximation of total annotation time.">Time: 0m</span> <input type="button" value="⚙️" id="button_settings" style="height: 1.5em; width: 30px;"><br><br><div id="progress" style="text-align: center;"></div><br><br><input type="button" value="Next 🛠️" id="button_next" disabled="disabled" style="width: 170px; height: 2.5em;" title="Finish annotating all examples first."> <input type="button" value="skip tutorial" id="button_skip_tutorial" style="width: 170px; font-size: 11pt; height: 30px; margin-top: 10px; display: none;" title="Skip tutorial only if you completed it already."></div></div><div id="settings_div" class="white-box" style="margin-right: 20px; margin-top: 10px; display: none; background-color: #e7e2cf; font-size: 11pt;"><input type="checkbox" id="settings_approximate_alignment"> <label for="settings_approximate_alignment">Show approximate alignment</label></div><div id="output_div" style="margin-top: 100px;"></div><br><br><br></div></body></html>

pearmut/utils.py CHANGED Viewed

@@ -3,6 +3,9 @@ import os
 ROOT = "."
+# Sentinel value to indicate a task reset - masks all prior annotations
+RESET_MARKER = "__RESET__"
 def highlight_differences(a, b):
     """
@@ -74,16 +77,31 @@ def get_db_log(campaign_id: str) -> list[dict]:
 def get_db_log_item(campaign_id: str, user_id: str | None, item_i: int | None) -> list[dict]:
     """
     Returns the log item for the given campaign_id, user_id and item_i.
-    Can be empty.
+    Can be empty. Respects reset markers - if a reset marker is found,
+    only entries after the last reset are returned.
     """
     log = get_db_log(campaign_id)
-    return [
+    # Filter matching entries
+    matching = [
         entry for entry in log
         if (
             (user_id is None or entry.get("user_id") == user_id) and
             (item_i is None or entry.get("item_i") == item_i)
         )
     ]
+    # Find the last reset marker for this user (if any)
+    last_reset_idx = -1
+    for i, entry in enumerate(matching):
+        if entry.get("annotations") == RESET_MARKER:
+            last_reset_idx = i
+    # Return only entries after the last reset
+    if last_reset_idx >= 0:
+        matching = matching[last_reset_idx + 1:]
+    return matching
 def save_db_payload(campaign_id: str, payload: dict):
@@ -91,11 +109,61 @@ def save_db_payload(campaign_id: str, payload: dict):
     Saves the given payload to the log for the given campaign_id, user_id and item_i.
     Saves both on disk and in-memory.
     """
+    # Ensure the in-memory cache is initialized before writing to file
+    # to avoid reading back the same entry we're about to append
+    log = get_db_log(campaign_id)
     log_path = f"{ROOT}/data/outputs/{campaign_id}.jsonl"
+    os.makedirs(os.path.dirname(log_path), exist_ok=True)
     with open(log_path, "a") as log_file:
         log_file.write(json.dumps(payload, ensure_ascii=False,) + "\n")
-    log = get_db_log(campaign_id)
-    # copy to avoid mutation issues
     log.append(payload)
+def check_validation_threshold(
+    tasks_data: dict,
+    progress_data: dict,
+    campaign_id: str,
+    user_id: str,
+) -> bool:
+    """
+    Check if user passes the validation threshold.
+    The threshold is defined in campaign info as 'validation_threshold':
+    - If integer: pass if number of failed checks <= threshold
+    - If float in [0, 1): pass if proportion of failed checks <= threshold
+    - If float >= 1: always fail
+    - If None/not set: defaults to 0 (fail on any failed check)
+    Returns True if validation passes, False otherwise.
+    """
+    threshold = tasks_data[campaign_id]["info"].get("validation_threshold", 0)
+    user_progress = progress_data[campaign_id][user_id]
+    validations = user_progress.get("validations", {})
+    # Count failed checks (validations is dict of item_i -> list of bools)
+    total_checks = 0
+    failed_checks = 0
+    for item_validations in validations.values():
+        for check_passed in item_validations:
+            total_checks += 1
+            if not check_passed:
+                failed_checks += 1
+    # If no validation checks exist, pass
+    if total_checks == 0:
+        return True
+    # Float >= 1: always fail
+    if isinstance(threshold, float) and threshold >= 1:
+        return False
+    # Check threshold based on type
+    if isinstance(threshold, float):
+        # Float in [0, 1): proportion-based, pass if failed proportion <= threshold
+        return failed_checks / total_checks <= threshold
+    else:
+        # Integer: count-based, pass if failed count <= threshold
+        return failed_checks <= threshold

{pearmut-0.2.2.dist-info → pearmut-0.2.4.dist-info}/METADATA RENAMED Viewed

@@ -1,6 +1,6 @@
 Metadata-Version: 2.4
 Name: pearmut
-Version: 0.2.2
+Version: 0.2.4
 Summary: A tool for evaluation of model outputs, primarily MT.
 Author-email: Vilém Zouhar <vilem.zouhar@gmail.com>
 License: apache-2.0
@@ -16,7 +16,6 @@ Requires-Dist: wonderwords>=3.0.0
 Requires-Dist: psutil>=7.1.0
 Provides-Extra: dev
 Requires-Dist: pytest; extra == "dev"
-Requires-Dist: pynpm>=0.3.0; extra == "dev"
 Dynamic: license-file
 # Pearmut 🍐
@@ -165,8 +164,10 @@ You can add validation rules to items for tutorials or attention checks. Items w
 - Tutorial items: Include `allow_skip: true` and `warning` to let users skip after seeing the feedback
 - Loud attention checks: Include `warning` without `allow_skip` to force users to retry
 - Silent attention checks: Omit `warning` to silently log failures without user notification (useful for quality control with bad translations)
 For listwise template, `validation` is an array where each element corresponds to a candidate.
-The dashboard shows failed/total validation checks per user.
+The dashboard shows failed/total validation checks per user, and ✅/❌ based on whether they pass the threshold.
+Set `validation_threshold` in `info` to control pass/fail: integer for max failed count, float in [0,1) for max failed proportion.
 See [examples/tutorial_pointwise.json](examples/tutorial_pointwise.json) and [examples/tutorial_listwise.json](examples/tutorial_listwise.json) for complete examples.
 ## Single-stream Assignment
@@ -181,7 +182,7 @@ We also support a simple allocation where all annotators draw from the same pool
         "protocol_score": True,                # collect scores
         "protocol_error_spans": True,          # collect error spans
         "protocol_error_categories": False,    # do not collect MQM categories, so ESA
-        "num_users": 50,                       # number of annotators
+        "users": 50,                           # number of annotators (can also be a list, see below)
     },
     "data": [...], # list of all items (shared among all annotators)
 }
@@ -196,12 +197,31 @@ We also support dynamic allocation of annotations (`dynamic`, not yet ⚠️), w
         "assignment": "dynamic",
         "template": "listwise",
         "protocol_k": 5,
-        "num_users": 50,
+        "users": 50,
     },
     "data": [...], # list of all items
 }
 ```
+## Pre-defined User IDs and Tokens
+By default, user IDs and completion tokens are automatically generated. The `users` field can be:
+- A number (e.g., `50`) to generate that many random user IDs
+- A list of strings (e.g., `["alice", "bob"]`) to use specific user IDs
+- A list of dictionaries to specify user IDs with custom tokens:
+```python
+{
+    "info": {
+        ...
+        "users": [
+            {"user_id": "alice", "token_pass": "alice_done", "token_fail": "alice_fail"},
+            {"user_id": "bob", "token_pass": "bob_done"}  # missing tokens are auto-generated
+        ],
+    },
+    ...
+}
+```
 To load a campaign into the server, run the following.
 It will fail if an existing campaign with the same `campaign_id` already exists, unless you specify `-o/--overwrite`.
 It will also output a secret management link. Then, launch the server:
@@ -234,8 +254,7 @@ and independently of that select your protocol template:
 When adding new campaigns or launching pearmut, a management link is shown that gives an overview of annotator progress but also an easy access to the annotation links or resetting the task progress (no data will be lost).
 This is also the place where you can download all progress and collected annotations (these files exist also locally but this might be more convenient).
-<img width="800" alt="Management dashboard" src="https://github.com/user-attachments/assets/82470693-a5ec-4d0e-8989-e93d5b0bb840" />
+<img width="800" alt="Management dashboard" src="https://github.com/user-attachments/assets/800a1741-5f41-47ac-9d5d-5cbf6abfc0e6" />
 Additionally, at the end of an annotation, a token of completion is shown which can be compared to the correct one that you can download in metadat from the dashboard.
 An intentionally incorrect token can be shown if the annotations don't pass quality control.
@@ -252,6 +271,39 @@ Tip: make sure the elements are already appropriately styled.
 <img width="1000" alt="Preview of multimodal elements in Pearmut" src="https://github.com/user-attachments/assets/77c4fa96-ee62-4e46-8e78-fd16e9007956" />
+## CLI Commands
+Pearmut provides the following commands:
+- `pearmut add <file(s)>`: Add one or more campaign JSON files. Supports wildcards (e.g., `pearmut add examples/*.json`).
+  - `-o/--overwrite`: Overwrite existing campaigns with the same ID.
+  - `--server <url>`: Prefix server URL for protocol links (default: `http://localhost:8001`).
+- `pearmut run`: Start the Pearmut server.
+  - `--port <port>`: Port to run the server on (default: 8001).
+  - `--server <url>`: Prefix server URL for protocol links.
+- `pearmut purge [campaign]`: Remove campaign data.
+  - Without arguments: Purges all campaigns (tasks, outputs, progress).
+  - With campaign name: Purges only the specified campaign's data.
+## Hosting Assets
+If you need to host local assets (e.g., audio files, images, videos) via Pearmut, you can use the `assets` key in your campaign file.
+When present, this directory is symlinked to the `static/` directory so its contents become accessible from the server.
+```python
+{
+    "campaign_id": "my_campaign",
+    "info": {
+      "assets": "videos",  # path to directory containing assets
+      ...
+    },
+    "data": [ ... ]
+}
+```
+For example, if `videos` contains `audio.mp3`, it will be accessible at `localhost:8001/assets/videos/audio.mp3`.
+The path can be absolute or relative to your current working directory.
 ## Development

pearmut-0.2.4.dist-info/RECORD ADDED Viewed

@@ -0,0 +1,19 @@
+pearmut/app.py,sha256=6dswjMC_YN6-3WHPSl8qhin6Qb2IsHXCveX9MKen-O0,8466
+pearmut/assignment.py,sha256=2dWuFacXCg65xjiEiqNPSXn4_4Z4fy5OgBolmCqgtUE,11181
+pearmut/cli.py,sha256=ff3UdCToXP_U1iKLHTAuHo9eDsK5G6d8ToVmSZ-6wYI,12582
+pearmut/utils.py,sha256=TWcbdTehg4CNwCpc5FuEOszpQM464LY0IQHHE_Sq1Zg,5293
+pearmut/static/dashboard.bundle.js,sha256=3i4o4VOZi2g2EsC6rzwz2pYO_YwncCIjnI0Gxz57Z44,91471
+pearmut/static/dashboard.html,sha256=aCYNhRZUHsVF_CXzKmzdBptEAnRTI3J5NKT4trxAots,1966
+pearmut/static/index.html,sha256=SC5M-NSTnJh1UNHCC5VOP0TKkmhNn6MHlY6L4GDacpA,849
+pearmut/static/listwise.bundle.js,sha256=kkXvg4F-xnNH8UzhuiAl1MqatwzAcs2h5r22jhnYvqE,105235
+pearmut/static/listwise.html,sha256=YZKQtB_TOt1gQKjJdwjEkcHAOiZoW2WlIFhpSr4kCo0,5163
+pearmut/static/pointwise.bundle.js,sha256=xVvarH95pYeZUqjfoXufyLzdISqkoJ4DcBshy-94WOw,107298
+pearmut/static/pointwise.html,sha256=7pf7HcyvM6t-Jze7tFYjfwTEu1C5Az1sg4e_SUbBFl0,4879
+pearmut/static/assets/favicon.svg,sha256=gVPxdBlyfyJVkiMfh8WLaiSyH4lpwmKZs8UiOeX8YW4,7347
+pearmut/static/assets/style.css,sha256=BrPnXTDr8hQ0M8T-EJlExddChzIFotlerBYMx2B8GDk,4136
+pearmut-0.2.4.dist-info/licenses/LICENSE,sha256=xx0jnfkXJvxRnG63LTGOxlggYnIysveWIZ6H3PNdCrQ,11357
+pearmut-0.2.4.dist-info/METADATA,sha256=C8cZZDhSGEYnQOosPieoAeCoY_lb5iM8hc_7SHK4H4o,14381
+pearmut-0.2.4.dist-info/WHEEL,sha256=_zCd3N1l69ArxyTb8rzEoP9TpbYXkqRFSNOD5OuxnTs,91
+pearmut-0.2.4.dist-info/entry_points.txt,sha256=eEA9LVWsS3neQbMvL_nMvEw8I0oFudw8nQa1iqxOiWM,45
+pearmut-0.2.4.dist-info/top_level.txt,sha256=CdgtUM-SKQDt6o5g0QreO-_7XTBP9_wnHMS1P-Rl5Go,8
+pearmut-0.2.4.dist-info/RECORD,,

pearmut-0.2.2.dist-info/RECORD DELETED Viewed

@@ -1,19 +0,0 @@
-pearmut/app.py,sha256=kGamXakzpuKFWQQSRV_rsFNn7rpbO-yMM09r65sdK2U,7911
-pearmut/assignment.py,sha256=Sycq-_6BTjpm7KPSZ02zX9aTZxOr-zaxW5QbZpQlqV8,10415
-pearmut/cli.py,sha256=9JFv8eop4HdpgJH9RzSWLMTo38fkoUBeMEcs1xmYiGs,7689
-pearmut/utils.py,sha256=gk8b4biPc9TTvZiQMQ_8xh1_FsWuwrhtPzeK3NpzhZc,2902
-pearmut/static/dashboard.bundle.js,sha256=tYnKv1eoDX_Ydfy7pHjFXR79SLjyHC5M3DMbrtlxPEg,91574
-pearmut/static/dashboard.html,sha256=tAFNUlrtYTJ_Bnh2Rer278eRyt_tIk8mXvN0sDcyzKE,1767
-pearmut/static/index.html,sha256=SC5M-NSTnJh1UNHCC5VOP0TKkmhNn6MHlY6L4GDacpA,849
-pearmut/static/listwise.bundle.js,sha256=tJzsHDoOsvWndDaxAcFaFlgiwimSyOXSeP1i2d4Q5n4,104842
-pearmut/static/listwise.html,sha256=evNyjPUCWPVfPSnGlzSEMhNmysH-WN4X_4drU91kBWY,5189
-pearmut/static/pointwise.bundle.js,sha256=CV3V3NcLpUPsBMhr4zVFvw_x5Udpd_jbtGcTrGUxK4g,107209
-pearmut/static/pointwise.html,sha256=snbT0UDxnKS3LEV8r832eglwzwkV0bqwY0zMWFnEUp4,4986
-pearmut/static/assets/favicon.svg,sha256=gVPxdBlyfyJVkiMfh8WLaiSyH4lpwmKZs8UiOeX8YW4,7347
-pearmut/static/assets/style.css,sha256=SARZqqovP_2s9S5ENI7dxJ6Hacz-ztQ2zn2Hn7DwoJU,4089
-pearmut-0.2.2.dist-info/licenses/LICENSE,sha256=xx0jnfkXJvxRnG63LTGOxlggYnIysveWIZ6H3PNdCrQ,11357
-pearmut-0.2.2.dist-info/METADATA,sha256=HIUdUB53cYuk4kBC3fywolFbG81XbVYyuqP_Jeq0KPg,12270
-pearmut-0.2.2.dist-info/WHEEL,sha256=_zCd3N1l69ArxyTb8rzEoP9TpbYXkqRFSNOD5OuxnTs,91
-pearmut-0.2.2.dist-info/entry_points.txt,sha256=eEA9LVWsS3neQbMvL_nMvEw8I0oFudw8nQa1iqxOiWM,45
-pearmut-0.2.2.dist-info/top_level.txt,sha256=CdgtUM-SKQDt6o5g0QreO-_7XTBP9_wnHMS1P-Rl5Go,8
-pearmut-0.2.2.dist-info/RECORD,,

{pearmut-0.2.2.dist-info → pearmut-0.2.4.dist-info}/WHEEL RENAMED Viewed

File without changes

{pearmut-0.2.2.dist-info → pearmut-0.2.4.dist-info}/entry_points.txt RENAMED Viewed

File without changes

{pearmut-0.2.2.dist-info → pearmut-0.2.4.dist-info}/licenses/LICENSE RENAMED Viewed

File without changes

{pearmut-0.2.2.dist-info → pearmut-0.2.4.dist-info}/top_level.txt RENAMED Viewed

File without changes

pearmut 0.2.2__py3-none-any.whl → 0.2.4__py3-none-any.whl

pearmut 0.2.2py3-none-any.whl → 0.2.4py3-none-any.whl