PyPI - markback - Versions diffs - 0.1.0__tar.gz → 0.1.2__tar.gz - Mend

markback 0.1.0tar.gz → 0.1.2tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (52) hide show

markback-0.1.2/.claude/settings.local.json ADDED Viewed

@@ -0,0 +1,13 @@
+{
+  "permissions": {
+    "allow": [
+      "Bash(python -m pytest:*)",
+      "Bash(npm test:*)",
+      "Bash(npm install)",
+      "Bash(npm run build:*)",
+      "Bash(echo:*)",
+      "Bash(python -m markback lint:*)",
+      "Bash(python:*)"
+    ]
+  }
+}

{markback-0.1.0 → markback-0.1.2}/.gitignore RENAMED Viewed

@@ -201,6 +201,11 @@ cython_debug/
 .cursorignore
 .cursorindexingignore
+# Node
+node_modules/
+packages/markback/dist/
+packages/markback/*.tsbuildinfo
 # Marimo
 marimo/_static/
 marimo/_lsp/

markback-0.1.2/.ishipped/card.md ADDED Viewed

@@ -0,0 +1,26 @@
+---
+title: "MarkBack"
+summary: "Human-writable format for pairing content with labels and feedback."
+shipped: 2026-01-04
+tags: [data-annotation, machine-learning, cli, python, typescript]
+links:
+  - label: "markback.org"
+    url: "https://markback.org"
+  - label: "GitHub"
+    url: "https://github.com/dandriscoll/markback"
+    primary: true
+  - label: "NPM"
+    url: "https://www.npmjs.com/package/markbackjs"
+---
+## What is it?
+MarkBack is a compact file format for storing content alongside feedback and labels. It's built for training data management, prompt engineering, and annotation workflows where you need human-readable files that machines can parse reliably.
+## Key Features
+- **Multiple storage modes** — Single-file, multi-record, compact one-liner, or paired files. Pick what fits your workflow.
+- **Structured feedback parsing** — Labels, key-value attributes, JSON, and freeform comments in one line.
+- **Comprehensive linting** — 18 diagnostic rules catch errors and style issues with precise line numbers.
+- **External content references** — Point to files, URIs, or embed content inline. Works with text, images, and binary files.
+- **Dual-language support** — Full implementations in Python (CLI + library) and TypeScript.

{markback-0.1.0 → markback-0.1.2}/PKG-INFO RENAMED Viewed

@@ -1,6 +1,6 @@
 Metadata-Version: 2.4
 Name: markback
-Version: 0.1.0
+Version: 0.1.2
 Summary: A compact, human-writable format for storing content paired with feedback/labels
 Project-URL: Homepage, https://github.com/dandriscoll/markback
 Project-URL: Repository, https://github.com/dandriscoll/markback
@@ -196,6 +196,16 @@ Second content.
 @source ./images/003.jpg <<< approved; scene=mountain
 ```
+### With Prior Reference
+Use `@prior` to reference an item that precedes the source (e.g., a prompt that generated an image):
+```
+@uri local:generated-001
+@prior ./prompts/sunset-prompt.txt
+@source ./images/generated-sunset.jpg <<< accurate; matches prompt well
+```
 ### Paired Files
 **content.txt:**

{markback-0.1.0 → markback-0.1.2}/README.md RENAMED Viewed

@@ -161,6 +161,16 @@ Second content.
 @source ./images/003.jpg <<< approved; scene=mountain
 ```
+### With Prior Reference
+Use `@prior` to reference an item that precedes the source (e.g., a prompt that generated an image):
+```
+@uri local:generated-001
+@prior ./prompts/sunset-prompt.txt
+@source ./images/generated-sunset.jpg <<< accurate; matches prompt well
+```
 ### Paired Files
 **content.txt:**

{markback-0.1.0 → markback-0.1.2}/SPEC.md RENAMED Viewed

@@ -28,6 +28,7 @@ A MarkBack **record** is the fundamental unit. Every record has:
 | `feedback` | Yes | Text after the `<<<` delimiter (always one line) |
 | `uri` | No | Unique identifier for the item |
 | `source` | No | Reference to external content (when content is not inline) |
+| `prior` | No | Reference to an item that precedes the source (e.g., a prompt that generated the content) |
 *Content is required but may be external (via `source` field).
@@ -66,6 +67,7 @@ Header lines appear at the start of a record and begin with `@`. They define met
 ```
 @uri <uri-value>
 @source <path-or-uri>
+@prior <path-or-uri>
 ```
 **Rules:**
@@ -109,6 +111,43 @@ References external content instead of inline content.
 - When `@source` is present, inline content MUST be empty (or contain only whitespace)
 - Parsers MUST verify referenced files exist (warning if missing)
+#### 3.1.3 `@prior` Header
+References an item that precedes the source material. For example, if the source is an image generated by an LLM, the prior could be the prompt that was used to create it.
+```
+@prior ./prompts/image-gen-prompt.txt
+@prior https://example.com/prompts/123
+@prior file:///path/to/prompt.txt
+```
+**Rules:**
+- Relative paths are resolved relative to the MarkBack file location
+- `@prior` can be used with or without `@source`
+- `@prior` does not affect content handling (inline content or `@source` rules still apply)
+- Parsers SHOULD verify referenced files exist (warning if missing)
+#### 3.1.4 Line Range Specification
+Both `@source` and `@prior` headers support optional line range specifications using colon notation. This allows referencing specific lines within a file.
+**Syntax:** `<path-or-uri>:<start>` or `<path-or-uri>:<start>-<end>`
+```
+@source ./code.py:42
+@source ./code.py:42-50
+@prior ./prompts/template.txt:1-20
+@source https://example.com/file.txt:100-150
+```
+**Rules:**
+- Line numbers are 1-indexed (first line is line 1)
+- Single line: `:N` references line N only
+- Line range: `:N-M` references lines N through M (inclusive)
+- End line must be greater than or equal to start line (E011 error otherwise)
+- Line ranges are informational metadata; parsers do not validate that referenced lines exist in the file
+- Windows drive letters (e.g., `C:\path`) are not confused with line ranges because scheme detection requires length > 1
 ### 3.2 Content Block
 Content is everything between headers and the `<<<` feedback delimiter.
@@ -442,7 +481,7 @@ Canonical form ensures consistent output for comparison and version control.
 ### 5.2 Canonicalization Rules
 1. **Line endings:** Normalize to `\n` (LF)
-2. **Header order:** `@uri` before `@source` before unknown headers (alphabetical)
+2. **Header order:** `@uri` before `@prior` before `@source` before unknown headers (alphabetical)
 3. **Header spacing:** Exactly one space after keyword
 4. **Trailing whitespace:** Remove from all lines
 5. **Content whitespace:** Preserve internal whitespace; trim leading/trailing blank lines
@@ -557,6 +596,7 @@ Each line is classified as one of:
 | E008 | Unclosed quote in structured attribute value (only in `structured` parse mode) |
 | E009 | Empty feedback (nothing after `<<< `) |
 | E010 | Missing blank line before inline content (content starts with `@`) |
+| E011 | Invalid line range (end line less than start line) |
 ### 7.2 Warnings (SHOULD fix)
@@ -570,6 +610,7 @@ Each line is classified as one of:
 | W006 | Missing `@uri` (record has no identifier) |
 | W007 | Paired feedback file not found for content file |
 | W008 | Non-canonical formatting detected |
+| W009 | `@prior` file not found |
 ### 7.3 Lint Output Format
@@ -617,7 +658,27 @@ Or in compact form:
 @source ./images/beach.jpg <<< appropriate; tags=landscape,beach,sunset; quality=high
 ```
-### 8.4 Single-File Example
+### 8.4 Record with Prior Reference (e.g., LLM-generated content)
+```
+@uri local:generated-image-001
+@prior ./prompts/beach-sunset.txt
+@source ./images/generated-beach.jpg
+<<< accurate; matches prompt well; quality=high
+```
+Or with inline content:
+```
+@uri local:generated-text-001
+@prior ./prompts/haiku-prompt.txt
+Cherry blossoms fall,
+Petals dance on gentle breeze,
+Spring whispers goodbye.
+<<< creative; follows haiku structure; quality=excellent
+```
+### 8.5 Single-File Example
 **File:** `question.mb`
 ```
@@ -627,7 +688,7 @@ Explain quantum entanglement in simple terms.
 <<< quality=excellent; accuracy=high; clarity=good
 ```
-### 8.5 Label List Example (Compact Format)
+### 8.6 Label List Example (Compact Format)
 **File:** `image-annotations.mb`
 ```
@@ -659,7 +720,7 @@ Explain quantum entanglement in simple terms.
 @source ./batch1/item3.txt <<< positive; excellent clarity
 ```
-### 8.6 Multi-Record Example (Mixed Freeform and Structured)
+### 8.7 Multi-Record Example (Mixed Freeform and Structured)
 **File:** `training-data.mb`
 ```
@@ -690,7 +751,7 @@ Please write a formal letter requesting a meeting.
 @source ./audio/sample-005.wav <<< transcription="Hello world"; quality=clear; language=en
 ```
-### 8.7 Paired-File Example
+### 8.8 Paired-File Example
 **Content file:** `essay.txt`
 ```
@@ -706,7 +767,7 @@ agriculture, manufacturing, mining, and transport.
 <<< good; grade=B+; well structured but needs more specific examples
 ```
-### 8.8 Freeform Feedback Examples
+### 8.9 Freeform Feedback Examples
 Various styles of freeform feedback:
@@ -729,7 +790,7 @@ Explain machine learning to a child.
 <<< needs work; the explanation assumes too much prior knowledge
 ```
-### 8.9 Complex Structured Feedback (JSON)
+### 8.10 Complex Structured Feedback (JSON)
 ```
 @uri local:complex-example
@@ -738,7 +799,7 @@ Multi-attribute content with special characters.
 <<< json:{"rating":4.5,"tags":["important","review"],"notes":"Contains \"quoted\" text and; semicolons","scores":{"accuracy":0.9,"relevance":0.85}}
 ```
-### 8.10 Image with MarkBack Sidecar
+### 8.11 Image with MarkBack Sidecar
 **Content file:** `diagram.png` (binary)
@@ -824,8 +885,10 @@ feedback-content = *VCHAR             ; no LF allowed
 compact-record  = [uri-line] source-feedback-line
 compact-list    = compact-record *(1*blank-line compact-record)
 uri-line        = "@uri" SP value LF
-source-feedback-line = "@source" SP path SP "<<<" SP feedback-content LF
-path            = 1*VCHAR             ; ends at space before <<<
+source-feedback-line = "@source" SP path-with-range SP "<<<" SP feedback-content LF
+path-with-range = path [line-range]   ; path with optional line range
+path            = 1*VCHAR             ; ends at space before <<< or line-range
+line-range      = ":" 1*DIGIT ["-" 1*DIGIT]
 LOWER           = %x61-7A  ; a-z
 SP              = %x20     ; space

{markback-0.1.0 → markback-0.1.2}/markback/linter.py RENAMED Viewed

@@ -110,6 +110,69 @@ def lint_source_exists(
     return diagnostics
+def lint_prior_exists(
+    record: Record,
+    base_path: Optional[Path],
+    record_idx: int,
+) -> list[Diagnostic]:
+    """Check if @prior file exists."""
+    diagnostics: list[Diagnostic] = []
+    if record.prior and not record.prior.is_uri:
+        try:
+            resolved = record.prior.resolve(base_path)
+            if not resolved.exists():
+                diagnostics.append(Diagnostic(
+                    file=record._source_file,
+                    line=record._start_line,
+                    column=None,
+                    severity=Severity.WARNING,
+                    code=WarningCode.W009,
+                    message=f"@prior file not found: {record.prior}",
+                    record_index=record_idx,
+                ))
+        except ValueError:
+            pass  # URI that can't be resolved to path
+    return diagnostics
+def lint_line_range(
+    record: Record,
+    record_idx: int,
+) -> list[Diagnostic]:
+    """Check if line ranges are valid (end >= start)."""
+    diagnostics: list[Diagnostic] = []
+    # Check @source line range
+    if record.source and record.source.start_line is not None:
+        if record.source.end_line is not None and record.source.end_line < record.source.start_line:
+            diagnostics.append(Diagnostic(
+                file=record._source_file,
+                line=record._start_line,
+                column=None,
+                severity=Severity.ERROR,
+                code=ErrorCode.E011,
+                message=f"Invalid line range in @source: end line {record.source.end_line} is less than start line {record.source.start_line}",
+                record_index=record_idx,
+            ))
+    # Check @prior line range
+    if record.prior and record.prior.start_line is not None:
+        if record.prior.end_line is not None and record.prior.end_line < record.prior.start_line:
+            diagnostics.append(Diagnostic(
+                file=record._source_file,
+                line=record._start_line,
+                column=None,
+                severity=Severity.ERROR,
+                code=ErrorCode.E011,
+                message=f"Invalid line range in @prior: end line {record.prior.end_line} is less than start line {record.prior.start_line}",
+                record_index=record_idx,
+            ))
+    return diagnostics
 def lint_canonical_format(
     records: list[Record],
     original_text: str,
@@ -173,10 +236,14 @@ def lint_string(
                 idx,
             ))
-        # Check source file existence
+        # Check source and prior file existence
         if check_sources:
             base_path = source_file.parent if source_file else None
             result.diagnostics.extend(lint_source_exists(record, base_path, idx))
+            result.diagnostics.extend(lint_prior_exists(record, base_path, idx))
+        # Check line range validity
+        result.diagnostics.extend(lint_line_range(record, idx))
     # Check canonical format
     if check_canonical and result.records and not result.has_errors:

{markback-0.1.0 → markback-0.1.2}/markback/parser.py RENAMED Viewed

@@ -17,7 +17,7 @@ from .types import (
 # Known header keywords
-KNOWN_HEADERS = {"uri", "source"}
+KNOWN_HEADERS = {"uri", "source", "prior"}
 # Patterns
 HEADER_PATTERN = re.compile(r"^@([a-z]+)\s+(.+)$")
@@ -147,6 +147,8 @@ def parse_string(
         uri = current_headers.get("uri") or pending_uri
         source_str = current_headers.get("source")
         source = SourceRef(source_str) if source_str else None
+        prior_str = current_headers.get("prior")
+        prior = SourceRef(prior_str) if prior_str else None
         content = None
         if current_content_lines:
@@ -163,6 +165,7 @@ def parse_string(
             feedback=feedback,
             uri=uri,
             source=source,
+            prior=prior,
             content=content,
             _source_file=source_file,
             _start_line=current_start_line,
@@ -239,13 +242,16 @@ def parse_string(
                     line_num,
                 )
-            # Use any pending @uri from previous line
+            # Use any pending @uri from previous line and @prior if present
             uri = pending_uri or current_headers.get("uri")
+            prior_str = current_headers.get("prior")
+            prior = SourceRef(prior_str) if prior_str else None
             record = Record(
                 feedback=feedback or "",
                 uri=uri,
                 source=source,
+                prior=prior,
                 content=None,
                 _source_file=source_file,
                 _start_line=current_start_line,

{markback-0.1.0 → markback-0.1.2}/markback/types.py RENAMED Viewed

@@ -1,5 +1,6 @@
 """Core types for MarkBack format."""
+import re
 from dataclasses import dataclass, field
 from enum import Enum
 from pathlib import Path
@@ -25,6 +26,7 @@ class ErrorCode(Enum):
     E008 = "E008"  # Unclosed quote in structured attribute value
     E009 = "E009"  # Empty feedback (nothing after <<< )
     E010 = "E010"  # Missing blank line before inline content
+    E011 = "E011"  # Invalid line range (end < start)
 class WarningCode(Enum):
@@ -37,6 +39,7 @@ class WarningCode(Enum):
     W006 = "W006"  # Missing @uri (record has no identifier)
     W007 = "W007"  # Paired feedback file not found
     W008 = "W008"  # Non-canonical formatting detected
+    W009 = "W009"  # @prior file not found
 @dataclass
@@ -75,29 +78,67 @@ class Diagnostic:
         }
+# Regex to parse line range from a path: path:start or path:start-end
+_LINE_RANGE_PATTERN = re.compile(r'^(.+?):(\d+)(?:-(\d+))?$')
 @dataclass
 class SourceRef:
     """Reference to external content (file path or URI)."""
     value: str
     is_uri: bool = False
+    start_line: Optional[int] = None
+    end_line: Optional[int] = None
+    _path_only: str = ""
     def __post_init__(self):
-        # Determine if this is a URI or file path
+        # Parse line range if present
+        self._parse_line_range()
+        # Determine if this is a URI or file path (using path without line range)
         if not self.is_uri:
-            parsed = urlparse(self.value)
+            parsed = urlparse(self._path_only)
             # Consider it a URI if it has a scheme that's not a Windows drive letter
             self.is_uri = bool(parsed.scheme) and len(parsed.scheme) > 1
+    def _parse_line_range(self):
+        """Parse optional line range from value."""
+        match = _LINE_RANGE_PATTERN.match(self.value)
+        if match:
+            self._path_only = match.group(1)
+            self.start_line = int(match.group(2))
+            if match.group(3):
+                self.end_line = int(match.group(3))
+            else:
+                # Single line reference: start and end are the same
+                self.end_line = self.start_line
+        else:
+            self._path_only = self.value
+    @property
+    def path(self) -> str:
+        """Return path without line range."""
+        return self._path_only
+    @property
+    def line_range_str(self) -> Optional[str]:
+        """Return formatted line range string, or None if no range."""
+        if self.start_line is None:
+            return None
+        if self.start_line == self.end_line:
+            return f":{self.start_line}"
+        return f":{self.start_line}-{self.end_line}"
     def resolve(self, base_path: Optional[Path] = None) -> Path:
         """Resolve to a file path (relative paths resolved against base_path)."""
         if self.is_uri:
-            parsed = urlparse(self.value)
+            parsed = urlparse(self._path_only)
             if parsed.scheme == "file":
                 # file:// URI
                 return Path(parsed.path)
             raise ValueError(f"Cannot resolve non-file URI to path: {self.value}")
-        path = Path(self.value)
+        path = Path(self._path_only)
         if path.is_absolute():
             return path
         if base_path:
@@ -122,6 +163,7 @@ class Record:
     feedback: str
     uri: Optional[str] = None
     source: Optional[SourceRef] = None
+    prior: Optional[SourceRef] = None
     content: Optional[str] = None
     metadata: dict = field(default_factory=dict)
@@ -154,6 +196,7 @@ class Record:
         return {
             "uri": self.uri,
             "source": str(self.source) if self.source else None,
+            "prior": str(self.prior) if self.prior else None,
             "content": self.content,
             "feedback": self.feedback,
             "metadata": self.metadata,

{markback-0.1.0 → markback-0.1.2}/markback/writer.py RENAMED Viewed

@@ -38,15 +38,19 @@ def write_record_canonical(
     )
     if use_compact:
-        # Compact format: @uri on its own line (if present), then @source ... <<<
+        # Compact format: @uri on its own line (if present), then @prior, then @source ... <<<
         if record.uri:
             lines.append(f"@uri {record.uri}")
+        if record.prior:
+            lines.append(f"@prior {record.prior}")
         lines.append(f"@source {record.source} <<< {record.feedback}")
     else:
         # Full format
-        # Headers: @uri first, then @source
+        # Headers: @uri first, then @prior, then @source
         if record.uri:
             lines.append(f"@uri {record.uri}")
+        if record.prior:
+            lines.append(f"@prior {record.prior}")
         if record.source:
             lines.append(f"@source {record.source}")
@@ -147,6 +151,9 @@ def write_label_file(record: Record) -> str:
     if record.uri:
         lines.append(f"@uri {record.uri}")
+    if record.prior:
+        lines.append(f"@prior {record.prior}")
     lines.append(f"<<< {record.feedback}")

markback-0.1.2/packages/markbackjs/LICENSE ADDED Viewed

@@ -0,0 +1,21 @@
+MIT License
+Copyright (c) 2026 dandriscoll
+Permission is hereby granted, free of charge, to any person obtaining a copy
+of this software and associated documentation files (the "Software"), to deal
+in the Software without restriction, including without limitation the rights
+to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+copies of the Software, and to permit persons to whom the Software is
+furnished to do so, subject to the following conditions:
+The above copyright notice and this permission notice shall be included in all
+copies or substantial portions of the Software.
+THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+SOFTWARE.

markback-0.1.2/packages/markbackjs/README.md ADDED Viewed

@@ -0,0 +1,47 @@
+# markbackjs
+JavaScript/TypeScript linter for the MarkBack format.
+## Install
+```bash
+npm install markbackjs
+```
+## Usage
+```js
+const { lintString, formatDiagnostics } = require("markbackjs");
+const text = "Content here.\n<<< positive\n";
+const result = lintString(text, { checkSources: false, checkCanonical: false });
+if (result.hasErrors) {
+  console.log(formatDiagnostics(result.diagnostics));
+}
+```
+### Supported Headers
+- `@uri` - Unique identifier for the record
+- `@source` - Reference to external content file
+- `@prior` - Reference to a file that precedes the source (e.g., a prompt that generated it)
+## API
+- `lintString(text, options)`
+- `lintFile(path, options)`
+- `lintFiles(paths, options)`
+- `formatDiagnostics(diagnostics, format)`
+- `summarizeResults(results)`
+Options:
+- `sourceFile`: string
+- `checkSources`: boolean (default true)
+- `checkCanonical`: boolean (default true)
+## Build
+```bash
+npm run build
+```

markback-0.1.2/packages/markbackjs/package-lock.json ADDED Viewed

@@ -0,0 +1,51 @@
+{
+  "name": "markbackjs",
+  "version": "0.1.0",
+  "lockfileVersion": 3,
+  "requires": true,
+  "packages": {
+    "": {
+      "name": "markbackjs",
+      "version": "0.1.0",
+      "license": "MIT",
+      "devDependencies": {
+        "@types/node": "^20.11.0",
+        "typescript": "^5.4.0"
+      },
+      "engines": {
+        "node": ">=18"
+      }
+    },
+    "node_modules/@types/node": {
+      "version": "20.19.27",
+      "resolved": "https://registry.npmjs.org/@types/node/-/node-20.19.27.tgz",
+      "integrity": "sha512-N2clP5pJhB2YnZJ3PIHFk5RkygRX5WO/5f0WC08tp0wd+sv0rsJk3MqWn3CbNmT2J505a5336jaQj4ph1AdMug==",
+      "dev": true,
+      "license": "MIT",
+      "dependencies": {
+        "undici-types": "~6.21.0"
+      }
+    },
+    "node_modules/typescript": {
+      "version": "5.9.3",
+      "resolved": "https://registry.npmjs.org/typescript/-/typescript-5.9.3.tgz",
+      "integrity": "sha512-jl1vZzPDinLr9eUt3J/t7V6FgNEw9QjvBPdysz9KfQDD41fQrC2Y4vKQdiaUpFT4bXlb1RHhLpp8wtm6M5TgSw==",
+      "dev": true,
+      "license": "Apache-2.0",
+      "bin": {
+        "tsc": "bin/tsc",
+        "tsserver": "bin/tsserver"
+      },
+      "engines": {
+        "node": ">=14.17"
+      }
+    },
+    "node_modules/undici-types": {
+      "version": "6.21.0",
+      "resolved": "https://registry.npmjs.org/undici-types/-/undici-types-6.21.0.tgz",
+      "integrity": "sha512-iwDZqg0QAGrg9Rav5H4n0M64c3mkR59cJ6wQp+7C4nI0gsmExaedaYLNO44eT4AtBBwjbTiGPMlt2Md0T9H9JQ==",
+      "dev": true,
+      "license": "MIT"
+    }
+  }
+}

markback 0.1.0__tar.gz → 0.1.2__tar.gz

markback 0.1.0tar.gz → 0.1.2tar.gz