npm - mdld-parse - Versions diffs - 0.2.7 → 0.2.9 - Mend

mdld-parse 0.2.7 → 0.2.9

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (3) hide show

package/README.md CHANGED Viewed

@@ -14,8 +14,11 @@ MD-LD allows you to author RDF graphs directly in Markdown using explicit `{...}
 # Apollo 11 {=ex:apollo11 .SpaceMission}
 Launch: [1969-07-16] {startDate ^^xsd:date}
-Crew: [Neil Armstrong](ex:armstrong) {?crewMember}
+Crew: [Neil Armstrong] {=?ex:armstrong ?crewMember fullName}
 Description: [First crewed Moon landing] {description}
+[Section] {=?#overview ?hasPart}
+Overview: [Mission summary] {description}
 ```
 Generates valid RDF triples:
@@ -25,18 +28,20 @@ ex:apollo11 a schema:SpaceMission ;
   schema:startDate "1969-07-16"^^xsd:date ;
   schema:crewMember ex:armstrong ;
   schema:description "First crewed Moon landing" .
-```
-## Core Guarantees
+ex:armstrong schema:fullName "Neil Armstrong" .
+```
-MD-LD v0.2 provides strict semantic guarantees:
+## Core Features
-1. **CommonMark-preserving** — Removing `{...}` yields valid Markdown
-2. **Explicit semantics** — Every quad originates from explicit `{...}`
-3. **Single-pass parsing** — Streaming-friendly, deterministic
-4. **No blank nodes** — All subjects are stable IRIs
-5. **Complete traceability** — Every quad maps to source location
-6. **Round-trip capable** — Markdown ↔ RDF ↔ Markdown preserves structure
+- **Subject declarations**: `{=IRI}` and `{=#fragment}` for context setting
+- **Object IRIs**: `{=?IRI}` and `{=?#fragment}` for temporary object declarations
+- **Four predicate forms**: `p` (S→L), `?p` (S→O), `^p` (L→S), `^?p` (O→S)
+- **Type declarations**: `.Class` for rdf:type triples
+- **Datatypes & language**: `^^xsd:date` and `@en` support
+- **Lists**: Explicit subject declarations for structured data
+- **Fragments**: Built-in document structuring with `{=#fragment}`
+- **Round-trip serialization**: Markdown ↔ RDF ↔ Markdown preserves structure
 ## Installation
@@ -246,6 +251,7 @@ ex:book schema:hasPart ex:part .
 ```markdown
 [ex] {: http://example.org/}
 [foaf] {: http://xmlns.com/foaf/0.1/}
+[@vocab] {: http://schema.org/}
 # Person {=ex:alice .foaf:Person}
 ```
@@ -385,6 +391,41 @@ MD-LD explicitly forbids to ensure deterministic parsing:
 - ❌ Predicate guessing from context
 - ❌ Multi-pass or backtracking parsers
+Below is a **tight, README-ready refinement** of the Algebra section.
+It keeps the math precise, examples exhaustive, and language compact.
+---
+## Algebra
+> Every RDF triple `(s, p, o)` can be authored **explicitly, deterministically, and locally**, with no inference, guessing, or reordering.
+MD-LD models RDF authoring as a **closed edge algebra** over a small, explicit state. To be algebraically complete for RDF triple construction, a syntax must support:
+* Binding a **subject** `S`
+* Binding an **object** `O`
+* Emitting predicates in **both directions**
+* Distinguishing **IRI nodes** from **literal nodes**
+* Operating with **no implicit state or inference**
+MD-LD satisfies these requirements with four explicit operators.
+Each predicate is partitioned by **direction** and **node kind**:
+| Predicate form | Emitted triple |
+| -------------- | -------------- |
+| `p`            | `S ─p→ L`      |
+| `?p`           | `S ─p→ O`      |
+| `^p`           | `L ─p→ S`      |
+| `^?p`          | `O ─p→ S`      |
+This spans all **2 × 2** combinations of:
+* source ∈ {subject, object/literal}
+* target ∈ {subject, object/literal}
+Therefore, the algebra is **closed**.
 ## Use Cases
 ### Personal Knowledge Management
@@ -456,8 +497,10 @@ Contributions welcome! Please:
 ## Acknowledgments
+Developed by [Denis Starov](https://github.com/davay42).
 Inspired by:
-- Thomas Francart's [Semantic Markdown](https://blog.sparna.fr/2020/02/20/semantic-markdown/)
+- Thomas Francart's [Semantic Markdown](https://blog.sparna.fr/2020/02/20/semantic-markdown/) article
 - RDFa decades of structured data experience
 - CommonMark's rigorous parsing approach

package/index.js CHANGED Viewed

@@ -24,7 +24,6 @@ export function hash(str) {
     return Math.abs(h).toString(16).slice(0, 12);
 }
-// IRI Utilities
 export function expandIRI(term, ctx) {
     if (term == null) return null;
     const raw = typeof term === 'string' ? term : (typeof term === 'object' && typeof term.value === 'string') ? term.value : String(term);
@@ -76,6 +75,13 @@ export function parseSemanticBlock(raw) {
                 continue;
             }
+            if (token.startsWith('=?#')) {
+                const fragment = token.substring(3);
+                result.object = `#${fragment}`;
+                result.entries.push({ kind: 'softFragment', fragment, relRange: { start: relStart, end: relEnd }, raw: token });
+                continue;
+            }
             if (token.startsWith('=?')) {
                 const iri = token.substring(2);
                 result.object = iri;
@@ -274,6 +280,41 @@ function extractInlineCarriers(text, baseOffset = 0) {
     let pos = 0;
     while (pos < text.length) {
+        // Try emphasis patterns first (before brackets)
+        const emphasisMatch = text.match(/^[*__`]+(.+?)[*__`]+\s*\{([^}]+)\}/, pos);
+        if (emphasisMatch) {
+            const carrierText = emphasisMatch[1];
+            const valueRange = [baseOffset + emphasisMatch[0].length, baseOffset + emphasisMatch[0].length + emphasisMatch[1].length];
+            carriers.push({
+                type: 'emphasis',
+                text: carrierText,
+                attrs: `{${emphasisMatch[2]}}`,
+                attrsRange: [baseOffset + emphasisMatch[0].length + emphasisMatch[1].length + 2, baseOffset + emphasisMatch[0].length + emphasisMatch[1].length + emphasisMatch[2].length],
+                valueRange,
+                range: [baseOffset + emphasisMatch[0].length, baseOffset + emphasisMatch[0].length + emphasisMatch[1].length]
+            });
+            pos = baseOffset + emphasisMatch[0].length + emphasisMatch[1].length + emphasisMatch[2].length;
+            continue;
+        }
+        // Try code spans
+        const codeMatch = text.match(/^``(.+?)``\s*\{([^}]+)\}/, pos);
+        if (codeMatch) {
+            const carrierText = codeMatch[1];
+            const valueRange = [baseOffset + 2, baseOffset + 2 + codeMatch[1].length];
+            carriers.push({
+                type: 'code',
+                text: carrierText,
+                attrs: `{${codeMatch[2]}}`,
+                attrsRange: [baseOffset + 2 + codeMatch[1].length + 2, baseOffset + 2 + codeMatch[1].length + 2],
+                valueRange,
+                range: [baseOffset + 2, baseOffset + 2 + codeMatch[1].length + 2]
+            });
+            pos = baseOffset + 2 + codeMatch[1].length + 2;
+            continue;
+        }
+        // Try bracket patterns (original logic)
         const bracketStart = text.indexOf('[', pos);
         if (bracketStart === -1) break;
@@ -365,7 +406,6 @@ function createBlock(subject, types, predicates, entries, range, attrsRange, val
     };
 }
-// Quad Utilities
 function quadIndexKey(subject, predicate, object) {
     const objKey = object.termType === 'Literal'
         ? JSON.stringify({ t: 'Literal', v: object.value, lang: object.language || '', dt: object.datatype?.value || '' })
@@ -406,7 +446,6 @@ function parseQuadIndexKey(key) {
     }
 }
-// Semantic Slot Utilities
 function createSemanticSlotId(subject, predicate) {
     return hash(`${subject.value}|${predicate.value}`);
 }
@@ -500,7 +539,17 @@ function processAnnotation(carrier, sem, state) {
     if (sem.object) {
         // Handle soft IRI object declaration - local to this annotation only
-        localObject = state.df.namedNode(expandIRI(sem.object, state.ctx));
+        if (sem.object.startsWith('#')) {
+            // Soft fragment - resolve against current subject base
+            const fragment = sem.object.substring(1);
+            if (state.currentSubject) {
+                const baseIRI = state.currentSubject.value.split('#')[0];
+                localObject = state.df.namedNode(`${baseIRI}#${fragment}`);
+            }
+        } else {
+            // Regular soft IRI
+            localObject = state.df.namedNode(expandIRI(sem.object, state.ctx));
+        }
     }
     if (newSubject) state.currentSubject = newSubject;
@@ -718,8 +767,6 @@ export function parse(text, options = {}) {
     return { quads: state.quads, origin: state.origin, context: state.ctx };
 }
-// Text Processing Utilities
 function readSpan(block, text, spanType = 'attrs') {
     const range = spanType === 'attrs' ? block?.attrsRange : block?.valueRange;
     if (!range) return null;
@@ -753,6 +800,16 @@ function removeObjectToken(tokens, iri) {
     return removeOneToken(tokens, t => t === objectToken);
 }
+function addSoftFragmentToken(tokens, fragment) {
+    const fragmentToken = `=?#${fragment}`;
+    return tokens.includes(fragmentToken) ? tokens : [...tokens, fragmentToken];
+}
+function removeSoftFragmentToken(tokens, fragment) {
+    const fragmentToken = `=?#${fragment}`;
+    return removeOneToken(tokens, t => t === fragmentToken);
+}
 function sanitizeCarrierValueForBlock(block, raw) {
     const s = String(raw ?? '');
     const t = block?.carrierType;
@@ -1057,6 +1114,17 @@ export function serialize({ text, diff, origin, options = {} }) {
                 return;
             }
+            // Handle soft fragment token removal
+            if (entry?.kind === 'softFragment') {
+                const fragment = entry.fragment;
+                const { tokens: updated, removed } = removeSoftFragmentToken(tokens, fragment);
+                if (!removed) return;
+                const newAttrs = updated.length === 0 ? '{}' : writeAttrsTokens(updated);
+                edits.push({ start: span.start, end: span.end, text: newAttrs });
+                return;
+            }
             const tokens = normalizeAttrsTokens(span.text);
             let updated = tokens;
             let removed = false;
@@ -1151,20 +1219,31 @@ export function serialize({ text, diff, origin, options = {} }) {
                     const objectShort = shortenIRI(full, ctx);
                     const predShort = shortenIRI(quad.predicate.value, ctx);
-                    // Check if this is a ?predicate form (should use object IRI)
-                    const span = readSpan(targetBlock, text, 'attrs');
-                    const tokens = blockTokensFromEntries(targetBlock) || normalizeAttrsTokens(span.text);
-                    const hasObjectToken = tokens.some(t => t.startsWith('=?'));
+                    // Check if this is a soft fragment
+                    const isSoftFragment = full.includes('#') && anchored?.entry?.kind === 'softFragment';
-                    if (hasObjectToken || anchored?.entry?.form === '?') {
-                        // Add object token if not present
-                        const updated = addObjectToken(tokens, objectShort);
-                        if (updated.length !== tokens.length) {
-                            edits.push({ start: span.start, end: span.end, text: writeAttrsTokens(updated) });
+                    if (isSoftFragment || anchored?.entry?.form === '?') {
+                        // Add soft fragment token if not present
+                        if (isSoftFragment) {
+                            const fragment = full.split('#')[1];
+                            const updated = addSoftFragmentToken(tokens, fragment);
+                            if (updated.length !== tokens.length) {
+                                edits.push({ start: span.start, end: span.end, text: writeAttrsTokens(updated) });
+                            }
+                        } else {
+                            const updated = addObjectToken(tokens, objectShort);
+                            if (updated.length !== tokens.length) {
+                                edits.push({ start: span.start, end: span.end, text: writeAttrsTokens(updated) });
+                            }
                         }
                     } else {
                         // Create new annotation with object token
-                        edits.push({ start: result.length, end: result.length, text: `\n[${objectShort}] {=?${objectShort} ?${predShort}}` });
+                        if (isSoftFragment) {
+                            const fragment = full.split('#')[1];
+                            edits.push({ start: result.length, end: result.length, text: `\n[${objectShort}] {=?#${fragment} ?${predShort}}` });
+                        } else {
+                            edits.push({ start: result.length, end: result.length, text: `\n[${objectShort}] {=?${objectShort} ?${predShort}}` });
+                        }
                     }
                     return;
                 }

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
 	"name": "mdld-parse",
-	"version": "0.2.7",
+	"version": "0.2.9",
 	"description": "A standards-compliant parser for **MD-LD (Markdown-Linked Data)** — a human-friendly RDF authoring format that extends Markdown with semantic annotations.",
 	"type": "module",
 	"main": "index.js",