npm - mdld-parse - Versions diffs - 0.2.6 → 0.2.8 - Mend

mdld-parse 0.2.6 → 0.2.8

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (3) hide show

package/README.md CHANGED Viewed

@@ -14,8 +14,11 @@ MD-LD allows you to author RDF graphs directly in Markdown using explicit `{...}
 # Apollo 11 {=ex:apollo11 .SpaceMission}
 Launch: [1969-07-16] {startDate ^^xsd:date}
-Crew: [Neil Armstrong](ex:armstrong) {?crewMember}
+Crew: [Neil Armstrong] {=?ex:armstrong ?crewMember fullName}
 Description: [First crewed Moon landing] {description}
+[Section] {=?#overview ?hasPart}
+Overview: [Mission summary] {description}
 ```
 Generates valid RDF triples:
@@ -25,18 +28,20 @@ ex:apollo11 a schema:SpaceMission ;
   schema:startDate "1969-07-16"^^xsd:date ;
   schema:crewMember ex:armstrong ;
   schema:description "First crewed Moon landing" .
-```
-## Core Guarantees
+ex:armstrong schema:fullName "Neil Armstrong" .
+```
-MD-LD v0.2 provides strict semantic guarantees:
+## Core Features
-1. **CommonMark-preserving** — Removing `{...}` yields valid Markdown
-2. **Explicit semantics** — Every quad originates from explicit `{...}`
-3. **Single-pass parsing** — Streaming-friendly, deterministic
-4. **No blank nodes** — All subjects are stable IRIs
-5. **Complete traceability** — Every quad maps to source location
-6. **Round-trip capable** — Markdown ↔ RDF ↔ Markdown preserves structure
+- **Subject declarations**: `{=IRI}` and `{=#fragment}` for context setting
+- **Object IRIs**: `{=?IRI}` and `{=?#fragment}` for temporary object declarations
+- **Four predicate forms**: `p` (S→L), `?p` (S→O), `^p` (L→S), `^?p` (O→S)
+- **Type declarations**: `.Class` for rdf:type triples
+- **Datatypes & language**: `^^xsd:date` and `@en` support
+- **Lists**: Explicit subject declarations for structured data
+- **Fragments**: Built-in document structuring with `{=#fragment}`
+- **Round-trip serialization**: Markdown ↔ RDF ↔ Markdown preserves structure
 ## Installation
@@ -246,6 +251,7 @@ ex:book schema:hasPart ex:part .
 ```markdown
 [ex] {: http://example.org/}
 [foaf] {: http://xmlns.com/foaf/0.1/}
+[@vocab] {: http://schema.org/}
 # Person {=ex:alice .foaf:Person}
 ```
@@ -385,6 +391,41 @@ MD-LD explicitly forbids to ensure deterministic parsing:
 - ❌ Predicate guessing from context
 - ❌ Multi-pass or backtracking parsers
+Below is a **tight, README-ready refinement** of the Algebra section.
+It keeps the math precise, examples exhaustive, and language compact.
+---
+## Algebra
+> Every RDF triple `(s, p, o)` can be authored **explicitly, deterministically, and locally**, with no inference, guessing, or reordering.
+MD-LD models RDF authoring as a **closed edge algebra** over a small, explicit state. To be algebraically complete for RDF triple construction, a syntax must support:
+* Binding a **subject** `S`
+* Binding an **object** `O`
+* Emitting predicates in **both directions**
+* Distinguishing **IRI nodes** from **literal nodes**
+* Operating with **no implicit state or inference**
+MD-LD satisfies these requirements with four explicit operators.
+Each predicate is partitioned by **direction** and **node kind**:
+| Predicate form | Emitted triple |
+| -------------- | -------------- |
+| `p`            | `S ─p→ L`      |
+| `?p`           | `S ─p→ O`      |
+| `^p`           | `L ─p→ S`      |
+| `^?p`          | `O ─p→ S`      |
+This spans all **2 × 2** combinations of:
+* source ∈ {subject, object/literal}
+* target ∈ {subject, object/literal}
+Therefore, the algebra is **closed**.
 ## Use Cases
 ### Personal Knowledge Management
@@ -456,8 +497,10 @@ Contributions welcome! Please:
 ## Acknowledgments
+Developed by [Denis Starov](https://github.com/davay42).
 Inspired by:
-- Thomas Francart's [Semantic Markdown](https://blog.sparna.fr/2020/02/20/semantic-markdown/)
+- Thomas Francart's [Semantic Markdown](https://blog.sparna.fr/2020/02/20/semantic-markdown/) article
 - RDFa decades of structured data experience
 - CommonMark's rigorous parsing approach

package/index.js CHANGED Viewed

@@ -1,4 +1,4 @@
-const DEFAULT_CONTEXT = {
+export const DEFAULT_CONTEXT = {
     '@vocab': 'http://schema.org/',
     rdf: 'http://www.w3.org/1999/02/22-rdf-syntax-ns#',
     rdfs: 'http://www.w3.org/2000/01/rdf-schema#',
@@ -6,7 +6,7 @@ const DEFAULT_CONTEXT = {
     schema: 'http://schema.org/'
 };
-const DataFactory = {
+export const DataFactory = {
     namedNode: (v) => ({ termType: 'NamedNode', value: v }),
     blankNode: (v = `b${Math.random().toString(36).slice(2, 11)}`) => ({ termType: 'BlankNode', value: v }),
     literal: (v, lang) => {
@@ -18,14 +18,14 @@ const DataFactory = {
     quad: (s, p, o, g) => ({ subject: s, predicate: p, object: o, graph: g || DataFactory.namedNode('') })
 };
-function hash(str) {
+export function hash(str) {
     let h = 5381;
     for (let i = 0; i < str.length; i++) h = ((h << 5) + h) + str.charCodeAt(i);
     return Math.abs(h).toString(16).slice(0, 12);
 }
 // IRI Utilities
-function expandIRI(term, ctx) {
+export function expandIRI(term, ctx) {
     if (term == null) return null;
     const raw = typeof term === 'string' ? term : (typeof term === 'object' && typeof term.value === 'string') ? term.value : String(term);
     const t = raw.trim();
@@ -48,17 +48,13 @@ export function shortenIRI(iri, ctx) {
     return iri;
 }
-function processIRI(term, ctx, operation = 'expand') {
-    return operation === 'expand' ? expandIRI(term, ctx) : shortenIRI(term, ctx);
-}
-function parseSemanticBlock(raw) {
+export function parseSemanticBlock(raw) {
     try {
         const src = String(raw || '').trim();
         const cleaned = src.replace(/^\{|\}$/g, '').trim();
-        if (!cleaned) return { subject: null, types: [], predicates: [], datatype: null, language: null, entries: [] };
+        if (!cleaned) return { subject: null, object: null, types: [], predicates: [], datatype: null, language: null, entries: [] };
-        const result = { subject: null, types: [], predicates: [], datatype: null, language: null, entries: [] };
+        const result = { subject: null, object: null, types: [], predicates: [], datatype: null, language: null, entries: [] };
         const re = /\S+/g;
         let m;
         while ((m = re.exec(cleaned)) !== null) {
@@ -80,6 +76,20 @@ function parseSemanticBlock(raw) {
                 continue;
             }
+            if (token.startsWith('=?#')) {
+                const fragment = token.substring(3);
+                result.object = `#${fragment}`;
+                result.entries.push({ kind: 'softFragment', fragment, relRange: { start: relStart, end: relEnd }, raw: token });
+                continue;
+            }
+            if (token.startsWith('=?')) {
+                const iri = token.substring(2);
+                result.object = iri;
+                result.entries.push({ kind: 'object', iri, relRange: { start: relStart, end: relEnd }, raw: token });
+                continue;
+            }
             if (token.startsWith('=')) {
                 const iri = token.substring(1);
                 result.subject = iri;
@@ -137,7 +147,7 @@ function parseSemanticBlock(raw) {
         return result;
     } catch (error) {
         console.error(`Error parsing semantic block ${raw}:`, error);
-        return { subject: null, types: [], predicates: [], datatype: null, language: null, entries: [] };
+        return { subject: null, object: null, types: [], predicates: [], datatype: null, language: null, entries: [] };
     }
 }
@@ -472,11 +482,13 @@ function createLiteral(value, datatype, language, context, dataFactory) {
 function processAnnotation(carrier, sem, state) {
     if (sem.subject === 'RESET') {
         state.currentSubject = null;
+        state.currentObject = null;
         return;
     }
     const previousSubject = state.currentSubject;
     let newSubject = null;
+    let localObject = null;
     if (sem.subject) {
         if (sem.subject.startsWith('=#')) {
@@ -492,6 +504,22 @@ function processAnnotation(carrier, sem, state) {
             newSubject = state.df.namedNode(expandIRI(sem.subject, state.ctx));
         }
     }
+    if (sem.object) {
+        // Handle soft IRI object declaration - local to this annotation only
+        if (sem.object.startsWith('#')) {
+            // Soft fragment - resolve against current subject base
+            const fragment = sem.object.substring(1);
+            if (state.currentSubject) {
+                const baseIRI = state.currentSubject.value.split('#')[0];
+                localObject = state.df.namedNode(`${baseIRI}#${fragment}`);
+            }
+        } else {
+            // Regular soft IRI
+            localObject = state.df.namedNode(expandIRI(sem.object, state.ctx));
+        }
+    }
     if (newSubject) state.currentSubject = newSubject;
     const S = state.currentSubject;
@@ -501,12 +529,15 @@ function processAnnotation(carrier, sem, state) {
     state.origin.blocks.set(block.id, block);
     const L = createLiteral(carrier.text, sem.datatype, sem.language, state.ctx, state.df);
-    const O = carrier.url ? state.df.namedNode(expandIRI(carrier.url, state.ctx)) : null;
+    const carrierO = carrier.url ? state.df.namedNode(expandIRI(carrier.url, state.ctx)) : null;
     sem.types.forEach(t => {
         const typeIRI = typeof t === 'string' ? t : t.iri;
         const entryIndex = typeof t === 'string' ? null : t.entryIndex;
-        const typeSubject = O || S;
+        // For types with subject declarations, the type applies to the new subject
+        // For types with soft IRI declarations, the type applies to the soft IRI object
+        // Otherwise, type applies to carrier object or current subject
+        const typeSubject = newSubject ? newSubject : (localObject || carrierO || S);
         const expandedType = expandIRI(typeIRI, state.ctx);
         emitQuad(state.quads, state.origin.quadIndex, block.id, typeSubject, state.df.namedNode(expandIRI('rdf:type', state.ctx)), state.df.namedNode(expandedType), state.df, { kind: 'type', token: `.${typeIRI}`, expandedType, entryIndex });
     });
@@ -516,18 +547,26 @@ function processAnnotation(carrier, sem, state) {
         const token = `${pred.form}${pred.iri}`;
         if (pred.form === '') {
-            emitQuad(state.quads, state.origin.quadIndex, block.id, S, P, L, state.df, { kind: 'pred', token, form: pred.form, expandedPredicate: P.value, entryIndex: pred.entryIndex });
+            // S —p→ L (use soft IRI object as subject if available, otherwise current subject)
+            const subjectIRI = localObject || S;
+            emitQuad(state.quads, state.origin.quadIndex, block.id, subjectIRI, P, L, state.df, { kind: 'pred', token, form: pred.form, expandedPredicate: P.value, entryIndex: pred.entryIndex });
         } else if (pred.form === '?') {
-            if (newSubject) {
-                emitQuad(state.quads, state.origin.quadIndex, block.id, previousSubject, P, newSubject, state.df, { kind: 'pred', token, form: pred.form, expandedPredicate: P.value, entryIndex: pred.entryIndex });
-            } else if (O) {
-                emitQuad(state.quads, state.origin.quadIndex, block.id, S, P, O, state.df, { kind: 'pred', token, form: pred.form, expandedPredicate: P.value, entryIndex: pred.entryIndex });
+            // S —p→ O (use previous subject as subject, newSubject as object)
+            const subjectIRI = newSubject ? previousSubject : S;
+            const objectIRI = localObject || newSubject || carrierO;
+            if (objectIRI && subjectIRI) {
+                emitQuad(state.quads, state.origin.quadIndex, block.id, subjectIRI, P, objectIRI, state.df, { kind: 'pred', token, form: pred.form, expandedPredicate: P.value, entryIndex: pred.entryIndex });
             }
+        } else if (pred.form === '^') {
+            // L —p→ S (use soft IRI object as subject if available, otherwise current subject)
+            const subjectIRI = localObject || S;
+            emitQuad(state.quads, state.origin.quadIndex, block.id, L, P, subjectIRI, state.df, { kind: 'pred', token, form: pred.form, expandedPredicate: P.value, entryIndex: pred.entryIndex });
         } else if (pred.form === '^?') {
-            if (newSubject) {
-                emitQuad(state.quads, state.origin.quadIndex, block.id, newSubject, P, previousSubject, state.df, { kind: 'pred', token, form: pred.form, expandedPredicate: P.value, entryIndex: pred.entryIndex });
-            } else if (O) {
-                emitQuad(state.quads, state.origin.quadIndex, block.id, O, P, S, state.df, { kind: 'pred', token, form: pred.form, expandedPredicate: P.value, entryIndex: pred.entryIndex });
+            // O —p→ S (use previous subject as object, newSubject as subject)
+            const objectIRI = newSubject ? previousSubject : S;
+            const subjectIRI = localObject || newSubject || carrierO;
+            if (objectIRI && subjectIRI) {
+                emitQuad(state.quads, state.origin.quadIndex, block.id, subjectIRI, P, objectIRI, state.df, { kind: 'pred', token, form: pred.form, expandedPredicate: P.value, entryIndex: pred.entryIndex });
             }
         }
     });
@@ -622,7 +661,8 @@ export function parse(text, options = {}) {
         df: options.dataFactory || DataFactory,
         quads: [],
         origin: { blocks: new Map(), quadIndex: new Map() },
-        currentSubject: null
+        currentSubject: null,
+        currentObject: null
     };
     const tokens = scanTokens(text);
@@ -720,6 +760,26 @@ function removeOneToken(tokens, matchFn) {
     return i === -1 ? { tokens, removed: false } : { tokens: [...tokens.slice(0, i), ...tokens.slice(i + 1)], removed: true };
 }
+function addObjectToken(tokens, iri) {
+    const objectToken = `=?${iri}`;
+    return tokens.includes(objectToken) ? tokens : [...tokens, objectToken];
+}
+function removeObjectToken(tokens, iri) {
+    const objectToken = `=?${iri}`;
+    return removeOneToken(tokens, t => t === objectToken);
+}
+function addSoftFragmentToken(tokens, fragment) {
+    const fragmentToken = `=?#${fragment}`;
+    return tokens.includes(fragmentToken) ? tokens : [...tokens, fragmentToken];
+}
+function removeSoftFragmentToken(tokens, fragment) {
+    const fragmentToken = `=?#${fragment}`;
+    return removeOneToken(tokens, t => t === fragmentToken);
+}
 function sanitizeCarrierValueForBlock(block, raw) {
     const s = String(raw ?? '');
     const t = block?.carrierType;
@@ -1013,6 +1073,28 @@ export function serialize({ text, diff, origin, options = {} }) {
                 return;
             }
+            // Handle object token removal
+            if (entry?.kind === 'object') {
+                const objectIRI = shortenIRI(quad.object.value, ctx);
+                const { tokens: updated, removed } = removeObjectToken(tokens, objectIRI);
+                if (!removed) return;
+                const newAttrs = updated.length === 0 ? '{}' : writeAttrsTokens(updated);
+                edits.push({ start: span.start, end: span.end, text: newAttrs });
+                return;
+            }
+            // Handle soft fragment token removal
+            if (entry?.kind === 'softFragment') {
+                const fragment = entry.fragment;
+                const { tokens: updated, removed } = removeSoftFragmentToken(tokens, fragment);
+                if (!removed) return;
+                const newAttrs = updated.length === 0 ? '{}' : writeAttrsTokens(updated);
+                edits.push({ start: span.start, end: span.end, text: newAttrs });
+                return;
+            }
             const tokens = normalizeAttrsTokens(span.text);
             let updated = tokens;
             let removed = false;
@@ -1084,7 +1166,8 @@ export function serialize({ text, diff, origin, options = {} }) {
                     } else {
                         const full = quad.object.value;
                         const label = shortenIRI(full, ctx);
-                        edits.push({ start: result.length, end: result.length, text: `\n[${label}] {=${label}) {?${predShort}}` });
+                        const objectShort = shortenIRI(full, ctx);
+                        edits.push({ start: result.length, end: result.length, text: `\n[${label}] {=?${objectShort} ?${predShort}}` });
                     }
                     return;
                 }
@@ -1103,8 +1186,35 @@ export function serialize({ text, diff, origin, options = {} }) {
                 if (quad.object.termType === 'NamedNode') {
                     const full = quad.object.value;
-                    const label = shortenIRI(full, ctx);
-                    edits.push({ start: result.length, end: result.length, text: `\n[${label}] {=${shortenIRI(full, ctx)} ?${predShort}}` });
+                    const objectShort = shortenIRI(full, ctx);
+                    const predShort = shortenIRI(quad.predicate.value, ctx);
+                    // Check if this is a soft fragment
+                    const isSoftFragment = full.includes('#') && anchored?.entry?.kind === 'softFragment';
+                    if (isSoftFragment || anchored?.entry?.form === '?') {
+                        // Add soft fragment token if not present
+                        if (isSoftFragment) {
+                            const fragment = full.split('#')[1];
+                            const updated = addSoftFragmentToken(tokens, fragment);
+                            if (updated.length !== tokens.length) {
+                                edits.push({ start: span.start, end: span.end, text: writeAttrsTokens(updated) });
+                            }
+                        } else {
+                            const updated = addObjectToken(tokens, objectShort);
+                            if (updated.length !== tokens.length) {
+                                edits.push({ start: span.start, end: span.end, text: writeAttrsTokens(updated) });
+                            }
+                        }
+                    } else {
+                        // Create new annotation with object token
+                        if (isSoftFragment) {
+                            const fragment = full.split('#')[1];
+                            edits.push({ start: result.length, end: result.length, text: `\n[${objectShort}] {=?#${fragment} ?${predShort}}` });
+                        } else {
+                            edits.push({ start: result.length, end: result.length, text: `\n[${objectShort}] {=?${objectShort} ?${predShort}}` });
+                        }
+                    }
                     return;
                 }
             }

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
 	"name": "mdld-parse",
-	"version": "0.2.6",
+	"version": "0.2.8",
 	"description": "A standards-compliant parser for **MD-LD (Markdown-Linked Data)** — a human-friendly RDF authoring format that extends Markdown with semantic annotations.",
 	"type": "module",
 	"main": "index.js",