npm - @remnic/export-weclone - Versions diffs - 1.0.1 - Mend

@remnic/export-weclone 1.0.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (6) hide show

package/LICENSE ADDED Viewed

@@ -0,0 +1,21 @@
+MIT License
+Copyright (c) 2025 Joshua Warren
+Permission is hereby granted, free of charge, to any person obtaining a copy
+of this software and associated documentation files (the "Software"), to deal
+in the Software without restriction, including without limitation the rights
+to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+copies of the Software, and to permit persons to whom the Software is
+furnished to do so, subject to the following conditions:
+The above copyright notice and this permission notice shall be included in all
+copies or substantial portions of the Software.
+THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+SOFTWARE.

package/README.md ADDED Viewed

@@ -0,0 +1,171 @@
+# @remnic/export-weclone
+Export [Remnic](https://github.com/joshuaswarren/remnic) memories as
+[WeClone](https://github.com/xming521/weclone)-compatible fine-tuning
+datasets. Produces Alpaca-format JSON consumable by
+[LLaMA Factory](https://github.com/hiyouga/LLaMA-Factory), which WeClone
+drives under the hood.
+This package solves the noisy-chat-log problem: WeClone normally trains on
+raw Telegram / WeChat exports, which include spam, one-word replies, and
+PII. Remnic has already distilled your conversations into structured
+facts, preferences, entities, and topics — a much higher
+signal-to-noise source for a personal digital avatar.
+## Install
+```bash
+pnpm add @remnic/export-weclone
+# or: npm i @remnic/export-weclone
+```
+`@remnic/export-weclone` depends on `@remnic/core` and is intended to be
+used alongside an existing Remnic memory store.
+## Quick start
+The primary entry point is the `remnic` CLI (see
+[`@remnic/cli`](../remnic-cli)). Importing this package as a side-effect
+registers the `weclone` adapter with the core training-export registry:
+```bash
+remnic training:export --format weclone --output ./weclone-dataset.json
+```
+Common options:
+```bash
+# Restrict to high-confidence memories created in 2026:
+remnic training:export \
+  --format weclone \
+  --output ./weclone.json \
+  --since 2026-01-01 \
+  --until 2027-01-01 \
+  --min-confidence 0.7
+# Restrict to specific categories:
+remnic training:export \
+  --format weclone \
+  --output ./weclone.json \
+  --categories preference,fact,skill
+# Generate conversational Q/A pairs instead of raw facts:
+remnic training:export \
+  --format weclone \
+  --output ./weclone.json \
+  --synthesize
+# Preview only (no file written):
+remnic training:export --format weclone --output /tmp/preview.json --dry-run
+```
+## Output format
+WeClone / LLaMA Factory expect [Alpaca
+JSON](https://github.com/tatsu-lab/stanford_alpaca#data-release):
+```json
+[
+  {
+    "instruction": "What kind of coffee do you like?",
+    "input": "",
+    "output": "dark roast, ethiopian yirgacheffe. something about that fruity wine-like flavor..."
+  }
+]
+```
+The adapter emits only the three Alpaca fields. Remnic metadata
+(`category`, `confidence`, `sourceIds`) is stripped from the output file
+but is preserved on the in-memory records so callers building their own
+pipelines can inspect it before serialization.
+## Programmatic API
+```ts
+import {
+  ensureWecloneExportAdapterRegistered,
+  wecloneExportAdapter,
+  synthesizeTrainingPairs,
+  extractStyleMarkers,
+  sweepPii,
+} from "@remnic/export-weclone";
+import {
+  convertMemoriesToRecords,
+  getTrainingExportAdapter,
+} from "@remnic/core";
+// Side-effect import is usually enough, but explicit registration is safe:
+ensureWecloneExportAdapterRegistered();
+const records = await convertMemoriesToRecords({
+  memoryDir: "/path/to/memory",
+  minConfidence: 0.7,
+});
+const pairs = synthesizeTrainingPairs(records, { maxPairsPerRecord: 2 });
+const { cleanRecords, redactedCount } = sweepPii(pairs);
+const adapter = getTrainingExportAdapter("weclone");
+const json = adapter!.formatRecords(cleanRecords);
+```
+### `synthesizeTrainingPairs(records, opts)`
+Turns flat memory records into natural conversational Q/A pairs using
+category-driven templates (preferences, opinions, expertise, personal).
+Pure templates — no LLM calls. Optionally applies style markers (e.g.
+lowercase normalization) extracted from the user's own transcripts.
+### `extractStyleMarkers(samples)`
+Analyses text samples with regex-and-count heuristics and returns a
+`StyleMarkers` profile (`avgSentenceLength`, `usesEmoji`, `formality`,
+`usesLowercase`, `commonPhrases`). Used by `synthesizeTrainingPairs` to
+match the output tone to the user's own writing style.
+### `sweepPii(records)`
+Belt-and-suspenders PII redaction for email, SSN, credit-card, IP, and
+phone patterns. Runs after Remnic's own privacy controls so that even if
+something slips through the upstream filter, the final dataset cannot leak
+these patterns. Returns `{ cleanRecords, redactedCount, redactionDetails }`.
+## How synthesis works
+Remnic memories are facts, not conversations. The synthesizer maps each
+memory category to a template group and generates a corresponding
+question, using any parenthesised tags in the instruction as the topic:
+```
+Category:  preference
+Memory:    "Dark roast coffee, Ethiopian Yirgacheffe specifically"
+Tags:      food, coffee
+Generated pair:
+  instruction: "What kind of food, coffee do you like?"
+  output:      "Dark roast coffee, Ethiopian Yirgacheffe specifically"
+```
+Question templates live in `src/synthesizer.ts`. Adding a new category
+mapping is a one-line change.
+## Privacy posture
+- Output JSON contains only `instruction`, `input`, `output`.
+- Remnic metadata (`sourceIds`, etc.) is **not** written to the dataset
+  file — even the record IDs stay in the memory store.
+- `sweepPii` runs by default in the CLI. Disable only with
+  `--no-privacy-sweep` and only when you have a compensating control.
+- Symlinks and hard-linked `.md` files under `memoryDir` are refused by
+  the core converter to block data-exfiltration vectors out of the memory
+  store (see `packages/remnic-core/src/training-export/converter.ts`).
+## Related
+- Tracking issue: [remnic#459](https://github.com/joshuaswarren/remnic/issues/459)
+- Upstream: [WeClone](https://github.com/xming521/weclone)
+- Format: [Alpaca JSON via LLaMA Factory](https://github.com/hiyouga/LLaMA-Factory)
+## License
+MIT. See the root [LICENSE](../../LICENSE) file.

package/dist/index.d.ts ADDED Viewed

@@ -0,0 +1,105 @@
+import { TrainingExportAdapter, TrainingExportRecord } from '@remnic/core';
+/**
+ * WeClone Alpaca-format training export adapter.
+ *
+ * Converts TrainingExportRecord[] into the JSON format that
+ * WeClone / LLaMA Factory expects for fine-tuning:
+ *
+ *   [{ "instruction": "...", "input": "", "output": "..." }, ...]
+ *
+ * Only the three Alpaca fields are emitted; Remnic-specific
+ * metadata (category, confidence, sourceIds) is stripped.
+ */
+declare const wecloneExportAdapter: TrainingExportAdapter;
+/**
+ * Communication style marker extraction.
+ *
+ * Analyzes text samples using simple heuristics to produce
+ * a StyleMarkers profile.  No LLM calls — pure regex and
+ * counting.
+ */
+interface StyleMarkers {
+    avgSentenceLength: number;
+    usesEmoji: boolean;
+    formality: "formal" | "casual" | "mixed";
+    usesLowercase: boolean;
+    commonPhrases: string[];
+}
+/**
+ * Analyse text samples and extract communication style markers.
+ */
+declare function extractStyleMarkers(samples: string[]): StyleMarkers;
+/**
+ * Training-pair synthesizer.
+ *
+ * Converts Remnic's flat TrainingExportRecord[] — where
+ * `instruction` is a natural-language description and
+ * `category` identifies the memory type — into natural
+ * conversational question-answer pairs suitable for
+ * WeClone / LLaMA Factory fine-tuning.
+ *
+ * Uses template-based question generation (no LLM calls).
+ */
+interface SynthesizerOptions {
+    styleMarkers?: StyleMarkers;
+    maxPairsPerRecord?: number;
+}
+/**
+ * Synthesize natural conversational training pairs from
+ * category-tagged memory records.
+ */
+declare function synthesizeTrainingPairs(records: TrainingExportRecord[], options?: SynthesizerOptions): TrainingExportRecord[];
+/**
+ * PII privacy sweep for training export records.
+ *
+ * Belt-and-suspenders check that runs after Remnic's own
+ * privacy controls.  Scans instruction, input, and output
+ * fields for common PII patterns and replaces matches with
+ * [REDACTED].
+ */
+interface PrivacySweepResult {
+    cleanRecords: TrainingExportRecord[];
+    redactedCount: number;
+    redactionDetails: {
+        index: number;
+        field: string;
+        pattern: string;
+    }[];
+}
+/**
+ * Scan and redact PII from training export records.
+ *
+ * Returns a new array of cleaned records, leaving the originals
+ * unmodified.  The `redactedCount` is the number of records that
+ * had at least one redaction.  `redactionDetails` lists every
+ * individual match with its record index, field, and pattern name.
+ */
+declare function sweepPii(records: TrainingExportRecord[]): PrivacySweepResult;
+/**
+ * @remnic/export-weclone
+ *
+ * WeClone-specific training-data export adapter that converts
+ * Remnic memories into Alpaca-format fine-tuning datasets
+ * compatible with WeClone / LLaMA Factory.
+ */
+/**
+ * Idempotently register the WeClone adapter with the core training-export
+ * registry. Callable multiple times without throwing (CLAUDE.md #13:
+ * secondary calls must not crash host processes that pre-register the
+ * adapter for test fixtures).
+ *
+ * Returns true when the adapter was newly registered, false when an adapter
+ * with the same name already exists.
+ */
+declare function ensureWecloneExportAdapterRegistered(): boolean;
+export { type PrivacySweepResult, type StyleMarkers, type SynthesizerOptions, ensureWecloneExportAdapterRegistered, extractStyleMarkers, sweepPii, synthesizeTrainingPairs, wecloneExportAdapter };

package/dist/index.js ADDED Viewed

@@ -0,0 +1,337 @@
+// openclaw-engram: Local-first memory plugin
+// src/index.ts
+import {
+  getTrainingExportAdapter,
+  registerTrainingExportAdapter
+} from "@remnic/core";
+// src/adapter.ts
+var wecloneExportAdapter = {
+  name: "weclone",
+  fileExtension: ".json",
+  formatRecords(records) {
+    const alpacaRecords = records.map((r) => ({
+      instruction: r.instruction,
+      input: r.input,
+      output: r.output
+    }));
+    return JSON.stringify(alpacaRecords, null, 2);
+  }
+};
+// src/synthesizer.ts
+var DEFAULT_MAX_PAIRS = 1;
+var QUESTION_TEMPLATES = {
+  preferences: [
+    "What kind of {topic} do you like?",
+    "What's your preference for {topic}?",
+    "What are your favorite {topic}?"
+  ],
+  opinions: [
+    "What do you think about {topic}?",
+    "How do you feel about {topic}?",
+    "What's your opinion on {topic}?"
+  ],
+  expertise: [
+    "Tell me about {topic}.",
+    "What do you know about {topic}?",
+    "Can you explain {topic}?"
+  ],
+  personal: [
+    "Can you tell me about your {topic}?",
+    "Tell me about your {topic}.",
+    "What can you share about your {topic}?"
+  ]
+};
+var DEFAULT_TEMPLATES = [
+  "Tell me about {topic}.",
+  "What can you share about {topic}?"
+];
+var CATEGORY_TO_TEMPLATE = {
+  preference: "preferences",
+  fact: "expertise",
+  entity: "expertise",
+  skill: "expertise",
+  correction: "opinions",
+  decision: "opinions",
+  principle: "opinions",
+  rule: "opinions",
+  personal: "personal",
+  relationship: "personal",
+  commitment: "personal",
+  moment: "personal"
+};
+function synthesizeTrainingPairs(records, options) {
+  const maxPairs = options?.maxPairsPerRecord ?? DEFAULT_MAX_PAIRS;
+  const style = options?.styleMarkers;
+  const result = [];
+  for (let i = 0; i < records.length; i++) {
+    const record = records[i];
+    const templateKey = resolveTemplateKey(record.category);
+    const topic = extractTopic(record.instruction);
+    const templates = QUESTION_TEMPLATES[templateKey] ?? DEFAULT_TEMPLATES;
+    const pairCount = Math.min(maxPairs, templates.length);
+    for (let j = 0; j < pairCount; j++) {
+      const templateIndex = (i + j) % templates.length;
+      const question = templates[templateIndex].replace("{topic}", topic);
+      let output = record.output;
+      if (style?.usesLowercase) {
+        output = output.toLowerCase();
+      }
+      result.push({
+        instruction: question,
+        input: "",
+        output,
+        category: record.category,
+        confidence: record.confidence,
+        sourceIds: record.sourceIds
+      });
+    }
+  }
+  return result;
+}
+function resolveTemplateKey(category) {
+  if (!category) return "";
+  return CATEGORY_TO_TEMPLATE[category.toLowerCase()] ?? "";
+}
+function extractTopic(instruction) {
+  const tagMatch = instruction.match(/\(([^()]+)\)/);
+  if (tagMatch) {
+    return tagMatch[1].trim().toLowerCase();
+  }
+  return "this";
+}
+// src/style-extractor.ts
+var EMOJI_RE = /[\u{1F600}-\u{1F64F}\u{1F300}-\u{1F5FF}\u{1F680}-\u{1F6FF}\u{1F1E0}-\u{1F1FF}\u{2600}-\u{27BF}\u{2702}-\u{27B0}\u{FE00}-\u{FE0F}\u{1FA00}-\u{1FA6F}\u{1FA70}-\u{1FAFF}\u{2328}\u{23CF}\u{23E9}-\u{23F3}\u{23F8}-\u{23FA}\u{200D}\u{20E3}\u{FE0F}\u{E0020}-\u{E007F}\u{2B50}\u{2B55}\u{2934}\u{2935}\u{25AA}-\u{25FE}\u{2600}-\u{26FF}\u{2700}-\u{27BF}\u{231A}\u{231B}\u{23E9}-\u{23F3}\u{23F8}-\u{23FA}\u{25FB}-\u{25FE}\u{2614}\u{2615}\u{2648}-\u{2653}\u{267F}\u{2693}\u{26A1}\u{26AA}\u{26AB}\u{26BD}\u{26BE}\u{26C4}\u{26C5}\u{26CE}\u{26D4}\u{26EA}\u{26F2}\u{26F3}\u{26F5}\u{26FA}\u{26FD}\u{2702}\u{2705}\u{2708}-\u{270D}\u{270F}\u{2712}\u{2714}\u{2716}\u{271D}\u{2721}\u{2728}\u{2733}\u{2734}\u{2744}\u{2747}\u{274C}\u{274E}\u{2753}-\u{2755}\u{2757}\u{2763}\u{2764}\u{2795}-\u{2797}\u{27A1}\u{27B0}\u{27BF}\u{2934}\u{2935}]/u;
+var FORMAL_MARKERS = [
+  "furthermore",
+  "however",
+  "therefore",
+  "moreover",
+  "consequently",
+  "nevertheless",
+  "in addition",
+  "accordingly",
+  "subsequently",
+  "regarding",
+  "pertaining",
+  "shall",
+  "hereby",
+  "whereas",
+  "notwithstanding",
+  "henceforth",
+  "aforementioned",
+  "please consider",
+  "would like to",
+  "i would",
+  "appreciation",
+  "recommendations",
+  "thoroughly",
+  "documentation"
+];
+var CASUAL_MARKERS = [
+  "gonna",
+  "wanna",
+  "kinda",
+  "sorta",
+  "gotta",
+  "dunno",
+  "lemme",
+  "yeah",
+  "yep",
+  "nah",
+  "lol",
+  "omg",
+  "tbh",
+  "imo",
+  "btw",
+  "nope",
+  "cuz",
+  "tho",
+  "ain't",
+  "y'all",
+  "awesome",
+  "cool",
+  "dude",
+  "bro",
+  "bruh"
+];
+var MIN_PHRASE_FREQUENCY = 2;
+var MAX_COMMON_PHRASES = 10;
+function extractStyleMarkers(samples) {
+  if (samples.length === 0) {
+    return {
+      avgSentenceLength: 0,
+      usesEmoji: false,
+      formality: "mixed",
+      usesLowercase: false,
+      commonPhrases: []
+    };
+  }
+  const joined = samples.join(" ");
+  return {
+    avgSentenceLength: calcAvgSentenceLength(joined),
+    usesEmoji: detectEmoji(joined),
+    formality: detectFormality(joined),
+    usesLowercase: detectLowercase(joined),
+    commonPhrases: findCommonPhrases(samples)
+  };
+}
+function calcAvgSentenceLength(text) {
+  const sentences = text.split(/[.!?]+/).map((s) => s.trim()).filter((s) => s.length > 0);
+  if (sentences.length === 0) return 0;
+  const totalWords = sentences.reduce((sum, s) => {
+    const words = s.split(/\s+/).filter((w) => w.length > 0);
+    return sum + words.length;
+  }, 0);
+  return Math.round(totalWords / sentences.length * 10) / 10;
+}
+function detectEmoji(text) {
+  return EMOJI_RE.test(text);
+}
+function detectFormality(text) {
+  const lower = text.toLowerCase();
+  let formalScore = 0;
+  for (const marker of FORMAL_MARKERS) {
+    if (new RegExp(`\\b${marker}\\b`, "i").test(lower)) formalScore++;
+  }
+  let casualScore = 0;
+  for (const marker of CASUAL_MARKERS) {
+    if (new RegExp(`\\b${marker}\\b`, "i").test(lower)) casualScore++;
+  }
+  const THRESHOLD = 2;
+  if (formalScore >= THRESHOLD && formalScore > casualScore) return "formal";
+  if (casualScore >= THRESHOLD && casualScore > formalScore) return "casual";
+  return "mixed";
+}
+function detectLowercase(text) {
+  const sentences = text.split(/[.!?]+/).map((s) => s.trim()).filter((s) => s.length > 0);
+  if (sentences.length === 0) return false;
+  const lowercaseStarts = sentences.filter((s) => {
+    const firstChar = s.charAt(0);
+    return firstChar === firstChar.toLowerCase() && firstChar !== firstChar.toUpperCase();
+  }).length;
+  return lowercaseStarts / sentences.length > 0.5;
+}
+function isAlnum(ch) {
+  const c = ch.charCodeAt(0);
+  return c >= 48 && c <= 57 || // 0-9
+  c >= 65 && c <= 90 || // A-Z
+  c >= 97 && c <= 122;
+}
+function trimNonAlnum(word) {
+  let start = 0;
+  let end = word.length;
+  while (start < end && !isAlnum(word.charAt(start))) start++;
+  while (end > start && !isAlnum(word.charAt(end - 1))) end--;
+  return start === 0 && end === word.length ? word : word.slice(start, end);
+}
+function findCommonPhrases(samples) {
+  const phraseCount = /* @__PURE__ */ new Map();
+  for (const sample of samples) {
+    const words = sample.split(/\s+/).map((w) => trimNonAlnum(w)).filter((w) => w.length > 0);
+    const seenInSample = /* @__PURE__ */ new Set();
+    for (let ngramSize = 2; ngramSize <= 3; ngramSize++) {
+      for (let i = 0; i <= words.length - ngramSize; i++) {
+        const phrase = words.slice(i, i + ngramSize).join(" ").toLowerCase();
+        if (!seenInSample.has(phrase)) {
+          seenInSample.add(phrase);
+          phraseCount.set(phrase, (phraseCount.get(phrase) ?? 0) + 1);
+        }
+      }
+    }
+  }
+  return [...phraseCount.entries()].filter(([, count]) => count >= MIN_PHRASE_FREQUENCY).sort((a, b) => {
+    if (b[1] !== a[1]) return b[1] - a[1];
+    return a[0].localeCompare(b[0]);
+  }).slice(0, MAX_COMMON_PHRASES).map(([phrase]) => phrase);
+}
+// src/privacy.ts
+var PII_PATTERNS = [
+  {
+    // Email: user@domain.tld
+    name: "email",
+    regex: /[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}/g
+  },
+  {
+    // SSN: 123-45-6789 (exactly 3-2-4 digit groups)
+    name: "ssn",
+    regex: /\b\d{3}-\d{2}-\d{4}\b/g
+  },
+  {
+    // Credit card: 4 groups of 4 digits separated by dashes or spaces
+    name: "credit_card",
+    regex: /\b\d{4}[-\s]\d{4}[-\s]\d{4}[-\s]\d{4}\b/g
+  },
+  {
+    // IP address: four octets 0-255
+    name: "ip_address",
+    regex: /\b(?:(?:25[0-5]|2[0-4]\d|[01]?\d\d?)\.){3}(?:25[0-5]|2[0-4]\d|[01]?\d\d?)\b/g
+  },
+  {
+    // Phone: optional +1- prefix, then 3-3-4 with dashes, dots, or spaces
+    // Also matches (555) 123-4567 format
+    name: "phone",
+    regex: /(?:\+\d{1,3}[-.\s]?)?\(?\d{3}\)?[-.\s]\d{3}[-.\s]\d{4}\b/g
+  }
+];
+var SCANNED_FIELDS = [
+  "instruction",
+  "input",
+  "output"
+];
+function sweepPii(records) {
+  const redactionDetails = [];
+  const recordHasRedaction = /* @__PURE__ */ new Set();
+  const cleanRecords = records.map((record, idx) => {
+    const cleaned = { ...record };
+    for (const field of SCANNED_FIELDS) {
+      let value = record[field];
+      if (!value) continue;
+      for (const pattern of PII_PATTERNS) {
+        pattern.regex.lastIndex = 0;
+        if (pattern.regex.test(value)) {
+          pattern.regex.lastIndex = 0;
+          value = value.replace(pattern.regex, "[REDACTED]");
+          recordHasRedaction.add(idx);
+          redactionDetails.push({
+            index: idx,
+            field,
+            pattern: pattern.name
+          });
+        }
+      }
+      cleaned[field] = value;
+    }
+    return cleaned;
+  });
+  return {
+    cleanRecords,
+    redactedCount: recordHasRedaction.size,
+    redactionDetails
+  };
+}
+// src/index.ts
+function ensureWecloneExportAdapterRegistered() {
+  if (getTrainingExportAdapter(wecloneExportAdapter.name) !== void 0) {
+    return false;
+  }
+  registerTrainingExportAdapter(wecloneExportAdapter);
+  return true;
+}
+try {
+  ensureWecloneExportAdapterRegistered();
+} catch {
+}
+export {
+  ensureWecloneExportAdapterRegistered,
+  extractStyleMarkers,
+  sweepPii,
+  synthesizeTrainingPairs,
+  wecloneExportAdapter
+};
+//# sourceMappingURL=index.js.map

package/dist/index.js.map ADDED Viewed

@@ -0,0 +1 @@

+ {"version":3,"sources":["../src/index.ts","../src/adapter.ts","../src/synthesizer.ts","../src/style-extractor.ts","../src/privacy.ts"],"sourcesContent":["/**\n * @remnic/export-weclone\n *\n * WeClone-specific training-data export adapter that converts\n * Remnic memories into Alpaca-format fine-tuning datasets\n * compatible with WeClone / LLaMA Factory.\n */\n\nimport {\n getTrainingExportAdapter,\n registerTrainingExportAdapter,\n} from \"@remnic/core\";\n\nimport { wecloneExportAdapter } from \"./adapter.js\";\n\nexport { wecloneExportAdapter } from \"./adapter.js\";\nexport { synthesizeTrainingPairs, type SynthesizerOptions } from \"./synthesizer.js\";\nexport { extractStyleMarkers, type StyleMarkers } from \"./style-extractor.js\";\nexport { sweepPii, type PrivacySweepResult } from \"./privacy.js\";\n\n/**\n * Idempotently register the WeClone adapter with the core training-export\n * registry. Callable multiple times without throwing (CLAUDE.md #13:\n * secondary calls must not crash host processes that pre-register the\n * adapter for test fixtures).\n *\n * Returns true when the adapter was newly registered, false when an adapter\n * with the same name already exists.\n */\nexport function ensureWecloneExportAdapterRegistered(): boolean {\n if (getTrainingExportAdapter(wecloneExportAdapter.name) !== undefined) {\n return false;\n }\n registerTrainingExportAdapter(wecloneExportAdapter);\n return true;\n}\n\n// Side-effect registration: importing this module registers the adapter.\n// Callers that need to manage registration manually (e.g. tests that call\n// `clearTrainingExportAdapters()`) can re-invoke\n// `ensureWecloneExportAdapterRegistered()` after clearing.\n//\n// The try/catch keeps import-time errors from breaking unrelated callers —\n// the adapter surfaces `formatRecords` purely, so a failure here would be\n// surprising, but defensive coding keeps CLI startup resilient.\ntry {\n ensureWecloneExportAdapterRegistered();\n} catch {\n // Swallow — explicit callers can re-invoke ensureWecloneExportAdapterRegistered().\n}\n","/**\n * WeClone Alpaca-format training export adapter.\n *\n * Converts TrainingExportRecord[] into the JSON format that\n * WeClone / LLaMA Factory expects for fine-tuning:\n *\n * [{ \"instruction\": \"...\", \"input\": \"\", \"output\": \"...\" }, ...]\n *\n * Only the three Alpaca fields are emitted; Remnic-specific\n * metadata (category, confidence, sourceIds) is stripped.\n */\n\nimport type { TrainingExportAdapter, TrainingExportRecord } from \"@remnic/core\";\n\nexport const wecloneExportAdapter: TrainingExportAdapter = {\n name: \"weclone\",\n fileExtension: \".json\",\n\n formatRecords(records: TrainingExportRecord[]): string {\n const alpacaRecords = records.map((r) => ({\n instruction: r.instruction,\n input: r.input,\n output: r.output,\n }));\n return JSON.stringify(alpacaRecords, null, 2);\n },\n};\n","/**\n * Training-pair synthesizer.\n *\n * Converts Remnic's flat TrainingExportRecord[] — where\n * `instruction` is a natural-language description and\n * `category` identifies the memory type — into natural\n * conversational question-answer pairs suitable for\n * WeClone / LLaMA Factory fine-tuning.\n *\n * Uses template-based question generation (no LLM calls).\n */\n\nimport type { TrainingExportRecord } from \"@remnic/core\";\nimport type { StyleMarkers } from \"./style-extractor.js\";\n\nexport interface SynthesizerOptions {\n styleMarkers?: StyleMarkers;\n maxPairsPerRecord?: number;\n}\n\n/** Default limit for pairs generated per input record. */\nconst DEFAULT_MAX_PAIRS = 1;\n\n/**\n * Question templates keyed by template group.\n * Each array provides variety; the synthesizer picks\n * based on record index for deterministic output.\n */\nconst QUESTION_TEMPLATES: Record<string, string[]> = {\n preferences: [\n \"What kind of {topic} do you like?\",\n \"What's your preference for {topic}?\",\n \"What are your favorite {topic}?\",\n ],\n opinions: [\n \"What do you think about {topic}?\",\n \"How do you feel about {topic}?\",\n \"What's your opinion on {topic}?\",\n ],\n expertise: [\n \"Tell me about {topic}.\",\n \"What do you know about {topic}?\",\n \"Can you explain {topic}?\",\n ],\n personal: [\n \"Can you tell me about your {topic}?\",\n \"Tell me about your {topic}.\",\n \"What can you share about your {topic}?\",\n ],\n};\n\nconst DEFAULT_TEMPLATES = [\n \"Tell me about {topic}.\",\n \"What can you share about {topic}?\",\n];\n\n/**\n * Maps record.category values (from core converter) to\n * QUESTION_TEMPLATES keys. Categories not listed here\n * fall through to DEFAULT_TEMPLATES.\n */\nconst CATEGORY_TO_TEMPLATE: Record<string, string> = {\n preference: \"preferences\",\n fact: \"expertise\",\n entity: \"expertise\",\n skill: \"expertise\",\n correction: \"opinions\",\n decision: \"opinions\",\n principle: \"opinions\",\n rule: \"opinions\",\n personal: \"personal\",\n relationship: \"personal\",\n commitment: \"personal\",\n moment: \"personal\",\n};\n\n/**\n * Synthesize natural conversational training pairs from\n * category-tagged memory records.\n */\nexport function synthesizeTrainingPairs(\n records: TrainingExportRecord[],\n options?: SynthesizerOptions,\n): TrainingExportRecord[] {\n const maxPairs = options?.maxPairsPerRecord ?? DEFAULT_MAX_PAIRS;\n const style = options?.styleMarkers;\n const result: TrainingExportRecord[] = [];\n\n for (let i = 0; i < records.length; i++) {\n const record = records[i];\n const templateKey = resolveTemplateKey(record.category);\n const topic = extractTopic(record.instruction);\n const templates = QUESTION_TEMPLATES[templateKey] ?? DEFAULT_TEMPLATES;\n\n const pairCount = Math.min(maxPairs, templates.length);\n\n for (let j = 0; j < pairCount; j++) {\n const templateIndex = (i + j) % templates.length;\n const question = templates[templateIndex].replace(\"{topic}\", topic);\n let output = record.output;\n\n if (style?.usesLowercase) {\n output = output.toLowerCase();\n }\n\n result.push({\n instruction: question,\n input: \"\",\n output,\n category: record.category,\n confidence: record.confidence,\n sourceIds: record.sourceIds,\n });\n }\n }\n\n return result;\n}\n\n// ── Internals ────────────────────────────────────────────\n\n/**\n * Resolve a record's category field to a QUESTION_TEMPLATES key.\n * Falls back to empty string (which triggers DEFAULT_TEMPLATES).\n */\nfunction resolveTemplateKey(category: string | undefined): string {\n if (!category) return \"\";\n return CATEGORY_TO_TEMPLATE[category.toLowerCase()] ?? \"\";\n}\n\n/**\n * Extract a human-readable topic from the instruction string.\n *\n * The core converter produces instructions like:\n * \"Recall a factual memory (food, cooking)\"\n * \"Recall a user preference\"\n *\n * When parenthesized tags are present, use them as the topic.\n * Otherwise fall back to \"this\".\n */\nfunction extractTopic(instruction: string): string {\n const tagMatch = instruction.match(/\$([^()]+)\$/);\n if (tagMatch) {\n return tagMatch[1].trim().toLowerCase();\n }\n return \"this\";\n}\n","/**\n * Communication style marker extraction.\n *\n * Analyzes text samples using simple heuristics to produce\n * a StyleMarkers profile. No LLM calls — pure regex and\n * counting.\n */\n\nexport interface StyleMarkers {\n avgSentenceLength: number;\n usesEmoji: boolean;\n formality: \"formal\" | \"casual\" | \"mixed\";\n usesLowercase: boolean;\n commonPhrases: string[];\n}\n\n/**\n * Regex matching most common emoji code-point ranges.\n * Covers Emoticons, Dingbats, Transport/Map symbols,\n * Misc symbols, and supplemental blocks.\n */\nconst EMOJI_RE =\n /[\\u{1F600}-\\u{1F64F}\\u{1F300}-\\u{1F5FF}\\u{1F680}-\\u{1F6FF}\\u{1F1E0}-\\u{1F1FF}\\u{2600}-\\u{27BF}\\u{2702}-\\u{27B0}\\u{FE00}-\\u{FE0F}\\u{1FA00}-\\u{1FA6F}\\u{1FA70}-\\u{1FAFF}\\u{2328}\\u{23CF}\\u{23E9}-\\u{23F3}\\u{23F8}-\\u{23FA}\\u{200D}\\u{20E3}\\u{FE0F}\\u{E0020}-\\u{E007F}\\u{2B50}\\u{2B55}\\u{2934}\\u{2935}\\u{25AA}-\\u{25FE}\\u{2600}-\\u{26FF}\\u{2700}-\\u{27BF}\\u{231A}\\u{231B}\\u{23E9}-\\u{23F3}\\u{23F8}-\\u{23FA}\\u{25FB}-\\u{25FE}\\u{2614}\\u{2615}\\u{2648}-\\u{2653}\\u{267F}\\u{2693}\\u{26A1}\\u{26AA}\\u{26AB}\\u{26BD}\\u{26BE}\\u{26C4}\\u{26C5}\\u{26CE}\\u{26D4}\\u{26EA}\\u{26F2}\\u{26F3}\\u{26F5}\\u{26FA}\\u{26FD}\\u{2702}\\u{2705}\\u{2708}-\\u{270D}\\u{270F}\\u{2712}\\u{2714}\\u{2716}\\u{271D}\\u{2721}\\u{2728}\\u{2733}\\u{2734}\\u{2744}\\u{2747}\\u{274C}\\u{274E}\\u{2753}-\\u{2755}\\u{2757}\\u{2763}\\u{2764}\\u{2795}-\\u{2797}\\u{27A1}\\u{27B0}\\u{27BF}\\u{2934}\\u{2935}]/u;\n\n/** Words/phrases that signal formal register. */\nconst FORMAL_MARKERS = [\n \"furthermore\",\n \"however\",\n \"therefore\",\n \"moreover\",\n \"consequently\",\n \"nevertheless\",\n \"in addition\",\n \"accordingly\",\n \"subsequently\",\n \"regarding\",\n \"pertaining\",\n \"shall\",\n \"hereby\",\n \"whereas\",\n \"notwithstanding\",\n \"henceforth\",\n \"aforementioned\",\n \"please consider\",\n \"would like to\",\n \"i would\",\n \"appreciation\",\n \"recommendations\",\n \"thoroughly\",\n \"documentation\",\n];\n\n/** Words/phrases that signal casual register. */\nconst CASUAL_MARKERS = [\n \"gonna\",\n \"wanna\",\n \"kinda\",\n \"sorta\",\n \"gotta\",\n \"dunno\",\n \"lemme\",\n \"yeah\",\n \"yep\",\n \"nah\",\n \"lol\",\n \"omg\",\n \"tbh\",\n \"imo\",\n \"btw\",\n \"nope\",\n \"cuz\",\n \"tho\",\n \"ain't\",\n \"y'all\",\n \"awesome\",\n \"cool\",\n \"dude\",\n \"bro\",\n \"bruh\",\n];\n\n/** Minimum occurrences for a phrase to count as \"common\". */\nconst MIN_PHRASE_FREQUENCY = 2;\n\n/** Maximum number of common phrases to return. */\nconst MAX_COMMON_PHRASES = 10;\n\n/**\n * Analyse text samples and extract communication style markers.\n */\nexport function extractStyleMarkers(samples: string[]): StyleMarkers {\n if (samples.length === 0) {\n return {\n avgSentenceLength: 0,\n usesEmoji: false,\n formality: \"mixed\",\n usesLowercase: false,\n commonPhrases: [],\n };\n }\n\n const joined = samples.join(\" \");\n\n return {\n avgSentenceLength: calcAvgSentenceLength(joined),\n usesEmoji: detectEmoji(joined),\n formality: detectFormality(joined),\n usesLowercase: detectLowercase(joined),\n commonPhrases: findCommonPhrases(samples),\n };\n}\n\n// ── Internals ────────────────────────────────────────────\n\nfunction calcAvgSentenceLength(text: string): number {\n // Split on sentence-ending punctuation, filter empties\n const sentences = text\n .split(/[.!?]+/)\n .map((s) => s.trim())\n .filter((s) => s.length > 0);\n\n if (sentences.length === 0) return 0;\n\n const totalWords = sentences.reduce((sum, s) => {\n const words = s.split(/\\s+/).filter((w) => w.length > 0);\n return sum + words.length;\n }, 0);\n\n return Math.round((totalWords / sentences.length) * 10) / 10;\n}\n\nfunction detectEmoji(text: string): boolean {\n return EMOJI_RE.test(text);\n}\n\nfunction detectFormality(text: string): \"formal\" | \"casual\" | \"mixed\" {\n const lower = text.toLowerCase();\n\n let formalScore = 0;\n for (const marker of FORMAL_MARKERS) {\n // Word-boundary matching prevents false positives\n // (e.g., \"tho\" matching inside \"those\" or \"method\")\n if (new RegExp(`\\\\b${marker}\\\\b`, \"i\").test(lower)) formalScore++;\n }\n\n let casualScore = 0;\n for (const marker of CASUAL_MARKERS) {\n if (new RegExp(`\\\\b${marker}\\\\b`, \"i\").test(lower)) casualScore++;\n }\n\n // Threshold: need at least 2 markers to declare a style\n const THRESHOLD = 2;\n\n if (formalScore >= THRESHOLD && formalScore > casualScore) return \"formal\";\n if (casualScore >= THRESHOLD && casualScore > formalScore) return \"casual\";\n return \"mixed\";\n}\n\nfunction detectLowercase(text: string): boolean {\n // Split into sentences and check what fraction start with lowercase\n const sentences = text\n .split(/[.!?]+/)\n .map((s) => s.trim())\n .filter((s) => s.length > 0);\n\n if (sentences.length === 0) return false;\n\n const lowercaseStarts = sentences.filter((s) => {\n const firstChar = s.charAt(0);\n return firstChar === firstChar.toLowerCase() && firstChar !== firstChar.toUpperCase();\n }).length;\n\n // Majority (>50%) of sentences start lowercase\n return lowercaseStarts / sentences.length > 0.5;\n}\n\n/**\n * Check whether a character is alphanumeric (ASCII a-z, A-Z, 0-9) using\n * code-point comparison. Pure function — no regex, no backtracking.\n */\nfunction isAlnum(ch: string): boolean {\n const c = ch.charCodeAt(0);\n return (\n (c >= 48 && c <= 57) || // 0-9\n (c >= 65 && c <= 90) || // A-Z\n (c >= 97 && c <= 122) // a-z\n );\n}\n\n/**\n * Strip leading and trailing non-alphanumeric characters from `word` using\n * a single linear scan on each side. This replaces the previous\n * `/^[^a-zA-Z0-9]+/` / `/[^a-zA-Z0-9]+$/` regexes, which CodeQL flagged as\n * polynomial ReDoS on uncontrolled input (e.g. long `///...///` runs).\n */\nfunction trimNonAlnum(word: string): string {\n let start = 0;\n let end = word.length;\n while (start < end && !isAlnum(word.charAt(start))) start++;\n while (end > start && !isAlnum(word.charAt(end - 1))) end--;\n return start === 0 && end === word.length ? word : word.slice(start, end);\n}\n\nfunction findCommonPhrases(samples: string[]): string[] {\n const phraseCount = new Map<string, number>();\n\n for (const sample of samples) {\n // Tokenize: split on whitespace, strip edge punctuation with a linear\n // scan (no regex) to eliminate the polynomial backtracking that the\n // previous `replace(/^[^a-zA-Z0-9]+/, \"\")` chain exposed.\n const words = sample\n .split(/\\s+/)\n .map((w) => trimNonAlnum(w))\n .filter((w) => w.length > 0);\n\n // Build 2-gram and 3-gram phrases\n const seenInSample = new Set<string>();\n for (let ngramSize = 2; ngramSize <= 3; ngramSize++) {\n for (let i = 0; i <= words.length - ngramSize; i++) {\n const phrase = words.slice(i, i + ngramSize).join(\" \").toLowerCase();\n // Only count once per sample to avoid inflating from repetition within one sample\n if (!seenInSample.has(phrase)) {\n seenInSample.add(phrase);\n phraseCount.set(phrase, (phraseCount.get(phrase) ?? 0) + 1);\n }\n }\n }\n }\n\n // Filter by minimum frequency and sort by count descending, then alphabetical for stability\n return [...phraseCount.entries()]\n .filter(([, count]) => count >= MIN_PHRASE_FREQUENCY)\n .sort((a, b) => {\n if (b[1] !== a[1]) return b[1] - a[1];\n return a[0].localeCompare(b[0]);\n })\n .slice(0, MAX_COMMON_PHRASES)\n .map(([phrase]) => phrase);\n}\n","/**\n * PII privacy sweep for training export records.\n *\n * Belt-and-suspenders check that runs after Remnic's own\n * privacy controls. Scans instruction, input, and output\n * fields for common PII patterns and replaces matches with\n * [REDACTED].\n */\n\nimport type { TrainingExportRecord } from \"@remnic/core\";\n\nexport interface PrivacySweepResult {\n cleanRecords: TrainingExportRecord[];\n redactedCount: number;\n redactionDetails: { index: number; field: string; pattern: string }[];\n}\n\ninterface PiiPattern {\n name: string;\n regex: RegExp;\n}\n\n/**\n * Ordered list of PII patterns.\n *\n * Order matters: more specific patterns (SSN, credit card)\n * come before broader ones (phone) to avoid partial matches.\n */\nconst PII_PATTERNS: PiiPattern[] = [\n {\n // Email: user@domain.tld\n name: \"email\",\n regex: /[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\\.[a-zA-Z]{2,}/g,\n },\n {\n // SSN: 123-45-6789 (exactly 3-2-4 digit groups)\n name: \"ssn\",\n regex: /\\b\\d{3}-\\d{2}-\\d{4}\\b/g,\n },\n {\n // Credit card: 4 groups of 4 digits separated by dashes or spaces\n name: \"credit_card\",\n regex: /\\b\\d{4}[-\\s]\\d{4}[-\\s]\\d{4}[-\\s]\\d{4}\\b/g,\n },\n {\n // IP address: four octets 0-255\n name: \"ip_address\",\n regex: /\\b(?:(?:25[0-5]|2[0-4]\\d|[01]?\\d\\d?)\\.){3}(?:25[0-5]|2[0-4]\\d|[01]?\\d\\d?)\\b/g,\n },\n {\n // Phone: optional +1- prefix, then 3-3-4 with dashes, dots, or spaces\n // Also matches (555) 123-4567 format\n name: \"phone\",\n regex: /(?:\\+\\d{1,3}[-.\\s]?)?\$?\\d{3}\$?[-.\\s]\\d{3}[-.\\s]\\d{4}\\b/g,\n },\n];\n\nconst SCANNED_FIELDS: (keyof Pick<TrainingExportRecord, \"instruction\" | \"input\" | \"output\">)[] = [\n \"instruction\",\n \"input\",\n \"output\",\n];\n\n/**\n * Scan and redact PII from training export records.\n *\n * Returns a new array of cleaned records, leaving the originals\n * unmodified. The `redactedCount` is the number of records that\n * had at least one redaction. `redactionDetails` lists every\n * individual match with its record index, field, and pattern name.\n */\nexport function sweepPii(records: TrainingExportRecord[]): PrivacySweepResult {\n const redactionDetails: PrivacySweepResult[\"redactionDetails\"] = [];\n const recordHasRedaction = new Set<number>();\n\n const cleanRecords = records.map((record, idx) => {\n const cleaned: TrainingExportRecord = { ...record };\n\n for (const field of SCANNED_FIELDS) {\n let value = record[field];\n if (!value) continue;\n\n for (const pattern of PII_PATTERNS) {\n // Reset lastIndex for global regex reuse\n pattern.regex.lastIndex = 0;\n if (pattern.regex.test(value)) {\n pattern.regex.lastIndex = 0;\n value = value.replace(pattern.regex, \"[REDACTED]\");\n recordHasRedaction.add(idx);\n redactionDetails.push({\n index: idx,\n field,\n pattern: pattern.name,\n });\n }\n }\n\n cleaned[field] = value;\n }\n\n return cleaned;\n });\n\n return {\n cleanRecords,\n redactedCount: recordHasRedaction.size,\n redactionDetails,\n };\n}\n"],"mappings":";;;AAQA;AAAA,EACE;AAAA,EACA;AAAA,OACK;;;ACGA,IAAM,uBAA8C;AAAA,EACzD,MAAM;AAAA,EACN,eAAe;AAAA,EAEf,cAAc,SAAyC;AACrD,UAAM,gBAAgB,QAAQ,IAAI,CAAC,OAAO;AAAA,MACxC,aAAa,EAAE;AAAA,MACf,OAAO,EAAE;AAAA,MACT,QAAQ,EAAE;AAAA,IACZ,EAAE;AACF,WAAO,KAAK,UAAU,eAAe,MAAM,CAAC;AAAA,EAC9C;AACF;;;ACLA,IAAM,oBAAoB;AAO1B,IAAM,qBAA+C;AAAA,EACnD,aAAa;AAAA,IACX;AAAA,IACA;AAAA,IACA;AAAA,EACF;AAAA,EACA,UAAU;AAAA,IACR;AAAA,IACA;AAAA,IACA;AAAA,EACF;AAAA,EACA,WAAW;AAAA,IACT;AAAA,IACA;AAAA,IACA;AAAA,EACF;AAAA,EACA,UAAU;AAAA,IACR;AAAA,IACA;AAAA,IACA;AAAA,EACF;AACF;AAEA,IAAM,oBAAoB;AAAA,EACxB;AAAA,EACA;AACF;AAOA,IAAM,uBAA+C;AAAA,EACnD,YAAY;AAAA,EACZ,MAAM;AAAA,EACN,QAAQ;AAAA,EACR,OAAO;AAAA,EACP,YAAY;AAAA,EACZ,UAAU;AAAA,EACV,WAAW;AAAA,EACX,MAAM;AAAA,EACN,UAAU;AAAA,EACV,cAAc;AAAA,EACd,YAAY;AAAA,EACZ,QAAQ;AACV;AAMO,SAAS,wBACd,SACA,SACwB;AACxB,QAAM,WAAW,SAAS,qBAAqB;AAC/C,QAAM,QAAQ,SAAS;AACvB,QAAM,SAAiC,CAAC;AAExC,WAAS,IAAI,GAAG,IAAI,QAAQ,QAAQ,KAAK;AACvC,UAAM,SAAS,QAAQ,CAAC;AACxB,UAAM,cAAc,mBAAmB,OAAO,QAAQ;AACtD,UAAM,QAAQ,aAAa,OAAO,WAAW;AAC7C,UAAM,YAAY,mBAAmB,WAAW,KAAK;AAErD,UAAM,YAAY,KAAK,IAAI,UAAU,UAAU,MAAM;AAErD,aAAS,IAAI,GAAG,IAAI,WAAW,KAAK;AAClC,YAAM,iBAAiB,IAAI,KAAK,UAAU;AAC1C,YAAM,WAAW,UAAU,aAAa,EAAE,QAAQ,WAAW,KAAK;AAClE,UAAI,SAAS,OAAO;AAEpB,UAAI,OAAO,eAAe;AACxB,iBAAS,OAAO,YAAY;AAAA,MAC9B;AAEA,aAAO,KAAK;AAAA,QACV,aAAa;AAAA,QACb,OAAO;AAAA,QACP;AAAA,QACA,UAAU,OAAO;AAAA,QACjB,YAAY,OAAO;AAAA,QACnB,WAAW,OAAO;AAAA,MACpB,CAAC;AAAA,IACH;AAAA,EACF;AAEA,SAAO;AACT;AAQA,SAAS,mBAAmB,UAAsC;AAChE,MAAI,CAAC,SAAU,QAAO;AACtB,SAAO,qBAAqB,SAAS,YAAY,CAAC,KAAK;AACzD;AAYA,SAAS,aAAa,aAA6B;AACjD,QAAM,WAAW,YAAY,MAAM,cAAc;AACjD,MAAI,UAAU;AACZ,WAAO,SAAS,CAAC,EAAE,KAAK,EAAE,YAAY;AAAA,EACxC;AACA,SAAO;AACT;;;AC7HA,IAAM,WACJ;AAGF,IAAM,iBAAiB;AAAA,EACrB;AAAA,EACA;AAAA,EACA;AAAA,EACA;AAAA,EACA;AAAA,EACA;AAAA,EACA;AAAA,EACA;AAAA,EACA;AAAA,EACA;AAAA,EACA;AAAA,EACA;AAAA,EACA;AAAA,EACA;AAAA,EACA;AAAA,EACA;AAAA,EACA;AAAA,EACA;AAAA,EACA;AAAA,EACA;AAAA,EACA;AAAA,EACA;AAAA,EACA;AAAA,EACA;AACF;AAGA,IAAM,iBAAiB;AAAA,EACrB;AAAA,EACA;AAAA,EACA;AAAA,EACA;AAAA,EACA;AAAA,EACA;AAAA,EACA;AAAA,EACA;AAAA,EACA;AAAA,EACA;AAAA,EACA;AAAA,EACA;AAAA,EACA;AAAA,EACA;AAAA,EACA;AAAA,EACA;AAAA,EACA;AAAA,EACA;AAAA,EACA;AAAA,EACA;AAAA,EACA;AAAA,EACA;AAAA,EACA;AAAA,EACA;AAAA,EACA;AACF;AAGA,IAAM,uBAAuB;AAG7B,IAAM,qBAAqB;AAKpB,SAAS,oBAAoB,SAAiC;AACnE,MAAI,QAAQ,WAAW,GAAG;AACxB,WAAO;AAAA,MACL,mBAAmB;AAAA,MACnB,WAAW;AAAA,MACX,WAAW;AAAA,MACX,eAAe;AAAA,MACf,eAAe,CAAC;AAAA,IAClB;AAAA,EACF;AAEA,QAAM,SAAS,QAAQ,KAAK,GAAG;AAE/B,SAAO;AAAA,IACL,mBAAmB,sBAAsB,MAAM;AAAA,IAC/C,WAAW,YAAY,MAAM;AAAA,IAC7B,WAAW,gBAAgB,MAAM;AAAA,IACjC,eAAe,gBAAgB,MAAM;AAAA,IACrC,eAAe,kBAAkB,OAAO;AAAA,EAC1C;AACF;AAIA,SAAS,sBAAsB,MAAsB;AAEnD,QAAM,YAAY,KACf,MAAM,QAAQ,EACd,IAAI,CAAC,MAAM,EAAE,KAAK,CAAC,EACnB,OAAO,CAAC,MAAM,EAAE,SAAS,CAAC;AAE7B,MAAI,UAAU,WAAW,EAAG,QAAO;AAEnC,QAAM,aAAa,UAAU,OAAO,CAAC,KAAK,MAAM;AAC9C,UAAM,QAAQ,EAAE,MAAM,KAAK,EAAE,OAAO,CAAC,MAAM,EAAE,SAAS,CAAC;AACvD,WAAO,MAAM,MAAM;AAAA,EACrB,GAAG,CAAC;AAEJ,SAAO,KAAK,MAAO,aAAa,UAAU,SAAU,EAAE,IAAI;AAC5D;AAEA,SAAS,YAAY,MAAuB;AAC1C,SAAO,SAAS,KAAK,IAAI;AAC3B;AAEA,SAAS,gBAAgB,MAA6C;AACpE,QAAM,QAAQ,KAAK,YAAY;AAE/B,MAAI,cAAc;AAClB,aAAW,UAAU,gBAAgB;AAGnC,QAAI,IAAI,OAAO,MAAM,MAAM,OAAO,GAAG,EAAE,KAAK,KAAK,EAAG;AAAA,EACtD;AAEA,MAAI,cAAc;AAClB,aAAW,UAAU,gBAAgB;AACnC,QAAI,IAAI,OAAO,MAAM,MAAM,OAAO,GAAG,EAAE,KAAK,KAAK,EAAG;AAAA,EACtD;AAGA,QAAM,YAAY;AAElB,MAAI,eAAe,aAAa,cAAc,YAAa,QAAO;AAClE,MAAI,eAAe,aAAa,cAAc,YAAa,QAAO;AAClE,SAAO;AACT;AAEA,SAAS,gBAAgB,MAAuB;AAE9C,QAAM,YAAY,KACf,MAAM,QAAQ,EACd,IAAI,CAAC,MAAM,EAAE,KAAK,CAAC,EACnB,OAAO,CAAC,MAAM,EAAE,SAAS,CAAC;AAE7B,MAAI,UAAU,WAAW,EAAG,QAAO;AAEnC,QAAM,kBAAkB,UAAU,OAAO,CAAC,MAAM;AAC9C,UAAM,YAAY,EAAE,OAAO,CAAC;AAC5B,WAAO,cAAc,UAAU,YAAY,KAAK,cAAc,UAAU,YAAY;AAAA,EACtF,CAAC,EAAE;AAGH,SAAO,kBAAkB,UAAU,SAAS;AAC9C;AAMA,SAAS,QAAQ,IAAqB;AACpC,QAAM,IAAI,GAAG,WAAW,CAAC;AACzB,SACG,KAAK,MAAM,KAAK;AAAA,EAChB,KAAK,MAAM,KAAK;AAAA,EAChB,KAAK,MAAM,KAAK;AAErB;AAQA,SAAS,aAAa,MAAsB;AAC1C,MAAI,QAAQ;AACZ,MAAI,MAAM,KAAK;AACf,SAAO,QAAQ,OAAO,CAAC,QAAQ,KAAK,OAAO,KAAK,CAAC,EAAG;AACpD,SAAO,MAAM,SAAS,CAAC,QAAQ,KAAK,OAAO,MAAM,CAAC,CAAC,EAAG;AACtD,SAAO,UAAU,KAAK,QAAQ,KAAK,SAAS,OAAO,KAAK,MAAM,OAAO,GAAG;AAC1E;AAEA,SAAS,kBAAkB,SAA6B;AACtD,QAAM,cAAc,oBAAI,IAAoB;AAE5C,aAAW,UAAU,SAAS;AAI5B,UAAM,QAAQ,OACX,MAAM,KAAK,EACX,IAAI,CAAC,MAAM,aAAa,CAAC,CAAC,EAC1B,OAAO,CAAC,MAAM,EAAE,SAAS,CAAC;AAG7B,UAAM,eAAe,oBAAI,IAAY;AACrC,aAAS,YAAY,GAAG,aAAa,GAAG,aAAa;AACnD,eAAS,IAAI,GAAG,KAAK,MAAM,SAAS,WAAW,KAAK;AAClD,cAAM,SAAS,MAAM,MAAM,GAAG,IAAI,SAAS,EAAE,KAAK,GAAG,EAAE,YAAY;AAEnE,YAAI,CAAC,aAAa,IAAI,MAAM,GAAG;AAC7B,uBAAa,IAAI,MAAM;AACvB,sBAAY,IAAI,SAAS,YAAY,IAAI,MAAM,KAAK,KAAK,CAAC;AAAA,QAC5D;AAAA,MACF;AAAA,IACF;AAAA,EACF;AAGA,SAAO,CAAC,GAAG,YAAY,QAAQ,CAAC,EAC7B,OAAO,CAAC,CAAC,EAAE,KAAK,MAAM,SAAS,oBAAoB,EACnD,KAAK,CAAC,GAAG,MAAM;AACd,QAAI,EAAE,CAAC,MAAM,EAAE,CAAC,EAAG,QAAO,EAAE,CAAC,IAAI,EAAE,CAAC;AACpC,WAAO,EAAE,CAAC,EAAE,cAAc,EAAE,CAAC,CAAC;AAAA,EAChC,CAAC,EACA,MAAM,GAAG,kBAAkB,EAC3B,IAAI,CAAC,CAAC,MAAM,MAAM,MAAM;AAC7B;;;AClNA,IAAM,eAA6B;AAAA,EACjC;AAAA;AAAA,IAEE,MAAM;AAAA,IACN,OAAO;AAAA,EACT;AAAA,EACA;AAAA;AAAA,IAEE,MAAM;AAAA,IACN,OAAO;AAAA,EACT;AAAA,EACA;AAAA;AAAA,IAEE,MAAM;AAAA,IACN,OAAO;AAAA,EACT;AAAA,EACA;AAAA;AAAA,IAEE,MAAM;AAAA,IACN,OAAO;AAAA,EACT;AAAA,EACA;AAAA;AAAA;AAAA,IAGE,MAAM;AAAA,IACN,OAAO;AAAA,EACT;AACF;AAEA,IAAM,iBAA2F;AAAA,EAC/F;AAAA,EACA;AAAA,EACA;AACF;AAUO,SAAS,SAAS,SAAqD;AAC5E,QAAM,mBAA2D,CAAC;AAClE,QAAM,qBAAqB,oBAAI,IAAY;AAE3C,QAAM,eAAe,QAAQ,IAAI,CAAC,QAAQ,QAAQ;AAChD,UAAM,UAAgC,EAAE,GAAG,OAAO;AAElD,eAAW,SAAS,gBAAgB;AAClC,UAAI,QAAQ,OAAO,KAAK;AACxB,UAAI,CAAC,MAAO;AAEZ,iBAAW,WAAW,cAAc;AAElC,gBAAQ,MAAM,YAAY;AAC1B,YAAI,QAAQ,MAAM,KAAK,KAAK,GAAG;AAC7B,kBAAQ,MAAM,YAAY;AAC1B,kBAAQ,MAAM,QAAQ,QAAQ,OAAO,YAAY;AACjD,6BAAmB,IAAI,GAAG;AAC1B,2BAAiB,KAAK;AAAA,YACpB,OAAO;AAAA,YACP;AAAA,YACA,SAAS,QAAQ;AAAA,UACnB,CAAC;AAAA,QACH;AAAA,MACF;AAEA,cAAQ,KAAK,IAAI;AAAA,IACnB;AAEA,WAAO;AAAA,EACT,CAAC;AAED,SAAO;AAAA,IACL;AAAA,IACA,eAAe,mBAAmB;AAAA,IAClC;AAAA,EACF;AACF;;;AJ/EO,SAAS,uCAAgD;AAC9D,MAAI,yBAAyB,qBAAqB,IAAI,MAAM,QAAW;AACrE,WAAO;AAAA,EACT;AACA,gCAA8B,oBAAoB;AAClD,SAAO;AACT;AAUA,IAAI;AACF,uCAAqC;AACvC,QAAQ;AAER;","names":[]}

package/package.json ADDED Viewed

@@ -0,0 +1,48 @@
+{
+  "name": "@remnic/export-weclone",
+  "version": "1.0.1",
+  "description": "Export Remnic memories as WeClone-compatible Alpaca-format fine-tuning datasets",
+  "type": "module",
+  "main": "dist/index.js",
+  "types": "dist/index.d.ts",
+  "exports": {
+    ".": {
+      "types": "./dist/index.d.ts",
+      "import": "./dist/index.js"
+    }
+  },
+  "files": [
+    "dist",
+    "README.md"
+  ],
+  "publishConfig": {
+    "access": "public",
+    "provenance": true
+  },
+  "dependencies": {
+    "@remnic/core": "^1.0.3"
+  },
+  "devDependencies": {
+    "tsup": "^8.0.0",
+    "typescript": "^5.7.0",
+    "tsx": "^4.0.0"
+  },
+  "license": "MIT",
+  "repository": {
+    "type": "git",
+    "url": "https://github.com/joshuaswarren/remnic.git",
+    "directory": "packages/export-weclone"
+  },
+  "keywords": [
+    "remnic",
+    "memory",
+    "weclone",
+    "fine-tuning",
+    "export",
+    "alpaca"
+  ],
+  "scripts": {
+    "build": "tsup src/index.ts --format esm --dts",
+    "test": "tsx --test 'src/**/*.test.ts'"
+  }
+}