@soulcraft/brainy 4.1.3 → 4.2.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (52) hide show
  1. package/CHANGELOG.md +100 -7
  2. package/dist/brainy.d.ts +74 -16
  3. package/dist/brainy.js +74 -16
  4. package/dist/import/FormatDetector.d.ts +6 -1
  5. package/dist/import/FormatDetector.js +40 -1
  6. package/dist/import/ImportCoordinator.d.ts +155 -5
  7. package/dist/import/ImportCoordinator.js +346 -6
  8. package/dist/import/InstancePool.d.ts +136 -0
  9. package/dist/import/InstancePool.js +231 -0
  10. package/dist/importers/SmartCSVImporter.d.ts +2 -1
  11. package/dist/importers/SmartCSVImporter.js +11 -22
  12. package/dist/importers/SmartDOCXImporter.d.ts +125 -0
  13. package/dist/importers/SmartDOCXImporter.js +227 -0
  14. package/dist/importers/SmartExcelImporter.d.ts +12 -1
  15. package/dist/importers/SmartExcelImporter.js +40 -25
  16. package/dist/importers/SmartJSONImporter.d.ts +1 -0
  17. package/dist/importers/SmartJSONImporter.js +25 -6
  18. package/dist/importers/SmartMarkdownImporter.d.ts +2 -1
  19. package/dist/importers/SmartMarkdownImporter.js +11 -16
  20. package/dist/importers/SmartPDFImporter.d.ts +2 -1
  21. package/dist/importers/SmartPDFImporter.js +11 -22
  22. package/dist/importers/SmartYAMLImporter.d.ts +121 -0
  23. package/dist/importers/SmartYAMLImporter.js +275 -0
  24. package/dist/importers/VFSStructureGenerator.js +12 -0
  25. package/dist/neural/SmartExtractor.d.ts +279 -0
  26. package/dist/neural/SmartExtractor.js +592 -0
  27. package/dist/neural/SmartRelationshipExtractor.d.ts +217 -0
  28. package/dist/neural/SmartRelationshipExtractor.js +396 -0
  29. package/dist/neural/embeddedTypeEmbeddings.d.ts +1 -1
  30. package/dist/neural/embeddedTypeEmbeddings.js +2 -2
  31. package/dist/neural/entityExtractor.d.ts +3 -0
  32. package/dist/neural/entityExtractor.js +34 -36
  33. package/dist/neural/presets.d.ts +189 -0
  34. package/dist/neural/presets.js +365 -0
  35. package/dist/neural/signals/ContextSignal.d.ts +166 -0
  36. package/dist/neural/signals/ContextSignal.js +646 -0
  37. package/dist/neural/signals/EmbeddingSignal.d.ts +175 -0
  38. package/dist/neural/signals/EmbeddingSignal.js +435 -0
  39. package/dist/neural/signals/ExactMatchSignal.d.ts +220 -0
  40. package/dist/neural/signals/ExactMatchSignal.js +542 -0
  41. package/dist/neural/signals/PatternSignal.d.ts +159 -0
  42. package/dist/neural/signals/PatternSignal.js +478 -0
  43. package/dist/neural/signals/VerbContextSignal.d.ts +102 -0
  44. package/dist/neural/signals/VerbContextSignal.js +390 -0
  45. package/dist/neural/signals/VerbEmbeddingSignal.d.ts +131 -0
  46. package/dist/neural/signals/VerbEmbeddingSignal.js +304 -0
  47. package/dist/neural/signals/VerbExactMatchSignal.d.ts +115 -0
  48. package/dist/neural/signals/VerbExactMatchSignal.js +335 -0
  49. package/dist/neural/signals/VerbPatternSignal.d.ts +104 -0
  50. package/dist/neural/signals/VerbPatternSignal.js +457 -0
  51. package/dist/types/graphTypes.d.ts +2 -0
  52. package/package.json +4 -1
package/CHANGELOG.md CHANGED
@@ -2,6 +2,11 @@
2
2
 
3
3
  All notable changes to this project will be documented in this file. See [standard-version](https://github.com/conventional-changelog/standard-version) for commit guidelines.
4
4
 
5
+ ### [4.1.4](https://github.com/soulcraftlabs/brainy/compare/v4.1.3...v4.1.4) (2025-10-21)
6
+
7
+ - feat: add import API validation and v4.x migration guide (a1a0576)
8
+
9
+
5
10
  ### [4.1.3](https://github.com/soulcraftlabs/brainy/compare/v4.1.2...v4.1.3) (2025-10-21)
6
11
 
7
12
  - perf: make getRelations() pagination consistent and efficient (54d819c)
@@ -223,22 +228,110 @@ $ brainy import ./research-papers --extract-concepts --progress
223
228
 
224
229
  ### ⚠️ Breaking Changes
225
230
 
226
- **NONE** - v4.0.0 is 100% backward compatible!
231
+ #### 💥 Import API Redesign
232
+
233
+ The import API has been redesigned for clarity and better feature control. **Old v3.x option names are no longer recognized** and will throw errors.
234
+
235
+ **What Changed:**
236
+
237
+ | v3.x Option | v4.x Option | Action Required |
238
+ |-------------|-------------|-----------------|
239
+ | `extractRelationships` | `enableRelationshipInference` | **Rename option** |
240
+ | `autoDetect` | *(removed)* | **Delete option** (always enabled) |
241
+ | `createFileStructure` | `vfsPath` | **Replace** with VFS path |
242
+ | `excelSheets` | *(removed)* | **Delete option** (all sheets processed) |
243
+ | `pdfExtractTables` | *(removed)* | **Delete option** (always enabled) |
244
+ | - | `enableNeuralExtraction` | **Add option** (new in v4.x) |
245
+ | - | `enableConceptExtraction` | **Add option** (new in v4.x) |
246
+ | - | `preserveSource` | **Add option** (new in v4.x) |
247
+
248
+ **Why These Changes?**
249
+
250
+ 1. **Clearer option names**: `enableRelationshipInference` explicitly indicates AI-powered relationship inference
251
+ 2. **Separation of concerns**: Neural extraction, relationship inference, and VFS are now separate, explicit options
252
+ 3. **Better defaults**: Auto-detection and AI features are enabled by default
253
+ 4. **Reduced confusion**: Removed redundant options like `autoDetect` and format-specific options
254
+
255
+ **Migration Examples:**
256
+
257
+ <details>
258
+ <summary>Example 1: Basic Excel Import</summary>
259
+
260
+ ```typescript
261
+ // v3.x (OLD - Will throw error)
262
+ await brain.import('./glossary.xlsx', {
263
+ extractRelationships: true,
264
+ createFileStructure: true
265
+ })
266
+
267
+ // v4.x (NEW - Use this)
268
+ await brain.import('./glossary.xlsx', {
269
+ enableRelationshipInference: true,
270
+ vfsPath: '/imports/glossary'
271
+ })
272
+ ```
273
+ </details>
274
+
275
+ <details>
276
+ <summary>Example 2: Full-Featured Import</summary>
227
277
 
228
- All v4.0.0 features are:
278
+ ```typescript
279
+ // v3.x (OLD - Will throw error)
280
+ await brain.import('./data.xlsx', {
281
+ extractRelationships: true,
282
+ autoDetect: true,
283
+ createFileStructure: true
284
+ })
285
+
286
+ // v4.x (NEW - Use this)
287
+ await brain.import('./data.xlsx', {
288
+ enableNeuralExtraction: true, // Extract entity names
289
+ enableRelationshipInference: true, // Infer semantic relationships
290
+ enableConceptExtraction: true, // Extract entity types
291
+ vfsPath: '/imports/data', // VFS directory
292
+ preserveSource: true // Save original file
293
+ })
294
+ ```
295
+ </details>
296
+
297
+ **Error Messages:**
298
+
299
+ If you use old v3.x options, you'll get a clear error message:
300
+
301
+ ```
302
+ ❌ Invalid import options detected (Brainy v4.x breaking changes)
303
+
304
+ The following v3.x options are no longer supported:
305
+
306
+ ❌ extractRelationships
307
+ → Use: enableRelationshipInference
308
+ → Why: Option renamed for clarity in v4.x
309
+
310
+ 📖 Migration Guide: https://brainy.dev/docs/guides/migrating-to-v4
311
+ ```
312
+
313
+ **Other v4.0.0 Features (Non-Breaking):**
314
+
315
+ All other v4.0.0 features are:
229
316
  - ✅ Opt-in (lifecycle, compression, batch operations)
230
317
  - ✅ Additive (new CLI commands, new methods)
231
318
  - ✅ Non-breaking (existing code continues to work)
232
319
 
233
320
  ### 📝 Migration
234
321
 
235
- **No migration required!** All v4.0.0 features are optional enhancements.
322
+ **Import API migration required** if you use `brain.import()` with the old v3.x option names.
236
323
 
237
- To use new features:
324
+ #### Required Changes:
238
325
  1. Update to v4.0.0: `npm install @soulcraft/brainy@4.0.0`
239
- 2. Enable lifecycle policies: `brainy storage lifecycle set`
240
- 3. Use batch operations: `brainy storage batch-delete entities.txt`
241
- 4. See `docs/MIGRATION-V3-TO-V4.md` for full feature documentation
326
+ 2. Update import calls to use new option names (see table above)
327
+ 3. Test your imports - you'll get clear error messages if you use old options
328
+
329
+ #### Optional Enhancements:
330
+ - Enable lifecycle policies: `brainy storage lifecycle set`
331
+ - Use batch operations: `brainy storage batch-delete entities.txt`
332
+ - See full migration guide: `docs/guides/migrating-to-v4.md`
333
+
334
+ **Complete Migration Guide:** [docs/guides/migrating-to-v4.md](./docs/guides/migrating-to-v4.md)
242
335
 
243
336
  ### 🎓 What This Means
244
337
 
package/dist/brainy.d.ts CHANGED
@@ -686,33 +686,91 @@ export declare class Brainy<T = any> implements BrainyInterface<T> {
686
686
  limit?: number;
687
687
  }): Promise<string[]>;
688
688
  /**
689
- * Import files with auto-detection and dual storage (VFS + Knowledge Graph)
689
+ * Import files with intelligent extraction and dual storage (VFS + Knowledge Graph)
690
690
  *
691
691
  * Unified import system that:
692
692
  * - Auto-detects format (Excel, PDF, CSV, JSON, Markdown)
693
- * - Extracts entities and relationships
693
+ * - Extracts entities with AI-powered name/type detection
694
+ * - Infers semantic relationships from context
694
695
  * - Stores in both VFS (organized files) and Knowledge Graph (connected entities)
695
696
  * - Links VFS files to graph entities
696
697
  *
697
- * @example
698
- * // Import from file path
699
- * const result = await brain.import('/path/to/file.xlsx')
698
+ * @since 4.0.0
700
699
  *
701
- * @example
702
- * // Import from buffer
700
+ * @example Quick Start (All AI features enabled by default)
701
+ * ```typescript
702
+ * const result = await brain.import('./glossary.xlsx')
703
+ * // Auto-detects format, extracts entities, infers relationships
704
+ * ```
705
+ *
706
+ * @example Full-Featured Import (v4.x)
707
+ * ```typescript
708
+ * const result = await brain.import('./data.xlsx', {
709
+ * // AI features
710
+ * enableNeuralExtraction: true, // Extract entity names/metadata
711
+ * enableRelationshipInference: true, // Detect semantic relationships
712
+ * enableConceptExtraction: true, // Extract types/concepts
713
+ *
714
+ * // VFS features
715
+ * vfsPath: '/imports/my-data', // Store in VFS directory
716
+ * groupBy: 'type', // Organize by entity type
717
+ * preserveSource: true, // Keep original file
718
+ *
719
+ * // Progress tracking
720
+ * onProgress: (p) => console.log(p.message)
721
+ * })
722
+ * ```
723
+ *
724
+ * @example Performance Tuning (Large Files)
725
+ * ```typescript
726
+ * const result = await brain.import('./huge-file.csv', {
727
+ * enableDeduplication: false, // Skip dedup for speed
728
+ * confidenceThreshold: 0.8, // Higher threshold = fewer entities
729
+ * onProgress: (p) => console.log(`${p.processed}/${p.total}`)
730
+ * })
731
+ * ```
732
+ *
733
+ * @example Import from Buffer or Object
734
+ * ```typescript
735
+ * // From buffer
703
736
  * const result = await brain.import(buffer, { format: 'pdf' })
704
737
  *
705
- * @example
706
- * // Import JSON object
738
+ * // From object
707
739
  * const result = await brain.import({ entities: [...] })
740
+ * ```
708
741
  *
709
- * @example
710
- * // Custom VFS path and grouping
711
- * const result = await brain.import(buffer, {
712
- * vfsPath: '/my-imports/data',
713
- * groupBy: 'type',
714
- * onProgress: (progress) => console.log(progress.message)
715
- * })
742
+ * @throws {Error} If invalid options are provided (v4.x breaking changes)
743
+ *
744
+ * @see {@link https://brainy.dev/docs/api/import API Documentation}
745
+ * @see {@link https://brainy.dev/docs/guides/migrating-to-v4 Migration Guide}
746
+ *
747
+ * @remarks
748
+ * **⚠️ Breaking Changes from v3.x:**
749
+ *
750
+ * The import API was redesigned in v4.0.0 for clarity and better feature control.
751
+ * Old v3.x option names are **no longer recognized** and will throw errors.
752
+ *
753
+ * **Option Changes:**
754
+ * - ❌ `extractRelationships` → ✅ `enableRelationshipInference`
755
+ * - ❌ `createFileStructure` → ✅ `vfsPath: '/your/path'`
756
+ * - ❌ `autoDetect` → ✅ *(removed - always enabled)*
757
+ * - ❌ `excelSheets` → ✅ *(removed - all sheets processed)*
758
+ * - ❌ `pdfExtractTables` → ✅ *(removed - always enabled)*
759
+ *
760
+ * **New Options:**
761
+ * - ✅ `enableNeuralExtraction` - Extract entity names via AI
762
+ * - ✅ `enableConceptExtraction` - Extract entity types via AI
763
+ * - ✅ `preserveSource` - Save original file in VFS
764
+ *
765
+ * **If you get an error:**
766
+ * The error message includes migration instructions and examples.
767
+ * See the complete migration guide for all details.
768
+ *
769
+ * **Why these changes?**
770
+ * - Clearer option names (explicitly describe what they do)
771
+ * - Separation of concerns (neural, relationships, VFS are separate)
772
+ * - Better defaults (AI features enabled by default)
773
+ * - Reduced confusion (removed redundant options)
716
774
  */
717
775
  import(source: Buffer | string | object, options?: {
718
776
  format?: 'excel' | 'pdf' | 'csv' | 'json' | 'markdown';
package/dist/brainy.js CHANGED
@@ -1593,33 +1593,91 @@ export class Brainy {
1593
1593
  return options?.limit ? concepts.slice(0, options.limit) : concepts;
1594
1594
  }
1595
1595
  /**
1596
- * Import files with auto-detection and dual storage (VFS + Knowledge Graph)
1596
+ * Import files with intelligent extraction and dual storage (VFS + Knowledge Graph)
1597
1597
  *
1598
1598
  * Unified import system that:
1599
1599
  * - Auto-detects format (Excel, PDF, CSV, JSON, Markdown)
1600
- * - Extracts entities and relationships
1600
+ * - Extracts entities with AI-powered name/type detection
1601
+ * - Infers semantic relationships from context
1601
1602
  * - Stores in both VFS (organized files) and Knowledge Graph (connected entities)
1602
1603
  * - Links VFS files to graph entities
1603
1604
  *
1604
- * @example
1605
- * // Import from file path
1606
- * const result = await brain.import('/path/to/file.xlsx')
1605
+ * @since 4.0.0
1607
1606
  *
1608
- * @example
1609
- * // Import from buffer
1607
+ * @example Quick Start (All AI features enabled by default)
1608
+ * ```typescript
1609
+ * const result = await brain.import('./glossary.xlsx')
1610
+ * // Auto-detects format, extracts entities, infers relationships
1611
+ * ```
1612
+ *
1613
+ * @example Full-Featured Import (v4.x)
1614
+ * ```typescript
1615
+ * const result = await brain.import('./data.xlsx', {
1616
+ * // AI features
1617
+ * enableNeuralExtraction: true, // Extract entity names/metadata
1618
+ * enableRelationshipInference: true, // Detect semantic relationships
1619
+ * enableConceptExtraction: true, // Extract types/concepts
1620
+ *
1621
+ * // VFS features
1622
+ * vfsPath: '/imports/my-data', // Store in VFS directory
1623
+ * groupBy: 'type', // Organize by entity type
1624
+ * preserveSource: true, // Keep original file
1625
+ *
1626
+ * // Progress tracking
1627
+ * onProgress: (p) => console.log(p.message)
1628
+ * })
1629
+ * ```
1630
+ *
1631
+ * @example Performance Tuning (Large Files)
1632
+ * ```typescript
1633
+ * const result = await brain.import('./huge-file.csv', {
1634
+ * enableDeduplication: false, // Skip dedup for speed
1635
+ * confidenceThreshold: 0.8, // Higher threshold = fewer entities
1636
+ * onProgress: (p) => console.log(`${p.processed}/${p.total}`)
1637
+ * })
1638
+ * ```
1639
+ *
1640
+ * @example Import from Buffer or Object
1641
+ * ```typescript
1642
+ * // From buffer
1610
1643
  * const result = await brain.import(buffer, { format: 'pdf' })
1611
1644
  *
1612
- * @example
1613
- * // Import JSON object
1645
+ * // From object
1614
1646
  * const result = await brain.import({ entities: [...] })
1647
+ * ```
1615
1648
  *
1616
- * @example
1617
- * // Custom VFS path and grouping
1618
- * const result = await brain.import(buffer, {
1619
- * vfsPath: '/my-imports/data',
1620
- * groupBy: 'type',
1621
- * onProgress: (progress) => console.log(progress.message)
1622
- * })
1649
+ * @throws {Error} If invalid options are provided (v4.x breaking changes)
1650
+ *
1651
+ * @see {@link https://brainy.dev/docs/api/import API Documentation}
1652
+ * @see {@link https://brainy.dev/docs/guides/migrating-to-v4 Migration Guide}
1653
+ *
1654
+ * @remarks
1655
+ * **⚠️ Breaking Changes from v3.x:**
1656
+ *
1657
+ * The import API was redesigned in v4.0.0 for clarity and better feature control.
1658
+ * Old v3.x option names are **no longer recognized** and will throw errors.
1659
+ *
1660
+ * **Option Changes:**
1661
+ * - ❌ `extractRelationships` → ✅ `enableRelationshipInference`
1662
+ * - ❌ `createFileStructure` → ✅ `vfsPath: '/your/path'`
1663
+ * - ❌ `autoDetect` → ✅ *(removed - always enabled)*
1664
+ * - ❌ `excelSheets` → ✅ *(removed - all sheets processed)*
1665
+ * - ❌ `pdfExtractTables` → ✅ *(removed - always enabled)*
1666
+ *
1667
+ * **New Options:**
1668
+ * - ✅ `enableNeuralExtraction` - Extract entity names via AI
1669
+ * - ✅ `enableConceptExtraction` - Extract entity types via AI
1670
+ * - ✅ `preserveSource` - Save original file in VFS
1671
+ *
1672
+ * **If you get an error:**
1673
+ * The error message includes migration instructions and examples.
1674
+ * See the complete migration guide for all details.
1675
+ *
1676
+ * **Why these changes?**
1677
+ * - Clearer option names (explicitly describe what they do)
1678
+ * - Separation of concerns (neural, relationships, VFS are separate)
1679
+ * - Better defaults (AI features enabled by default)
1680
+ * - Reduced confusion (removed redundant options)
1623
1681
  */
1624
1682
  async import(source, options) {
1625
1683
  // Lazy load ImportCoordinator
@@ -8,7 +8,7 @@
8
8
  *
9
9
  * NO MOCKS - Production-ready implementation
10
10
  */
11
- export type SupportedFormat = 'excel' | 'pdf' | 'csv' | 'json' | 'markdown';
11
+ export type SupportedFormat = 'excel' | 'pdf' | 'csv' | 'json' | 'markdown' | 'yaml' | 'docx';
12
12
  export interface DetectionResult {
13
13
  format: SupportedFormat;
14
14
  confidence: number;
@@ -54,6 +54,11 @@ export declare class FormatDetector {
54
54
  * Check if content looks like CSV
55
55
  */
56
56
  private looksLikeCSV;
57
+ /**
58
+ * Check if content looks like YAML
59
+ * v4.2.0: Added YAML detection
60
+ */
61
+ private looksLikeYAML;
57
62
  /**
58
63
  * Check if content is text-based (not binary)
59
64
  */
@@ -38,7 +38,11 @@ export class FormatDetector {
38
38
  '.csv': 'csv',
39
39
  '.json': 'json',
40
40
  '.md': 'markdown',
41
- '.markdown': 'markdown'
41
+ '.markdown': 'markdown',
42
+ '.yaml': 'yaml',
43
+ '.yml': 'yaml',
44
+ '.docx': 'docx',
45
+ '.doc': 'docx'
42
46
  };
43
47
  const format = extensionMap[ext];
44
48
  if (format) {
@@ -63,6 +67,14 @@ export class FormatDetector {
63
67
  evidence: ['Content starts with { or [', 'Valid JSON structure']
64
68
  };
65
69
  }
70
+ // YAML detection (v4.2.0)
71
+ if (this.looksLikeYAML(trimmed)) {
72
+ return {
73
+ format: 'yaml',
74
+ confidence: 0.90,
75
+ evidence: ['Contains YAML key: value patterns', 'YAML-style indentation']
76
+ };
77
+ }
66
78
  // Markdown detection
67
79
  if (this.looksLikeMarkdown(trimmed)) {
68
80
  return {
@@ -233,6 +245,33 @@ export class FormatDetector {
233
245
  }
234
246
  return false;
235
247
  }
248
+ /**
249
+ * Check if content looks like YAML
250
+ * v4.2.0: Added YAML detection
251
+ */
252
+ looksLikeYAML(content) {
253
+ const lines = content.split('\n').filter(l => l.trim()).slice(0, 20);
254
+ if (lines.length < 2)
255
+ return false;
256
+ let yamlIndicators = 0;
257
+ for (const line of lines) {
258
+ const trimmed = line.trim();
259
+ // Check for YAML key: value pattern
260
+ if (/^[\w-]+:\s/.test(trimmed)) {
261
+ yamlIndicators++;
262
+ }
263
+ // Check for YAML list items (- item)
264
+ if (/^-\s+\w/.test(trimmed)) {
265
+ yamlIndicators++;
266
+ }
267
+ // Check for YAML document separator (---)
268
+ if (trimmed === '---' || trimmed === '...') {
269
+ yamlIndicators += 2;
270
+ }
271
+ }
272
+ // If >50% of lines have YAML indicators, it's likely YAML
273
+ return yamlIndicators / lines.length > 0.5;
274
+ }
236
275
  /**
237
276
  * Check if content is text-based (not binary)
238
277
  */
@@ -15,13 +15,23 @@ import { ImportHistory } from './ImportHistory.js';
15
15
  import { NounType, VerbType } from '../types/graphTypes.js';
16
16
  export interface ImportSource {
17
17
  /** Source type */
18
- type: 'buffer' | 'path' | 'string' | 'object';
18
+ type: 'buffer' | 'path' | 'string' | 'object' | 'url';
19
19
  /** Source data */
20
20
  data: Buffer | string | object;
21
21
  /** Optional filename hint */
22
22
  filename?: string;
23
+ /** HTTP headers for URL imports (v4.2.0) */
24
+ headers?: Record<string, string>;
25
+ /** Basic authentication for URL imports (v4.2.0) */
26
+ auth?: {
27
+ username: string;
28
+ password: string;
29
+ };
23
30
  }
24
- export interface ImportOptions {
31
+ /**
32
+ * Valid import options for v4.x
33
+ */
34
+ export interface ValidImportOptions {
25
35
  /** Force specific format (skip auto-detection) */
26
36
  format?: SupportedFormat;
27
37
  /** VFS root path for imported files */
@@ -52,9 +62,81 @@ export interface ImportOptions {
52
62
  enableHistory?: boolean;
53
63
  /** Chunk size for streaming large imports (0 = no streaming) */
54
64
  chunkSize?: number;
55
- /** Progress callback */
56
- onProgress?: (progress: ImportProgress) => void;
65
+ /**
66
+ * Progress callback for tracking import progress (v4.2.0+)
67
+ *
68
+ * **Streaming Architecture** (always enabled):
69
+ * - Indexes are flushed periodically during import (adaptive intervals)
70
+ * - Data is queryable progressively as import proceeds
71
+ * - `progress.queryable` is `true` after each flush
72
+ * - Provides crash resilience and live monitoring
73
+ *
74
+ * **Adaptive Flush Intervals**:
75
+ * - <1K entities: Flush every 100 entities (max 10 flushes)
76
+ * - 1K-10K entities: Flush every 1000 entities (10-100 flushes)
77
+ * - >10K entities: Flush every 5000 entities (low overhead)
78
+ *
79
+ * **Performance**:
80
+ * - Flush overhead: ~5-50ms per flush (~0.3% total time)
81
+ * - No configuration needed - works optimally out of the box
82
+ *
83
+ * @example
84
+ * ```typescript
85
+ * // Monitor import progress with live queries
86
+ * await brain.import(file, {
87
+ * onProgress: async (progress) => {
88
+ * console.log(`${progress.processed}/${progress.total}`)
89
+ *
90
+ * // Query data as it's imported!
91
+ * if (progress.queryable) {
92
+ * const count = await brain.count({ type: 'Product' })
93
+ * console.log(`${count} products imported so far`)
94
+ * }
95
+ * }
96
+ * })
97
+ * ```
98
+ */
99
+ onProgress?: (progress: ImportProgress) => void | Promise<void>;
57
100
  }
101
+ /**
102
+ * Deprecated import options from v3.x
103
+ * Using these will cause TypeScript compile errors
104
+ *
105
+ * @deprecated These options are no longer supported in v4.x
106
+ * @see {@link https://brainy.dev/docs/guides/migrating-to-v4 Migration Guide}
107
+ */
108
+ export interface DeprecatedImportOptions {
109
+ /**
110
+ * @deprecated Use `enableRelationshipInference` instead
111
+ * @see {@link https://brainy.dev/docs/guides/migrating-to-v4 Migration Guide}
112
+ */
113
+ extractRelationships?: never;
114
+ /**
115
+ * @deprecated Removed in v4.x - auto-detection is now always enabled
116
+ * @see {@link https://brainy.dev/docs/guides/migrating-to-v4 Migration Guide}
117
+ */
118
+ autoDetect?: never;
119
+ /**
120
+ * @deprecated Use `vfsPath` to specify the directory path instead
121
+ * @see {@link https://brainy.dev/docs/guides/migrating-to-v4 Migration Guide}
122
+ */
123
+ createFileStructure?: never;
124
+ /**
125
+ * @deprecated Removed in v4.x - all sheets are now processed automatically
126
+ * @see {@link https://brainy.dev/docs/guides/migrating-to-v4 Migration Guide}
127
+ */
128
+ excelSheets?: never;
129
+ /**
130
+ * @deprecated Removed in v4.x - table extraction is now automatic for PDF imports
131
+ * @see {@link https://brainy.dev/docs/guides/migrating-to-v4 Migration Guide}
132
+ */
133
+ pdfExtractTables?: never;
134
+ }
135
+ /**
136
+ * Complete import options interface
137
+ * Combines valid v4.x options with deprecated v3.x options (which cause TypeScript errors)
138
+ */
139
+ export type ImportOptions = ValidImportOptions & DeprecatedImportOptions;
58
140
  export interface ImportProgress {
59
141
  stage: 'detecting' | 'extracting' | 'storing-vfs' | 'storing-graph' | 'relationships' | 'complete';
60
142
  /** Phase of import - extraction or relationship building (v3.49.0) */
@@ -70,6 +152,15 @@ export interface ImportProgress {
70
152
  throughput?: number;
71
153
  /** Estimated time remaining in ms (v3.38.0) */
72
154
  eta?: number;
155
+ /**
156
+ * Whether data is queryable at this point (v4.2.0+)
157
+ *
158
+ * When true, indexes have been flushed and queries will return up-to-date results.
159
+ * When false, data exists in storage but indexes may not be current (queries may be slower/incomplete).
160
+ *
161
+ * Only present during streaming imports with flushInterval > 0.
162
+ */
163
+ queryable?: boolean;
73
164
  }
74
165
  export interface ImportResult {
75
166
  /** Import ID for history tracking */
@@ -127,6 +218,8 @@ export declare class ImportCoordinator {
127
218
  private csvImporter;
128
219
  private jsonImporter;
129
220
  private markdownImporter;
221
+ private yamlImporter;
222
+ private docxImporter;
130
223
  private vfsGenerator;
131
224
  constructor(brain: Brainy);
132
225
  /**
@@ -139,12 +232,27 @@ export declare class ImportCoordinator {
139
232
  getHistory(): ImportHistory;
140
233
  /**
141
234
  * Import from any source with auto-detection
235
+ * v4.2.0: Now supports URL imports with authentication
142
236
  */
143
- import(source: Buffer | string | object, options?: ImportOptions): Promise<ImportResult>;
237
+ import(source: Buffer | string | object | ImportSource, options?: ImportOptions): Promise<ImportResult>;
144
238
  /**
145
239
  * Normalize source to ImportSource
240
+ * v4.2.0: Now async to support URL fetching
146
241
  */
147
242
  private normalizeSource;
243
+ /**
244
+ * Check if value is an ImportSource object
245
+ */
246
+ private isImportSource;
247
+ /**
248
+ * Check if string is a URL
249
+ */
250
+ private isUrl;
251
+ /**
252
+ * Fetch content from URL
253
+ * v4.2.0: Supports authentication and custom headers
254
+ */
255
+ private fetchUrl;
148
256
  /**
149
257
  * Check if string is a file path
150
258
  */
@@ -165,4 +273,46 @@ export declare class ImportCoordinator {
165
273
  * Normalize extraction result to unified format (Excel-like structure)
166
274
  */
167
275
  private normalizeExtractionResult;
276
+ /**
277
+ * Validate options and reject deprecated v3.x options (v4.0.0+)
278
+ * Throws clear errors with migration guidance
279
+ */
280
+ private validateOptions;
281
+ /**
282
+ * Build detailed error message for invalid options
283
+ * Respects LOG_LEVEL for verbosity (detailed in dev, concise in prod)
284
+ */
285
+ private buildValidationErrorMessage;
286
+ /**
287
+ * Get progressive flush interval based on CURRENT entity count (v4.2.0+)
288
+ *
289
+ * Unlike adaptive intervals (which require knowing total count upfront),
290
+ * progressive intervals adjust dynamically as import proceeds.
291
+ *
292
+ * Thresholds:
293
+ * - 0-999 entities: Flush every 100 (frequent updates for better UX)
294
+ * - 1K-9.9K entities: Flush every 1000 (balanced performance/responsiveness)
295
+ * - 10K+ entities: Flush every 5000 (performance focused, minimal overhead)
296
+ *
297
+ * Benefits:
298
+ * - Works with known totals (file imports)
299
+ * - Works with unknown totals (streaming APIs, database cursors)
300
+ * - Frequent updates early when user is watching
301
+ * - Efficient processing later when performance matters
302
+ * - Low overhead (~0.3% for large imports)
303
+ * - No configuration required
304
+ *
305
+ * Example:
306
+ * - Import with 50K entities:
307
+ * - Flushes at: 100, 200, ..., 900 (9 flushes with interval=100)
308
+ * - Interval increases to 1000 at entity #1000
309
+ * - Flushes at: 1000, 2000, ..., 9000 (9 more flushes)
310
+ * - Interval increases to 5000 at entity #10000
311
+ * - Flushes at: 10000, 15000, ..., 50000 (8 more flushes)
312
+ * - Total: ~26 flushes = ~1.3s overhead = 0.026% of import time
313
+ *
314
+ * @param currentEntityCount - Current number of entities imported so far
315
+ * @returns Current optimal flush interval
316
+ */
317
+ private getProgressiveFlushInterval;
168
318
  }