@adverant/nexus-memory-skill 2.1.0 → 2.2.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/SKILL.md CHANGED
@@ -418,6 +418,145 @@ curl -X POST "https://api.adverant.ai/fileprocess/api/process" \
418
418
 
419
419
  ---
420
420
 
421
+ ## Upload Document Hook (Full Knowledge Extraction)
422
+
423
+ The `upload-document.sh` hook provides a streamlined way to upload files with **full auto-discovery and knowledge extraction** enabled by default.
424
+
425
+ ### Auto-Discovery Features (All Enabled by Default)
426
+
427
+ When you upload a document, the system automatically:
428
+
429
+ | Feature | Description | Accuracy |
430
+ |---------|-------------|----------|
431
+ | **Smart File Detection** | Magic byte detection for accurate MIME type | 100% |
432
+ | **Intelligent Routing** | Auto-routes to MageAgent, VideoAgent, or CyberAgent | Automatic |
433
+ | **3-Tier OCR Cascade** | Tesseract → GPT-4o Vision → Claude Opus | Auto-escalates |
434
+ | **Layout Analysis** | Document structure preservation | 99.2% |
435
+ | **Table Extraction** | Tables converted to structured data | 97.9% |
436
+ | **Document DNA** | Triple-layer storage (semantic + structural + original) | Full fidelity |
437
+ | **Entity Extraction** | People, places, organizations → Neo4j | Automatic |
438
+ | **Vector Embeddings** | VoyageAI embeddings → Qdrant | Automatic |
439
+
440
+ ### Basic Usage
441
+
442
+ ```bash
443
+ # Upload a single file
444
+ ~/.claude/hooks/upload-document.sh ./document.pdf
445
+
446
+ # Upload and wait for processing to complete
447
+ ~/.claude/hooks/upload-document.sh ./book.pdf --wait
448
+
449
+ # Upload with custom tags for easier recall
450
+ ~/.claude/hooks/upload-document.sh ./research.pdf --wait --tags=research,ai,papers
451
+ ```
452
+
453
+ ### Batch Upload (Multiple Files)
454
+
455
+ ```bash
456
+ # Upload 3 books at once
457
+ ~/.claude/hooks/upload-document.sh book1.pdf book2.pdf book3.pdf --batch --wait
458
+
459
+ # Upload entire directory of PDFs
460
+ ~/.claude/hooks/upload-document.sh ./docs/*.pdf --batch --wait --tags=documentation
461
+ ```
462
+
463
+ ### Processing Options
464
+
465
+ | Flag | Description |
466
+ |------|-------------|
467
+ | `--wait` | Wait for processing to complete and show results |
468
+ | `--batch` | Enable batch mode for multiple files |
469
+ | `--tags=a,b,c` | Add custom tags for easier recall |
470
+ | `--no-entities` | Skip entity extraction (faster, less rich) |
471
+ | `--prefer-speed` | Use faster OCR (may reduce accuracy) |
472
+ | `--poll-interval=N` | Poll interval in seconds (default: 5) |
473
+
474
+ ### Supported File Types
475
+
476
+ The upload hook supports **ALL file types** through intelligent routing:
477
+
478
+ - **Documents**: PDF, DOCX, DOC, TXT, MD, HTML, CSV, XML, JSON
479
+ - **Images**: JPEG, PNG, GIF, TIFF, WebP (with OCR)
480
+ - **Videos**: MP4, MOV, AVI, MKV, WebM, FLV
481
+ - **Archives**: ZIP, RAR, 7z, TAR, TAR.GZ, TAR.BZ2
482
+ - **Geospatial**: GeoJSON, Shapefile, GeoTIFF, KML
483
+ - **Point Cloud**: LAS, LAZ, PLY, PCD, E57
484
+ - **Code**: Any programming language
485
+ - **Any binary format** (routed to CyberAgent if suspicious)
486
+
487
+ ### Example Output (--wait mode)
488
+
489
+ ```
490
+ [upload-document] Uploading: research-paper.pdf (12MB)
491
+ Job ID: abc123-def456-ghi789
492
+
493
+ ╔══════════════════════════════════════════════════════════════╗
494
+ ║ PROCESSING COMPLETE: research-paper.pdf
495
+ ╚══════════════════════════════════════════════════════════════╝
496
+
497
+ 📄 Document Type: academic_paper
498
+ 📑 Pages: 42
499
+ 📝 Words: 15,230
500
+
501
+ 🔍 Auto-Discovery Results:
502
+ • OCR Tier Used: tesseract (text-based PDF)
503
+ • Tables Found: 8
504
+ • Entities: 127
505
+ • GraphRAG: true
506
+
507
+ 🏷️ Extracted Entities:
508
+ • Dr. Emily Chen (person)
509
+ • Stanford University (organization)
510
+ • NeurIPS 2024 (event)
511
+ • Transformer Architecture (concept)
512
+ ... and 123 more
513
+
514
+ 💡 To recall this content:
515
+ echo '{"query": "<your search>"}' | recall-memory.sh
516
+ ```
517
+
518
+ ### Recalling Uploaded Content
519
+
520
+ After upload, documents are immediately searchable:
521
+
522
+ ```bash
523
+ # Search by content
524
+ echo '{"query": "transformer architecture research"}' | ~/.claude/hooks/recall-memory.sh
525
+
526
+ # Search by entity
527
+ echo '{"query": "papers by Dr. Emily Chen"}' | ~/.claude/hooks/recall-memory.sh
528
+
529
+ # Search by tag
530
+ echo '{"query": "research papers tagged ai"}' | ~/.claude/hooks/recall-memory.sh
531
+ ```
532
+
533
+ ### Environment Variables
534
+
535
+ | Variable | Default | Description |
536
+ |----------|---------|-------------|
537
+ | `NEXUS_API_KEY` | (required) | API key for authentication |
538
+ | `NEXUS_API_URL` | `https://api.adverant.ai` | API endpoint |
539
+ | `NEXUS_COMPANY_ID` | `adverant` | Company identifier |
540
+ | `NEXUS_APP_ID` | `claude-code` | Application identifier |
541
+ | `NEXUS_VERBOSE` | `0` | Set to `1` for debug output |
542
+
543
+ ### Troubleshooting
544
+
545
+ **File too large:**
546
+ Maximum file size is 5GB. For larger files, consider splitting or using the direct API.
547
+
548
+ **Processing takes too long:**
549
+ - Large PDFs with many images trigger OCR cascade
550
+ - Use `--prefer-speed` for faster (but less accurate) processing
551
+ - Increase `--poll-interval` for large batches
552
+
553
+ **Entities not extracted:**
554
+ - Ensure you didn't use `--no-entities`
555
+ - Check NEXUS_VERBOSE=1 for detailed logs
556
+ - Some file types don't support entity extraction
557
+
558
+ ---
559
+
421
560
  ## Store Memory
422
561
 
423
562
  ```bash
@@ -1,7 +1,19 @@
1
1
  #!/bin/bash
2
2
  #
3
- # Nexus Memory - Upload Document Hook
4
- # Uploads documents to FileProcessAgent for intelligent processing.
3
+ # Nexus Memory - Upload Document Hook (v2.2.0)
4
+ # Uploads documents to FileProcessAgent for intelligent processing with
5
+ # FULL KNOWLEDGE EXTRACTION enabled by default.
6
+ #
7
+ # Auto-Discovery Features (enabled automatically):
8
+ # - Smart file type detection via magic bytes
9
+ # - Intelligent routing: MageAgent (docs), VideoAgent (video), CyberAgent (binaries)
10
+ # - 3-tier OCR cascade: Tesseract → GPT-4o Vision → Claude Opus (auto-escalates)
11
+ # - Layout analysis: 99.2% accuracy (Dockling-level)
12
+ # - Table extraction: 97.9% accuracy
13
+ # - Document DNA: Triple-layer storage (semantic + structural + original)
14
+ # - Entity extraction → Neo4j knowledge graph
15
+ # - Vector embeddings → Qdrant for semantic search
16
+ # - Content findable via recall-memory.sh
5
17
  #
6
18
  # Supports ALL file types including:
7
19
  # - Documents: PDF, DOCX, TXT, MD, HTML, etc.
@@ -10,15 +22,21 @@
10
22
  # - Archives: ZIP, RAR, 7z, TAR, GZIP
11
23
  # - Geospatial: GeoJSON, Shapefile, GeoTIFF, KML (via intelligent routing)
12
24
  # - Point Cloud: LAS, LAZ, PLY, PCD, E57 (via intelligent routing)
25
+ # - Code repositories: Automatically detected and processed
13
26
  # - Any other binary format (routed to appropriate processor)
14
27
  #
15
28
  # Usage:
16
- # upload-document.sh <file_path> [--wait] [--poll-interval=5]
29
+ # upload-document.sh <file_path> [options]
30
+ # upload-document.sh <file1> <file2> ... --batch [options]
17
31
  #
18
32
  # Arguments:
19
- # file_path Path to the file to upload (required)
20
- # --wait Wait for processing to complete and return results
21
- # --poll-interval=N Poll interval in seconds (default: 5)
33
+ # file_path Path to the file(s) to upload (required)
34
+ # --wait Wait for processing to complete and return results
35
+ # --poll-interval=N Poll interval in seconds (default: 5)
36
+ # --batch Process multiple files (list files before this flag)
37
+ # --tags=a,b,c Add custom tags for recall (comma-separated)
38
+ # --no-entities Skip entity extraction to knowledge graph
39
+ # --prefer-speed Use faster OCR (may reduce accuracy for scanned docs)
22
40
  #
23
41
  # Environment Variables:
24
42
  # NEXUS_API_KEY - API key for authentication (REQUIRED)
@@ -29,9 +47,10 @@
29
47
  #
30
48
  # Examples:
31
49
  # upload-document.sh ./document.pdf
32
- # upload-document.sh ./large-dataset.csv --wait
50
+ # upload-document.sh ./book.pdf --wait
51
+ # upload-document.sh ./data.csv --wait --tags=dataset,sales
52
+ # upload-document.sh book1.pdf book2.pdf book3.pdf --batch --wait
33
53
  # upload-document.sh ./video.mp4 --wait --poll-interval=10
34
- # upload-document.sh ./pointcloud.las --wait
35
54
  #
36
55
 
37
56
  set -o pipefail
@@ -63,12 +82,27 @@ log_info() {
63
82
  }
64
83
 
65
84
  print_usage() {
66
- echo "Usage: upload-document.sh <file_path> [--wait] [--poll-interval=N]"
85
+ echo "Usage: upload-document.sh <file_path> [options]"
86
+ echo " upload-document.sh <file1> <file2> ... --batch [options]"
67
87
  echo ""
68
88
  echo "Arguments:"
69
- echo " file_path Path to the file to upload (required)"
89
+ echo " file_path Path to the file(s) to upload (required)"
70
90
  echo " --wait Wait for processing to complete"
71
91
  echo " --poll-interval=N Poll interval in seconds (default: 5)"
92
+ echo " --batch Process multiple files (list files before this flag)"
93
+ echo " --tags=a,b,c Add custom tags for recall (comma-separated)"
94
+ echo " --no-entities Skip entity extraction to knowledge graph"
95
+ echo " --prefer-speed Use faster OCR (may reduce accuracy)"
96
+ echo ""
97
+ echo "Auto-Discovery Features (enabled by default):"
98
+ echo " • Smart file type detection via magic bytes"
99
+ echo " • Intelligent routing: MageAgent, VideoAgent, CyberAgent"
100
+ echo " • 3-tier OCR cascade (auto-escalates for quality)"
101
+ echo " • Layout analysis (99.2% accuracy)"
102
+ echo " • Table extraction (97.9% accuracy)"
103
+ echo " • Entity extraction → Knowledge graph"
104
+ echo " • Vector embeddings → Semantic search"
105
+ echo " • Content findable via recall-memory.sh"
72
106
  echo ""
73
107
  echo "Supported file types:"
74
108
  echo " Documents: PDF, DOCX, DOC, TXT, MD, HTML, CSV, XML, JSON"
@@ -77,13 +111,16 @@ print_usage() {
77
111
  echo " Archives: ZIP, RAR, 7z, TAR, TAR.GZ, TAR.BZ2"
78
112
  echo " Geospatial: GeoJSON, Shapefile, GeoTIFF, KML"
79
113
  echo " Point Cloud: LAS, LAZ, PLY, PCD, E57"
114
+ echo " Code: Any programming language"
80
115
  echo " Any other binary format"
81
116
  echo ""
82
117
  echo "Maximum file size: 5GB"
83
118
  echo ""
84
119
  echo "Examples:"
85
120
  echo " upload-document.sh ./document.pdf"
86
- echo " upload-document.sh ./data.csv --wait"
121
+ echo " upload-document.sh ./book.pdf --wait"
122
+ echo " upload-document.sh ./data.csv --wait --tags=dataset,sales"
123
+ echo " upload-document.sh book1.pdf book2.pdf book3.pdf --batch --wait"
87
124
  echo " upload-document.sh ./video.mp4 --wait --poll-interval=10"
88
125
  }
89
126
 
@@ -109,9 +146,13 @@ if ! command -v jq &> /dev/null; then
109
146
  fi
110
147
 
111
148
  # Parse arguments
112
- FILE_PATH=""
149
+ FILES=()
113
150
  WAIT_FOR_COMPLETION=0
114
151
  POLL_INTERVAL=5
152
+ BATCH_MODE=0
153
+ CUSTOM_TAGS=""
154
+ EXTRACT_ENTITIES=1
155
+ PREFER_SPEED=0
115
156
 
116
157
  while [[ $# -gt 0 ]]; do
117
158
  case $1 in
@@ -123,6 +164,22 @@ while [[ $# -gt 0 ]]; do
123
164
  POLL_INTERVAL="${1#*=}"
124
165
  shift
125
166
  ;;
167
+ --batch)
168
+ BATCH_MODE=1
169
+ shift
170
+ ;;
171
+ --tags=*)
172
+ CUSTOM_TAGS="${1#*=}"
173
+ shift
174
+ ;;
175
+ --no-entities)
176
+ EXTRACT_ENTITIES=0
177
+ shift
178
+ ;;
179
+ --prefer-speed)
180
+ PREFER_SPEED=1
181
+ shift
182
+ ;;
126
183
  --help|-h)
127
184
  print_usage
128
185
  exit 0
@@ -133,153 +190,348 @@ while [[ $# -gt 0 ]]; do
133
190
  exit 1
134
191
  ;;
135
192
  *)
136
- if [[ -z "$FILE_PATH" ]]; then
137
- FILE_PATH="$1"
138
- else
139
- log_error "Unexpected argument: $1"
140
- print_usage
141
- exit 1
142
- fi
193
+ # Collect file paths
194
+ FILES+=("$1")
143
195
  shift
144
196
  ;;
145
197
  esac
146
198
  done
147
199
 
148
- # Validate file path
149
- if [[ -z "$FILE_PATH" ]]; then
150
- log_error "File path is required"
200
+ # Validate files
201
+ if [[ ${#FILES[@]} -eq 0 ]]; then
202
+ log_error "At least one file path is required"
151
203
  print_usage
152
204
  exit 1
153
205
  fi
154
206
 
155
- if [[ ! -f "$FILE_PATH" ]]; then
156
- log_error "File not found: $FILE_PATH"
207
+ # If not batch mode but multiple files provided, error
208
+ if [[ "$BATCH_MODE" == "0" ]] && [[ ${#FILES[@]} -gt 1 ]]; then
209
+ log_error "Multiple files require --batch flag"
210
+ log_error "Usage: upload-document.sh file1.pdf file2.pdf --batch"
157
211
  exit 1
158
212
  fi
159
213
 
160
- # Get file info
161
- FILE_NAME=$(basename "$FILE_PATH")
162
- FILE_SIZE=$(wc -c < "$FILE_PATH" | tr -d ' ')
163
- FILE_SIZE_MB=$((FILE_SIZE / 1024 / 1024))
214
+ # Validate all files exist
215
+ for file in "${FILES[@]}"; do
216
+ if [[ ! -f "$file" ]]; then
217
+ log_error "File not found: $file"
218
+ exit 1
219
+ fi
220
+ done
164
221
 
165
- # Check file size (max 5GB)
166
- MAX_SIZE=$((5 * 1024 * 1024 * 1024))
167
- if [[ "$FILE_SIZE" -gt "$MAX_SIZE" ]]; then
168
- log_error "File too large. Maximum size is 5GB. Your file: ${FILE_SIZE_MB}MB"
169
- exit 1
170
- fi
222
+ # Build processing metadata JSON with hints for aggressive extraction
223
+ build_metadata() {
224
+ local file_name="$1"
225
+ local tags_json="[]"
171
226
 
172
- log "File: $FILE_PATH"
173
- log "Size: $FILE_SIZE bytes (${FILE_SIZE_MB}MB)"
227
+ # Convert comma-separated tags to JSON array
228
+ if [[ -n "$CUSTOM_TAGS" ]]; then
229
+ tags_json=$(echo "$CUSTOM_TAGS" | tr ',' '\n' | jq -R . | jq -s .)
230
+ fi
174
231
 
175
- # Display upload info
176
- if [[ "$FILE_SIZE_MB" -gt 100 ]]; then
177
- log_info "Uploading large file: $FILE_NAME (${FILE_SIZE_MB}MB) - this may take a while..."
178
- else
179
- log_info "Uploading: $FILE_NAME (${FILE_SIZE_MB}MB)"
180
- fi
232
+ # Determine OCR preference
233
+ local prefer_accuracy="true"
234
+ if [[ "$PREFER_SPEED" == "1" ]]; then
235
+ prefer_accuracy="false"
236
+ fi
181
237
 
182
- # Upload file via multipart form
183
- log "Uploading to $FILEPROCESS_URL"
238
+ # Determine entity extraction
239
+ local extract_entities="true"
240
+ if [[ "$EXTRACT_ENTITIES" == "0" ]]; then
241
+ extract_entities="false"
242
+ fi
184
243
 
185
- RESPONSE=$(curl -s -w "\n%{http_code}" -X POST "$FILEPROCESS_URL" \
186
- -H "Authorization: Bearer $NEXUS_API_KEY" \
187
- -H "X-Company-ID: $COMPANY_ID" \
188
- -H "X-App-ID: $APP_ID" \
189
- -H "X-User-ID: ${USER:-unknown}" \
190
- -F "file=@${FILE_PATH}" \
191
- -F "userId=${USER:-unknown}" \
192
- --max-time 600 2>&1)
244
+ cat <<EOF
245
+ {
246
+ "source": "nexus-memory-skill",
247
+ "version": "2.2.0",
248
+ "preferAccuracy": ${prefer_accuracy},
249
+ "forceEntityExtraction": ${extract_entities},
250
+ "storeInKnowledgeGraph": ${extract_entities},
251
+ "enableDocumentDNA": true,
252
+ "tags": ${tags_json},
253
+ "uploadedBy": "${USER:-unknown}",
254
+ "uploadedAt": "$(date -u +"%Y-%m-%dT%H:%M:%SZ")"
255
+ }
256
+ EOF
257
+ }
193
258
 
194
- # Parse response
195
- HTTP_CODE=$(echo "$RESPONSE" | tail -n1)
196
- BODY=$(echo "$RESPONSE" | sed '$d')
259
+ # Function to upload a single file
260
+ upload_file() {
261
+ local FILE_PATH="$1"
262
+ local FILE_NAME=$(basename "$FILE_PATH")
263
+ local FILE_SIZE=$(wc -c < "$FILE_PATH" | tr -d ' ')
264
+ local FILE_SIZE_MB=$((FILE_SIZE / 1024 / 1024))
265
+
266
+ # Check file size (max 5GB)
267
+ local MAX_SIZE=$((5 * 1024 * 1024 * 1024))
268
+ if [[ "$FILE_SIZE" -gt "$MAX_SIZE" ]]; then
269
+ log_error "File too large: $FILE_NAME (${FILE_SIZE_MB}MB). Maximum size is 5GB."
270
+ return 1
271
+ fi
197
272
 
198
- log "Response code: $HTTP_CODE"
273
+ log "File: $FILE_PATH"
274
+ log "Size: $FILE_SIZE bytes (${FILE_SIZE_MB}MB)"
199
275
 
200
- # Check for upload errors (200, 201, 202 are all success codes)
201
- if [[ "$HTTP_CODE" != "200" ]] && [[ "$HTTP_CODE" != "201" ]] && [[ "$HTTP_CODE" != "202" ]]; then
202
- log_error "Failed to upload document (HTTP $HTTP_CODE)"
203
- if [[ -n "$BODY" ]]; then
276
+ # Display upload info
277
+ if [[ "$FILE_SIZE_MB" -gt 100 ]]; then
278
+ log_info "Uploading large file: $FILE_NAME (${FILE_SIZE_MB}MB) - this may take a while..."
279
+ else
280
+ log_info "Uploading: $FILE_NAME (${FILE_SIZE_MB}MB)"
281
+ fi
282
+
283
+ # Build metadata with processing hints
284
+ local METADATA=$(build_metadata "$FILE_NAME")
285
+ log "Metadata: $METADATA"
286
+
287
+ # Upload file via multipart form with metadata hints
288
+ log "Uploading to $FILEPROCESS_URL"
289
+
290
+ local RESPONSE=$(curl -s -w "\n%{http_code}" -X POST "$FILEPROCESS_URL" \
291
+ -H "Authorization: Bearer $NEXUS_API_KEY" \
292
+ -H "X-Company-ID: $COMPANY_ID" \
293
+ -H "X-App-ID: $APP_ID" \
294
+ -H "X-User-ID: ${USER:-unknown}" \
295
+ -F "file=@${FILE_PATH}" \
296
+ -F "userId=${USER:-unknown}" \
297
+ -F "metadata=${METADATA}" \
298
+ --max-time 600 2>&1)
299
+
300
+ # Parse response
301
+ local HTTP_CODE=$(echo "$RESPONSE" | tail -n1)
302
+ local BODY=$(echo "$RESPONSE" | sed '$d')
303
+
304
+ log "Response code: $HTTP_CODE"
305
+
306
+ # Check for upload errors (200, 201, 202 are all success codes)
307
+ if [[ "$HTTP_CODE" != "200" ]] && [[ "$HTTP_CODE" != "201" ]] && [[ "$HTTP_CODE" != "202" ]]; then
308
+ log_error "Failed to upload document (HTTP $HTTP_CODE)"
309
+ if [[ -n "$BODY" ]]; then
310
+ echo "$BODY" | jq . 2>/dev/null || echo "$BODY"
311
+ fi
312
+ return 1
313
+ fi
314
+
315
+ # Parse job ID from response
316
+ local JOB_ID=$(echo "$BODY" | jq -r '.jobId // empty')
317
+
318
+ if [[ -z "$JOB_ID" ]]; then
319
+ log_error "No job ID returned from upload"
204
320
  echo "$BODY" | jq . 2>/dev/null || echo "$BODY"
321
+ return 1
205
322
  fi
206
- exit 1
207
- fi
208
323
 
209
- # Parse job ID from response
210
- JOB_ID=$(echo "$BODY" | jq -r '.jobId // empty')
324
+ log_info "Document queued for processing"
325
+ echo "Job ID: $JOB_ID"
211
326
 
212
- if [[ -z "$JOB_ID" ]]; then
213
- log_error "No job ID returned from upload"
214
- echo "$BODY" | jq . 2>/dev/null || echo "$BODY"
215
- exit 1
216
- fi
327
+ # Return job ID for tracking
328
+ echo "$JOB_ID"
329
+ }
217
330
 
218
- log_info "Document queued for processing"
219
- echo "Job ID: $JOB_ID"
331
+ # Function to display detailed results
332
+ display_results() {
333
+ local STATUS_RESPONSE="$1"
334
+ local FILE_NAME="$2"
220
335
 
221
- # If not waiting, exit here
222
- if [[ "$WAIT_FOR_COMPLETION" == "0" ]]; then
223
336
  echo ""
224
- echo "To check status: curl -s \"$JOBS_URL/$JOB_ID\" | jq ."
225
- exit 0
226
- fi
227
-
228
- # Wait for processing to complete
229
- log_info "Waiting for processing to complete (polling every ${POLL_INTERVAL}s)..."
337
+ echo "╔══════════════════════════════════════════════════════════════╗"
338
+ echo "║ PROCESSING COMPLETE: $FILE_NAME"
339
+ echo "╚══════════════════════════════════════════════════════════════╝"
340
+ echo ""
230
341
 
231
- MAX_WAIT=3600 # 1 hour max wait
232
- WAITED=0
342
+ # Extract key metrics from response
343
+ local ENTITY_COUNT=$(echo "$STATUS_RESPONSE" | jq -r '.result.entities // .entities // [] | length' 2>/dev/null)
344
+ local TABLE_COUNT=$(echo "$STATUS_RESPONSE" | jq -r '.result.tables // .tables // [] | length' 2>/dev/null)
345
+ local OCR_TIER=$(echo "$STATUS_RESPONSE" | jq -r '.result.ocrTier // .ocrTier // "auto"' 2>/dev/null)
346
+ local PAGE_COUNT=$(echo "$STATUS_RESPONSE" | jq -r '.result.pageCount // .pageCount // "N/A"' 2>/dev/null)
347
+ local WORD_COUNT=$(echo "$STATUS_RESPONSE" | jq -r '.result.wordCount // .wordCount // "N/A"' 2>/dev/null)
348
+ local DOC_TYPE=$(echo "$STATUS_RESPONSE" | jq -r '.result.documentType // .documentType // "unknown"' 2>/dev/null)
349
+ local GRAPHRAG_STORED=$(echo "$STATUS_RESPONSE" | jq -r '.result.storedInGraphRAG // .storedInGraphRAG // false' 2>/dev/null)
350
+
351
+ echo "📄 Document Type: $DOC_TYPE"
352
+ echo "📑 Pages: $PAGE_COUNT"
353
+ echo "📝 Words: $WORD_COUNT"
354
+ echo ""
355
+ echo "🔍 Auto-Discovery Results:"
356
+ echo " • OCR Tier Used: $OCR_TIER"
357
+ echo " • Tables Found: $TABLE_COUNT"
358
+ echo " • Entities: $ENTITY_COUNT"
359
+ echo " • GraphRAG: $GRAPHRAG_STORED"
360
+ echo ""
233
361
 
234
- while [[ "$WAITED" -lt "$MAX_WAIT" ]]; do
235
- sleep "$POLL_INTERVAL"
236
- WAITED=$((WAITED + POLL_INTERVAL))
362
+ # Show extracted entities if any
363
+ if [[ "$ENTITY_COUNT" != "0" ]] && [[ "$ENTITY_COUNT" != "null" ]]; then
364
+ echo "🏷️ Extracted Entities:"
365
+ echo "$STATUS_RESPONSE" | jq -r '.result.entities // .entities // [] | .[:10][] | " • \(.name // .text) (\(.type // "entity"))"' 2>/dev/null
366
+ if [[ "$ENTITY_COUNT" -gt 10 ]]; then
367
+ echo " ... and $((ENTITY_COUNT - 10)) more"
368
+ fi
369
+ echo ""
370
+ fi
237
371
 
238
- # Check job status
239
- STATUS_RESPONSE=$(curl -s "$JOBS_URL/$JOB_ID" \
240
- -H "Authorization: Bearer $NEXUS_API_KEY" \
241
- -H "X-Company-ID: $COMPANY_ID" \
242
- -H "X-App-ID: $APP_ID" \
243
- -H "X-User-ID: ${USER:-unknown}" \
244
- --max-time 30 2>/dev/null)
372
+ # Show recall command
373
+ echo "💡 To recall this content:"
374
+ echo " echo '{\"query\": \"<your search>\"}' | recall-memory.sh"
375
+ echo ""
245
376
 
246
- if [[ -z "$STATUS_RESPONSE" ]]; then
247
- log "Waiting... (${WAITED}s elapsed)"
248
- continue
377
+ # Show full JSON if verbose
378
+ if [[ "$VERBOSE" == "1" ]]; then
379
+ echo "=== FULL RESPONSE ==="
380
+ echo "$STATUS_RESPONSE" | jq .
249
381
  fi
382
+ }
250
383
 
251
- JOB_STATE=$(echo "$STATUS_RESPONSE" | jq -r '.state // .status // empty')
384
+ # Function to wait for job completion
385
+ wait_for_job() {
386
+ local JOB_ID="$1"
387
+ local FILE_NAME="$2"
388
+
389
+ log_info "Waiting for processing to complete (polling every ${POLL_INTERVAL}s)..."
390
+
391
+ local MAX_WAIT=3600 # 1 hour max wait
392
+ local WAITED=0
393
+
394
+ while [[ "$WAITED" -lt "$MAX_WAIT" ]]; do
395
+ sleep "$POLL_INTERVAL"
396
+ WAITED=$((WAITED + POLL_INTERVAL))
397
+
398
+ # Check job status
399
+ local STATUS_RESPONSE=$(curl -s "$JOBS_URL/$JOB_ID" \
400
+ -H "Authorization: Bearer $NEXUS_API_KEY" \
401
+ -H "X-Company-ID: $COMPANY_ID" \
402
+ -H "X-App-ID: $APP_ID" \
403
+ -H "X-User-ID: ${USER:-unknown}" \
404
+ --max-time 30 2>/dev/null)
405
+
406
+ if [[ -z "$STATUS_RESPONSE" ]]; then
407
+ log "Waiting... (${WAITED}s elapsed)"
408
+ continue
409
+ fi
410
+
411
+ local JOB_STATE=$(echo "$STATUS_RESPONSE" | jq -r '.state // .status // empty')
412
+
413
+ case "$JOB_STATE" in
414
+ "completed"|"finished"|"success")
415
+ display_results "$STATUS_RESPONSE" "$FILE_NAME"
416
+ return 0
417
+ ;;
418
+ "failed"|"error")
419
+ log_error "Processing failed for: $FILE_NAME"
420
+ echo ""
421
+ echo "=== ERROR DETAILS ==="
422
+ echo "$STATUS_RESPONSE" | jq .
423
+ return 1
424
+ ;;
425
+ "waiting"|"active"|"processing"|"pending")
426
+ local PROGRESS=$(echo "$STATUS_RESPONSE" | jq -r '.progress // empty')
427
+ local STAGE=$(echo "$STATUS_RESPONSE" | jq -r '.stage // empty')
428
+ if [[ -n "$PROGRESS" ]] && [[ -n "$STAGE" ]]; then
429
+ log "[$FILE_NAME] ${STAGE}: ${PROGRESS}% (${WAITED}s elapsed)"
430
+ elif [[ -n "$PROGRESS" ]]; then
431
+ log "[$FILE_NAME] Processing... ${PROGRESS}% (${WAITED}s elapsed)"
432
+ else
433
+ log "[$FILE_NAME] Processing... (${WAITED}s elapsed)"
434
+ fi
435
+ ;;
436
+ *)
437
+ log "[$FILE_NAME] Status: $JOB_STATE (${WAITED}s elapsed)"
438
+ ;;
439
+ esac
440
+ done
441
+
442
+ log_error "Timeout waiting for processing (${MAX_WAIT}s)"
443
+ echo "Job may still be processing. Check status manually:"
444
+ echo "curl -s \"$JOBS_URL/$JOB_ID\" | jq ."
445
+ return 1
446
+ }
252
447
 
253
- case "$JOB_STATE" in
254
- "completed"|"finished"|"success")
255
- log_info "Processing completed!"
256
- echo ""
257
- echo "=== PROCESSING RESULT ==="
258
- echo "$STATUS_RESPONSE" | jq .
259
- exit 0
260
- ;;
261
- "failed"|"error")
262
- log_error "Processing failed!"
263
- echo ""
264
- echo "=== ERROR DETAILS ==="
265
- echo "$STATUS_RESPONSE" | jq .
266
- exit 1
267
- ;;
268
- "waiting"|"active"|"processing"|"pending")
269
- PROGRESS=$(echo "$STATUS_RESPONSE" | jq -r '.progress // empty')
270
- if [[ -n "$PROGRESS" ]]; then
271
- log "Processing... ${PROGRESS}% (${WAITED}s elapsed)"
272
- else
273
- log "Processing... (${WAITED}s elapsed)"
448
+ # ============================================================================
449
+ # MAIN EXECUTION
450
+ # ============================================================================
451
+
452
+ # Track all job IDs for batch mode
453
+ JOB_IDS=()
454
+ FILE_NAMES=()
455
+ FAILED_UPLOADS=0
456
+
457
+ # Display batch info
458
+ if [[ "$BATCH_MODE" == "1" ]]; then
459
+ log_info "Batch mode: uploading ${#FILES[@]} files"
460
+ echo ""
461
+ fi
462
+
463
+ # Upload all files
464
+ for file in "${FILES[@]}"; do
465
+ FILE_NAME=$(basename "$file")
466
+ FILE_NAMES+=("$FILE_NAME")
467
+
468
+ # Upload file and capture job ID (last line of output)
469
+ UPLOAD_OUTPUT=$(upload_file "$file" 2>&1)
470
+ UPLOAD_EXIT_CODE=$?
471
+
472
+ if [[ $UPLOAD_EXIT_CODE -eq 0 ]]; then
473
+ # Extract job ID from output (last non-empty line that looks like a job ID)
474
+ JOB_ID=$(echo "$UPLOAD_OUTPUT" | grep -E '^[a-f0-9-]+$' | tail -1)
475
+ if [[ -n "$JOB_ID" ]]; then
476
+ JOB_IDS+=("$JOB_ID")
477
+ else
478
+ # Try to extract from "Job ID: xxx" format
479
+ JOB_ID=$(echo "$UPLOAD_OUTPUT" | grep "Job ID:" | sed 's/Job ID: //' | tr -d ' ')
480
+ if [[ -n "$JOB_ID" ]]; then
481
+ JOB_IDS+=("$JOB_ID")
274
482
  fi
275
- ;;
276
- *)
277
- log "Status: $JOB_STATE (${WAITED}s elapsed)"
278
- ;;
279
- esac
483
+ fi
484
+ echo "$UPLOAD_OUTPUT"
485
+ else
486
+ log_error "Failed to upload: $FILE_NAME"
487
+ echo "$UPLOAD_OUTPUT"
488
+ FAILED_UPLOADS=$((FAILED_UPLOADS + 1))
489
+ fi
490
+
491
+ # Add spacing between files in batch mode
492
+ if [[ "$BATCH_MODE" == "1" ]]; then
493
+ echo ""
494
+ fi
280
495
  done
281
496
 
282
- log_error "Timeout waiting for processing (${MAX_WAIT}s)"
283
- echo "Job may still be processing. Check status manually:"
284
- echo "curl -s \"$JOBS_URL/$JOB_ID\" | jq ."
285
- exit 1
497
+ # Summary for batch mode
498
+ if [[ "$BATCH_MODE" == "1" ]]; then
499
+ echo "╔══════════════════════════════════════════════════════════════╗"
500
+ echo "║ UPLOAD SUMMARY ║"
501
+ echo "╚══════════════════════════════════════════════════════════════╝"
502
+ echo "Total files: ${#FILES[@]}"
503
+ echo "Uploaded: ${#JOB_IDS[@]}"
504
+ echo "Failed: $FAILED_UPLOADS"
505
+ echo ""
506
+ fi
507
+
508
+ # If not waiting, exit here
509
+ if [[ "$WAIT_FOR_COMPLETION" == "0" ]]; then
510
+ if [[ ${#JOB_IDS[@]} -gt 0 ]]; then
511
+ echo "To check status:"
512
+ for i in "${!JOB_IDS[@]}"; do
513
+ echo " curl -s \"$JOBS_URL/${JOB_IDS[$i]}\" | jq . # ${FILE_NAMES[$i]}"
514
+ done
515
+ fi
516
+ exit $FAILED_UPLOADS
517
+ fi
518
+
519
+ # Wait for all jobs to complete
520
+ FAILED_JOBS=0
521
+ for i in "${!JOB_IDS[@]}"; do
522
+ JOB_ID="${JOB_IDS[$i]}"
523
+ FILE_NAME="${FILE_NAMES[$i]}"
524
+
525
+ if ! wait_for_job "$JOB_ID" "$FILE_NAME"; then
526
+ FAILED_JOBS=$((FAILED_JOBS + 1))
527
+ fi
528
+ done
529
+
530
+ # Final exit code
531
+ TOTAL_FAILURES=$((FAILED_UPLOADS + FAILED_JOBS))
532
+ if [[ "$TOTAL_FAILURES" -gt 0 ]]; then
533
+ log_error "$TOTAL_FAILURES file(s) failed to process"
534
+ exit 1
535
+ fi
536
+
537
+ exit 0
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "@adverant/nexus-memory-skill",
3
- "version": "2.1.0",
3
+ "version": "2.2.0",
4
4
  "description": "Claude Code skill for persistent memory via Nexus GraphRAG - store and recall memories across all sessions and projects",
5
5
  "main": "SKILL.md",
6
6
  "type": "module",