@miller-tech/uap 1.0.0 → 1.3.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/dist/benchmarks/benchmark.d.ts +8 -8
- package/dist/benchmarks/improved-benchmark.d.ts.map +1 -1
- package/dist/benchmarks/improved-benchmark.js +10 -23
- package/dist/benchmarks/improved-benchmark.js.map +1 -1
- package/dist/benchmarks/model-integration.d.ts.map +1 -1
- package/dist/benchmarks/model-integration.js +22 -23
- package/dist/benchmarks/model-integration.js.map +1 -1
- package/dist/bin/policy.js +67 -11
- package/dist/bin/policy.js.map +1 -1
- package/dist/cli/dashboard.d.ts +2 -1
- package/dist/cli/dashboard.d.ts.map +1 -1
- package/dist/cli/dashboard.js +399 -10
- package/dist/cli/dashboard.js.map +1 -1
- package/dist/cli/model.js +12 -12
- package/dist/cli/model.js.map +1 -1
- package/dist/cli/setup-wizard.d.ts.map +1 -1
- package/dist/cli/setup-wizard.js +24 -0
- package/dist/cli/setup-wizard.js.map +1 -1
- package/dist/coordination/deploy-batcher.d.ts +1 -0
- package/dist/coordination/deploy-batcher.d.ts.map +1 -1
- package/dist/coordination/deploy-batcher.js +24 -25
- package/dist/coordination/deploy-batcher.js.map +1 -1
- package/dist/dashboard/data-service.d.ts +94 -0
- package/dist/dashboard/data-service.d.ts.map +1 -0
- package/dist/dashboard/data-service.js +286 -0
- package/dist/dashboard/data-service.js.map +1 -0
- package/dist/dashboard/index.d.ts +5 -0
- package/dist/dashboard/index.d.ts.map +1 -0
- package/dist/dashboard/index.js +3 -0
- package/dist/dashboard/index.js.map +1 -0
- package/dist/dashboard/server.d.ts +15 -0
- package/dist/dashboard/server.d.ts.map +1 -0
- package/dist/dashboard/server.js +158 -0
- package/dist/dashboard/server.js.map +1 -0
- package/dist/mcp-router/session-stats.d.ts +9 -0
- package/dist/mcp-router/session-stats.d.ts.map +1 -1
- package/dist/mcp-router/session-stats.js +19 -3
- package/dist/mcp-router/session-stats.js.map +1 -1
- package/dist/memory/adaptive-context.d.ts +1 -0
- package/dist/memory/adaptive-context.d.ts.map +1 -1
- package/dist/memory/adaptive-context.js +4 -0
- package/dist/memory/adaptive-context.js.map +1 -1
- package/dist/memory/embeddings.d.ts.map +1 -1
- package/dist/memory/embeddings.js +4 -4
- package/dist/memory/embeddings.js.map +1 -1
- package/dist/memory/model-router.d.ts +1 -1
- package/dist/memory/model-router.d.ts.map +1 -1
- package/dist/memory/model-router.js +52 -1
- package/dist/memory/model-router.js.map +1 -1
- package/dist/memory/predictive-memory.d.ts.map +1 -1
- package/dist/memory/predictive-memory.js +4 -3
- package/dist/memory/predictive-memory.js.map +1 -1
- package/dist/models/analytics.d.ts +93 -0
- package/dist/models/analytics.d.ts.map +1 -0
- package/dist/models/analytics.js +205 -0
- package/dist/models/analytics.js.map +1 -0
- package/dist/models/execution-profiles.d.ts +6 -0
- package/dist/models/execution-profiles.d.ts.map +1 -1
- package/dist/models/execution-profiles.js +15 -0
- package/dist/models/execution-profiles.js.map +1 -1
- package/dist/models/executor.d.ts.map +1 -1
- package/dist/models/executor.js +51 -17
- package/dist/models/executor.js.map +1 -1
- package/dist/models/index.d.ts +2 -0
- package/dist/models/index.d.ts.map +1 -1
- package/dist/models/index.js +2 -0
- package/dist/models/index.js.map +1 -1
- package/dist/models/router.d.ts +8 -0
- package/dist/models/router.d.ts.map +1 -1
- package/dist/models/router.js +46 -21
- package/dist/models/router.js.map +1 -1
- package/dist/models/types.d.ts +26 -0
- package/dist/models/types.d.ts.map +1 -1
- package/dist/models/types.js +43 -4
- package/dist/models/types.js.map +1 -1
- package/dist/models/unified-router.d.ts.map +1 -1
- package/dist/models/unified-router.js +4 -0
- package/dist/models/unified-router.js.map +1 -1
- package/dist/policies/database-manager.d.ts +1 -0
- package/dist/policies/database-manager.d.ts.map +1 -1
- package/dist/policies/database-manager.js +14 -2
- package/dist/policies/database-manager.js.map +1 -1
- package/dist/policies/enforced-tool-router.d.ts +2 -2
- package/dist/policies/enforced-tool-router.d.ts.map +1 -1
- package/dist/policies/enforced-tool-router.js +4 -4
- package/dist/policies/enforced-tool-router.js.map +1 -1
- package/dist/policies/policy-gate.d.ts +2 -2
- package/dist/policies/policy-gate.d.ts.map +1 -1
- package/dist/policies/policy-gate.js +6 -4
- package/dist/policies/policy-gate.js.map +1 -1
- package/dist/policies/policy-memory.d.ts +3 -0
- package/dist/policies/policy-memory.d.ts.map +1 -1
- package/dist/policies/policy-memory.js +11 -0
- package/dist/policies/policy-memory.js.map +1 -1
- package/dist/policies/schemas/policy.d.ts +3 -0
- package/dist/policies/schemas/policy.d.ts.map +1 -1
- package/dist/policies/schemas/policy.js +1 -0
- package/dist/policies/schemas/policy.js.map +1 -1
- package/dist/tasks/coordination.d.ts +18 -0
- package/dist/tasks/coordination.d.ts.map +1 -1
- package/dist/tasks/coordination.js +59 -1
- package/dist/tasks/coordination.js.map +1 -1
- package/dist/tasks/event-bus.d.ts +91 -0
- package/dist/tasks/event-bus.d.ts.map +1 -0
- package/dist/tasks/event-bus.js +123 -0
- package/dist/tasks/event-bus.js.map +1 -0
- package/dist/tasks/service.d.ts +5 -0
- package/dist/tasks/service.d.ts.map +1 -1
- package/dist/tasks/service.js +59 -0
- package/dist/tasks/service.js.map +1 -1
- package/dist/telemetry/session-telemetry.d.ts.map +1 -1
- package/dist/telemetry/session-telemetry.js +3 -0
- package/dist/telemetry/session-telemetry.js.map +1 -1
- package/dist/utils/concurrency-pool.d.ts +51 -0
- package/dist/utils/concurrency-pool.d.ts.map +1 -0
- package/dist/utils/concurrency-pool.js +80 -0
- package/dist/utils/concurrency-pool.js.map +1 -0
- package/dist/utils/system-resources.d.ts +47 -0
- package/dist/utils/system-resources.d.ts.map +1 -0
- package/dist/utils/system-resources.js +92 -0
- package/dist/utils/system-resources.js.map +1 -0
- package/docs/BENCHMARK_GAPS_AND_PLAN.md +146 -0
- package/docs/PARALLELISM_GAPS_AND_OPTIONS.md +422 -0
- package/docs/UAP_OPTIMIZATION_PLAN.md +638 -0
- package/docs/getting-started/INTEGRATION.md +193 -14
- package/docs/opencode-integration-guide.md +740 -0
- package/docs/opencode-integration-quickref.md +180 -0
- package/package.json +4 -1
- package/templates/hooks/session-start.sh +8 -1
|
@@ -0,0 +1,51 @@
|
|
|
1
|
+
/**
|
|
2
|
+
* Concurrency Pool
|
|
3
|
+
*
|
|
4
|
+
* Shared utility for bounded-concurrency parallel execution.
|
|
5
|
+
* Replaces duplicated Promise.all batching patterns across the codebase.
|
|
6
|
+
*
|
|
7
|
+
* Uses getMaxParallel() for auto-detection with UAP_MAX_PARALLEL env override.
|
|
8
|
+
*/
|
|
9
|
+
/**
|
|
10
|
+
* Map over items with bounded concurrency.
|
|
11
|
+
*
|
|
12
|
+
* Unlike Promise.all(items.map(fn)), this limits the number of
|
|
13
|
+
* in-flight promises to prevent overwhelming local inference
|
|
14
|
+
* endpoints or exhausting file descriptors.
|
|
15
|
+
*
|
|
16
|
+
* @param items - Array of items to process
|
|
17
|
+
* @param fn - Async function to apply to each item
|
|
18
|
+
* @param options - Concurrency options
|
|
19
|
+
* @returns Results in the same order as input items
|
|
20
|
+
*
|
|
21
|
+
* @example
|
|
22
|
+
* ```ts
|
|
23
|
+
* // Auto-detect concurrency from vCPUs
|
|
24
|
+
* const results = await concurrentMap(urls, url => fetch(url));
|
|
25
|
+
*
|
|
26
|
+
* // Explicit limit
|
|
27
|
+
* const results = await concurrentMap(tasks, runTask, { maxConcurrent: 4 });
|
|
28
|
+
*
|
|
29
|
+
* // CPU-bound mode (reserves cores for OS)
|
|
30
|
+
* const results = await concurrentMap(files, compress, { mode: 'cpu' });
|
|
31
|
+
* ```
|
|
32
|
+
*/
|
|
33
|
+
export declare function concurrentMap<T, R>(items: T[], fn: (item: T, index: number) => Promise<R>, options?: {
|
|
34
|
+
/** Maximum concurrent operations. Overrides auto-detection. */
|
|
35
|
+
maxConcurrent?: number;
|
|
36
|
+
/** 'cpu' reserves cores for OS/inference, 'io' allows higher concurrency */
|
|
37
|
+
mode?: 'cpu' | 'io';
|
|
38
|
+
}): Promise<R[]>;
|
|
39
|
+
/**
|
|
40
|
+
* Map over items with bounded concurrency, settling all promises.
|
|
41
|
+
*
|
|
42
|
+
* Like concurrentMap but uses Promise.allSettled semantics --
|
|
43
|
+
* failures don't abort other in-flight operations.
|
|
44
|
+
*
|
|
45
|
+
* @returns Array of PromiseSettledResult in input order
|
|
46
|
+
*/
|
|
47
|
+
export declare function concurrentMapSettled<T, R>(items: T[], fn: (item: T, index: number) => Promise<R>, options?: {
|
|
48
|
+
maxConcurrent?: number;
|
|
49
|
+
mode?: 'cpu' | 'io';
|
|
50
|
+
}): Promise<PromiseSettledResult<R>[]>;
|
|
51
|
+
//# sourceMappingURL=concurrency-pool.d.ts.map
|
|
@@ -0,0 +1 @@
|
|
|
1
|
+
{"version":3,"file":"concurrency-pool.d.ts","sourceRoot":"","sources":["../../src/utils/concurrency-pool.ts"],"names":[],"mappings":"AAAA;;;;;;;GAOG;AAIH;;;;;;;;;;;;;;;;;;;;;;;GAuBG;AACH,wBAAsB,aAAa,CAAC,CAAC,EAAE,CAAC,EACtC,KAAK,EAAE,CAAC,EAAE,EACV,EAAE,EAAE,CAAC,IAAI,EAAE,CAAC,EAAE,KAAK,EAAE,MAAM,KAAK,OAAO,CAAC,CAAC,CAAC,EAC1C,OAAO,CAAC,EAAE;IACR,+DAA+D;IAC/D,aAAa,CAAC,EAAE,MAAM,CAAC;IACvB,4EAA4E;IAC5E,IAAI,CAAC,EAAE,KAAK,GAAG,IAAI,CAAC;CACrB,GACA,OAAO,CAAC,CAAC,EAAE,CAAC,CAkBd;AAED;;;;;;;GAOG;AACH,wBAAsB,oBAAoB,CAAC,CAAC,EAAE,CAAC,EAC7C,KAAK,EAAE,CAAC,EAAE,EACV,EAAE,EAAE,CAAC,IAAI,EAAE,CAAC,EAAE,KAAK,EAAE,MAAM,KAAK,OAAO,CAAC,CAAC,CAAC,EAC1C,OAAO,CAAC,EAAE;IACR,aAAa,CAAC,EAAE,MAAM,CAAC;IACvB,IAAI,CAAC,EAAE,KAAK,GAAG,IAAI,CAAC;CACrB,GACA,OAAO,CAAC,oBAAoB,CAAC,CAAC,CAAC,EAAE,CAAC,CAuBpC"}
|
|
@@ -0,0 +1,80 @@
|
|
|
1
|
+
/**
|
|
2
|
+
* Concurrency Pool
|
|
3
|
+
*
|
|
4
|
+
* Shared utility for bounded-concurrency parallel execution.
|
|
5
|
+
* Replaces duplicated Promise.all batching patterns across the codebase.
|
|
6
|
+
*
|
|
7
|
+
* Uses getMaxParallel() for auto-detection with UAP_MAX_PARALLEL env override.
|
|
8
|
+
*/
|
|
9
|
+
import { getMaxParallel } from './system-resources.js';
|
|
10
|
+
/**
|
|
11
|
+
* Map over items with bounded concurrency.
|
|
12
|
+
*
|
|
13
|
+
* Unlike Promise.all(items.map(fn)), this limits the number of
|
|
14
|
+
* in-flight promises to prevent overwhelming local inference
|
|
15
|
+
* endpoints or exhausting file descriptors.
|
|
16
|
+
*
|
|
17
|
+
* @param items - Array of items to process
|
|
18
|
+
* @param fn - Async function to apply to each item
|
|
19
|
+
* @param options - Concurrency options
|
|
20
|
+
* @returns Results in the same order as input items
|
|
21
|
+
*
|
|
22
|
+
* @example
|
|
23
|
+
* ```ts
|
|
24
|
+
* // Auto-detect concurrency from vCPUs
|
|
25
|
+
* const results = await concurrentMap(urls, url => fetch(url));
|
|
26
|
+
*
|
|
27
|
+
* // Explicit limit
|
|
28
|
+
* const results = await concurrentMap(tasks, runTask, { maxConcurrent: 4 });
|
|
29
|
+
*
|
|
30
|
+
* // CPU-bound mode (reserves cores for OS)
|
|
31
|
+
* const results = await concurrentMap(files, compress, { mode: 'cpu' });
|
|
32
|
+
* ```
|
|
33
|
+
*/
|
|
34
|
+
export async function concurrentMap(items, fn, options) {
|
|
35
|
+
if (items.length === 0)
|
|
36
|
+
return [];
|
|
37
|
+
const max = options?.maxConcurrent ?? getMaxParallel(options?.mode ?? 'io');
|
|
38
|
+
const results = new Array(items.length);
|
|
39
|
+
let nextIndex = 0;
|
|
40
|
+
const worker = async () => {
|
|
41
|
+
while (nextIndex < items.length) {
|
|
42
|
+
const i = nextIndex++;
|
|
43
|
+
results[i] = await fn(items[i], i);
|
|
44
|
+
}
|
|
45
|
+
};
|
|
46
|
+
const workerCount = Math.min(max, items.length);
|
|
47
|
+
await Promise.all(Array.from({ length: workerCount }, () => worker()));
|
|
48
|
+
return results;
|
|
49
|
+
}
|
|
50
|
+
/**
|
|
51
|
+
* Map over items with bounded concurrency, settling all promises.
|
|
52
|
+
*
|
|
53
|
+
* Like concurrentMap but uses Promise.allSettled semantics --
|
|
54
|
+
* failures don't abort other in-flight operations.
|
|
55
|
+
*
|
|
56
|
+
* @returns Array of PromiseSettledResult in input order
|
|
57
|
+
*/
|
|
58
|
+
export async function concurrentMapSettled(items, fn, options) {
|
|
59
|
+
if (items.length === 0)
|
|
60
|
+
return [];
|
|
61
|
+
const max = options?.maxConcurrent ?? getMaxParallel(options?.mode ?? 'io');
|
|
62
|
+
const results = new Array(items.length);
|
|
63
|
+
let nextIndex = 0;
|
|
64
|
+
const worker = async () => {
|
|
65
|
+
while (nextIndex < items.length) {
|
|
66
|
+
const i = nextIndex++;
|
|
67
|
+
try {
|
|
68
|
+
const value = await fn(items[i], i);
|
|
69
|
+
results[i] = { status: 'fulfilled', value };
|
|
70
|
+
}
|
|
71
|
+
catch (reason) {
|
|
72
|
+
results[i] = { status: 'rejected', reason };
|
|
73
|
+
}
|
|
74
|
+
}
|
|
75
|
+
};
|
|
76
|
+
const workerCount = Math.min(max, items.length);
|
|
77
|
+
await Promise.all(Array.from({ length: workerCount }, () => worker()));
|
|
78
|
+
return results;
|
|
79
|
+
}
|
|
80
|
+
//# sourceMappingURL=concurrency-pool.js.map
|
|
@@ -0,0 +1 @@
|
|
|
1
|
+
{"version":3,"file":"concurrency-pool.js","sourceRoot":"","sources":["../../src/utils/concurrency-pool.ts"],"names":[],"mappings":"AAAA;;;;;;;GAOG;AAEH,OAAO,EAAE,cAAc,EAAE,MAAM,uBAAuB,CAAC;AAEvD;;;;;;;;;;;;;;;;;;;;;;;GAuBG;AACH,MAAM,CAAC,KAAK,UAAU,aAAa,CACjC,KAAU,EACV,EAA0C,EAC1C,OAKC;IAED,IAAI,KAAK,CAAC,MAAM,KAAK,CAAC;QAAE,OAAO,EAAE,CAAC;IAElC,MAAM,GAAG,GAAG,OAAO,EAAE,aAAa,IAAI,cAAc,CAAC,OAAO,EAAE,IAAI,IAAI,IAAI,CAAC,CAAC;IAC5E,MAAM,OAAO,GAAQ,IAAI,KAAK,CAAC,KAAK,CAAC,MAAM,CAAC,CAAC;IAC7C,IAAI,SAAS,GAAG,CAAC,CAAC;IAElB,MAAM,MAAM,GAAG,KAAK,IAAmB,EAAE;QACvC,OAAO,SAAS,GAAG,KAAK,CAAC,MAAM,EAAE,CAAC;YAChC,MAAM,CAAC,GAAG,SAAS,EAAE,CAAC;YACtB,OAAO,CAAC,CAAC,CAAC,GAAG,MAAM,EAAE,CAAC,KAAK,CAAC,CAAC,CAAC,EAAE,CAAC,CAAC,CAAC;QACrC,CAAC;IACH,CAAC,CAAC;IAEF,MAAM,WAAW,GAAG,IAAI,CAAC,GAAG,CAAC,GAAG,EAAE,KAAK,CAAC,MAAM,CAAC,CAAC;IAChD,MAAM,OAAO,CAAC,GAAG,CAAC,KAAK,CAAC,IAAI,CAAC,EAAE,MAAM,EAAE,WAAW,EAAE,EAAE,GAAG,EAAE,CAAC,MAAM,EAAE,CAAC,CAAC,CAAC;IAEvE,OAAO,OAAO,CAAC;AACjB,CAAC;AAED;;;;;;;GAOG;AACH,MAAM,CAAC,KAAK,UAAU,oBAAoB,CACxC,KAAU,EACV,EAA0C,EAC1C,OAGC;IAED,IAAI,KAAK,CAAC,MAAM,KAAK,CAAC;QAAE,OAAO,EAAE,CAAC;IAElC,MAAM,GAAG,GAAG,OAAO,EAAE,aAAa,IAAI,cAAc,CAAC,OAAO,EAAE,IAAI,IAAI,IAAI,CAAC,CAAC;IAC5E,MAAM,OAAO,GAA8B,IAAI,KAAK,CAAC,KAAK,CAAC,MAAM,CAAC,CAAC;IACnE,IAAI,SAAS,GAAG,CAAC,CAAC;IAElB,MAAM,MAAM,GAAG,KAAK,IAAmB,EAAE;QACvC,OAAO,SAAS,GAAG,KAAK,CAAC,MAAM,EAAE,CAAC;YAChC,MAAM,CAAC,GAAG,SAAS,EAAE,CAAC;YACtB,IAAI,CAAC;gBACH,MAAM,KAAK,GAAG,MAAM,EAAE,CAAC,KAAK,CAAC,CAAC,CAAC,EAAE,CAAC,CAAC,CAAC;gBACpC,OAAO,CAAC,CAAC,CAAC,GAAG,EAAE,MAAM,EAAE,WAAW,EAAE,KAAK,EAAE,CAAC;YAC9C,CAAC;YAAC,OAAO,MAAM,EAAE,CAAC;gBAChB,OAAO,CAAC,CAAC,CAAC,GAAG,EAAE,MAAM,EAAE,UAAU,EAAE,MAAM,EAAE,CAAC;YAC9C,CAAC;QACH,CAAC;IACH,CAAC,CAAC;IAEF,MAAM,WAAW,GAAG,IAAI,CAAC,GAAG,CAAC,GAAG,EAAE,KAAK,CAAC,MAAM,CAAC,CAAC;IAChD,MAAM,OAAO,CAAC,GAAG,CAAC,KAAK,CAAC,IAAI,CAAC,EAAE,MAAM,EAAE,WAAW,EAAE,EAAE,GAAG,EAAE,CAAC,MAAM,EAAE,CAAC,CAAC,CAAC;IAEvE,OAAO,OAAO,CAAC;AACjB,CAAC"}
|
|
@@ -0,0 +1,47 @@
|
|
|
1
|
+
/**
|
|
2
|
+
* System Resource Detection
|
|
3
|
+
*
|
|
4
|
+
* Provides vCPU, VRAM, and memory detection with caching.
|
|
5
|
+
* Used to auto-tune parallelism across the UAP system.
|
|
6
|
+
*
|
|
7
|
+
* Env override: UAP_MAX_PARALLEL always takes precedence.
|
|
8
|
+
* Precedence: env var → config → auto-detect → hardcoded default
|
|
9
|
+
*/
|
|
10
|
+
export interface SystemResources {
|
|
11
|
+
/** Number of logical CPUs (vCPUs / hardware threads) */
|
|
12
|
+
vCPUs: number;
|
|
13
|
+
/** GPU VRAM in GB (0 if no GPU detected) */
|
|
14
|
+
vramGB: number;
|
|
15
|
+
/** System RAM in GB */
|
|
16
|
+
memoryGB: number;
|
|
17
|
+
}
|
|
18
|
+
/**
|
|
19
|
+
* Detect system resources (cached after first call).
|
|
20
|
+
*/
|
|
21
|
+
export declare function detectSystemResources(): SystemResources;
|
|
22
|
+
/**
|
|
23
|
+
* Compute safe parallelism ceiling.
|
|
24
|
+
*
|
|
25
|
+
* @param mode - 'cpu' for compute-bound work (reserves cores for OS + inference),
|
|
26
|
+
* 'io' for IO-bound work like API calls (higher concurrency safe)
|
|
27
|
+
* @returns Maximum number of concurrent operations
|
|
28
|
+
*
|
|
29
|
+
* Precedence:
|
|
30
|
+
* 1. UAP_MAX_PARALLEL env var (always wins)
|
|
31
|
+
* 2. Auto-detected from os.cpus()
|
|
32
|
+
* 3. Hardcoded fallback (3)
|
|
33
|
+
*/
|
|
34
|
+
export declare function getMaxParallel(mode?: 'cpu' | 'io'): number;
|
|
35
|
+
/**
|
|
36
|
+
* Check if parallelism is globally enabled.
|
|
37
|
+
*
|
|
38
|
+
* Precedence:
|
|
39
|
+
* 1. UAP_PARALLEL env var ('false' disables)
|
|
40
|
+
* 2. Default: true
|
|
41
|
+
*/
|
|
42
|
+
export declare function isParallelEnabled(): boolean;
|
|
43
|
+
/**
|
|
44
|
+
* Reset cached resources (for testing).
|
|
45
|
+
*/
|
|
46
|
+
export declare function resetResourceCache(): void;
|
|
47
|
+
//# sourceMappingURL=system-resources.d.ts.map
|
|
@@ -0,0 +1 @@
|
|
|
1
|
+
{"version":3,"file":"system-resources.d.ts","sourceRoot":"","sources":["../../src/utils/system-resources.ts"],"names":[],"mappings":"AAAA;;;;;;;;GAQG;AAKH,MAAM,WAAW,eAAe;IAC9B,wDAAwD;IACxD,KAAK,EAAE,MAAM,CAAC;IACd,4CAA4C;IAC5C,MAAM,EAAE,MAAM,CAAC;IACf,uBAAuB;IACvB,QAAQ,EAAE,MAAM,CAAC;CAClB;AAID;;GAEG;AACH,wBAAgB,qBAAqB,IAAI,eAAe,CA+BvD;AAED;;;;;;;;;;;GAWG;AACH,wBAAgB,cAAc,CAAC,IAAI,GAAE,KAAK,GAAG,IAAW,GAAG,MAAM,CAiBhE;AAED;;;;;;GAMG;AACH,wBAAgB,iBAAiB,IAAI,OAAO,CAE3C;AAED;;GAEG;AACH,wBAAgB,kBAAkB,IAAI,IAAI,CAEzC"}
|
|
@@ -0,0 +1,92 @@
|
|
|
1
|
+
/**
|
|
2
|
+
* System Resource Detection
|
|
3
|
+
*
|
|
4
|
+
* Provides vCPU, VRAM, and memory detection with caching.
|
|
5
|
+
* Used to auto-tune parallelism across the UAP system.
|
|
6
|
+
*
|
|
7
|
+
* Env override: UAP_MAX_PARALLEL always takes precedence.
|
|
8
|
+
* Precedence: env var → config → auto-detect → hardcoded default
|
|
9
|
+
*/
|
|
10
|
+
import { cpus, totalmem } from 'os';
|
|
11
|
+
import { execSync } from 'child_process';
|
|
12
|
+
let _cached = null;
|
|
13
|
+
/**
|
|
14
|
+
* Detect system resources (cached after first call).
|
|
15
|
+
*/
|
|
16
|
+
export function detectSystemResources() {
|
|
17
|
+
if (_cached)
|
|
18
|
+
return _cached;
|
|
19
|
+
const vCPUs = cpus().length;
|
|
20
|
+
const memoryGB = Math.round(totalmem() / 1024 ** 3);
|
|
21
|
+
let vramGB = 0;
|
|
22
|
+
try {
|
|
23
|
+
// NVIDIA GPU
|
|
24
|
+
const out = execSync('nvidia-smi --query-gpu=memory.total --format=csv,noheader,nounits', {
|
|
25
|
+
encoding: 'utf-8',
|
|
26
|
+
timeout: 3000,
|
|
27
|
+
stdio: ['pipe', 'pipe', 'pipe'],
|
|
28
|
+
});
|
|
29
|
+
vramGB = Math.round(parseInt(out.trim().split('\n')[0], 10) / 1024);
|
|
30
|
+
}
|
|
31
|
+
catch {
|
|
32
|
+
try {
|
|
33
|
+
// macOS unified memory (report as VRAM since GPU shares it)
|
|
34
|
+
const out = execSync('sysctl -n hw.memsize', {
|
|
35
|
+
encoding: 'utf-8',
|
|
36
|
+
timeout: 3000,
|
|
37
|
+
stdio: ['pipe', 'pipe', 'pipe'],
|
|
38
|
+
});
|
|
39
|
+
vramGB = Math.min(Math.round(parseInt(out.trim(), 10) / 1024 ** 3), 48);
|
|
40
|
+
}
|
|
41
|
+
catch {
|
|
42
|
+
// No GPU detected
|
|
43
|
+
}
|
|
44
|
+
}
|
|
45
|
+
_cached = { vCPUs, vramGB, memoryGB };
|
|
46
|
+
return _cached;
|
|
47
|
+
}
|
|
48
|
+
/**
|
|
49
|
+
* Compute safe parallelism ceiling.
|
|
50
|
+
*
|
|
51
|
+
* @param mode - 'cpu' for compute-bound work (reserves cores for OS + inference),
|
|
52
|
+
* 'io' for IO-bound work like API calls (higher concurrency safe)
|
|
53
|
+
* @returns Maximum number of concurrent operations
|
|
54
|
+
*
|
|
55
|
+
* Precedence:
|
|
56
|
+
* 1. UAP_MAX_PARALLEL env var (always wins)
|
|
57
|
+
* 2. Auto-detected from os.cpus()
|
|
58
|
+
* 3. Hardcoded fallback (3)
|
|
59
|
+
*/
|
|
60
|
+
export function getMaxParallel(mode = 'io') {
|
|
61
|
+
const envOverride = process.env.UAP_MAX_PARALLEL;
|
|
62
|
+
if (envOverride) {
|
|
63
|
+
const parsed = parseInt(envOverride, 10);
|
|
64
|
+
if (!isNaN(parsed) && parsed > 0)
|
|
65
|
+
return parsed;
|
|
66
|
+
}
|
|
67
|
+
const { vCPUs } = detectSystemResources();
|
|
68
|
+
if (mode === 'cpu') {
|
|
69
|
+
// Reserve 2 cores for OS + inference server
|
|
70
|
+
return Math.max(1, vCPUs - 2);
|
|
71
|
+
}
|
|
72
|
+
// IO-bound: safe to use more concurrency, cap at 8 to avoid
|
|
73
|
+
// overwhelming local inference endpoints
|
|
74
|
+
return Math.max(1, Math.min(vCPUs, 8));
|
|
75
|
+
}
|
|
76
|
+
/**
|
|
77
|
+
* Check if parallelism is globally enabled.
|
|
78
|
+
*
|
|
79
|
+
* Precedence:
|
|
80
|
+
* 1. UAP_PARALLEL env var ('false' disables)
|
|
81
|
+
* 2. Default: true
|
|
82
|
+
*/
|
|
83
|
+
export function isParallelEnabled() {
|
|
84
|
+
return process.env.UAP_PARALLEL !== 'false';
|
|
85
|
+
}
|
|
86
|
+
/**
|
|
87
|
+
* Reset cached resources (for testing).
|
|
88
|
+
*/
|
|
89
|
+
export function resetResourceCache() {
|
|
90
|
+
_cached = null;
|
|
91
|
+
}
|
|
92
|
+
//# sourceMappingURL=system-resources.js.map
|
|
@@ -0,0 +1 @@
|
|
|
1
|
+
{"version":3,"file":"system-resources.js","sourceRoot":"","sources":["../../src/utils/system-resources.ts"],"names":[],"mappings":"AAAA;;;;;;;;GAQG;AAEH,OAAO,EAAE,IAAI,EAAE,QAAQ,EAAE,MAAM,IAAI,CAAC;AACpC,OAAO,EAAE,QAAQ,EAAE,MAAM,eAAe,CAAC;AAWzC,IAAI,OAAO,GAA2B,IAAI,CAAC;AAE3C;;GAEG;AACH,MAAM,UAAU,qBAAqB;IACnC,IAAI,OAAO;QAAE,OAAO,OAAO,CAAC;IAE5B,MAAM,KAAK,GAAG,IAAI,EAAE,CAAC,MAAM,CAAC;IAC5B,MAAM,QAAQ,GAAG,IAAI,CAAC,KAAK,CAAC,QAAQ,EAAE,GAAG,IAAI,IAAI,CAAC,CAAC,CAAC;IAEpD,IAAI,MAAM,GAAG,CAAC,CAAC;IACf,IAAI,CAAC;QACH,aAAa;QACb,MAAM,GAAG,GAAG,QAAQ,CAAC,mEAAmE,EAAE;YACxF,QAAQ,EAAE,OAAO;YACjB,OAAO,EAAE,IAAI;YACb,KAAK,EAAE,CAAC,MAAM,EAAE,MAAM,EAAE,MAAM,CAAC;SAChC,CAAC,CAAC;QACH,MAAM,GAAG,IAAI,CAAC,KAAK,CAAC,QAAQ,CAAC,GAAG,CAAC,IAAI,EAAE,CAAC,KAAK,CAAC,IAAI,CAAC,CAAC,CAAC,CAAC,EAAE,EAAE,CAAC,GAAG,IAAI,CAAC,CAAC;IACtE,CAAC;IAAC,MAAM,CAAC;QACP,IAAI,CAAC;YACH,4DAA4D;YAC5D,MAAM,GAAG,GAAG,QAAQ,CAAC,sBAAsB,EAAE;gBAC3C,QAAQ,EAAE,OAAO;gBACjB,OAAO,EAAE,IAAI;gBACb,KAAK,EAAE,CAAC,MAAM,EAAE,MAAM,EAAE,MAAM,CAAC;aAChC,CAAC,CAAC;YACH,MAAM,GAAG,IAAI,CAAC,GAAG,CAAC,IAAI,CAAC,KAAK,CAAC,QAAQ,CAAC,GAAG,CAAC,IAAI,EAAE,EAAE,EAAE,CAAC,GAAG,IAAI,IAAI,CAAC,CAAC,EAAE,EAAE,CAAC,CAAC;QAC1E,CAAC;QAAC,MAAM,CAAC;YACP,kBAAkB;QACpB,CAAC;IACH,CAAC;IAED,OAAO,GAAG,EAAE,KAAK,EAAE,MAAM,EAAE,QAAQ,EAAE,CAAC;IACtC,OAAO,OAAO,CAAC;AACjB,CAAC;AAED;;;;;;;;;;;GAWG;AACH,MAAM,UAAU,cAAc,CAAC,OAAqB,IAAI;IACtD,MAAM,WAAW,GAAG,OAAO,CAAC,GAAG,CAAC,gBAAgB,CAAC;IACjD,IAAI,WAAW,EAAE,CAAC;QAChB,MAAM,MAAM,GAAG,QAAQ,CAAC,WAAW,EAAE,EAAE,CAAC,CAAC;QACzC,IAAI,CAAC,KAAK,CAAC,MAAM,CAAC,IAAI,MAAM,GAAG,CAAC;YAAE,OAAO,MAAM,CAAC;IAClD,CAAC;IAED,MAAM,EAAE,KAAK,EAAE,GAAG,qBAAqB,EAAE,CAAC;IAE1C,IAAI,IAAI,KAAK,KAAK,EAAE,CAAC;QACnB,4CAA4C;QAC5C,OAAO,IAAI,CAAC,GAAG,CAAC,CAAC,EAAE,KAAK,GAAG,CAAC,CAAC,CAAC;IAChC,CAAC;IAED,4DAA4D;IAC5D,yCAAyC;IACzC,OAAO,IAAI,CAAC,GAAG,CAAC,CAAC,EAAE,IAAI,CAAC,GAAG,CAAC,KAAK,EAAE,CAAC,CAAC,CAAC,CAAC;AACzC,CAAC;AAED;;;;;;GAMG;AACH,MAAM,UAAU,iBAAiB;IAC/B,OAAO,OAAO,CAAC,GAAG,CAAC,YAAY,KAAK,OAAO,CAAC;AAC9C,CAAC;AAED;;GAEG;AACH,MAAM,UAAU,kBAAkB;IAChC,OAAO,GAAG,IAAI,CAAC;AACjB,CAAC"}
|
|
@@ -0,0 +1,146 @@
|
|
|
1
|
+
# UAP Benchmark: Actual Gaps & Execution Plan
|
|
2
|
+
|
|
3
|
+
**Generated:** 2026-03-17
|
|
4
|
+
**Benchmark:** Harbor Terminal-Bench 2.0 (89 tasks)
|
|
5
|
+
**Primary Target:** Qwen3.5 35B A3B (IQ4_XS)
|
|
6
|
+
|
|
7
|
+
---
|
|
8
|
+
|
|
9
|
+
## What Already Exists (DO NOT REBUILD)
|
|
10
|
+
|
|
11
|
+
| Component | File | Status |
|
|
12
|
+
| -------------------------------- | ---------------------------------------------------------- | ---------------------------------------------------------------------------------------------- |
|
|
13
|
+
| Baseline benchmark (no UAP) | `scripts/benchmarks/benchmark-qwen35-baseline-no-uap.tsx` | 403 lines, 94 tasks |
|
|
14
|
+
| UAP benchmark (full integration) | `scripts/benchmarks/benchmark-qwen35-uap-3.0-opencode.tsx` | 812 lines, 89 tasks |
|
|
15
|
+
| Harbor quick runner (UAP) | `scripts/benchmarks/run-tbench-qwen35-quick.sh` | 459 lines, hybrid-adaptive |
|
|
16
|
+
| Harbor baseline+UAP runner | `scripts/benchmarks/run-harbor-qwen35-benchmark.sh` | Runs both configs sequentially |
|
|
17
|
+
| Harbor YAML configs | `benchmarks/harbor-configs/qwen35_*.yaml` | Baseline + UAP pair |
|
|
18
|
+
| Comparison report generator | `scripts/benchmarks/generate-comparison-report.ts` | 461 lines, p-value tests |
|
|
19
|
+
| Full benchmark harness | `scripts/benchmarks/run-full-benchmark.sh` | 413 lines, multi-model A/B |
|
|
20
|
+
| Multi-turn agent loop | `src/benchmarks/multi-turn-loop.ts` | 213 lines, `executeWithRetry()` |
|
|
21
|
+
| Multi-turn + verification | `src/benchmarks/multi-turn-agent.ts` | Wired to dynamic retrieval |
|
|
22
|
+
| Improved benchmark runner | `src/benchmarks/improved-benchmark.ts` | 794 lines, wires multi-turn + dynamic retrieval + task classification + hierarchical prompting |
|
|
23
|
+
| Dynamic memory retrieval | `src/memory/dynamic-retrieval.ts` | 1168 lines, 6 memory sources, adaptive depth |
|
|
24
|
+
| Task classifier | `src/memory/task-classifier.ts` | 426 lines, 8 categories, ambiguity detection |
|
|
25
|
+
| Qdrant embeddings | `src/memory/embeddings.ts` | Fixed, 5 backends with fallback |
|
|
26
|
+
| Tool call retry (Qwen) | `tools/agents/scripts/qwen_tool_call_wrapper.py` | 686 lines, 6 retry strategies |
|
|
27
|
+
| Harbor UAP agent | `tools/uap_harbor/uap_agent.py` | 379 lines, classified preamble |
|
|
28
|
+
| Qwen3.5 model presets | `src/models/types.ts:136-151` | `qwen35-a3b` and `qwen35` defined |
|
|
29
|
+
| Model router | `src/models/router.ts` | Qwen3.5 as default executor |
|
|
30
|
+
|
|
31
|
+
---
|
|
32
|
+
|
|
33
|
+
## Actual Gaps (3 items)
|
|
34
|
+
|
|
35
|
+
### Gap 1: `improved-benchmark.ts` MODELS array missing Qwen3.5
|
|
36
|
+
|
|
37
|
+
`src/benchmarks/improved-benchmark.ts:95-99` has the fully wired runner (multi-turn + dynamic retrieval + task classification + hierarchical prompting + verification) but its MODELS array only contains:
|
|
38
|
+
|
|
39
|
+
```typescript
|
|
40
|
+
const MODELS: ModelConfig[] = [
|
|
41
|
+
{ id: 'opus-4.5', name: 'Claude Opus 4.5', apiModel: 'claude-opus-4-5-20251101' },
|
|
42
|
+
{ id: 'glm-4.7', name: 'GLM 4.7', apiModel: 'glm-4.7' },
|
|
43
|
+
{ id: 'gpt-5.2-codex', name: 'GPT 5.2 Codex', apiModel: 'gpt-5.2-codex' },
|
|
44
|
+
];
|
|
45
|
+
// Qwen3.5 MISSING
|
|
46
|
+
```
|
|
47
|
+
|
|
48
|
+
**Fix:** Add Qwen3.5 to the MODELS array. The preset already exists in `src/models/types.ts:136-151`.
|
|
49
|
+
|
|
50
|
+
### Gap 2: `model-integration.ts` MODELS array missing Qwen3.5 + still single-shot
|
|
51
|
+
|
|
52
|
+
`src/benchmarks/model-integration.ts:336-361` is the older benchmark runner. It:
|
|
53
|
+
|
|
54
|
+
- Has no Qwen3.5 in its MODELS array
|
|
55
|
+
- Uses single-shot execution (no multi-turn, no dynamic retrieval)
|
|
56
|
+
|
|
57
|
+
**Fix:** Add Qwen3.5 to its MODELS array. The multi-turn wiring gap is already solved by `improved-benchmark.ts` -- this file can remain as the "legacy single-shot" runner for comparison purposes.
|
|
58
|
+
|
|
59
|
+
### Gap 3: No benchmark results exist
|
|
60
|
+
|
|
61
|
+
`benchmark-results/` directory does not exist. None of the scripts have been executed.
|
|
62
|
+
|
|
63
|
+
**Fix:** Run the existing scripts.
|
|
64
|
+
|
|
65
|
+
---
|
|
66
|
+
|
|
67
|
+
## Execution Plan
|
|
68
|
+
|
|
69
|
+
### Step 1: Add Qwen3.5 to improved-benchmark.ts MODELS array
|
|
70
|
+
|
|
71
|
+
**File:** `src/benchmarks/improved-benchmark.ts:95-99`
|
|
72
|
+
|
|
73
|
+
```typescript
|
|
74
|
+
const MODELS: ModelConfig[] = [
|
|
75
|
+
{ id: 'opus-4.5', name: 'Claude Opus 4.5', apiModel: 'claude-opus-4-5-20251101' },
|
|
76
|
+
{ id: 'glm-4.7', name: 'GLM 4.7', apiModel: 'glm-4.7' },
|
|
77
|
+
{ id: 'gpt-5.2-codex', name: 'GPT 5.2 Codex', apiModel: 'gpt-5.2-codex' },
|
|
78
|
+
{ id: 'qwen35-a3b', name: 'Qwen 3.5 35B A3B', apiModel: 'qwen35-a3b-iq4xs' },
|
|
79
|
+
];
|
|
80
|
+
```
|
|
81
|
+
|
|
82
|
+
### Step 2: Add Qwen3.5 to model-integration.ts MODELS array
|
|
83
|
+
|
|
84
|
+
**File:** `src/benchmarks/model-integration.ts:336-361`
|
|
85
|
+
|
|
86
|
+
```typescript
|
|
87
|
+
{
|
|
88
|
+
id: 'qwen35-a3b',
|
|
89
|
+
name: 'Qwen 3.5 35B A3B',
|
|
90
|
+
provider: 'local',
|
|
91
|
+
apiModel: 'qwen35-a3b-iq4xs',
|
|
92
|
+
},
|
|
93
|
+
```
|
|
94
|
+
|
|
95
|
+
### Step 3: Run existing benchmarks
|
|
96
|
+
|
|
97
|
+
```bash
|
|
98
|
+
# Option A: Quick Qwen3.5 baseline + UAP via Harbor (recommended first)
|
|
99
|
+
./scripts/benchmarks/run-harbor-qwen35-benchmark.sh
|
|
100
|
+
|
|
101
|
+
# Option B: Direct API baseline (no Harbor containers)
|
|
102
|
+
npx tsx scripts/benchmarks/benchmark-qwen35-baseline-no-uap.tsx
|
|
103
|
+
|
|
104
|
+
# Option C: Direct API UAP-enhanced
|
|
105
|
+
npx tsx scripts/benchmarks/benchmark-qwen35-uap-3.0-opencode.tsx
|
|
106
|
+
|
|
107
|
+
# Option D: Improved benchmark with multi-turn + dynamic retrieval (all models)
|
|
108
|
+
npx tsx src/benchmarks/improved-benchmark.ts
|
|
109
|
+
|
|
110
|
+
# Option E: Full Harbor harness (all models, baseline vs UAP)
|
|
111
|
+
./scripts/benchmarks/run-full-benchmark.sh --model qwen35-a3b-iq4xs
|
|
112
|
+
```
|
|
113
|
+
|
|
114
|
+
### Step 4: Generate comparison report
|
|
115
|
+
|
|
116
|
+
```bash
|
|
117
|
+
npx tsx scripts/benchmarks/generate-comparison-report.ts \
|
|
118
|
+
--baseline benchmark-results/qwen35_baseline_no_uap/ \
|
|
119
|
+
--uap benchmark-results/qwen35_uap_3.0_opencode/
|
|
120
|
+
```
|
|
121
|
+
|
|
122
|
+
---
|
|
123
|
+
|
|
124
|
+
## What This Plan Does NOT Do (because it already exists)
|
|
125
|
+
|
|
126
|
+
- Build a multi-turn agent loop (exists: `src/benchmarks/multi-turn-loop.ts`)
|
|
127
|
+
- Build dynamic memory retrieval (exists: `src/memory/dynamic-retrieval.ts`)
|
|
128
|
+
- Build task classification (exists: `src/memory/task-classifier.ts`)
|
|
129
|
+
- Fix Qdrant embeddings (already fixed: `src/memory/embeddings.ts`)
|
|
130
|
+
- Build Harbor configs (exist: `benchmarks/harbor-configs/qwen35_*.yaml`)
|
|
131
|
+
- Build comparison report generator (exists: `scripts/benchmarks/generate-comparison-report.ts`)
|
|
132
|
+
- Wire multi-turn into benchmark runner (exists: `src/benchmarks/improved-benchmark.ts`)
|
|
133
|
+
- Build tool call retry for Qwen (exists: `tools/agents/scripts/qwen_tool_call_wrapper.py`)
|
|
134
|
+
- Create execution scripts (exist: 6+ scripts in `scripts/benchmarks/`)
|
|
135
|
+
|
|
136
|
+
---
|
|
137
|
+
|
|
138
|
+
## Estimated Effort
|
|
139
|
+
|
|
140
|
+
| Step | Effort | Type |
|
|
141
|
+
| ------------------------------------ | -------------- | -------------------------------------- |
|
|
142
|
+
| Add Qwen3.5 to improved-benchmark.ts | 2 minutes | Code change (1 line) |
|
|
143
|
+
| Add Qwen3.5 to model-integration.ts | 2 minutes | Code change (5 lines) |
|
|
144
|
+
| Run benchmarks | 2-8 hours | Execution (depends on model speed) |
|
|
145
|
+
| Review results | 30 minutes | Analysis |
|
|
146
|
+
| **Total** | **~3-9 hours** | Mostly waiting for benchmark execution |
|