@thinkhive/sdk 3.1.1 → 3.3.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -1,16 +1,16 @@
1
- # ThinkHive SDK
1
+ # ThinkHive SDK v3.3.0
2
2
 
3
3
  The official JavaScript/TypeScript SDK for [ThinkHive](https://thinkhive.ai) - AI Agent Observability Platform.
4
4
 
5
5
  ## Features
6
6
 
7
- - **25 Trace Format Support**: Automatic detection and normalization for LangSmith, Langfuse, Helicone, CrewAI, Opik, Braintrust, HoneyHive, Datadog, MLflow, AgentOps, Portkey, TruLens, Lunary, LangWatch, OpenLIT, Maxim AI, Galileo, PostHog, Keywords AI, Agenta, and more
8
- - **Trace Analysis**: Analyze AI agent traces with detailed explainability
9
- - **RAG Evaluation**: 8 quality metrics for RAG systems (groundedness, faithfulness, etc.)
10
- - **Hallucination Detection**: 9 types of hallucination detection
11
- - **Business Impact**: Industry-specific ROI calculations
7
+ - **OpenTelemetry-Based Tracing**: Built on OTLP for seamless integration with existing observability tools
8
+ - **Run-Centric Architecture**: Atomic unit of work tracking with claims, calibration, and linking
9
+ - **Facts vs Inferences**: Claims API for separating verified facts from inferences
10
+ - **Deterministic Ticket Linking**: 7 methods for linking runs to support tickets
11
+ - **Calibrated Predictions**: Brier scores for prediction accuracy
12
12
  - **Auto-Instrumentation**: Works with LangChain, OpenAI, Anthropic, and more
13
- - **OpenTelemetry**: Built on OTLP for seamless integration
13
+ - **Multi-Format Support**: Normalizes traces from 25+ observability platforms
14
14
 
15
15
  ## Installation
16
16
 
@@ -20,190 +20,341 @@ npm install @thinkhive/sdk
20
20
 
21
21
  ## Quick Start
22
22
 
23
- ### Basic Usage
23
+ ### Basic Initialization
24
24
 
25
25
  ```typescript
26
- import { ThinkHive } from '@thinkhive/sdk';
26
+ import { init, runs, traceLLM, shutdown } from '@thinkhive/sdk';
27
27
 
28
- // Initialize client
29
- const client = new ThinkHive({
30
- apiKey: 'your_api_key',
31
- baseUrl: 'https://api.thinkhive.ai'
28
+ // Initialize the SDK
29
+ init({
30
+ apiKey: 'th_your_api_key',
31
+ serviceName: 'my-ai-agent',
32
+ autoInstrument: true,
33
+ frameworks: ['langchain', 'openai'],
32
34
  });
33
35
 
34
- // Send a trace
35
- const result = await client.trace({
36
- userMessage: 'What is the weather in San Francisco?',
37
- agentResponse: 'The weather in San Francisco is currently 65°F and sunny.',
38
- agentId: 'weather-agent'
36
+ // Create a run (atomic unit of work)
37
+ const run = await runs.create({
38
+ agentId: 'weather-agent',
39
+ conversation: [
40
+ { role: 'user', content: 'What is the weather in San Francisco?' },
41
+ { role: 'assistant', content: 'The weather in San Francisco is currently 65°F and sunny.' }
42
+ ],
43
+ outcome: 'success',
39
44
  });
40
45
 
41
- console.log(`Trace ID: ${result.traceId}`);
42
- if (result.analysis) {
43
- console.log(`Outcome: ${result.analysis.outcome.verdict}`);
44
- console.log(`Impact Score: ${result.analysis.businessImpact.impactScore}`);
45
- }
46
+ console.log(`Run ID: ${run.id}`);
47
+
48
+ // Shutdown when done
49
+ await shutdown();
46
50
  ```
47
51
 
48
- ### With Business Context
52
+ ### Manual Tracing
49
53
 
50
54
  ```typescript
51
- const result = await client.trace({
52
- userMessage: 'I want to cancel my order #12345',
53
- agentResponse: 'I understand you want to cancel order #12345...',
54
- agentId: 'support-agent',
55
- businessContext: {
56
- customerId: 'cust_abc123',
57
- transactionValue: 150.00,
58
- priority: 'high',
59
- industry: 'ecommerce'
60
- }
55
+ import { init, traceLLM, traceRetrieval, traceTool, traceChain } from '@thinkhive/sdk';
56
+
57
+ init({ apiKey: 'th_your_api_key', serviceName: 'my-agent' });
58
+
59
+ // Trace an LLM call
60
+ const response = await traceLLM({
61
+ name: 'generate-response',
62
+ modelName: 'gpt-4',
63
+ provider: 'openai',
64
+ input: { prompt: 'Hello!' }
65
+ }, async () => {
66
+ // Your LLM call here
67
+ return await openai.chat.completions.create({...});
61
68
  });
62
69
 
63
- // Access ROI metrics
64
- if (result.analysis?.businessImpact?.roi) {
65
- const roi = result.analysis.businessImpact.roi;
66
- console.log(`Estimated Revenue Loss: $${roi.estimatedRevenueLoss}`);
67
- console.log(`Churn Probability: ${roi.churnProbability}%`);
68
- }
70
+ // Trace a retrieval operation
71
+ const docs = await traceRetrieval({
72
+ name: 'search-knowledge-base',
73
+ query: 'refund policy',
74
+ topK: 5
75
+ }, async () => {
76
+ return await vectorStore.similaritySearch(query, 5);
77
+ });
78
+
79
+ // Trace a tool call
80
+ const result = await traceTool({
81
+ name: 'lookup-order',
82
+ toolName: 'order_lookup',
83
+ parameters: { orderId: '12345' }
84
+ }, async () => {
85
+ return await lookupOrder('12345');
86
+ });
69
87
  ```
70
88
 
71
- ### Explainer API
89
+ ### Analyzer API (User-Selected Analysis)
72
90
 
73
91
  ```typescript
74
- // Full trace analysis with RAG evaluation
75
- const analysis = await client.explainer.analyze({
76
- userMessage: 'What is your return policy?',
77
- agentResponse: 'Items can be returned within 30 days...',
78
- retrievedContexts: ['Return Policy: 30 day returns...'],
79
- outcome: 'success'
80
- }, {
81
- tier: 'full_llm',
82
- includeRagEvaluation: true,
83
- includeHallucinationDetection: true
92
+ import { analyzer } from '@thinkhive/sdk';
93
+
94
+ // Estimate cost before running analysis
95
+ const estimate = await analyzer.estimateCost({
96
+ traceIds: ['trace-1', 'trace-2', 'trace-3'],
97
+ tier: 'standard',
98
+ });
99
+ console.log(`Estimated cost: $${estimate.estimatedCost}`);
100
+
101
+ // Analyze specific traces
102
+ const analysis = await analyzer.analyze({
103
+ traceIds: ['trace-1', 'trace-2'],
104
+ tier: 'standard',
105
+ includeRootCause: true,
106
+ includeLayers: true,
84
107
  });
85
108
 
86
- console.log(`Summary: ${analysis.summary}`);
87
- console.log(`Groundedness: ${analysis.ragEvaluation?.groundedness}`);
109
+ // Analyze traces by time window with smart sampling
110
+ const windowAnalysis = await analyzer.analyzeWindow({
111
+ agentId: 'support-agent',
112
+ startDate: new Date('2024-01-01'),
113
+ endDate: new Date('2024-01-31'),
114
+ filters: { outcomes: ['failure'], minSeverity: 'medium' },
115
+ sampling: { strategy: 'smart', samplePercent: 10 },
116
+ });
117
+
118
+ // Get aggregated insights
119
+ const summary = await analyzer.summarize({
120
+ agentId: 'support-agent',
121
+ startDate: new Date('2024-01-01'),
122
+ endDate: new Date('2024-01-31'),
123
+ });
124
+ ```
88
125
 
89
- // Batch analysis
90
- const batchResult = await client.explainer.analyzeBatch([
91
- { userMessage: '...', agentResponse: '...' },
92
- { userMessage: '...', agentResponse: '...' }
93
- ], { tier: 'fast_llm' });
126
+ ### Issues API (Clustered Failure Patterns)
94
127
 
95
- // Semantic search
96
- const searchResults = await client.explainer.search({
97
- query: 'refund complaints',
98
- filters: { outcome: 'failure' },
99
- limit: 10
128
+ ```typescript
129
+ import { issues } from '@thinkhive/sdk';
130
+
131
+ // List issues for an agent
132
+ const issueList = await issues.list('support-agent', {
133
+ status: 'open',
134
+ limit: 10,
100
135
  });
136
+
137
+ // Get a specific issue
138
+ const issue = await issues.get('issue-123');
139
+
140
+ // Get fixes for an issue
141
+ const fixes = await issues.getFixes('issue-123');
101
142
  ```
102
143
 
103
- ### Quality Metrics
144
+ ### API Key Management
104
145
 
105
146
  ```typescript
106
- // Get RAG scores
107
- const ragScores = await client.quality.getRagScores('trace-123');
108
- console.log(`Groundedness: ${ragScores.groundedness}`);
109
- console.log(`Faithfulness: ${ragScores.faithfulness}`);
110
-
111
- // Get hallucination report
112
- const report = await client.quality.getHallucinationReport('trace-123');
113
- if (report.hasHallucinations) {
114
- for (const detection of report.detectedTypes) {
115
- console.log(`- ${detection.type}: ${detection.description}`);
116
- }
147
+ import { apiKeys, hasPermission, canAccessAgent } from '@thinkhive/sdk';
148
+
149
+ // Create a scoped API key
150
+ const result = await apiKeys.create({
151
+ name: 'CI Pipeline Key',
152
+ permissions: {
153
+ read: true,
154
+ write: true,
155
+ delete: false
156
+ },
157
+ scopeType: 'agent', // Restrict to specific agents
158
+ allowedAgentIds: ['agent-prod-001'],
159
+ environment: 'production',
160
+ expiresAt: new Date(Date.now() + 90 * 24 * 60 * 60 * 1000) // 90 days
161
+ });
162
+
163
+ console.log(`Key created: ${result.name} (${result.keyPrefix}...)`);
164
+
165
+ // Check permissions
166
+ if (hasPermission(key, 'write')) {
167
+ // Can write data
117
168
  }
118
169
 
119
- // Evaluate RAG for custom input
120
- const evaluation = await client.quality.evaluateRag({
121
- query: 'What is the return policy?',
122
- response: 'Items can be returned within 30 days.',
123
- contexts: [{ content: 'Return Policy: 30 day returns...' }]
124
- });
170
+ // Check agent access
171
+ if (canAccessAgent(key, 'agent-123')) {
172
+ // Can access this agent
173
+ }
125
174
  ```
126
175
 
127
- ### ROI Analytics
176
+ ### Claims API (Facts vs Inferences)
128
177
 
129
178
  ```typescript
130
- // Get ROI summary
131
- const summary = await client.analytics.getRoiSummary();
132
- console.log(`Revenue Saved: $${summary.totalRevenueSaved}`);
133
-
134
- // Get per-agent ROI
135
- const agentRoi = await client.analytics.getRoiByAgent('support-agent');
136
- console.log(`Success Rate: ${agentRoi.successRate}%`);
137
-
138
- // Get correlation analysis
139
- const correlations = await client.analytics.getCorrelations();
140
- for (const corr of correlations.correlations) {
141
- console.log(`${corr.type}: ${corr.actionableInsight}`);
179
+ import { claims, isFact, isInference, getHighConfidenceClaims } from '@thinkhive/sdk';
180
+
181
+ // List claims for a run
182
+ const claimList = await claims.list(runId);
183
+
184
+ // Filter by type
185
+ const facts = claimList.filter(isFact);
186
+ const inferences = claimList.filter(isInference);
187
+
188
+ // Get high confidence claims
189
+ const confident = getHighConfidenceClaims(claimList, 0.9);
190
+ ```
191
+
192
+ ### Calibration API (Prediction Accuracy)
193
+
194
+ ```typescript
195
+ import { calibration, calculateBrierScore, isWellCalibrated } from '@thinkhive/sdk';
196
+
197
+ // Get calibration status
198
+ const status = await calibration.getStatus(agentId);
199
+
200
+ // Calculate Brier score for predictions
201
+ const brierScore = calculateBrierScore(predictions, outcomes);
202
+
203
+ // Check if well calibrated
204
+ if (isWellCalibrated(status)) {
205
+ console.log('Agent predictions are well calibrated');
142
206
  }
143
207
  ```
144
208
 
145
- ### Providing Feedback
209
+ ### Business Metrics API
146
210
 
147
211
  ```typescript
148
- // After receiving user feedback
149
- await client.feedback({
150
- traceId: result.traceId,
151
- rating: 5,
152
- wasHelpful: true,
153
- comment: 'Very accurate response!'
212
+ import {
213
+ businessMetrics,
214
+ isMetricReady,
215
+ needsMoreTraces,
216
+ getStatusMessage
217
+ } from '@thinkhive/sdk';
218
+
219
+ // Get current metric value with status
220
+ const metric = await businessMetrics.current('agent-123', 'Deflection Rate');
221
+ console.log(`${metric.metricName}: ${metric.valueFormatted}`);
222
+
223
+ if (metric.status === 'insufficient_data') {
224
+ console.log(`Need ${metric.minTraceThreshold - metric.traceCount} more traces`);
225
+ }
226
+
227
+ // Get historical data for graphing
228
+ const history = await businessMetrics.history('agent-123', 'Deflection Rate', {
229
+ startDate: new Date(Date.now() - 30 * 24 * 60 * 60 * 1000),
230
+ endDate: new Date(),
231
+ granularity: 'daily',
154
232
  });
155
233
 
156
- // When response was incorrect
157
- await client.feedback({
158
- traceId: result.traceId,
159
- rating: 2,
160
- wasHelpful: false,
161
- hadIssues: ['incorrect_info', 'too_long'],
162
- correctedResponse: 'The correct answer is...'
234
+ console.log(`${history.dataPoints.length} data points`);
235
+ console.log(`Change: ${history.summary.changePercent}%`);
236
+
237
+ // Record external metric values (from CRM, surveys, etc.)
238
+ await businessMetrics.record('agent-123', {
239
+ metricName: 'CSAT/NPS',
240
+ value: 4.5,
241
+ unit: 'score',
242
+ periodStart: '2024-01-01T00:00:00Z',
243
+ periodEnd: '2024-01-07T23:59:59Z',
244
+ source: 'survey_system',
245
+ sourceDetails: { surveyId: 'survey_456', responseCount: 150 },
163
246
  });
164
247
  ```
165
248
 
249
+ #### Metric Status Types
250
+
251
+ | Status | Description |
252
+ |--------|-------------|
253
+ | `ready` | Metric calculated and ready to display |
254
+ | `insufficient_data` | Need more traces before calculation |
255
+ | `awaiting_external` | External data source not connected |
256
+ | `stale` | Data is older than expected |
257
+
258
+ ### Ticket Linking (Zendesk Integration)
259
+
260
+ ```typescript
261
+ import {
262
+ linking,
263
+ generateZendeskMarker,
264
+ linkRunToZendeskTicket
265
+ } from '@thinkhive/sdk';
266
+
267
+ // Generate a marker to embed in ticket
268
+ const marker = generateZendeskMarker(runId);
269
+ // Returns: <!-- thinkhive:run:abc123 -->
270
+
271
+ // Link a run to a ticket
272
+ await linkRunToZendeskTicket(runId, ticketId);
273
+
274
+ // Get best linking method
275
+ import { getBestLinkMethod } from '@thinkhive/sdk';
276
+ const method = getBestLinkMethod(runData);
277
+ // Returns: 'conversation_id' | 'subject_hash' | 'marker' | etc.
278
+ ```
279
+
166
280
  ### Auto-Instrumentation
167
281
 
168
282
  ```typescript
169
- import { init, autoInstrument } from '@thinkhive/sdk';
283
+ import { init } from '@thinkhive/sdk';
170
284
 
171
- // Initialize SDK
285
+ // Initialize with auto-instrumentation
172
286
  init({
173
- apiKey: 'your_api_key',
287
+ apiKey: 'th_your_api_key',
174
288
  serviceName: 'my-ai-agent',
175
289
  autoInstrument: true,
176
- frameworks: ['langchain', 'openai']
177
- });
178
-
179
- // Or manually instrument
180
- autoInstrument(client, {
181
- frameworks: ['langchain', 'openai'],
182
- capturePrompts: true,
183
- captureResponses: true,
184
- businessContext: { industry: 'saas' }
290
+ frameworks: ['langchain', 'openai', 'anthropic']
185
291
  });
186
292
 
187
- // Now all LangChain and OpenAI calls are automatically traced!
293
+ // Now all LangChain, OpenAI, and Anthropic calls are automatically traced!
188
294
  ```
189
295
 
190
296
  ## Analysis Tiers
191
297
 
192
- | Tier | Description | Latency | Cost |
193
- |------|-------------|---------|------|
194
- | `rule_based` | Pattern matching, keyword extraction | ~50ms | Free |
195
- | `fast_llm` | Quick LLM analysis (GPT-3.5) | ~500ms | Low |
196
- | `full_llm` | Complete analysis (GPT-4o) | ~3s | Standard |
197
- | `deep` | Multi-pass with validation | ~15s | Premium |
298
+ | Tier | Description | Use Case |
299
+ |------|-------------|----------|
300
+ | `fast` | Quick pattern-based analysis | High-volume, low-latency needs |
301
+ | `standard` | LLM-powered analysis | Default for most use cases |
302
+ | `deep` | Multi-pass with validation | Critical traces, root cause analysis |
198
303
 
199
304
  ## Environment Variables
200
305
 
201
306
  | Variable | Description |
202
307
  |----------|-------------|
203
308
  | `THINKHIVE_API_KEY` | Your ThinkHive API key |
204
- | `THINKHIVE_ENDPOINT` | Custom API endpoint (optional) |
309
+ | `THINKHIVE_ENDPOINT` | Custom API endpoint (default: https://demo.thinkhive.ai) |
205
310
  | `THINKHIVE_SERVICE_NAME` | Service name for traces (optional) |
206
311
 
312
+ ## V3 Architecture
313
+
314
+ ### Key Concepts
315
+
316
+ **Run-Centric Model**: The atomic unit of work is a "Run" (not a trace). A run captures:
317
+ - Conversation messages
318
+ - Retrieved contexts
319
+ - Tool calls
320
+ - Outcome and metadata
321
+
322
+ **Facts vs Inferences**: Claims API separates:
323
+ - **Facts**: Verified information from retrieval or tool calls
324
+ - **Inferences**: LLM-generated conclusions
325
+ - **Computed**: Derived values from rules
326
+
327
+ **Calibrated Predictions**: Track prediction accuracy using:
328
+ - Brier scores for overall calibration
329
+ - ECE (Expected Calibration Error) for bucketed analysis
330
+
331
+ ### API Structure
332
+
333
+ | API | Description |
334
+ |-----|-------------|
335
+ | `runs` | Create and manage runs (atomic work units) |
336
+ | `claims` | Manage facts/inferences for runs |
337
+ | `calibration` | Track prediction accuracy |
338
+ | `analyzer` | User-selected trace analysis |
339
+ | `issues` | Clustered failure patterns |
340
+ | `linking` | Connect runs to support tickets |
341
+ | `customerContext` | Time-series customer snapshots |
342
+ | `apiKeys` | API key management |
343
+ | `businessMetrics` | Industry-driven metrics with historical tracking |
344
+ | `roiAnalytics` | Business ROI and financial impact analysis |
345
+ | `qualityMetrics` | RAG evaluation and hallucination detection |
346
+
347
+ ### New Evaluation APIs (v3.0)
348
+
349
+ | API | Description |
350
+ |-----|-------------|
351
+ | `humanReview` | Human-in-the-loop review queues |
352
+ | `nondeterminism` | Multi-sample reliability testing |
353
+ | `evalHealth` | Evaluation metric health monitoring |
354
+ | `deterministicGraders` | Rule-based evaluation |
355
+ | `conversationEval` | Multi-turn conversation evaluation |
356
+ | `transcriptPatterns` | Pattern detection in transcripts |
357
+
207
358
  ## API Reference
208
359
 
209
360
  See [API Documentation](https://docs.thinkhive.ai/sdk/javascript/reference) for complete type definitions.