@sap-ai-sdk/orchestration 1.6.1-20250120013121.0 → 1.6.1-20250121013052.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +120 -119
- package/package.json +3 -3
package/README.md
CHANGED
|
@@ -16,6 +16,7 @@ This package incorporates generative AI orchestration capabilities into your AI
|
|
|
16
16
|
- [Data Masking](#data-masking)
|
|
17
17
|
- [Grounding](#grounding)
|
|
18
18
|
- [Using a JSON Configuration from AI Launchpad](#using-a-json-configuration-from-ai-launchpad)
|
|
19
|
+
- [Streaming](#streaming)
|
|
19
20
|
- [Using Resource Groups](#using-resource-groups)
|
|
20
21
|
- [Custom Request Configuration](#custom-request-configuration)
|
|
21
22
|
- [Custom Destination](#custom-destination)
|
|
@@ -86,125 +87,6 @@ The client allows you to combine various modules, such as templating and content
|
|
|
86
87
|
|
|
87
88
|
In addition to the examples below, you can find more **sample code** [here](https://github.com/SAP/ai-sdk-js/blob/main/sample-code/src/orchestration.ts).
|
|
88
89
|
|
|
89
|
-
### Streaming
|
|
90
|
-
|
|
91
|
-
The `OrchestrationClient` supports streaming responses for chat completion requests based on the [Server-sent events](https://html.spec.whatwg.org/multipage/server-sent-events.html#server-sent-events) standard.
|
|
92
|
-
|
|
93
|
-
Use the `stream()` method to receive a stream of chunk responses from the model.
|
|
94
|
-
After consuming the stream, call the helper methods to get the finish reason and token usage information.
|
|
95
|
-
|
|
96
|
-
```ts
|
|
97
|
-
const orchestrationClient = new OrchestrationClient({
|
|
98
|
-
llm: {
|
|
99
|
-
model_name: 'gpt-4o',
|
|
100
|
-
model_params: { max_tokens: 50, temperature: 0.1 }
|
|
101
|
-
},
|
|
102
|
-
templating: {
|
|
103
|
-
template: [
|
|
104
|
-
{ role: 'user', content: 'Give a long history of {{?country}}?' }
|
|
105
|
-
]
|
|
106
|
-
}
|
|
107
|
-
});
|
|
108
|
-
|
|
109
|
-
const response = await orchestrationClient.stream({
|
|
110
|
-
inputParams: { country: 'France' }
|
|
111
|
-
});
|
|
112
|
-
|
|
113
|
-
for await (const chunk of response.stream) {
|
|
114
|
-
console.log(JSON.stringify(chunk));
|
|
115
|
-
}
|
|
116
|
-
|
|
117
|
-
const finishReason = response.getFinishReason();
|
|
118
|
-
const tokenUsage = response.getTokenUsage();
|
|
119
|
-
|
|
120
|
-
console.log(`Finish reason: ${finishReason}\n`);
|
|
121
|
-
console.log(`Token usage: ${JSON.stringify(tokenUsage)}\n`);
|
|
122
|
-
```
|
|
123
|
-
|
|
124
|
-
#### Streaming the Delta Content
|
|
125
|
-
|
|
126
|
-
The client provides a helper method to extract the text chunks as strings:
|
|
127
|
-
|
|
128
|
-
```ts
|
|
129
|
-
for await (const chunk of response.stream.toContentStream()) {
|
|
130
|
-
console.log(chunk); // will log the delta content
|
|
131
|
-
}
|
|
132
|
-
```
|
|
133
|
-
|
|
134
|
-
Each chunk will be a string containing the delta content.
|
|
135
|
-
|
|
136
|
-
#### Streaming with Abort Controller
|
|
137
|
-
|
|
138
|
-
Streaming request can be aborted using the `AbortController` API.
|
|
139
|
-
In case of an error, the SAP Cloud SDK for AI will automatically close the stream.
|
|
140
|
-
Additionally, it can be aborted manually by calling the `stream()` method with an `AbortController` object.
|
|
141
|
-
|
|
142
|
-
```ts
|
|
143
|
-
const orchestrationClient = new OrchestrationClient({
|
|
144
|
-
llm: {
|
|
145
|
-
model_name: 'gpt-4o',
|
|
146
|
-
model_params: { max_tokens: 50, temperature: 0.1 }
|
|
147
|
-
},
|
|
148
|
-
templating: {
|
|
149
|
-
template: [
|
|
150
|
-
{ role: 'user', content: 'Give a long history of {{?country}}?' }
|
|
151
|
-
]
|
|
152
|
-
}
|
|
153
|
-
});
|
|
154
|
-
|
|
155
|
-
const controller = new AbortController();
|
|
156
|
-
const response = await orchestrationClient.stream(
|
|
157
|
-
{
|
|
158
|
-
inputParams: { country: 'France' }
|
|
159
|
-
},
|
|
160
|
-
controller
|
|
161
|
-
);
|
|
162
|
-
|
|
163
|
-
// Abort the streaming request after one second
|
|
164
|
-
setTimeout(() => {
|
|
165
|
-
controller.abort();
|
|
166
|
-
}, 1000);
|
|
167
|
-
|
|
168
|
-
for await (const chunk of response.stream) {
|
|
169
|
-
console.log(JSON.stringify(chunk));
|
|
170
|
-
}
|
|
171
|
-
```
|
|
172
|
-
|
|
173
|
-
In this example, streaming request will be aborted after one second.
|
|
174
|
-
Abort controller can be useful, e.g., when end-user wants to stop the stream or refreshes the page.
|
|
175
|
-
|
|
176
|
-
#### Stream Options
|
|
177
|
-
|
|
178
|
-
The orchestration service offers multiple streaming options, which you can configure in addition to the LLM's streaming options.
|
|
179
|
-
These include options like definining the maximum number of characters per chunk or modifying the output filter behavior.
|
|
180
|
-
There are two ways to add specific streaming options to your client, either at initialization of orchestration client, or when calling the stream API.
|
|
181
|
-
|
|
182
|
-
Setting streaming options dynamically could be useful if an initialized orchestration client will also be used for streaming.
|
|
183
|
-
|
|
184
|
-
You can check the list of available stream options in the [orchestration service's documentation](https://help.sap.com/docs/sap-ai-core/sap-ai-core-service-guide/streaming).
|
|
185
|
-
|
|
186
|
-
An example for setting the streaming options when calling the stream API looks like the following:
|
|
187
|
-
|
|
188
|
-
```ts
|
|
189
|
-
const response = orchestrationClient.stream(
|
|
190
|
-
{
|
|
191
|
-
inputParams: { country: 'France' }
|
|
192
|
-
},
|
|
193
|
-
controller,
|
|
194
|
-
{
|
|
195
|
-
llm: { include_usage: false },
|
|
196
|
-
global: { chunk_size: 10 },
|
|
197
|
-
outputFiltering: { overlap: 200 }
|
|
198
|
-
}
|
|
199
|
-
);
|
|
200
|
-
```
|
|
201
|
-
|
|
202
|
-
Usage metrics are collected by default, if you do not want to receive them, set `include_usage` to `false`.
|
|
203
|
-
If you don't want any streaming options as part of your call to the LLM, set `streamOptions.llm` to `null`.
|
|
204
|
-
|
|
205
|
-
> [!NOTE]
|
|
206
|
-
> When initalizing a client with a JSON module config, providing streaming options is not possible.
|
|
207
|
-
|
|
208
90
|
### Templating
|
|
209
91
|
|
|
210
92
|
Use the orchestration client with templating to pass a prompt containing placeholders that will be replaced with input parameters during a chat completion request.
|
|
@@ -501,6 +383,125 @@ const response = await new OrchestrationClient(jsonConfig).chatCompletion();
|
|
|
501
383
|
return response;
|
|
502
384
|
```
|
|
503
385
|
|
|
386
|
+
### Streaming
|
|
387
|
+
|
|
388
|
+
The `OrchestrationClient` supports streaming responses for chat completion requests based on the [Server-sent events](https://html.spec.whatwg.org/multipage/server-sent-events.html#server-sent-events) standard.
|
|
389
|
+
|
|
390
|
+
Use the `stream()` method to receive a stream of chunk responses from the model.
|
|
391
|
+
After consuming the stream, call the helper methods to get the finish reason and token usage information.
|
|
392
|
+
|
|
393
|
+
```ts
|
|
394
|
+
const orchestrationClient = new OrchestrationClient({
|
|
395
|
+
llm: {
|
|
396
|
+
model_name: 'gpt-4o',
|
|
397
|
+
model_params: { max_tokens: 50, temperature: 0.1 }
|
|
398
|
+
},
|
|
399
|
+
templating: {
|
|
400
|
+
template: [
|
|
401
|
+
{ role: 'user', content: 'Give a long history of {{?country}}?' }
|
|
402
|
+
]
|
|
403
|
+
}
|
|
404
|
+
});
|
|
405
|
+
|
|
406
|
+
const response = await orchestrationClient.stream({
|
|
407
|
+
inputParams: { country: 'France' }
|
|
408
|
+
});
|
|
409
|
+
|
|
410
|
+
for await (const chunk of response.stream) {
|
|
411
|
+
console.log(JSON.stringify(chunk));
|
|
412
|
+
}
|
|
413
|
+
|
|
414
|
+
const finishReason = response.getFinishReason();
|
|
415
|
+
const tokenUsage = response.getTokenUsage();
|
|
416
|
+
|
|
417
|
+
console.log(`Finish reason: ${finishReason}\n`);
|
|
418
|
+
console.log(`Token usage: ${JSON.stringify(tokenUsage)}\n`);
|
|
419
|
+
```
|
|
420
|
+
|
|
421
|
+
#### Streaming the Delta Content
|
|
422
|
+
|
|
423
|
+
The client provides a helper method to extract the text chunks as strings:
|
|
424
|
+
|
|
425
|
+
```ts
|
|
426
|
+
for await (const chunk of response.stream.toContentStream()) {
|
|
427
|
+
console.log(chunk); // will log the delta content
|
|
428
|
+
}
|
|
429
|
+
```
|
|
430
|
+
|
|
431
|
+
Each chunk will be a string containing the delta content.
|
|
432
|
+
|
|
433
|
+
#### Streaming with Abort Controller
|
|
434
|
+
|
|
435
|
+
Streaming request can be aborted using the `AbortController` API.
|
|
436
|
+
In case of an error, the SAP Cloud SDK for AI will automatically close the stream.
|
|
437
|
+
Additionally, it can be aborted manually by calling the `stream()` method with an `AbortController` object.
|
|
438
|
+
|
|
439
|
+
```ts
|
|
440
|
+
const orchestrationClient = new OrchestrationClient({
|
|
441
|
+
llm: {
|
|
442
|
+
model_name: 'gpt-4o',
|
|
443
|
+
model_params: { max_tokens: 50, temperature: 0.1 }
|
|
444
|
+
},
|
|
445
|
+
templating: {
|
|
446
|
+
template: [
|
|
447
|
+
{ role: 'user', content: 'Give a long history of {{?country}}?' }
|
|
448
|
+
]
|
|
449
|
+
}
|
|
450
|
+
});
|
|
451
|
+
|
|
452
|
+
const controller = new AbortController();
|
|
453
|
+
const response = await orchestrationClient.stream(
|
|
454
|
+
{
|
|
455
|
+
inputParams: { country: 'France' }
|
|
456
|
+
},
|
|
457
|
+
controller
|
|
458
|
+
);
|
|
459
|
+
|
|
460
|
+
// Abort the streaming request after one second
|
|
461
|
+
setTimeout(() => {
|
|
462
|
+
controller.abort();
|
|
463
|
+
}, 1000);
|
|
464
|
+
|
|
465
|
+
for await (const chunk of response.stream) {
|
|
466
|
+
console.log(JSON.stringify(chunk));
|
|
467
|
+
}
|
|
468
|
+
```
|
|
469
|
+
|
|
470
|
+
In this example, streaming request will be aborted after one second.
|
|
471
|
+
Abort controller can be useful, e.g., when end-user wants to stop the stream or refreshes the page.
|
|
472
|
+
|
|
473
|
+
#### Stream Options
|
|
474
|
+
|
|
475
|
+
The orchestration service offers multiple streaming options, which you can configure in addition to the LLM's streaming options.
|
|
476
|
+
These include options like definining the maximum number of characters per chunk or modifying the output filter behavior.
|
|
477
|
+
There are two ways to add specific streaming options to your client, either at initialization of orchestration client, or when calling the stream API.
|
|
478
|
+
|
|
479
|
+
Setting streaming options dynamically could be useful if an initialized orchestration client will also be used for streaming.
|
|
480
|
+
|
|
481
|
+
You can check the list of available stream options in the [orchestration service's documentation](https://help.sap.com/docs/sap-ai-core/sap-ai-core-service-guide/streaming).
|
|
482
|
+
|
|
483
|
+
An example for setting the streaming options when calling the stream API looks like the following:
|
|
484
|
+
|
|
485
|
+
```ts
|
|
486
|
+
const response = orchestrationClient.stream(
|
|
487
|
+
{
|
|
488
|
+
inputParams: { country: 'France' }
|
|
489
|
+
},
|
|
490
|
+
controller,
|
|
491
|
+
{
|
|
492
|
+
llm: { include_usage: false },
|
|
493
|
+
global: { chunk_size: 10 },
|
|
494
|
+
outputFiltering: { overlap: 200 }
|
|
495
|
+
}
|
|
496
|
+
);
|
|
497
|
+
```
|
|
498
|
+
|
|
499
|
+
Usage metrics are collected by default, if you do not want to receive them, set `include_usage` to `false`.
|
|
500
|
+
If you don't want any streaming options as part of your call to the LLM, set `streamOptions.llm` to `null`.
|
|
501
|
+
|
|
502
|
+
> [!NOTE]
|
|
503
|
+
> When initalizing a client with a JSON module config, providing streaming options is not possible.
|
|
504
|
+
|
|
504
505
|
### Using Resource Groups
|
|
505
506
|
|
|
506
507
|
The resource group can be used as an additional parameter to pick the right orchestration deployment.
|
package/package.json
CHANGED
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
{
|
|
2
2
|
"name": "@sap-ai-sdk/orchestration",
|
|
3
|
-
"version": "1.6.1-
|
|
3
|
+
"version": "1.6.1-20250121013052.0",
|
|
4
4
|
"description": "",
|
|
5
5
|
"license": "Apache-2.0",
|
|
6
6
|
"keywords": [
|
|
@@ -21,8 +21,8 @@
|
|
|
21
21
|
],
|
|
22
22
|
"dependencies": {
|
|
23
23
|
"@sap-cloud-sdk/util": "^3.25.0",
|
|
24
|
-
"@sap-ai-sdk/core": "^1.6.1-
|
|
25
|
-
"@sap-ai-sdk/ai-api": "^1.6.1-
|
|
24
|
+
"@sap-ai-sdk/core": "^1.6.1-20250121013052.0",
|
|
25
|
+
"@sap-ai-sdk/ai-api": "^1.6.1-20250121013052.0"
|
|
26
26
|
},
|
|
27
27
|
"devDependencies": {
|
|
28
28
|
"@sap-cloud-sdk/http-client": "^3.25.0",
|