npm - @office-ai/aioncli-core - Versions diffs - 0.24.0 → 0.24.2 - Mend

@office-ai/aioncli-core 0.24.0 → 0.24.2

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (26) hide show

package/dist/docs/bedrock-integration-plan.md +595 -0
package/dist/docs/get-started/authentication.md +162 -0
package/dist/src/config/models.d.ts +26 -0
package/dist/src/config/models.js +134 -0
package/dist/src/config/models.js.map +1 -1
package/dist/src/core/anthropicContentGenerator.d.ts +28 -0
package/dist/src/core/anthropicContentGenerator.js +282 -0
package/dist/src/core/anthropicContentGenerator.js.map +1 -0
package/dist/src/core/bedrockContentGenerator.d.ts +72 -0
package/dist/src/core/bedrockContentGenerator.js +628 -0
package/dist/src/core/bedrockContentGenerator.js.map +1 -0
package/dist/src/core/contentGenerator.d.ts +4 -1
package/dist/src/core/contentGenerator.js +43 -0
package/dist/src/core/contentGenerator.js.map +1 -1
package/dist/src/generated/git-commit.d.ts +2 -2
package/dist/src/generated/git-commit.js +2 -2
package/dist/src/generated/git-commit.js.map +1 -1
package/dist/src/services/shellExecutionService.d.ts +5 -0
package/dist/src/services/shellExecutionService.js +12 -0
package/dist/src/services/shellExecutionService.js.map +1 -1
package/dist/src/tools/shell.js +8 -0
package/dist/src/tools/shell.js.map +1 -1
package/dist/src/utils/retry.js +25 -0
package/dist/src/utils/retry.js.map +1 -1
package/dist/tsconfig.tsbuildinfo +1 -1
package/package.json +4 -2

package/dist/docs/bedrock-integration-plan.md ADDED Viewed

@@ -0,0 +1,595 @@
+# AWS Bedrock Integration Implementation Plan
+## Overview
+Add AWS Bedrock support to aioncli using the unified Converse API, with priority
+support for Anthropic Claude model series and multi-region deployment.
+## Requirements Confirmation
+- **Region Support**: Multi-region (globally available)
+- **Priority Models**: Anthropic Claude (3.5/3.7 Sonnet)
+- **Implementation Scope**: Complete implementation (text generation, tool
+  calling, streaming responses, token counting, embedding)
+## Supported Model List
+**Phase 1: Anthropic Claude model series only**
+### Cross-Region Models (Cross-Region Inference Profiles, Recommended)
+These models use the `global.` prefix and can be called from any AWS region,
+providing optimal availability and fault tolerance.
+- `global.anthropic.claude-opus-4-5-20251101-v1:0` - Claude Opus 4.5 (most
+  powerful)
+- `global.anthropic.claude-sonnet-4-5-20250929-v1:0` - Claude Sonnet 4.5
+  (recommended, default)
+- `global.anthropic.claude-sonnet-4-20250514-v1:0` - Claude Sonnet 4
+- `global.anthropic.claude-haiku-4-5-20251001-v1:0` - Claude Haiku 4.5 (fastest)
+### Regional Models
+These models are only available in specific regions and provide backward
+compatibility with older Claude versions.
+- `anthropic.claude-3-5-sonnet-20241022-v2:0` - Claude 3.5 Sonnet v2
+- `anthropic.claude-3-5-sonnet-20240620-v1:0` - Claude 3.5 Sonnet v1
+- `anthropic.claude-3-opus-20240229-v1:0` - Claude 3 Opus
+- `anthropic.claude-3-haiku-20240307-v1:0` - Claude 3 Haiku
+**Future Phases**: Extend support to Amazon Titan, Meta Llama, and Mistral
+models as needed.
+## Core Architecture Design
+### 1. ContentGenerator Implementation
+Create `BedrockContentGenerator` class implementing the `ContentGenerator`
+interface:
+```typescript
+// packages/core/src/core/bedrockContentGenerator.ts
+export class BedrockContentGenerator implements ContentGenerator {
+  private client: BedrockRuntimeClient;
+  private model: string;
+  private region: string;
+  async generateContent(
+    request,
+    userPromptId,
+  ): Promise<GenerateContentResponse>;
+  async generateContentStream(
+    request,
+    userPromptId,
+  ): AsyncGenerator<GenerateContentResponse>;
+  async countTokens(request): Promise<CountTokensResponse>;
+  async embedContent(request): Promise<EmbedContentResponse>;
+}
+```
+### 2. API Format Conversion
+#### Gemini → Bedrock Converse Request Format
+**Message Conversion**:
+- Gemini:
+  `{role: 'user'|'model', parts: [{text}|{functionCall}|{functionResponse}]}`
+- Bedrock:
+  `{role: 'user'|'assistant', content: [{text}|{toolUse}|{toolResult}]}`
+**Tool Definition Conversion**:
+- Gemini: `functionDeclarations` with `parameters` (JSON Schema)
+- Bedrock: `toolSpec` with `inputSchema.json` (JSON Schema)
+**System Instruction**:
+- Gemini: `systemInstruction` field
+- Bedrock: `system` array `[{text: '...'}]`
+#### Bedrock Converse → Gemini Response Format
+**Text Content**:
+- Bedrock: `{content: [{text: '...'}]}`
+- Gemini: `{parts: [{text: '...'}]}`
+**Tool Calls**:
+- Bedrock: `{content: [{toolUse: {toolUseId, name, input}}]}`
+- Gemini: `{parts: [{functionCall: {id, name, args}}]}`
+**Finish Reason Mapping**:
+- `end_turn` → `STOP`
+- `max_tokens` → `MAX_TOKENS`
+- `stop_sequence` → `STOP`
+- `tool_use` → `STOP`
+- `content_filtered` → `SAFETY`
+### 3. Streaming Response Handling
+Use `ConverseStreamCommand` to process event streams:
+```typescript
+async *streamGenerator(stream) {
+  const toolCalls = new Map(); // Accumulate tool calls
+  for await (const event of stream) {
+    if (event.contentBlockStart) {
+      // Start new content block
+    }
+    if (event.contentBlockDelta) {
+      // Accumulate text/tool input
+      if (event.contentBlockDelta.delta?.text) {
+        yield convertTextDelta(event);
+      }
+      if (event.contentBlockDelta.delta?.toolUse) {
+        accumulateToolCall(event);
+      }
+    }
+    if (event.contentBlockStop) {
+      // Complete tool call, emit full result
+      yield finalizeToolCall(event);
+    }
+    if (event.metadata) {
+      // Token usage information
+      yield convertUsageMetadata(event);
+    }
+  }
+}
+```
+### 4. Tool Call Handling
+**Multi-turn Conversation Support**:
+1. User message → Model returns toolUse
+2. Convert to Gemini functionCall → CLI executes tool
+3. Tool result converts to toolResult → Send back to Bedrock
+4. Bedrock returns final response
+**ID Matching**:
+- Bedrock `toolUseId` ↔ Gemini `functionCall.id`
+- Ensure tool call and response IDs are consistent
+## Key File Modifications
+### New Files
+1. **`/packages/core/src/core/bedrockContentGenerator.ts`** (~1800 lines)
+   - BedrockContentGenerator class implementation
+   - Format conversion methods
+   - Streaming response handling
+   - Error handling and retry logic
+2. **`/packages/core/src/core/bedrockContentGenerator.test.ts`** (~600 lines)
+   - Mock AWS SDK client
+   - Format conversion tests
+   - Tool calling tests
+   - Streaming response tests
+### Modified Files
+3. **`/packages/core/src/core/contentGenerator.ts`**
+   - Add `AuthType.USE_BEDROCK = 'bedrock'`
+   - Update `ContentGeneratorConfig` type to add `awsRegion?: string`
+   - Add Bedrock routing in `createContentGenerator()` factory function
+4. **`/packages/core/src/config/config.ts`**
+   - Add AWS environment variable detection in `createContentGeneratorConfig()`
+   - Read `AWS_REGION`, `AWS_ACCESS_KEY_ID`, `AWS_SECRET_ACCESS_KEY`
+5. **`/packages/core/src/config/models.ts`**
+   - Add Bedrock model constants and validation
+   - Define
+     `DEFAULT_BEDROCK_MODEL = 'global.anthropic.claude-sonnet-4-5-20250929-v1:0'`
+6. **`/packages/core/package.json`**
+   - Add dependencies:
+     - `"@aws-sdk/client-bedrock-runtime": "^3.700.0"`
+     - `"@aws-sdk/credential-providers": "^3.700.0"` (for AWS Profile
+       authentication)
+## Implementation Details
+### AWS Authentication
+**Simplest Approach: Fully rely on AWS SDK default credential chain**
+```typescript
+import { BedrockRuntimeClient } from '@aws-sdk/client-bedrock-runtime';
+// SDK automatically finds credentials, in priority order:
+// 1. Environment variables: AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY, AWS_SESSION_TOKEN
+// 2. AWS_PROFILE specified profile (from ~/.aws/credentials)
+// 3. Default [default] profile (from ~/.aws/credentials)
+const client = new BedrockRuntimeClient({
+  region: process.env.AWS_REGION || 'us-east-1',
+});
+```
+**User Configuration Examples**:
+**Method 1: Environment Variables (suitable for temporary use or CI/CD)**
+```bash
+export AWS_REGION="us-east-1"
+export AWS_ACCESS_KEY_ID="AKIA..."
+export AWS_SECRET_ACCESS_KEY="..."
+npm run start
+```
+**Method 2: AWS Profile (recommended, supports multi-account switching)**
+```bash
+# ~/.aws/credentials file content:
+[default]
+aws_access_key_id = AKIA...
+aws_secret_access_key = ...
+[enterprise-ai]
+aws_access_key_id = AKIA...
+aws_secret_access_key = ...
+# Use default profile
+export AWS_REGION="us-east-1"
+npm run start
+# Use enterprise-ai profile
+export AWS_REGION="ap-southeast-1"
+export AWS_PROFILE="enterprise-ai"
+npm run start
+```
+**Advantages**:
+- Minimal code (~5 lines)
+- AWS SDK automatically handles all authentication logic
+- Supports all AWS standard authentication methods (env vars, profiles, IAM
+  roles, etc.)
+- Users don't need to learn new configuration methods
+### Token Counting Implementation
+**Background**:
+- Bedrock responses include precise `usage.inputTokens/outputTokens/totalTokens`
+- `countTokens` method is mainly used for media content estimation and error
+  logging
+- Doesn't need to be particularly precise, simple estimation is sufficient
+**Implementation**:
+```typescript
+async countTokens(request: CountTokensParameters): Promise<CountTokensResponse> {
+  // Extract all text content
+  const text = request.contents
+    .flatMap(c => c.parts)
+    .filter(p => 'text' in p)
+    .map(p => p.text)
+    .join('');
+  // Simple estimation: 1 token ≈ 4 characters (suitable for English and code)
+  // Actual tokens for Claude models will differ slightly, but sufficient for estimation
+  const totalTokens = Math.ceil(text.length / 4);
+  return { totalTokens };
+}
+```
+**Note**: This method is for estimation only; actual token usage is based on the
+usage in API responses.
+### Embedding Support
+**Phase 1: Not Implemented**:
+- Claude models don't provide embedding capability
+- `embedContent` method is not actually used in the CLI
+- Simply return "not supported" error
+```typescript
+async embedContent(request: EmbedContentParameters): Promise<EmbedContentResponse> {
+  throw new Error(
+    'Embedding is not supported for Claude models on Bedrock. ' +
+    'Consider using Amazon Titan Embed models in future versions.'
+  );
+}
+```
+**Future Extension**: If embedding support is needed, Amazon Titan Embed models
+can be added (using InvokeModel API).
+### Error Handling
+**Throttling Retry** (ThrottlingException):
+```typescript
+private async sendWithRetry(command, maxRetries = 3) {
+  for (let attempt = 0; attempt < maxRetries; attempt++) {
+    try {
+      return await this.client.send(command);
+    } catch (error) {
+      if (error.name === 'ThrottlingException' && attempt < maxRetries - 1) {
+        await sleep(Math.pow(2, attempt) * 1000); // Exponential backoff
+        continue;
+      }
+      if (error.name === 'ValidationException') {
+        throw new Error(`Bedrock validation error: ${error.message}`);
+      }
+      throw error;
+    }
+  }
+}
+```
+### JSON Mode Support
+Bedrock doesn't support native JSON mode; use tool calling to simulate:
+```typescript
+if (request.config?.responseJsonSchema) {
+  const jsonTool = {
+    toolSpec: {
+      name: 'respond_in_schema',
+      description: 'Response in JSON schema',
+      inputSchema: { json: request.config.responseJsonSchema },
+    },
+  };
+  // Force use of this tool
+  const command = new ConverseCommand({
+    modelId: this.model,
+    messages,
+    toolConfig: {
+      tools: [jsonTool],
+      toolChoice: { tool: { name: 'respond_in_schema' } },
+    },
+  });
+}
+```
+## Configuration Examples
+### Environment Variable Configuration
+```bash
+# Use AWS access keys
+export AWS_REGION="us-east-1"
+export AWS_ACCESS_KEY_ID="AKIA..."
+export AWS_SECRET_ACCESS_KEY="..."
+# Or use AWS Profile
+export AWS_REGION="ap-southeast-1"
+export AWS_PROFILE="my-profile"
+# Start CLI
+npm run start
+```
+### Model Selection
+```bash
+# Use default cross-region Claude Sonnet 4.5 model
+npm run start
+# Specify specific cross-region model
+npm run start -- --model global.anthropic.claude-opus-4-5-20251101-v1:0
+# Use regional model
+npm run start -- --model anthropic.claude-3-5-sonnet-20241022-v2:0
+# Use Titan
+npm run start -- --model amazon.titan-text-premier-v1:0
+```
+### View Available Models
+```bash
+# List all Anthropic models in current region
+aws bedrock list-foundation-models \
+  --region $AWS_REGION \
+  --by-provider Anthropic
+```
+## Testing Strategy
+### Unit Tests (vitest)
+Mock AWS SDK client:
+```typescript
+vi.mock('@aws-sdk/client-bedrock-runtime', () => ({
+  BedrockRuntimeClient: vi.fn(),
+  ConverseCommand: vi.fn(),
+  ConverseStreamCommand: vi.fn(),
+}));
+const mockClient = {
+  send: vi.fn(),
+};
+BedrockRuntimeClient.mockImplementation(() => mockClient);
+```
+Test Coverage:
+- ✅ Request format conversion (Gemini → Bedrock)
+- ✅ Response format conversion (Bedrock → Gemini)
+- ✅ Tool definition conversion
+- ✅ Tool call/response conversion
+- ✅ Streaming response accumulation
+- ✅ Finish reason mapping
+- ✅ Error handling (throttling, validation errors)
+- ✅ Token counting estimation
+- ✅ Embedding calls
+### Integration Tests
+Requires real AWS credentials:
+```bash
+# Set test credentials
+export AWS_REGION="us-east-1"
+export AWS_ACCESS_KEY_ID="..."
+export AWS_SECRET_ACCESS_KEY="..."
+# Run integration tests
+npm run test:integration:bedrock
+```
+Test Scenarios:
+- Single-turn conversations
+- Multi-turn conversations
+- Tool calling (read files, execute commands)
+- Streaming responses
+- Different model families (Claude, Titan, Llama)
+## Validation Plan
+### End-to-End Testing
+1. **Basic Conversation**:
+   ```bash
+   npm run start
+   > Hello, please introduce yourself
+   # Verify: Normal response returned
+   ```
+2. **Tool Calling**:
+   ```bash
+   > Read the README.md file in the current directory
+   # Verify: Calls ReadFileTool, returns file content
+   ```
+3. **Multi-turn Conversation**:
+   ```bash
+   > Create a file named test.txt with content "Hello Bedrock"
+   # Verify: Calls WriteFileTool, confirms creation success
+   > Now read this file
+   # Verify: Calls ReadFileTool, returns correct content
+   ```
+4. **Streaming Response**:
+   ```bash
+   > Write a poem about cloud computing
+   # Verify: Character-by-character display, smooth experience
+   ```
+5. **Cross-Region Testing**:
+   ```bash
+   # Test Asia-Pacific region
+   export AWS_REGION="ap-southeast-1"
+   npm run start
+   # Test European region
+   export AWS_REGION="eu-west-1"
+   npm run start
+   ```
+### Performance Validation
+- Response latency < 2 seconds (non-streaming)
+- Streaming first byte latency < 500ms
+- Token counting error < 10%
+- Throttling retry success rate > 95%
+## Potential Challenges and Solutions
+### 1. Stricter Bedrock Throttling
+**Issue**: Claude models on Bedrock have stricter rate limits (e.g., 10 req/min)
+**Solutions**:
+- Implement exponential backoff retry
+- Provide friendly error messages suggesting users request quota increases
+- Support multi-region failover (if multiple regions configured)
+### 2. Significant Tool Call Format Differences
+**Issue**: Bedrock's toolUse/toolResult format differs significantly from Gemini
+**Solutions**:
+- Reference OpenAIContentGenerator's tool conversion logic
+- Establish complete ID mapping mechanism
+- Add detailed logging for debugging
+### 3. Different Schema Requirements for Different Models
+**Issue**: Claude and Llama have different levels of JSON Schema support
+**Solutions**:
+- Implement schema sanitization function to remove unsupported fields
+- Perform compatibility testing for each model family
+- Document limitations of each model in documentation
+### 4. Embedding Only Supports Titan
+**Issue**: Claude and Llama models don't provide embeddings
+**Solutions**:
+- Detect model type in `embedContent()`
+- Return clear error message if not a Titan embedding model
+- Suggest users switch to `amazon.titan-embed-text-v2:0`
+## Implementation Priority
+### Phase 1 (Core Functionality)
+- ✅ BedrockContentGenerator basic structure
+- ✅ Text generation (generateContent)
+- ✅ Streaming responses (generateContentStream)
+- ✅ Basic error handling
+- ✅ Claude model support
+### Phase 2 (Tool Support)
+- ✅ Tool definition conversion
+- ✅ Tool calling and response handling
+- ✅ Multi-turn conversation support
+- ✅ Complete unit tests
+### Phase 3 (Enhanced Features)
+- ✅ Token counting implementation (simple estimation)
+- ✅ Throttling retry optimization
+- ✅ Single region configuration (via AWS_REGION environment variable)
+### Phase 4 (Production Ready)
+- ⏳ Integration tests (basic tests complete, can be extended)
+- ✅ Documentation improvements (authentication docs added)
+- ✅ Performance optimization (retry logic integrated)
+- ✅ Error message localization (enhanceError provides friendly error messages)
+## Estimated Effort
+**Phase 1 (Claude support only)**:
+- Core implementation: ~1500 lines of code (BedrockContentGenerator)
+- Test code: ~500 lines of code (unit tests + format conversion tests)
+- Configuration changes: ~100 lines of code (AuthType, factory, models config)
+- Documentation updates: ~300 lines of documentation
+Total: Approximately 2400 lines of code
+**Simplifications**:
+- ❌ No support for Titan/Llama/Mistral models
+- ❌ No Embedding functionality
+- ✅ Token counting uses simple estimation
+- ✅ Only implement environment variable and AWS Profile authentication