gemini-executor 0.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/LICENSE ADDED
@@ -0,0 +1,21 @@
1
+ MIT License
2
+
3
+ Copyright (c) 2026 mokasz
4
+
5
+ Permission is hereby granted, free of charge, to any person obtaining a copy
6
+ of this software and associated documentation files (the "Software"), to deal
7
+ in the Software without restriction, including without limitation the rights
8
+ to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9
+ copies of the Software, and to permit persons to whom the Software is
10
+ furnished to do so, subject to the following conditions:
11
+
12
+ The above copyright notice and this permission notice shall be included in all
13
+ copies or substantial portions of the Software.
14
+
15
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16
+ IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17
+ FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18
+ AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19
+ LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20
+ OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21
+ SOFTWARE.
package/README.md ADDED
@@ -0,0 +1,274 @@
1
+ # Gemini Executor
2
+
3
+ > **Seamlessly integrate Google's Gemini CLI with Claude Code for powerful AI collaboration**
4
+
5
+ [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)
6
+ [![Node.js Version](https://img.shields.io/badge/node-%3E%3D18.0.0-brightgreen)](https://nodejs.org/)
7
+ [![TypeScript](https://img.shields.io/badge/TypeScript-5.0+-blue)](https://www.typescriptlang.org/)
8
+
9
+ Gemini Executor is an integration layer that combines the strengths of **Claude Code** and **Google's Gemini CLI**, enabling them to work together as complementary AI assistants. Claude handles precise code editing and project operations, while Gemini provides creative reasoning, data analysis, and multimodal processing with its 1M token context window.
10
+
11
+ ## ✨ Key Features
12
+
13
+ - **🎯 Complementary AI Strengths** - Claude for code precision, Gemini for creative reasoning
14
+ - **📦 Context Isolation** - Gemini's large outputs don't pollute your main conversation
15
+ - **🔒 Security First** - Built-in sanitization, validation, and sensitive file detection
16
+ - **🚀 Zero Config** - Works with existing Gemini CLI installation
17
+ - **🎨 Multimodal Ready** - Process images, PDFs, audio, and video files
18
+ - **💰 Cost Effective** - Leverages Gemini's generous free tier
19
+
20
+ ## 🏗️ Architecture
21
+
22
+ ```
23
+ ┌─────────────────────────────────────────────────────┐
24
+ │ Claude Code │
25
+ │ │
26
+ │ ┌────────────────┐ ┌──────────────────┐ │
27
+ │ │ User Request │────────▶│ Task Router │ │
28
+ │ └────────────────┘ └──────────────────┘ │
29
+ │ │ │
30
+ │ ┌────────────────┴───────────┐ │
31
+ │ │ │ │
32
+ │ ┌───────▼────────┐ ┌────────▼──────────┐
33
+ │ │ Claude Agent │ │ Gemini SubAgent │
34
+ │ │ │ │ │
35
+ │ │ • Code Editing │ │ • Complex Logic │
36
+ │ │ • File Ops │ │ • Data Analysis │
37
+ │ │ • Git Ops │ │ • Creative Tasks │
38
+ │ └────────────────┘ └─────────┬─────────┘
39
+ │ │
40
+ │ ┌──────────▼─────────┐
41
+ │ │ Gemini CLI │
42
+ │ └────────────────────┘
43
+ └─────────────────────────────────────────────────────────────┘
44
+ ```
45
+
46
+ ### Design Pattern: "Thin Skill + Universal SubAgent"
47
+
48
+ - **Thin Skill**: Lightweight `/gemini` command for user interaction
49
+ - **Universal SubAgent**: Reusable `gemini-executor` agent for programmatic delegation
50
+ - **Context Isolation**: Long outputs processed separately from main conversation
51
+
52
+ ## 📋 Prerequisites
53
+
54
+ - **Node.js** >= 18.0.0
55
+ - **Claude Code** (CLI tool)
56
+ - **Gemini CLI** >= 1.0.0
57
+
58
+ Install Gemini CLI:
59
+ ```bash
60
+ # macOS
61
+ brew install gemini-cli
62
+
63
+ # Or via pip
64
+ pip install gemini-cli
65
+ ```
66
+
67
+ ## 🚀 Installation
68
+
69
+ ### Option 1: Clone and Install
70
+
71
+ ```bash
72
+ git clone https://github.com/mokasz/gemini-executor.git
73
+ cd gemini-executor
74
+ npm install
75
+ npm run build
76
+ ```
77
+
78
+ ### Option 2: npm (coming soon)
79
+
80
+ ```bash
81
+ npm install -g gemini-executor
82
+ ```
83
+
84
+ ## 📖 Usage
85
+
86
+ ### As a SubAgent (Programmatic)
87
+
88
+ Claude can automatically delegate tasks to Gemini:
89
+
90
+ ```typescript
91
+ // Claude internally uses:
92
+ Task({
93
+ subagent_type: 'gemini-executor',
94
+ prompt: 'Analyze this large codebase and identify architectural patterns',
95
+ description: 'Analyze codebase architecture'
96
+ })
97
+ ```
98
+
99
+ **Use Cases:**
100
+ - Large project analysis (leveraging 1M token context)
101
+ - Complex algorithm design
102
+ - Data analysis and interpretation
103
+ - Creative problem-solving
104
+
105
+ ### As a Skill (Command Line)
106
+
107
+ Users can directly invoke Gemini:
108
+
109
+ ```bash
110
+ # Simple query
111
+ /gemini Explain how dependency injection works
112
+
113
+ # With specific model
114
+ /gemini -m gemini-2.0-flash-thinking-exp Design a caching strategy
115
+
116
+ # JSON output
117
+ /gemini -o json Extract UI components from this image
118
+
119
+ # Interactive mode
120
+ /gemini -i Let's brainstorm ideas for...
121
+ ```
122
+
123
+ **Use Cases:**
124
+ - Quick questions and clarifications
125
+ - Multimodal file processing (images, PDFs, audio, video)
126
+ - Alternative perspectives on problems
127
+ - Iterative problem-solving
128
+
129
+ ## 🎨 Real-World Examples
130
+
131
+ ### Example 1: Analyze Large Codebase
132
+
133
+ ```bash
134
+ # Claude delegates to Gemini for full project analysis
135
+ User: "Analyze the entire project structure and suggest improvements"
136
+ Claude: [Uses gemini-executor agent]
137
+ Gemini: [Reads all files with 1M context, provides comprehensive analysis]
138
+ Claude: [Implements suggested improvements]
139
+ ```
140
+
141
+ ### Example 2: Process UI Design Image
142
+
143
+ ```bash
144
+ # Direct skill invocation for multimodal task
145
+ /gemini Analyze this UI design and extract component specs design.png
146
+ ```
147
+
148
+ ### Example 3: Complex Algorithm Design
149
+
150
+ ```bash
151
+ User: "Design an efficient LRU cache with O(1) operations"
152
+ Claude: "Let me consult Gemini for algorithm design"
153
+ Claude: [Delegates to gemini-executor]
154
+ Gemini: [Provides algorithm design with trade-offs]
155
+ Claude: [Implements the algorithm in code]
156
+ ```
157
+
158
+ ## 🔒 Security Features
159
+
160
+ - **Path Validation**: Prevents directory traversal attacks
161
+ - **Command Injection Prevention**: Sanitizes all shell inputs
162
+ - **Sensitive File Detection**: Warns before processing `.env`, credentials, etc.
163
+ - **API Key Protection**: Never logs or exposes API keys
164
+ - **Input Sanitization**: Escapes special characters in user input
165
+
166
+ ## 🎯 Complementary Strengths
167
+
168
+ | Capability | Claude | Gemini |
169
+ |------------|--------|--------|
170
+ | Code Editing | ⭐⭐⭐⭐⭐ | ⭐⭐ |
171
+ | File Operations | ⭐⭐⭐⭐⭐ | - |
172
+ | Git Operations | ⭐⭐⭐⭐⭐ | - |
173
+ | Multimodal Processing | ⭐ | ⭐⭐⭐⭐⭐ |
174
+ | Super-long Context (1M tokens) | ⭐⭐ | ⭐⭐⭐⭐⭐ |
175
+ | Free Tier | - | ⭐⭐⭐⭐⭐ |
176
+
177
+ ## 📂 Project Structure
178
+
179
+ ```
180
+ gemini-executor/
181
+ ├── agents/ # SubAgent implementations
182
+ │ └── gemini-executor/ # Main Gemini executor agent
183
+ ├── skills/ # Skill implementations
184
+ │ └── gemini/ # User-facing /gemini command
185
+ ├── docs/ # Documentation
186
+ │ └── architecture.md # Detailed architecture guide
187
+ ├── LICENSE # MIT License
188
+ ├── package.json # Project metadata
189
+ └── README.md # This file
190
+ ```
191
+
192
+ ## 🛠️ Development
193
+
194
+ ### Setup Development Environment
195
+
196
+ ```bash
197
+ # Install dependencies
198
+ npm install
199
+
200
+ # Run in watch mode
201
+ npm run dev
202
+
203
+ # Run tests
204
+ npm test
205
+
206
+ # Lint code
207
+ npm run lint
208
+
209
+ # Format code
210
+ npm run format
211
+ ```
212
+
213
+ ### Build
214
+
215
+ ```bash
216
+ npm run build
217
+ ```
218
+
219
+ Output will be in `dist/` directory.
220
+
221
+ ## 🤝 Contributing
222
+
223
+ Contributions are welcome! Please see [CONTRIBUTING.md](CONTRIBUTING.md) for guidelines.
224
+
225
+ ### Development Roadmap
226
+
227
+ - [x] Architecture design
228
+ - [x] Documentation
229
+ - [ ] SubAgent implementation (`agents/gemini-executor/`)
230
+ - [ ] Skill implementation (`skills/gemini/`)
231
+ - [ ] Unit tests
232
+ - [ ] Integration tests
233
+ - [ ] CI/CD pipeline
234
+ - [ ] npm package publication
235
+
236
+ ## 📊 Status
237
+
238
+ | Component | Status | Progress |
239
+ |-----------|--------|----------|
240
+ | Architecture | ✅ Complete | 100% |
241
+ | Documentation | ✅ Complete | 100% |
242
+ | SubAgent Implementation | ⏳ Planned | 0% |
243
+ | Skill Implementation | ⏳ Planned | 0% |
244
+ | Tests | ⏳ Planned | 0% |
245
+
246
+ **Overall Progress**: 35% (Design phase complete, implementation pending)
247
+
248
+ ## 📚 Documentation
249
+
250
+ - [Architecture Overview](docs/architecture.md) - Detailed system design
251
+ - [Analysis Report](ANALYSIS_REPORT.md) - Comprehensive analysis from Gemini
252
+ - [Strategic Summary](STRATEGIC_SUMMARY.md) - Executive overview
253
+ - [Quick Reference](QUICK_REFERENCE.md) - Developer handbook
254
+ - [Analysis Index](ANALYSIS_INDEX.md) - Documentation navigation
255
+
256
+ ## 📝 License
257
+
258
+ This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.
259
+
260
+ ## 🙏 Acknowledgments
261
+
262
+ - **Claude Code** by Anthropic - For providing the foundation and integration capabilities
263
+ - **Google Gemini** - For the powerful AI capabilities and CLI tool
264
+ - The open-source community for inspiration and best practices
265
+
266
+ ## 📮 Contact
267
+
268
+ - **Author**: mokasz
269
+ - **Issues**: [GitHub Issues](https://github.com/mokasz/gemini-executor/issues)
270
+ - **Discussions**: [GitHub Discussions](https://github.com/mokasz/gemini-executor/discussions)
271
+
272
+ ---
273
+
274
+ **Made with ❤️ by the AI community**
@@ -0,0 +1,315 @@
1
+ # Gemini Executor SubAgent
2
+
3
+ A specialized agent for executing Google's Gemini CLI commands from within Claude Code.
4
+
5
+ ## Overview
6
+
7
+ This SubAgent provides a programmatic interface to Gemini CLI, allowing Claude to delegate tasks that benefit from Gemini's capabilities such as:
8
+
9
+ - Complex reasoning and creative problem-solving
10
+ - Large context analysis (1M tokens)
11
+ - Multimodal processing (images, PDFs, audio, video)
12
+ - Data analysis and interpretation
13
+
14
+ ## Architecture
15
+
16
+ ```
17
+ Claude Code
18
+
19
+ ├─ Task Tool
20
+ │ │
21
+ │ └─ subagent_type: 'gemini-executor'
22
+ │ │
23
+ │ └─ Gemini Executor SubAgent
24
+ │ │
25
+ │ └─ Gemini CLI
26
+ │ │
27
+ │ └─ Google Gemini API
28
+ ```
29
+
30
+ ## Usage
31
+
32
+ ### From Claude Code (Internal)
33
+
34
+ Claude can invoke this SubAgent via the Task tool:
35
+
36
+ ```typescript
37
+ Task({
38
+ subagent_type: 'gemini-executor',
39
+ prompt: JSON.stringify({
40
+ query: 'Analyze this codebase architecture',
41
+ files: ['/path/to/project'],
42
+ outputFormat: 'text'
43
+ }),
44
+ description: 'Analyze codebase architecture'
45
+ })
46
+ ```
47
+
48
+ ### Programmatic Usage
49
+
50
+ ```typescript
51
+ import { execute, checkGeminiCLI } from './agents/gemini-executor';
52
+
53
+ // Check if Gemini CLI is available
54
+ const isAvailable = await checkGeminiCLI();
55
+ if (!isAvailable) {
56
+ console.error('Gemini CLI is not installed');
57
+ process.exit(1);
58
+ }
59
+
60
+ // Execute a query
61
+ const result = await execute({
62
+ query: 'Explain how dependency injection works',
63
+ model: 'gemini-2.0-flash',
64
+ outputFormat: 'text'
65
+ });
66
+
67
+ if (result.success) {
68
+ console.log('Output:', result.output);
69
+ console.log('Metadata:', result.metadata);
70
+ } else {
71
+ console.error('Error:', result.error);
72
+ }
73
+ ```
74
+
75
+ ## API Reference
76
+
77
+ ### `execute(options, config?)`
78
+
79
+ Main execution function for Gemini CLI.
80
+
81
+ **Parameters:**
82
+ - `options: ExecutionOptions` - Execution options
83
+ - `query: string` - Query or prompt to send to Gemini (required)
84
+ - `model?: string` - Specific model to use
85
+ - `outputFormat?: 'text' | 'json' | 'stream-json'` - Output format
86
+ - `files?: string[]` - File paths to include in the prompt
87
+ - `workingDir?: string` - Working directory for file references
88
+ - `interactive?: boolean` - Enable interactive mode
89
+ - `config?: Partial<GeminiConfig>` - Configuration overrides
90
+
91
+ **Returns:** `Promise<ExecutionResult>`
92
+ - `success: boolean` - Success status
93
+ - `output?: string` - Output from Gemini
94
+ - `error?: string` - Error message if failed
95
+ - `metadata: object` - Execution metadata
96
+ - `model: string` - Model used
97
+ - `retries: number` - Number of retries
98
+ - `duration: number` - Execution duration in ms
99
+
100
+ ### `checkGeminiCLI(config?)`
101
+
102
+ Check if Gemini CLI is installed and accessible.
103
+
104
+ **Parameters:**
105
+ - `config?: Partial<GeminiConfig>` - Configuration overrides
106
+
107
+ **Returns:** `Promise<boolean>` - True if Gemini CLI is available
108
+
109
+ ## Configuration
110
+
111
+ Default configuration:
112
+
113
+ ```typescript
114
+ {
115
+ cliPath: '/opt/homebrew/bin/gemini',
116
+ defaultModel: 'gemini-2.0-flash',
117
+ maxRetries: 3,
118
+ timeout: 120000, // 2 minutes
119
+ yolo: true // Auto-confirm prompts
120
+ }
121
+ ```
122
+
123
+ Override configuration when calling `execute()`:
124
+
125
+ ```typescript
126
+ const result = await execute(
127
+ { query: 'Your query' },
128
+ {
129
+ cliPath: '/custom/path/to/gemini',
130
+ defaultModel: 'gemini-2.0-flash-thinking-exp',
131
+ maxRetries: 5,
132
+ timeout: 300000 // 5 minutes
133
+ }
134
+ );
135
+ ```
136
+
137
+ ## Security Features
138
+
139
+ ### Input Sanitization
140
+
141
+ All user inputs are sanitized to prevent command injection:
142
+
143
+ ```typescript
144
+ // Dangerous characters are removed or escaped
145
+ sanitizeInput('hello; rm -rf /') // → 'hello rm -rf /'
146
+ ```
147
+
148
+ ### Path Validation
149
+
150
+ File paths are validated to prevent directory traversal attacks:
151
+
152
+ ```typescript
153
+ validateFilePath('../../../etc/passwd') // → Throws error
154
+ validateFilePath('/legitimate/path') // → '/legitimate/path'
155
+ ```
156
+
157
+ ### Sensitive File Detection
158
+
159
+ The SubAgent warns when processing potentially sensitive files:
160
+
161
+ ```typescript
162
+ // Detects patterns like:
163
+ // .env, credentials.json, *.key, *.pem, id_rsa, etc.
164
+ ```
165
+
166
+ ## Error Handling
167
+
168
+ The SubAgent includes robust error handling:
169
+
170
+ ### Retry Logic
171
+
172
+ Automatically retries failed requests with exponential backoff:
173
+
174
+ ```typescript
175
+ // Attempt 1: immediate
176
+ // Attempt 2: 1 second delay
177
+ // Attempt 3: 2 second delay
178
+ // etc.
179
+ ```
180
+
181
+ ### Timeout Protection
182
+
183
+ Commands that exceed the timeout are automatically terminated:
184
+
185
+ ```typescript
186
+ {
187
+ timeout: 120000 // 2 minutes default
188
+ }
189
+ ```
190
+
191
+ ### Detailed Error Messages
192
+
193
+ Errors include context for debugging:
194
+
195
+ ```typescript
196
+ {
197
+ success: false,
198
+ error: 'Gemini CLI execution failed: timeout exceeded',
199
+ metadata: {
200
+ model: 'gemini-2.0-flash',
201
+ retries: 3,
202
+ duration: 120045
203
+ }
204
+ }
205
+ ```
206
+
207
+ ## Output Formats
208
+
209
+ ### Text (Default)
210
+
211
+ Plain text response from Gemini:
212
+
213
+ ```typescript
214
+ await execute({
215
+ query: 'Explain async/await',
216
+ outputFormat: 'text'
217
+ });
218
+ // Returns plain text explanation
219
+ ```
220
+
221
+ ### JSON
222
+
223
+ Structured JSON response:
224
+
225
+ ```typescript
226
+ await execute({
227
+ query: 'Extract UI components from this design',
228
+ outputFormat: 'json'
229
+ });
230
+ // Returns parsed JSON object
231
+ ```
232
+
233
+ ### Stream JSON
234
+
235
+ Real-time JSON updates:
236
+
237
+ ```typescript
238
+ await execute({
239
+ query: 'Analyze large codebase',
240
+ outputFormat: 'stream-json'
241
+ });
242
+ // Returns streaming JSON for progress updates
243
+ ```
244
+
245
+ ## Examples
246
+
247
+ ### Example 1: Simple Query
248
+
249
+ ```typescript
250
+ const result = await execute({
251
+ query: 'What are the benefits of TypeScript?'
252
+ });
253
+ ```
254
+
255
+ ### Example 2: Code Analysis
256
+
257
+ ```typescript
258
+ const result = await execute({
259
+ query: 'Analyze this code for potential issues',
260
+ files: ['./src/index.ts'],
261
+ workingDir: '/path/to/project'
262
+ });
263
+ ```
264
+
265
+ ### Example 3: With Specific Model
266
+
267
+ ```typescript
268
+ const result = await execute({
269
+ query: 'Design an optimal caching strategy',
270
+ model: 'gemini-2.0-flash-thinking-exp',
271
+ outputFormat: 'json'
272
+ });
273
+ ```
274
+
275
+ ### Example 4: Large Context Analysis
276
+
277
+ ```typescript
278
+ const result = await execute({
279
+ query: 'Analyze the entire project structure and identify architectural patterns',
280
+ files: ['./src', './docs', './tests'],
281
+ workingDir: '/path/to/large/project',
282
+ model: 'gemini-2.0-flash'
283
+ });
284
+ ```
285
+
286
+ ## Debugging
287
+
288
+ Enable verbose logging:
289
+
290
+ ```typescript
291
+ // Set environment variable
292
+ process.env.DEBUG = 'gemini-executor';
293
+
294
+ // Or add console.log statements
295
+ console.log('Command:', command);
296
+ console.log('Result:', result);
297
+ ```
298
+
299
+ ## Limitations
300
+
301
+ - **Gemini CLI Required**: Must have Gemini CLI installed
302
+ - **File Size Limits**:
303
+ - Images: max 20MB
304
+ - PDFs/Audio/Video: max 100MB
305
+ - **Context Window**: max 1M tokens
306
+ - **Rate Limits**: Subject to Gemini API rate limits
307
+ - **Free Tier**: Daily usage limits apply
308
+
309
+ ## Contributing
310
+
311
+ See the main [CONTRIBUTING.md](../../CONTRIBUTING.md) for contribution guidelines.
312
+
313
+ ## License
314
+
315
+ MIT License - see [LICENSE](../../LICENSE)