ox-ai-workers 0.9.5 → 0.9.6.1
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- checksums.yaml +4 -4
- data/.cursor/rules/010-project-structure.mdc +89 -0
- data/.cursor/rules/998-clean-code.mdc +52 -0
- data/.cursor/rules/999-mdc-format.mdc +132 -0
- data/CHANGELOG.md +4 -1
- data/README.md +229 -1
- data/lib/oxaiworkers/assistant/module_base.rb +8 -0
- data/lib/oxaiworkers/assistant/painter.rb +0 -6
- data/lib/oxaiworkers/iterator.rb +33 -4
- data/lib/oxaiworkers/module_request.rb +3 -2
- data/lib/oxaiworkers/tool/pixels.rb +3 -23
- data/lib/oxaiworkers/version.rb +1 -1
- metadata +4 -2
- data/.cursorrules +0 -155
checksums.yaml
CHANGED
@@ -1,7 +1,7 @@
|
|
1
1
|
---
|
2
2
|
SHA256:
|
3
|
-
metadata.gz:
|
4
|
-
data.tar.gz:
|
3
|
+
metadata.gz: 768550f6fde8e654917a2b01404856abcbe643429ba725d7c86c853b472f4bed
|
4
|
+
data.tar.gz: 9444f8c6869caa2e7039d3632605885579364942de95ba2fa6b201d5ca00c3e5
|
5
5
|
SHA512:
|
6
|
-
metadata.gz:
|
7
|
-
data.tar.gz:
|
6
|
+
metadata.gz: 6a0c8f3622ab30d8a199b367ad3b0bab3c985a292cc786d3c38380515b5adbe02b945eaff4606073b199a59bf4237bb36c86a0a89b07f6c62c0bc0b738ed7593
|
7
|
+
data.tar.gz: 491720defa83f4b26cccc0d389975f1125c8636821c37314814fdbe9bf714d9233f22456a09b2f8ec890e2b52d3fcb96065295563b1272336316f34461fe6667
|
@@ -0,0 +1,89 @@
|
|
1
|
+
---
|
2
|
+
description:
|
3
|
+
globs:
|
4
|
+
alwaysApply: true
|
5
|
+
---
|
6
|
+
# Overview
|
7
|
+
|
8
|
+
OxAiWorkers is a Ruby gem that implements a finite state machine (FSM, using the `state_machine` gem) to solve tasks using generative intelligence. This approach enhances the final result by utilizing internal monologue and external tools.
|
9
|
+
|
10
|
+
## Core Components
|
11
|
+
|
12
|
+
- `Request` and `DelayedRequest` - classes for executing API requests (immediate and delayed)
|
13
|
+
- `ModuleRequest` - base class for all API requests with parsing and response handling ([module_request.rb](mdc:lib/oxaiworkers/module_request.rb))
|
14
|
+
- `Iterator` - main class for iterative task execution with tools ([iterator.rb](mdc:lib/oxaiworkers/iterator.rb))
|
15
|
+
- `Assistant::ModuleBase` - high-level wrappers over Iterator (Sysop, Coder, Localizer, etc.) ([module_base.rb](mdc:lib/oxaiworkers/assistant/module_base.rb))
|
16
|
+
- `Tool` - tools that can be used during task execution (Eval, FileSystem, Database, Pixels, Pipeline)
|
17
|
+
- `ToolDefinition` - module for declaring functions and methods for tools ([tool_definition.rb](mdc:lib/oxaiworkers/tool_definition.rb))
|
18
|
+
- `StateTools` - base class for managing states and transitions ([state_tools.rb](mdc:lib/oxaiworkers/state_tools.rb))
|
19
|
+
- `ContextualLogger` - logging system with contextual information support
|
20
|
+
|
21
|
+
## Code Conventions
|
22
|
+
|
23
|
+
- Use `snake_case` for method and variable names
|
24
|
+
- All code comments, CHANGELOG, README, and other documentation must be written in English
|
25
|
+
|
26
|
+
## Interaction Patterns
|
27
|
+
|
28
|
+
- The system uses internal monologue (inner_monologue) for planning actions
|
29
|
+
- External voice (outer_voice) is used for communication with the user
|
30
|
+
- Execution flow management through finite state machine
|
31
|
+
- Implementation of callback mechanisms for flexible event handling
|
32
|
+
|
33
|
+
## Tools Architecture
|
34
|
+
|
35
|
+
- Each tool should be a self-contained module
|
36
|
+
- Tools are registered through the `define_function` interface
|
37
|
+
- All tools should handle their own errors and return readable messages
|
38
|
+
- Handle errors at the tool level, preventing them from interrupting the main execution flow
|
39
|
+
|
40
|
+
## Finite State Machine Implementation
|
41
|
+
|
42
|
+
- Core FSM based on `state_machine` gem with states: idle → prepared → requested → analyzed → finished → idle
|
43
|
+
- State transitions managed by events: prepare, request, analyze, complete, iterate, end ([state_tools.rb](mdc:lib/oxaiworkers/state_tools.rb), [iterator.rb](mdc:lib/oxaiworkers/iterator.rb))
|
44
|
+
- `StateTools` - base class for FSM implementation with event hooks and transition callbacks ([state_tools.rb](mdc:lib/oxaiworkers/state_tools.rb))
|
45
|
+
- `StateBatch` - FSM extension for batch request processing with additional states
|
46
|
+
- Automatic error recovery and retry mechanisms for failed API requests
|
47
|
+
|
48
|
+
## Iterator Lifecycle
|
49
|
+
|
50
|
+
- 3 core functions: inner_monologue, outer_voice, finish_it ([iterator.rb](mdc:lib/oxaiworkers/iterator.rb))
|
51
|
+
- Configurable message queue for stateful conversation history
|
52
|
+
- Callback system for processing each state transition
|
53
|
+
- Context and milestone management for optimizing token usage
|
54
|
+
- Support for custom steps and instruction templating
|
55
|
+
|
56
|
+
## Assistants Details
|
57
|
+
|
58
|
+
- `ModuleBase` - shared functionality for all assistant types ([module_base.rb](mdc:lib/oxaiworkers/assistant/module_base.rb))
|
59
|
+
- `Sysop` - system administration and shell command execution ([sysop.rb](mdc:lib/oxaiworkers/assistant/sysop.rb), [file_system.rb](mdc:lib/oxaiworkers/tool/file_system.rb))
|
60
|
+
- `Coder` - specialized for code generation and analysis ([coder.rb](mdc:lib/oxaiworkers/assistant/coder.rb), [eval.rb](mdc:lib/oxaiworkers/tool/eval.rb))
|
61
|
+
- `Localizer` - translation and localization support ([localizer.rb](mdc:lib/oxaiworkers/assistant/localizer.rb))
|
62
|
+
- `Orchestrator` - Coordinates multiple assistants to work together on complex tasks ([orchestrator.rb](mdc:lib/oxaiworkers/assistant/orchestrator.rb), [pipeline.rb](mdc:lib/oxaiworkers/tool/pipeline.rb))
|
63
|
+
- `Painter` - Image generation and manipulation ([painter.rb](mdc:lib/oxaiworkers/assistant/painter.rb), [pixels.rb](mdc:lib/oxaiworkers/tool/pixels.rb))
|
64
|
+
|
65
|
+
## Internationalization and Localization
|
66
|
+
|
67
|
+
- All user-facing strings MUST be properly localized using I18n (config/locales/*.yml)
|
68
|
+
- Use I18n.t for all text that will be shown to users or appears in assistant prompts
|
69
|
+
- Store translations in YAML files within the config/locales directory
|
70
|
+
- Follow the naming convention of language.namespace.key (e.g., en.oxaiworkers.assistant.role)
|
71
|
+
- Use named parameters (%{variable}) instead of positional parameters (%s) in translation strings
|
72
|
+
- Use the with_locale method to ensure proper locale context when processing localized text
|
73
|
+
- Implement locale-aware classes by including the OxAiWorkers::LoadI18n module
|
74
|
+
- Store the current locale on initialization and preserve it across method calls
|
75
|
+
- Support multiple languages simultaneously through careful locale management
|
76
|
+
- Default to English for developer-facing messages and logs
|
77
|
+
- Ensure that all assistant classes properly handle localization in their format_role methods
|
78
|
+
|
79
|
+
## LoadI18n Module Usage
|
80
|
+
|
81
|
+
- The `OxAiWorkers::LoadI18n` module provides two key methods for localization:
|
82
|
+
- `store_locale` - saves the current locale at initialization time
|
83
|
+
- `with_locale` - executes a block of code in the context of the saved locale
|
84
|
+
- Always include the `OxAiWorkers::LoadI18n` module in classes that need localization capabilities
|
85
|
+
- Call `store_locale` in the initialization methods of locale-aware classes
|
86
|
+
- Wrap all locale-dependent code in `with_locale` blocks
|
87
|
+
- NEVER redefine the `with_locale` method in classes that include LoadI18n
|
88
|
+
- All methods that produce user-visible text must use the locale context via `with_locale` blocks
|
89
|
+
- Regular method calls from classes including LoadI18n do not require additional locale handling
|
@@ -0,0 +1,52 @@
|
|
1
|
+
---
|
2
|
+
description: Guidelines for writing clean, maintainable, and human-readable code. Apply these rules when writing or reviewing code to ensure consistency and quality.
|
3
|
+
globs:
|
4
|
+
alwaysApply: false
|
5
|
+
---
|
6
|
+
# Clean Code Guidelines
|
7
|
+
|
8
|
+
## Constants Over Magic Numbers
|
9
|
+
- Replace hard-coded values with named constants
|
10
|
+
- Use descriptive constant names that explain the value's purpose
|
11
|
+
- Keep constants at the top of the file or in a dedicated constants file
|
12
|
+
|
13
|
+
## Meaningful Names
|
14
|
+
- Variables, functions, and classes should reveal their purpose
|
15
|
+
- Names should explain why something exists and how it's used
|
16
|
+
- Avoid abbreviations unless they're universally understood
|
17
|
+
|
18
|
+
## Smart Comments
|
19
|
+
- Don't comment on what the code does - make the code self-documenting
|
20
|
+
- Use comments to explain why something is done a certain way
|
21
|
+
- Document APIs, complex algorithms, and non-obvious side effects
|
22
|
+
|
23
|
+
## Single Responsibility
|
24
|
+
- Each function should do exactly one thing
|
25
|
+
- Functions should be small and focused
|
26
|
+
- If a function needs a comment to explain what it does, it should be split
|
27
|
+
|
28
|
+
## DRY (Don't Repeat Yourself)
|
29
|
+
- Extract repeated code into reusable functions
|
30
|
+
- Share common logic through proper abstraction
|
31
|
+
- Maintain single sources of truth
|
32
|
+
|
33
|
+
## Clean Structure
|
34
|
+
- Keep related code together
|
35
|
+
- Organize code in a logical hierarchy
|
36
|
+
- Use consistent file and folder naming conventions
|
37
|
+
|
38
|
+
## Encapsulation
|
39
|
+
- Hide implementation details
|
40
|
+
- Expose clear interfaces
|
41
|
+
- Move nested conditionals into well-named functions
|
42
|
+
|
43
|
+
## Code Quality Maintenance
|
44
|
+
- Refactor continuously
|
45
|
+
- Fix technical debt early
|
46
|
+
- Leave code cleaner than you found it
|
47
|
+
- Follow the "Fail fast" principle for early error detection
|
48
|
+
|
49
|
+
## Version Control
|
50
|
+
- Write clear commit messages
|
51
|
+
- Make small, focused commits
|
52
|
+
- Use meaningful branch names
|
@@ -0,0 +1,132 @@
|
|
1
|
+
---
|
2
|
+
description:
|
3
|
+
globs: *.mdc,**/*.mdc
|
4
|
+
alwaysApply: false
|
5
|
+
---
|
6
|
+
# MDC File Format Guide
|
7
|
+
|
8
|
+
MDC (Markdown Configuration) files are used by Cursor to provide context-specific instructions to AI assistants. This guide explains how to create and maintain these files properly.
|
9
|
+
|
10
|
+
## File Structure
|
11
|
+
|
12
|
+
Each MDC file consists of two main parts:
|
13
|
+
|
14
|
+
1. **Frontmatter** - Configuration metadata at the top of the file
|
15
|
+
2. **Markdown Content** - The actual instructions in Markdown format
|
16
|
+
|
17
|
+
### Frontmatter
|
18
|
+
|
19
|
+
The frontmatter must be the first thing in the file and must be enclosed between triple-dash lines (`---`). Configuration should be based on the intended behavior:
|
20
|
+
|
21
|
+
```
|
22
|
+
---
|
23
|
+
# Configure your rule based on desired behavior:
|
24
|
+
|
25
|
+
description: Brief description of what the rule does
|
26
|
+
globs: **/*.js, **/*.ts # Optional: Comma-separated list, not an array
|
27
|
+
alwaysApply: false # Set to true for global rules
|
28
|
+
---
|
29
|
+
```
|
30
|
+
|
31
|
+
> **Important**: Despite the appearance, the frontmatter is not strictly YAML formatted. The `globs` field is a comma-separated list and should NOT include brackets `[]` or quotes `"`.
|
32
|
+
|
33
|
+
#### Guidelines for Setting Fields
|
34
|
+
|
35
|
+
- **description**: Should be agent-friendly and clearly describe when the rule is relevant. Format as `<topic>: <details>` for best results.
|
36
|
+
- **globs**:
|
37
|
+
- If a rule is only relevant in very specific situations, leave globs empty so it's loaded only when applicable to the user request.
|
38
|
+
- If the only glob would match all files (like `**/*`), leave it empty and set `alwaysApply: true` instead.
|
39
|
+
- Otherwise, be as specific as possible with glob patterns to ensure rules are only applied with relevant files.
|
40
|
+
- **alwaysApply**: Use sparingly for truly global guidelines.
|
41
|
+
|
42
|
+
#### Glob Pattern Examples
|
43
|
+
|
44
|
+
- **/*.js - All JavaScript files
|
45
|
+
- src/**/*.jsx - All JSX files in the src directory
|
46
|
+
- **/components/**/*.vue - All Vue files in any components directory
|
47
|
+
|
48
|
+
### Markdown Content
|
49
|
+
|
50
|
+
After the frontmatter, the rest of the file should be valid Markdown:
|
51
|
+
|
52
|
+
```markdown
|
53
|
+
# Title of Your Rule
|
54
|
+
|
55
|
+
## Section 1
|
56
|
+
- Guidelines and information
|
57
|
+
- Code examples
|
58
|
+
|
59
|
+
## Section 2
|
60
|
+
More detailed information...
|
61
|
+
```
|
62
|
+
|
63
|
+
## Special Features
|
64
|
+
|
65
|
+
### File References
|
66
|
+
|
67
|
+
You can reference other files from within an MDC file using the markdown link syntax:
|
68
|
+
|
69
|
+
```
|
70
|
+
[rule-name.mdc](mdc:location/of/the/rule.mdc)
|
71
|
+
```
|
72
|
+
|
73
|
+
When this rule is activated, the referenced file will also be included in the context.
|
74
|
+
|
75
|
+
### Code Blocks
|
76
|
+
|
77
|
+
Use fenced code blocks for examples:
|
78
|
+
|
79
|
+
````markdown
|
80
|
+
```javascript
|
81
|
+
// Example code
|
82
|
+
function example() {
|
83
|
+
return "This is an example";
|
84
|
+
}
|
85
|
+
```
|
86
|
+
````
|
87
|
+
|
88
|
+
## Best Practices
|
89
|
+
|
90
|
+
1. **Clear Organization**
|
91
|
+
- Use numbered prefixes (e.g., `01-workflow.mdc`) for sorting rules logically
|
92
|
+
- Place task-specific rules in the `tasks/` subdirectory
|
93
|
+
- Use descriptive filenames that indicate the rule's purpose
|
94
|
+
|
95
|
+
2. **Frontmatter Specificity**
|
96
|
+
- Be specific with glob patterns to ensure rules are only applied in relevant contexts
|
97
|
+
- Use `alwaysApply: true` for truly global guidelines
|
98
|
+
- Make descriptions clear and concise so AI knows when to apply the rule
|
99
|
+
|
100
|
+
3. **Content Structure**
|
101
|
+
- Start with a clear title (H1)
|
102
|
+
- Use hierarchical headings (H2, H3, etc.) to organize content
|
103
|
+
- Include examples where appropriate
|
104
|
+
- Keep instructions clear and actionable
|
105
|
+
|
106
|
+
4. **File Size Considerations**
|
107
|
+
- Keep files focused on a single topic or closely related topics
|
108
|
+
- Split very large rule sets into multiple files and link them with references
|
109
|
+
- Aim for under 300 lines per file when possible
|
110
|
+
|
111
|
+
## Usage in Cursor
|
112
|
+
|
113
|
+
When working with files in Cursor, rules are automatically applied when:
|
114
|
+
|
115
|
+
1. The file you're working on matches a rule's glob pattern
|
116
|
+
2. A rule has `alwaysApply: true` set in its frontmatter
|
117
|
+
3. The agent thinks the rule's description matches the user request
|
118
|
+
4. You explicitly reference a rule in a conversation with Cursor's AI
|
119
|
+
|
120
|
+
## Creating/Renaming/Removing Rules
|
121
|
+
|
122
|
+
- When a rule file is added/renamed/removed, update also the list under 010-workflow.mdc.
|
123
|
+
- When changs are made to multiple `mdc` files from a single request, review also [999-mdc-format]((mdc:.cursor/rules/999-mdc-format.mdc)) to consider whether to update it too.
|
124
|
+
|
125
|
+
## Updating Rules
|
126
|
+
|
127
|
+
When updating existing rules:
|
128
|
+
|
129
|
+
1. Maintain the frontmatter format
|
130
|
+
2. Keep the same glob patterns unless intentionally changing the rule's scope
|
131
|
+
3. Update the description if the purpose of the rule changes
|
132
|
+
4. Consider whether changes should propagate to related rules (e.g., CE versions)
|
data/CHANGELOG.md
CHANGED
@@ -1,6 +1,9 @@
|
|
1
1
|
|
2
|
-
## [0.9.
|
2
|
+
## [0.9.6] - 2025-05-10
|
3
3
|
|
4
|
+
- Added `add_file` for `Iterator` (only pdf for now)
|
5
|
+
- Added `add_image` for `Iterator`
|
6
|
+
- Added `add_file` and `add_image` for Assistants
|
4
7
|
- Added `call_stack` for `Iterator` and `ModuleRequest`
|
5
8
|
- Added `stop_double_calls` for `Iterator` and `ModuleRequest`
|
6
9
|
|
data/README.md
CHANGED
@@ -88,6 +88,7 @@ For a more robust setup, you can configure the gem with your API keys, for examp
|
|
88
88
|
OxAiWorkers.configure do |config|
|
89
89
|
config.access_token_openai = ENV.fetch("OPENAI")
|
90
90
|
config.access_token_deepseek = ENV.fetch("DEEPSEEK")
|
91
|
+
config.access_token_stability = ENV.fetch("STABILITY")
|
91
92
|
config.max_tokens = 4096 # Default
|
92
93
|
config.temperature = 0.7 # Default
|
93
94
|
config.wait_for_complete = true # Default
|
@@ -396,6 +397,74 @@ class MyTool
|
|
396
397
|
end
|
397
398
|
```
|
398
399
|
|
400
|
+
### Working with Files and Images
|
401
|
+
|
402
|
+
You can easily add files and images to your assistants:
|
403
|
+
|
404
|
+
```ruby
|
405
|
+
# Add a PDF file
|
406
|
+
iterator.add_file(
|
407
|
+
pdf: File.read('document.pdf'),
|
408
|
+
filename: 'document.pdf',
|
409
|
+
text: 'Here is the document you requested'
|
410
|
+
)
|
411
|
+
|
412
|
+
# Add image from URL
|
413
|
+
iterator.add_image(
|
414
|
+
text: 'Here is the image',
|
415
|
+
url: 'https://example.com/image.jpg',
|
416
|
+
detail: 'auto' # 'auto', 'low', or 'high'
|
417
|
+
)
|
418
|
+
|
419
|
+
# Add image from binary data
|
420
|
+
image_data = File.read('local_image.jpg')
|
421
|
+
iterator.add_image(
|
422
|
+
text: 'Image from binary data',
|
423
|
+
binary: image_data,
|
424
|
+
mime_type: 'image/jpeg' # Defaults to 'image/png'
|
425
|
+
)
|
426
|
+
```
|
427
|
+
|
428
|
+
#### Image Input Requirements
|
429
|
+
|
430
|
+
When using images with the API, your input images must meet the following requirements:
|
431
|
+
|
432
|
+
**Supported file types:**
|
433
|
+
|
434
|
+
- PNG (.png)
|
435
|
+
- JPEG (.jpeg and .jpg)
|
436
|
+
- WEBP (.webp)
|
437
|
+
- Non-animated GIF (.gif)
|
438
|
+
|
439
|
+
**Size limits:**
|
440
|
+
|
441
|
+
- Up to 20MB per image
|
442
|
+
- Low-resolution: 512px x 512px
|
443
|
+
- High-resolution: 768px (short side) x 2000px (long side)
|
444
|
+
|
445
|
+
**Other requirements:**
|
446
|
+
|
447
|
+
- No watermarks or logos
|
448
|
+
- No text
|
449
|
+
- No NSFW content
|
450
|
+
- Clear enough for a human to understand
|
451
|
+
|
452
|
+
**Image detail level:**
|
453
|
+
|
454
|
+
The `detail` parameter controls what level of detail the model uses when processing the image:
|
455
|
+
|
456
|
+
```ruby
|
457
|
+
iterator.add_image(
|
458
|
+
text: 'Nature boardwalk image',
|
459
|
+
url: 'https://example.com/nature.jpg',
|
460
|
+
detail: 'high' # Options: 'auto', 'low', or 'high'
|
461
|
+
)
|
462
|
+
```
|
463
|
+
|
464
|
+
- `detail: 'low'`: Uses less tokens (85) and processes a low-resolution 512px x 512px version of the image. Best for simple use cases like identifying dominant colors or shapes.
|
465
|
+
- `detail: 'high'`: Provides better image understanding for complex tasks requiring higher resolution detail.
|
466
|
+
- `detail: 'auto'`: Lets the model decide the appropriate detail level (default if not specified).
|
467
|
+
|
399
468
|
### Handling State Transitions with Callbacks
|
400
469
|
|
401
470
|
You can track and respond to state transitions with callbacks:
|
@@ -470,6 +539,16 @@ OxAiWorkers provides several specialized assistant types:
|
|
470
539
|
orchestrator.task = "Create a hello world application in C, save it to hello_world.c, compile, run, and verify it works."
|
471
540
|
```
|
472
541
|
|
542
|
+
All assistants support working with files and images:
|
543
|
+
|
544
|
+
```ruby
|
545
|
+
# Add files and images to any assistant
|
546
|
+
sysop.add_file(pdf: File.read('error_log.pdf'), filename: 'error_log.pdf', text: 'Error log file')
|
547
|
+
sysop.add_image(text: 'Screenshot of the error', url: 'https://example.com/screenshot.png')
|
548
|
+
```
|
549
|
+
|
550
|
+
See the [Working with Files and Images](#working-with-files-and-images) section for full details.
|
551
|
+
|
473
552
|
### Available Tools
|
474
553
|
|
475
554
|
OxAiWorkers provides several specialized tools to extend functionality:
|
@@ -481,7 +560,7 @@ OxAiWorkers provides several specialized tools to extend functionality:
|
|
481
560
|
pixels = OxAiWorkers::Tool::Pixels.new(
|
482
561
|
worker: worker, # Required: Request or DelayedRequest instance
|
483
562
|
current_dir: Dir.pwd, # Optional: Directory to save generated images
|
484
|
-
image_model:
|
563
|
+
image_model: OxAiWorkers::Models::StabilityImages.new, # Optional, default is OpenaiDalle3
|
485
564
|
only: [:generate_image] # Optional: Limit available functions
|
486
565
|
)
|
487
566
|
```
|
@@ -529,6 +608,69 @@ OxAiWorkers provides several specialized tools to extend functionality:
|
|
529
608
|
|
530
609
|
Additional tools like Database and Converter are available for specialized tasks and can be integrated using the same pattern.
|
531
610
|
|
611
|
+
### Function Control Mechanisms
|
612
|
+
|
613
|
+
OxAiWorkers provides two powerful mechanisms to control function execution behavior in iterators:
|
614
|
+
|
615
|
+
#### Call Stack
|
616
|
+
|
617
|
+
The `call_stack` parameter allows you to force the model to call specific functions in a predetermined order:
|
618
|
+
|
619
|
+
```ruby
|
620
|
+
iterator = OxAiWorkers::Iterator.new(
|
621
|
+
worker: worker,
|
622
|
+
tools: [my_tool],
|
623
|
+
call_stack: [
|
624
|
+
my_tool.full_function_name(:process_data),
|
625
|
+
OxAiWorkers::Iterator.full_function_name(:outer_voice),
|
626
|
+
]
|
627
|
+
)
|
628
|
+
```
|
629
|
+
|
630
|
+
This feature is particularly useful when:
|
631
|
+
|
632
|
+
- You need to ensure a specific sequence of operations
|
633
|
+
- Certain functions must be called before others
|
634
|
+
- You want to guide the model through a predefined workflow
|
635
|
+
- Complex operations require strict ordering of function calls
|
636
|
+
|
637
|
+
The `call_stack` is processed sequentially, with each function being removed from the stack after it's called.
|
638
|
+
|
639
|
+
#### Stop Double Calls
|
640
|
+
|
641
|
+
The `stop_double_calls` parameter prevents the model from calling the same function twice in consecutive operations:
|
642
|
+
|
643
|
+
```ruby
|
644
|
+
iterator = OxAiWorkers::Iterator.new(
|
645
|
+
worker: worker,
|
646
|
+
tools: [my_tool],
|
647
|
+
stop_double_calls: [
|
648
|
+
my_tool.full_function_name(:expensive_operation)
|
649
|
+
]
|
650
|
+
)
|
651
|
+
```
|
652
|
+
|
653
|
+
This feature is valuable for:
|
654
|
+
|
655
|
+
- Preventing redundant operations that could waste resources
|
656
|
+
- Avoiding duplicate processing of the same data
|
657
|
+
- Ensuring that certain operations are executed only once in sequence
|
658
|
+
- Protecting against potential infinite loops in function calls
|
659
|
+
|
660
|
+
When a function is called, its name is stored as the `last_call`. If the next function call matches both the `last_call` and is included in the `stop_double_calls` list, it will be excluded from the available tools for that request.
|
661
|
+
|
662
|
+
By default, `stop_double_calls` is applied to the `inner_monologue` and `outer_voice` functions to prevent reasoning loops and repetitive responses. This default behavior helps models avoid getting stuck in circular thinking patterns.
|
663
|
+
|
664
|
+
If you need to override this default behavior (for example, when consecutive monologue or voice calls are required for your specific use case), you can reset the stop_double_calls list **after** the iterator is created:
|
665
|
+
|
666
|
+
```ruby
|
667
|
+
# Clear the default stop_double_calls constraints
|
668
|
+
@iterator.stop_double_calls = []
|
669
|
+
|
670
|
+
# Or set your own custom constraints
|
671
|
+
@iterator.stop_double_calls = [my_tool.full_function_name(:specific_function)]
|
672
|
+
```
|
673
|
+
|
532
674
|
### Implementing Your Own Assistant
|
533
675
|
|
534
676
|
Create custom assistants by inheriting from existing ones or composing with the Iterator:
|
@@ -553,6 +695,92 @@ module OxAiWorkers
|
|
553
695
|
end
|
554
696
|
```
|
555
697
|
|
698
|
+
## Image Generation
|
699
|
+
|
700
|
+
OxAiWorkers supports image generation through the Painter assistant and Pixels tool, with multiple AI image generation models.
|
701
|
+
|
702
|
+
### Supported Image Models
|
703
|
+
|
704
|
+
- **OpenaiDalle3** - OpenAI's DALL-E 3 model
|
705
|
+
- **OpenaiGptImage** - OpenAI's GPT-Image-1 model
|
706
|
+
- **StabilityImages** - Stability AI's image generation models
|
707
|
+
|
708
|
+
### Using the Painter Assistant
|
709
|
+
|
710
|
+
```ruby
|
711
|
+
# Using DALL-E 3 (default)
|
712
|
+
painter = OxAiWorkers::Assistant::Painter.new(current_dir: Dir.pwd)
|
713
|
+
painter.task = "Create an image of a sunset over mountains"
|
714
|
+
|
715
|
+
# Using GPT-Image-1
|
716
|
+
painter = OxAiWorkers::Assistant::Painter.new(
|
717
|
+
image_model: OxAiWorkers::Models::OpenaiGptImage.new,
|
718
|
+
current_dir: Dir.pwd
|
719
|
+
)
|
720
|
+
painter.task = "Generate a photorealistic red apple"
|
721
|
+
|
722
|
+
# Using Stability AI
|
723
|
+
painter = OxAiWorkers::Assistant::Painter.new(
|
724
|
+
image_model: OxAiWorkers::Models::StabilityImages.new,
|
725
|
+
current_dir: Dir.pwd
|
726
|
+
)
|
727
|
+
painter.task = "Create a fantasy landscape with dragons"
|
728
|
+
```
|
729
|
+
|
730
|
+
### Using the Pixels Tool Directly
|
731
|
+
|
732
|
+
For more direct control over image generation:
|
733
|
+
|
734
|
+
```ruby
|
735
|
+
# Initialize with DALL-E 3
|
736
|
+
pixels = OxAiWorkers::Tool::Pixels.new(
|
737
|
+
worker: OxAiWorkers::Models::OpenaiDalle3.new,
|
738
|
+
current_dir: Dir.pwd
|
739
|
+
)
|
740
|
+
pixels.generate_image(
|
741
|
+
prompt: "A photorealistic red apple on a wooden table",
|
742
|
+
file_name: "apple.png",
|
743
|
+
size: "1024x1024",
|
744
|
+
quality: "hd"
|
745
|
+
)
|
746
|
+
|
747
|
+
# Initialize with GPT-Image-1
|
748
|
+
pixels = OxAiWorkers::Tool::Pixels.new(
|
749
|
+
worker: OxAiWorkers::Models::OpenaiGptImage.new,
|
750
|
+
current_dir: Dir.pwd
|
751
|
+
)
|
752
|
+
pixels.generate_image(
|
753
|
+
prompt: "Futuristic cityscape at night",
|
754
|
+
file_name: "city.png",
|
755
|
+
size: "1536x1024",
|
756
|
+
quality: "high"
|
757
|
+
)
|
758
|
+
|
759
|
+
# Initialize with Stability AI
|
760
|
+
pixels = OxAiWorkers::Tool::Pixels.new(
|
761
|
+
worker: OxAiWorkers::Models::StabilityImages.new,
|
762
|
+
current_dir: Dir.pwd
|
763
|
+
)
|
764
|
+
pixels.generate_image(
|
765
|
+
prompt: "Photorealistic mountain landscape",
|
766
|
+
file_name: "mountains.png"
|
767
|
+
)
|
768
|
+
```
|
769
|
+
|
770
|
+
### Model-Specific Features
|
771
|
+
|
772
|
+
- **OpenaiDalle3**
|
773
|
+
- Sizes: '1024x1024', '1024x1792', '1792x1024'
|
774
|
+
- Qualities: 'standard', 'hd'
|
775
|
+
|
776
|
+
- **OpenaiGptImage**
|
777
|
+
- Sizes: 'auto', '1024x1024', '1536x1024', '1024x1536'
|
778
|
+
- Qualities: 'auto', 'low', 'medium', 'high'
|
779
|
+
|
780
|
+
- **StabilityImages**
|
781
|
+
- Uses Stability AI's API with different engine options
|
782
|
+
- Configuration via options parameter
|
783
|
+
|
556
784
|
## Contributing
|
557
785
|
|
558
786
|
Bug reports and pull requests are welcome on GitHub at <https://github.com/neonix20b/ox-ai-workers>. This project is intended to be a safe, welcoming space for collaboration, and contributors are expected to adhere to the [code of conduct](https://github.com/neonix20b/ox-ai-workers/blob/main/CODE_OF_CONDUCT.md).
|
@@ -34,6 +34,14 @@ module OxAiWorkers
|
|
34
34
|
@iterator.clear_context
|
35
35
|
@iterator.add_context context
|
36
36
|
end
|
37
|
+
|
38
|
+
def add_file(pdf:, filename:, text:, role: :user)
|
39
|
+
@iterator.add_file(pdf:, filename:, text:, role:)
|
40
|
+
end
|
41
|
+
|
42
|
+
def add_image(text:, url: nil, binary: nil, role: :user, detail: 'auto', mime_type: 'image/jpeg')
|
43
|
+
@iterator.add_image(text:, url:, binary:, role:, detail:, mime_type:)
|
44
|
+
end
|
37
45
|
end
|
38
46
|
end
|
39
47
|
end
|
@@ -29,12 +29,6 @@ module OxAiWorkers
|
|
29
29
|
on_outer_voice: ->(text:) { puts "voice: #{text}".colorize(:green) }
|
30
30
|
)
|
31
31
|
end
|
32
|
-
|
33
|
-
def cleanup
|
34
|
-
Dir.glob(File.join(@current_dir, '*.png')).each do |file|
|
35
|
-
File.delete(file) if File.exist?(file)
|
36
|
-
end
|
37
|
-
end
|
38
32
|
end
|
39
33
|
end
|
40
34
|
end
|
data/lib/oxaiworkers/iterator.rb
CHANGED
@@ -117,16 +117,16 @@ module OxAiWorkers
|
|
117
117
|
@worker.call_stack = @call_stack.dup
|
118
118
|
@worker.stop_double_calls = @stop_double_calls
|
119
119
|
@worker.messages = []
|
120
|
-
@worker.append(role: :system, content: @role) if @role.present?
|
120
|
+
@worker.append(role: :system, content: "<role>\n#{@role}\n</role>") if @role.present?
|
121
121
|
|
122
|
-
@
|
123
|
-
@worker.append(role: :
|
122
|
+
@worker.append(role: :system, content: "<instructions>\n#{valid_monologue.join("\n")}\n</instructions>")
|
123
|
+
@tasks.each { |task| @worker.append(role: :user, content: "<task>\n#{task}\n</task>") }
|
124
124
|
@worker.append(messages: @context) if @context.present?
|
125
125
|
@tools.each do |tool|
|
126
126
|
@worker.append(role: :user, content: tool.context) if tool.respond_to?(:context) && tool.context.present?
|
127
127
|
end
|
128
128
|
@worker.append(messages: @messages)
|
129
|
-
@tasks.each { |task| @worker.append(role: :user, content: task) }
|
129
|
+
@tasks.each { |task| @worker.append(role: :user, content: "<task>\n#{task}\n</task>") }
|
130
130
|
@worker.tools = function_schemas.to_openai_format(only: available_defs)
|
131
131
|
return unless @tools.present?
|
132
132
|
|
@@ -252,6 +252,35 @@ module OxAiWorkers
|
|
252
252
|
add_raw_context({ role:, content: text })
|
253
253
|
end
|
254
254
|
|
255
|
+
def add_file(pdf:, filename:, text:, role: :user)
|
256
|
+
content = []
|
257
|
+
content << { type: 'text', text: } if text.present?
|
258
|
+
content << {
|
259
|
+
type: 'file',
|
260
|
+
file: {
|
261
|
+
filename:,
|
262
|
+
file_data: Base64.strict_encode64(pdf)
|
263
|
+
}
|
264
|
+
}
|
265
|
+
|
266
|
+
add_raw_context({ role:, content: })
|
267
|
+
end
|
268
|
+
|
269
|
+
def add_image(text:, url: nil, binary: nil, role: :user, detail: 'auto', mime_type: 'image/png')
|
270
|
+
content = []
|
271
|
+
content << { type: 'text', text: } if text.present?
|
272
|
+
|
273
|
+
image_url = if binary.present?
|
274
|
+
"data:#{mime_type};base64,#{Base64.strict_encode64(binary)}"
|
275
|
+
else
|
276
|
+
url
|
277
|
+
end
|
278
|
+
|
279
|
+
content << { type: 'image_url', image_url: { url: image_url, detail: } }
|
280
|
+
|
281
|
+
add_raw_context({ role:, content: })
|
282
|
+
end
|
283
|
+
|
255
284
|
def add_raw_context(c)
|
256
285
|
@context << c
|
257
286
|
end
|
@@ -53,10 +53,11 @@ module OxAiWorkers
|
|
53
53
|
frequency_penalty: @model.frequency_penalty
|
54
54
|
}
|
55
55
|
if @tools.present?
|
56
|
-
parameters[:tools] = @tools.
|
56
|
+
parameters[:tools] = @tools.select do |f|
|
57
57
|
tool_name = f[:function][:name]
|
58
58
|
tool_name == @last_call && @stop_double_calls.include?(tool_name)
|
59
59
|
end
|
60
|
+
OxAiWorkers.logger.debug("tools: #{parameters[:tools]} last_call=#{@last_call} stop_double_calls=#{@stop_double_calls}", for: self.class)
|
60
61
|
if @call_stack&.any?
|
61
62
|
func1 = @call_stack.first
|
62
63
|
@call_stack = @call_stack.drop(1)
|
@@ -146,7 +147,7 @@ module OxAiWorkers
|
|
146
147
|
name: function['name'].split('__').last,
|
147
148
|
args: args
|
148
149
|
}
|
149
|
-
@last_call = function['name']
|
150
|
+
@last_call = function['name'].to_s
|
150
151
|
end
|
151
152
|
end
|
152
153
|
end
|
@@ -62,29 +62,9 @@ module OxAiWorkers
|
|
62
62
|
end
|
63
63
|
end
|
64
64
|
|
65
|
-
def edit_image(input_image:, prompt:, output_file_name: nil, size: nil, mask: nil)
|
66
|
-
|
67
|
-
|
68
|
-
|
69
|
-
response = @worker.client.images.edit(
|
70
|
-
parameters: {
|
71
|
-
image: input_image,
|
72
|
-
model: @image_model['model'],
|
73
|
-
prompt:,
|
74
|
-
size:,
|
75
|
-
mask:
|
76
|
-
}
|
77
|
-
)
|
78
|
-
|
79
|
-
@url = response.dig('data', 0, 'url')
|
80
|
-
revised_prompt = response.dig('data', 0, 'revised_prompt')
|
81
|
-
if output_file_name.present?
|
82
|
-
path = save_generated_image(file_name: output_file_name)
|
83
|
-
"url: #{@url}\nfile_name: #{path}\n\nrevised_prompt: #{revised_prompt}"
|
84
|
-
else
|
85
|
-
"url: #{@url}\n\nrevised_prompt: #{revised_prompt}"
|
86
|
-
end
|
87
|
-
end
|
65
|
+
# def edit_image(input_image:, prompt:, output_file_name: nil, size: nil, mask: nil)
|
66
|
+
# # TODO: Implement edit_image
|
67
|
+
# end
|
88
68
|
|
89
69
|
def save_generated_image(file_name:, binary:)
|
90
70
|
unless @current_dir.present?
|
data/lib/oxaiworkers/version.rb
CHANGED
metadata
CHANGED
@@ -1,7 +1,7 @@
|
|
1
1
|
--- !ruby/object:Gem::Specification
|
2
2
|
name: ox-ai-workers
|
3
3
|
version: !ruby/object:Gem::Version
|
4
|
-
version: 0.9.
|
4
|
+
version: 0.9.6.1
|
5
5
|
platform: ruby
|
6
6
|
authors:
|
7
7
|
- Denis Smolev
|
@@ -123,7 +123,9 @@ executables:
|
|
123
123
|
extensions: []
|
124
124
|
extra_rdoc_files: []
|
125
125
|
files:
|
126
|
-
- ".
|
126
|
+
- ".cursor/rules/010-project-structure.mdc"
|
127
|
+
- ".cursor/rules/998-clean-code.mdc"
|
128
|
+
- ".cursor/rules/999-mdc-format.mdc"
|
127
129
|
- ".ruby-version"
|
128
130
|
- CHANGELOG.md
|
129
131
|
- CODE_OF_CONDUCT.md
|
data/.cursorrules
DELETED
@@ -1,155 +0,0 @@
|
|
1
|
-
# Overview
|
2
|
-
|
3
|
-
OxAiWorkers is a Ruby gem that implements a finite state machine (using the `state_machine` gem) to solve tasks using generative intelligence (with the `ruby-openai` gem). This approach enhances the final result by utilizing internal monologue and external tools.
|
4
|
-
|
5
|
-
## Architecture Principles
|
6
|
-
|
7
|
-
- The library is built on the finite state machine (FSM) pattern using the 'state_machine' gem
|
8
|
-
- Integration with generative models is implemented using the 'ruby-openai' gem
|
9
|
-
- DRY (Don't Repeat Yourself) principle is applied throughout all components
|
10
|
-
- Modular structure with clear separation of responsibilities between classes
|
11
|
-
- Encapsulation of states and transitions in separate classes
|
12
|
-
- Implementation of the "Composition" pattern for flexible tool integration
|
13
|
-
|
14
|
-
## Core Components
|
15
|
-
|
16
|
-
- `Request` and `DelayedRequest` - classes for executing API requests (immediate and delayed)
|
17
|
-
- `Iterator` - main class for iterative task execution with tools
|
18
|
-
- `Assistant` - high-level wrappers over Iterator (Sysop, Coder, Localizer, etc.)
|
19
|
-
- `Tool` - tools that can be used during task execution (Eval, FileSystem, Database)
|
20
|
-
- `ToolDefinition` - module for declaring functions and methods for tools
|
21
|
-
- `StateTools` - base class for managing states and transitions
|
22
|
-
- `ContextualLogger` - logging system with contextual information support
|
23
|
-
|
24
|
-
## Code Conventions
|
25
|
-
|
26
|
-
- Use `snake_case` for method and variable names
|
27
|
-
- Functions for generative models should also be in `snake_case` (inner_monologue, outer_voice, etc.)
|
28
|
-
- All public methods must have documentation with usage examples
|
29
|
-
- Tests are mandatory for all new functions
|
30
|
-
- All code comments, CHANGELOG, README, and other documentation must be written in English
|
31
|
-
- Use YARD-style documentation for all public methods
|
32
|
-
- Maintain a unified code formatting style (Rubocop is recommended)
|
33
|
-
- Follow the "Fail fast" principle for early error detection
|
34
|
-
|
35
|
-
## Interaction Patterns
|
36
|
-
|
37
|
-
- The system uses internal monologue (inner_monologue) for planning actions
|
38
|
-
- External voice (outer_voice) is used for communication with the user
|
39
|
-
- Execution flow management through finite state machine
|
40
|
-
- Implementation of callback mechanisms for flexible event handling
|
41
|
-
- Isolation of error handling functions at the tool level
|
42
|
-
|
43
|
-
## Integration
|
44
|
-
|
45
|
-
- CLI interface through `oxaiworkers init` and `oxaiworkers run` commands
|
46
|
-
- Rails support via ActiveRecord for storing delayed requests
|
47
|
-
- Configuration through the `OxAiWorkers.configure` block
|
48
|
-
- Multilingual support via standard I18n
|
49
|
-
- Integration with external APIs through request client templates
|
50
|
-
- Delayed execution mechanism via DelayedRequest
|
51
|
-
- Support for various language models (OpenAI, Anthropic, Gemini)
|
52
|
-
|
53
|
-
## Best Practices
|
54
|
-
|
55
|
-
- Use callbacks to handle various states (on_inner_monologue, on_outer_voice)
|
56
|
-
- Handle errors at the tool level, preventing them from interrupting the main execution flow
|
57
|
-
- When creating new assistants, inherit from the base Assistant class
|
58
|
-
- Use the white_list mechanism to restrict available functions
|
59
|
-
- Separate language model requests from result processing logic
|
60
|
-
- Practice dependency injection to improve code testability
|
61
|
-
- Use localization mechanisms for multilingual support
|
62
|
-
|
63
|
-
## Tools Architecture
|
64
|
-
|
65
|
-
- Each tool should be a self-contained module
|
66
|
-
- Tools are registered through the `define_function` interface
|
67
|
-
- All tools should handle their own errors and return readable messages
|
68
|
-
- Use parameter validation at the function definition level
|
69
|
-
- Maintain a unified format for return values
|
70
|
-
|
71
|
-
## Performance and Scaling
|
72
|
-
|
73
|
-
- Cache API request results when possible
|
74
|
-
- Use asynchronous processing for long operations
|
75
|
-
- Apply backoff strategies for repeated requests
|
76
|
-
- Break large tasks into atomic operations
|
77
|
-
- Provide monitoring and profiling mechanisms
|
78
|
-
|
79
|
-
## Finite State Machine Implementation
|
80
|
-
|
81
|
-
- Core FSM based on `state_machine` gem with states: idle → prepared → requested → analyzed → finished → idle
|
82
|
-
- State transitions managed by events: prepare, request, analyze, complete, iterate, end
|
83
|
-
- `StateTools` - base class for FSM implementation with event hooks and transition callbacks
|
84
|
-
- `StateBatch` - FSM extension for batch request processing with additional states
|
85
|
-
- Automatic error recovery and retry mechanisms for failed API requests
|
86
|
-
|
87
|
-
## Request Processing
|
88
|
-
|
89
|
-
- `ModuleRequest` - base class for all API requests with parsing and response handling
|
90
|
-
- Support for streaming responses with callback processing
|
91
|
-
- Built-in token usage tracking and truncation detection
|
92
|
-
- Error handling with automatic retries for server errors
|
93
|
-
|
94
|
-
## Iterator Lifecycle
|
95
|
-
|
96
|
-
- 3 core functions: inner_monologue, outer_voice, finish_it
|
97
|
-
- Configurable message queue for stateful conversation history
|
98
|
-
- Callback system for processing each state transition
|
99
|
-
- Context and milestone management for optimizing token usage
|
100
|
-
- Support for custom steps and instruction templating
|
101
|
-
|
102
|
-
## Additional Tools
|
103
|
-
|
104
|
-
- `Converter` - tools for data format conversion and transformation
|
105
|
-
- Support for custom tool development through inheritance and composition
|
106
|
-
- Automatic function name resolution and parameter validation
|
107
|
-
|
108
|
-
## Assistants Details
|
109
|
-
|
110
|
-
- `ModuleBase` - shared functionality for all assistant types
|
111
|
-
- `Sysop` - system administration and shell command execution
|
112
|
-
- `Coder` - specialized for code generation and analysis
|
113
|
-
- `Localizer` - translation and localization support
|
114
|
-
|
115
|
-
## Development Guidelines
|
116
|
-
|
117
|
-
- Use dependency injection for testability
|
118
|
-
- Follow the FSM pattern for all stateful operations
|
119
|
-
- Implement proper error boundaries at the tool level
|
120
|
-
- Use monologue for complex reasoning and planning
|
121
|
-
- Apply callbacks for event-driven architecture
|
122
|
-
- Utilize templates in the CLI for rapid prototyping
|
123
|
-
- Extend the base classes rather than modifying them
|
124
|
-
|
125
|
-
## Internationalization and Localization
|
126
|
-
|
127
|
-
- All code comments, variable names, and documentation MUST be written in English
|
128
|
-
- All user-facing strings MUST be properly localized using I18n
|
129
|
-
- Use I18n.t for all text that will be shown to users or appears in assistant prompts
|
130
|
-
- Store translations in YAML files within the config/locales directory
|
131
|
-
- Follow the naming convention of language.namespace.key (e.g., en.oxaiworkers.assistant.role)
|
132
|
-
- Use named parameters (%{variable}) instead of positional parameters (%s) in translation strings
|
133
|
-
- Use the with_locale method to ensure proper locale context when processing localized text
|
134
|
-
- Implement locale-aware classes by including the OxAiWorkers::LoadI18n module
|
135
|
-
- Store the current locale on initialization and preserve it across method calls
|
136
|
-
- Support multiple languages simultaneously through careful locale management
|
137
|
-
- Default to English for developer-facing messages and logs
|
138
|
-
- Ensure that all assistant classes properly handle localization in their format_role methods
|
139
|
-
|
140
|
-
## LoadI18n Module Usage
|
141
|
-
|
142
|
-
- The `OxAiWorkers::LoadI18n` module provides two key methods for localization:
|
143
|
-
- `store_locale` - saves the current locale at initialization time
|
144
|
-
- `with_locale` - executes a block of code in the context of the saved locale
|
145
|
-
- Always include the `OxAiWorkers::LoadI18n` module in classes that need localization capabilities
|
146
|
-
- Call `store_locale` in the initialization methods of locale-aware classes
|
147
|
-
- Wrap all locale-dependent code in `with_locale` blocks
|
148
|
-
- NEVER redefine the `with_locale` method in classes that include LoadI18n
|
149
|
-
- All methods that produce user-visible text must use the locale context via `with_locale` blocks
|
150
|
-
- Regular method calls from classes including LoadI18n do not require additional locale handling
|
151
|
-
|
152
|
-
## Multi-Language Support
|
153
|
-
|
154
|
-
- Use the store_locale and with_locale methods for consistent localization context
|
155
|
-
- All error messages should be localized and retrieved via I18n.t
|