ragdoll-cli 0.0.2

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
checksums.yaml ADDED
@@ -0,0 +1,7 @@
1
+ ---
2
+ SHA256:
3
+ metadata.gz: f19f7c89cad761c7da655010fdc868b7124ab164d0481cf2d97dea485df58317
4
+ data.tar.gz: c2c7be844cc5addcd4386c4ea0596c75abe9cbbb29167b218ee95826a9c98ace
5
+ SHA512:
6
+ metadata.gz: 27c20b905c6fa11a5941a8c1343b01b37affd6d84fd4e3884733988031c2e91f7336c351fc7da90474c68bc4b951cae37b3b3d2d6f85e25f010d19d3110ef43a
7
+ data.tar.gz: 121b9df6c28f2ed96ddfaf2f00468b7f53aced41b04ed77dca015ec8690e801e3fd10dfc8567737b3448835bb7010234b11630e53a68a494064b8f089a3654ba
data/README.md ADDED
@@ -0,0 +1,296 @@
1
+ <div align="center" style="background-color: yellow; color: black; padding: 20px; margin: 20px 0; border: 2px solid black; font-size: 48px; font-weight: bold;">
2
+ ⚠️ CAUTION ⚠️<br />
3
+ Software Under Development by a Crazy Man
4
+ </div>
5
+ <br />
6
+ <div align="center">
7
+ <table>
8
+ <tr>
9
+ <td width="50%">
10
+ <a href="https://research.ibm.com/blog/retrieval-augmented-generation-RAG" target="_blank">
11
+ <img src="ragdoll-cli.png" alt="Ragdoll CLI Putting the Puzzle Together" width="800">
12
+ </a>
13
+ </td>
14
+ <td width="50%" valign="top">
15
+ <p>Multi-modal RAG (Retrieval-Augmented Generation) is an architecture that integrates multiple data types (such as text, images, and audio) to enhance AI response generation. It combines retrieval-based methods, which fetch relevant information from a knowledge base, with generative large language models (LLMs) that create coherent and contextually appropriate outputs. This approach allows for more comprehensive and engaging user interactions, such as chatbots that respond with both text and images or educational tools that incorporate visual aids into learning materials. By leveraging various modalities, multi-modal RAG systems improve context understanding and user experience.</p>
16
+ </td>
17
+ </tr>
18
+ </table>
19
+ </div>
20
+
21
+ # Ragdoll::CLI
22
+
23
+ Standalone command-line interface for the Ragdoll RAG (Retrieval-Augmented Generation) system. Provides document import, search, and management capabilities through a simple CLI.
24
+
25
+ ## Installation
26
+
27
+ ```bash
28
+ gem install ragdoll-cli
29
+ ```
30
+
31
+ This will install the `ragdoll` command-line tool.
32
+
33
+ ## Quick Start
34
+
35
+ 1. **Initialize configuration:**
36
+ ```bash
37
+ ragdoll init
38
+ ```
39
+
40
+ 2. **Set your API key:**
41
+ ```bash
42
+ export OPENAI_API_KEY=your_api_key_here
43
+ ```
44
+
45
+ 3. **Import documents:**
46
+ ```bash
47
+ ragdoll import "docs/*.pdf" --recursive
48
+ ```
49
+
50
+ 4. **Search for content:**
51
+ ```bash
52
+ ragdoll search "What is machine learning?"
53
+ ```
54
+
55
+ ## Commands
56
+
57
+ ### Configuration
58
+
59
+ ```bash
60
+ # Initialize configuration
61
+ ragdoll init
62
+
63
+ # Show current configuration
64
+ ragdoll config show
65
+
66
+ # Set configuration values
67
+ ragdoll config set llm_provider openai
68
+ ragdoll config set chunk_size 1000
69
+
70
+ # Get configuration values
71
+ ragdoll config get embedding_model
72
+
73
+ # Show config file path
74
+ ragdoll config path
75
+
76
+ # Show database configuration and status
77
+ ragdoll config database
78
+ ```
79
+
80
+ ### Document Import
81
+
82
+ ```bash
83
+ # Import files matching a pattern
84
+ ragdoll import "documents/*.pdf"
85
+
86
+ # Import recursively from directory
87
+ ragdoll import "docs/**/*" --recursive
88
+
89
+ # Filter by document type
90
+ ragdoll import "files/*" --type pdf
91
+
92
+ # Available types: pdf, docx, txt, md, html
93
+ ```
94
+
95
+ ### Search
96
+
97
+ ```bash
98
+ # Basic search
99
+ ragdoll search "machine learning concepts"
100
+
101
+ # Limit number of results
102
+ ragdoll search "AI algorithms" --limit 5
103
+
104
+ # Different output formats
105
+ ragdoll search "deep learning" --format json
106
+ ragdoll search "AI" --format plain
107
+ ragdoll search "ML" --format table # default
108
+ ```
109
+
110
+ ### Document Management
111
+
112
+ ```bash
113
+ # Add a single document
114
+ ragdoll add <path>
115
+
116
+ # List all documents
117
+ ragdoll list
118
+
119
+ # Limit number of documents shown
120
+ ragdoll list --limit 10
121
+
122
+ # Different output formats
123
+ ragdoll list --format json
124
+ ragdoll list --format plain
125
+
126
+ # Check document status
127
+ ragdoll status <id>
128
+
129
+ # Update document metadata
130
+ ragdoll update <id> --title "New Title"
131
+
132
+ # Delete a document
133
+ ragdoll delete <id>
134
+ ragdoll delete <id> --force # Bypass confirmation
135
+
136
+ # Show system statistics
137
+ ragdoll stats
138
+ ragdoll stats --format json
139
+ ragdoll stats --format plain
140
+ ```
141
+
142
+ ### Retrieval Utilities
143
+
144
+ ```bash
145
+ # Get context for RAG applications
146
+ ragdoll context "<query>" --limit 5
147
+
148
+ # Enhance a prompt with context
149
+ ragdoll enhance "<prompt>" --context_limit 5
150
+ ```
151
+
152
+ ### Utilities
153
+
154
+ ```bash
155
+ # Show version information
156
+ ragdoll version
157
+
158
+ # Show help
159
+ ragdoll help
160
+ ragdoll help import # Help for specific command
161
+
162
+ # Check system health
163
+ ragdoll health
164
+ ```
165
+
166
+ ## Configuration
167
+
168
+ The CLI uses a YAML configuration file located at `~/.ragdoll/config.yml`. You can customize various settings:
169
+
170
+ ```yaml
171
+ llm_provider: openai
172
+ embedding_provider: openai
173
+ embedding_model: text-embedding-3-small
174
+ chunk_size: 1000
175
+ chunk_overlap: 200
176
+ search_similarity_threshold: 0.7
177
+ max_search_results: 10
178
+ storage_backend: file
179
+ storage_config:
180
+ directory: "~/.ragdoll"
181
+ api_keys:
182
+ openai: your_key_here
183
+ anthropic: your_key_here
184
+ ```
185
+
186
+ ### Environment Variables
187
+
188
+ API keys can be set via environment variables (recommended):
189
+
190
+ ```bash
191
+ export OPENAI_API_KEY=your_key_here
192
+ export ANTHROPIC_API_KEY=your_key_here
193
+ export GOOGLE_API_KEY=your_key_here
194
+ export AZURE_OPENAI_API_KEY=your_key_here
195
+ export HUGGINGFACE_API_KEY=your_key_here
196
+ export OLLAMA_ENDPOINT=http://localhost:11434
197
+ ```
198
+
199
+ ### Custom Configuration Location
200
+
201
+ ```bash
202
+ export RAGDOLL_CONFIG=/path/to/custom/config.yml
203
+ ```
204
+
205
+ ## Storage
206
+
207
+ Documents and embeddings are stored in a PostgreSQL database managed by the `ragdoll-core` gem for production performance. Configuration and log files are stored locally in `~/.ragdoll/`:
208
+
209
+ - `~/.ragdoll/config.yml` - Configuration settings
210
+ - `~/.ragdoll/ragdoll.log` - Log file (if configured)
211
+
212
+ ## Supported Document Types
213
+
214
+ - **PDF files** (`.pdf`) - Extracts text and metadata
215
+ - **Microsoft Word** (`.docx`) - Extracts text, tables, and metadata
216
+ - **Text files** (`.txt`) - Plain text import
217
+ - **Markdown** (`.md`, `.markdown`) - Markdown document import
218
+ - **HTML** (`.html`, `.htm`) - Strips HTML tags and imports text
219
+
220
+ ## Examples
221
+
222
+ ### Import a directory of documentation
223
+
224
+ ```bash
225
+ # Import all markdown files from a docs directory
226
+ ragdoll import "docs/**/*.md" --recursive
227
+
228
+ # Import mixed document types
229
+ ragdoll import "knowledge-base/*" --recursive
230
+ ```
231
+
232
+ ### Search and get enhanced prompts
233
+
234
+ ```bash
235
+ # Basic search
236
+ ragdoll search "How to configure SSL certificates?"
237
+
238
+ # Get detailed results
239
+ ragdoll search "database optimization" --format plain --limit 3
240
+ ```
241
+
242
+ ### Manage your knowledge base
243
+
244
+ ```bash
245
+ # See what's in your knowledge base
246
+ ragdoll stats
247
+ ragdoll list --limit 20
248
+
249
+ # Check status of a specific document
250
+ ragdoll status 123
251
+
252
+ # Update document title
253
+ ragdoll update 123 --title "Updated Document Title"
254
+
255
+ # Delete a document
256
+ ragdoll delete 123
257
+ ```
258
+
259
+ ## Integration with Other Tools
260
+
261
+ The CLI is designed to work well with other command-line tools:
262
+
263
+ ```bash
264
+ # Search and pipe to jq for JSON processing
265
+ ragdoll search "API documentation" --format json | jq '.results[0].content'
266
+
267
+ # Import files found by find command
268
+ find ./docs -name "*.pdf" -exec ragdoll import {} \;
269
+
270
+ # Use with xargs for batch processing
271
+ ls *.md | xargs -I {} ragdoll import {}
272
+ ```
273
+
274
+ ## Troubleshooting
275
+
276
+ ### Common Issues
277
+
278
+ 1. **No API key configured:**
279
+ ```
280
+ Error: Missing API key
281
+ Solution: Set OPENAI_API_KEY environment variable or add to config
282
+ ```
283
+
284
+ 2. **No documents found:**
285
+ ```
286
+ ragdoll stats # Check if documents are imported
287
+ ragdoll list # See what documents exist
288
+ ```
289
+
290
+ ## Contributing
291
+
292
+ Bug reports and pull requests are welcome on GitHub at https://github.com/MadBomber/ragdoll-cli.
293
+
294
+ ## License
295
+
296
+ The gem is available as open source under the terms of the [MIT License](https://opensource.org/licenses/MIT).
data/Rakefile ADDED
@@ -0,0 +1,21 @@
1
+ # frozen_string_literal: true
2
+
3
+ require 'simplecov'
4
+ SimpleCov.start
5
+
6
+ # Suppress bundler/rubygems warnings
7
+ $VERBOSE = nil
8
+
9
+ require "bundler/gem_tasks"
10
+ require "rake/testtask"
11
+
12
+ Rake::TestTask.new(:test) do |t|
13
+ t.libs << "test"
14
+ t.libs << "lib"
15
+ t.test_files = FileList["test/**/*_test.rb"]
16
+ end
17
+
18
+ # Load annotate tasks
19
+ Dir.glob("lib/tasks/*.rake").each { |r| load r }
20
+
21
+ task default: :test
data/bin/ragdoll ADDED
@@ -0,0 +1,7 @@
1
+ #!/usr/bin/env ruby
2
+ # frozen_string_literal: true
3
+
4
+ require "ragdoll"
5
+ require_relative "../lib/ragdoll/cli"
6
+
7
+ Ragdoll::CLI::Main.start(ARGV)
@@ -0,0 +1,152 @@
1
+ # frozen_string_literal: true
2
+
3
+ require 'yaml'
4
+
5
+ module Ragdoll
6
+ module CLI
7
+ class Config < Thor
8
+ desc 'init', 'Initialize Ragdoll configuration'
9
+ def init
10
+ loader = ConfigurationLoader.new
11
+
12
+ if loader.config_exists?
13
+ puts "Configuration file already exists at: #{loader.config_path}"
14
+ if yes?('Overwrite existing configuration?')
15
+ loader.create_default_config
16
+ puts "Configuration file created at: #{loader.config_path}"
17
+ else
18
+ puts 'Configuration unchanged.'
19
+ return
20
+ end
21
+ else
22
+ loader.create_default_config
23
+ puts "Configuration file created at: #{loader.config_path}"
24
+ end
25
+
26
+ puts "\nDefault configuration created with PostgreSQL database."
27
+ puts 'You may need to:'
28
+ puts '1. Ensure PostgreSQL is installed and running'
29
+ puts '2. Create the database: createdb ragdoll_development'
30
+ puts '3. Set your API keys in environment variables:'
31
+ puts ' export OPENAI_API_KEY=your_key_here'
32
+ puts "4. Or add them to the config file under 'api_keys' section"
33
+ puts '5. For production, update the database configuration:'
34
+ puts ' ragdoll config set database_config.database ragdoll_production'
35
+ puts "6. Edit #{loader.config_path} to customize settings"
36
+ end
37
+
38
+ desc 'show', 'Show current configuration'
39
+ def show
40
+ loader = ConfigurationLoader.new
41
+
42
+ unless loader.config_exists?
43
+ puts "No configuration file found. Run 'ragdoll config init' to create one."
44
+ return
45
+ end
46
+
47
+ config = YAML.load_file(loader.config_path)
48
+ puts "Configuration from: #{loader.config_path}"
49
+ puts
50
+ puts YAML.dump(config)
51
+ end
52
+
53
+ desc 'set KEY VALUE', 'Set a configuration value'
54
+ def set(key, value)
55
+ loader = ConfigurationLoader.new
56
+
57
+ unless loader.config_exists?
58
+ puts "No configuration file found. Run 'ragdoll config init' to create one."
59
+ return
60
+ end
61
+
62
+ config = YAML.load_file(loader.config_path)
63
+
64
+ # Parse numeric values
65
+ if value.match?(/^\d+$/)
66
+ value = value.to_i
67
+ elsif value.match?(/^\d+\.\d+$/)
68
+ value = value.to_f
69
+ elsif value == 'true'
70
+ value = true
71
+ elsif value == 'false'
72
+ value = false
73
+ end
74
+
75
+ # Support nested keys with dot notation
76
+ keys = key.split('.')
77
+ current = config
78
+ keys[0..-2].each do |k|
79
+ current[k] ||= {}
80
+ current = current[k]
81
+ end
82
+ current[keys.last] = value
83
+
84
+ File.write(loader.config_path, YAML.dump(config))
85
+ puts "Set #{key} = #{value}"
86
+ puts 'Note: Restart the CLI or reload configuration for changes to take effect.'
87
+ end
88
+
89
+ desc 'get KEY', 'Get a configuration value'
90
+ def get(key)
91
+ loader = ConfigurationLoader.new
92
+
93
+ unless loader.config_exists?
94
+ puts "No configuration file found. Run 'ragdoll config init' to create one."
95
+ return
96
+ end
97
+
98
+ config = YAML.load_file(loader.config_path)
99
+
100
+ # Support nested keys with dot notation
101
+ keys = key.split('.')
102
+ value = config
103
+ keys.each do |k|
104
+ value = value[k] if value.is_a?(Hash)
105
+ end
106
+
107
+ puts "#{key} = #{value}"
108
+ end
109
+
110
+ desc 'path', 'Show configuration file path'
111
+ def path
112
+ loader = ConfigurationLoader.new
113
+ puts loader.config_path
114
+ end
115
+
116
+ desc 'database', 'Show database configuration and status'
117
+ def database
118
+ loader = ConfigurationLoader.new
119
+
120
+ unless loader.config_exists?
121
+ puts "No configuration file found. Run 'ragdoll config init' to create one."
122
+ return
123
+ end
124
+
125
+ config = YAML.load_file(loader.config_path)
126
+ db_config = config['database_config']
127
+
128
+ puts 'Database Configuration:'
129
+ puts " Adapter: #{db_config['adapter']}"
130
+ puts " Database: #{db_config['database']}"
131
+ puts " Auto-migrate: #{db_config['auto_migrate']}"
132
+
133
+ if db_config['adapter'] == 'postgresql'
134
+ puts " Host: #{db_config['host'] || 'localhost'}"
135
+ puts " Port: #{db_config['port'] || 5432}"
136
+ puts " Username: #{db_config['username']}"
137
+ end
138
+
139
+ begin
140
+ client = StandaloneClient.new
141
+ if client.healthy?
142
+ puts "\nDatabase Status: ✓ Connected"
143
+ else
144
+ puts "\nDatabase Status: ✗ Connection failed"
145
+ end
146
+ rescue StandardError => e
147
+ puts "\nDatabase Status: ✗ Error - #{e.message}"
148
+ end
149
+ end
150
+ end
151
+ end
152
+ end
@@ -0,0 +1,37 @@
1
+ # frozen_string_literal: true
2
+
3
+ module Ragdoll
4
+ module CLI
5
+ class Delete
6
+ def call(id, options)
7
+ client = StandaloneClient.new
8
+
9
+ puts "Deleting document ID: #{id}"
10
+ puts "Options: #{options.to_h}" unless options.to_h.empty?
11
+ puts
12
+
13
+ unless options[:force]
14
+ puts "Are you sure you want to delete document ID #{id}? This action cannot be undone."
15
+ return unless yes?('Confirm deletion?')
16
+ end
17
+
18
+ result = client.delete_document(id: id)
19
+
20
+ if result[:success]
21
+ puts "Document ID #{id} deleted successfully."
22
+ puts result[:message] if result[:message]
23
+ else
24
+ puts "Failed to delete document ID #{id}."
25
+ puts result[:message] if result[:message]
26
+ end
27
+ end
28
+
29
+ private
30
+
31
+ def yes?(question)
32
+ require 'highline/import'
33
+ agree("#{question} (y/n) ")
34
+ end
35
+ end
36
+ end
37
+ end
@@ -0,0 +1,22 @@
1
+ # frozen_string_literal: true
2
+
3
+ module Ragdoll
4
+ module CLI
5
+ class Health
6
+ def call(_options)
7
+ client = StandaloneClient.new
8
+
9
+ puts 'Checking system health'
10
+ puts
11
+
12
+ if client.healthy?
13
+ puts 'System Status: ✓ Healthy'
14
+ puts 'The Ragdoll system is operational.'
15
+ else
16
+ puts 'System Status: ✗ Unhealthy'
17
+ puts 'There may be issues with the database or configuration.'
18
+ end
19
+ end
20
+ end
21
+ end
22
+ end
@@ -0,0 +1,57 @@
1
+ # frozen_string_literal: true
2
+
3
+ require 'json'
4
+
5
+ module Ragdoll
6
+ module CLI
7
+ class List
8
+ def call(options)
9
+ client = StandaloneClient.new
10
+
11
+ puts 'Listing documents'
12
+ puts "Options: #{options.to_h}" unless options.to_h.empty?
13
+ puts
14
+
15
+ list_options = {}
16
+ list_options[:limit] = options[:limit] if options[:limit]
17
+
18
+ documents = client.list_documents(**list_options)
19
+
20
+ if documents.empty?
21
+ puts 'No documents found.'
22
+ return
23
+ end
24
+
25
+ case options[:format]
26
+ when 'json'
27
+ puts JSON.pretty_generate(documents)
28
+ when 'plain'
29
+ documents.each_with_index do |doc, index|
30
+ puts "#{index + 1}. #{doc[:title] || 'Untitled'}"
31
+ puts " ID: #{doc[:id]}"
32
+ puts " Status: #{doc[:status] || 'N/A'}"
33
+ puts
34
+ end
35
+ else
36
+ # Table format (default)
37
+ puts "Found #{documents.length} documents:"
38
+ puts
39
+ puts 'Rank'.ljust(5) + 'Title'.ljust(30) + 'ID'.ljust(10) + 'Status'
40
+ puts '-' * 60
41
+
42
+ documents.each_with_index do |doc, index|
43
+ rank = (index + 1).to_s.ljust(5)
44
+ title = (doc[:title] || 'Untitled')[0..29].ljust(30)
45
+ id = doc[:id].to_s.ljust(10)
46
+ status = doc[:status] || 'N/A'
47
+
48
+ puts "#{rank}#{title}#{id}#{status}"
49
+ end
50
+
51
+ puts
52
+ puts 'Use --format=json for complete results or --format=plain for detailed view'
53
+ end
54
+ end
55
+ end
56
+ end
57
+ end
@@ -0,0 +1,88 @@
1
+ # frozen_string_literal: true
2
+
3
+ require 'json'
4
+
5
+ module Ragdoll
6
+ module CLI
7
+ class Search
8
+ def call(query, options)
9
+ client = StandaloneClient.new
10
+
11
+ puts "Searching for: #{query}"
12
+ puts "Options: #{options.to_h}" unless options.to_h.empty?
13
+ puts
14
+
15
+ search_options = {}
16
+ search_options[:limit] = options[:limit] if options[:limit]
17
+ search_options[:content_type] = options[:content_type] if options[:content_type]
18
+ search_options[:classification] = options[:classification] if options[:classification]
19
+ search_options[:keywords] = options[:keywords].split(',').map(&:strip) if options[:keywords]
20
+ search_options[:tags] = options[:tags].split(',').map(&:strip) if options[:tags]
21
+
22
+ search_response = client.search(query, **search_options)
23
+
24
+ # Extract the actual results array from the response
25
+ results = search_response[:results] || search_response['results'] || []
26
+
27
+ if results.empty?
28
+ total = search_response[:total_results] || search_response['total_results'] || 0
29
+ puts "No results found for '#{query}'"
30
+ puts "(Total documents in system: #{total})" if total > 0
31
+ puts "Try adjusting your search terms or check if documents have been processed."
32
+ return
33
+ end
34
+
35
+ case options[:format]
36
+ when 'json'
37
+ puts JSON.pretty_generate(search_response)
38
+ when 'plain'
39
+ results.each_with_index do |result, index|
40
+ title = safe_string_value(result, [:title, :document_title], 'Untitled')
41
+ content = safe_string_value(result, [:content, :text], '')
42
+ puts "#{index + 1}. #{title}"
43
+ puts " ID: #{result[:document_id] || result[:id]}"
44
+ puts " Similarity: #{result[:similarity]&.round(3) || 'N/A'}"
45
+ puts " Content: #{content[0..200]}..."
46
+ puts
47
+ end
48
+ else
49
+ # Table format (default)
50
+ puts "Found #{results.length} results:"
51
+ puts
52
+ puts 'Rank'.ljust(5) + 'Title'.ljust(30) + 'Similarity'.ljust(12) + 'Content Preview'
53
+ puts '-' * 80
54
+
55
+ results.each_with_index do |result, index|
56
+ rank = (index + 1).to_s.ljust(5)
57
+ title = safe_string_value(result, [:title, :document_title], 'Untitled')[0..29].ljust(30)
58
+ similarity = (result[:similarity]&.round(3) || 'N/A').to_s.ljust(12)
59
+ content = safe_string_value(result, [:content, :text], '')[0..50]
60
+ content += '...' if content.length == 50
61
+
62
+ puts "#{rank}#{title}#{similarity}#{content}"
63
+ end
64
+
65
+ puts
66
+ puts 'Use --format=json for complete results or --format=plain for detailed view'
67
+ end
68
+ end
69
+
70
+ private
71
+
72
+ def safe_string_value(obj, keys, default)
73
+ return default.to_s unless obj.respond_to?(:[])
74
+
75
+ keys.each do |key|
76
+ begin
77
+ value = obj[key] || obj[key.to_s]
78
+ return value.to_s if value
79
+ rescue TypeError, NoMethodError
80
+ # Skip this key if access fails
81
+ next
82
+ end
83
+ end
84
+ default.to_s
85
+ end
86
+ end
87
+ end
88
+ end