exa-ai 0.6.1 → 0.7.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: 4f55e9efe411da4b9eaf3018791aa7819fc8872c69e32369627014594eb23670
4
- data.tar.gz: 0d074ce4bb6eaa2902b80fe5df239be717f182c3703ebfcd6c45944de7cd1245
3
+ metadata.gz: 1a5fb5324a2ae6dfb4380d91bce7d8d41a3f3b14e471e43bf254a5e763ef0113
4
+ data.tar.gz: 539a8486ec5e639c4f43fd3e81071ff0f1e5951b3e42c31bc1915d8e00d1df36
5
5
  SHA512:
6
- metadata.gz: 732d915bf1eadcabff77ae2dcf7c573d426021bea3dac1fe0b2013958dde5a53129c62675ff66279b5204233c970b988ea72ad29ef94bb4b5219cba55910145b
7
- data.tar.gz: f4f73347c282e2ebb2209afd42a02e1fb253e9d682a5959ce1cd2ea27519f132c156fb897bbd3177d24f4b88bd06874afd76b81bf70ccb225dcb3ad64ec6249b
6
+ metadata.gz: dd87558e4a7428b56d9e513478e9d02f88cc845bf7d14fa9c0822e4042dd73d100e686ce3c151bc755d680cc283017311ba2b5109eedfcffef03bd99d4457700
7
+ data.tar.gz: 49eac46680edc76e6bfa9a2032b9871841095e052cc6e21eac174675f4060149ddc3a690cdd5b6506f39dbaf621f124d7a288f227f99d1637979729e23d63042
data/README.md CHANGED
@@ -2,6 +2,57 @@
2
2
 
3
3
  Ruby client for the Exa.ai API. Search and analyze web content using neural search, question answering, code discovery, and research automation.
4
4
 
5
+ ## Table of Contents
6
+
7
+ - [Requirements](#requirements)
8
+ - [Installation](#installation)
9
+ - [Configuration](#configuration)
10
+ - [Quick Start](#quick-start)
11
+ - [Features](#features)
12
+ - [Error Handling](#error-handling)
13
+ - [Documentation](#documentation)
14
+ - [Development](#development)
15
+ - [Testing](#testing)
16
+ - [Support](#support)
17
+ - [License](#license)
18
+
19
+ ## Requirements
20
+
21
+ - **Ruby 3.0.0 or higher**
22
+
23
+ ### Installing Ruby on macOS
24
+
25
+ If you're setting up on a fresh macOS laptop, the easiest way to get Ruby 3.x is through Homebrew:
26
+
27
+ **1. Install Homebrew** (if not already installed):
28
+
29
+ ```bash
30
+ /bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/HEAD/install.sh)"
31
+ ```
32
+
33
+ **2. Install Ruby:**
34
+
35
+ ```bash
36
+ brew install ruby
37
+ ```
38
+
39
+ **3. Add Homebrew's Ruby to your PATH** (follow the instructions Homebrew prints, usually adding to `~/.zshrc`):
40
+
41
+ ```bash
42
+ echo 'export PATH="/opt/homebrew/opt/ruby/bin:$PATH"' >> ~/.zshrc
43
+ source ~/.zshrc
44
+ ```
45
+
46
+ **4. Verify installation:**
47
+
48
+ ```bash
49
+ ruby -v # Should show Ruby 3.x
50
+ ```
51
+
52
+ **Alternative: Using a version manager**
53
+
54
+ For managing multiple Ruby versions, consider [rbenv](https://github.com/rbenv/rbenv) or [asdf](https://asdf-vm.com/).
55
+
5
56
  ## Installation
6
57
 
7
58
  Add to your Gemfile:
@@ -32,6 +83,20 @@ Get your API key from [dashboard.exa.ai](https://dashboard.exa.ai).
32
83
  export EXA_API_KEY="your-api-key-here"
33
84
  ```
34
85
 
86
+ **Using .env file (local development)**
87
+
88
+ Create a `.env` file in your project root:
89
+
90
+ ```bash
91
+ # Copy the example file
92
+ cp .env.example .env
93
+
94
+ # Edit .env and add your API key
95
+ EXA_API_KEY=your-api-key-here
96
+ ```
97
+
98
+ The gem automatically loads `.env` files in development when the `dotenv` gem is installed.
99
+
35
100
  **Ruby Code**
36
101
 
37
102
  ```ruby
@@ -193,6 +258,46 @@ See [CONTRIBUTING.md](./CONTRIBUTING.md) for:
193
258
  - Code conventions
194
259
  - Building and releasing
195
260
 
261
+ ## Testing
262
+
263
+ ### Running Tests
264
+
265
+ ```bash
266
+ # Run unit tests (integration tests skip by default)
267
+ bundle exec rake test
268
+
269
+ # Run integration tests (VCR-based, no real API calls)
270
+ RUN_INTEGRATION_TESTS=true bundle exec rake test
271
+
272
+ # Run CLI integration tests (real API calls, requires explicit opt-in)
273
+ RUN_CLI_INTEGRATION_TESTS=true bundle exec rake test
274
+ ```
275
+
276
+ ### Integration Tests
277
+
278
+ **Integration tests are skipped by default** to prevent accidental API calls.
279
+
280
+ **VCR-based integration tests (`RUN_INTEGRATION_TESTS`):**
281
+ - Use recorded HTTP interactions (VCR cassettes)
282
+ - No real API calls when replaying cassettes
283
+ - Set `RUN_INTEGRATION_TESTS=true` to run them
284
+ - Safe to run during development
285
+
286
+ **CLI integration tests (`RUN_CLI_INTEGRATION_TESTS`):**
287
+ - Make real API calls through shell commands
288
+ - Consume Exa's concurrent search quota
289
+ - Set `RUN_CLI_INTEGRATION_TESTS=true` AND `EXA_API_KEY` to run them
290
+ - **Warning:** Can exhaust API quota and trigger rate limits lasting 1-2 days
291
+
292
+ **When to run integration tests:**
293
+ - VCR tests: Anytime (safe, no real API calls)
294
+ - CLI tests: Only before releases or when testing CLI-specific functionality
295
+
296
+ **Test Coverage:**
297
+ - **Unit tests** - Fast, no API calls, always run
298
+ - **VCR integration tests** - Replay cassettes, skipped by default
299
+ - **CLI integration tests** - Real API calls via shell, skipped by default
300
+
196
301
  ## Support
197
302
 
198
303
  - **Documentation**: https://docs.exa.ai
data/exe/exa-ai-answer CHANGED
@@ -9,7 +9,8 @@ def parse_args(argv)
9
9
  output_format: "json",
10
10
  api_key: nil,
11
11
  text: false,
12
- stream: false
12
+ stream: false,
13
+ skip_citations: false
13
14
  }
14
15
 
15
16
  # Extract query (first non-flag argument)
@@ -24,6 +25,9 @@ def parse_args(argv)
24
25
  when "--stream"
25
26
  args[:stream] = true
26
27
  i += 1
28
+ when "--skip-citations", "--no-citations"
29
+ args[:skip_citations] = true
30
+ i += 1
27
31
  when "--output-schema"
28
32
  args[:output_schema] = argv[i + 1]
29
33
  i += 2
@@ -48,6 +52,8 @@ def parse_args(argv)
48
52
  Options:
49
53
  --stream Stream answer chunks in real-time
50
54
  --text Include full text content from sources
55
+ --skip-citations Remove citations from output (saves tokens)
56
+ --no-citations Alias for --skip-citations
51
57
  --output-schema JSON JSON schema for structured output
52
58
  --system-prompt TEXT System prompt to guide answer generation
53
59
  --api-key KEY Exa API key (or set EXA_API_KEY env var)
@@ -123,7 +129,7 @@ begin
123
129
  else
124
130
  # Non-streaming mode - collect full response and format
125
131
  result = client.answer(args[:query], **answer_params)
126
- output = Exa::CLI::Formatters::AnswerFormatter.format(result, output_format)
132
+ output = Exa::CLI::Formatters::AnswerFormatter.format(result, output_format, skip_citations: args[:skip_citations])
127
133
  puts output
128
134
  $stdout.flush
129
135
  end
@@ -3,7 +3,7 @@
3
3
 
4
4
  require "exa-ai"
5
5
 
6
- VALID_FORMATS = %w[text url options].freeze
6
+ VALID_FORMATS = Exa::Constants::Websets::ENRICHMENT_FORMATS
7
7
 
8
8
  # Recursively convert hash keys from strings to symbols
9
9
  def deep_symbolize_keys(obj)
data/exe/exa-ai-search CHANGED
@@ -2,189 +2,61 @@
2
2
  # frozen_string_literal: true
3
3
 
4
4
  require "exa-ai"
5
-
6
- # Parse command-line arguments
7
- def parse_args(argv)
8
- args = {
9
- output_format: "json",
10
- api_key: nil
11
- }
12
-
13
- # Extract query (first non-flag argument)
14
- query_parts = []
15
- i = 0
16
- while i < argv.length
17
- arg = argv[i]
18
- case arg
19
- when "--num-results"
20
- args[:num_results] = argv[i + 1].to_i
21
- i += 2
22
- when "--type"
23
- search_type = argv[i + 1]
24
- valid_types = ["fast", "deep", "keyword", "auto"]
25
- unless valid_types.include?(search_type)
26
- $stderr.puts "Error: Search type must be one of: #{valid_types.join(', ')}"
27
- exit 1
28
- end
29
- args[:type] = search_type
30
- i += 2
31
- when "--category"
32
- category = argv[i + 1]
33
- valid_categories = ["company", "research paper", "news", "pdf", "github", "tweet", "personal site", "linkedin profile", "financial report"]
34
- unless valid_categories.include?(category)
35
- $stderr.puts "Error: Category must be one of: #{valid_categories.map { |c| "\"#{c}\"" }.join(', ')}"
36
- exit 1
37
- end
38
- args[:category] = category
39
- i += 2
40
- when "--include-domains"
41
- args[:include_domains] = argv[i + 1].split(",").map(&:strip)
42
- i += 2
43
- when "--exclude-domains"
44
- args[:exclude_domains] = argv[i + 1].split(",").map(&:strip)
45
- i += 2
46
- when "--api-key"
47
- args[:api_key] = argv[i + 1]
48
- i += 2
49
- when "--output-format"
50
- args[:output_format] = argv[i + 1]
51
- i += 2
52
- when "--linkedin"
53
- linkedin_type = argv[i + 1]
54
- valid_types = ["company", "person", "all"]
55
- unless valid_types.include?(linkedin_type)
56
- $stderr.puts "Error: LinkedIn type must be one of: #{valid_types.join(', ')}"
57
- exit 1
58
- end
59
- args[:linkedin] = linkedin_type
60
- i += 2
61
- when "--start-published-date"
62
- args[:start_published_date] = argv[i + 1]
63
- i += 2
64
- when "--end-published-date"
65
- args[:end_published_date] = argv[i + 1]
66
- i += 2
67
- when "--start-crawl-date"
68
- args[:start_crawl_date] = argv[i + 1]
69
- i += 2
70
- when "--end-crawl-date"
71
- args[:end_crawl_date] = argv[i + 1]
72
- i += 2
73
- when "--include-text"
74
- args[:include_text] ||= []
75
- args[:include_text] << argv[i + 1]
76
- i += 2
77
- when "--exclude-text"
78
- args[:exclude_text] ||= []
79
- args[:exclude_text] << argv[i + 1]
80
- i += 2
81
- when "--text"
82
- args[:text] = true
83
- i += 1
84
- when "--text-max-characters"
85
- args[:text_max_characters] = argv[i + 1].to_i
86
- i += 2
87
- when "--include-html-tags"
88
- args[:include_html_tags] = true
89
- i += 1
90
- when "--summary"
91
- args[:summary] = true
92
- i += 1
93
- when "--summary-query"
94
- args[:summary_query] = argv[i + 1]
95
- i += 2
96
- when "--summary-schema"
97
- schema_arg = argv[i + 1]
98
- args[:summary_schema] = if schema_arg.start_with?("@")
99
- JSON.parse(File.read(schema_arg[1..]))
100
- else
101
- JSON.parse(schema_arg)
102
- end
103
- i += 2
104
- when "--context"
105
- args[:context] = true
106
- i += 1
107
- when "--context-max-characters"
108
- args[:context_max_characters] = argv[i + 1].to_i
109
- i += 2
110
- when "--subpages"
111
- args[:subpages] = argv[i + 1].to_i
112
- i += 2
113
- when "--subpage-target"
114
- args[:subpage_target] ||= []
115
- args[:subpage_target] << argv[i + 1]
116
- i += 2
117
- when "--links"
118
- args[:links] = argv[i + 1].to_i
119
- i += 2
120
- when "--image-links"
121
- args[:image_links] = argv[i + 1].to_i
122
- i += 2
123
- when "--help", "-h"
124
- puts <<~HELP
125
- Usage: exa-ai search QUERY [OPTIONS]
126
-
127
- Search the web using Exa AI
128
-
129
- Arguments:
130
- QUERY Search query (required)
131
-
132
- Options:
133
- --num-results N Number of results to return (default: 10)
134
- --type TYPE Search type: fast, deep, keyword, or auto (default: fast)
135
- --category CAT Focus on specific data category
136
- Options: "company", "research paper", "news", "pdf",
137
- "github", "tweet", "personal site", "linkedin profile",
138
- "financial report"
139
- --include-domains D Comma-separated list of domains to include
140
- --exclude-domains D Comma-separated list of domains to exclude
141
- --start-published-date DATE Filter by published date (ISO 8601 format)
142
- --end-published-date DATE Filter by published date (ISO 8601 format)
143
- --start-crawl-date DATE Filter by crawl date (ISO 8601 format)
144
- --end-crawl-date DATE Filter by crawl date (ISO 8601 format)
145
- --include-text PHRASE Include results with exact phrase (repeatable)
146
- --exclude-text PHRASE Exclude results with exact phrase (repeatable)
147
-
148
- Content Extraction:
149
- --text Include full webpage text
150
- --text-max-characters N Max characters for webpage text
151
- --include-html-tags Include HTML tags in text extraction
152
- --summary Include AI-generated summary
153
- --summary-query PROMPT Custom prompt for summary generation
154
- --summary-schema FILE JSON schema for summary structure (@file syntax)
155
- --context Format results as context for LLM RAG
156
- --context-max-characters N Max characters for context string
157
- --subpages N Number of subpages to crawl
158
- --subpage-target PHRASE Subpage target phrases (repeatable)
159
- --links N Number of links to extract per result
160
- --image-links N Number of image links to extract
161
-
162
- General Options:
163
- --linkedin TYPE Search LinkedIn: company, person, or all
164
- --api-key KEY Exa API key (or set EXA_API_KEY env var)
165
- --output-format FMT Output format: json, pretty, or text (default: json)
166
- --help, -h Show this help message
167
-
168
- Examples:
169
- exa-ai search "ruby programming"
170
- exa-ai search "machine learning" --num-results 5 --type deep
171
- exa-ai search "Latest LLM research" --category "research paper"
172
- exa-ai search "AI startups" --category company
173
- exa-ai search "Anthropic" --linkedin company
174
- exa-ai search "Dario Amodei" --linkedin person
175
- exa-ai search "AI" --linkedin all
176
- exa-ai search "AI research" --include-domains arxiv.org,scholar.google.com
177
- exa-ai search "tutorials" --output-format pretty
178
- HELP
179
- exit 0
180
- else
181
- query_parts << arg
182
- i += 1
183
- end
184
- end
185
-
186
- args[:query] = query_parts.join(" ")
187
- args
5
+ require_relative "../lib/exa/cli/search_parser"
6
+
7
+ def print_help
8
+ puts <<~HELP
9
+ Usage: exa-ai search QUERY [OPTIONS]
10
+
11
+ Search the web using Exa AI
12
+
13
+ Arguments:
14
+ QUERY Search query (required)
15
+
16
+ Options:
17
+ --num-results N Number of results to return (default: 10)
18
+ --type TYPE Search type: fast, deep, keyword, or auto (default: fast)
19
+ --category CAT Focus on specific data category
20
+ Options: "company", "research paper", "news", "pdf",
21
+ "github", "tweet", "personal site", "financial report",
22
+ "people"
23
+ --include-domains D Comma-separated list of domains to include
24
+ --exclude-domains D Comma-separated list of domains to exclude
25
+ --start-published-date DATE Filter by published date (ISO 8601 format)
26
+ --end-published-date DATE Filter by published date (ISO 8601 format)
27
+ --start-crawl-date DATE Filter by crawl date (ISO 8601 format)
28
+ --end-crawl-date DATE Filter by crawl date (ISO 8601 format)
29
+ --include-text PHRASE Include results with exact phrase (repeatable)
30
+ --exclude-text PHRASE Exclude results with exact phrase (repeatable)
31
+
32
+ Content Extraction:
33
+ --text Include full webpage text
34
+ --text-max-characters N Max characters for webpage text
35
+ --include-html-tags Include HTML tags in text extraction
36
+ --summary Include AI-generated summary
37
+ --summary-query PROMPT Custom prompt for summary generation
38
+ --summary-schema FILE JSON schema for summary structure (@file syntax)
39
+ --context Format results as context for LLM RAG
40
+ --context-max-characters N Max characters for context string
41
+ --subpages N Number of subpages to crawl
42
+ --subpage-target PHRASE Subpage target phrases (repeatable)
43
+ --links N Number of links to extract per result
44
+ --image-links N Number of image links to extract
45
+
46
+ General Options:
47
+ --api-key KEY Exa API key (or set EXA_API_KEY env var)
48
+ --output-format FMT Output format: json, pretty, or text (default: json)
49
+ --help, -h Show this help message
50
+
51
+ Examples:
52
+ exa-ai search "ruby programming"
53
+ exa-ai search "machine learning" --num-results 5 --type deep
54
+ exa-ai search "Latest LLM research" --category "research paper"
55
+ exa-ai search "AI startups" --category company
56
+ exa-ai search "Dario Amodei" --category people
57
+ exa-ai search "AI research" --include-domains arxiv.org,scholar.google.com
58
+ exa-ai search "tutorials" --output-format pretty
59
+ HELP
188
60
  end
189
61
 
190
62
  # Build contents parameter from extracted flags
@@ -238,15 +110,15 @@ end
238
110
 
239
111
  # Main execution
240
112
  begin
241
- args = parse_args(ARGV)
242
-
243
- # Validate query
244
- if args[:query].nil? || args[:query].empty?
245
- $stderr.puts "Error: Query is required"
246
- $stderr.puts "Run 'exa-ai search --help' for usage information"
247
- exit 1
113
+ # Handle help flag
114
+ if ARGV.include?("--help") || ARGV.include?("-h")
115
+ print_help
116
+ exit 0
248
117
  end
249
118
 
119
+ # Parse command-line arguments
120
+ args = Exa::CLI::SearchParser.parse(ARGV)
121
+
250
122
  # Resolve API key
251
123
  api_key = Exa::CLI::Base.resolve_api_key(args[:api_key])
252
124
 
@@ -272,17 +144,8 @@ begin
272
144
  contents = build_contents(args)
273
145
  search_params.merge!(contents) if contents
274
146
 
275
- # Execute search based on LinkedIn type
276
- result = case args[:linkedin]
277
- when "company"
278
- client.linkedin_company(args[:query], **search_params)
279
- when "person"
280
- client.linkedin_person(args[:query], **search_params)
281
- when "all"
282
- client.search(args[:query], includeDomains: ["linkedin.com"], **search_params)
283
- else
284
- client.search(args[:query], **search_params)
285
- end
147
+ # Execute search
148
+ result = client.search(args[:query], **search_params)
286
149
 
287
150
  # Format and output result
288
151
  output = Exa::CLI::Formatters::SearchFormatter.format(result, output_format)
@@ -7,6 +7,8 @@ require "exa-ai"
7
7
  webset_id = nil
8
8
  api_key = nil
9
9
  output_format = "json"
10
+ limit = nil
11
+ cursor = nil
10
12
 
11
13
  args = ARGV.dup
12
14
  while args.any?
@@ -16,6 +18,10 @@ while args.any?
16
18
  api_key = args.shift
17
19
  when "--output-format"
18
20
  output_format = args.shift
21
+ when "--limit"
22
+ limit = args.shift&.to_i
23
+ when "--cursor"
24
+ cursor = args.shift
19
25
  when "--help", "-h"
20
26
  puts <<~HELP
21
27
  Usage: exa-ai webset-item-list <webset_id> [OPTIONS]
@@ -26,14 +32,17 @@ while args.any?
26
32
  webset_id ID of the webset (required)
27
33
 
28
34
  Options:
35
+ --limit N Maximum number of items to return (default: 20)
36
+ --cursor CURSOR Cursor for pagination (use nextCursor from previous response)
29
37
  --api-key KEY Exa API key (or set EXA_API_KEY env var)
30
- --output-format FMT Output format: json, pretty, or text (default: json)
38
+ --output-format FMT Output format: json, pretty, text, or toon (default: json)
31
39
  --help, -h Show this help message
32
40
 
33
41
  Examples:
34
42
  exa-ai webset-item-list ws_123
43
+ exa-ai webset-item-list ws_123 --limit 10
44
+ exa-ai webset-item-list ws_123 --limit 5 --cursor "abc123"
35
45
  exa-ai webset-item-list ws_123 --output-format pretty
36
- exa-ai webset-item-list ws_123 --output-format text
37
46
  HELP
38
47
  exit 0
39
48
  else
@@ -63,11 +72,16 @@ begin
63
72
  # Build client
64
73
  client = Exa::CLI::Base.build_client(api_key)
65
74
 
75
+ # Build list params
76
+ list_params = {}
77
+ list_params[:limit] = limit if limit
78
+ list_params[:cursor] = cursor if cursor
79
+
66
80
  # List items
67
- items = client.list_items(webset_id: webset_id)
81
+ collection = client.list_items(webset_id: webset_id, **list_params)
68
82
 
69
83
  # Format and output
70
- output = Exa::CLI::Formatters::WebsetItemFormatter.format_collection(items, output_format)
84
+ output = Exa::CLI::Formatters::WebsetItemFormatter.format_collection(collection, output_format)
71
85
  puts output
72
86
  $stdout.flush
73
87
 
@@ -4,24 +4,30 @@ module Exa
4
4
  module CLI
5
5
  module Formatters
6
6
  class AnswerFormatter
7
- def self.format(result, format)
7
+ def self.format(result, format, skip_citations: false)
8
8
  case format
9
9
  when "json"
10
- JSON.pretty_generate(result.to_h)
10
+ format_json(result, skip_citations: skip_citations)
11
11
  when "pretty"
12
- format_pretty(result)
12
+ format_pretty(result, skip_citations: skip_citations)
13
13
  when "text"
14
14
  format_text(result)
15
15
  when "toon"
16
16
  Exa::CLI::Base.encode_as_toon(result.to_h)
17
17
  else
18
- JSON.pretty_generate(result.to_h)
18
+ format_json(result, skip_citations: skip_citations)
19
19
  end
20
20
  end
21
21
 
22
22
  private
23
23
 
24
- def self.format_pretty(result)
24
+ def self.format_json(result, skip_citations: false)
25
+ hash = result.to_h
26
+ hash.delete(:citations) if skip_citations
27
+ JSON.pretty_generate(hash)
28
+ end
29
+
30
+ def self.format_pretty(result, skip_citations: false)
25
31
  output = []
26
32
  output << "Answer:"
27
33
  output << "-" * 60
@@ -34,15 +40,17 @@ module Exa
34
40
  end
35
41
  output << ""
36
42
 
37
- if result.citations && !result.citations.empty?
38
- output << "Citations:"
39
- output << "-" * 60
40
- result.citations.each_with_index do |citation, idx|
41
- output << "[#{idx + 1}] #{citation['title']}"
42
- output << " URL: #{citation['url']}"
43
- output << " Author: #{citation['author']}" if citation['author']
44
- output << " Date: #{citation['publishedDate']}" if citation['publishedDate']
45
- output << ""
43
+ unless skip_citations
44
+ if result.citations && !result.citations.empty?
45
+ output << "Citations:"
46
+ output << "-" * 60
47
+ result.citations.each_with_index do |citation, idx|
48
+ output << "[#{idx + 1}] #{citation['title']}"
49
+ output << " URL: #{citation['url']}"
50
+ output << " Author: #{citation['author']}" if citation['author']
51
+ output << " Date: #{citation['publishedDate']}" if citation['publishedDate']
52
+ output << ""
53
+ end
46
54
  end
47
55
  end
48
56
 
@@ -19,16 +19,16 @@ module Exa
19
19
  end
20
20
  end
21
21
 
22
- def self.format_collection(items, output_format)
22
+ def self.format_collection(collection, output_format)
23
23
  case output_format
24
24
  when "json"
25
- JSON.generate(items)
25
+ JSON.generate(collection.to_h)
26
26
  when "pretty"
27
- format_collection_as_pretty(items)
27
+ format_collection_as_pretty(collection)
28
28
  when "text"
29
- format_collection_as_text(items)
29
+ format_collection_as_text(collection)
30
30
  when "toon"
31
- Exa::CLI::Base.encode_as_toon(items)
31
+ Exa::CLI::Base.encode_as_toon(collection.to_h)
32
32
  else
33
33
  raise ArgumentError, "Unknown output format: #{output_format}"
34
34
  end
@@ -74,12 +74,17 @@ module Exa
74
74
  end
75
75
  private_class_method :format_as_text
76
76
 
77
- def self.format_collection_as_pretty(items)
77
+ def self.format_collection_as_pretty(collection)
78
78
  lines = []
79
- lines << "Items (#{items.length})"
79
+ lines << "Webset Items (#{collection.data.length} items)"
80
+
81
+ if collection.has_more
82
+ lines << "Next Cursor: #{collection.next_cursor}"
83
+ end
84
+
80
85
  lines << ""
81
86
 
82
- items.each_with_index do |item, idx|
87
+ collection.data.each_with_index do |item, idx|
83
88
  lines << "" if idx > 0 # Blank line between items
84
89
 
85
90
  lines << "Item ID: #{item['id']}"
@@ -101,9 +106,9 @@ module Exa
101
106
  end
102
107
  private_class_method :format_collection_as_pretty
103
108
 
104
- def self.format_collection_as_text(items)
105
- lines = ["Items (#{items.length} total):"]
106
- items.each_with_index do |item, idx|
109
+ def self.format_collection_as_text(collection)
110
+ lines = ["Webset Items (#{collection.data.length} items):"]
111
+ collection.data.each_with_index do |item, idx|
107
112
  lines << "\n#{idx + 1}. #{item['id']}"
108
113
  lines << " URL: #{item['url']}" if item['url']
109
114
  lines << " Title: #{item['title']}" if item['title']
@@ -112,6 +117,11 @@ module Exa
112
117
  lines << " Entity: #{item['entity']['name']}"
113
118
  end
114
119
  end
120
+
121
+ if collection.has_more
122
+ lines << "\nMore available (cursor: #{collection.next_cursor})"
123
+ end
124
+
115
125
  lines.join("\n")
116
126
  end
117
127
  private_class_method :format_collection_as_text
@@ -0,0 +1,152 @@
1
+ # frozen_string_literal: true
2
+
3
+ module Exa
4
+ module CLI
5
+ class SearchParser
6
+ VALID_SEARCH_TYPES = ["fast", "deep", "keyword", "auto"].freeze
7
+ VALID_CATEGORIES = [
8
+ "company", "research paper", "news", "pdf", "github",
9
+ "tweet", "personal site", "financial report", "people"
10
+ ].freeze
11
+
12
+ def self.parse(argv)
13
+ new(argv).parse
14
+ end
15
+
16
+ def initialize(argv)
17
+ @argv = argv
18
+ @args = {
19
+ output_format: "json",
20
+ api_key: nil
21
+ }
22
+ end
23
+
24
+ def parse
25
+ parse_arguments
26
+ validate_query
27
+ @args
28
+ end
29
+
30
+ private
31
+
32
+ def parse_arguments
33
+ query_parts = []
34
+ i = 0
35
+
36
+ while i < @argv.length
37
+ arg = @argv[i]
38
+ case arg
39
+ when "--num-results"
40
+ @args[:num_results] = @argv[i + 1].to_i
41
+ i += 2
42
+ when "--type"
43
+ search_type = @argv[i + 1]
44
+ validate_search_type(search_type)
45
+ @args[:type] = search_type
46
+ i += 2
47
+ when "--category"
48
+ category = @argv[i + 1]
49
+ validate_category(category)
50
+ @args[:category] = category
51
+ i += 2
52
+ when "--include-domains"
53
+ @args[:include_domains] = @argv[i + 1].split(",").map(&:strip)
54
+ i += 2
55
+ when "--exclude-domains"
56
+ @args[:exclude_domains] = @argv[i + 1].split(",").map(&:strip)
57
+ i += 2
58
+ when "--api-key"
59
+ @args[:api_key] = @argv[i + 1]
60
+ i += 2
61
+ when "--output-format"
62
+ @args[:output_format] = @argv[i + 1]
63
+ i += 2
64
+ when "--start-published-date"
65
+ @args[:start_published_date] = @argv[i + 1]
66
+ i += 2
67
+ when "--end-published-date"
68
+ @args[:end_published_date] = @argv[i + 1]
69
+ i += 2
70
+ when "--start-crawl-date"
71
+ @args[:start_crawl_date] = @argv[i + 1]
72
+ i += 2
73
+ when "--end-crawl-date"
74
+ @args[:end_crawl_date] = @argv[i + 1]
75
+ i += 2
76
+ when "--include-text"
77
+ @args[:include_text] ||= []
78
+ @args[:include_text] << @argv[i + 1]
79
+ i += 2
80
+ when "--exclude-text"
81
+ @args[:exclude_text] ||= []
82
+ @args[:exclude_text] << @argv[i + 1]
83
+ i += 2
84
+ when "--text"
85
+ @args[:text] = true
86
+ i += 1
87
+ when "--text-max-characters"
88
+ @args[:text_max_characters] = @argv[i + 1].to_i
89
+ i += 2
90
+ when "--include-html-tags"
91
+ @args[:include_html_tags] = true
92
+ i += 1
93
+ when "--summary"
94
+ @args[:summary] = true
95
+ i += 1
96
+ when "--summary-query"
97
+ @args[:summary_query] = @argv[i + 1]
98
+ i += 2
99
+ when "--summary-schema"
100
+ schema_arg = @argv[i + 1]
101
+ @args[:summary_schema] = if schema_arg.start_with?("@")
102
+ JSON.parse(File.read(schema_arg[1..]))
103
+ else
104
+ JSON.parse(schema_arg)
105
+ end
106
+ i += 2
107
+ when "--context"
108
+ @args[:context] = true
109
+ i += 1
110
+ when "--context-max-characters"
111
+ @args[:context_max_characters] = @argv[i + 1].to_i
112
+ i += 2
113
+ when "--subpages"
114
+ @args[:subpages] = @argv[i + 1].to_i
115
+ i += 2
116
+ when "--subpage-target"
117
+ @args[:subpage_target] ||= []
118
+ @args[:subpage_target] << @argv[i + 1]
119
+ i += 2
120
+ when "--links"
121
+ @args[:links] = @argv[i + 1].to_i
122
+ i += 2
123
+ when "--image-links"
124
+ @args[:image_links] = @argv[i + 1].to_i
125
+ i += 2
126
+ else
127
+ query_parts << arg
128
+ i += 1
129
+ end
130
+ end
131
+
132
+ @args[:query] = query_parts.join(" ")
133
+ end
134
+
135
+ def validate_query
136
+ raise ArgumentError, "Query is required" if @args[:query].nil? || @args[:query].empty?
137
+ end
138
+
139
+ def validate_search_type(search_type)
140
+ return if VALID_SEARCH_TYPES.include?(search_type)
141
+
142
+ raise ArgumentError, "Search type must be one of: #{VALID_SEARCH_TYPES.join(', ')}"
143
+ end
144
+
145
+ def validate_category(category)
146
+ return if VALID_CATEGORIES.include?(category)
147
+
148
+ raise ArgumentError, "Category must be one of: #{VALID_CATEGORIES.map { |c| "\"#{c}\"" }.join(', ')}"
149
+ end
150
+ end
151
+ end
152
+ end
data/lib/exa/client.rb CHANGED
@@ -122,32 +122,6 @@ module Exa
122
122
  Services::Context.new(connection, query: query, **params).call
123
123
  end
124
124
 
125
- # Search for LinkedIn company pages
126
- #
127
- # Convenience method that restricts search to LinkedIn company profiles
128
- # using keyword search for precise name matching.
129
- #
130
- # @param query [String] Company name to search
131
- # @param params [Hash] Additional search parameters
132
- # @option params [Integer] :numResults Number of results to return
133
- # @return [Resources::SearchResult] LinkedIn company results
134
- def linkedin_company(query, **params)
135
- search(query, type: "keyword", includeDomains: ["linkedin.com/company"], **params)
136
- end
137
-
138
- # Search for LinkedIn profiles
139
- #
140
- # Convenience method that restricts search to LinkedIn individual profiles
141
- # using keyword search for precise name matching.
142
- #
143
- # @param query [String] Person name to search
144
- # @param params [Hash] Additional search parameters
145
- # @option params [Integer] :numResults Number of results to return
146
- # @return [Resources::SearchResult] LinkedIn profile results
147
- def linkedin_person(query, **params)
148
- search(query, type: "keyword", includeDomains: ["linkedin.com/in"], **params)
149
- end
150
-
151
125
  # List all websets
152
126
  #
153
127
  # @param params [Hash] Pagination parameters
@@ -314,9 +288,12 @@ module Exa
314
288
  # List all items in a webset
315
289
  #
316
290
  # @param webset_id [String] Webset ID
317
- # @return [Array<Hash>] Array of items
318
- def list_items(webset_id:)
319
- Services::Websets::ListItems.new(connection, webset_id: webset_id).call
291
+ # @param params [Hash] Pagination parameters
292
+ # @option params [String] :cursor Cursor for pagination
293
+ # @option params [Integer] :limit Maximum number of items to return (default: 20)
294
+ # @return [Resources::WebsetItemCollection] Paginated list of items
295
+ def list_items(webset_id:, **params)
296
+ Services::Websets::ListItems.new(connection, webset_id: webset_id, **params).call
320
297
  end
321
298
 
322
299
  # List all imports
@@ -7,7 +7,7 @@ module Exa
7
7
  ENTITY_TYPES = %w[company person article research_paper custom].freeze
8
8
 
9
9
  # Valid enrichment formats
10
- ENRICHMENT_FORMATS = %w[text date number options url].freeze
10
+ ENRICHMENT_FORMATS = %w[text date number options email phone url].freeze
11
11
 
12
12
  # Valid source types for imports and exclusions
13
13
  SOURCE_TYPES = %w[import webset].freeze
@@ -0,0 +1,33 @@
1
+ # frozen_string_literal: true
2
+
3
+ module Exa
4
+ module Resources
5
+ # Represents a paginated list of webset items from the Exa API
6
+ #
7
+ # This class wraps the JSON response from the GET /websets/v0/websets/{id}/items endpoint
8
+ # and provides pagination support.
9
+ class WebsetItemCollection < Struct.new(
10
+ :data,
11
+ :has_more,
12
+ :next_cursor,
13
+ keyword_init: true
14
+ )
15
+ def initialize(data:, has_more: false, next_cursor: nil)
16
+ super
17
+ freeze
18
+ end
19
+
20
+ def empty?
21
+ data.empty?
22
+ end
23
+
24
+ def to_h
25
+ {
26
+ data: data,
27
+ has_more: has_more,
28
+ next_cursor: next_cursor
29
+ }
30
+ end
31
+ end
32
+ end
33
+ end
@@ -4,15 +4,21 @@ module Exa
4
4
  module Services
5
5
  module Websets
6
6
  class ListItems
7
- def initialize(connection, webset_id:)
7
+ def initialize(connection, webset_id:, **params)
8
8
  @connection = connection
9
9
  @webset_id = webset_id
10
+ @params = params
10
11
  end
11
12
 
12
13
  def call
13
- response = @connection.get("/websets/v0/websets/#{@webset_id}/items")
14
+ response = @connection.get("/websets/v0/websets/#{@webset_id}/items", @params)
14
15
  body = response.body
15
- body["data"] || []
16
+
17
+ Resources::WebsetItemCollection.new(
18
+ data: body["data"] || [],
19
+ has_more: body["hasMore"] || false,
20
+ next_cursor: body["nextCursor"]
21
+ )
16
22
  end
17
23
  end
18
24
  end
data/lib/exa/version.rb CHANGED
@@ -1,5 +1,5 @@
1
1
  # frozen_string_literal: true
2
2
 
3
3
  module Exa
4
- VERSION = "0.6.1"
4
+ VERSION = "0.7.1"
5
5
  end
data/lib/exa.rb CHANGED
@@ -17,6 +17,7 @@ require_relative "exa/resources/webset"
17
17
  require_relative "exa/resources/webset_search"
18
18
  require_relative "exa/resources/webset_enrichment"
19
19
  require_relative "exa/resources/webset_enrichment_collection"
20
+ require_relative "exa/resources/webset_item_collection"
20
21
  require_relative "exa/resources/import"
21
22
  require_relative "exa/resources/import_collection"
22
23
  require_relative "exa/resources/monitor"
metadata CHANGED
@@ -1,7 +1,7 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: exa-ai
3
3
  version: !ruby/object:Gem::Version
4
- version: 0.6.1
4
+ version: 0.7.1
5
5
  platform: ruby
6
6
  authors:
7
7
  - Benjamin Jackson
@@ -135,6 +135,20 @@ dependencies:
135
135
  - - "~>"
136
136
  - !ruby/object:Gem::Version
137
137
  version: '0.9'
138
+ - !ruby/object:Gem::Dependency
139
+ name: dotenv
140
+ requirement: !ruby/object:Gem::Requirement
141
+ requirements:
142
+ - - "~>"
143
+ - !ruby/object:Gem::Version
144
+ version: '3.0'
145
+ type: :development
146
+ prerelease: false
147
+ version_requirements: !ruby/object:Gem::Requirement
148
+ requirements:
149
+ - - "~>"
150
+ - !ruby/object:Gem::Version
151
+ version: '3.0'
138
152
  description: A Ruby gem for interacting with the Exa.ai search and discovery API
139
153
  email:
140
154
  - ben@hearmeout.co
@@ -208,6 +222,7 @@ files:
208
222
  - lib/exa/cli/formatters/webset_item_formatter.rb
209
223
  - lib/exa/cli/formatters/webset_search_formatter.rb
210
224
  - lib/exa/cli/polling.rb
225
+ - lib/exa/cli/search_parser.rb
211
226
  - lib/exa/client.rb
212
227
  - lib/exa/connection.rb
213
228
  - lib/exa/constants/websets.rb
@@ -230,6 +245,7 @@ files:
230
245
  - lib/exa/resources/webset_collection.rb
231
246
  - lib/exa/resources/webset_enrichment.rb
232
247
  - lib/exa/resources/webset_enrichment_collection.rb
248
+ - lib/exa/resources/webset_item_collection.rb
233
249
  - lib/exa/resources/webset_search.rb
234
250
  - lib/exa/services/answer.rb
235
251
  - lib/exa/services/answer_stream.rb