reducto_ai 0.1.3 → 0.2.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- checksums.yaml +4 -4
- data/CHANGELOG.md +33 -0
- data/README.md +61 -39
- data/lib/reducto_ai/client.rb +7 -1
- data/lib/reducto_ai/config.rb +8 -4
- data/lib/reducto_ai/engine.rb +1 -1
- data/lib/reducto_ai/errors.rb +15 -0
- data/lib/reducto_ai/job_status.rb +66 -0
- data/lib/reducto_ai/rails/request_verifier.rb +29 -0
- data/lib/reducto_ai/resources/async_payload.rb +48 -0
- data/lib/reducto_ai/resources/edit.rb +29 -9
- data/lib/reducto_ai/resources/extract.rb +7 -11
- data/lib/reducto_ai/resources/jobs.rb +105 -14
- data/lib/reducto_ai/resources/parse.rb +9 -15
- data/lib/reducto_ai/resources/pipeline.rb +7 -4
- data/lib/reducto_ai/resources/split.rb +7 -13
- data/lib/reducto_ai/version.rb +1 -1
- data/lib/reducto_ai/webhooks/event.rb +68 -0
- data/lib/reducto_ai/webhooks/verifier.rb +47 -0
- data/lib/reducto_ai.rb +7 -0
- data/sig/reducto_ai.rbs +93 -1
- metadata +34 -1
checksums.yaml
CHANGED
|
@@ -1,7 +1,7 @@
|
|
|
1
1
|
---
|
|
2
2
|
SHA256:
|
|
3
|
-
metadata.gz:
|
|
4
|
-
data.tar.gz:
|
|
3
|
+
metadata.gz: e94dd6ff1f85cd61aed7ca9c9a1cd6a62318a99b6caa941cd4e2cbee608d69f7
|
|
4
|
+
data.tar.gz: 80da8255e7d17627bda7fa1fdcf14bb5c2fd6795514868cf939d76f135bca8e5
|
|
5
5
|
SHA512:
|
|
6
|
-
metadata.gz:
|
|
7
|
-
data.tar.gz:
|
|
6
|
+
metadata.gz: 7429c001e1349f3ecdaaf60c3908c6b2a730d7b1780fed07df786c755cfafcab9952dd4a3673fd3e653e3dbd848510519c17cae52d4bae0be692ea90a5b0eff6
|
|
7
|
+
data.tar.gz: a9e0d1f1e03e80f420a0a77c98789ed6ad6df2b4b94c350c63ad76091298651a91dd923bea9e541e35b7a6786af142bcd280d225aa6a4bd705dfb7c547428033
|
data/CHANGELOG.md
CHANGED
|
@@ -1,3 +1,36 @@
|
|
|
1
|
+
## [Unreleased]
|
|
2
|
+
|
|
3
|
+
## [0.2.0] - 2026-03-30
|
|
4
|
+
|
|
5
|
+
### Added
|
|
6
|
+
- Async payload support for parse/extract/split/edit/pipeline using current Reducto async options
|
|
7
|
+
- `Resources::Jobs` lifecycle helpers: status normalization, predicates, and `wait`
|
|
8
|
+
- Svix-backed webhook verification helpers and a thin Rails request verifier
|
|
9
|
+
- New job/webhook error classes and configuration for webhook secrets
|
|
10
|
+
- `RateLimitError` exception class for HTTP 429 responses
|
|
11
|
+
- Path traversal protection in job_id validation
|
|
12
|
+
- `AsyncPayload` module for shared async options handling
|
|
13
|
+
- Comprehensive test coverage for async operations, webhook verification, and job lifecycle
|
|
14
|
+
|
|
15
|
+
### Changed
|
|
16
|
+
- README examples now reflect current async lifecycle and webhook portal behavior
|
|
17
|
+
- Test suite expanded to cover async payload translation, job polling, webhook verification, and edge cases
|
|
18
|
+
- `RequestVerifier` now extracts only webhook headers (svix-*) instead of full Rack environment
|
|
19
|
+
- Consolidated `normalize_input` method into shared `AsyncPayload` module
|
|
20
|
+
- Simplified `Jobs` class to use `include ReductoAI::JobStatus` instead of delegation
|
|
21
|
+
- Lazy-load Svix gem only when webhook verification is used
|
|
22
|
+
- Case-insensitive status normalization via downcase lookup
|
|
23
|
+
- Replaced deprecated `Faraday::UploadIO` with `Faraday::Multipart::FilePart`
|
|
24
|
+
- HTTP 403 now handled as standard client error
|
|
25
|
+
|
|
26
|
+
### Fixed
|
|
27
|
+
- Security: Path traversal vulnerability in cancel/retrieve job_id interpolation
|
|
28
|
+
- Security: Rack environment variable leak in webhook header extraction
|
|
29
|
+
- Error handling: Added user-friendly message for invalid Svix secret format
|
|
30
|
+
- Error handling: HTTP 429 (rate limit) now raises dedicated exception
|
|
31
|
+
- Error handling: HTTP 403 (forbidden) now properly caught as client error
|
|
32
|
+
- Code quality: Removed dead `raise_exceptions` configuration attribute
|
|
33
|
+
|
|
1
34
|
## [0.1.3] - 2026-01-04
|
|
2
35
|
|
|
3
36
|
### Added
|
data/README.md
CHANGED
|
@@ -27,49 +27,51 @@ end
|
|
|
27
27
|
- **Split**: Use after parsing when you need logical sections. Provide `split_description` names/rules to segment the parsed document into labeled ranges.
|
|
28
28
|
- **Extract**: Run when you need structured answers (fields, JSON). Supply instructions or schema to pull values from raw input or an existing parse `job_id`.
|
|
29
29
|
- **Edit**: Generate marked-up PDFs using `document_url` plus `edit_instructions` (PDF forms supported via `form_schema`).
|
|
30
|
-
- **Pipeline**:
|
|
30
|
+
- **Pipeline**: The current gem surface remains `steps:`-based for multi-step workflows.
|
|
31
31
|
|
|
32
32
|
### Async Operations
|
|
33
33
|
|
|
34
|
-
|
|
34
|
+
Async variants return immediately with a `job_id`. Use `client.jobs.wait(...)` for polling or configure Svix-backed webhooks in your app.
|
|
35
|
+
|
|
36
|
+
Notes:
|
|
37
|
+
- Reducto prioritizes sync jobs over async jobs.
|
|
38
|
+
- Async results may be deleted on Reducto's normal 12-hour cleanup cadence unless you persist them yourself or opt into `persist_results`.
|
|
39
|
+
- `client.jobs.configure_webhook` returns a Svix portal URL string.
|
|
40
|
+
- `client.jobs.wait` requires either `timeout:` or `max_attempts:` so it cannot poll forever by accident.
|
|
41
|
+
- `client.edit.async` is the exception to the generic async shape: Reducto's `/edit_async` endpoint only accepts top-level `priority` and `webhook`, not async `metadata`.
|
|
35
42
|
|
|
36
43
|
```ruby
|
|
37
44
|
client = ReductoAI::Client.new
|
|
38
45
|
|
|
39
|
-
|
|
40
|
-
|
|
41
|
-
|
|
42
|
-
|
|
46
|
+
job = client.parse.async(
|
|
47
|
+
input: "https://example.com/large-doc.pdf",
|
|
48
|
+
output_formats: { markdown: true },
|
|
49
|
+
async: {
|
|
50
|
+
priority: false,
|
|
51
|
+
webhook: { mode: "svix", channels: ["production"] },
|
|
52
|
+
metadata: { document_id: "doc-123" }
|
|
53
|
+
},
|
|
54
|
+
settings: { persist_results: true }
|
|
55
|
+
)
|
|
43
56
|
|
|
44
|
-
#
|
|
45
|
-
# {
|
|
46
|
-
# "job_id" => "async-123",
|
|
47
|
-
# "status" => "processing"
|
|
48
|
-
# }
|
|
57
|
+
# => { "job_id" => "async-123", "status" => "Pending" }
|
|
49
58
|
|
|
50
|
-
|
|
51
|
-
# API Reference: https://docs.reducto.ai/api-reference/get-job
|
|
52
|
-
result = client.jobs.retrieve(job_id: job_id)
|
|
59
|
+
result = client.jobs.wait(job_id: job["job_id"], interval: 2, timeout: 300)
|
|
53
60
|
|
|
54
|
-
#
|
|
55
|
-
# {
|
|
56
|
-
# "job_id" => "async-123",
|
|
57
|
-
# "status" => "complete",
|
|
58
|
-
# "result" => {...},
|
|
59
|
-
# "usage" => {"credits" => 1.0}
|
|
60
|
-
# }
|
|
61
|
+
# => { "job_id" => "async-123", "status" => "Completed", "result" => {...} }
|
|
61
62
|
|
|
62
|
-
|
|
63
|
-
#
|
|
64
|
-
client.jobs.configure_webhook
|
|
63
|
+
portal_url = client.jobs.configure_webhook
|
|
64
|
+
# => "https://dashboard.svix.com/..."
|
|
65
65
|
```
|
|
66
66
|
|
|
67
|
-
Available async
|
|
68
|
-
- `client.parse.async(input:, **options)`
|
|
69
|
-
- `client.extract.async(input:, instructions:, **options)`
|
|
70
|
-
- `client.split.async(input:, **options)`
|
|
71
|
-
- `client.edit.async(input:, instructions:, **options)`
|
|
72
|
-
- `client.pipeline.async(input:, steps:, **options)`
|
|
67
|
+
Available async helpers:
|
|
68
|
+
- `client.parse.async(input:, async:, **options)`
|
|
69
|
+
- `client.extract.async(input:, instructions:, async:, **options)`
|
|
70
|
+
- `client.split.async(input:, async:, **options)`
|
|
71
|
+
- `client.edit.async(input:, instructions:, async:, **options)` where `async:` may only include `priority` and `webhook`
|
|
72
|
+
- `client.pipeline.async(input:, steps:, async:, **options)`
|
|
73
|
+
- `client.jobs.wait(job_id:, interval: 2, timeout: nil, max_attempts: nil, raise_on_failure: true)`
|
|
74
|
+
- `client.jobs.pending?/in_progress?/completing?/completed?/failed?/terminal?`
|
|
73
75
|
|
|
74
76
|
### Rails
|
|
75
77
|
|
|
@@ -78,14 +80,38 @@ Create `config/initializers/reducto_ai.rb`:
|
|
|
78
80
|
```ruby
|
|
79
81
|
ReductoAI.configure do |c|
|
|
80
82
|
c.api_key = Rails.application.credentials.dig(:reducto, :api_key)
|
|
83
|
+
c.webhook_secret = Rails.application.credentials.dig(:reducto, :webhook_secret)
|
|
81
84
|
# c.base_url = "https://platform.reducto.ai"
|
|
82
85
|
# c.open_timeout = 5; c.read_timeout = 30
|
|
83
86
|
end
|
|
87
|
+
```
|
|
84
88
|
|
|
85
|
-
|
|
86
|
-
|
|
89
|
+
In your host app, own the route/controller/job:
|
|
90
|
+
|
|
91
|
+
```ruby
|
|
92
|
+
# config/routes.rb
|
|
93
|
+
post "/webhooks/reducto", to: "reducto_webhooks#create"
|
|
87
94
|
```
|
|
88
95
|
|
|
96
|
+
```ruby
|
|
97
|
+
class ReductoWebhooksController < ActionController::API
|
|
98
|
+
def create
|
|
99
|
+
event = ReductoAI::Rails::RequestVerifier.verify!(request)
|
|
100
|
+
|
|
101
|
+
return head :ok if WebhookDelivery.exists?(provider: "reducto", delivery_id: event.svix_id)
|
|
102
|
+
|
|
103
|
+
WebhookDelivery.create!(provider: "reducto", delivery_id: event.svix_id, job_id: event.job_id)
|
|
104
|
+
ReductoWebhookJob.perform_later(event.job_id, event.svix_id)
|
|
105
|
+
|
|
106
|
+
head :ok
|
|
107
|
+
rescue ReductoAI::WebhookVerificationError
|
|
108
|
+
head :unauthorized
|
|
109
|
+
end
|
|
110
|
+
end
|
|
111
|
+
```
|
|
112
|
+
|
|
113
|
+
Return 2xx quickly, dedupe on `svix-id`, and fetch/store final results in the background job.
|
|
114
|
+
|
|
89
115
|
### Quick Start
|
|
90
116
|
|
|
91
117
|
```ruby
|
|
@@ -99,7 +125,7 @@ job_id = parse["job_id"]
|
|
|
99
125
|
# Response:
|
|
100
126
|
# {
|
|
101
127
|
# "job_id" => "abc-123",
|
|
102
|
-
# "status" => "
|
|
128
|
+
# "status" => "Completed",
|
|
103
129
|
# "result" => {...}
|
|
104
130
|
# }
|
|
105
131
|
|
|
@@ -141,7 +167,7 @@ parse = client.parse.sync(input: "https://example.com/invoices.pdf")
|
|
|
141
167
|
# Response:
|
|
142
168
|
# {
|
|
143
169
|
# "job_id" => "parse-123",
|
|
144
|
-
# "status" => "
|
|
170
|
+
# "status" => "Completed",
|
|
145
171
|
# "result" => {...}
|
|
146
172
|
# }
|
|
147
173
|
|
|
@@ -362,11 +388,7 @@ job = client.parse.async(input: large_pdf_url)
|
|
|
362
388
|
job_id = job["job_id"]
|
|
363
389
|
|
|
364
390
|
# Poll or use webhooks
|
|
365
|
-
|
|
366
|
-
result = client.jobs.retrieve(job_id: job_id)
|
|
367
|
-
break if result["status"] == "complete"
|
|
368
|
-
sleep 2
|
|
369
|
-
end
|
|
391
|
+
result = client.jobs.wait(job_id: job_id, interval: 2, timeout: 300)
|
|
370
392
|
|
|
371
393
|
# Then reuse the job_id for split/extract
|
|
372
394
|
split = client.split.sync(input: job_id, split_description: [...])
|
data/lib/reducto_ai/client.rb
CHANGED
|
@@ -3,6 +3,7 @@
|
|
|
3
3
|
require "faraday"
|
|
4
4
|
require "json"
|
|
5
5
|
require "faraday/multipart"
|
|
6
|
+
require_relative "resources/async_payload"
|
|
6
7
|
require_relative "resources/parse"
|
|
7
8
|
require_relative "resources/extract"
|
|
8
9
|
require_relative "resources/split"
|
|
@@ -176,6 +177,7 @@ module ReductoAI
|
|
|
176
177
|
|
|
177
178
|
parsed_body = parse_error_body(body)
|
|
178
179
|
return handle_auth_error(parsed_body, status) if status == 401
|
|
180
|
+
return handle_rate_limit_error(parsed_body, status) if status == 429
|
|
179
181
|
return handle_client_error(parsed_body, status) if client_error?(status)
|
|
180
182
|
return handle_server_error(parsed_body, status) if server_error?(status)
|
|
181
183
|
|
|
@@ -214,7 +216,7 @@ module ReductoAI
|
|
|
214
216
|
end
|
|
215
217
|
|
|
216
218
|
def client_error?(status)
|
|
217
|
-
[400, 404, 422].include?(status)
|
|
219
|
+
[400, 403, 404, 422].include?(status)
|
|
218
220
|
end
|
|
219
221
|
|
|
220
222
|
def server_error?(status)
|
|
@@ -225,6 +227,10 @@ module ReductoAI
|
|
|
225
227
|
raise AuthenticationError.new("Unauthorized (401): check API key", status: status, body: body)
|
|
226
228
|
end
|
|
227
229
|
|
|
230
|
+
def handle_rate_limit_error(body, status)
|
|
231
|
+
raise RateLimitError.new(error_message(status, body), status: status, body: body)
|
|
232
|
+
end
|
|
233
|
+
|
|
228
234
|
def handle_client_error(body, status)
|
|
229
235
|
raise ClientError.new(error_message(status, body), status: status, body: body)
|
|
230
236
|
end
|
data/lib/reducto_ai/config.rb
CHANGED
|
@@ -38,8 +38,11 @@ module ReductoAI
|
|
|
38
38
|
# @return [Integer] Request read timeout in seconds (default: 30)
|
|
39
39
|
attr_accessor :read_timeout
|
|
40
40
|
|
|
41
|
-
# @return [
|
|
42
|
-
attr_accessor :
|
|
41
|
+
# @return [String, nil] Svix webhook signing secret
|
|
42
|
+
attr_accessor :webhook_secret
|
|
43
|
+
|
|
44
|
+
# @return [Proc, nil] Proc that resolves webhook secret from request headers
|
|
45
|
+
attr_accessor :webhook_secret_resolver
|
|
43
46
|
|
|
44
47
|
# @return [Logger] Logger instance for debugging
|
|
45
48
|
attr_writer :logger
|
|
@@ -50,7 +53,8 @@ module ReductoAI
|
|
|
50
53
|
@base_url = ENV.fetch("REDUCTO_BASE_URL", "https://platform.reducto.ai")
|
|
51
54
|
@open_timeout = integer_or_default("REDUCTO_OPEN_TIMEOUT", 5)
|
|
52
55
|
@read_timeout = integer_or_default("REDUCTO_READ_TIMEOUT", 30)
|
|
53
|
-
@
|
|
56
|
+
@webhook_secret = ENV.fetch("REDUCTO_WEBHOOK_SECRET", nil)
|
|
57
|
+
@webhook_secret_resolver = nil
|
|
54
58
|
end
|
|
55
59
|
|
|
56
60
|
# Returns the logger instance.
|
|
@@ -59,7 +63,7 @@ module ReductoAI
|
|
|
59
63
|
#
|
|
60
64
|
# @return [Logger] the logger instance
|
|
61
65
|
def logger
|
|
62
|
-
@logger ||= (defined?(Rails) && Rails.respond_to?(:logger) && Rails.logger) || Logger.new($stderr)
|
|
66
|
+
@logger ||= (defined?(::Rails) && ::Rails.respond_to?(:logger) && ::Rails.logger) || Logger.new($stderr)
|
|
63
67
|
end
|
|
64
68
|
|
|
65
69
|
private
|
data/lib/reducto_ai/engine.rb
CHANGED
data/lib/reducto_ai/errors.rb
CHANGED
|
@@ -57,6 +57,12 @@ module ReductoAI
|
|
|
57
57
|
# # => ReductoAI::ClientError: HTTP 400: Invalid input URL
|
|
58
58
|
class ClientError < Error; end
|
|
59
59
|
|
|
60
|
+
# Raised on 429 Too Many Requests responses.
|
|
61
|
+
#
|
|
62
|
+
# Indicates API rate limit has been exceeded. Consumers should implement
|
|
63
|
+
# retry logic with backoff.
|
|
64
|
+
class RateLimitError < ClientError; end
|
|
65
|
+
|
|
60
66
|
# Raised on 5xx server errors.
|
|
61
67
|
#
|
|
62
68
|
# Indicates Reducto API internal errors or temporary failures.
|
|
@@ -77,4 +83,13 @@ module ReductoAI
|
|
|
77
83
|
# client.parse.sync(input: "https://example.com/large-doc.pdf")
|
|
78
84
|
# # => ReductoAI::NetworkError: Network error: execution expired
|
|
79
85
|
class NetworkError < Error; end
|
|
86
|
+
|
|
87
|
+
# Raised when waiting for an async job exceeds the configured timeout or attempt limit.
|
|
88
|
+
class JobTimeoutError < Error; end
|
|
89
|
+
|
|
90
|
+
# Raised when an async job reaches a failed terminal state.
|
|
91
|
+
class JobFailedError < Error; end
|
|
92
|
+
|
|
93
|
+
# Raised when webhook signature verification fails.
|
|
94
|
+
class WebhookVerificationError < Error; end
|
|
80
95
|
end
|
|
@@ -0,0 +1,66 @@
|
|
|
1
|
+
# frozen_string_literal: true
|
|
2
|
+
|
|
3
|
+
module ReductoAI
|
|
4
|
+
module JobStatus
|
|
5
|
+
extend self
|
|
6
|
+
|
|
7
|
+
STATUS_MAP = {
|
|
8
|
+
"pending" => "Pending",
|
|
9
|
+
"idle" => "Pending",
|
|
10
|
+
"inprogress" => "InProgress",
|
|
11
|
+
"processing" => "InProgress",
|
|
12
|
+
"running" => "InProgress",
|
|
13
|
+
"completing" => "Completing",
|
|
14
|
+
"completed" => "Completed",
|
|
15
|
+
"complete" => "Completed",
|
|
16
|
+
"succeeded" => "Completed",
|
|
17
|
+
"failed" => "Failed"
|
|
18
|
+
}.freeze
|
|
19
|
+
|
|
20
|
+
def normalize_status(value_or_response)
|
|
21
|
+
raw_status = extract_status(value_or_response)
|
|
22
|
+
return nil if raw_status.nil?
|
|
23
|
+
|
|
24
|
+
STATUS_MAP.fetch(raw_status.to_s.downcase, raw_status.to_s)
|
|
25
|
+
end
|
|
26
|
+
|
|
27
|
+
def pending?(value_or_response)
|
|
28
|
+
normalize_status(value_or_response) == "Pending"
|
|
29
|
+
end
|
|
30
|
+
|
|
31
|
+
def in_progress?(value_or_response)
|
|
32
|
+
normalize_status(value_or_response) == "InProgress"
|
|
33
|
+
end
|
|
34
|
+
|
|
35
|
+
def completing?(value_or_response)
|
|
36
|
+
normalize_status(value_or_response) == "Completing"
|
|
37
|
+
end
|
|
38
|
+
|
|
39
|
+
def completed?(value_or_response)
|
|
40
|
+
normalize_status(value_or_response) == "Completed"
|
|
41
|
+
end
|
|
42
|
+
|
|
43
|
+
def failed?(value_or_response)
|
|
44
|
+
normalize_status(value_or_response) == "Failed"
|
|
45
|
+
end
|
|
46
|
+
|
|
47
|
+
def terminal?(value_or_response)
|
|
48
|
+
completed?(value_or_response) || failed?(value_or_response)
|
|
49
|
+
end
|
|
50
|
+
|
|
51
|
+
private
|
|
52
|
+
|
|
53
|
+
def extract_status(value_or_response)
|
|
54
|
+
case value_or_response
|
|
55
|
+
when Hash
|
|
56
|
+
value_or_response["status"] || value_or_response[:status]
|
|
57
|
+
else
|
|
58
|
+
if value_or_response.respond_to?(:status)
|
|
59
|
+
value_or_response.status
|
|
60
|
+
else
|
|
61
|
+
value_or_response
|
|
62
|
+
end
|
|
63
|
+
end
|
|
64
|
+
end
|
|
65
|
+
end
|
|
66
|
+
end
|
|
@@ -0,0 +1,29 @@
|
|
|
1
|
+
# frozen_string_literal: true
|
|
2
|
+
|
|
3
|
+
module ReductoAI
|
|
4
|
+
module Rails
|
|
5
|
+
class RequestVerifier
|
|
6
|
+
WEBHOOK_HEADERS = %w[svix-id svix-timestamp svix-signature].freeze
|
|
7
|
+
private_constant :WEBHOOK_HEADERS
|
|
8
|
+
|
|
9
|
+
class << self
|
|
10
|
+
def verify!(request, secret: nil)
|
|
11
|
+
payload = request.raw_post
|
|
12
|
+
headers = extract_webhook_headers(request)
|
|
13
|
+
verified_payload = Webhooks::Verifier.verify!(payload: payload, headers: headers, secret: secret)
|
|
14
|
+
|
|
15
|
+
Webhooks::Event.parse(verified_payload, headers: headers)
|
|
16
|
+
end
|
|
17
|
+
|
|
18
|
+
private
|
|
19
|
+
|
|
20
|
+
def extract_webhook_headers(request)
|
|
21
|
+
WEBHOOK_HEADERS.each_with_object({}) do |key, h|
|
|
22
|
+
value = request.headers[key]
|
|
23
|
+
h[key] = value.to_s if value
|
|
24
|
+
end
|
|
25
|
+
end
|
|
26
|
+
end
|
|
27
|
+
end
|
|
28
|
+
end
|
|
29
|
+
end
|
|
@@ -0,0 +1,48 @@
|
|
|
1
|
+
# frozen_string_literal: true
|
|
2
|
+
|
|
3
|
+
module ReductoAI
|
|
4
|
+
module Resources
|
|
5
|
+
module AsyncPayload
|
|
6
|
+
private
|
|
7
|
+
|
|
8
|
+
def normalize_input(input)
|
|
9
|
+
return input unless input.is_a?(Hash)
|
|
10
|
+
|
|
11
|
+
input[:url] || input["url"] || input
|
|
12
|
+
end
|
|
13
|
+
|
|
14
|
+
def apply_async_payload!(payload, async)
|
|
15
|
+
normalized_async = normalize_async_payload(async)
|
|
16
|
+
payload[:async] = normalized_async unless normalized_async.nil?
|
|
17
|
+
payload
|
|
18
|
+
end
|
|
19
|
+
|
|
20
|
+
def normalize_async_payload(async)
|
|
21
|
+
case async
|
|
22
|
+
when nil, false
|
|
23
|
+
nil
|
|
24
|
+
when true
|
|
25
|
+
{}
|
|
26
|
+
when Hash
|
|
27
|
+
deep_compact(async)
|
|
28
|
+
else
|
|
29
|
+
raise ArgumentError, "async must be a Hash, true, false, or nil"
|
|
30
|
+
end
|
|
31
|
+
end
|
|
32
|
+
|
|
33
|
+
def deep_compact(value)
|
|
34
|
+
case value
|
|
35
|
+
when Hash
|
|
36
|
+
value.each_with_object({}) do |(key, child_value), compacted|
|
|
37
|
+
normalized_child = deep_compact(child_value)
|
|
38
|
+
compacted[key] = normalized_child unless normalized_child.nil?
|
|
39
|
+
end
|
|
40
|
+
when Array
|
|
41
|
+
value.map { |child_value| deep_compact(child_value) }.compact
|
|
42
|
+
else
|
|
43
|
+
value
|
|
44
|
+
end
|
|
45
|
+
end
|
|
46
|
+
end
|
|
47
|
+
end
|
|
48
|
+
end
|
|
@@ -18,6 +18,8 @@ module ReductoAI
|
|
|
18
18
|
# @note Edit operations consume credits based on document size and
|
|
19
19
|
# instruction complexity.
|
|
20
20
|
class Edit
|
|
21
|
+
include AsyncPayload
|
|
22
|
+
|
|
21
23
|
# @param client [Client] the Reducto API client
|
|
22
24
|
# @api private
|
|
23
25
|
def initialize(client)
|
|
@@ -33,7 +35,7 @@ module ReductoAI
|
|
|
33
35
|
#
|
|
34
36
|
# @return [Hash] Edit results with keys:
|
|
35
37
|
# * "job_id" [String] - Job identifier
|
|
36
|
-
# * "status" [String] - Job status ("
|
|
38
|
+
# * "status" [String] - Job status ("Completed")
|
|
37
39
|
# * "result" [Hash] - Contains "document_url" with marked PDF
|
|
38
40
|
# * "usage" [Hash] - Credit usage details
|
|
39
41
|
#
|
|
@@ -64,12 +66,14 @@ module ReductoAI
|
|
|
64
66
|
#
|
|
65
67
|
# @param input [String, Hash] Document URL or hash with :url key
|
|
66
68
|
# @param instructions [String] Natural language editing instructions
|
|
67
|
-
# @param async [Boolean, nil] Async
|
|
69
|
+
# @param async [Boolean, Hash, nil] Async options. `true` keeps the legacy no-options call,
|
|
70
|
+
# while a hash is translated to Reducto's current top-level edit async fields.
|
|
71
|
+
# `/edit_async` only accepts `priority` and `webhook`, not generic async metadata.
|
|
68
72
|
# @param options [Hash] Additional editing options
|
|
69
73
|
#
|
|
70
74
|
# @return [Hash] Job status with keys:
|
|
71
75
|
# * "job_id" [String] - Job identifier for polling
|
|
72
|
-
# * "status" [String] - Initial status ("
|
|
76
|
+
# * "status" [String] - Initial status ("Pending")
|
|
73
77
|
#
|
|
74
78
|
# @raise [ArgumentError] if input or instructions are nil/empty
|
|
75
79
|
#
|
|
@@ -88,8 +92,9 @@ module ReductoAI
|
|
|
88
92
|
raise ArgumentError, "instructions are required"
|
|
89
93
|
end
|
|
90
94
|
|
|
91
|
-
payload = build_payload(input, instructions,
|
|
92
|
-
payload
|
|
95
|
+
payload = build_payload(input, instructions, {})
|
|
96
|
+
payload.merge!(translate_async_options(async))
|
|
97
|
+
payload.merge!(options.compact)
|
|
93
98
|
|
|
94
99
|
@client.post("/edit_async", payload)
|
|
95
100
|
end
|
|
@@ -102,11 +107,26 @@ module ReductoAI
|
|
|
102
107
|
{ document_url: document_url, edit_instructions: instructions, **options }.compact
|
|
103
108
|
end
|
|
104
109
|
|
|
105
|
-
#
|
|
106
|
-
|
|
107
|
-
|
|
110
|
+
# Edit API uses top-level async keys (priority, webhook) rather than
|
|
111
|
+
# the nested `async` object used by other resources. This mirrors the
|
|
112
|
+
# Reducto API design where edit_async accepts these fields at root level.
|
|
113
|
+
def translate_async_options(async)
|
|
114
|
+
case async
|
|
115
|
+
when nil, false, true
|
|
116
|
+
{}
|
|
117
|
+
when Hash
|
|
118
|
+
normalized_async = async.each_with_object({}) do |(key, value), normalized|
|
|
119
|
+
normalized[key.to_sym] = value
|
|
120
|
+
end
|
|
121
|
+
unsupported_keys = normalized_async.keys - %i[priority webhook]
|
|
122
|
+
unless unsupported_keys.empty?
|
|
123
|
+
raise ArgumentError, "unsupported async options: #{unsupported_keys.join(", ")}"
|
|
124
|
+
end
|
|
108
125
|
|
|
109
|
-
|
|
126
|
+
normalized_async.compact
|
|
127
|
+
else
|
|
128
|
+
raise ArgumentError, "async must be a Hash, true, false, or nil"
|
|
129
|
+
end
|
|
110
130
|
end
|
|
111
131
|
end
|
|
112
132
|
end
|
|
@@ -24,6 +24,8 @@ module ReductoAI
|
|
|
24
24
|
# @note Extraction operations consume credits based on document complexity
|
|
25
25
|
# and schema size.
|
|
26
26
|
class Extract
|
|
27
|
+
include AsyncPayload
|
|
28
|
+
|
|
27
29
|
# @param client [Client] the Reducto API client
|
|
28
30
|
# @api private
|
|
29
31
|
def initialize(client)
|
|
@@ -40,7 +42,7 @@ module ReductoAI
|
|
|
40
42
|
#
|
|
41
43
|
# @return [Hash] Extraction results with keys:
|
|
42
44
|
# * "job_id" [String] - Job identifier
|
|
43
|
-
# * "status" [String] - Job status ("
|
|
45
|
+
# * "status" [String] - Job status ("Completed")
|
|
44
46
|
# * "result" [Hash] - Extracted data matching schema
|
|
45
47
|
# * "usage" [Hash] - Credit usage details
|
|
46
48
|
#
|
|
@@ -74,12 +76,13 @@ module ReductoAI
|
|
|
74
76
|
#
|
|
75
77
|
# @param input [String, Hash] Document URL or hash with :url key
|
|
76
78
|
# @param instructions [Hash, String] Extraction schema (same as {#sync})
|
|
77
|
-
# @param async [Boolean, nil] Async
|
|
79
|
+
# @param async [Boolean, Hash, nil] Async options. `true` becomes an empty async payload,
|
|
80
|
+
# while a hash is sent as Reducto's nested `async` object.
|
|
78
81
|
# @param options [Hash] Additional extraction options
|
|
79
82
|
#
|
|
80
83
|
# @return [Hash] Job status with keys:
|
|
81
84
|
# * "job_id" [String] - Job identifier for polling
|
|
82
|
-
# * "status" [String] - Initial status ("
|
|
85
|
+
# * "status" [String] - Initial status ("Pending")
|
|
83
86
|
#
|
|
84
87
|
# @raise [ArgumentError] if input or instructions are nil/empty
|
|
85
88
|
#
|
|
@@ -99,7 +102,7 @@ module ReductoAI
|
|
|
99
102
|
end
|
|
100
103
|
|
|
101
104
|
payload = build_payload(input, instructions, options)
|
|
102
|
-
payload
|
|
105
|
+
apply_async_payload!(payload, async)
|
|
103
106
|
|
|
104
107
|
@client.post("/extract_async", payload)
|
|
105
108
|
end
|
|
@@ -114,13 +117,6 @@ module ReductoAI
|
|
|
114
117
|
{ input: normalized_input, instructions: normalized_instructions, **options }.compact
|
|
115
118
|
end
|
|
116
119
|
|
|
117
|
-
# @private
|
|
118
|
-
def normalize_input(input)
|
|
119
|
-
return input unless input.is_a?(Hash)
|
|
120
|
-
|
|
121
|
-
input[:url] || input["url"] || input
|
|
122
|
-
end
|
|
123
|
-
|
|
124
120
|
# @private
|
|
125
121
|
def normalize_instructions(instructions)
|
|
126
122
|
return { schema: instructions } unless instructions.is_a?(Hash)
|
|
@@ -13,7 +13,7 @@ module ReductoAI
|
|
|
13
13
|
#
|
|
14
14
|
# loop do
|
|
15
15
|
# status = client.jobs.retrieve(job_id: job["job_id"])
|
|
16
|
-
# break if status
|
|
16
|
+
# break if client.jobs.completed?(status)
|
|
17
17
|
# sleep 2
|
|
18
18
|
# end
|
|
19
19
|
# result = status["result"]
|
|
@@ -23,6 +23,8 @@ module ReductoAI
|
|
|
23
23
|
# document_url = upload_result["url"]
|
|
24
24
|
# client.parse.sync(input: document_url)
|
|
25
25
|
class Jobs
|
|
26
|
+
include ReductoAI::JobStatus
|
|
27
|
+
|
|
26
28
|
# @param client [Client] the Reducto API client
|
|
27
29
|
# @api private
|
|
28
30
|
def initialize(client)
|
|
@@ -43,7 +45,7 @@ module ReductoAI
|
|
|
43
45
|
# Lists jobs with optional filtering.
|
|
44
46
|
#
|
|
45
47
|
# @param options [Hash] Query parameters for filtering
|
|
46
|
-
# @option options [String] :status Filter by job status
|
|
48
|
+
# @option options [String] :status Filter by raw job status returned by Reducto
|
|
47
49
|
# @option options [Integer] :limit Maximum number of jobs to return
|
|
48
50
|
# @option options [Integer] :offset Pagination offset
|
|
49
51
|
#
|
|
@@ -59,7 +61,8 @@ module ReductoAI
|
|
|
59
61
|
# @see https://docs.reducto.ai/api-reference/jobs
|
|
60
62
|
def list(**options)
|
|
61
63
|
params = options.compact
|
|
62
|
-
@client.request(:get, "/jobs", params: params)
|
|
64
|
+
response = @client.request(:get, "/jobs", params: params)
|
|
65
|
+
normalize_job_list(response)
|
|
63
66
|
end
|
|
64
67
|
|
|
65
68
|
# Cancels a running async job.
|
|
@@ -76,7 +79,7 @@ module ReductoAI
|
|
|
76
79
|
#
|
|
77
80
|
# @see https://docs.reducto.ai/api-reference/cancel
|
|
78
81
|
def cancel(job_id:)
|
|
79
|
-
|
|
82
|
+
validate_job_id!(job_id)
|
|
80
83
|
|
|
81
84
|
@client.request(:post, "/cancel/#{job_id}")
|
|
82
85
|
end
|
|
@@ -90,9 +93,9 @@ module ReductoAI
|
|
|
90
93
|
#
|
|
91
94
|
# @return [Hash] Job status with keys:
|
|
92
95
|
# * "job_id" [String] - Job identifier
|
|
93
|
-
# * "status" [String] - Current status ("
|
|
94
|
-
# * "result" [Hash] - Results (only present when
|
|
95
|
-
# * "error" [String] - Error message (only present when
|
|
96
|
+
# * "status" [String] - Current raw Reducto status (for example "Pending", "Completed", "Failed")
|
|
97
|
+
# * "result" [Hash] - Results (only present when the job completed)
|
|
98
|
+
# * "error" [String] - Error message (only present when the job failed)
|
|
96
99
|
#
|
|
97
100
|
# @raise [ArgumentError] if job_id is nil or empty
|
|
98
101
|
# @raise [ClientError] if job doesn't exist
|
|
@@ -100,13 +103,13 @@ module ReductoAI
|
|
|
100
103
|
# @example Poll until complete
|
|
101
104
|
# loop do
|
|
102
105
|
# status = client.jobs.retrieve(job_id: job_id)
|
|
103
|
-
# break if
|
|
106
|
+
# break if client.jobs.terminal?(status)
|
|
104
107
|
# sleep 2
|
|
105
108
|
# end
|
|
106
109
|
#
|
|
107
110
|
# @see https://docs.reducto.ai/api-reference/job
|
|
108
111
|
def retrieve(job_id:)
|
|
109
|
-
|
|
112
|
+
validate_job_id!(job_id)
|
|
110
113
|
|
|
111
114
|
@client.request(:get, "/job/#{job_id}")
|
|
112
115
|
end
|
|
@@ -149,18 +152,107 @@ module ReductoAI
|
|
|
149
152
|
|
|
150
153
|
# Configures webhook notifications for async jobs.
|
|
151
154
|
#
|
|
152
|
-
# @return [
|
|
155
|
+
# @return [String] Svix portal URL for webhook configuration
|
|
153
156
|
#
|
|
154
157
|
# @example
|
|
155
158
|
# client.jobs.configure_webhook
|
|
156
159
|
#
|
|
157
|
-
# @see https://docs.reducto.ai/api-reference/
|
|
160
|
+
# @see https://docs.reducto.ai/api-reference/webhook-portal
|
|
158
161
|
def configure_webhook
|
|
159
|
-
@client.request(:post, "/configure_webhook")
|
|
162
|
+
response = @client.request(:post, "/configure_webhook")
|
|
163
|
+
normalize_webhook_portal_url(response)
|
|
164
|
+
end
|
|
165
|
+
|
|
166
|
+
def wait(job_id:, interval: 2, timeout: nil, max_attempts: nil, raise_on_failure: true)
|
|
167
|
+
validate_wait_arguments!(interval: interval, timeout: timeout, max_attempts: max_attempts)
|
|
168
|
+
|
|
169
|
+
started_at = monotonic_time
|
|
170
|
+
attempts = 0
|
|
171
|
+
|
|
172
|
+
loop do
|
|
173
|
+
attempts += 1
|
|
174
|
+
response = retrieve(job_id: job_id)
|
|
175
|
+
terminal_response = resolve_terminal_response(job_id, response, raise_on_failure)
|
|
176
|
+
return terminal_response if terminal_response
|
|
177
|
+
|
|
178
|
+
raise_if_attempt_limit_reached!(job_id, response, attempts, max_attempts)
|
|
179
|
+
raise_if_timeout_exceeded!(job_id, response, started_at, timeout)
|
|
180
|
+
sleep(interval)
|
|
181
|
+
end
|
|
160
182
|
end
|
|
161
183
|
|
|
162
184
|
private
|
|
163
185
|
|
|
186
|
+
def validate_job_id!(job_id)
|
|
187
|
+
raise ArgumentError, "job_id is required" if job_id.nil? || job_id.to_s.strip.empty?
|
|
188
|
+
raise ArgumentError, "job_id contains invalid characters" unless job_id.to_s.match?(/\A[\w\-.]+\z/)
|
|
189
|
+
end
|
|
190
|
+
|
|
191
|
+
def validate_wait_arguments!(interval:, timeout:, max_attempts:)
|
|
192
|
+
validate_wait_interval!(interval)
|
|
193
|
+
validate_wait_timeout!(timeout)
|
|
194
|
+
validate_wait_max_attempts!(max_attempts)
|
|
195
|
+
validate_wait_bounds!(timeout, max_attempts)
|
|
196
|
+
end
|
|
197
|
+
|
|
198
|
+
def validate_wait_interval!(interval)
|
|
199
|
+
raise ArgumentError, "interval must be non-negative" if interval.negative?
|
|
200
|
+
end
|
|
201
|
+
|
|
202
|
+
def validate_wait_timeout!(timeout)
|
|
203
|
+
raise ArgumentError, "timeout must be non-negative" if timeout&.negative?
|
|
204
|
+
end
|
|
205
|
+
|
|
206
|
+
def validate_wait_max_attempts!(max_attempts)
|
|
207
|
+
raise ArgumentError, "max_attempts must be positive" if max_attempts && max_attempts < 1
|
|
208
|
+
end
|
|
209
|
+
|
|
210
|
+
def validate_wait_bounds!(timeout, max_attempts)
|
|
211
|
+
raise ArgumentError, "timeout or max_attempts is required" if timeout.nil? && max_attempts.nil?
|
|
212
|
+
end
|
|
213
|
+
|
|
214
|
+
def resolve_terminal_response(job_id, response, raise_on_failure)
|
|
215
|
+
return response if completed?(response)
|
|
216
|
+
return nil unless failed?(response)
|
|
217
|
+
return response unless raise_on_failure
|
|
218
|
+
|
|
219
|
+
raise JobFailedError.new(response["error"] || "Job #{job_id} failed", body: response)
|
|
220
|
+
end
|
|
221
|
+
|
|
222
|
+
def raise_if_attempt_limit_reached!(job_id, response, attempts, max_attempts)
|
|
223
|
+
return unless max_attempts && attempts >= max_attempts
|
|
224
|
+
|
|
225
|
+
raise JobTimeoutError.new("Timed out waiting for job #{job_id}", body: response)
|
|
226
|
+
end
|
|
227
|
+
|
|
228
|
+
def raise_if_timeout_exceeded!(job_id, response, started_at, timeout)
|
|
229
|
+
return unless timeout && (monotonic_time - started_at) >= timeout
|
|
230
|
+
|
|
231
|
+
raise JobTimeoutError.new("Timed out waiting for job #{job_id}", body: response)
|
|
232
|
+
end
|
|
233
|
+
|
|
234
|
+
def monotonic_time
|
|
235
|
+
Process.clock_gettime(Process::CLOCK_MONOTONIC)
|
|
236
|
+
end
|
|
237
|
+
|
|
238
|
+
def normalize_job_list(response)
|
|
239
|
+
return response if response.is_a?(Hash)
|
|
240
|
+
return { "results" => response, "next_cursor" => nil } if response.is_a?(Array)
|
|
241
|
+
|
|
242
|
+
response
|
|
243
|
+
end
|
|
244
|
+
|
|
245
|
+
def normalize_webhook_portal_url(response)
|
|
246
|
+
return response if response.is_a?(String)
|
|
247
|
+
|
|
248
|
+
if response.is_a?(Hash)
|
|
249
|
+
portal_url = response["portal_url"] || response[:portal_url] || response["url"] || response[:url]
|
|
250
|
+
return portal_url if portal_url.is_a?(String)
|
|
251
|
+
end
|
|
252
|
+
|
|
253
|
+
raise ServerError.new("Unexpected webhook portal response", body: response)
|
|
254
|
+
end
|
|
255
|
+
|
|
164
256
|
# @private
|
|
165
257
|
def build_upload_io(file)
|
|
166
258
|
if file.is_a?(String)
|
|
@@ -173,9 +265,8 @@ module ReductoAI
|
|
|
173
265
|
else
|
|
174
266
|
"upload"
|
|
175
267
|
end
|
|
176
|
-
|
|
177
268
|
end
|
|
178
|
-
Faraday::
|
|
269
|
+
Faraday::Multipart::FilePart.new(file, "application/octet-stream", filename)
|
|
179
270
|
end
|
|
180
271
|
end
|
|
181
272
|
end
|
|
@@ -29,6 +29,8 @@ module ReductoAI
|
|
|
29
29
|
# @note Each parse operation consumes credits based on document complexity.
|
|
30
30
|
# See Reducto documentation for pricing details.
|
|
31
31
|
class Parse
|
|
32
|
+
include AsyncPayload
|
|
33
|
+
|
|
32
34
|
# @param client [Client] the Reducto API client
|
|
33
35
|
# @api private
|
|
34
36
|
def initialize(client)
|
|
@@ -48,7 +50,7 @@ module ReductoAI
|
|
|
48
50
|
#
|
|
49
51
|
# @return [Hash] Parsed document with keys:
|
|
50
52
|
# * "job_id" [String] - Job identifier
|
|
51
|
-
# * "status" [String] - Job status ("
|
|
53
|
+
# * "status" [String] - Job status ("Completed")
|
|
52
54
|
# * "result" [Hash] - Parsed content by format (e.g., "markdown", "html")
|
|
53
55
|
# * "usage" [Hash] - Credit usage details
|
|
54
56
|
#
|
|
@@ -76,12 +78,13 @@ module ReductoAI
|
|
|
76
78
|
# Returns immediately with a job_id. Poll with {Jobs#retrieve} to get results.
|
|
77
79
|
#
|
|
78
80
|
# @param input [String, Hash] Document URL or hash with :url key
|
|
79
|
-
# @param async [Boolean, nil] Async
|
|
81
|
+
# @param async [Boolean, Hash, nil] Async options. `true` becomes an empty async payload,
|
|
82
|
+
# while a hash is sent as Reducto's nested `async` object.
|
|
80
83
|
# @param options [Hash] Additional parsing options (same as {#sync})
|
|
81
84
|
#
|
|
82
85
|
# @return [Hash] Job status with keys:
|
|
83
86
|
# * "job_id" [String] - Job identifier for polling
|
|
84
|
-
# * "status" [String] - Initial status ("
|
|
87
|
+
# * "status" [String] - Initial status ("Pending")
|
|
85
88
|
#
|
|
86
89
|
# @raise [ArgumentError] if input is nil
|
|
87
90
|
#
|
|
@@ -92,31 +95,22 @@ module ReductoAI
|
|
|
92
95
|
# # Poll for completion
|
|
93
96
|
# loop do
|
|
94
97
|
# status = client.jobs.retrieve(job_id: job_id)
|
|
95
|
-
# break if status
|
|
98
|
+
# break if client.jobs.completed?(status)
|
|
96
99
|
# sleep 2
|
|
97
100
|
# end
|
|
98
101
|
#
|
|
99
102
|
# @see Jobs#retrieve
|
|
100
|
-
# @see https://docs.reducto.ai/api-reference/parse
|
|
103
|
+
# @see https://docs.reducto.ai/api-reference/async-parse Reducto Async Parse
|
|
101
104
|
def async(input:, async: nil, **options)
|
|
102
105
|
raise ArgumentError, "input is required" if input.nil?
|
|
103
106
|
|
|
104
107
|
normalized_input = normalize_input(input)
|
|
105
108
|
payload = { input: normalized_input }
|
|
106
|
-
payload
|
|
109
|
+
apply_async_payload!(payload, async)
|
|
107
110
|
payload.merge!(options.compact)
|
|
108
111
|
|
|
109
112
|
@client.post("/parse_async", payload)
|
|
110
113
|
end
|
|
111
|
-
|
|
112
|
-
private
|
|
113
|
-
|
|
114
|
-
# @private
|
|
115
|
-
def normalize_input(input)
|
|
116
|
-
return input unless input.is_a?(Hash)
|
|
117
|
-
|
|
118
|
-
input[:url] || input["url"] || input
|
|
119
|
-
end
|
|
120
114
|
end
|
|
121
115
|
end
|
|
122
116
|
end
|
|
@@ -20,6 +20,8 @@ module ReductoAI
|
|
|
20
20
|
#
|
|
21
21
|
# @note Pipeline operations consume credits based on all steps executed.
|
|
22
22
|
class Pipeline
|
|
23
|
+
include AsyncPayload
|
|
24
|
+
|
|
23
25
|
# @param client [Client] the Reducto API client
|
|
24
26
|
# @api private
|
|
25
27
|
def initialize(client)
|
|
@@ -36,7 +38,7 @@ module ReductoAI
|
|
|
36
38
|
#
|
|
37
39
|
# @return [Hash] Pipeline results with keys:
|
|
38
40
|
# * "job_id" [String] - Job identifier
|
|
39
|
-
# * "status" [String] - Job status ("
|
|
41
|
+
# * "status" [String] - Job status ("Completed")
|
|
40
42
|
# * "result" [Hash] - Contains "steps" array with each step's result
|
|
41
43
|
# * "usage" [Hash] - Credit usage details
|
|
42
44
|
#
|
|
@@ -68,12 +70,13 @@ module ReductoAI
|
|
|
68
70
|
#
|
|
69
71
|
# @param input [String, Hash] Document URL or hash with :url key
|
|
70
72
|
# @param steps [Array<Hash>] Array of step configurations (same as {#sync})
|
|
71
|
-
# @param async [Boolean, nil] Async
|
|
73
|
+
# @param async [Boolean, Hash, nil] Async options. `true` becomes an empty async payload,
|
|
74
|
+
# while a hash is sent as Reducto's nested `async` object.
|
|
72
75
|
# @param options [Hash] Additional pipeline options
|
|
73
76
|
#
|
|
74
77
|
# @return [Hash] Job status with keys:
|
|
75
78
|
# * "job_id" [String] - Job identifier for polling
|
|
76
|
-
# * "status" [String] - Initial status ("
|
|
79
|
+
# * "status" [String] - Initial status ("Pending")
|
|
77
80
|
#
|
|
78
81
|
# @raise [ArgumentError] if input or steps are nil/empty
|
|
79
82
|
#
|
|
@@ -94,7 +97,7 @@ module ReductoAI
|
|
|
94
97
|
raise ArgumentError, "steps are required" if steps.nil? || (steps.respond_to?(:empty?) && steps.empty?)
|
|
95
98
|
|
|
96
99
|
payload = { input: input, steps: steps }
|
|
97
|
-
payload
|
|
100
|
+
apply_async_payload!(payload, async)
|
|
98
101
|
payload.merge!(options.compact)
|
|
99
102
|
|
|
100
103
|
@client.post("/pipeline_async", payload)
|
|
@@ -18,6 +18,8 @@ module ReductoAI
|
|
|
18
18
|
#
|
|
19
19
|
# @note Split operations consume credits based on document size.
|
|
20
20
|
class Split
|
|
21
|
+
include AsyncPayload
|
|
22
|
+
|
|
21
23
|
# @param client [Client] the Reducto API client
|
|
22
24
|
# @api private
|
|
23
25
|
def initialize(client)
|
|
@@ -31,7 +33,7 @@ module ReductoAI
|
|
|
31
33
|
#
|
|
32
34
|
# @return [Hash] Split results with keys:
|
|
33
35
|
# * "job_id" [String] - Job identifier
|
|
34
|
-
# * "status" [String] - Job status ("
|
|
36
|
+
# * "status" [String] - Job status ("Completed")
|
|
35
37
|
# * "result" [Hash] - Sections with page ranges
|
|
36
38
|
# * "usage" [Hash] - Credit usage details
|
|
37
39
|
#
|
|
@@ -58,12 +60,13 @@ module ReductoAI
|
|
|
58
60
|
# Returns immediately with a job_id. Poll with {Jobs#retrieve} to get results.
|
|
59
61
|
#
|
|
60
62
|
# @param input [String, Hash] Document URL or hash with :url key
|
|
61
|
-
# @param async [Boolean, nil] Async
|
|
63
|
+
# @param async [Boolean, Hash, nil] Async options. `true` becomes an empty async payload,
|
|
64
|
+
# while a hash is sent as Reducto's nested `async` object.
|
|
62
65
|
# @param options [Hash] Additional splitting options
|
|
63
66
|
#
|
|
64
67
|
# @return [Hash] Job status with keys:
|
|
65
68
|
# * "job_id" [String] - Job identifier for polling
|
|
66
|
-
# * "status" [String] - Initial status ("
|
|
69
|
+
# * "status" [String] - Initial status ("Pending")
|
|
67
70
|
#
|
|
68
71
|
# @raise [ArgumentError] if input is nil
|
|
69
72
|
#
|
|
@@ -80,20 +83,11 @@ module ReductoAI
|
|
|
80
83
|
|
|
81
84
|
normalized_input = normalize_input(input)
|
|
82
85
|
payload = { input: normalized_input }
|
|
83
|
-
payload
|
|
86
|
+
apply_async_payload!(payload, async)
|
|
84
87
|
payload.merge!(options.compact)
|
|
85
88
|
|
|
86
89
|
@client.post("/split_async", payload)
|
|
87
90
|
end
|
|
88
|
-
|
|
89
|
-
private
|
|
90
|
-
|
|
91
|
-
# @private
|
|
92
|
-
def normalize_input(input)
|
|
93
|
-
return input unless input.is_a?(Hash)
|
|
94
|
-
|
|
95
|
-
input[:url] || input["url"] || input
|
|
96
|
-
end
|
|
97
91
|
end
|
|
98
92
|
end
|
|
99
93
|
end
|
data/lib/reducto_ai/version.rb
CHANGED
|
@@ -0,0 +1,68 @@
|
|
|
1
|
+
# frozen_string_literal: true
|
|
2
|
+
|
|
3
|
+
require "json"
|
|
4
|
+
|
|
5
|
+
module ReductoAI
|
|
6
|
+
module Webhooks
|
|
7
|
+
class Event
|
|
8
|
+
attr_reader :payload, :headers
|
|
9
|
+
|
|
10
|
+
def self.parse(payload, headers: {})
|
|
11
|
+
parsed_payload = payload.is_a?(String) ? JSON.parse(payload) : payload
|
|
12
|
+
new(parsed_payload, headers: headers)
|
|
13
|
+
end
|
|
14
|
+
|
|
15
|
+
def initialize(payload, headers: {})
|
|
16
|
+
@payload = stringify_keys(payload || {})
|
|
17
|
+
@headers = stringify_keys(headers || {})
|
|
18
|
+
end
|
|
19
|
+
|
|
20
|
+
def svix_id
|
|
21
|
+
headers["svix-id"] || headers["webhook-id"]
|
|
22
|
+
end
|
|
23
|
+
|
|
24
|
+
def job_id
|
|
25
|
+
payload["job_id"]
|
|
26
|
+
end
|
|
27
|
+
|
|
28
|
+
def status
|
|
29
|
+
payload["status"]
|
|
30
|
+
end
|
|
31
|
+
|
|
32
|
+
def metadata
|
|
33
|
+
payload["metadata"] || {}
|
|
34
|
+
end
|
|
35
|
+
|
|
36
|
+
def normalized_status
|
|
37
|
+
ReductoAI::JobStatus.normalize_status(status)
|
|
38
|
+
end
|
|
39
|
+
|
|
40
|
+
def completed?
|
|
41
|
+
ReductoAI::JobStatus.completed?(status)
|
|
42
|
+
end
|
|
43
|
+
|
|
44
|
+
def failed?
|
|
45
|
+
ReductoAI::JobStatus.failed?(status)
|
|
46
|
+
end
|
|
47
|
+
|
|
48
|
+
def to_h
|
|
49
|
+
payload
|
|
50
|
+
end
|
|
51
|
+
|
|
52
|
+
private
|
|
53
|
+
|
|
54
|
+
def stringify_keys(value)
|
|
55
|
+
case value
|
|
56
|
+
when Hash
|
|
57
|
+
value.each_with_object({}) do |(key, child_value), normalized|
|
|
58
|
+
normalized[key.to_s] = stringify_keys(child_value)
|
|
59
|
+
end
|
|
60
|
+
when Array
|
|
61
|
+
value.map { |child_value| stringify_keys(child_value) }
|
|
62
|
+
else
|
|
63
|
+
value
|
|
64
|
+
end
|
|
65
|
+
end
|
|
66
|
+
end
|
|
67
|
+
end
|
|
68
|
+
end
|
|
@@ -0,0 +1,47 @@
|
|
|
1
|
+
# frozen_string_literal: true
|
|
2
|
+
|
|
3
|
+
require "svix"
|
|
4
|
+
|
|
5
|
+
module ReductoAI
|
|
6
|
+
module Webhooks
|
|
7
|
+
class Verifier
|
|
8
|
+
class << self
|
|
9
|
+
def verify!(payload:, headers:, secret: nil)
|
|
10
|
+
normalized_headers = normalize_headers(headers)
|
|
11
|
+
resolved_secret = resolve_secret(secret, normalized_headers)
|
|
12
|
+
|
|
13
|
+
raise WebhookVerificationError, "webhook secret is required" if resolved_secret.to_s.strip.empty?
|
|
14
|
+
|
|
15
|
+
build_webhook(resolved_secret).verify(payload.to_s, normalized_headers)
|
|
16
|
+
rescue Svix::WebhookVerificationError => e
|
|
17
|
+
raise WebhookVerificationError, e.message
|
|
18
|
+
end
|
|
19
|
+
|
|
20
|
+
private
|
|
21
|
+
|
|
22
|
+
def build_webhook(secret)
|
|
23
|
+
Svix::Webhook.new(secret)
|
|
24
|
+
rescue ArgumentError => e
|
|
25
|
+
raise WebhookVerificationError, "invalid webhook secret format: #{e.message}"
|
|
26
|
+
end
|
|
27
|
+
|
|
28
|
+
def resolve_secret(secret, headers)
|
|
29
|
+
return secret unless secret.nil? || secret.to_s.strip.empty?
|
|
30
|
+
|
|
31
|
+
configuration = ReductoAI.config
|
|
32
|
+
return configuration.webhook_secret_resolver.call(headers) if configuration.webhook_secret_resolver
|
|
33
|
+
|
|
34
|
+
configuration.webhook_secret
|
|
35
|
+
end
|
|
36
|
+
|
|
37
|
+
def normalize_headers(headers)
|
|
38
|
+
return {} if headers.nil?
|
|
39
|
+
|
|
40
|
+
headers.each_with_object({}) do |(key, value), normalized|
|
|
41
|
+
normalized[key.to_s.downcase] = value.to_s
|
|
42
|
+
end
|
|
43
|
+
end
|
|
44
|
+
end
|
|
45
|
+
end
|
|
46
|
+
end
|
|
47
|
+
end
|
data/lib/reducto_ai.rb
CHANGED
|
@@ -3,6 +3,9 @@
|
|
|
3
3
|
require_relative "reducto_ai/version"
|
|
4
4
|
require_relative "reducto_ai/config"
|
|
5
5
|
require_relative "reducto_ai/errors"
|
|
6
|
+
require_relative "reducto_ai/job_status"
|
|
7
|
+
require_relative "reducto_ai/webhooks/verifier"
|
|
8
|
+
require_relative "reducto_ai/webhooks/event"
|
|
6
9
|
require_relative "reducto_ai/client"
|
|
7
10
|
require_relative "reducto_ai/engine"
|
|
8
11
|
|
|
@@ -24,6 +27,10 @@ require_relative "reducto_ai/engine"
|
|
|
24
27
|
# @see Client
|
|
25
28
|
# @see Config
|
|
26
29
|
module ReductoAI
|
|
30
|
+
module Rails
|
|
31
|
+
autoload :RequestVerifier, "reducto_ai/rails/request_verifier"
|
|
32
|
+
end
|
|
33
|
+
|
|
27
34
|
class << self
|
|
28
35
|
# Returns the global configuration instance.
|
|
29
36
|
#
|
data/sig/reducto_ai.rbs
CHANGED
|
@@ -1,4 +1,96 @@
|
|
|
1
1
|
module ReductoAI
|
|
2
2
|
VERSION: String
|
|
3
|
-
|
|
3
|
+
|
|
4
|
+
def self.config: () -> Config
|
|
5
|
+
def self.configure: () { (Config) -> void } -> void
|
|
6
|
+
def self.reset_configuration!: () -> void
|
|
7
|
+
|
|
8
|
+
class Config
|
|
9
|
+
attr_accessor api_key: String?
|
|
10
|
+
attr_accessor base_url: String
|
|
11
|
+
attr_accessor open_timeout: Integer
|
|
12
|
+
attr_accessor read_timeout: Integer
|
|
13
|
+
attr_accessor webhook_secret: String?
|
|
14
|
+
attr_accessor webhook_secret_resolver: Proc?
|
|
15
|
+
attr_writer logger: untyped
|
|
16
|
+
|
|
17
|
+
def logger: () -> untyped
|
|
18
|
+
end
|
|
19
|
+
|
|
20
|
+
class Error < StandardError
|
|
21
|
+
attr_reader status: Integer?
|
|
22
|
+
attr_reader body: untyped
|
|
23
|
+
end
|
|
24
|
+
|
|
25
|
+
class AuthenticationError < Error
|
|
26
|
+
end
|
|
27
|
+
|
|
28
|
+
class ClientError < Error
|
|
29
|
+
end
|
|
30
|
+
|
|
31
|
+
class RateLimitError < ClientError
|
|
32
|
+
end
|
|
33
|
+
|
|
34
|
+
class ServerError < Error
|
|
35
|
+
end
|
|
36
|
+
|
|
37
|
+
class NetworkError < Error
|
|
38
|
+
end
|
|
39
|
+
|
|
40
|
+
class JobTimeoutError < Error
|
|
41
|
+
end
|
|
42
|
+
|
|
43
|
+
class JobFailedError < Error
|
|
44
|
+
end
|
|
45
|
+
|
|
46
|
+
class WebhookVerificationError < Error
|
|
47
|
+
end
|
|
48
|
+
|
|
49
|
+
module Resources
|
|
50
|
+
class Jobs
|
|
51
|
+
def version: () -> Hash[untyped, untyped]
|
|
52
|
+
def list: (?status: String?, ?limit: Integer?, ?offset: Integer?) -> Hash[untyped, untyped]
|
|
53
|
+
def cancel: (job_id: String) -> Hash[untyped, untyped]
|
|
54
|
+
def retrieve: (job_id: String) -> Hash[untyped, untyped]
|
|
55
|
+
def upload: (file: untyped, ?extension: String?) -> Hash[untyped, untyped]
|
|
56
|
+
def normalize_status: (String | Hash[untyped, untyped] | untyped) -> String?
|
|
57
|
+
def pending?: (String | Hash[untyped, untyped] | untyped) -> bool
|
|
58
|
+
def in_progress?: (String | Hash[untyped, untyped] | untyped) -> bool
|
|
59
|
+
def completing?: (String | Hash[untyped, untyped] | untyped) -> bool
|
|
60
|
+
def completed?: (String | Hash[untyped, untyped] | untyped) -> bool
|
|
61
|
+
def failed?: (String | Hash[untyped, untyped] | untyped) -> bool
|
|
62
|
+
def terminal?: (String | Hash[untyped, untyped] | untyped) -> bool
|
|
63
|
+
def wait: (job_id: String, ?interval: Numeric, timeout: Numeric, ?max_attempts: Integer?, ?raise_on_failure: bool) -> Hash[untyped, untyped]
|
|
64
|
+
| (job_id: String, ?interval: Numeric, ?timeout: Numeric?, max_attempts: Integer, ?raise_on_failure: bool) -> Hash[untyped, untyped]
|
|
65
|
+
def configure_webhook: () -> String
|
|
66
|
+
end
|
|
67
|
+
end
|
|
68
|
+
|
|
69
|
+
module Webhooks
|
|
70
|
+
class Verifier
|
|
71
|
+
def self.verify!: (payload: String, headers: Hash[untyped, untyped], ?secret: String?) -> Hash[Symbol, untyped]
|
|
72
|
+
end
|
|
73
|
+
|
|
74
|
+
class Event
|
|
75
|
+
attr_reader payload: Hash[String, untyped]
|
|
76
|
+
attr_reader headers: Hash[String, untyped]
|
|
77
|
+
|
|
78
|
+
def self.parse: (String | Hash[untyped, untyped], ?headers: Hash[untyped, untyped]) -> Event
|
|
79
|
+
def initialize: (Hash[untyped, untyped], ?headers: Hash[untyped, untyped]) -> void
|
|
80
|
+
def svix_id: () -> String?
|
|
81
|
+
def job_id: () -> String?
|
|
82
|
+
def status: () -> String?
|
|
83
|
+
def metadata: () -> Hash[String, untyped]
|
|
84
|
+
def normalized_status: () -> String?
|
|
85
|
+
def completed?: () -> bool
|
|
86
|
+
def failed?: () -> bool
|
|
87
|
+
def to_h: () -> Hash[String, untyped]
|
|
88
|
+
end
|
|
89
|
+
end
|
|
90
|
+
|
|
91
|
+
module Rails
|
|
92
|
+
class RequestVerifier
|
|
93
|
+
def self.verify!: (untyped request, ?secret: String?) -> Webhooks::Event
|
|
94
|
+
end
|
|
95
|
+
end
|
|
4
96
|
end
|
metadata
CHANGED
|
@@ -1,7 +1,7 @@
|
|
|
1
1
|
--- !ruby/object:Gem::Specification
|
|
2
2
|
name: reducto_ai
|
|
3
3
|
version: !ruby/object:Gem::Version
|
|
4
|
-
version: 0.
|
|
4
|
+
version: 0.2.0
|
|
5
5
|
platform: ruby
|
|
6
6
|
authors:
|
|
7
7
|
- dpaluy
|
|
@@ -9,6 +9,20 @@ bindir: exe
|
|
|
9
9
|
cert_chain: []
|
|
10
10
|
date: 1980-01-02 00:00:00.000000000 Z
|
|
11
11
|
dependencies:
|
|
12
|
+
- !ruby/object:Gem::Dependency
|
|
13
|
+
name: base64
|
|
14
|
+
requirement: !ruby/object:Gem::Requirement
|
|
15
|
+
requirements:
|
|
16
|
+
- - ">="
|
|
17
|
+
- !ruby/object:Gem::Version
|
|
18
|
+
version: '0'
|
|
19
|
+
type: :runtime
|
|
20
|
+
prerelease: false
|
|
21
|
+
version_requirements: !ruby/object:Gem::Requirement
|
|
22
|
+
requirements:
|
|
23
|
+
- - ">="
|
|
24
|
+
- !ruby/object:Gem::Version
|
|
25
|
+
version: '0'
|
|
12
26
|
- !ruby/object:Gem::Dependency
|
|
13
27
|
name: faraday
|
|
14
28
|
requirement: !ruby/object:Gem::Requirement
|
|
@@ -37,6 +51,20 @@ dependencies:
|
|
|
37
51
|
- - "~>"
|
|
38
52
|
- !ruby/object:Gem::Version
|
|
39
53
|
version: '1.0'
|
|
54
|
+
- !ruby/object:Gem::Dependency
|
|
55
|
+
name: svix
|
|
56
|
+
requirement: !ruby/object:Gem::Requirement
|
|
57
|
+
requirements:
|
|
58
|
+
- - "~>"
|
|
59
|
+
- !ruby/object:Gem::Version
|
|
60
|
+
version: '1.0'
|
|
61
|
+
type: :runtime
|
|
62
|
+
prerelease: false
|
|
63
|
+
version_requirements: !ruby/object:Gem::Requirement
|
|
64
|
+
requirements:
|
|
65
|
+
- - "~>"
|
|
66
|
+
- !ruby/object:Gem::Version
|
|
67
|
+
version: '1.0'
|
|
40
68
|
description: ReductoAI provides a lightweight Faraday-based wrapper for Reducto's
|
|
41
69
|
Parse, Split, Extract, Edit, and Pipeline endpoints including async helpers and
|
|
42
70
|
Rails-friendly configuration.
|
|
@@ -59,6 +87,9 @@ files:
|
|
|
59
87
|
- lib/reducto_ai/config.rb
|
|
60
88
|
- lib/reducto_ai/engine.rb
|
|
61
89
|
- lib/reducto_ai/errors.rb
|
|
90
|
+
- lib/reducto_ai/job_status.rb
|
|
91
|
+
- lib/reducto_ai/rails/request_verifier.rb
|
|
92
|
+
- lib/reducto_ai/resources/async_payload.rb
|
|
62
93
|
- lib/reducto_ai/resources/edit.rb
|
|
63
94
|
- lib/reducto_ai/resources/extract.rb
|
|
64
95
|
- lib/reducto_ai/resources/jobs.rb
|
|
@@ -66,6 +97,8 @@ files:
|
|
|
66
97
|
- lib/reducto_ai/resources/pipeline.rb
|
|
67
98
|
- lib/reducto_ai/resources/split.rb
|
|
68
99
|
- lib/reducto_ai/version.rb
|
|
100
|
+
- lib/reducto_ai/webhooks/event.rb
|
|
101
|
+
- lib/reducto_ai/webhooks/verifier.rb
|
|
69
102
|
- sig/reducto_ai.rbs
|
|
70
103
|
homepage: https://github.com/dpaluy/reducto_ai
|
|
71
104
|
licenses:
|