ragie_ruby_sdk 1.0.7 → 1.0.9
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- checksums.yaml +4 -4
- data/README.md +4 -4
- data/docs/BodyCreateDocument.md +1 -1
- data/docs/BodyUpdateDocumentFile.md +1 -1
- data/docs/DocumentsApi.md +6 -6
- data/lib/ragie_ruby_sdk/api/documents_api.rb +8 -8
- data/lib/ragie_ruby_sdk/models/body.rb +1 -2
- data/lib/ragie_ruby_sdk/models/body_create_document.rb +1 -1
- data/lib/ragie_ruby_sdk/models/body_update_document_file.rb +1 -1
- data/lib/ragie_ruby_sdk/models/create_partition_params_metadata_schema_value.rb +1 -2
- data/lib/ragie_ruby_sdk/models/data.rb +1 -2
- data/lib/ragie_ruby_sdk/models/metadata_value.rb +1 -2
- data/lib/ragie_ruby_sdk/models/metadata_value1.rb +1 -2
- data/lib/ragie_ruby_sdk/models/mode.rb +1 -2
- data/lib/ragie_ruby_sdk/models/mode1.rb +1 -2
- data/lib/ragie_ruby_sdk/models/mode2.rb +1 -2
- data/lib/ragie_ruby_sdk/models/partition_strategy.rb +1 -2
- data/lib/ragie_ruby_sdk/models/response_patchdocumentmetadata.rb +1 -2
- data/lib/ragie_ruby_sdk/models/source.rb +1 -2
- data/lib/ragie_ruby_sdk/models/validation_error_loc_inner.rb +1 -2
- data/lib/ragie_ruby_sdk/version.rb +1 -1
- data/spec/api/documents_api_spec.rb +4 -4
- metadata +1 -1
checksums.yaml
CHANGED
|
@@ -1,7 +1,7 @@
|
|
|
1
1
|
---
|
|
2
2
|
SHA256:
|
|
3
|
-
metadata.gz:
|
|
4
|
-
data.tar.gz:
|
|
3
|
+
metadata.gz: 901a454f4f01f2def36293ef0b034cc9d4c2acd01f61657e07b775bfe157181e
|
|
4
|
+
data.tar.gz: 46f82247aadc1dee53a03fa12350d433fa124a6241f73ed9b8e195e0ac82e8b3
|
|
5
5
|
SHA512:
|
|
6
|
-
metadata.gz:
|
|
7
|
-
data.tar.gz:
|
|
6
|
+
metadata.gz: 135860facf2ab64fe7637675840d722c92bd7e03c96f7e80dc2426a8b4a01a6738bbb83e5a3f4cc7f0f7506f52106867f251a0b3b656185d885ca86662ca86b6
|
|
7
|
+
data.tar.gz: aa1485983b191ef7d4300e4efea135f66e06b94897b806581968fe9020890db510bdda95d9f55c23147f83888f2908615667e3746bdc8901f5b5ada6c1590d88
|
data/README.md
CHANGED
|
@@ -7,7 +7,7 @@ No description provided (generated by Openapi Generator https://github.com/opena
|
|
|
7
7
|
This SDK is automatically generated by the [OpenAPI Generator](https://openapi-generator.tech) project:
|
|
8
8
|
|
|
9
9
|
- API version: 1.0.0
|
|
10
|
-
- Package version: 1.0.
|
|
10
|
+
- Package version: 1.0.8
|
|
11
11
|
- Generator version: 7.16.0-SNAPSHOT
|
|
12
12
|
- Build package: org.openapitools.codegen.languages.RubyClientCodegen
|
|
13
13
|
|
|
@@ -24,16 +24,16 @@ gem build ragie_ruby_sdk.gemspec
|
|
|
24
24
|
Then either install the gem locally:
|
|
25
25
|
|
|
26
26
|
```shell
|
|
27
|
-
gem install ./ragie_ruby_sdk-1.0.
|
|
27
|
+
gem install ./ragie_ruby_sdk-1.0.8.gem
|
|
28
28
|
```
|
|
29
29
|
|
|
30
|
-
(for development, run `gem install --dev ./ragie_ruby_sdk-1.0.
|
|
30
|
+
(for development, run `gem install --dev ./ragie_ruby_sdk-1.0.8.gem` to install the development dependencies)
|
|
31
31
|
|
|
32
32
|
or publish the gem to a gem hosting service, e.g. [RubyGems](https://rubygems.org/).
|
|
33
33
|
|
|
34
34
|
Finally add this to the Gemfile:
|
|
35
35
|
|
|
36
|
-
gem 'ragie_ruby_sdk', '~> 1.0.
|
|
36
|
+
gem 'ragie_ruby_sdk', '~> 1.0.8'
|
|
37
37
|
|
|
38
38
|
### Install from Git
|
|
39
39
|
|
data/docs/BodyCreateDocument.md
CHANGED
|
@@ -6,7 +6,7 @@
|
|
|
6
6
|
| ---- | ---- | ----------- | ----- |
|
|
7
7
|
| **mode** | **String** | Partition strategy for the document. Different strategies exist for textual, audio and video file types and you can set the strategy you want for each file type, or just for textual types. For textual documents the options are `'hi_res'` or `'fast'`. When set to `'hi_res'`, images and tables will be extracted from the document. `'fast'` will only extract text. `'fast'` may be up to 20x faster than `'hi_res'`. `hi_res` is only applicable for Word documents, PDFs, Images, and PowerPoints. Images will always be processed in `hi_res`. If `hi_res` is set for an unsupported document type, it will be processed and billed in `fast` mode. For audio files, the options are true or false. True if you want to process audio, false otherwise. For video files, the options are `'audio_only'`, `'video_only'`, `'audio_video'`. `'audio_only'` will extract just the audio part of the video. `'video_only'` will similarly just extract the video part, ignoring audio. `'audio_video'` will extract both audio and video. To process all media types at the highest quality, use `'all'`. When you specify audio or video stategies, the format must be a JSON object. In this case, textual documents are denoted by the key \"static\". If you omit a key, that document type won't be processd. See examples below. Examples Textual documents only \"fast\" Video documents only { \"video\": \"audio_video\" } Specify multiple document types { \"static\": \"hi_res\", \"audio\": true, \"video\": \"video_only\" } Specify only textual or audio document types { \"static\": \"fast\", \"audio\": true } Highest quality processing for all media types \"all\" | [optional][default to 'fast'] |
|
|
8
8
|
| **metadata** | **String** | Metadata for the document. Keys must be strings. Values may be strings, numbers, booleans, or lists of strings. Numbers may be integers or floating point and will be converted to 64 bit floating point. 1000 total values are allowed. Each item in an array counts towards the total. The following keys are reserved for internal use: `document_id`, `document_type`, `document_source`, `document_name`, `document_uploaded_at`, `start_time`, `end_time`. | [optional][default to '{}'] |
|
|
9
|
-
| **file** | **File** | The binary file to upload, extract, and index for retrieval. The following file types are supported: Plain Text: `.eml` `.html` `.json` `.md` `.msg` `.rst` `.rtf` `.txt` `.xml` Images: `.png` `.webp` `.jpg` `.jpeg` `.tiff` `.bmp` `.heic` Documents: `.csv` `.doc` `.docx` `.epub` `.epub+zip` `.odt` `.pdf` `.ppt` `.pptx` `.tsv` `.xlsx` `.xls`. | |
|
|
9
|
+
| **file** | **File** | The binary file to upload, extract, and index for retrieval. The following file types are supported: Plain Text: `.eml` `.html` `.json` `.md` `.msg` `.rst` `.rtf` `.txt` `.xml` Images: `.png` `.webp` `.jpg` `.jpeg` `.tiff` `.bmp` `.heic` Documents: `.csv` `.doc` `.docx` `.epub` `.epub+zip` `.odt` `.pdf` `.ppt` `.pptx` `.tsv` `.xlsx` `.xls`. PDF files over 2000 pages are not supported in hi_res mode. | |
|
|
10
10
|
| **external_id** | **String** | An optional identifier for the document. A common value might be an id in an external system or the URL where the source file may be found. | [optional] |
|
|
11
11
|
| **partition** | **String** | An optional partition identifier. Documents can be scoped to a partition. Partitions must be lowercase alphanumeric and may only include the special characters `_` and `-`. A partition is created any time a document is created. | [optional] |
|
|
12
12
|
| **name** | **String** | An optional name for the document. If set, the document will have this name. Otherwise it will default to the file's name. | [optional] |
|
|
@@ -5,7 +5,7 @@
|
|
|
5
5
|
| Name | Type | Description | Notes |
|
|
6
6
|
| ---- | ---- | ----------- | ----- |
|
|
7
7
|
| **mode** | **String** | Partition strategy for the document. Different strategies exist for textual, audio and video file types and you can set the strategy you want for each file type, or just for textual types. For textual documents the options are `'hi_res'` or `'fast'`. When set to `'hi_res'`, images and tables will be extracted from the document. `'fast'` will only extract text. `'fast'` may be up to 20x faster than `'hi_res'`. `hi_res` is only applicable for Word documents, PDFs, Images, and PowerPoints. Images will always be processed in `hi_res`. If `hi_res` is set for an unsupported document type, it will be processed and billed in `fast` mode. For audio files, the options are true or false. True if you want to process audio, false otherwise. For video files, the options are `'audio_only'`, `'video_only'`, `'audio_video'`. `'audio_only'` will extract just the audio part of the video. `'video_only'` will similarly just extract the video part, ignoring audio. `'audio_video'` will extract both audio and video. To process all media types at the highest quality, use `'all'`. When you specify audio or video stategies, the format must be a JSON object. In this case, textual documents are denoted by the key \"static\". If you omit a key, that document type won't be processd. See examples below. Examples Textual documents only \"fast\" Video documents only { \"video\": \"audio_video\" } Specify multiple document types { \"static\": \"hi_res\", \"audio\": true, \"video\": \"video_only\" } Specify only textual or audio document types { \"static\": \"fast\", \"audio\": true } Highest quality processing for all media types \"all\" | [optional][default to 'fast'] |
|
|
8
|
-
| **file** | **File** | The binary file to upload, extract, and index for retrieval. The following file types are supported: Plain Text: `.eml` `.html` `.json` `.md` `.msg` `.rst` `.rtf` `.txt` `.xml` Images: `.png` `.webp` `.jpg` `.jpeg` `.tiff` `.bmp` `.heic` Documents: `.csv` `.doc` `.docx` `.epub` `.epub+zip` `.odt` `.pdf` `.ppt` `.pptx` `.tsv` `.xlsx` `.xls`. | |
|
|
8
|
+
| **file** | **File** | The binary file to upload, extract, and index for retrieval. The following file types are supported: Plain Text: `.eml` `.html` `.json` `.md` `.msg` `.rst` `.rtf` `.txt` `.xml` Images: `.png` `.webp` `.jpg` `.jpeg` `.tiff` `.bmp` `.heic` Documents: `.csv` `.doc` `.docx` `.epub` `.epub+zip` `.odt` `.pdf` `.ppt` `.pptx` `.tsv` `.xlsx` `.xls`. PDF files over 2000 pages are not supported in hi_res mode. | |
|
|
9
9
|
|
|
10
10
|
## Example
|
|
11
11
|
|
data/docs/DocumentsApi.md
CHANGED
|
@@ -42,7 +42,7 @@ RagieRubySdk.configure do |config|
|
|
|
42
42
|
end
|
|
43
43
|
|
|
44
44
|
api_instance = RagieRubySdk::DocumentsApi.new
|
|
45
|
-
file = File.new('/path/to/some/file') # File | The binary file to upload, extract, and index for retrieval. The following file types are supported: Plain Text: `.eml` `.html` `.json` `.md` `.msg` `.rst` `.rtf` `.txt` `.xml` Images: `.png` `.webp` `.jpg` `.jpeg` `.tiff` `.bmp` `.heic` Documents: `.csv` `.doc` `.docx` `.epub` `.epub+zip` `.odt` `.pdf` `.ppt` `.pptx` `.tsv` `.xlsx` `.xls`.
|
|
45
|
+
file = File.new('/path/to/some/file') # File | The binary file to upload, extract, and index for retrieval. The following file types are supported: Plain Text: `.eml` `.html` `.json` `.md` `.msg` `.rst` `.rtf` `.txt` `.xml` Images: `.png` `.webp` `.jpg` `.jpeg` `.tiff` `.bmp` `.heic` Documents: `.csv` `.doc` `.docx` `.epub` `.epub+zip` `.odt` `.pdf` `.ppt` `.pptx` `.tsv` `.xlsx` `.xls`. PDF files over 2000 pages are not supported in hi_res mode.
|
|
46
46
|
opts = {
|
|
47
47
|
mode: RagieRubySdk::Mode2OneOf.new, # Mode2 |
|
|
48
48
|
metadata: { key: nil}, # Hash<String, MetadataValue1> | Metadata for the document. Keys must be strings. Values may be strings, numbers, booleans, or lists of strings. Numbers may be integers or floating point and will be converted to 64 bit floating point. 1000 total values are allowed. Each item in an array counts towards the total. The following keys are reserved for internal use: `document_id`, `document_type`, `document_source`, `document_name`, `document_uploaded_at`, `start_time`, `end_time`.
|
|
@@ -82,7 +82,7 @@ end
|
|
|
82
82
|
|
|
83
83
|
| Name | Type | Description | Notes |
|
|
84
84
|
| ---- | ---- | ----------- | ----- |
|
|
85
|
-
| **file** | **File** | The binary file to upload, extract, and index for retrieval. The following file types are supported: Plain Text: `.eml` `.html` `.json` `.md` `.msg` `.rst` `.rtf` `.txt` `.xml` Images: `.png` `.webp` `.jpg` `.jpeg` `.tiff` `.bmp` `.heic` Documents: `.csv` `.doc` `.docx` `.epub` `.epub+zip` `.odt` `.pdf` `.ppt` `.pptx` `.tsv` `.xlsx` `.xls`. | |
|
|
85
|
+
| **file** | **File** | The binary file to upload, extract, and index for retrieval. The following file types are supported: Plain Text: `.eml` `.html` `.json` `.md` `.msg` `.rst` `.rtf` `.txt` `.xml` Images: `.png` `.webp` `.jpg` `.jpeg` `.tiff` `.bmp` `.heic` Documents: `.csv` `.doc` `.docx` `.epub` `.epub+zip` `.odt` `.pdf` `.ppt` `.pptx` `.tsv` `.xlsx` `.xls`. PDF files over 2000 pages are not supported in hi_res mode. | |
|
|
86
86
|
| **mode** | [**Mode2**](Mode2.md) | | [optional] |
|
|
87
87
|
| **metadata** | [**Hash<String, MetadataValue1>**](Hash.md) | Metadata for the document. Keys must be strings. Values may be strings, numbers, booleans, or lists of strings. Numbers may be integers or floating point and will be converted to 64 bit floating point. 1000 total values are allowed. Each item in an array counts towards the total. The following keys are reserved for internal use: `document_id`, `document_type`, `document_source`, `document_name`, `document_uploaded_at`, `start_time`, `end_time`. | [optional] |
|
|
88
88
|
| **external_id** | **String** | An optional identifier for the document. A common value might be an id in an external system or the URL where the source file may be found. | [optional] |
|
|
@@ -109,7 +109,7 @@ end
|
|
|
109
109
|
|
|
110
110
|
Create Document From Url
|
|
111
111
|
|
|
112
|
-
Ingest a document from a publicly accessible URL. On ingest, the document goes through a series of steps before it is ready for retrieval. Each step is reflected in the status of the document which can be one of [`pending`, `partitioning`, `partitioned`, `refined`, `chunked`, `indexed`, `summary_indexed`, `keyword_indexed`, `ready`, `failed`]. The document is available for retrieval once it is in ready state. The summary index step can take a few seconds. You can optionally use the document for retrieval once it is in `indexed` state. However the summary will only be available once the state has changed to `summary_indexed` or `ready`.
|
|
112
|
+
Ingest a document from a publicly accessible URL. On ingest, the document goes through a series of steps before it is ready for retrieval. Each step is reflected in the status of the document which can be one of [`pending`, `partitioning`, `partitioned`, `refined`, `chunked`, `indexed`, `summary_indexed`, `keyword_indexed`, `ready`, `failed`]. The document is available for retrieval once it is in ready state. The summary index step can take a few seconds. You can optionally use the document for retrieval once it is in `indexed` state. However the summary will only be available once the state has changed to `summary_indexed` or `ready`. PDF files over 2000 pages are not supported in hi_res mode.
|
|
113
113
|
|
|
114
114
|
### Examples
|
|
115
115
|
|
|
@@ -1016,7 +1016,7 @@ end
|
|
|
1016
1016
|
|
|
1017
1017
|
api_instance = RagieRubySdk::DocumentsApi.new
|
|
1018
1018
|
document_id = '38400000-8cf0-11bd-b23e-10b96e4ef00d' # String | The id of the document.
|
|
1019
|
-
file = File.new('/path/to/some/file') # File | The binary file to upload, extract, and index for retrieval. The following file types are supported: Plain Text: `.eml` `.html` `.json` `.md` `.msg` `.rst` `.rtf` `.txt` `.xml` Images: `.png` `.webp` `.jpg` `.jpeg` `.tiff` `.bmp` `.heic` Documents: `.csv` `.doc` `.docx` `.epub` `.epub+zip` `.odt` `.pdf` `.ppt` `.pptx` `.tsv` `.xlsx` `.xls`.
|
|
1019
|
+
file = File.new('/path/to/some/file') # File | The binary file to upload, extract, and index for retrieval. The following file types are supported: Plain Text: `.eml` `.html` `.json` `.md` `.msg` `.rst` `.rtf` `.txt` `.xml` Images: `.png` `.webp` `.jpg` `.jpeg` `.tiff` `.bmp` `.heic` Documents: `.csv` `.doc` `.docx` `.epub` `.epub+zip` `.odt` `.pdf` `.ppt` `.pptx` `.tsv` `.xlsx` `.xls`. PDF files over 2000 pages are not supported in hi_res mode.
|
|
1020
1020
|
opts = {
|
|
1021
1021
|
partition: 'partition_example', # String | An optional partition to scope the request to. If omitted, accounts created after 1/9/2025 will have the request scoped to the default partition, while older accounts will have the request scoped to all partitions. Older accounts may opt in to strict partition scoping by contacting support@ragie.ai. Older accounts using the partitions feature are strongly recommended to scope the request to a partition.
|
|
1022
1022
|
mode: RagieRubySdk::Mode2OneOf.new # Mode2 |
|
|
@@ -1054,7 +1054,7 @@ end
|
|
|
1054
1054
|
| Name | Type | Description | Notes |
|
|
1055
1055
|
| ---- | ---- | ----------- | ----- |
|
|
1056
1056
|
| **document_id** | **String** | The id of the document. | |
|
|
1057
|
-
| **file** | **File** | The binary file to upload, extract, and index for retrieval. The following file types are supported: Plain Text: `.eml` `.html` `.json` `.md` `.msg` `.rst` `.rtf` `.txt` `.xml` Images: `.png` `.webp` `.jpg` `.jpeg` `.tiff` `.bmp` `.heic` Documents: `.csv` `.doc` `.docx` `.epub` `.epub+zip` `.odt` `.pdf` `.ppt` `.pptx` `.tsv` `.xlsx` `.xls`. | |
|
|
1057
|
+
| **file** | **File** | The binary file to upload, extract, and index for retrieval. The following file types are supported: Plain Text: `.eml` `.html` `.json` `.md` `.msg` `.rst` `.rtf` `.txt` `.xml` Images: `.png` `.webp` `.jpg` `.jpeg` `.tiff` `.bmp` `.heic` Documents: `.csv` `.doc` `.docx` `.epub` `.epub+zip` `.odt` `.pdf` `.ppt` `.pptx` `.tsv` `.xlsx` `.xls`. PDF files over 2000 pages are not supported in hi_res mode. | |
|
|
1058
1058
|
| **partition** | **String** | An optional partition to scope the request to. If omitted, accounts created after 1/9/2025 will have the request scoped to the default partition, while older accounts will have the request scoped to all partitions. Older accounts may opt in to strict partition scoping by contacting support@ragie.ai. Older accounts using the partitions feature are strongly recommended to scope the request to a partition. | [optional] |
|
|
1059
1059
|
| **mode** | [**Mode2**](Mode2.md) | | [optional] |
|
|
1060
1060
|
|
|
@@ -1078,7 +1078,7 @@ end
|
|
|
1078
1078
|
|
|
1079
1079
|
Update Document Url
|
|
1080
1080
|
|
|
1081
|
-
Updates a document from a publicly accessible URL. On ingest, the document goes through a series of steps before it is ready for retrieval. Each step is reflected in the status of the document which can be one of [`pending`, `partitioning`, `partitioned`, `refined`, `chunked`, `indexed`, `summary_indexed`, `keyword_indexed`, `ready`, `failed`]. The document is available for retrieval once it is in ready state. The summary index step can take a few seconds. You can optionally use the document for retrieval once it is in `indexed` state. However the summary will only be available once the state has changed to `summary_indexed` or `ready`.
|
|
1081
|
+
Updates a document from a publicly accessible URL. On ingest, the document goes through a series of steps before it is ready for retrieval. Each step is reflected in the status of the document which can be one of [`pending`, `partitioning`, `partitioned`, `refined`, `chunked`, `indexed`, `summary_indexed`, `keyword_indexed`, `ready`, `failed`]. The document is available for retrieval once it is in ready state. The summary index step can take a few seconds. You can optionally use the document for retrieval once it is in `indexed` state. However the summary will only be available once the state has changed to `summary_indexed` or `ready`. PDF files over 2000 pages are not supported in hi_res mode.
|
|
1082
1082
|
|
|
1083
1083
|
### Examples
|
|
1084
1084
|
|
|
@@ -21,7 +21,7 @@ module RagieRubySdk
|
|
|
21
21
|
end
|
|
22
22
|
# Create Document
|
|
23
23
|
# On ingest, the document goes through a series of steps before it is ready for retrieval. Each step is reflected in the status of the document which can be one of [`pending`, `partitioning`, `partitioned`, `refined`, `chunked`, `indexed`, `summary_indexed`, `keyword_indexed`, `ready`, `failed`]. The document is available for retrieval once it is in ready state. The summary index step can take a few seconds. You can optionally use the document for retrieval once it is in `indexed` state. However the summary will only be available once the state has changed to `summary_indexed` or `ready`.
|
|
24
|
-
# @param file [File] The binary file to upload, extract, and index for retrieval. The following file types are supported: Plain Text: `.eml` `.html` `.json` `.md` `.msg` `.rst` `.rtf` `.txt` `.xml` Images: `.png` `.webp` `.jpg` `.jpeg` `.tiff` `.bmp` `.heic` Documents: `.csv` `.doc` `.docx` `.epub` `.epub+zip` `.odt` `.pdf` `.ppt` `.pptx` `.tsv` `.xlsx` `.xls`.
|
|
24
|
+
# @param file [File] The binary file to upload, extract, and index for retrieval. The following file types are supported: Plain Text: `.eml` `.html` `.json` `.md` `.msg` `.rst` `.rtf` `.txt` `.xml` Images: `.png` `.webp` `.jpg` `.jpeg` `.tiff` `.bmp` `.heic` Documents: `.csv` `.doc` `.docx` `.epub` `.epub+zip` `.odt` `.pdf` `.ppt` `.pptx` `.tsv` `.xlsx` `.xls`. PDF files over 2000 pages are not supported in hi_res mode.
|
|
25
25
|
# @param [Hash] opts the optional parameters
|
|
26
26
|
# @option opts [Mode2] :mode
|
|
27
27
|
# @option opts [Hash<String, MetadataValue1>] :metadata Metadata for the document. Keys must be strings. Values may be strings, numbers, booleans, or lists of strings. Numbers may be integers or floating point and will be converted to 64 bit floating point. 1000 total values are allowed. Each item in an array counts towards the total. The following keys are reserved for internal use: `document_id`, `document_type`, `document_source`, `document_name`, `document_uploaded_at`, `start_time`, `end_time`.
|
|
@@ -36,7 +36,7 @@ module RagieRubySdk
|
|
|
36
36
|
|
|
37
37
|
# Create Document
|
|
38
38
|
# On ingest, the document goes through a series of steps before it is ready for retrieval. Each step is reflected in the status of the document which can be one of [`pending`, `partitioning`, `partitioned`, `refined`, `chunked`, `indexed`, `summary_indexed`, `keyword_indexed`, `ready`, `failed`]. The document is available for retrieval once it is in ready state. The summary index step can take a few seconds. You can optionally use the document for retrieval once it is in `indexed` state. However the summary will only be available once the state has changed to `summary_indexed` or `ready`.
|
|
39
|
-
# @param file [File] The binary file to upload, extract, and index for retrieval. The following file types are supported: Plain Text: `.eml` `.html` `.json` `.md` `.msg` `.rst` `.rtf` `.txt` `.xml` Images: `.png` `.webp` `.jpg` `.jpeg` `.tiff` `.bmp` `.heic` Documents: `.csv` `.doc` `.docx` `.epub` `.epub+zip` `.odt` `.pdf` `.ppt` `.pptx` `.tsv` `.xlsx` `.xls`.
|
|
39
|
+
# @param file [File] The binary file to upload, extract, and index for retrieval. The following file types are supported: Plain Text: `.eml` `.html` `.json` `.md` `.msg` `.rst` `.rtf` `.txt` `.xml` Images: `.png` `.webp` `.jpg` `.jpeg` `.tiff` `.bmp` `.heic` Documents: `.csv` `.doc` `.docx` `.epub` `.epub+zip` `.odt` `.pdf` `.ppt` `.pptx` `.tsv` `.xlsx` `.xls`. PDF files over 2000 pages are not supported in hi_res mode.
|
|
40
40
|
# @param [Hash] opts the optional parameters
|
|
41
41
|
# @option opts [Mode2] :mode
|
|
42
42
|
# @option opts [Hash<String, MetadataValue1>] :metadata Metadata for the document. Keys must be strings. Values may be strings, numbers, booleans, or lists of strings. Numbers may be integers or floating point and will be converted to 64 bit floating point. 1000 total values are allowed. Each item in an array counts towards the total. The following keys are reserved for internal use: `document_id`, `document_type`, `document_source`, `document_name`, `document_uploaded_at`, `start_time`, `end_time`.
|
|
@@ -104,7 +104,7 @@ module RagieRubySdk
|
|
|
104
104
|
end
|
|
105
105
|
|
|
106
106
|
# Create Document From Url
|
|
107
|
-
# Ingest a document from a publicly accessible URL. On ingest, the document goes through a series of steps before it is ready for retrieval. Each step is reflected in the status of the document which can be one of [`pending`, `partitioning`, `partitioned`, `refined`, `chunked`, `indexed`, `summary_indexed`, `keyword_indexed`, `ready`, `failed`]. The document is available for retrieval once it is in ready state. The summary index step can take a few seconds. You can optionally use the document for retrieval once it is in `indexed` state. However the summary will only be available once the state has changed to `summary_indexed` or `ready`.
|
|
107
|
+
# Ingest a document from a publicly accessible URL. On ingest, the document goes through a series of steps before it is ready for retrieval. Each step is reflected in the status of the document which can be one of [`pending`, `partitioning`, `partitioned`, `refined`, `chunked`, `indexed`, `summary_indexed`, `keyword_indexed`, `ready`, `failed`]. The document is available for retrieval once it is in ready state. The summary index step can take a few seconds. You can optionally use the document for retrieval once it is in `indexed` state. However the summary will only be available once the state has changed to `summary_indexed` or `ready`. PDF files over 2000 pages are not supported in hi_res mode.
|
|
108
108
|
# @param create_document_from_url_params [CreateDocumentFromUrlParams]
|
|
109
109
|
# @param [Hash] opts the optional parameters
|
|
110
110
|
# @return [Document]
|
|
@@ -114,7 +114,7 @@ module RagieRubySdk
|
|
|
114
114
|
end
|
|
115
115
|
|
|
116
116
|
# Create Document From Url
|
|
117
|
-
# Ingest a document from a publicly accessible URL. On ingest, the document goes through a series of steps before it is ready for retrieval. Each step is reflected in the status of the document which can be one of [`pending`, `partitioning`, `partitioned`, `refined`, `chunked`, `indexed`, `summary_indexed`, `keyword_indexed`, `ready`, `failed`]. The document is available for retrieval once it is in ready state. The summary index step can take a few seconds. You can optionally use the document for retrieval once it is in `indexed` state. However the summary will only be available once the state has changed to `summary_indexed` or `ready`.
|
|
117
|
+
# Ingest a document from a publicly accessible URL. On ingest, the document goes through a series of steps before it is ready for retrieval. Each step is reflected in the status of the document which can be one of [`pending`, `partitioning`, `partitioned`, `refined`, `chunked`, `indexed`, `summary_indexed`, `keyword_indexed`, `ready`, `failed`]. The document is available for retrieval once it is in ready state. The summary index step can take a few seconds. You can optionally use the document for retrieval once it is in `indexed` state. However the summary will only be available once the state has changed to `summary_indexed` or `ready`. PDF files over 2000 pages are not supported in hi_res mode.
|
|
118
118
|
# @param create_document_from_url_params [CreateDocumentFromUrlParams]
|
|
119
119
|
# @param [Hash] opts the optional parameters
|
|
120
120
|
# @return [Array<(Document, Integer, Hash)>] Document data, response status code and response headers
|
|
@@ -978,7 +978,7 @@ module RagieRubySdk
|
|
|
978
978
|
|
|
979
979
|
# Update Document File
|
|
980
980
|
# @param document_id [String] The id of the document.
|
|
981
|
-
# @param file [File] The binary file to upload, extract, and index for retrieval. The following file types are supported: Plain Text: `.eml` `.html` `.json` `.md` `.msg` `.rst` `.rtf` `.txt` `.xml` Images: `.png` `.webp` `.jpg` `.jpeg` `.tiff` `.bmp` `.heic` Documents: `.csv` `.doc` `.docx` `.epub` `.epub+zip` `.odt` `.pdf` `.ppt` `.pptx` `.tsv` `.xlsx` `.xls`.
|
|
981
|
+
# @param file [File] The binary file to upload, extract, and index for retrieval. The following file types are supported: Plain Text: `.eml` `.html` `.json` `.md` `.msg` `.rst` `.rtf` `.txt` `.xml` Images: `.png` `.webp` `.jpg` `.jpeg` `.tiff` `.bmp` `.heic` Documents: `.csv` `.doc` `.docx` `.epub` `.epub+zip` `.odt` `.pdf` `.ppt` `.pptx` `.tsv` `.xlsx` `.xls`. PDF files over 2000 pages are not supported in hi_res mode.
|
|
982
982
|
# @param [Hash] opts the optional parameters
|
|
983
983
|
# @option opts [String] :partition An optional partition to scope the request to. If omitted, accounts created after 1/9/2025 will have the request scoped to the default partition, while older accounts will have the request scoped to all partitions. Older accounts may opt in to strict partition scoping by contacting support@ragie.ai. Older accounts using the partitions feature are strongly recommended to scope the request to a partition.
|
|
984
984
|
# @option opts [Mode2] :mode
|
|
@@ -990,7 +990,7 @@ module RagieRubySdk
|
|
|
990
990
|
|
|
991
991
|
# Update Document File
|
|
992
992
|
# @param document_id [String] The id of the document.
|
|
993
|
-
# @param file [File] The binary file to upload, extract, and index for retrieval. The following file types are supported: Plain Text: `.eml` `.html` `.json` `.md` `.msg` `.rst` `.rtf` `.txt` `.xml` Images: `.png` `.webp` `.jpg` `.jpeg` `.tiff` `.bmp` `.heic` Documents: `.csv` `.doc` `.docx` `.epub` `.epub+zip` `.odt` `.pdf` `.ppt` `.pptx` `.tsv` `.xlsx` `.xls`.
|
|
993
|
+
# @param file [File] The binary file to upload, extract, and index for retrieval. The following file types are supported: Plain Text: `.eml` `.html` `.json` `.md` `.msg` `.rst` `.rtf` `.txt` `.xml` Images: `.png` `.webp` `.jpg` `.jpeg` `.tiff` `.bmp` `.heic` Documents: `.csv` `.doc` `.docx` `.epub` `.epub+zip` `.odt` `.pdf` `.ppt` `.pptx` `.tsv` `.xlsx` `.xls`. PDF files over 2000 pages are not supported in hi_res mode.
|
|
994
994
|
# @param [Hash] opts the optional parameters
|
|
995
995
|
# @option opts [String] :partition An optional partition to scope the request to. If omitted, accounts created after 1/9/2025 will have the request scoped to the default partition, while older accounts will have the request scoped to all partitions. Older accounts may opt in to strict partition scoping by contacting support@ragie.ai. Older accounts using the partitions feature are strongly recommended to scope the request to a partition.
|
|
996
996
|
# @option opts [Mode2] :mode
|
|
@@ -1056,7 +1056,7 @@ module RagieRubySdk
|
|
|
1056
1056
|
end
|
|
1057
1057
|
|
|
1058
1058
|
# Update Document Url
|
|
1059
|
-
# Updates a document from a publicly accessible URL. On ingest, the document goes through a series of steps before it is ready for retrieval. Each step is reflected in the status of the document which can be one of [`pending`, `partitioning`, `partitioned`, `refined`, `chunked`, `indexed`, `summary_indexed`, `keyword_indexed`, `ready`, `failed`]. The document is available for retrieval once it is in ready state. The summary index step can take a few seconds. You can optionally use the document for retrieval once it is in `indexed` state. However the summary will only be available once the state has changed to `summary_indexed` or `ready`.
|
|
1059
|
+
# Updates a document from a publicly accessible URL. On ingest, the document goes through a series of steps before it is ready for retrieval. Each step is reflected in the status of the document which can be one of [`pending`, `partitioning`, `partitioned`, `refined`, `chunked`, `indexed`, `summary_indexed`, `keyword_indexed`, `ready`, `failed`]. The document is available for retrieval once it is in ready state. The summary index step can take a few seconds. You can optionally use the document for retrieval once it is in `indexed` state. However the summary will only be available once the state has changed to `summary_indexed` or `ready`. PDF files over 2000 pages are not supported in hi_res mode.
|
|
1060
1060
|
# @param document_id [String] The id of the document.
|
|
1061
1061
|
# @param update_document_from_url_params [UpdateDocumentFromUrlParams]
|
|
1062
1062
|
# @param [Hash] opts the optional parameters
|
|
@@ -1068,7 +1068,7 @@ module RagieRubySdk
|
|
|
1068
1068
|
end
|
|
1069
1069
|
|
|
1070
1070
|
# Update Document Url
|
|
1071
|
-
# Updates a document from a publicly accessible URL. On ingest, the document goes through a series of steps before it is ready for retrieval. Each step is reflected in the status of the document which can be one of [`pending`, `partitioning`, `partitioned`, `refined`, `chunked`, `indexed`, `summary_indexed`, `keyword_indexed`, `ready`, `failed`]. The document is available for retrieval once it is in ready state. The summary index step can take a few seconds. You can optionally use the document for retrieval once it is in `indexed` state. However the summary will only be available once the state has changed to `summary_indexed` or `ready`.
|
|
1071
|
+
# Updates a document from a publicly accessible URL. On ingest, the document goes through a series of steps before it is ready for retrieval. Each step is reflected in the status of the document which can be one of [`pending`, `partitioning`, `partitioned`, `refined`, `chunked`, `indexed`, `summary_indexed`, `keyword_indexed`, `ready`, `failed`]. The document is available for retrieval once it is in ready state. The summary index step can take a few seconds. You can optionally use the document for retrieval once it is in `indexed` state. However the summary will only be available once the state has changed to `summary_indexed` or `ready`. PDF files over 2000 pages are not supported in hi_res mode.
|
|
1072
1072
|
# @param document_id [String] The id of the document.
|
|
1073
1073
|
# @param update_document_from_url_params [UpdateDocumentFromUrlParams]
|
|
1074
1074
|
# @param [Hash] opts the optional parameters
|
|
@@ -44,8 +44,7 @@ module RagieRubySdk
|
|
|
44
44
|
openapi_any_of.each do |klass|
|
|
45
45
|
begin
|
|
46
46
|
next if klass == :AnyType # "nullable: true"
|
|
47
|
-
|
|
48
|
-
return typed_data if typed_data
|
|
47
|
+
return find_and_cast_into_type(klass, data)
|
|
49
48
|
rescue # rescue all errors so we keep iterating even if the current item lookup raises
|
|
50
49
|
end
|
|
51
50
|
end
|
|
@@ -21,7 +21,7 @@ module RagieRubySdk
|
|
|
21
21
|
# Metadata for the document. Keys must be strings. Values may be strings, numbers, booleans, or lists of strings. Numbers may be integers or floating point and will be converted to 64 bit floating point. 1000 total values are allowed. Each item in an array counts towards the total. The following keys are reserved for internal use: `document_id`, `document_type`, `document_source`, `document_name`, `document_uploaded_at`, `start_time`, `end_time`.
|
|
22
22
|
attr_accessor :metadata
|
|
23
23
|
|
|
24
|
-
# The binary file to upload, extract, and index for retrieval. The following file types are supported: Plain Text: `.eml` `.html` `.json` `.md` `.msg` `.rst` `.rtf` `.txt` `.xml` Images: `.png` `.webp` `.jpg` `.jpeg` `.tiff` `.bmp` `.heic` Documents: `.csv` `.doc` `.docx` `.epub` `.epub+zip` `.odt` `.pdf` `.ppt` `.pptx` `.tsv` `.xlsx` `.xls`.
|
|
24
|
+
# The binary file to upload, extract, and index for retrieval. The following file types are supported: Plain Text: `.eml` `.html` `.json` `.md` `.msg` `.rst` `.rtf` `.txt` `.xml` Images: `.png` `.webp` `.jpg` `.jpeg` `.tiff` `.bmp` `.heic` Documents: `.csv` `.doc` `.docx` `.epub` `.epub+zip` `.odt` `.pdf` `.ppt` `.pptx` `.tsv` `.xlsx` `.xls`. PDF files over 2000 pages are not supported in hi_res mode.
|
|
25
25
|
attr_accessor :file
|
|
26
26
|
|
|
27
27
|
# An optional identifier for the document. A common value might be an id in an external system or the URL where the source file may be found.
|
|
@@ -18,7 +18,7 @@ module RagieRubySdk
|
|
|
18
18
|
# Partition strategy for the document. Different strategies exist for textual, audio and video file types and you can set the strategy you want for each file type, or just for textual types. For textual documents the options are `'hi_res'` or `'fast'`. When set to `'hi_res'`, images and tables will be extracted from the document. `'fast'` will only extract text. `'fast'` may be up to 20x faster than `'hi_res'`. `hi_res` is only applicable for Word documents, PDFs, Images, and PowerPoints. Images will always be processed in `hi_res`. If `hi_res` is set for an unsupported document type, it will be processed and billed in `fast` mode. For audio files, the options are true or false. True if you want to process audio, false otherwise. For video files, the options are `'audio_only'`, `'video_only'`, `'audio_video'`. `'audio_only'` will extract just the audio part of the video. `'video_only'` will similarly just extract the video part, ignoring audio. `'audio_video'` will extract both audio and video. To process all media types at the highest quality, use `'all'`. When you specify audio or video stategies, the format must be a JSON object. In this case, textual documents are denoted by the key \"static\". If you omit a key, that document type won't be processd. See examples below. Examples Textual documents only \"fast\" Video documents only { \"video\": \"audio_video\" } Specify multiple document types { \"static\": \"hi_res\", \"audio\": true, \"video\": \"video_only\" } Specify only textual or audio document types { \"static\": \"fast\", \"audio\": true } Highest quality processing for all media types \"all\"
|
|
19
19
|
attr_accessor :mode
|
|
20
20
|
|
|
21
|
-
# The binary file to upload, extract, and index for retrieval. The following file types are supported: Plain Text: `.eml` `.html` `.json` `.md` `.msg` `.rst` `.rtf` `.txt` `.xml` Images: `.png` `.webp` `.jpg` `.jpeg` `.tiff` `.bmp` `.heic` Documents: `.csv` `.doc` `.docx` `.epub` `.epub+zip` `.odt` `.pdf` `.ppt` `.pptx` `.tsv` `.xlsx` `.xls`.
|
|
21
|
+
# The binary file to upload, extract, and index for retrieval. The following file types are supported: Plain Text: `.eml` `.html` `.json` `.md` `.msg` `.rst` `.rtf` `.txt` `.xml` Images: `.png` `.webp` `.jpg` `.jpeg` `.tiff` `.bmp` `.heic` Documents: `.csv` `.doc` `.docx` `.epub` `.epub+zip` `.odt` `.pdf` `.ppt` `.pptx` `.tsv` `.xlsx` `.xls`. PDF files over 2000 pages are not supported in hi_res mode.
|
|
22
22
|
attr_accessor :file
|
|
23
23
|
|
|
24
24
|
# Attribute mapping from ruby-style variable name to JSON key.
|
|
@@ -40,8 +40,7 @@ module RagieRubySdk
|
|
|
40
40
|
openapi_any_of.each do |klass|
|
|
41
41
|
begin
|
|
42
42
|
next if klass == :AnyType # "nullable: true"
|
|
43
|
-
|
|
44
|
-
return typed_data if typed_data
|
|
43
|
+
return find_and_cast_into_type(klass, data)
|
|
45
44
|
rescue # rescue all errors so we keep iterating even if the current item lookup raises
|
|
46
45
|
end
|
|
47
46
|
end
|
|
@@ -38,8 +38,7 @@ module RagieRubySdk
|
|
|
38
38
|
openapi_any_of.each do |klass|
|
|
39
39
|
begin
|
|
40
40
|
next if klass == :AnyType # "nullable: true"
|
|
41
|
-
|
|
42
|
-
return typed_data if typed_data
|
|
41
|
+
return find_and_cast_into_type(klass, data)
|
|
43
42
|
rescue # rescue all errors so we keep iterating even if the current item lookup raises
|
|
44
43
|
end
|
|
45
44
|
end
|
|
@@ -39,8 +39,7 @@ module RagieRubySdk
|
|
|
39
39
|
openapi_any_of.each do |klass|
|
|
40
40
|
begin
|
|
41
41
|
next if klass == :AnyType # "nullable: true"
|
|
42
|
-
|
|
43
|
-
return typed_data if typed_data
|
|
42
|
+
return find_and_cast_into_type(klass, data)
|
|
44
43
|
rescue # rescue all errors so we keep iterating even if the current item lookup raises
|
|
45
44
|
end
|
|
46
45
|
end
|
|
@@ -40,8 +40,7 @@ module RagieRubySdk
|
|
|
40
40
|
openapi_one_of.each do |klass|
|
|
41
41
|
begin
|
|
42
42
|
next if klass == :AnyType # "nullable: true"
|
|
43
|
-
|
|
44
|
-
return typed_data if typed_data
|
|
43
|
+
return find_and_cast_into_type(klass, data)
|
|
45
44
|
rescue # rescue all errors so we keep iterating even if the current item lookup raises
|
|
46
45
|
end
|
|
47
46
|
end
|
|
@@ -38,8 +38,7 @@ module RagieRubySdk
|
|
|
38
38
|
openapi_any_of.each do |klass|
|
|
39
39
|
begin
|
|
40
40
|
next if klass == :AnyType # "nullable: true"
|
|
41
|
-
|
|
42
|
-
return typed_data if typed_data
|
|
41
|
+
return find_and_cast_into_type(klass, data)
|
|
43
42
|
rescue # rescue all errors so we keep iterating even if the current item lookup raises
|
|
44
43
|
end
|
|
45
44
|
end
|
|
@@ -37,8 +37,7 @@ module RagieRubySdk
|
|
|
37
37
|
openapi_any_of.each do |klass|
|
|
38
38
|
begin
|
|
39
39
|
next if klass == :AnyType # "nullable: true"
|
|
40
|
-
|
|
41
|
-
return typed_data if typed_data
|
|
40
|
+
return find_and_cast_into_type(klass, data)
|
|
42
41
|
rescue # rescue all errors so we keep iterating even if the current item lookup raises
|
|
43
42
|
end
|
|
44
43
|
end
|
|
@@ -39,8 +39,7 @@ module RagieRubySdk
|
|
|
39
39
|
openapi_one_of.each do |klass|
|
|
40
40
|
begin
|
|
41
41
|
next if klass == :AnyType # "nullable: true"
|
|
42
|
-
|
|
43
|
-
return typed_data if typed_data
|
|
42
|
+
return find_and_cast_into_type(klass, data)
|
|
44
43
|
rescue # rescue all errors so we keep iterating even if the current item lookup raises
|
|
45
44
|
end
|
|
46
45
|
end
|
|
@@ -37,8 +37,7 @@ module RagieRubySdk
|
|
|
37
37
|
openapi_any_of.each do |klass|
|
|
38
38
|
begin
|
|
39
39
|
next if klass == :AnyType # "nullable: true"
|
|
40
|
-
|
|
41
|
-
return typed_data if typed_data
|
|
40
|
+
return find_and_cast_into_type(klass, data)
|
|
42
41
|
rescue # rescue all errors so we keep iterating even if the current item lookup raises
|
|
43
42
|
end
|
|
44
43
|
end
|
|
@@ -37,8 +37,7 @@ module RagieRubySdk
|
|
|
37
37
|
openapi_any_of.each do |klass|
|
|
38
38
|
begin
|
|
39
39
|
next if klass == :AnyType # "nullable: true"
|
|
40
|
-
|
|
41
|
-
return typed_data if typed_data
|
|
40
|
+
return find_and_cast_into_type(klass, data)
|
|
42
41
|
rescue # rescue all errors so we keep iterating even if the current item lookup raises
|
|
43
42
|
end
|
|
44
43
|
end
|
|
@@ -38,8 +38,7 @@ module RagieRubySdk
|
|
|
38
38
|
openapi_any_of.each do |klass|
|
|
39
39
|
begin
|
|
40
40
|
next if klass == :AnyType # "nullable: true"
|
|
41
|
-
|
|
42
|
-
return typed_data if typed_data
|
|
41
|
+
return find_and_cast_into_type(klass, data)
|
|
43
42
|
rescue # rescue all errors so we keep iterating even if the current item lookup raises
|
|
44
43
|
end
|
|
45
44
|
end
|
|
@@ -37,8 +37,7 @@ module RagieRubySdk
|
|
|
37
37
|
openapi_any_of.each do |klass|
|
|
38
38
|
begin
|
|
39
39
|
next if klass == :AnyType # "nullable: true"
|
|
40
|
-
|
|
41
|
-
return typed_data if typed_data
|
|
40
|
+
return find_and_cast_into_type(klass, data)
|
|
42
41
|
rescue # rescue all errors so we keep iterating even if the current item lookup raises
|
|
43
42
|
end
|
|
44
43
|
end
|
|
@@ -35,7 +35,7 @@ describe 'DocumentsApi' do
|
|
|
35
35
|
# unit tests for create_document
|
|
36
36
|
# Create Document
|
|
37
37
|
# On ingest, the document goes through a series of steps before it is ready for retrieval. Each step is reflected in the status of the document which can be one of [`pending`, `partitioning`, `partitioned`, `refined`, `chunked`, `indexed`, `summary_indexed`, `keyword_indexed`, `ready`, `failed`]. The document is available for retrieval once it is in ready state. The summary index step can take a few seconds. You can optionally use the document for retrieval once it is in `indexed` state. However the summary will only be available once the state has changed to `summary_indexed` or `ready`.
|
|
38
|
-
# @param file The binary file to upload, extract, and index for retrieval. The following file types are supported: Plain Text: `.eml` `.html` `.json` `.md` `.msg` `.rst` `.rtf` `.txt` `.xml` Images: `.png` `.webp` `.jpg` `.jpeg` `.tiff` `.bmp` `.heic` Documents: `.csv` `.doc` `.docx` `.epub` `.epub+zip` `.odt` `.pdf` `.ppt` `.pptx` `.tsv` `.xlsx` `.xls`.
|
|
38
|
+
# @param file The binary file to upload, extract, and index for retrieval. The following file types are supported: Plain Text: `.eml` `.html` `.json` `.md` `.msg` `.rst` `.rtf` `.txt` `.xml` Images: `.png` `.webp` `.jpg` `.jpeg` `.tiff` `.bmp` `.heic` Documents: `.csv` `.doc` `.docx` `.epub` `.epub+zip` `.odt` `.pdf` `.ppt` `.pptx` `.tsv` `.xlsx` `.xls`. PDF files over 2000 pages are not supported in hi_res mode.
|
|
39
39
|
# @param [Hash] opts the optional parameters
|
|
40
40
|
# @option opts [Mode2] :mode
|
|
41
41
|
# @option opts [Hash<String, MetadataValue1>] :metadata Metadata for the document. Keys must be strings. Values may be strings, numbers, booleans, or lists of strings. Numbers may be integers or floating point and will be converted to 64 bit floating point. 1000 total values are allowed. Each item in an array counts towards the total. The following keys are reserved for internal use: `document_id`, `document_type`, `document_source`, `document_name`, `document_uploaded_at`, `start_time`, `end_time`.
|
|
@@ -51,7 +51,7 @@ describe 'DocumentsApi' do
|
|
|
51
51
|
|
|
52
52
|
# unit tests for create_document_from_url
|
|
53
53
|
# Create Document From Url
|
|
54
|
-
# Ingest a document from a publicly accessible URL. On ingest, the document goes through a series of steps before it is ready for retrieval. Each step is reflected in the status of the document which can be one of [`pending`, `partitioning`, `partitioned`, `refined`, `chunked`, `indexed`, `summary_indexed`, `keyword_indexed`, `ready`, `failed`]. The document is available for retrieval once it is in ready state. The summary index step can take a few seconds. You can optionally use the document for retrieval once it is in `indexed` state. However the summary will only be available once the state has changed to `summary_indexed` or `ready`.
|
|
54
|
+
# Ingest a document from a publicly accessible URL. On ingest, the document goes through a series of steps before it is ready for retrieval. Each step is reflected in the status of the document which can be one of [`pending`, `partitioning`, `partitioned`, `refined`, `chunked`, `indexed`, `summary_indexed`, `keyword_indexed`, `ready`, `failed`]. The document is available for retrieval once it is in ready state. The summary index step can take a few seconds. You can optionally use the document for retrieval once it is in `indexed` state. However the summary will only be available once the state has changed to `summary_indexed` or `ready`. PDF files over 2000 pages are not supported in hi_res mode.
|
|
55
55
|
# @param create_document_from_url_params
|
|
56
56
|
# @param [Hash] opts the optional parameters
|
|
57
57
|
# @return [Document]
|
|
@@ -219,7 +219,7 @@ describe 'DocumentsApi' do
|
|
|
219
219
|
# unit tests for update_document_file
|
|
220
220
|
# Update Document File
|
|
221
221
|
# @param document_id The id of the document.
|
|
222
|
-
# @param file The binary file to upload, extract, and index for retrieval. The following file types are supported: Plain Text: `.eml` `.html` `.json` `.md` `.msg` `.rst` `.rtf` `.txt` `.xml` Images: `.png` `.webp` `.jpg` `.jpeg` `.tiff` `.bmp` `.heic` Documents: `.csv` `.doc` `.docx` `.epub` `.epub+zip` `.odt` `.pdf` `.ppt` `.pptx` `.tsv` `.xlsx` `.xls`.
|
|
222
|
+
# @param file The binary file to upload, extract, and index for retrieval. The following file types are supported: Plain Text: `.eml` `.html` `.json` `.md` `.msg` `.rst` `.rtf` `.txt` `.xml` Images: `.png` `.webp` `.jpg` `.jpeg` `.tiff` `.bmp` `.heic` Documents: `.csv` `.doc` `.docx` `.epub` `.epub+zip` `.odt` `.pdf` `.ppt` `.pptx` `.tsv` `.xlsx` `.xls`. PDF files over 2000 pages are not supported in hi_res mode.
|
|
223
223
|
# @param [Hash] opts the optional parameters
|
|
224
224
|
# @option opts [String] :partition An optional partition to scope the request to. If omitted, accounts created after 1/9/2025 will have the request scoped to the default partition, while older accounts will have the request scoped to all partitions. Older accounts may opt in to strict partition scoping by contacting support@ragie.ai. Older accounts using the partitions feature are strongly recommended to scope the request to a partition.
|
|
225
225
|
# @option opts [Mode2] :mode
|
|
@@ -232,7 +232,7 @@ describe 'DocumentsApi' do
|
|
|
232
232
|
|
|
233
233
|
# unit tests for update_document_from_url
|
|
234
234
|
# Update Document Url
|
|
235
|
-
# Updates a document from a publicly accessible URL. On ingest, the document goes through a series of steps before it is ready for retrieval. Each step is reflected in the status of the document which can be one of [`pending`, `partitioning`, `partitioned`, `refined`, `chunked`, `indexed`, `summary_indexed`, `keyword_indexed`, `ready`, `failed`]. The document is available for retrieval once it is in ready state. The summary index step can take a few seconds. You can optionally use the document for retrieval once it is in `indexed` state. However the summary will only be available once the state has changed to `summary_indexed` or `ready`.
|
|
235
|
+
# Updates a document from a publicly accessible URL. On ingest, the document goes through a series of steps before it is ready for retrieval. Each step is reflected in the status of the document which can be one of [`pending`, `partitioning`, `partitioned`, `refined`, `chunked`, `indexed`, `summary_indexed`, `keyword_indexed`, `ready`, `failed`]. The document is available for retrieval once it is in ready state. The summary index step can take a few seconds. You can optionally use the document for retrieval once it is in `indexed` state. However the summary will only be available once the state has changed to `summary_indexed` or `ready`. PDF files over 2000 pages are not supported in hi_res mode.
|
|
236
236
|
# @param document_id The id of the document.
|
|
237
237
|
# @param update_document_from_url_params
|
|
238
238
|
# @param [Hash] opts the optional parameters
|