PyPI - parallel-web-tools - Versions diffs - 0.1.2rc2__tar.gz → 0.2.0__tar.gz - Mend

parallel-web-tools 0.1.2rc2tar.gz → 0.2.0tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (47) hide show

{parallel_web_tools-0.1.2rc2 → parallel_web_tools-0.2.0}/PKG-INFO RENAMED Viewed

@@ -1,7 +1,7 @@
 Metadata-Version: 2.4
 Name: parallel-web-tools
-Version: 0.1.2rc2
-Summary: Parallel Tools: CLI and data enrichment utilities for the Parallel API
+Version: 0.2.0
+Summary: Parallel Tools: CLI and Python SDK for AI-powered web intelligence
 Project-URL: Homepage, https://github.com/parallel-web/parallel-web-tools
 Project-URL: Documentation, https://docs.parallel.ai
 Project-URL: Repository, https://github.com/parallel-web/parallel-web-tools
@@ -24,7 +24,7 @@ Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
 Requires-Python: >=3.10
 Requires-Dist: click>=8.1.0
 Requires-Dist: httpx>=0.25.0
-Requires-Dist: parallel-web>=0.4.1
+Requires-Dist: parallel-web>=0.4.2
 Requires-Dist: python-dotenv>=1.0.0
 Requires-Dist: rich>=13.0.0
 Provides-Extra: all
@@ -98,6 +98,7 @@ CLI and data enrichment utilities for the [Parallel API](https://docs.parallel.a
 - **Web Search** - AI-powered search with domain filtering and date ranges
 - **Content Extraction** - Extract clean markdown from any URL
 - **Data Enrichment** - Enrich CSV, JSON, DuckDB, and BigQuery data with AI
+- **Follow-up Context** - Chain research and enrichment tasks using `--previous-interaction-id`
 - **AI-Assisted Planning** - Use natural language to define what data you want
 - **Multiple Integrations** - Polars, DuckDB, Snowflake, BigQuery, Spark
@@ -110,10 +111,14 @@ Requires **Python 3.10+**.
 Install the standalone `parallel-cli` binary for search, extract, enrichment, and deep research (no Python required):
 ```bash
+# macOS / Linux (Homebrew)
+brew install parallel-web/tap/parallel-cli
+# macOS / Linux (shell script)
 curl -fsSL https://parallel.ai/install.sh | bash
 ```
-This automatically detects your platform (macOS/Linux, x64/arm64) and installs to `~/.local/bin`.
+The shell script automatically detects your platform (macOS/Linux, x64/arm64) and installs to `~/.local/bin`.
 > **Note:** The standalone binary supports `search`, `extract`, `research`, and `enrich run` with CLI arguments, CSV files, and JSON files. For YAML config files, interactive planner, DuckDB/BigQuery sources, or deployment commands, use pip install.
@@ -150,7 +155,7 @@ pip install parallel-web-tools[all]
 ```
 parallel-cli
 ├── auth                    # Check authentication status
-├── login                   # OAuth login (or use PARALLEL_API_KEY env var)
+├── login                   # OAuth login (--device for SSH/containers/CI, or use PARALLEL_API_KEY)
 ├── logout                  # Remove stored credentials
 ├── search                  # Web search
 ├── extract / fetch         # Extract content from URLs
@@ -172,6 +177,9 @@ parallel-cli
 │   ├── status              # Check status of a FindAll run
 │   ├── poll                # Poll until completion
 │   ├── result              # Fetch results of a completed run
+│   ├── enrich              # Enrich existing FindAll results with new columns
+│   ├── extend              # Request additional candidates for a run
+│   ├── schema              # Get the schema for a FindAll run
 │   └── cancel              # Cancel a running FindAll
 └── monitor                 # Continuous web change tracking
     ├── create              # Create a new web monitor
@@ -189,9 +197,12 @@ parallel-cli
 ### 1. Authenticate
 ```bash
-# Interactive OAuth login
+# Interactive OAuth login (opens browser)
 parallel-cli login
+# Device authorization flow — for SSH, containers, CI, or headless environments
+parallel-cli login --device
 # Or set environment variable
 export PARALLEL_API_KEY=your_api_key
 ```
@@ -283,13 +294,41 @@ echo "What is the latest funding for Anthropic?" | parallel-cli search - --json
 echo "Research question" | parallel-cli research run - --json
 # Async: launch then poll separately
-parallel-cli research run "question" --no-wait --json   # returns run_id
+parallel-cli research run "question" --no-wait --json   # returns run_id + interaction_id
 parallel-cli research status trun_xxx --json             # check status
 parallel-cli research poll trun_xxx --json               # wait and get result
+# Follow-up: reuse context from a previous task
+parallel-cli research run "follow-up question" --previous-interaction-id trun_xxx --json
+parallel-cli enrich run --data '[...]' --previous-interaction-id trun_xxx --json
 # Exit codes: 0=ok, 2=bad input, 3=auth error, 4=api error, 5=timeout
 ```
+### Follow-up research with context reuse
+Tasks return an `interaction_id` that can be passed as `--previous-interaction-id` on a subsequent research or enrichment run. The new task inherits the context from the prior one, so follow-up questions can reference earlier results without repeating them.
+```bash
+# Step 1: Run initial research (interaction_id is in the JSON output)
+parallel-cli research run "What are the top 3 AI companies?" --json --processor lite-fast
+# → { "run_id": "trun_abc", "interaction_id": "trun_abc", ... }
+# Step 2: Follow-up research referencing the first task's context
+parallel-cli research run "What products does the #1 company make?" \
+    --previous-interaction-id trun_abc --json
+# Step 3: Use research context for enrichment
+parallel-cli enrich run \
+    --data '[{"company": "Anthropic"}, {"company": "OpenAI"}]' \
+    --target enriched.csv \
+    --source-columns '[{"name": "company", "description": "Company name"}]' \
+    --enriched-columns '[{"name": "products", "description": "Main products"}]' \
+    --previous-interaction-id trun_abc --json
+```
+The `interaction_id` is shown in both human-readable and `--json` output for `research run`, `research status`, and `research poll`.
 ### More examples
 ```bash
@@ -354,9 +393,11 @@ print(result.result)
 **DuckDB:**
 ```python
 import duckdb
-from parallel_web_tools.integrations.duckdb import enrich_table
+from parallel_web_tools.integrations.duckdb import enrich_table, findall_table
 conn = duckdb.connect()
+# Enrich an existing table
 conn.execute("CREATE TABLE companies AS SELECT 'Google' as name")
 result = enrich_table(
     conn,
@@ -365,6 +406,14 @@ result = enrich_table(
     output_columns=["CEO name", "Founding year"],
 )
 print(result.result.fetchdf())
+# Discover entities with FindAll
+result = findall_table(
+    conn,
+    "countries that have won the FIFA World Cup and their capital cities",
+    match_limit=10,
+)
+result.result.show()
 ```
 ## Programmatic Usage
@@ -385,6 +434,56 @@ run_enrichment_from_dict({
 })
 ```
+### Device Authorization (RFC 8628)
+For headless environments (SSH, containers, CI), use the device authorization flow:
+```python
+from parallel_web_tools import request_device_code, poll_device_token
+# Step 1: Request a device code
+device_info = request_device_code()
+print(f"Go to: {device_info.verification_uri_complete}")
+# Step 2: Poll until the user authorizes
+token = poll_device_token(device_info.device_code)
+```
+### FindAll
+Discover entities from the web using natural language:
+```python
+from parallel_web_tools import run_findall
+# Discover entities (auto-enriches by default)
+result = run_findall("AI startups in healthcare", match_limit=20)
+# Post-run operations
+from parallel_web_tools import enrich_findall, extend_findall, get_findall_schema
+schema = get_findall_schema(result.run_id)
+enriched = enrich_findall(result.run_id, ["funding amount", "number of employees"])
+extended = extend_findall(result.run_id, additional_matches=10)
+```
+### Monitor
+Track web changes programmatically:
+```python
+from parallel_web_tools import create_monitor, list_monitors, get_monitor
+# Create a monitor
+monitor = create_monitor(query="Track Tesla SEC filings", cadence="daily")
+# List all monitors
+monitors = list_monitors()
+# Get monitor details and events
+details = get_monitor(monitor.monitor_id)
+```
 ## YAML Configuration Format
 ```yaml

{parallel_web_tools-0.1.2rc2 → parallel_web_tools-0.2.0}/README.md RENAMED Viewed

@@ -13,6 +13,7 @@ CLI and data enrichment utilities for the [Parallel API](https://docs.parallel.a
 - **Web Search** - AI-powered search with domain filtering and date ranges
 - **Content Extraction** - Extract clean markdown from any URL
 - **Data Enrichment** - Enrich CSV, JSON, DuckDB, and BigQuery data with AI
+- **Follow-up Context** - Chain research and enrichment tasks using `--previous-interaction-id`
 - **AI-Assisted Planning** - Use natural language to define what data you want
 - **Multiple Integrations** - Polars, DuckDB, Snowflake, BigQuery, Spark
@@ -25,10 +26,14 @@ Requires **Python 3.10+**.
 Install the standalone `parallel-cli` binary for search, extract, enrichment, and deep research (no Python required):
 ```bash
+# macOS / Linux (Homebrew)
+brew install parallel-web/tap/parallel-cli
+# macOS / Linux (shell script)
 curl -fsSL https://parallel.ai/install.sh | bash
 ```
-This automatically detects your platform (macOS/Linux, x64/arm64) and installs to `~/.local/bin`.
+The shell script automatically detects your platform (macOS/Linux, x64/arm64) and installs to `~/.local/bin`.
 > **Note:** The standalone binary supports `search`, `extract`, `research`, and `enrich run` with CLI arguments, CSV files, and JSON files. For YAML config files, interactive planner, DuckDB/BigQuery sources, or deployment commands, use pip install.
@@ -65,7 +70,7 @@ pip install parallel-web-tools[all]
 ```
 parallel-cli
 ├── auth                    # Check authentication status
-├── login                   # OAuth login (or use PARALLEL_API_KEY env var)
+├── login                   # OAuth login (--device for SSH/containers/CI, or use PARALLEL_API_KEY)
 ├── logout                  # Remove stored credentials
 ├── search                  # Web search
 ├── extract / fetch         # Extract content from URLs
@@ -87,6 +92,9 @@ parallel-cli
 │   ├── status              # Check status of a FindAll run
 │   ├── poll                # Poll until completion
 │   ├── result              # Fetch results of a completed run
+│   ├── enrich              # Enrich existing FindAll results with new columns
+│   ├── extend              # Request additional candidates for a run
+│   ├── schema              # Get the schema for a FindAll run
 │   └── cancel              # Cancel a running FindAll
 └── monitor                 # Continuous web change tracking
     ├── create              # Create a new web monitor
@@ -104,9 +112,12 @@ parallel-cli
 ### 1. Authenticate
 ```bash
-# Interactive OAuth login
+# Interactive OAuth login (opens browser)
 parallel-cli login
+# Device authorization flow — for SSH, containers, CI, or headless environments
+parallel-cli login --device
 # Or set environment variable
 export PARALLEL_API_KEY=your_api_key
 ```
@@ -198,13 +209,41 @@ echo "What is the latest funding for Anthropic?" | parallel-cli search - --json
 echo "Research question" | parallel-cli research run - --json
 # Async: launch then poll separately
-parallel-cli research run "question" --no-wait --json   # returns run_id
+parallel-cli research run "question" --no-wait --json   # returns run_id + interaction_id
 parallel-cli research status trun_xxx --json             # check status
 parallel-cli research poll trun_xxx --json               # wait and get result
+# Follow-up: reuse context from a previous task
+parallel-cli research run "follow-up question" --previous-interaction-id trun_xxx --json
+parallel-cli enrich run --data '[...]' --previous-interaction-id trun_xxx --json
 # Exit codes: 0=ok, 2=bad input, 3=auth error, 4=api error, 5=timeout
 ```
+### Follow-up research with context reuse
+Tasks return an `interaction_id` that can be passed as `--previous-interaction-id` on a subsequent research or enrichment run. The new task inherits the context from the prior one, so follow-up questions can reference earlier results without repeating them.
+```bash
+# Step 1: Run initial research (interaction_id is in the JSON output)
+parallel-cli research run "What are the top 3 AI companies?" --json --processor lite-fast
+# → { "run_id": "trun_abc", "interaction_id": "trun_abc", ... }
+# Step 2: Follow-up research referencing the first task's context
+parallel-cli research run "What products does the #1 company make?" \
+    --previous-interaction-id trun_abc --json
+# Step 3: Use research context for enrichment
+parallel-cli enrich run \
+    --data '[{"company": "Anthropic"}, {"company": "OpenAI"}]' \
+    --target enriched.csv \
+    --source-columns '[{"name": "company", "description": "Company name"}]' \
+    --enriched-columns '[{"name": "products", "description": "Main products"}]' \
+    --previous-interaction-id trun_abc --json
+```
+The `interaction_id` is shown in both human-readable and `--json` output for `research run`, `research status`, and `research poll`.
 ### More examples
 ```bash
@@ -269,9 +308,11 @@ print(result.result)
 **DuckDB:**
 ```python
 import duckdb
-from parallel_web_tools.integrations.duckdb import enrich_table
+from parallel_web_tools.integrations.duckdb import enrich_table, findall_table
 conn = duckdb.connect()
+# Enrich an existing table
 conn.execute("CREATE TABLE companies AS SELECT 'Google' as name")
 result = enrich_table(
     conn,
@@ -280,6 +321,14 @@ result = enrich_table(
     output_columns=["CEO name", "Founding year"],
 )
 print(result.result.fetchdf())
+# Discover entities with FindAll
+result = findall_table(
+    conn,
+    "countries that have won the FIFA World Cup and their capital cities",
+    match_limit=10,
+)
+result.result.show()
 ```
 ## Programmatic Usage
@@ -300,6 +349,56 @@ run_enrichment_from_dict({
 })
 ```
+### Device Authorization (RFC 8628)
+For headless environments (SSH, containers, CI), use the device authorization flow:
+```python
+from parallel_web_tools import request_device_code, poll_device_token
+# Step 1: Request a device code
+device_info = request_device_code()
+print(f"Go to: {device_info.verification_uri_complete}")
+# Step 2: Poll until the user authorizes
+token = poll_device_token(device_info.device_code)
+```
+### FindAll
+Discover entities from the web using natural language:
+```python
+from parallel_web_tools import run_findall
+# Discover entities (auto-enriches by default)
+result = run_findall("AI startups in healthcare", match_limit=20)
+# Post-run operations
+from parallel_web_tools import enrich_findall, extend_findall, get_findall_schema
+schema = get_findall_schema(result.run_id)
+enriched = enrich_findall(result.run_id, ["funding amount", "number of employees"])
+extended = extend_findall(result.run_id, additional_matches=10)
+```
+### Monitor
+Track web changes programmatically:
+```python
+from parallel_web_tools import create_monitor, list_monitors, get_monitor
+# Create a monitor
+monitor = create_monitor(query="Track Tesla SEC filings", cadence="daily")
+# List all monitors
+monitors = list_monitors()
+# Get monitor details and events
+details = get_monitor(monitor.monitor_id)
+```
 ## YAML Configuration Format
 ```yaml

{parallel_web_tools-0.1.2rc2 → parallel_web_tools-0.2.0}/parallel_web_tools/__init__.py RENAMED Viewed

@@ -29,7 +29,7 @@ from parallel_web_tools.core import (
     run_tasks,
 )
-__version__ = "0.1.2rc2"
+__version__ = "0.2.0"
 __all__ = [
     # Auth

parallel-web-tools 0.1.2rc2__tar.gz → 0.2.0__tar.gz

parallel-web-tools 0.1.2rc2tar.gz → 0.2.0tar.gz