PyPI - perplexity-webui-scraper - Versions diffs - 0.3.3__tar.gz → 0.3.4__tar.gz - Mend

perplexity-webui-scraper 0.3.3tar.gz → 0.3.4tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (17) hide show

{perplexity_webui_scraper-0.3.3 → perplexity_webui_scraper-0.3.4}/PKG-INFO RENAMED Viewed

@@ -1,6 +1,6 @@
 Metadata-Version: 2.4
 Name: perplexity-webui-scraper
-Version: 0.3.3
+Version: 0.3.4
 Summary: Python scraper to extract AI responses from Perplexity's web interface.
 Keywords: perplexity,ai,scraper,webui,api,client
 Author: henrique-coder
@@ -19,7 +19,7 @@ Classifier: Programming Language :: Python :: 3.14
 Classifier: Topic :: Internet :: WWW/HTTP
 Classifier: Topic :: Software Development :: Libraries :: Python Modules
 Classifier: Typing :: Typed
-Requires-Dist: curl-cffi>=0.13.0
+Requires-Dist: curl-cffi>=0.14.0
 Requires-Dist: orjson>=3.11.5
 Requires-Dist: pydantic>=2.12.5
 Requires-Python: >=3.10
@@ -52,15 +52,44 @@ uv pip install perplexity-webui-scraper
 ## Requirements
-- **Perplexity Pro subscription**
-- **Session token** (`__Secure-next-auth.session-token` cookie from browser)
+- **Perplexity Pro/Max account**
+- **Session token** (`__Secure-next-auth.session-token` cookie from your browser)
 ### Getting Your Session Token
+You can obtain your session token in two ways:
+#### Option 1: Automatic (CLI Tool)
+The package includes a CLI tool to automatically generate and save your session token:
+```bash
+get-perplexity-session-token
+```
+This interactive tool will:
+1. Ask for your Perplexity email
+2. Send a verification code to your email
+3. Accept either a 6-digit code or magic link
+4. Extract and display your session token
+5. Optionally save it to your `.env` file
+**Features:**
+- Secure ephemeral session (cleared on exit)
+- Automatic `.env` file management
+- Support for both OTP codes and magic links
+- Clean terminal interface with status updates
+#### Option 2: Manual (Browser)
+If you prefer to extract the token manually:
 1. Log in at [perplexity.ai](https://www.perplexity.ai)
-2. Open DevTools (`F12`) → Application → Cookies
-3. Copy `__Secure-next-auth.session-token` value
-4. Store in `.env`: `PERPLEXITY_SESSION_TOKEN=your_token`
+2. Open DevTools (`F12`) → Application/Storage → Cookies
+3. Copy the value of `__Secure-next-auth.session-token`
+4. Store in `.env`: `PERPLEXITY_SESSION_TOKEN="your_token"`
 ## Quick Start
@@ -117,25 +146,34 @@ conversation.ask("Latest AI research", files=["paper.pdf"])
 ### `Conversation.ask(query, model?, files?, citation_mode?, stream?)`
-| Parameter       | Type           | Default       | Description         |
-| --------------- | -------------- | ------------- | ------------------- |
-| `query`         | `str`          | —             | Question (required) |
-| `model`         | `Model`        | `Models.BEST` | AI model            |
-| `files`         | `list[str]`    | `None`        | File paths          |
-| `citation_mode` | `CitationMode` | `CLEAN`       | Citation format     |
-| `stream`        | `bool`         | `False`       | Enable streaming    |
+| Parameter       | Type                    | Default       | Description         |
+| --------------- | ----------------------- | ------------- | ------------------- |
+| `query`         | `str`                   | -             | Question (required) |
+| `model`         | `Model`                 | `Models.BEST` | AI model            |
+| `files`         | `list[str \| PathLike]` | `None`        | File paths          |
+| `citation_mode` | `CitationMode`          | `CLEAN`       | Citation format     |
+| `stream`        | `bool`                  | `False`       | Enable streaming    |
 ### Models
-| Model                          | Description       |
-| ------------------------------ | ----------------- |
-| `Models.BEST`                  | Auto-select best  |
-| `Models.RESEARCH`              | Deep research     |
-| `Models.SONAR`                 | Fast queries      |
-| `Models.GPT_51`                | OpenAI GPT-5.1    |
-| `Models.CLAUDE_45_SONNET`      | Claude 4.5 Sonnet |
-| `Models.GEMINI_3_PRO_THINKING` | Gemini 3.0 Pro    |
-| `Models.GROK_41`               | xAI Grok 4.1      |
+| Model                              | Description                                                               |
+| ---------------------------------- | ------------------------------------------------------------------------- |
+| `Models.RESEARCH`                  | Research - Fast and thorough for routine research                         |
+| `Models.LABS`                      | Labs - Multi-step tasks with advanced troubleshooting                     |
+| `Models.BEST`                      | Best - Automatically selects the most responsive model based on the query |
+| `Models.SONAR`                     | Sonar - Perplexity's fast model                                           |
+| `Models.GPT_52`                    | GPT-5.2 - OpenAI's latest model                                           |
+| `Models.GPT_52_THINKING`           | GPT-5.2 Thinking - OpenAI's latest model with thinking                    |
+| `Models.CLAUDE_45_OPUS`            | Claude Opus 4.5 - Anthropic's Opus reasoning model                        |
+| `Models.CLAUDE_45_OPUS_THINKING`   | Claude Opus 4.5 Thinking - Anthropic's Opus reasoning model with thinking |
+| `Models.GEMINI_3_PRO`              | Gemini 3 Pro - Google's newest reasoning model                            |
+| `Models.GEMINI_3_FLASH`            | Gemini 3 Flash - Google's fast reasoning model                            |
+| `Models.GEMINI_3_FLASH_THINKING`   | Gemini 3 Flash Thinking - Google's fast reasoning model with thinking     |
+| `Models.GROK_41`                   | Grok 4.1 - xAI's latest advanced model                                    |
+| `Models.GROK_41_THINKING`          | Grok 4.1 Thinking - xAI's latest reasoning model                          |
+| `Models.KIMI_K2_THINKING`          | Kimi K2 Thinking - Moonshot AI's latest reasoning model                   |
+| `Models.CLAUDE_45_SONNET`          | Claude Sonnet 4.5 - Anthropic's newest advanced model                     |
+| `Models.CLAUDE_45_SONNET_THINKING` | Claude Sonnet 4.5 Thinking - Anthropic's newest reasoning model           |
 ### CitationMode
@@ -159,6 +197,16 @@ conversation.ask("Latest AI research", files=["paper.pdf"])
 | `timezone`        | `None`        | Timezone           |
 | `coordinates`     | `None`        | Location (lat/lng) |
+## CLI Tools
+### Session Token Generator
+```bash
+get-perplexity-session-token
+```
+Interactive tool to automatically obtain your Perplexity session token via email authentication. The token can be automatically saved to your `.env` file for immediate use.
 ## Disclaimer
 This is an **unofficial** library. It uses internal APIs that may change without notice. Use at your own risk. Not for production use.

perplexity_webui_scraper-0.3.4/README.md ADDED Viewed

@@ -0,0 +1,182 @@
+<div align="center">
+# Perplexity WebUI Scraper
+Python scraper to extract AI responses from [Perplexity's](https://www.perplexity.ai) web interface.
+[![PyPI](https://img.shields.io/pypi/v/perplexity-webui-scraper?color=blue)](https://pypi.org/project/perplexity-webui-scraper)
+[![Python](https://img.shields.io/pypi/pyversions/perplexity-webui-scraper)](https://pypi.org/project/perplexity-webui-scraper)
+[![License](https://img.shields.io/github/license/henrique-coder/perplexity-webui-scraper?color=green)](./LICENSE)
+</div>
+---
+## Installation
+```bash
+uv pip install perplexity-webui-scraper
+```
+## Requirements
+- **Perplexity Pro/Max account**
+- **Session token** (`__Secure-next-auth.session-token` cookie from your browser)
+### Getting Your Session Token
+You can obtain your session token in two ways:
+#### Option 1: Automatic (CLI Tool)
+The package includes a CLI tool to automatically generate and save your session token:
+```bash
+get-perplexity-session-token
+```
+This interactive tool will:
+1. Ask for your Perplexity email
+2. Send a verification code to your email
+3. Accept either a 6-digit code or magic link
+4. Extract and display your session token
+5. Optionally save it to your `.env` file
+**Features:**
+- Secure ephemeral session (cleared on exit)
+- Automatic `.env` file management
+- Support for both OTP codes and magic links
+- Clean terminal interface with status updates
+#### Option 2: Manual (Browser)
+If you prefer to extract the token manually:
+1. Log in at [perplexity.ai](https://www.perplexity.ai)
+2. Open DevTools (`F12`) → Application/Storage → Cookies
+3. Copy the value of `__Secure-next-auth.session-token`
+4. Store in `.env`: `PERPLEXITY_SESSION_TOKEN="your_token"`
+## Quick Start
+```python
+from perplexity_webui_scraper import Perplexity
+client = Perplexity(session_token="YOUR_TOKEN")
+conversation = client.create_conversation()
+conversation.ask("What is quantum computing?")
+print(conversation.answer)
+# Follow-up
+conversation.ask("Explain it simpler")
+print(conversation.answer)
+```
+### Streaming
+```python
+for chunk in conversation.ask("Explain AI", stream=True):
+    print(chunk.answer)
+```
+### With Options
+```python
+from perplexity_webui_scraper import (
+    ConversationConfig,
+    Coordinates,
+    Models,
+    SourceFocus,
+)
+config = ConversationConfig(
+    model=Models.RESEARCH,
+    source_focus=[SourceFocus.WEB, SourceFocus.ACADEMIC],
+    language="en-US",
+    coordinates=Coordinates(latitude=40.7128, longitude=-74.0060),
+)
+conversation = client.create_conversation(config)
+conversation.ask("Latest AI research", files=["paper.pdf"])
+```
+## API
+### `Perplexity(session_token, config?)`
+| Parameter       | Type           | Description        |
+| --------------- | -------------- | ------------------ |
+| `session_token` | `str`          | Browser cookie     |
+| `config`        | `ClientConfig` | Timeout, TLS, etc. |
+### `Conversation.ask(query, model?, files?, citation_mode?, stream?)`
+| Parameter       | Type                    | Default       | Description         |
+| --------------- | ----------------------- | ------------- | ------------------- |
+| `query`         | `str`                   | -             | Question (required) |
+| `model`         | `Model`                 | `Models.BEST` | AI model            |
+| `files`         | `list[str \| PathLike]` | `None`        | File paths          |
+| `citation_mode` | `CitationMode`          | `CLEAN`       | Citation format     |
+| `stream`        | `bool`                  | `False`       | Enable streaming    |
+### Models
+| Model                              | Description                                                               |
+| ---------------------------------- | ------------------------------------------------------------------------- |
+| `Models.RESEARCH`                  | Research - Fast and thorough for routine research                         |
+| `Models.LABS`                      | Labs - Multi-step tasks with advanced troubleshooting                     |
+| `Models.BEST`                      | Best - Automatically selects the most responsive model based on the query |
+| `Models.SONAR`                     | Sonar - Perplexity's fast model                                           |
+| `Models.GPT_52`                    | GPT-5.2 - OpenAI's latest model                                           |
+| `Models.GPT_52_THINKING`           | GPT-5.2 Thinking - OpenAI's latest model with thinking                    |
+| `Models.CLAUDE_45_OPUS`            | Claude Opus 4.5 - Anthropic's Opus reasoning model                        |
+| `Models.CLAUDE_45_OPUS_THINKING`   | Claude Opus 4.5 Thinking - Anthropic's Opus reasoning model with thinking |
+| `Models.GEMINI_3_PRO`              | Gemini 3 Pro - Google's newest reasoning model                            |
+| `Models.GEMINI_3_FLASH`            | Gemini 3 Flash - Google's fast reasoning model                            |
+| `Models.GEMINI_3_FLASH_THINKING`   | Gemini 3 Flash Thinking - Google's fast reasoning model with thinking     |
+| `Models.GROK_41`                   | Grok 4.1 - xAI's latest advanced model                                    |
+| `Models.GROK_41_THINKING`          | Grok 4.1 Thinking - xAI's latest reasoning model                          |
+| `Models.KIMI_K2_THINKING`          | Kimi K2 Thinking - Moonshot AI's latest reasoning model                   |
+| `Models.CLAUDE_45_SONNET`          | Claude Sonnet 4.5 - Anthropic's newest advanced model                     |
+| `Models.CLAUDE_45_SONNET_THINKING` | Claude Sonnet 4.5 Thinking - Anthropic's newest reasoning model           |
+### CitationMode
+| Mode       | Output                |
+| ---------- | --------------------- |
+| `DEFAULT`  | `text[1]`             |
+| `MARKDOWN` | `text[1](url)`        |
+| `CLEAN`    | `text` (no citations) |
+### ConversationConfig
+| Parameter         | Default       | Description        |
+| ----------------- | ------------- | ------------------ |
+| `model`           | `Models.BEST` | Default model      |
+| `citation_mode`   | `CLEAN`       | Citation format    |
+| `save_to_library` | `False`       | Save to library    |
+| `search_focus`    | `WEB`         | Search type        |
+| `source_focus`    | `WEB`         | Source types       |
+| `time_range`      | `ALL`         | Time filter        |
+| `language`        | `"en-US"`     | Response language  |
+| `timezone`        | `None`        | Timezone           |
+| `coordinates`     | `None`        | Location (lat/lng) |
+## CLI Tools
+### Session Token Generator
+```bash
+get-perplexity-session-token
+```
+Interactive tool to automatically obtain your Perplexity session token via email authentication. The token can be automatically saved to your `.env` file for immediate use.
+## Disclaimer
+This is an **unofficial** library. It uses internal APIs that may change without notice. Use at your own risk. Not for production use.
+By using this library, you agree to Perplexity AI's Terms of Service.

{perplexity_webui_scraper-0.3.3 → perplexity_webui_scraper-0.3.4}/pyproject.toml RENAMED Viewed

@@ -1,6 +1,6 @@
 [project]
 name = "perplexity-webui-scraper"
-version = "0.3.3"
+version = "0.3.4"
 description = "Python scraper to extract AI responses from Perplexity's web interface."
 authors = [{ name = "henrique-coder", email = "henriquemoreira10fk@gmail.com" }]
 license = "MIT"
@@ -23,18 +23,21 @@ classifiers = [
     "Typing :: Typed",
 ]
 dependencies = [
-    "curl-cffi>=0.13.0",
+    "curl-cffi>=0.14.0",
     "orjson>=3.11.5",
     "pydantic>=2.12.5",
 ]
 [dependency-groups]
 dev = [
+    "beautifulsoup4>=4.14.3",
+    "jsbeautifier>=1.15.4",
+    "lxml>=6.0.2",
     "python-dotenv>=1.2.1",
     "rich>=14.2.0",
 ]
 lint = [
-    "ruff>=0.14.8",
+    "ruff>=0.14.10",
 ]
 tests = [
     "pytest>=9.0.2",
@@ -98,6 +101,9 @@ line-ending = "lf"                  # Unix-style line endings
 docstring-code-format = true        # Format code examples inside docstrings
 skip-magic-trailing-comma = false   # Preserve trailing commas as formatting hints
+[project.scripts]
+get-perplexity-session-token = "perplexity_webui_scraper.cli.get_perplexity_session_token:get_token"
 [build-system]
 requires = ["uv_build"]
 build-backend = "uv_build"

{perplexity_webui_scraper-0.3.3 → perplexity_webui_scraper-0.3.4}/src/perplexity_webui_scraper/__init__.py RENAMED Viewed

@@ -1,6 +1,6 @@
 """Extract AI responses from Perplexity's web interface."""
-from importlib.metadata import version
+from importlib import metadata
 from .config import ClientConfig, ConversationConfig
 from .core import Conversation, Perplexity
@@ -16,7 +16,7 @@ from .models import Model, Models
 from .types import Coordinates, Response, SearchResultItem
-__version__: str = version("perplexity-webui-scraper")
+__version__: str = metadata.version("perplexity-webui-scraper")
 __all__: list[str] = [
     "AuthenticationError",
     "CitationMode",

perplexity_webui_scraper-0.3.4/src/perplexity_webui_scraper/cli/get_perplexity_session_token.py ADDED Viewed

@@ -0,0 +1,216 @@
+"""CLI utility for secure Perplexity authentication and session extraction."""
+from __future__ import annotations
+from pathlib import Path
+from sys import exit
+from typing import NoReturn
+from curl_cffi.requests import Session
+from rich.console import Console
+from rich.panel import Panel
+from rich.prompt import Confirm, Prompt
+# Constants
+BASE_URL: str = "https://www.perplexity.ai"
+ENV_KEY: str = "PERPLEXITY_SESSION_TOKEN"
+# Initialize console on stderr to ensure secure alternate screen usage
+console = Console(stderr=True, soft_wrap=True)
+def update_env(token: str) -> bool:
+    """
+    Securely updates the .env file with the session token.
+    Preserves existing content and comments.
+    """
+    path = Path(".env")
+    line_entry = f'{ENV_KEY}="{token}"'
+    try:
+        lines = path.read_text(encoding="utf-8").splitlines() if path.exists() else []
+        updated = False
+        new_lines = []
+        for line in lines:
+            if line.strip().startswith(ENV_KEY):
+                new_lines.append(line_entry)
+                updated = True
+            else:
+                new_lines.append(line)
+        if not updated:
+            if new_lines and new_lines[-1] != "":
+                new_lines.append("")
+            new_lines.append(line_entry)
+        path.write_text("\n".join(new_lines) + "\n", encoding="utf-8")
+        return True
+    except Exception:
+        return False
+def _initialize_session() -> tuple[Session, str]:
+    """Initialize session and obtain CSRF token."""
+    session = Session(impersonate="chrome", headers={"Referer": BASE_URL, "Origin": BASE_URL})
+    with console.status("[bold green]Initializing secure connection...", spinner="dots"):
+        session.get(BASE_URL)
+        csrf_data = session.get(f"{BASE_URL}/api/auth/csrf").json()
+        csrf = csrf_data.get("csrfToken")
+        if not csrf:
+            raise ValueError("Failed to obtain CSRF token.")
+    return session, csrf
+def _request_verification_code(session: Session, csrf: str, email: str) -> None:
+    """Send verification code to user's email."""
+    with console.status("[bold green]Sending verification code...", spinner="dots"):
+        r = session.post(
+            f"{BASE_URL}/api/auth/signin/email?version=2.18&source=default",
+            json={
+                "email": email,
+                "csrfToken": csrf,
+                "useNumericOtp": "true",
+                "json": "true",
+                "callbackUrl": f"{BASE_URL}/?login-source=floatingSignup",
+            },
+        )
+        if r.status_code != 200:
+            raise ValueError(f"Authentication request failed: {r.text}")
+def _validate_and_get_redirect_url(session: Session, email: str, user_input: str) -> str:
+    """Validate user input (OTP or magic link) and return redirect URL."""
+    with console.status("[bold green]Validating...", spinner="dots"):
+        if user_input.startswith("http"):
+            return user_input
+        r_otp = session.post(
+            f"{BASE_URL}/api/auth/otp-redirect-link",
+            json={
+                "email": email,
+                "otp": user_input,
+                "redirectUrl": f"{BASE_URL}/?login-source=floatingSignup",
+                "emailLoginMethod": "web-otp",
+            },
+        )
+        if r_otp.status_code != 200:
+            raise ValueError("Invalid verification code.")
+        redirect_path = r_otp.json().get("redirect")
+        if not redirect_path:
+            raise ValueError("No redirect URL received.")
+        return f"{BASE_URL}{redirect_path}" if redirect_path.startswith("/") else redirect_path
+def _extract_session_token(session: Session, redirect_url: str) -> str:
+    """Extract session token from cookies after authentication."""
+    session.get(redirect_url)
+    token = session.cookies.get("__Secure-next-auth.session-token")
+    if not token:
+        raise ValueError("Authentication successful, but token not found.")
+    return token
+def _display_and_save_token(token: str) -> None:
+    """Display token and optionally save to .env file."""
+    console.print("\n[bold green]✅ Token generated successfully![/bold green]")
+    console.print(f"\n[bold white]Your session token:[/bold white]\n[green]{token}[/green]\n")
+    prompt_text = f"Save token to [bold yellow].env[/bold yellow] file ({ENV_KEY})?"
+    if Confirm.ask(prompt_text, default=True, console=console):
+        if update_env(token):
+            console.print("[dim]Token saved to .env successfully.[/dim]")
+        else:
+            console.print("[red]Failed to save to .env file.[/red]")
+def _show_header() -> None:
+    """Display welcome header."""
+    console.print(
+        Panel(
+            "[bold white]Perplexity WebUI Scraper[/bold white]\n\n"
+            "Automatic session token generator via email authentication.\n"
+            "[dim]All session data will be cleared on exit.[/dim]",
+            title="🔐 Token Generator",
+            border_style="cyan",
+        )
+    )
+def _show_exit_message() -> None:
+    """Display security note and wait for user to exit."""
+    console.print("\n[bold yellow]⚠️ Security Note:[/bold yellow]")
+    console.print("Press [bold white]ENTER[/bold white] to clear screen and exit.")
+    console.input()
+def get_token() -> NoReturn:
+    """
+    Executes the authentication flow within an ephemeral terminal screen.
+    Handles CSRF, Email OTP/Link validation, and secure token display.
+    """
+    with console.screen():
+        try:
+            _show_header()
+            # Step 1: Initialize session and get CSRF token
+            session, csrf = _initialize_session()
+            # Step 2: Get email and request verification code
+            console.print("\n[bold cyan]Step 1: Email Verification[/bold cyan]")
+            email = Prompt.ask("  Enter your Perplexity email", console=console)
+            _request_verification_code(session, csrf, email)
+            # Step 3: Get and validate user input (OTP or magic link)
+            console.print("\n[bold cyan]Step 2: Verification[/bold cyan]")
+            console.print("  Check your email for a [bold]6-digit code[/bold] or [bold]magic link[/bold].")
+            user_input = Prompt.ask("  Enter code or paste link", console=console).strip()
+            redirect_url = _validate_and_get_redirect_url(session, email, user_input)
+            # Step 4: Extract session token
+            token = _extract_session_token(session, redirect_url)
+            # Step 5: Display and optionally save token
+            _display_and_save_token(token)
+            # Step 6: Exit
+            _show_exit_message()
+            exit(0)
+        except KeyboardInterrupt:
+            exit(0)
+        except Exception as error:
+            console.print(f"\n[bold red]⛔ Error:[/bold red] {error}")
+            console.input("[dim]Press ENTER to exit...[/dim]")
+            exit(1)
+if __name__ == "__main__":
+    get_token()

{perplexity_webui_scraper-0.3.3 → perplexity_webui_scraper-0.3.4}/src/perplexity_webui_scraper/config.py RENAMED Viewed

@@ -32,5 +32,5 @@ class ConversationConfig:
 class ClientConfig:
     """HTTP client settings."""
-    timeout: int = 1800
+    timeout: int = 3600
     impersonate: str = "chrome"

{perplexity_webui_scraper-0.3.3 → perplexity_webui_scraper-0.3.4}/src/perplexity_webui_scraper/constants.py RENAMED Viewed

@@ -1,8 +1,4 @@
-"""Fixed constants and values for the Perplexity API.
-These are internal API values that should not be modified by users.
-They represent fixed parameters required by the Perplexity WebUI API.
-"""
+"""Constants and values for the Perplexity internal API and HTTP interactions."""
 from __future__ import annotations
@@ -10,21 +6,14 @@ from re import Pattern, compile
 from typing import Final
-# =============================================================================
 # API Configuration
-# =============================================================================
 API_VERSION: Final[str] = "2.18"
 """Current API version used by Perplexity WebUI."""
 API_BASE_URL: Final[str] = "https://www.perplexity.ai"
 """Base URL for all API requests."""
-# =============================================================================
 # API Endpoints
-# =============================================================================
 ENDPOINT_ASK: Final[str] = "/rest/sse/perplexity_ask"
 """SSE endpoint for sending prompts."""
@@ -34,54 +23,39 @@ ENDPOINT_SEARCH_INIT: Final[str] = "/search/new"
 ENDPOINT_UPLOAD: Final[str] = "/rest/uploads/batch_create_upload_urls"
 """Endpoint for file upload URL generation."""
-# =============================================================================
 # API Fixed Parameters
-# =============================================================================
 SEND_BACK_TEXT: Final[bool] = True
-"""Whether to receive full text in each streaming chunk.
+"""
+Whether to receive full text in each streaming chunk.
 True = API sends complete text each chunk (replace mode).
 False = API sends delta chunks only (accumulate mode).
-Currently must be True for the parser to work correctly.
 """
 USE_SCHEMATIZED_API: Final[bool] = False
-"""Whether to use the schematized API format.
-Currently must be False - schematized format is not supported.
-"""
+"""Whether to use the schematized API format."""
 PROMPT_SOURCE: Final[str] = "user"
 """Source identifier for prompts."""
-# =============================================================================
-# Regex Patterns (Pre-compiled for performance)
-# =============================================================================
+# Regex Patterns (Pre-compiled for performance in streaming parsing)
 CITATION_PATTERN: Final[Pattern[str]] = compile(r"\[(\d{1,2})\]")
-"""Regex pattern for matching citation markers like [1], [2], etc.
+"""
+Regex pattern for matching citation markers like [1], [2], etc.
 Uses word boundary to avoid matching things like [123].
-Pre-compiled for performance in streaming scenarios.
 """
 JSON_OBJECT_PATTERN: Final[Pattern[str]] = compile(r"^\{.*\}$")
 """Pattern to detect JSON object strings."""
-# =============================================================================
 # HTTP Headers
-# =============================================================================
 DEFAULT_HEADERS: Final[dict[str, str]] = {
     "Accept": "text/event-stream, application/json",
     "Content-Type": "application/json",
 }
-"""Default HTTP headers for API requests.
+"""
+Default HTTP headers for API requests.
 Referer and Origin are added dynamically based on BASE_URL.
 """

{perplexity_webui_scraper-0.3.3 → perplexity_webui_scraper-0.3.4}/src/perplexity_webui_scraper/core.py RENAMED Viewed

@@ -8,7 +8,7 @@ from pathlib import Path
 from typing import TYPE_CHECKING, Any
 from uuid import uuid4
-from orjson import loads
+from orjson import JSONDecodeError, loads
 if TYPE_CHECKING:
@@ -57,10 +57,12 @@ class Perplexity:
     def create_conversation(self, config: ConversationConfig | None = None) -> Conversation:
         """Create a new conversation."""
         return Conversation(self._http, config or ConversationConfig())
     def close(self) -> None:
         """Close the client."""
         self._http.close()
     def __enter__(self) -> Perplexity:
@@ -103,55 +105,64 @@ class Conversation:
     @property
     def answer(self) -> str | None:
         """Last response text."""
         return self._answer
     @property
     def title(self) -> str | None:
         """Conversation title."""
         return self._title
     @property
     def search_results(self) -> list[SearchResultItem]:
         """Search results from last response."""
         return self._search_results
     @property
     def uuid(self) -> str | None:
         """Conversation UUID."""
         return self._backend_uuid
     def __iter__(self) -> Generator[Response, None, None]:
         if self._stream_generator is not None:
             yield from self._stream_generator
             self._stream_generator = None
     def ask(
         self,
         query: str,
         model: Model | None = None,
-        files: list[str | PathLike[str]] | None = None,
+        files: list[str | PathLike] | None = None,
         citation_mode: CitationMode | None = None,
         stream: bool = False,
     ) -> Conversation:
         """Ask a question. Returns self for method chaining or streaming iteration."""
         effective_model = model or self._config.model or Models.BEST
         effective_citation = citation_mode if citation_mode is not None else self._config.citation_mode
         self._citation_mode = effective_citation
         self._execute(query, effective_model, files, stream=stream)
         return self
     def _execute(
         self,
         query: str,
         model: Model,
-        files: list[str | PathLike[str]] | None,
+        files: list[str | PathLike] | None,
         stream: bool = False,
     ) -> None:
         """Execute a query."""
         self._reset_response_state()
         # Upload files
         file_urls: list[str] = []
         if files:
             validated = self._validate_files(files)
             file_urls = [self._upload_file(f) for f in validated]
@@ -172,15 +183,17 @@ class Conversation:
         self._raw_data = {}
         self._stream_generator = None
-    def _validate_files(self, files: list[str | PathLike[str]] | None) -> list[_FileInfo]:
+    def _validate_files(self, files: list[str | PathLike] | None) -> list[_FileInfo]:
         if not files:
             return []
         seen: set[str] = set()
         file_list: list[Path] = []
         for item in files:
             if item and isinstance(item, (str, PathLike)):
                 path = Path(item).resolve()
                 if path.as_posix() not in seen:
                     seen.add(path.as_posix())
                     file_list.append(path)
@@ -203,11 +216,13 @@ class Conversation:
                     raise FileValidationError(file_path, "Path is not a file")
                 file_size = path.stat().st_size
                 if file_size > MAX_FILE_SIZE:
                     raise FileValidationError(
                         file_path,
                         f"File exceeds 50MB limit: {file_size / (1024 * 1024):.1f}MB",
                     )
                 if file_size == 0:
                     raise FileValidationError(file_path, "File is empty")
@@ -224,10 +239,10 @@ class Conversation:
                 )
             except FileValidationError:
                 raise
-            except (FileNotFoundError, PermissionError) as e:
-                raise FileValidationError(file_path, f"Cannot access file: {e}") from e
-            except OSError as e:
-                raise FileValidationError(file_path, f"File system error: {e}") from e
+            except (FileNotFoundError, PermissionError) as error:
+                raise FileValidationError(file_path, f"Cannot access file: {error}") from error
+            except OSError as error:
+                raise FileValidationError(file_path, f"File system error: {error}") from error
         return result
@@ -255,8 +270,8 @@ class Conversation:
                 raise FileUploadError(file_info.path, "No upload URL returned")
             return upload_url
-        except FileUploadError:
-            raise
+        except FileUploadError as error:
+            raise error
         except Exception as e:
             raise FileUploadError(file_info.path, str(e)) from e
@@ -301,6 +316,7 @@ class Conversation:
         if self._backend_uuid is not None:
             params["last_backend_uuid"] = self._backend_uuid
             params["query_source"] = "followup"
             if self._read_write_token:
                 params["read_write_token"] = self._read_write_token
@@ -312,6 +328,7 @@ class Conversation:
         def replacer(m: Match[str]) -> str:
             num = m.group(1)
             if not num.isdigit():
                 return m.group(0)
@@ -319,8 +336,10 @@ class Conversation:
                 return ""
             idx = int(num) - 1
             if 0 <= idx < len(self._search_results):
                 url = self._search_results[idx].url or ""
                 if self._citation_mode == CitationMode.MARKDOWN and url:
                     return f"[{num}]({url})"
@@ -330,8 +349,10 @@ class Conversation:
     def _parse_line(self, line: str | bytes) -> dict[str, Any] | None:
         prefix = b"data: " if isinstance(line, bytes) else "data: "
         if (isinstance(line, bytes) and line.startswith(prefix)) or (isinstance(line, str) and line.startswith(prefix)):
             return loads(line[6:])
         return None
     def _process_data(self, data: dict[str, Any]) -> None:
@@ -341,10 +362,25 @@ class Conversation:
         if self._read_write_token is None and "read_write_token" in data:
             self._read_write_token = data["read_write_token"]
-        if "text" not in data:
-            return
+        if "blocks" in data:
+            for block in data["blocks"]:
+                if block.get("intended_usage") == "web_results":
+                    diff = block.get("diff_block", {})
+                    for patch in diff.get("patches", []):
+                        if patch.get("op") == "replace" and patch.get("path") == "/web_results":
+                            pass
+        if "text" not in data and "blocks" not in data:
+            return None
+        try:
+            json_data = loads(data["text"])
+        except KeyError as e:
+            raise ValueError("Missing 'text' field in data") from e
+        except JSONDecodeError as e:
+            raise ValueError("Invalid JSON in 'text' field") from e
-        json_data = loads(data["text"])
         answer_data: dict[str, Any] = {}
         if isinstance(json_data, list):
@@ -359,14 +395,18 @@ class Conversation:
                         answer_data = raw_content
                     self._update_state(data.get("thread_title"), answer_data)
                     break
         elif isinstance(json_data, dict):
             self._update_state(data.get("thread_title"), json_data)
+        else:
+            raise ValueError("Unexpected JSON structure in 'text' field")
     def _update_state(self, title: str | None, answer_data: dict[str, Any]) -> None:
         self._title = title
         web_results = answer_data.get("web_results", [])
         if web_results:
             self._search_results = [
                 SearchResultItem(
@@ -379,10 +419,12 @@ class Conversation:
             ]
         answer_text = answer_data.get("answer")
         if answer_text is not None:
             self._answer = self._format_citations(answer_text)
         chunks = answer_data.get("chunks", [])
         if chunks:
             self._chunks = chunks
@@ -402,16 +444,21 @@ class Conversation:
     def _complete(self, payload: dict[str, Any]) -> None:
         for line in self._http.stream_ask(payload):
             data = self._parse_line(line)
             if data:
                 self._process_data(data)
                 if data.get("final"):
                     break
     def _stream(self, payload: dict[str, Any]) -> Generator[Response, None, None]:
         for line in self._http.stream_ask(payload):
             data = self._parse_line(line)
             if data:
                 self._process_data(data)
                 yield self._build_response()
                 if data.get("final"):
                     break

{perplexity_webui_scraper-0.3.3 → perplexity_webui_scraper-0.3.4}/src/perplexity_webui_scraper/http.py RENAMED Viewed

@@ -46,6 +46,7 @@ class HTTPClient:
             "Origin": API_BASE_URL,
         }
         cookies: dict[str, str] = {SESSION_COOKIE_NAME: session_token}
         self._session: Session = Session(
             headers=headers,
             cookies=cookies,
@@ -101,6 +102,7 @@ class HTTPClient:
         try:
             response = self._session.get(url, params=params)
             response.raise_for_status()
             return response
         except Exception as e:
             self._handle_error(e, f"GET {endpoint}: ")
@@ -132,6 +134,7 @@ class HTTPClient:
         try:
             response = self._session.post(url, json=json, stream=stream)
             response.raise_for_status()
             return response
         except Exception as e:
             self._handle_error(e, f"POST {endpoint}: ")

{perplexity_webui_scraper-0.3.3 → perplexity_webui_scraper-0.3.4}/src/perplexity_webui_scraper/limits.py RENAMED Viewed

@@ -10,11 +10,8 @@ MAX_FILES: Final[int] = 30
 """Maximum number of files that can be attached to a single prompt."""
 MAX_FILE_SIZE: Final[int] = 50 * 1024 * 1024  # 50 MB in bytes
-"""Maximum file size in bytes (50 MB)."""
+"""Maximum file size in bytes."""
 # Request Limits
 DEFAULT_TIMEOUT: Final[int] = 30 * 60  # 30 minutes in seconds
-"""Default request timeout in seconds (30 minutes).
-Set high to accommodate complex models that may take longer to respond.
-"""
+"""Default request timeout in seconds"""

perplexity_webui_scraper-0.3.4/src/perplexity_webui_scraper/models.py ADDED Viewed

@@ -0,0 +1,73 @@
+"""AI model definitions for Perplexity WebUI Scraper."""
+from __future__ import annotations
+from dataclasses import dataclass
+@dataclass(frozen=True, slots=True)
+class Model:
+    """AI model configuration.
+    Attributes:
+        identifier: Model identifier used by the API.
+        mode: Model execution mode. Default: "copilot".
+    """
+    identifier: str
+    mode: str = "copilot"
+class Models:
+    """Available AI models with their configurations.
+    All models use the "copilot" mode which enables web search.
+    """
+    RESEARCH = Model(identifier="pplx_alpha")
+    """Research - Fast and thorough for routine research"""
+    LABS = Model(identifier="pplx_beta")
+    """Labs - Multi-step tasks with advanced troubleshooting"""
+    BEST = Model(identifier="pplx_pro_upgraded")
+    """Best - Automatically selects the most responsive model based on the query"""
+    SONAR = Model(identifier="experimental")
+    """Sonar - Perplexity's fast model"""
+    GPT_52 = Model(identifier="gpt52")
+    """GPT-5.2 - OpenAI's latest model"""
+    GPT_52_THINKING = Model(identifier="gpt52_thinking")
+    """GPT-5.2 Thinking - OpenAI's latest model with thinking"""
+    CLAUDE_45_OPUS = Model(identifier="claude45opus")
+    """Claude Opus 4.5 - Anthropic's Opus reasoning model"""
+    CLAUDE_45_OPUS_THINKING = Model(identifier="claude45opusthinking")
+    """Claude Opus 4.5 Thinking - Anthropic's Opus reasoning model with thinking"""
+    GEMINI_3_PRO = Model(identifier="gemini30pro")
+    """Gemini 3 Pro - Google's newest reasoning model"""
+    GEMINI_3_FLASH = Model(identifier="gemini30flash")
+    """Gemini 3 Flash - Google's fast reasoning model"""
+    GEMINI_3_FLASH_THINKING = Model(identifier="gemini30flash_high")
+    """Gemini 3 Flash Thinking - Google's fast reasoning model with enhanced thinking"""
+    GROK_41 = Model(identifier="grok41nonreasoning")
+    """Grok 4.1 - xAI's latest advanced model"""
+    GROK_41_THINKING = Model(identifier="grok41reasoning")
+    """Grok 4.1 Thinking - xAI's latest reasoning model"""
+    KIMI_K2_THINKING = Model(identifier="kimik2thinking")
+    """Kimi K2 Thinking - Moonshot AI's latest reasoning model"""
+    CLAUDE_45_SONNET = Model(identifier="claude45sonnet")
+    """Claude Sonnet 4.5 - Anthropic's newest advanced model"""
+    CLAUDE_45_SONNET_THINKING = Model(identifier="claude45sonnetthinking")
+    """Claude Sonnet 4.5 Thinking - Anthropic's newest reasoning model"""

perplexity_webui_scraper-0.3.3/README.md DELETED Viewed

@@ -1,134 +0,0 @@
-<div align="center">
-# Perplexity WebUI Scraper
-Python scraper to extract AI responses from [Perplexity's](https://www.perplexity.ai) web interface.
-[![PyPI](https://img.shields.io/pypi/v/perplexity-webui-scraper?color=blue)](https://pypi.org/project/perplexity-webui-scraper)
-[![Python](https://img.shields.io/pypi/pyversions/perplexity-webui-scraper)](https://pypi.org/project/perplexity-webui-scraper)
-[![License](https://img.shields.io/github/license/henrique-coder/perplexity-webui-scraper?color=green)](./LICENSE)
-</div>
----
-## Installation
-```bash
-uv pip install perplexity-webui-scraper
-```
-## Requirements
-- **Perplexity Pro subscription**
-- **Session token** (`__Secure-next-auth.session-token` cookie from browser)
-### Getting Your Session Token
-1. Log in at [perplexity.ai](https://www.perplexity.ai)
-2. Open DevTools (`F12`) → Application → Cookies
-3. Copy `__Secure-next-auth.session-token` value
-4. Store in `.env`: `PERPLEXITY_SESSION_TOKEN=your_token`
-## Quick Start
-```python
-from perplexity_webui_scraper import Perplexity
-client = Perplexity(session_token="YOUR_TOKEN")
-conversation = client.create_conversation()
-conversation.ask("What is quantum computing?")
-print(conversation.answer)
-# Follow-up
-conversation.ask("Explain it simpler")
-print(conversation.answer)
-```
-### Streaming
-```python
-for chunk in conversation.ask("Explain AI", stream=True):
-    print(chunk.answer)
-```
-### With Options
-```python
-from perplexity_webui_scraper import (
-    ConversationConfig,
-    Coordinates,
-    Models,
-    SourceFocus,
-)
-config = ConversationConfig(
-    model=Models.RESEARCH,
-    source_focus=[SourceFocus.WEB, SourceFocus.ACADEMIC],
-    language="en-US",
-    coordinates=Coordinates(latitude=40.7128, longitude=-74.0060),
-)
-conversation = client.create_conversation(config)
-conversation.ask("Latest AI research", files=["paper.pdf"])
-```
-## API
-### `Perplexity(session_token, config?)`
-| Parameter       | Type           | Description        |
-| --------------- | -------------- | ------------------ |
-| `session_token` | `str`          | Browser cookie     |
-| `config`        | `ClientConfig` | Timeout, TLS, etc. |
-### `Conversation.ask(query, model?, files?, citation_mode?, stream?)`
-| Parameter       | Type           | Default       | Description         |
-| --------------- | -------------- | ------------- | ------------------- |
-| `query`         | `str`          | —             | Question (required) |
-| `model`         | `Model`        | `Models.BEST` | AI model            |
-| `files`         | `list[str]`    | `None`        | File paths          |
-| `citation_mode` | `CitationMode` | `CLEAN`       | Citation format     |
-| `stream`        | `bool`         | `False`       | Enable streaming    |
-### Models
-| Model                          | Description       |
-| ------------------------------ | ----------------- |
-| `Models.BEST`                  | Auto-select best  |
-| `Models.RESEARCH`              | Deep research     |
-| `Models.SONAR`                 | Fast queries      |
-| `Models.GPT_51`                | OpenAI GPT-5.1    |
-| `Models.CLAUDE_45_SONNET`      | Claude 4.5 Sonnet |
-| `Models.GEMINI_3_PRO_THINKING` | Gemini 3.0 Pro    |
-| `Models.GROK_41`               | xAI Grok 4.1      |
-### CitationMode
-| Mode       | Output                |
-| ---------- | --------------------- |
-| `DEFAULT`  | `text[1]`             |
-| `MARKDOWN` | `text[1](url)`        |
-| `CLEAN`    | `text` (no citations) |
-### ConversationConfig
-| Parameter         | Default       | Description        |
-| ----------------- | ------------- | ------------------ |
-| `model`           | `Models.BEST` | Default model      |
-| `citation_mode`   | `CLEAN`       | Citation format    |
-| `save_to_library` | `False`       | Save to library    |
-| `search_focus`    | `WEB`         | Search type        |
-| `source_focus`    | `WEB`         | Source types       |
-| `time_range`      | `ALL`         | Time filter        |
-| `language`        | `"en-US"`     | Response language  |
-| `timezone`        | `None`        | Timezone           |
-| `coordinates`     | `None`        | Location (lat/lng) |
-## Disclaimer
-This is an **unofficial** library. It uses internal APIs that may change without notice. Use at your own risk. Not for production use.
-By using this library, you agree to Perplexity AI's Terms of Service.

perplexity_webui_scraper-0.3.3/src/perplexity_webui_scraper/models.py DELETED Viewed

@@ -1,58 +0,0 @@
-"""AI model definitions for Perplexity WebUI Scraper."""
-from __future__ import annotations
-from dataclasses import dataclass
-@dataclass(frozen=True, slots=True)
-class Model:
-    """AI model configuration.
-    Attributes:
-        identifier: Model identifier used by the API.
-        mode: Model execution mode. Default: "copilot".
-    """
-    identifier: str
-    mode: str = "copilot"
-class Models:
-    """Available AI models with their configurations.
-    All models use the "copilot" mode which enables web search.
-    """
-    LABS = Model(identifier="pplx_beta")
-    """Create projects from scratch (turn your ideas into completed docs, slides, dashboards, and more)."""
-    RESEARCH = Model(identifier="pplx_alpha")
-    """Deep research on any topic (in-depth reports with more sources, charts, and advanced reasoning)."""
-    BEST = Model(identifier="pplx_pro")
-    """Automatically selects the best model based on the query. Recommended for most use cases."""
-    SONAR = Model(identifier="experimental")
-    """Perplexity's fast model. Good for quick queries."""
-    GPT_51 = Model(identifier="gpt51")
-    """OpenAI's latest model (GPT-5.1)."""
-    GPT_51_THINKING = Model(identifier="gpt51_thinking")
-    """OpenAI's latest model with extended reasoning capabilities."""
-    CLAUDE_45_SONNET = Model(identifier="claude45sonnet")
-    """Anthropic's Claude 4.5 Sonnet model."""
-    CLAUDE_45_SONNET_THINKING = Model(identifier="claude45sonnetthinking")
-    """Anthropic's Claude 4.5 Sonnet with extended reasoning capabilities."""
-    GEMINI_3_PRO_THINKING = Model(identifier="gemini30pro")
-    """Google's Gemini 3.0 Pro with reasoning capabilities."""
-    GROK_41 = Model(identifier="grok41nonreasoning")
-    """xAI's Grok 4.1 model."""
-    KIMI_K2_THINKING = Model(identifier="kimik2thinking")
-    """Moonshot AI's Kimi K2 reasoning model (hosted in the US)."""

{perplexity_webui_scraper-0.3.3 → perplexity_webui_scraper-0.3.4}/src/perplexity_webui_scraper/enums.py RENAMED Viewed

File without changes

{perplexity_webui_scraper-0.3.3 → perplexity_webui_scraper-0.3.4}/src/perplexity_webui_scraper/exceptions.py RENAMED Viewed

File without changes

{perplexity_webui_scraper-0.3.3 → perplexity_webui_scraper-0.3.4}/src/perplexity_webui_scraper/py.typed RENAMED Viewed

File without changes

{perplexity_webui_scraper-0.3.3 → perplexity_webui_scraper-0.3.4}/src/perplexity_webui_scraper/types.py RENAMED Viewed

File without changes

perplexity-webui-scraper 0.3.3__tar.gz → 0.3.4__tar.gz

perplexity-webui-scraper 0.3.3tar.gz → 0.3.4tar.gz