PyPI - perplexity-web-mcp-cli - Versions diffs - 0.12.0__tar.gz → 0.12.2__tar.gz - Mend

perplexity-web-mcp-cli 0.12.0tar.gz → 0.12.2tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (42) hide show

{perplexity_web_mcp_cli-0.12.0 → perplexity_web_mcp_cli-0.12.2}/PKG-INFO RENAMED Viewed

@@ -1,6 +1,6 @@
 Metadata-Version: 2.4
 Name: perplexity-web-mcp-cli
-Version: 0.12.0
+Version: 0.12.2
 Summary: CLI, MCP server, and Anthropic/OpenAI API-compatible interface for Perplexity AI.
 Keywords: perplexity,ai,mcp,anthropic,api,client
 Author: Jacob BD
@@ -41,6 +41,13 @@ Description-Content-Type: text/markdown
 # Perplexity Web MCP & CLI
+[![PyPI version](https://img.shields.io/pypi/v/perplexity-web-mcp-cli)](https://pypi.org/project/perplexity-web-mcp-cli/)
+[![PyPI downloads](https://img.shields.io/pypi/dm/perplexity-web-mcp-cli)](https://pypistats.org/packages/perplexity-web-mcp-cli)
+[![Total downloads](https://static.pepy.tech/badge/perplexity-web-mcp-cli)](https://pepy.tech/projects/perplexity-web-mcp-cli)
+[![Python](https://img.shields.io/pypi/pyversions/perplexity-web-mcp-cli)](https://pypi.org/project/perplexity-web-mcp-cli/)
+[![License](https://img.shields.io/pypi/l/perplexity-web-mcp-cli)](https://github.com/jacob-bd/perplexity-web-mcp/blob/main/LICENSE)
+[![Buy Me a Coffee](https://img.shields.io/badge/Buy%20Me%20a%20Coffee-FFDD00?style=flat-square&logo=buy-me-a-coffee&logoColor=black)](https://buymeacoffee.com/jacobbd)
 <p align="center">
   <a href="https://youtu.be/9xyClDvmoZ0">
     <img src="https://img.youtube.com/vi/9xyClDvmoZ0/maxresdefault.jpg" alt="Perplexity Powerhouse + Model Council Demo" width="400">
@@ -233,7 +240,7 @@ pwm research "NVIDIA competitive landscape" -s finance --json
 Query multiple models in parallel and get a synthesized consensus. Each model costs 1 Pro Search. Default synthesis uses Sonar 2 (also 1 Pro Search).
 ```bash
-# Default: GPT-5.4, Claude Opus, Gemini Pro + Sonar 2 synthesis (4 Pro Searches)
+# Default: GPT-5.4, Claude Sonnet, Gemini Pro + Sonar 2 synthesis (4 Pro Searches)
 pwm council "What are best practices for microservices?"
 ```

{perplexity_web_mcp_cli-0.12.0 → perplexity_web_mcp_cli-0.12.2}/README.md RENAMED Viewed

@@ -4,6 +4,13 @@
 # Perplexity Web MCP & CLI
+[![PyPI version](https://img.shields.io/pypi/v/perplexity-web-mcp-cli)](https://pypi.org/project/perplexity-web-mcp-cli/)
+[![PyPI downloads](https://img.shields.io/pypi/dm/perplexity-web-mcp-cli)](https://pypistats.org/packages/perplexity-web-mcp-cli)
+[![Total downloads](https://static.pepy.tech/badge/perplexity-web-mcp-cli)](https://pepy.tech/projects/perplexity-web-mcp-cli)
+[![Python](https://img.shields.io/pypi/pyversions/perplexity-web-mcp-cli)](https://pypi.org/project/perplexity-web-mcp-cli/)
+[![License](https://img.shields.io/pypi/l/perplexity-web-mcp-cli)](https://github.com/jacob-bd/perplexity-web-mcp/blob/main/LICENSE)
+[![Buy Me a Coffee](https://img.shields.io/badge/Buy%20Me%20a%20Coffee-FFDD00?style=flat-square&logo=buy-me-a-coffee&logoColor=black)](https://buymeacoffee.com/jacobbd)
 <p align="center">
   <a href="https://youtu.be/9xyClDvmoZ0">
     <img src="https://img.youtube.com/vi/9xyClDvmoZ0/maxresdefault.jpg" alt="Perplexity Powerhouse + Model Council Demo" width="400">
@@ -196,7 +203,7 @@ pwm research "NVIDIA competitive landscape" -s finance --json
 Query multiple models in parallel and get a synthesized consensus. Each model costs 1 Pro Search. Default synthesis uses Sonar 2 (also 1 Pro Search).
 ```bash
-# Default: GPT-5.4, Claude Opus, Gemini Pro + Sonar 2 synthesis (4 Pro Searches)
+# Default: GPT-5.4, Claude Sonnet, Gemini Pro + Sonar 2 synthesis (4 Pro Searches)
 pwm council "What are best practices for microservices?"
 ```

{perplexity_web_mcp_cli-0.12.0 → perplexity_web_mcp_cli-0.12.2}/pyproject.toml RENAMED Viewed

@@ -1,6 +1,6 @@
 [project]
 name = "perplexity-web-mcp-cli"
-version = "0.12.0"
+version = "0.12.2"
 description = "CLI, MCP server, and Anthropic/OpenAI API-compatible interface for Perplexity AI."
 authors = [{ name = "Jacob BD" }]
 license = "MIT"

{perplexity_web_mcp_cli-0.12.0 → perplexity_web_mcp_cli-0.12.2}/src/perplexity_web_mcp/cli/ai_doc.py RENAMED Viewed

@@ -66,7 +66,7 @@ MODEL COUNCIL
   pwm council "query" --json                  Output as JSON
   Each model in the council costs 1 Pro Search, plus 1 for synthesis. Default = 4 Pro Searches.
-  Available models: gpt54, gpt55, claude_sonnet, claude_opus, gemini_pro, nemotron, kimi_k26
+  Available models: sonar, gpt54, gpt55, claude_sonnet, claude_opus, gemini_pro, nemotron, kimi_k26
   Thinking toggle: -t / --thinking (gpt54, gpt55, claude_sonnet, claude_opus, kimi_k26 support toggle;
     gemini_pro and nemotron are always thinking)
@@ -110,7 +110,7 @@ MODELS
 Name            Identifier              Thinking   Notes
 -----------     ----------------------  ---------  ---------------------------
 auto            pplx_pro                No         Auto-selects best model
-sonar           experimental            No         Sonar 2 (latest in-house)
+sonar           experimental            No         Sonar 2 (concise search mode for grounded responses)
 deep_research   pplx_alpha              No         In-depth reports (monthly quota)
 gpt54           gpt54                   Yes        OpenAI GPT-5.4 (versatile)
 gpt55           gpt55                   Yes        OpenAI GPT-5.5 (latest, Max tier)
@@ -168,12 +168,13 @@ QUERY TOOLS (each call costs 1 Pro Search query unless noted):
   pplx_ask(query, source_focus="web")
       Auto-selects best model. 1 PRO SEARCH per call.
-  pplx_council(query, source_focus="web", models="gpt54,claude_opus,gemini_pro",
+  pplx_council(query, source_focus="web", models="gpt54,claude_sonnet,gemini_pro",
                synthesize=True, thinking=False, chairman="sonar")
       Model Council — N PRO SEARCHES (1 per model selected).
       BEFORE CALLING: You MUST ask the user which models and how many.
-      Available: gpt54, gpt55, claude_sonnet, claude_opus, gemini_pro, nemotron, kimi_k26.
-      Default: 3 models (GPT-5.4, Claude Opus, Gemini Pro) + synthesis = 4 Pro Searches.
+      Available: sonar, gpt54, gpt55, claude_sonnet, claude_opus, gemini_pro, nemotron, kimi_k26.
+      Max-only: gpt55, claude_opus. Exclude these when Subscription is Pro.
+      Default: 3 Pro-compatible models (GPT-5.4, Claude Sonnet, Gemini Pro) + synthesis = 4 Pro Searches.
       Synthesis uses Sonar 2 by default. Set chairman to override.
       Non-sonar chairman costs 1 extra Pro Search.
       Set synthesize=False to skip synthesis entirely.
@@ -200,9 +201,9 @@ QUERY TOOLS (each call costs 1 Pro Search query unless noted):
   All query tools accept source_focus: "none", "web", "academic", "social",
   "finance", "all". Use "none" for model-only queries without web search.
-  All query tools also accept an optional `conversation_id` (str) parameter.
-  The server returns `[Conversation ID: <uuid>]` at the end of each response.
-  Extract this UUID and pass it to the next query to maintain context across
+  All query tools also accept an optional `conversation_id` (str) parameter.
+  The server returns `[Conversation ID: <uuid>]` at the end of each response.
+  Extract this UUID and pass it to the next query to maintain context across
   multiple turns. State is retained in memory for 1 hour.
 USAGE TOOL (1):
@@ -271,8 +272,8 @@ weekly pool (~300). Deep Research draws from a tiny monthly pool (~5-10).
 Wasting Pro queries on simple lookups means nothing left for real questions.
 COST MODEL:
-  Sonar 2 (pplx_sonar,     In-house model; still uses your session and Perplexity
-    quick intent)         counters — check pplx_usage() / pwm usage.
+  Sonar 2 (pplx_sonar,     In-house model; uses concise search mode to guarantee responses
+    quick intent)         are grounded. Decrements Perplexity session counters.
   Pro Search (standard,   Typically 1 from weekly Pro Search pool (~300/week
     detailed, pplx_ask,   on Pro/Max; exact rules are enforced by Perplexity).
     pplx_query, premium
@@ -283,11 +284,14 @@ COST MODEL:
 MANDATORY PROTOCOL:
   1. CHECK QUOTA FIRST: Call pplx_usage() before your first query each session.
   2. DEFAULT TO QUICK: Use pplx_smart_query(intent='quick') for most lookups.
-     It prefers Sonar 2 first and only escalates when the query needs a premium model.
+     It prefers Sonar 2 first (using concise mode to guarantee grounding) and
+     only escalates when the query needs a premium model.
   3. ESCALATE ONLY WHEN NEEDED: Use 'standard' for multi-source synthesis,
      'detailed' for complex analysis, 'research' only when user requests it.
   4. NEVER USE DEEP RESEARCH AUTONOMOUSLY — always ask the user first.
-  5. COUNCIL: Before calling pplx_council, ASK the user which models and how
+  5. SUBSCRIPTION-AWARE MODELS: Read the Subscription line from pplx_usage().
+     If it is Pro, exclude Max-only models: gpt55 and claude_opus.
+  6. COUNCIL: Before calling pplx_council, ASK the user which models and how
      many. Each model = 1 Pro Search. List available models for them to choose.
 WHEN TO USE EACH INTENT:
@@ -296,7 +300,7 @@ WHEN TO USE EACH INTENT:
   detailed  Complex analysis, deep reasoning, premium model       → 1 Pro
   research  Comprehensive reports (user must request explicitly)  → 1 Research
-DECISION RULE: Ask "Can Sonar 2 answer this?" If yes → quick. If no → standard.
+DECISION RULE: Ask "Can Sonar 2 answer this?" If yes → quick (grounded on concise search). If no → standard.
 Only use detailed/research when the complexity genuinely demands it.
 When in doubt, start with quick and escalate if the answer is insufficient.

{perplexity_web_mcp_cli-0.12.0 → perplexity_web_mcp_cli-0.12.2}/src/perplexity_web_mcp/cli/main.py RENAMED Viewed

@@ -27,14 +27,16 @@ import rich_click as click
 from perplexity_web_mcp.exceptions import AuthenticationError, RateLimitError
 from perplexity_web_mcp.shared import (
+    COUNCIL_DEFAULT_MODELS_STR,
     COUNCIL_DISPLAY_NAMES,
+    COUNCIL_ELIGIBLE_MODEL_NAMES,
     MODEL_MAP,
     MODEL_NAMES,
     SOURCE_FOCUS_NAMES,
-    THINKING_TOGGLEABLE,
     Models,
     SourceFocusName,
     ask,
+    build_council_model_list,
     get_limit_cache,
     resolve_model,
 )
@@ -230,7 +232,7 @@ def _cmd_research_impl(query, source, json_output):
 # ── Council ────────────────────────────────────────────────────────────────
-COUNCIL_MODEL_NAMES = tuple(n for n in MODEL_NAMES if n not in {"auto", "sonar", "deep_research"})
+COUNCIL_MODEL_NAMES = COUNCIL_ELIGIBLE_MODEL_NAMES
 @cli.command()
@@ -239,7 +241,7 @@ COUNCIL_MODEL_NAMES = tuple(n for n in MODEL_NAMES if n not in {"auto", "sonar",
     "-m",
     "--models",
     "models_str",
-    default="gpt54,claude_opus,gemini_pro",
+    default=COUNCIL_DEFAULT_MODELS_STR,
     help=f"Comma-separated models ({', '.join(COUNCIL_MODEL_NAMES)}).",
 )
 @click.option("-t", "--thinking", is_flag=True, help="Enable extended thinking mode.")
@@ -301,14 +303,8 @@ def _cmd_council_impl(query, models_str, source, synthesize, json_output, thinki
         # Build model list (None = use defaults)
         model_list = None
-        if models_str != "gpt54,claude_opus,gemini_pro":
-            model_list = []
-            for name in model_names:
-                resolved = resolve_model(name, thinking=thinking)
-                display = COUNCIL_DISPLAY_NAMES.get(name, name)
-                if thinking and name in THINKING_TOGGLEABLE:
-                    display += " Thinking"
-                model_list.append((display, resolved))
+        if models_str != COUNCIL_DEFAULT_MODELS_STR:
+            model_list = build_council_model_list(model_names, thinking=thinking)
         synthesis_model = resolve_model(chairman) if chairman != "sonar" else None
@@ -454,16 +450,23 @@ def _cmd_usage_impl(refresh):
     # ── Account Info ───────────────────────────────────────────────────────
     settings = cache.get_user_settings(force_refresh=refresh)
-    if settings:
+    from perplexity_web_mcp.cli.auth import get_user_info
+    user_info = get_user_info(token)
+    if settings or user_info:
         table = Table(title="👤 Account", show_header=True, header_style="bold cyan")
         table.add_column("Field", style="bold")
         table.add_column("Value", justify="right")
-        tier = (settings.subscription_tier or "unknown").title()
-        status = settings.subscription_status
-        table.add_row("Subscription", f"[bold]{tier}[/] ({status})")
-        table.add_row("Total Queries", f"{settings.query_count:,}")
-        table.add_row("Pro Queries", f"{settings.query_count_copilot:,}")
+        if user_info:
+            table.add_row("Subscription", f"[bold]{user_info.tier_display}[/]")
+        if settings:
+            billing = settings.subscription_tier or "unknown"
+            status = settings.subscription_status
+            table.add_row("Billing", f"[bold]{billing}[/] ({status})")
+            table.add_row("Total Queries", f"{settings.query_count:,}")
+            table.add_row("Pro Queries", f"{settings.query_count_copilot:,}")
         console.print(table)
@@ -732,7 +735,7 @@ def _cmd_council(args: list[str]) -> int:
         return 1
     query = args[0]
-    models_str = "gpt54,claude_opus,gemini_pro"
+    models_str = COUNCIL_DEFAULT_MODELS_STR
     source: SourceFocusName = "web"
     synthesize = True
     json_output = False

{perplexity_web_mcp_cli-0.12.0 → perplexity_web_mcp_cli-0.12.2}/src/perplexity_web_mcp/council.py RENAMED Viewed

@@ -16,6 +16,7 @@ from .config import ConversationConfig
 from .enums import CitationMode, SearchFocus, SourceFocus
 from .logging import get_logger
 from .models import Model, Models
+from .shared import COUNCIL_DEFAULT_MODEL_NAMES, build_council_model_list
 if TYPE_CHECKING:
@@ -29,19 +30,14 @@ logger = get_logger(__name__)
 # Default council composition
 # ---------------------------------------------------------------------------
-COUNCIL_DEFAULT_MODELS: list[tuple[str, Model]] = [
-    ("GPT-5.4", Models.GPT_54),
-    ("Claude Opus 4.7", Models.CLAUDE_47_OPUS),
-    ("Gemini 3.1 Pro", Models.GEMINI_31_PRO_THINKING),
-]
-"""Default models for the council (3 diverse providers)."""
+COUNCIL_DEFAULT_MODELS: list[tuple[str, Model]] = build_council_model_list(COUNCIL_DEFAULT_MODEL_NAMES)
+"""Default Pro-compatible models for the council (3 diverse providers)."""
-COUNCIL_DEFAULT_MODELS_THINKING: list[tuple[str, Model]] = [
-    ("GPT-5.4 Thinking", Models.GPT_54_THINKING),
-    ("Claude Opus 4.7 Thinking", Models.CLAUDE_47_OPUS_THINKING),
-    ("Gemini 3.1 Pro", Models.GEMINI_31_PRO_THINKING),
-]
-"""Default models for the council with extended thinking enabled."""
+COUNCIL_DEFAULT_MODELS_THINKING: list[tuple[str, Model]] = build_council_model_list(
+    COUNCIL_DEFAULT_MODEL_NAMES,
+    thinking=True,
+)
+"""Default Pro-compatible models for the council with extended thinking enabled."""
 # ---------------------------------------------------------------------------
@@ -220,7 +216,7 @@ def council_ask(
     Args:
         query: The question to ask all models.
         models: List of (display_name, Model) tuples. Defaults to
-                COUNCIL_DEFAULT_MODELS (GPT-5.4, Claude Opus, Gemini Pro).
+                COUNCIL_DEFAULT_MODELS (GPT-5.4, Claude Sonnet, Gemini Pro).
         source_focus: Source focus for all queries (none/web/academic/social/finance/all).
         synthesize: Whether to produce a synthesized consensus (adds 1 Sonar 2 synthesis query by default).
         thinking: Use thinking model variants for default council members.

{perplexity_web_mcp_cli-0.12.0 → perplexity_web_mcp_cli-0.12.2}/src/perplexity_web_mcp/data/SKILL.md RENAMED Viewed

@@ -2,7 +2,7 @@
 name: perplexity-web-mcp
 description: "Search the web and query AI models via Perplexity AI using perplexity-web-mcp-cli. Supports CLI commands (pwm ask, pwm research), MCP tools (pplx_*), and Anthropic/OpenAI-compatible API server. Use when the user mentions \"perplexity\", \"pplx\", \"pwm\", \"web search with AI\", \"deep research\", \"search the internet\", or wants to query premium models like GPT-5.4, GPT-5.5, Claude, Gemini, Nemotron through Perplexity's web interface."
 metadata:
-  version: "0.10.7"
+  version: "0.12.2"
   author: "Jacob BD"
 ---
@@ -46,8 +46,9 @@ the weekly pool fast, leaving nothing for questions that actually need it.
 ### Before Every Session
 1. **Check quota first**: Call `pplx_usage()` (MCP) or `pwm usage` (CLI) before your first query.
-2. Review the remaining Pro and Research counts.
-3. If Pro < 20% remaining, restrict yourself to quick/Sonar 2 for everything except user-requested Pro queries.
+2. Review the remaining Pro and Research counts and the `Subscription` line.
+3. If Subscription is Pro, exclude Max-only models (`gpt55`, `claude_opus`) from model selection and councils.
+4. If Pro < 20% remaining, restrict yourself to quick/Sonar 2 for everything except user-requested Pro queries.
 ### Before Every Query: Choose the Lowest Sufficient Tier
@@ -83,8 +84,9 @@ Ask yourself: **"Can Sonar 2 answer this?"** If yes, use `quick`. Only escalate
 - The user needs high-confidence answers validated across multiple AI providers
 - Important decisions, fact-checking, or complex analysis
 - BEFORE calling: ASK the user which models and how many (each = 1 Pro Search)
-- Available models: gpt54, gpt55, claude_sonnet, claude_opus, gemini_pro, nemotron, kimi_k26
-- Default: 3 models (GPT-5.4, Claude Opus, Gemini Pro) + synthesis = 4 Pro Searches
+- Available models: sonar, gpt54, gpt55, claude_sonnet, claude_opus, gemini_pro, nemotron, kimi_k26
+- Max-only models: gpt55, claude_opus. Do not use these for Pro subscriptions.
+- Default: 3 Pro-compatible models (GPT-5.4, Claude Sonnet, Gemini Pro) + synthesis = 4 Pro Searches
 ### Decision Flowchart
@@ -127,7 +129,7 @@ The smart router automatically protects you:
 - **Healthy quota**: Uses the ideal model for your intent
 - **Low quota (<20% pro remaining)**: Response footer warns you to conserve
 - **Critical quota (<10% pro remaining)**: Downgrades detailed→auto to conserve
-- **Exhausted quota**: Falls back to Sonar 2 for everything except research
+- **Exhausted quota**: Falls back to Sonar 2 for everything except research (Sonar 2 is forced to concise mode to ensure grounded responses using search results)
 - **Research exhausted**: Falls back to premium Pro Search
 - Response metadata shows what model was used, why, and remaining quota
@@ -237,7 +239,8 @@ pwm ask "protein folding advances" -m gemini_pro -s academic --json
 ### Model Council
 Query multiple models in parallel and get a synthesized consensus.
-Each model in the council costs 1 Pro Search, plus 1 for Sonar 2 synthesis. Default: 3 models + synthesis = 4 Pro Searches.
+Each model in the council costs 1 Pro Search, plus 1 for Sonar 2 synthesis. Default: 3 Pro-compatible models + synthesis = 4 Pro Searches.
+Before selecting models, check `pplx_usage()` or `pwm usage`. If the subscription is Pro, exclude Max-only models (`gpt55`, `claude_opus`).
 ```bash
 pwm council "What are the best practices for microservices?"           # default 3 models
@@ -283,7 +286,7 @@ pwm usage --refresh         # Force-refresh from server
 | `pplx_sonar` | 1 Pro Search | Perplexity Sonar 2 |
 | `pplx_query` | 1 Pro | Explicit model selection with thinking toggle |
 | `pplx_ask` | 1 Pro | Quick Q&A (auto model) |
-| `pplx_council` | **N+1 Pro** (1 per model + 1 synthesis) | Model Council — **ASK USER which models first!** Supports `thinking=True` and `chairman` for synthesis model. |
+| `pplx_council` | **N+1 Pro** (1 per model + 1 synthesis) | Model Council — **ASK USER which models first!** Check subscription first; exclude Max-only `gpt55`/`claude_opus` on Pro. Supports `thinking=True` and `chairman` for synthesis model. |
 | `pplx_gpt54` / `_thinking` | 1 Pro | OpenAI GPT-5.4 (versatile) |
 | `pplx_gpt55` / `_thinking` | 1 Pro | OpenAI GPT-5.5 (latest, Max tier) |
 | `pplx_claude_sonnet` / `_think` | 1 Pro | Anthropic Claude 4.6 Sonnet |
@@ -309,7 +312,7 @@ For full MCP tool parameters: See [references/mcp-tools.md](references/mcp-tools
 | CLI Name | Provider | Thinking | Notes |
 |----------|----------|----------|-------|
 | auto | Perplexity | No | Auto-selects best |
-| sonar | Perplexity | No | Sonar 2 (API id `experimental`) |
+| sonar | Perplexity | No | Sonar 2 (API id `experimental`). Uses `mode="concise"` to ensure grounded answers. |
 | deep_research | Perplexity | No | Monthly quota |
 | gpt54 | OpenAI | Toggle | GPT-5.4 (versatile) |
 | gpt55 | OpenAI | Toggle | GPT-5.5 (latest, Max tier) |

{perplexity_web_mcp_cli-0.12.0 → perplexity_web_mcp_cli-0.12.2}/src/perplexity_web_mcp/data/references/mcp-tools.md RENAMED Viewed

@@ -120,7 +120,7 @@ Returns a summary including:
 - Deep Research remaining (monthly)
 - Create Files & Apps remaining (monthly)
 - Browser Agent remaining (monthly)
-- Subscription tier and account info
+- Subscription tier, billing detail, and account info
 ## Authentication Tools

{perplexity_web_mcp_cli-0.12.0 → perplexity_web_mcp_cli-0.12.2}/src/perplexity_web_mcp/data/references/models.md RENAMED Viewed

@@ -16,7 +16,7 @@ Complete list of models available through Perplexity Web MCP.
 - **Thinking:** No
 - **CLI:** `pwm ask "query" -m sonar`
 - **MCP:** `pplx_sonar(query)` or `pplx_query(query, model="sonar")`
-- **Notes:** Perplexity's latest in-house model.
+- **Notes:** Perplexity's latest in-house model. Settings default to concise search mode (`mode="concise"`), which bypasses interactive copilot to guarantee responses are grounded on retrieved search citations on all accounts (including Free tier fallback).
 ### deep_research (Deep Research)
 - **Identifier:** `pplx_alpha`

{perplexity_web_mcp_cli-0.12.0 → perplexity_web_mcp_cli-0.12.2}/src/perplexity_web_mcp/mcp/server.py RENAMED Viewed

@@ -15,11 +15,11 @@ from fastmcp import FastMCP
 from perplexity_web_mcp.models import Models
 from perplexity_web_mcp.shared import (
-    COUNCIL_DISPLAY_NAMES,
-    THINKING_TOGGLEABLE,
+    COUNCIL_DEFAULT_MODELS_STR,
     ModelName,
     SourceFocusName,
     ask,
+    build_council_model_list,
     council_ask,
     get_limit_cache,
     resolve_model,
@@ -39,6 +39,7 @@ mcp = FastMCP(
         "- pplx_deep_research: 1 DEEP RESEARCH each (small monthly pool, ~5-10 total)\n\n"
         "MANDATORY PROTOCOL:\n"
         "1. On your FIRST query of the session, call pplx_usage() to check remaining quotas.\n"
+        "   Read the Subscription line: Pro users must avoid Max-only models.\n"
         "2. DEFAULT to pplx_smart_query(intent='quick') for most lookups — it prefers Sonar 2 "
         "before premium models when that fits the question.\n"
         "3. Only use 'standard' or 'detailed' intent when the question requires synthesis, "
@@ -288,7 +289,7 @@ def pplx_smart_query(
 def pplx_council(
     query: str,
     source_focus: SourceFocusName = "web",
-    models: str = "gpt54,claude_opus,gemini_pro",
+    models: str = COUNCIL_DEFAULT_MODELS_STR,
     synthesize: bool = True,
     thinking: bool = False,
     chairman: ModelName = "sonar",
@@ -296,20 +297,22 @@ def pplx_council(
     """Model Council — query multiple models in parallel, get synthesized consensus.
     IMPORTANT — BEFORE calling this tool, you MUST:
-    1. Tell the user the available models: gpt54, gpt55, claude_sonnet, claude_opus, gemini_pro, nemotron, kimi_k26
-    2. Ask the user WHICH models they want in their council and HOW MANY
-    3. Inform them of the cost: each council model = 1 Pro Search query, plus synthesis
+    1. Tell the user the available models: sonar, gpt54, gpt55, claude_sonnet, claude_opus, gemini_pro, nemotron, kimi_k26
+    2. Check pplx_usage() first. If Subscription is Pro, do not include Max-only models: gpt55, claude_opus
+    3. Ask the user WHICH models they want in their council and HOW MANY
+    4. Inform them of the cost: each council model = 1 Pro Search query, plus synthesis
        (default chairman sonar = Sonar 2 pass — still counts as a normal query toward limits)
-    4. Get explicit confirmation before executing
+    5. Get explicit confirmation before executing
-    Default council: GPT-5.4, Claude Opus 4.7, Gemini 3.1 Pro (3 diverse providers).
+    Default council: GPT-5.4, Claude Sonnet 4.6, Gemini 3.1 Pro (Pro-compatible, 3 diverse providers).
     Args:
         query: The question to ask all council models
         source_focus: Source type for all models (none/web/academic/social/finance/all)
         models: Comma-separated model names to use as council members.
-                Available: gpt54, gpt55, claude_sonnet, claude_opus, gemini_pro, nemotron, kimi_k26.
-                Default: "gpt54,claude_opus,gemini_pro" (3 models + synthesis = 4 Pro Searches)
+                Available: sonar, gpt54, gpt55, claude_sonnet, claude_opus, gemini_pro, nemotron, kimi_k26.
+                Default: "gpt54,claude_sonnet,gemini_pro" (3 models + synthesis = 4 Pro Searches)
+                Max-only: gpt55, claude_opus. Exclude these when pplx_usage shows a Pro subscription.
         synthesize: Whether to synthesize a consensus from all responses.
                     Set false to get only individual responses (saves 1 Sonar 2 call).
         thinking: Enable extended thinking for council models (gpt54, gpt55, claude_sonnet,
@@ -319,15 +322,9 @@ def pplx_council(
     """
     # Parse custom model list if provided
     model_list = None
-    if models != "gpt54,claude_opus,gemini_pro":
-        model_list = []
-        for name in models.split(","):
-            name = name.strip()
-            resolved = resolve_model(name, thinking=thinking)
-            display = COUNCIL_DISPLAY_NAMES.get(name, name)
-            if thinking and name in THINKING_TOGGLEABLE:
-                display += " Thinking"
-            model_list.append((display, resolved))
+    if models != COUNCIL_DEFAULT_MODELS_STR:
+        model_names = [name.strip() for name in models.split(",") if name.strip()]
+        model_list = build_council_model_list(model_names, thinking=thinking)
     synthesis_model = resolve_model(chairman) if chairman != "sonar" else None
@@ -376,12 +373,26 @@ def pplx_usage(refresh: bool = False) -> str:
     else:
         parts.append("WARNING: Could not fetch rate limits (network error or token issue).")
+    from perplexity_web_mcp.cli.auth import get_user_info
+    user_info = get_user_info(token)
     settings = cache.get_user_settings(force_refresh=refresh)
-    if settings:
+    if settings or user_info:
         parts.append("")
         parts.append("ACCOUNT INFO")
         parts.append("=" * 40)
-        parts.append(settings.format_summary())
+        if user_info:
+            parts.append(f"Subscription: {user_info.tier_display}")
+        if settings:
+            parts.append(f"Billing: {settings.subscription_tier} ({settings.subscription_status})")
+            parts.append(f"Total queries: {settings.query_count:,}")
+            parts.append(f"Pro queries: {settings.query_count_copilot:,}")
+            parts.append(f"Upload limit: {settings.upload_limit} files")
+            parts.append(f"Create limit: {settings.create_limit}")
+            parts.append(f"Pages limit: {settings.pages_limit}")
+            parts.append(f"Max files/user: {settings.max_files_per_user:,}")
+            parts.append(f"Max file size: {settings.connector_limits.max_file_size_mb} MB")
+            parts.append(f"Daily attachments: {settings.connector_limits.daily_attachment_limit}")
     credits = cache.get_credits(force_refresh=refresh)
     if credits:

{perplexity_web_mcp_cli-0.12.0 → perplexity_web_mcp_cli-0.12.2}/src/perplexity_web_mcp/models.py RENAMED Viewed

@@ -25,7 +25,7 @@ class Models:
     BEST = Model(identifier="pplx_pro")
     """Best - Automatically selects the best model based on the query."""
-    SONAR = Model(identifier="experimental")
+    SONAR = Model(identifier="experimental", mode="concise")
     """Sonar 2 — Perplexity's latest in-house model (backend id: experimental)."""
     GEMINI_31_PRO_THINKING = Model(identifier="gemini31pro_high")

{perplexity_web_mcp_cli-0.12.0 → perplexity_web_mcp_cli-0.12.2}/src/perplexity_web_mcp/shared.py RENAMED Viewed

@@ -7,6 +7,7 @@ Both the MCP server (mcp/server.py) and CLI (cli/main.py) import from here.
 from __future__ import annotations
+from dataclasses import dataclass
 from threading import Lock
 from typing import TYPE_CHECKING, Literal
 from uuid import uuid4
@@ -22,6 +23,7 @@ from .token_store import get_token_or_raise, load_token
 if TYPE_CHECKING:
+    from .council import CouncilResponse
     from .types import SearchResultItem
@@ -29,6 +31,20 @@ if TYPE_CHECKING:
 # Model and source focus mappings (single source of truth)
 # ---------------------------------------------------------------------------
+SubscriptionMinimumTier = Literal["free", "pro", "max"]
+@dataclass(frozen=True, slots=True)
+class ModelDefinition:
+    """Metadata and model instances for one user-facing model key."""
+    base_model: Model
+    thinking_model: Model | None
+    display_name: str
+    provider: str
+    minimum_tier: SubscriptionMinimumTier = "pro"
+    council_eligible: bool = True
 SOURCE_FOCUS_MAP: dict[str, list[SourceFocus]] = {
     "none": [],
     "web": [SourceFocus.WEB],
@@ -38,18 +54,49 @@ SOURCE_FOCUS_MAP: dict[str, list[SourceFocus]] = {
     "all": [SourceFocus.WEB, SourceFocus.ACADEMIC, SourceFocus.SOCIAL],
 }
+MODEL_METADATA: dict[str, ModelDefinition] = {
+    "auto": ModelDefinition(Models.BEST, None, "Auto (Best)", "Perplexity", council_eligible=False),
+    "sonar": ModelDefinition(Models.SONAR, None, "Sonar 2", "Perplexity"),
+    "deep_research": ModelDefinition(
+        Models.DEEP_RESEARCH,
+        None,
+        "Deep Research",
+        "Perplexity",
+        council_eligible=False,
+    ),
+    "gpt54": ModelDefinition(Models.GPT_54, Models.GPT_54_THINKING, "GPT-5.4", "OpenAI"),
+    "gpt55": ModelDefinition(Models.GPT_55, Models.GPT_55_THINKING, "GPT-5.5", "OpenAI", minimum_tier="max"),
+    "claude_sonnet": ModelDefinition(
+        Models.CLAUDE_46_SONNET,
+        Models.CLAUDE_46_SONNET_THINKING,
+        "Claude Sonnet 4.6",
+        "Anthropic",
+    ),
+    "claude_opus": ModelDefinition(
+        Models.CLAUDE_47_OPUS,
+        Models.CLAUDE_47_OPUS_THINKING,
+        "Claude Opus 4.7",
+        "Anthropic",
+        minimum_tier="max",
+    ),
+    "gemini_pro": ModelDefinition(
+        Models.GEMINI_31_PRO_THINKING,
+        Models.GEMINI_31_PRO_THINKING,
+        "Gemini 3.1 Pro",
+        "Google",
+    ),
+    "nemotron": ModelDefinition(
+        Models.NEMOTRON_3_SUPER,
+        Models.NEMOTRON_3_SUPER,
+        "Nemotron 3 Super",
+        "NVIDIA",
+    ),
+    "kimi_k26": ModelDefinition(Models.KIMI_K2_6, Models.KIMI_K2_6_THINKING, "Kimi K2.6", "Moonshot"),
+}
+"""User-facing model metadata. Update this table when model names or tier availability changes."""
 MODEL_MAP: dict[str, tuple[Model, Model | None]] = {
-    # (base_model, thinking_model) - None if no thinking variant
-    "auto": (Models.BEST, None),
-    "sonar": (Models.SONAR, None),
-    "deep_research": (Models.DEEP_RESEARCH, None),
-    "gpt54": (Models.GPT_54, Models.GPT_54_THINKING),
-    "gpt55": (Models.GPT_55, Models.GPT_55_THINKING),
-    "claude_sonnet": (Models.CLAUDE_46_SONNET, Models.CLAUDE_46_SONNET_THINKING),
-    "claude_opus": (Models.CLAUDE_47_OPUS, Models.CLAUDE_47_OPUS_THINKING),
-    "gemini_pro": (Models.GEMINI_31_PRO_THINKING, Models.GEMINI_31_PRO_THINKING),
-    "nemotron": (Models.NEMOTRON_3_SUPER, Models.NEMOTRON_3_SUPER),
-    "kimi_k26": (Models.KIMI_K2_6, Models.KIMI_K2_6_THINKING),
+    name: (definition.base_model, definition.thinking_model) for name, definition in MODEL_METADATA.items()
 }
 SourceFocusName = Literal["none", "web", "academic", "social", "finance", "all"]
@@ -70,21 +117,39 @@ MODEL_NAMES: list[str] = list(MODEL_MAP.keys())
 SOURCE_FOCUS_NAMES: list[str] = list(SOURCE_FOCUS_MAP.keys())
 COUNCIL_DISPLAY_NAMES: dict[str, str] = {
-    "auto": "Auto (Best)",
-    "sonar": "Sonar 2",
-    "gpt54": "GPT-5.4",
-    "gpt55": "GPT-5.5",
-    "claude_sonnet": "Claude Sonnet 4.6",
-    "claude_opus": "Claude Opus 4.7",
-    "gemini_pro": "Gemini 3.1 Pro",
-    "nemotron": "Nemotron 3 Super",
-    "kimi_k26": "Kimi K2.6",
+    name: definition.display_name for name, definition in MODEL_METADATA.items()
 }
 THINKING_TOGGLEABLE: frozenset[str] = frozenset(
     name for name, (base, thinking) in MODEL_MAP.items() if thinking is not None and thinking is not base
 )
+MAX_ONLY_MODEL_NAMES: frozenset[str] = frozenset(
+    name for name, definition in MODEL_METADATA.items() if definition.minimum_tier == "max"
+)
+COUNCIL_ELIGIBLE_MODEL_NAMES: tuple[str, ...] = tuple(
+    name for name, definition in MODEL_METADATA.items() if definition.council_eligible
+)
+COUNCIL_DEFAULT_MODEL_NAMES: tuple[str, ...] = ("gpt54", "claude_sonnet", "gemini_pro")
+COUNCIL_DEFAULT_MODELS_STR = ",".join(COUNCIL_DEFAULT_MODEL_NAMES)
+def build_council_model_list(
+    model_names: tuple[str, ...] | list[str],
+    thinking: bool = False,
+) -> list[tuple[str, Model]]:
+    """Build display/model pairs for council execution from model metadata."""
+    model_list: list[tuple[str, Model]] = []
+    for name in model_names:
+        resolved = resolve_model(name, thinking=thinking)
+        display = COUNCIL_DISPLAY_NAMES.get(name, name)
+        if thinking and name in THINKING_TOGGLEABLE:
+            display += " Thinking"
+        model_list.append((display, resolved))
+    return model_list
 def resolve_model(name: str, thinking: bool = False) -> Model:
     """Resolve a model name string to a Model instance.