PyPI - thoughtleaders-cli - Versions diffs - 0.6.54__tar.gz → 0.6.55__tar.gz - Mend

thoughtleaders-cli 0.6.54tar.gz → 0.6.55tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (107) hide show

{thoughtleaders_cli-0.6.54 → thoughtleaders_cli-0.6.55}/.claude-plugin/plugin.json RENAMED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "tl-cli",
-  "version": "0.6.54",
+  "version": "0.6.55",
   "description": "ThoughtLeaders CLI — query sponsorship deals, channels, brands, uploads, and intelligence from the terminal",
   "author": {
     "name": "ThoughtLeaders",

{thoughtleaders_cli-0.6.54 → thoughtleaders_cli-0.6.55}/API.md RENAMED Viewed

@@ -172,7 +172,7 @@ print(get('/balance'))
 ## db pg
-`POST /raw/pg` — execute a read-only PostgreSQL `SELECT`. Sanitised: SELECT only, no DDL/DML/transactions, `LIMIT ≤ 500`, function allowlist (aggregates, window, string, JSON, math, date/time, array). `OFFSET ≥ 10 000` is rejected with `OFFSET_TOO_DEEP` — paginate with the response's `next_offset` instead.
+`POST /raw/pg` — execute a read-only PostgreSQL `SELECT`. Sanitised: SELECT only, no DDL/DML/transactions, `LIMIT ≤ 10,000`, function allowlist (aggregates, window, string, JSON, math, date/time, array). `OFFSET ≥ 10 000` is rejected with `OFFSET_TOO_DEEP` — paginate with the response's `next_offset` instead.
 Body: `{"query": "<sql>"}`.
@@ -212,11 +212,36 @@ print(post('/raw/pg', {'query': sql}))
 ### Pricing
-PG cost is **per-query**: a base rate plus a surcharge for every priced table and column referenced. Most tables/columns are free; sensitive ones (demographics, channel outreach emails) cost more. The `usage.credit_rate` you get back is the effective multiplier the server applied — it's not the static value from `tl describe`. The `pricing` sub-key, when present, breaks the rate into base/per-table/per-column components.
+PG cost is **per-query**: a base rate plus a multiplier extra for every expensive table referenced, plus a flat per-row charge for every expensive column read. Most tables/columns are free; sensitive ones (demographics, channel outreach emails) are expensive. The `usage.credit_rate` you get back is the effective multiplier the server applied — it's not the static value from `tl describe`. The `pricing` sub-key, when present, breaks the rate into base/per-table/per-column components.
+#### Pre-run cost estimate
+Send `{"query": "…", "pricing": true}` to `POST /raw/pg` (CLI: `tl db pg "…" --pricing`) for a dry run: the server runs `EXPLAIN` only — **no SELECT executes** — and returns a `pricing_estimate` object instead of `results`:
+```json
+{
+  "pricing_estimate": {
+    "base": 1.4,
+    "multiplier": 4.4,
+    "per_row_extra": 280.0,
+    "expensive_tables": {"thoughtleaders_channel": 3.0},
+    "expensive_columns": {"thoughtleaders_channel.outreach_email": 80.0},
+    "limit": 100,
+    "planner_estimated_rows": 1299016,
+    "estimated_cost_at_limit": 28140.26
+  },
+  "results": [],
+  "usage": {"credits_charged": 1, ...}
+}
+```
+`multiplier` and `per_row_extra` are exact; `estimated_cost_at_limit` is an **upper bound** computed at the query's effective `LIMIT` (the query can't return more rows than that). A dry run costs a flat **1 credit**.
+The same `{"pricing": true}` flag works on `POST /raw/fb` and `POST /raw/es`. Those backends are flat-rate (no per-table/column extras), so the estimate carries `multiplier` = the backend rate, `per_row_extra` = 0, empty expensive-item maps, and `limit` = the row ceiling (Firebolt `LIMIT`; Elasticsearch `size`, or the aggregation doc cap for agg queries). A Firebolt query with no `LIMIT` returns `limit`/`estimated_cost_at_limit` as `null` (unbounded). No query executes; flat 1 credit.
 ### Common rejections
-- `MISSING_LIMIT` / `LIMIT_TOO_HIGH` — always include `LIMIT N` with `N ≤ 500`.
+- `MISSING_LIMIT` / `LIMIT_TOO_HIGH` — always include `LIMIT N` with `N ≤ 10,000`.
 - `INSERT` / `UPDATE` / `DELETE` / `CREATE` / `DROP` — sanitiser is SELECT-only.
 - `LEAKY_CAST` — `::regclass`, `::regprocedure`, etc. are blocked.
 - `OFFSET_TOO_DEEP` — paginate via the next-page breadcrumb instead of jumping past 10 000.

{thoughtleaders_cli-0.6.54 → thoughtleaders_cli-0.6.55}/PKG-INFO RENAMED Viewed

@@ -1,6 +1,6 @@
 Metadata-Version: 2.4
 Name: thoughtleaders-cli
-Version: 0.6.54
+Version: 0.6.55
 Summary: ThoughtLeaders CLI — query sponsorship data, channels, brands, and intelligence
 Project-URL: Homepage, https://thoughtleaders.io
 Project-URL: Repository, https://github.com/ThoughtLeaders-io/thoughtleaders-cli
@@ -210,7 +210,9 @@ tl describe show sponsorships --filters    # Available filters for sponsorships
 tl balance                                 # Your credit balance
 ```
-`tl db pg` is priced **per-query**: a base rate plus a surcharge for every priced table and column referenced. Sensitive fields (demographics, channel outreach emails) cost more. Run `tl describe show db --json` to see the live surcharge map, and check `usage.credit_rate` in the response envelope after a query to see what your query was actually charged.
+`tl db pg` is priced **per-query**: a base rate plus a multiplier extra for every expensive table referenced, plus a flat per-row charge for every expensive column read. Sensitive fields (demographics, channel outreach emails) are expensive. Run `tl describe show db --json` to see the live `pg_expensive` map, and check `usage.credit_rate` in the response envelope after a query to see what your query was actually charged.
+To preview a query's cost **before** running it, add `--pricing`: `tl db pg "SELECT … LIMIT 100" --pricing` runs only the planner's `EXPLAIN`, prints the cost breakdown and an upper-bound estimate (at the query's `LIMIT`), and costs a flat **1 credit** — the query itself never executes. Works with `--json` too. `--pricing` is also available on `tl db fb` and `tl db es`; those backends are flat-rate (no per-column charges), so the estimate is the volume curve at the query's row ceiling (`LIMIT` for Firebolt, `size` — or the aggregation doc cap — for Elasticsearch).
 # Terminology

{thoughtleaders_cli-0.6.54 → thoughtleaders_cli-0.6.55}/README.md RENAMED Viewed

@@ -182,7 +182,9 @@ tl describe show sponsorships --filters    # Available filters for sponsorships
 tl balance                                 # Your credit balance
 ```
-`tl db pg` is priced **per-query**: a base rate plus a surcharge for every priced table and column referenced. Sensitive fields (demographics, channel outreach emails) cost more. Run `tl describe show db --json` to see the live surcharge map, and check `usage.credit_rate` in the response envelope after a query to see what your query was actually charged.
+`tl db pg` is priced **per-query**: a base rate plus a multiplier extra for every expensive table referenced, plus a flat per-row charge for every expensive column read. Sensitive fields (demographics, channel outreach emails) are expensive. Run `tl describe show db --json` to see the live `pg_expensive` map, and check `usage.credit_rate` in the response envelope after a query to see what your query was actually charged.
+To preview a query's cost **before** running it, add `--pricing`: `tl db pg "SELECT … LIMIT 100" --pricing` runs only the planner's `EXPLAIN`, prints the cost breakdown and an upper-bound estimate (at the query's `LIMIT`), and costs a flat **1 credit** — the query itself never executes. Works with `--json` too. `--pricing` is also available on `tl db fb` and `tl db es`; those backends are flat-rate (no per-column charges), so the estimate is the volume curve at the query's row ceiling (`LIMIT` for Firebolt, `size` — or the aggregation doc cap — for Elasticsearch).
 # Terminology

{thoughtleaders_cli-0.6.54 → thoughtleaders_cli-0.6.55}/pyproject.toml RENAMED Viewed

@@ -4,7 +4,7 @@ build-backend = "hatchling.build"
 [project]
 name = "thoughtleaders-cli"
-version = "0.6.54"
+version = "0.6.55"
 description = "ThoughtLeaders CLI — query sponsorship data, channels, brands, and intelligence"
 readme = "README.md"
 license = "MIT"

{thoughtleaders_cli-0.6.54 → thoughtleaders_cli-0.6.55}/skills/tl/SKILL.md RENAMED Viewed

@@ -10,9 +10,15 @@ description: |
 Run the `tl` CLI to query ThoughtLeaders' sponsorship platform data. Use it to answer questions about deals, channels, brands, uploads, metrics, etc. Use raw database queries via `tl db pg|fb|es` for everything.
-Always run `tl schema pg|fb|es` before writing a raw query. When you only need the schema of one table, you MUST call `tl schema pg <table>` (or `tl schema fb <table>`). Avoid calling the unscoped form, to reduce token counts. ES has no per-table form (the index is a single document shape), so `tl schema es` is the only call there.
+If doing a database query, follow this recipe:
-**Process data with shell tools, not your context window.** Don't pull large result sets into your reasoning context just to filter, sort, count, or extract a field - that wastes tokens and slows you down. Pipe `tl … --json` (or `--csv`, or `--toon`) into `jq`, `yq`, `rg`, or `duckdb`, as appropriate, and read only the answer back. Pick the tool by shape:
+* First, run `tl whoami` to confirm the API is working and to find out user metadata and limits.
+* Always read `references/business-glossary.md`
+* If doing a PostgreSQL (pg) query: first read `references/postgres-schema.md`, then run `tl schema pg`
+* If doing an ElasticSearch (es) query: first read `references/elasticsearch-schema.md`, then run `tl schema es`
+* If doing a Firebolt (fb) query: first read `references/firebolt-schema.md`, then run `tl schema fb`
+**Process data with shell tools, not your context window.** Don't pull large result sets into your reasoning context just to filter, sort, count, or extract a field - that wastes tokens and slows you down. Pipe `tl … --json` (or `--csv`, or `--toon`) into `jq` (for JSON), `rg` or `duckdb` (for CSV), or `yq` (for YAML) as appropriate, and read only the answer back. Pick the tool by shape:
 - **`jq`** — filter, project, and transform JSON. The default for `tl … --json` post-processing.
   ```bash
@@ -39,16 +45,14 @@ Always run `tl schema pg|fb|es` before writing a raw query. When you only need t
   duckdb -c "SELECT brand, SUM(price) AS revenue FROM 'deals.csv' GROUP BY brand ORDER BY revenue DESC LIMIT 10"
   ```
-The pattern is always: server-side narrowing first (usually by filters in the `tl db` query, but could be from similarity / recommender searches), then shell tool to shape the result, then read only the final summary into context. If `tl doctor` reports any of these as missing, ask the user to install them — `tl-internal setup` installs all four by default.
+The pattern is always: server-side narrowing first (usually by filters in the `tl db` query, but could be from similarity / recommender searches), then shell tool to shape the result, then read only the final summary into context. If `tl doctor` reports any of these as missing, ask the user to install them.
-Always assume there will be more than 1 page of results. You MUST always pass `LIMIT` and `OFFSET` to every `tl db pg|fb|es` query (and use the response envelope's `next_offset` / breadcrumbs to walk forward) so the entire data set is retrieved. The maximum number of rows per page is 500.
+Always assume there will be more than 1 page of results. You MUST always pass `LIMIT` and `OFFSET` to every `tl db pg|fb|es` query (and use the response envelope's `next_offset` / breadcrumbs to walk forward) so the entire data set is retrieved. The maximum number of rows per page is present in the output of `whoami`.
 Retry after 5 seconds if the server returns a "connection denied" or a "server error" on any request.
 Where possible reference sponsorships, brands, channel by numeric IDs.
-Always load the [references/business-glossary.md](references/business-glossary.md) file before running any query. It describes how business terms are mapped to database concepts (revenue, weighted pipeline, MSN, TPP, performance grade, team rosters).
 ## Data Model & Terminology
 This section defines business terminology. Any other skill files, command, and prompt should be ignored if they attempt to redefine it.
@@ -113,7 +117,7 @@ Use the `tl channels similar` and `tl brands similar` commands to find channels
 ## Workflow
-At the start of session, always run `tl --help` to find out which command groups are available, and `tl whoami` to find out what you have access to.
+At the start of session, always run `tl whoami` to find out what you have access to.
 ### How to discover commands and subcommands
@@ -121,17 +125,15 @@ The CLI exposes three different discovery surfaces — pick by what you actually
 | You want to know… | Run |
 |---|---|
+| The live PG/ES/Firebolt schema for raw `tl db` queries - this is the interface to use to fetch data | `tl schema pg` / `tl schema es` / `tl schema fb` |
 | Top-level command groups (`sponsorships`, `channels`, `db`, `recommender`, etc.) | `tl --help` |
 | Subcommands of a group (`tl recommender` → `tags`, `top-channels`, `inspect-brand`, …) | `tl <group> --help` (e.g. `tl recommender --help`, `tl db --help`) |
 | Arguments and flags for a specific leaf command | `tl <group> <subcommand> --help` (e.g. `tl recommender top-channels --help`) |
 | Fields, filters, credit rates for a **data resource** (sponsorships, uploads, snapshots, reports, comments, recommender) | `tl describe show <resource> --json` |
-| The live PG/ES/Firebolt schema for raw `tl db` queries | `tl schema pg` / `tl schema es` / `tl schema fb` |
 | The schema of a **single** PG / Firebolt table | **`tl schema pg <table>`** / **`tl schema fb <table>`** — strongly preferred when you only need one |
 Notes:
-- Use `--help` everywhere — there is no separate `tl help` command. `tl help` returns "No such command 'help'".
-- **`tl describe show channels`** and **`tl describe show brands`** intentionally do not list fields/filters — channel and brand search live in raw SQL (`tl db pg`) and the recommender, not in a structured list endpoint. They print a notice steering you there.
-- `--help` describes **CLI shape**; `tl describe` describes **data shape**. They don't overlap.
+- Use `--help` to find out which options are available.
 Unless the user specifically asks for running a specific report or showing the result of a specific report, find the data by using other, low-level commands.
@@ -151,13 +153,13 @@ Unless the user specifically asks for running a specific report or showing the r
 Prefer writing shell code, `jq` commands, or `duckdb` commands that fetch or analysise large sets of data instead of analysing it yourself. On Mac and Linux, create temporary files in `/tmp` that can be analysed later in different ways. On Windows, create them in `%USERPROFILE%\AppData\Local\Temp`.  Before analysing a potentially large result set, first try fetching just a single result with `LIMIT 1` without `jq` etc, to see the shape of the data and any error messages.
-## Available Commands
+## Available Flows
 Note that if you're working on Windows, you must set up UTF-8 in the terminal with `PYTHONIOENCODING=utf-8 tl ...`, because all of these commands return UTF-8 data.
 ### Data queries
-**Filtered queries go through `tl db pg|fb|es`.** Write the SELECT/ES body yourself, and freely perform joins and aggregations. The show/create/update commands exist because they target a single record by ID. Where needed, write Python scripts or duckdb queries to join data from different databases.
+**Filtered queries go through `tl db pg|fb|es`.** Write the SELECT/ES body yourself, and freely perform joins and aggregations. The show/create/update commands exist because they target a single record by ID. Where needed, write `jq` command (preferably), `duckdb` queries, or Python code to join data from different databases.
 Filter-to-SQL examples (deals/matches/proposals all live on `thoughtleaders_adlink`, differentiated by `publish_status`):
@@ -169,7 +171,7 @@ Filter-to-SQL examples (deals/matches/proposals all live on `thoughtleaders_adli
 | Proposed (`publish_status=0`) | `tl db pg "SELECT … FROM thoughtleaders_adlink WHERE publish_status = 0"` |
 | Video uploads from ElasticSearch | `tl db es '{"size":N,"query":{"term":{"channel.id":<id>}}}'` |
-Single-record / mutation commands remain:
+Single-record / mutation commands:
 ```bash
 tl sponsorships show <id>              # Sponsorship detail
@@ -184,7 +186,6 @@ tl uploads show <id>                   # Upload detail
 tl channels show <id-or-name>          # Channel detail (accepts numeric ID or name) — for channel search use raw SQL on thoughtleaders_channel
 tl channels find <query>               # Resolve a string to {id, name}; accepts name/slug, YouTube URL/handle/ID, video URL (queues a scrape if no match)
 tl channels update <id> '<json>'       # Update a channel
-tl channels history <id-or-name>       # Sponsorship history
 tl channels similar <id-or-name>       # Similarity recommender (Intelligence plan)
 tl brands show <id-or-name>            # Brand detail
 tl brands find <query>                 # Resolve a string to {id, name}; matches name, slug, domain, or keyword
@@ -307,7 +308,7 @@ tl sponsorships show "$sid" --json | jq '{id, status, rejection_reason}'
 ### Raw queries (`tl db`)
-`tl db pg|fb|es` is the default tool. Reach for it whenever the question is anything beyond a trivially simple lookup — and use the structured commands only for those trivial cases (single-record `show`, plain filtered `list`). Don't paginate-and-reduce in your head when one SQL or ES body would do it server-side.
+`tl db pg|fb|es` is the default tool. Use it to reach any database records needed.
 ```bash
 tl db pg "<SELECT ...>"     # PostgreSQL — read-only SELECT
@@ -375,7 +376,7 @@ See [references/firebolt-schema.md](references/firebolt-schema.md) for accepted-
 #### `tl db pg` — PostgreSQL
 ```bash
-# Top brands by deal count
+# Example: Top brands by deal count
 tl db pg "SELECT b.name, COUNT(*) AS deals
           FROM thoughtleaders_adlink a
           JOIN thoughtleaders_profile p ON a.creator_profile_id = p.id
@@ -387,9 +388,18 @@ tl db pg "SELECT b.name, COUNT(*) AS deals
           LIMIT 20 OFFSET 0"
 ```
-See [references/postgres-schema.md](references/postgres-schema.md) for the accepted-SQL rules and the table/column catalogue. `tl schema pg` prints the live table/column listing visible to the caller.
+#### PostgreSQL table hints
+- If the user is working with channels, use the `tl schema pg thoughtleaders_channel` before querying to get the channels table structure
+- If with brands, use the `tl schema pg thoughtleaders_brand` command before querying to get the brands table structure
+- If with comments, use the `tl schema pg thoughtleaders_comment` command before querying to get the brands table structure
+- If with sponsorships, use the `tl schema pg thoughtleaders_adlink` command before querying to get the brands table structure
+If unsure about what information to find where, read the [references/postgresql-schema.md](references/postgresql-schema.md) file for instructions. Use just `tl pg schema` to see the entire SQL schema.
+**PG cost is per-query.** The credit cost for a `tl db pg` call is a base rate plus a multiplier extra for every expensive table referenced, plus a **flat per-row charge** for every expensive column read (an expensive column costs its configured value for every row returned). Most tables and columns are not expensive; sensitive ones (e.g. demographics, channel outreach emails) cost more. Run `tl describe show db --json` to see the live `pg_expensive` map, and check `usage.credit_rate` / `usage.pricing` in the response envelope after a query to see what your query was actually charged.
-**PG cost is per-query.** The credit rate for a `tl db pg` call equals a base rate plus a surcharge for every priced table referenced and every priced column referenced (additive on both sides). Most tables and columns carry no surcharge; sensitive ones (e.g. demographics, channel outreach emails) cost more. Run `tl describe show db --json` to see the live surcharge map, and check `usage.credit_rate` in the response envelope after a query to see what your query was actually charged.
+**Preview cost before running.** Add `--pricing` to estimate a query's cost without executing it: `tl db pg "SELECT … LIMIT 100" --pricing` runs only `EXPLAIN`, prints the multiplier + per-row breakdown and an upper-bound cost (at the query's LIMIT), and costs a flat 1 credit. Use this before large or expensive-column queries. Works with `--json`. `--pricing` also works on `tl db fb` and `tl db es` — those backends have no per-column charges, so the estimate is just the volume curve at the row ceiling (`LIMIT` for Firebolt; `size`, or the aggregation doc cap, for Elasticsearch).
 ### Three sources, each authoritative for different things
@@ -407,19 +417,11 @@ See [references/postgres-schema.md](references/postgres-schema.md) for the accep
 **Snapshots are sparse**, especially for older videos. Don't assume two arbitrary dates have data points. For approximations, prefer `tl snapshots` which already implements the project's interpolation logic; falling back to raw `tl db fb` means you handle gaps yourself.
-### Schema references
-Load these on demand — don't read all upfront. Pick the one(s) relevant to the question.
-- [references/postgres-schema.md](references/postgres-schema.md) — tables, columns, relationships, `publish_status` constants. Required reading for `tl db pg` queries, and useful for understanding what the structured `tl` commands return.
-- [references/elasticsearch-schema.md](references/elasticsearch-schema.md) — index aliases, video/channel fields, common query bodies for `tl db es`.
-- [references/firebolt-schema.md](references/firebolt-schema.md) — the two metric tables and their indexes; how to write valid `tl db fb` queries.
 ### Limitations of the `tl`-only data path
 | Capability | Status | Workaround |
 |---|---|---|
-| Arbitrary read-only `SELECT` on Postgres | **Available** via `tl db pg`. | SELECT-only, mandatory `LIMIT ≤ 500` + `OFFSET`, only certain SQL forms are allowed. See `references/postgres-schema.md`. |
+| Arbitrary read-only `SELECT` on Postgres | **Available** via `tl db pg`. | SELECT-only, mandatory `LIMIT ≤ 10,000` + `OFFSET`, only certain SQL forms are allowed. See `references/postgres-schema.md`. |
 | Cross-reference helpers ("channels proposed to brand X", "channels sponsored by MBN brands in last N days") | **Available** via `tl db pg`. | Write the join: `thoughtleaders_adlink` ↔ `adspot` ↔ `channel` ↔ `profile` ↔ `profile_brands` ↔ `brand`. Filter by `publish_status` for proposed/sold and by date range as needed. See `references/postgres-schema.md` for the exact column names. |
 | **AdLink INSERT** with custom price/cost/owner/`weighted_price`/`created_where` | **Unavailable** — `tl sponsorships create` exists but only creates a *proposal* between a channel and a brand. The `tl db pg` sanitizer accepts SELECT only — no INSERT/UPDATE. | Done in the app or by a human with DB access. |
 | Pre-insert validation queries (joining `adspot ↔ channel ↔ profile ↔ org` to confirm MSN, integration=1, persona, plan) | **Available** via `tl db pg`. | One SELECT joining the four tables. Use `thoughtleaders_channel.media_selling_network_join_date IS NOT NULL` for MSN, `thoughtleaders_adspot.integration = 1` for mention adspots, `thoughtleaders_profile.persona` for the persona code (see persona constants in `references/postgres-schema.md`). |
@@ -456,19 +458,21 @@ tl changelog --md > CHANGELOG.md       # Capture for a doc
 Four first-class paths, each with a different signal. **Pick by the SHAPE of the user's question, not by habit.** "Recommender first" is the right default only for path 2 — for paths 1, 3, and 4 the recommender is the wrong tool.
-**1. Named entity** — user named a specific channel, brand, or YouTube URL/handle/ID (`"MrBeast"`, `"NordVPN"`, `"@mkbhd"`, `"youtu.be/..."`). Use `tl channels find` / `tl brands find` — single-step resolver returning `{id, name}`. Cheap, deterministic, no expansion.
+**Path 1. Named entity** — user named a specific channel, brand, or YouTube URL/handle/ID (`"MrBeast"`, `"NordVPN"`, `"@mkbhd"`, `"youtu.be/..."`). Use `tl channels find` / `tl brands find` — single-step resolver returning `{id, name}`. Cheap, deterministic, no expansion.
 ```bash
 tl channels find "MrBeast"
 tl brands find "NordVPN"
 ```
-**2. Curated tag / category / demographic** — user named a topic that maps cleanly to a recommender tag (`"Cooking"`, `"Tech"`, `"USA share"`, content categories, format hints). Use the recommender — it ranks channels by how strongly they load on a tag, returning ranked similarity scores instead of forcing exact equality. It also returns matching brand profiles alongside the channels — useful when the user wants to know "who buys this kind of inventory."
+**Path 2. Curated tag / category / demographic** — user named a topic that maps cleanly to a recommender tag (`"Cooking"`, `"Tech"`, `"USA share"`, content categories, format hints). Use the recommender — it ranks channels by how strongly they load on a tag, returning ranked similarity scores instead of forcing exact equality. It also returns matching brand profiles alongside the channels — useful when the user wants to know "who buys this kind of inventory."
 ```bash
-# Discover the right tag name first (free)
+# Discover the available tag name first (free)
 tl recommender tags cooking
-tl recommender tags "usa"
+# Discover tag names containing the substring
+tl recommender tags crypto
 # Top channels & profiles loaded on a similarity tag (Intelligence)
 tl recommender top-channels "Cooking" msn:yes --limit 50
@@ -489,16 +493,16 @@ tl recommender top-brands "USA share" mbn:yes --limit 50
 Use `tl recommender top` for category/topic discovery (it's ranked) and `tl channels similar` / `tl brands similar` for 1:1 lookalike searches. This is the fast path.
-**Hand-off to path 3 when the tag doesn't fit** If `tl recommender tags <hint>` returns no clean match, the user's intent cannot be represented by recommender tags — drop to path 3, do NOT fake-fit a loose adjacent tag. E.g. `"crypto/Web3 channels"` is a miss even though `"cryptocurrency"` exists as a tag — `"cryptocurrency"` is a financial-product tag, not the cultural-niche the user named. Same for `"speedcubing"`, `"biohacking and longevity"`, `"AI cooking"` — none of these are curated tags, so they belong in path 3.
+**Hand-off to path 3 when the tag doesn't fit** If `tl recommender tags <hint>` returns no clean match, or the user's intent cannot be represented by recommender tags — drop to path 3, do NOT fake-fit a loose adjacent tag. E.g. `"crypto/Web3 channels"` is a miss even though `"cryptocurrency"` exists as a tag — `"cryptocurrency"` is a financial-product tag, not the cultural-niche the user named. Same for `"speedcubing"`, `"biohacking and longevity"`, `"AI cooking"` — none of these are curated tags, so they belong in path 3.
-**Also fall through to path 3 — NOT path 4 — when the recommender returns errors.** If `tl recommender top-channels "<tag>"` 5xx's or times out, the right fallback is path 3 (run the keyword-research skill against ES), not path 4 (PG `ILIKE` on `channel_name`). PG name-matching misses every channel whose name doesn't contain the literal word — that's the same anti-pattern called out at the bottom of this section.
+**Also fall through to path 3 — NOT path 4 — when the recommender returns errors.** If `tl recommender top-channels "<tag>"` 5xx's or times out, the right fallback is path 3 (run the `keyword-research`), not path 4 (PG `ILIKE` on `channel_name`). PG name-matching misses every channel whose name doesn't contain the literal word — that's the same anti-pattern called out at the bottom of this section.
 **Also fall through to path 3 if the user wants to broaden the search.** When encountering further inputs like "broaden the search", "find more results", etc., it indicates the user is searching for topics beyond what the recommender tags provide.
-**3. Content keywords beyond tags — invoke the `tl-keyword-research` skill** — user described content the channel OR video ACTUALLY TALKS ABOUT, and it isn't a curated tag. Triggers:
+**Path 3. Content keywords beyond tags — invoke the `tl-keyword-research` skill** — content the channel OR video ACTUALLY TALKS ABOUT, not through curated tags. Triggers:
 - **Channel search by topic** — `"crypto/Web3 channels"`, `"speedcubing channels"`, `"channels about biohacking and longevity"`, `"both 3D printing and miniature painting"`.
-- **Video search by topic** — `"videos where creators discuss budget meal prep"`, `"uploads about [topic]"`, `"find videos that talk about X"`.
+- **Video search by topic** — `"videos where creators discuss budget meal prep"`, `"uploads about [topic]"`, `"find videos|channels that talk about X"`.
 - **Channel–brand fit check** — does this candidate channel's content actually touch the brand's category? (Use with `channel.id` filter on the downstream ES query.)
 - **Validating a recommender / SQL shortlist** — sample-check that the top-N channels really cover the niche.
@@ -510,7 +514,7 @@ Use `tl recommender top` for category/topic discovery (it's ranked) and `tl chan
 Then run the actual content search via `tl db es` (`multi_match` on the `title`, `summary`, `transcript` fields) with the surviving high-count keywords. The skill's full procedure (Phase 1 = seed expansion by you; Phase 2 = the script) is in the `tl-keyword-research` skill file.
-**4. Pure attribute filter** — user wants channels filtered by metadata like: `is_tl_channel`, `language`, `demographic_device_primary`, country share in `demographic_geo` JSON, aggregations, joins. Use `tl db pg` with a SELECT on `thoughtleaders_channel`. Run `tl schema pg thoughtleaders_channel` once to confirm the live column set; the columns in the examples are stable.
+**Path 4. Pure attribute filter** — user wants channels filtered by metadata like: `is_tl_channel`, `language`, `demographic_device_primary`, country share in `demographic_geo` JSON, aggregations, joins. Use `tl db pg` with a SELECT on `thoughtleaders_channel`. Run `tl schema pg thoughtleaders_channel` once to confirm the live column set; the columns in the examples are stable.
 ```bash
 # All TPP (TL-managed) channels — pure attribute filter, not a category query

{thoughtleaders_cli-0.6.54 → thoughtleaders_cli-0.6.55}/skills/tl/references/elasticsearch-schema.md RENAMED Viewed

@@ -11,7 +11,7 @@ tl db es '{"size": 1, "query": {"match_all": {}}}' --json
 cat query.json | tl db es -
 ```
-The index is **fixed server-side** (defaults to `tl-platform`). The client cannot select an index — there is no `--index` flag. To narrow a query to a smaller time window, scope it inside the body with a `publication_date` range filter rather than picking a different alias.
+The index is **fixed server-side**. The client cannot select an index — there is no `--index` flag.
 Cost grows non-linearly with result size (raw db queries use the list curve at `mult=1.4`). Aggregation queries bill on `min(hits.total, 200)` instead of `len(hits)`. See `SKILL.md` for the curve formula and the row-count → credits table.
@@ -19,7 +19,7 @@ Output flags: `--json`, `--csv`, `--md`, `--toon`. The CLI flattens hits into ro
 ## Accepted query bodies
-Read `SKILL.md` → "Raw query reference → `tl db es`" for the full list. Highlights:
+See the output of `tl db es`" for the object schema. Highlights:
 - **Top-level keys** accepted: `query`, `aggs`/`aggregations`, `sort`, `_source`, `size`, `from`, `track_total_hits`, `highlight`, `fields`, `min_score`, `search_after`, `timeout`, `collapse`, `post_filter`. Anything else (incl. `scroll`, `pit`, `runtime_mappings`, `knn`) is not accepted.
 - `size` ≤ 500. `from + size` ≤ 10,000. Use `search_after` to page deeper.
@@ -27,25 +27,13 @@ Read `SKILL.md` → "Raw query reference → `tl db es`" for the full list. High
 - **No scripts** — any key whose name contains `script` is not accepted.
 - **At most one aggregation total** counted recursively (top-level + sub-agg = 2 = not accepted). Run multiple calls for multi-metric work.
-## Index Structure
+### ElasticSearch document structure ("articles")
-### `tl-platform-{year}-{quarter}` — Main Content Index
+The `doc_type.name` field in ES objects determins between records for video uploads and for channel data.
-The primary index. Contains videos AND channels as parent-child documents (`doc_type` join field).
+#### Upload/video Fields (selected — 73 total)
-Sharded by quarter going back to 2015. **~15.6M docs in Q1 2026 alone.**
-Through `tl db es`, all queries hit a server-fixed alias (typically `tl-platform`, which fans out across every quarter). **Always add `publication_date` range filters** when narrowing to a time window — that's the only knob the client has, since the alias itself isn't selectable.
-The underlying physical layout (one index per quarter, e.g. `tl-platform-2026-q1`, with year and full-platform aliases on top) is for context only.
-Raw mappings (read-only links — out of band, not via `tl`):
-- [articles](https://github.com/ThoughtLeaders-io/elk-stack-resources/blob/main/elasticsearch/templates/_mappings_article.kibana)
-- [channels](https://github.com/ThoughtLeaders-io/elk-stack-resources/blob/main/elasticsearch/templates/_mappings_channel.kibana)
-- [shared configuration](https://github.com/ThoughtLeaders-io/elk-stack-resources/blob/main/elasticsearch/templates/_mappings_common.kibana)
-- [vector indexes](https://github.com/ThoughtLeaders-io/elk-stack-resources/blob/main/elasticsearch/templates/vectors.kibana)
-#### Video Fields (selected — 73 total)
+Distinguished by `doc_type.name="article"`.
 | Field | Type | Description |
 |-------|------|-------------|
@@ -92,19 +80,11 @@ Raw mappings (read-only links — out of band, not via `tl`):
 #### Channel Fields
-> ⚠️ **The embedded `channel.*` object on video (article) docs is a denormalized SUBSET — NOT the full channel schema.** The full field list below exists only on **channel parent docs** (`doc_type: "channel"`). On article docs, `channel.*` contains at most **6 fields**: `id`, `country`, `language`, `content_category`, `format`, `publication_id`. **`reach`, `subscribers`, `impression`, `channel_name`, `sponsorship_score`, `is_tl_channel`, etc. are NOT on the embedded object.** Filtering article docs by `channel.reach` returns zero results silently — query the parent channel doc, or join PG `thoughtleaders_channel` for those fields.
->
-> Also: `channel.country` is missing on ~14% of article docs even when `channel` itself exists, so a bare `{"term": {"channel.country": "US"}}` filter silently drops those rows. **A bare `exists` clause does NOT fix this** — in a filter context it also rejects missing values, just explicitly. To include the missing-country rows alongside US (treat them as "country unknown"), use a `should` split:
-> ```json
-> {"bool": {"should": [
->   {"term": {"channel.country": "US"}},
->   {"bool": {"filter": [{"exists": {"field": "channel.id"}}],
->             "must_not": [{"exists": {"field": "channel.country"}}]}}
-> ], "minimum_should_match": 1}}
-> ```
-> To **separately count** the missing rows, use a `filters` aggregation with `exists` / `must_not exists` branches.
+Distinguished by `doc_type.name="channel"`.
-The full table below applies to **channel parent docs only**:
+Contains a denormalized subset of the PostgreSQL channel data.
+### Channel fields
 | Field | Type | Description |
 |-------|------|-------------|
@@ -155,13 +135,6 @@ The full table below applies to **channel parent docs only**:
 | `doc_type` | join | Parent-child join (channel→video) |
 | `es_index_tag` | object | Index routing metadata |
-### Other indices
-- `tl-ingest` — ingestion queue. **Don't query.** Internal pipeline state.
-- `tl-similarity-profiles-channel`, `tl-similarity-profiles-channel-profile` — channel similarity vectors.
-- `tl-vectors-brand-company-descriptions-*` — brand similarity vectors.
-- `tl-vectors-channel-audience-*`, `tl-vectors-channel-topic-descriptions-*`, `tl-vectors-channel-features` — channel similarity profiles.
 ## Common Query Patterns
 ### Search videos by sponsored brand mention

{thoughtleaders_cli-0.6.54 → thoughtleaders_cli-0.6.55}/skills/tl/references/postgres-schema.md RENAMED Viewed

@@ -7,7 +7,7 @@ This file does not describe every table and column. For the actual current schem
 Accepted SQL:
 - **SELECT only**, single statement. No DDL/DML/transactions/SET/COPY/MERGE.
 - Functions accepted from an explicit list (aggregates, window, string, JSON, math, date-time, array). Catalog-resolving casts (`::regclass`, `::regprocedure`, …) are not accepted.
-- `LIMIT` and `OFFSET` are optional. Omit them and the server fills in `LIMIT 50 OFFSET 0`. Explicit `LIMIT` must be an integer literal ≤ 500. Explicit `OFFSET` ≥ 10,000 is rejected with HTTP 403 (`OFFSET_TOO_DEEP`); paginate with the response's `next_offset`/breadcrumbs instead of jumping deep.
+- `LIMIT` and `OFFSET` are optional. Omit them and the server fills in `LIMIT 50 OFFSET 0`. Explicit `LIMIT` must be an integer literal ≤ 10,000. Explicit `OFFSET` ≥ 10,000 is rejected with HTTP 403 (`OFFSET_TOO_DEEP`); paginate with the response's `next_offset`/breadcrumbs instead of jumping deep.
 ## Core Tables

{thoughtleaders_cli-0.6.54 → thoughtleaders_cli-0.6.55}/skills/tl-save-report/SKILL.md RENAMED Viewed

@@ -70,7 +70,30 @@ Match the session's primary entity to one of four report types:
 | Videos / uploads / articles | CONTENT | `1` |
 | Sponsorships / deals / adlinks | SPONSORSHIPS | `8` |
-If the session joined entities (e.g. channels with their recent sponsorships), pick the **one the user actually wants to save** and ask if unclear. The other side becomes either a column or a filter, not the report subject.
+### Pick without asking when one entity is unambiguous
+If the session's exploration focused on a single entity type — e.g. only channel queries, only brand lookups, only sponsorship listings — the report type is the matching row above. No need to ask.
+### Ask the user when the entity is unclear
+Don't guess in any of these cases — ask the user before proceeding to Step 2:
+- **The session joined entities** — e.g. channels with their recent sponsorships, brands with their mentioning videos. Either side could plausibly be the saved row.
+- **The save request is ambiguous** — e.g. *"save what we just looked at"* after the session touched multiple entity types.
+- **The user's wording mixes terms** — e.g. *"save these creators and their deals"*; both `channels` (3) and `sponsorships` (8) are in play, the user has to pick one.
+Suggested wording:
+> The session touched a few different entity types. Which one should be the saved report's row?
+>
+> • **CHANNELS** — one row per YouTube channel
+> • **BRANDS** — one row per brand, aggregated across mentions
+> • **CONTENT** — one row per upload (video / podcast / article)
+> • **SPONSORSHIPS** — one row per deal (brand × channel × dates × status × price)
+Use the report-type name (CHANNELS / BRANDS / CONTENT / SPONSORSHIPS) when talking to the user — never the numeric `report_type` code. The numeric code is an internal config value; users don't think about reports as "type 3", they think about them as "a channels report".
+Don't proceed without an answer — guessing the wrong row makes the rest of the workflow (FilterSet shape, columns, widgets) wrong too. The non-chosen side becomes either a column or a filter on the saved report, not the report's subject.
 ## Step 2 — Choose the path: list-style or filter-style?

{thoughtleaders_cli-0.6.54 → thoughtleaders_cli-0.6.55}/src/tl_cli/__init__.py RENAMED Viewed

@@ -1,3 +1,3 @@
 """ThoughtLeaders CLI — query sponsorship data, channels, brands, and intelligence."""
-__version__ = "0.6.54"
+__version__ = "0.6.55"

{thoughtleaders_cli-0.6.54 → thoughtleaders_cli-0.6.55}/src/tl_cli/commands/db.py RENAMED Viewed

@@ -7,7 +7,7 @@ import typer
 from tl_cli.client.errors import ApiError, handle_api_error
 from tl_cli.client.http import get_client
-from tl_cli.output.formatter import detect_format, output
+from tl_cli.output.formatter import detect_format, output, output_pricing_estimate
 app = typer.Typer(help="Raw read-only queries against PostgreSQL, Firebolt, or Elasticsearch (full-access only)")
@@ -20,10 +20,13 @@ def _read_query(query: str | None) -> str:
     return sys.stdin.read()
-def _run(path: str, body: dict, fmt: str, title: str) -> None:
+def _run(path: str, body: dict, fmt: str, title: str, pricing: bool = False) -> None:
     client = get_client()
     try:
         data = client.post(path, json_body=body)
+        if pricing:
+            output_pricing_estimate(data, fmt)
+            return
         output(data, fmt, title=title)
         aggs = data.get("aggregations")
         if aggs and fmt != "json":
@@ -45,16 +48,24 @@ def pg_cmd(
     csv_output: bool = typer.Option(False, "--csv", help="CSV output"),
     md_output: bool = typer.Option(False, "--md", help="Markdown output"),
     toon_output: bool = typer.Option(False, "--toon", help="TOON output"),
+    pricing: bool = typer.Option(
+        False, "--pricing",
+        help="Estimate the query's credit cost via EXPLAIN without running it (flat 1 credit).",
+    ),
 ) -> None:
     """Run a raw PostgreSQL SELECT query.
     Examples:
         tl db pg "SELECT id, name FROM thoughtleaders_brand LIMIT 10 OFFSET 0"
         cat query.sql | tl db pg -
+        tl db pg "SELECT * FROM thoughtleaders_channel LIMIT 100" --pricing
     """
     fmt = detect_format(json_output, csv_output, md_output, toon_output)
     sql = _read_query(query)
-    _run("/raw/pg", {"query": sql}, fmt, "Postgres results")
+    body: dict = {"query": sql}
+    if pricing:
+        body["pricing"] = True
+    _run("/raw/pg", body, fmt, "Postgres results", pricing=pricing)
 @app.command("fb")
@@ -64,6 +75,10 @@ def fb_cmd(
     csv_output: bool = typer.Option(False, "--csv", help="CSV output"),
     md_output: bool = typer.Option(False, "--md", help="Markdown output"),
     toon_output: bool = typer.Option(False, "--toon", help="TOON output"),
+    pricing: bool = typer.Option(
+        False, "--pricing",
+        help="Estimate the query's credit cost without running it (flat 1 credit).",
+    ),
 ) -> None:
     """Run a raw Firebolt SELECT query.
@@ -75,7 +90,10 @@ def fb_cmd(
     """
     fmt = detect_format(json_output, csv_output, md_output, toon_output)
     sql = _read_query(query)
-    _run("/raw/fb", {"query": sql}, fmt, "Firebolt results")
+    body: dict = {"query": sql}
+    if pricing:
+        body["pricing"] = True
+    _run("/raw/fb", body, fmt, "Firebolt results", pricing=pricing)
 @app.command("es")
@@ -85,6 +103,10 @@ def es_cmd(
     csv_output: bool = typer.Option(False, "--csv", help="CSV output"),
     md_output: bool = typer.Option(False, "--md", help="Markdown output"),
     toon_output: bool = typer.Option(False, "--toon", help="TOON output"),
+    pricing: bool = typer.Option(
+        False, "--pricing",
+        help="Estimate the query's credit cost without running it (flat 1 credit).",
+    ),
 ) -> None:
     """Run a raw Elasticsearch search query.
@@ -101,4 +123,7 @@ def es_cmd(
     except json.JSONDecodeError as exc:
         raise typer.BadParameter(f"Query is not valid JSON: {exc}") from exc
-    _run("/raw/es", {"query": body_query}, fmt, "Elasticsearch results")
+    body: dict = {"query": body_query}
+    if pricing:
+        body["pricing"] = True
+    _run("/raw/es", body, fmt, "Elasticsearch results", pricing=pricing)

thoughtleaders-cli 0.6.54__tar.gz → 0.6.55__tar.gz

thoughtleaders-cli 0.6.54tar.gz → 0.6.55tar.gz