PyPI - ifcraftcorpus - Versions diffs - 1.1.0__tar.gz → 1.2.1__tar.gz - Mend

ifcraftcorpus 1.1.0tar.gz → 1.2.1tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (68) hide show

{ifcraftcorpus-1.1.0 → ifcraftcorpus-1.2.1}/PKG-INFO RENAMED Viewed

@@ -1,6 +1,6 @@
 Metadata-Version: 2.4
 Name: ifcraftcorpus
-Version: 1.1.0
+Version: 1.2.1
 Summary: Interactive fiction craft corpus with search library and MCP server
 Project-URL: Homepage, https://pvliesdonk.github.io/if-craft-corpus
 Project-URL: Repository, https://github.com/pvliesdonk/if-craft-corpus
@@ -124,6 +124,23 @@ results = corpus.search(
 | agent-design | 2 | Multi-agent patterns, prompt engineering |
 | game-design | 1 | Mechanics design patterns |
+## Verbose Logging
+Set `LOG_LEVEL` (e.g., `INFO`, `DEBUG`) or the convenience flag `VERBOSE=1`
+before launching `ifcraftcorpus`, `ifcraftcorpus-mcp`, or the Docker image to
+emit detailed logs to stderr. Example:
+```bash
+LOG_LEVEL=DEBUG ifcraftcorpus-mcp
+# Docker
+docker run -p 8000:8000 \
+  -e LOG_LEVEL=DEBUG \
+  ghcr.io/pvliesdonk/if-craft-corpus
+```
+Logs never touch stdout, so stdio transports remain compatible.
 ## Documentation
 Full documentation: https://pvliesdonk.github.io/if-craft-corpus

{ifcraftcorpus-1.1.0 → ifcraftcorpus-1.2.1}/README.md RENAMED Viewed

@@ -71,6 +71,23 @@ results = corpus.search(
 | agent-design | 2 | Multi-agent patterns, prompt engineering |
 | game-design | 1 | Mechanics design patterns |
+## Verbose Logging
+Set `LOG_LEVEL` (e.g., `INFO`, `DEBUG`) or the convenience flag `VERBOSE=1`
+before launching `ifcraftcorpus`, `ifcraftcorpus-mcp`, or the Docker image to
+emit detailed logs to stderr. Example:
+```bash
+LOG_LEVEL=DEBUG ifcraftcorpus-mcp
+# Docker
+docker run -p 8000:8000 \
+  -e LOG_LEVEL=DEBUG \
+  ghcr.io/pvliesdonk/if-craft-corpus
+```
+Logs never touch stdout, so stdio transports remain compatible.
 ## Documentation
 Full documentation: https://pvliesdonk.github.io/if-craft-corpus

{ifcraftcorpus-1.1.0 → ifcraftcorpus-1.2.1}/corpus/agent-design/agent_prompt_engineering.md RENAMED Viewed

@@ -285,6 +285,70 @@ Small models may interpret as "never validate" or "always validate."
 ---
+## Sampling Parameters
+Sampling parameters control the randomness and diversity of LLM outputs. The two most important are **temperature** and **top_p**. These can be set per API call, enabling different settings for different phases of a workflow.
+### Temperature
+Temperature controls the probability distribution over tokens. Lower values make the model more deterministic; higher values increase randomness and creativity.
+| Temperature | Effect | Use Cases |
+|-------------|--------|-----------|
+| 0.0–0.2 | Highly deterministic, consistent | Structured output, tool calling, factual responses |
+| 0.3–0.5 | Balanced, slight variation | General conversation, summarization |
+| 0.6–0.8 | More creative, diverse | Brainstorming, draft generation |
+| 0.9–1.0+ | High randomness, exploratory | Creative writing, idea exploration, poetry |
+**How it works:** Temperature scales the logits (pre-softmax scores) before sampling. At T=0, the model always picks the highest-probability token. At T>1, probability differences flatten, making unlikely tokens more probable.
+**Caveats:**
+- Even T=0 isn't fully deterministic—hardware concurrency and floating-point variations can introduce tiny differences
+- High temperature increases hallucination risk
+- Temperature interacts with top_p; tuning both simultaneously requires care
+### Top_p (Nucleus Sampling)
+Top_p limits sampling to the smallest set of tokens whose cumulative probability exceeds p. This provides a different control over diversity than temperature.
+| Top_p | Effect |
+|-------|--------|
+| 0.1–0.3 | Very focused, few token choices |
+| 0.5–0.7 | Moderate diversity |
+| 0.9–1.0 | Wide sampling, more variation |
+**Temperature vs Top_p:**
+- Temperature affects *all* token probabilities uniformly
+- Top_p dynamically adjusts the candidate pool based on probability mass
+- For most use cases, adjust one and leave the other at default
+- Common pattern: low temperature (0.0–0.3) with top_p=1.0 for structured tasks
+### Provider Temperature Ranges
+| Provider | Range | Default | Notes |
+|----------|-------|---------|-------|
+| OpenAI | 0.0–2.0 | 1.0 | Values >1.0 increase randomness significantly |
+| Anthropic | 0.0–1.0 | 1.0 | Cannot exceed 1.0 |
+| Gemini | 0.0–2.0 | 1.0 | Similar to OpenAI |
+| Ollama | 0.0–1.0+ | 0.7–0.8 | Model-dependent defaults |
+### Phase-Specific Temperature
+Since temperature can be set per API call, use different values for different workflow phases:
+| Phase | Temperature | Rationale |
+|-------|-------------|-----------|
+| Brainstorming/Discuss | 0.7–1.0 | Encourage diverse ideas, exploration |
+| Planning/Freeze | 0.3–0.5 | Balance creativity with coherence |
+| Serialize/Tool calls | 0.0–0.2 | Maximize format compliance |
+| Validation repair | 0.0–0.2 | Deterministic corrections |
+This is particularly relevant for the **Discuss → Freeze → Serialize** pattern described below—each stage benefits from different temperature settings.
+---
 ## Structured Output Pipelines
 Many agent tasks end in a **strict artifact**—JSON/YAML configs, story plans, outlines—rather than free-form prose. Trying to get both *conversation* and *perfectly formatted output* from a single response is brittle, especially for small/local models.
@@ -297,21 +361,23 @@ A more reliable approach is to separate the flow into stages:
 ### Discuss → Freeze → Serialize
-**Discuss:** keep prompts focused on meaning, not field names. Explicitly tell the model *not* to output JSON/YAML during this phase.
+**Discuss** (temperature 0.7–1.0): Keep prompts focused on meaning, not field names. Explicitly tell the model *not* to output JSON/YAML during this phase. Higher temperature encourages diverse ideas and creative exploration.
-**Freeze:** compress decisions into a short summary:
+**Freeze** (temperature 0.3–0.5): Compress decisions into a short summary:
 - 10–30 bullets, one decision per line.
 - No open questions, only resolved choices.
 - Structured enough that a smaller model can follow it reliably.
+- Moderate temperature balances coherence with flexibility.
-**Serialize:** in a separate call:
+**Serialize** (temperature 0.0–0.2): In a separate call:
 - Provide the schema (JSON Schema, typed model, or tool definition).
-- Instruct: *“Output only JSON that matches this schema. No prose, no markdown fences.”*
+- Instruct: *"Output only JSON that matches this schema. No prose, no markdown fences."*
 - Use constrained decoding/tool calling where available.
+- Low temperature maximizes format compliance.
-This separates conversational drift from serialization, which significantly improves reliability for structured outputs like story plans, world-bible slices, or configuration objects.
+This separates conversational drift from serialization, which significantly improves reliability for structured outputs like story plans, world-bible slices, or configuration objects. The temperature gradient—high for exploration, low for precision—matches each phase's purpose.
 ### Tool-Gated Finalization
@@ -363,7 +429,108 @@ When a candidate fails validation, the repair prompt should:
 > “Return a corrected JSON object that fixes **only** these errors. Do not change fields that are not mentioned. Output only JSON.”
-For small models, keep error descriptions compact and concrete rather than abstract (“string too long: 345 > max 200”).
+For small models, keep error descriptions compact and concrete rather than abstract ("string too long: 345 > max 200").
+### Structured Validation Feedback
+Rather than returning free-form error messages, use a structured feedback format that leverages attention patterns (status first, action last) and distinguishes error types clearly.
+**Result Categories**
+Use a semantic result enum rather than boolean success/failure:
+| Result | Meaning | Model Action |
+|--------|---------|--------------|
+| `accepted` | Validation passed, artifact stored | Proceed to next step |
+| `validation_failed` | Content issues the model can fix | Repair and resubmit |
+| `tool_error` | Infrastructure failure | Retry unchanged or escalate |
+This distinction matters: `validation_failed` tells the model its *content* was wrong (fixable), while `tool_error` indicates the tool itself failed (retry or give up).
+**Error Categorization**
+Group validation errors by type to help the model understand what went wrong:
+```json
+{
+  "result": "validation_failed",
+  "issues": {
+    "invalid": [
+      {"field": "estimated_passages", "value": 15, "requirement": "must be 1-10"}
+    ],
+    "missing": ["protagonist_name", "setting"],
+    "unknown": ["passages"]
+  },
+  "issue_count": {"invalid": 1, "missing": 2, "unknown": 1},
+  "action": "Fix the 4 issues above and resubmit. Use exact field names from the schema."
+}
+```
+| Category | Meaning | Common Cause |
+|----------|---------|--------------|
+| `invalid` | Field present but value wrong | Constraint violation, wrong type |
+| `missing` | Required field not provided | Omission, incomplete output |
+| `unknown` | Field not in schema | Typo, hallucinated field name |
+The `unknown` category is particularly valuable—it catches near-misses like `passages` instead of `estimated_passages` that would otherwise appear as "missing" with no hint about the typo.
+**Field Ordering (Primacy/Recency)**
+Structure feedback to exploit the U-shaped attention curve:
+1. **Result status** (first—immediate orientation)
+2. **Issues by category** (middle—detailed content)
+3. **Issue count** (severity summary)
+4. **Action instructions** (last—what to do next)
+**What NOT to Include**
+| Avoid | Why |
+|-------|-----|
+| Full schema | Already in tool definition; wastes tokens in retry loops |
+| Boolean `success` field | Ambiguous; use semantic result categories instead |
+| Generic hints | Replace with actionable, field-specific instructions |
+| Valid fields | Only describe what failed, not what succeeded |
+**Example: Before and After**
+Anti-pattern (vague, wastes tokens):
+```
+Error: Validation failed. Expected fields: type, title, protagonist_name,
+setting, theme, estimated_passages, tone. Please check your submission
+and ensure all required fields are present with valid values.
+```
+Better (specific, actionable):
+```json
+{
+  "result": "validation_failed",
+  "issues": {
+    "invalid": [{"field": "type", "value": "story", "requirement": "must be 'dream'"}],
+    "missing": ["protagonist_name"],
+    "unknown": ["passages"]
+  },
+  "action": "Fix these 3 issues. Did you mean 'estimated_passages' instead of 'passages'?"
+}
+```
+The improved version:
+- Names the exact fields that failed
+- Suggests the likely typo (`passages` → `estimated_passages`)
+- Doesn't repeat schema information already available to the model
+- Ends with a clear action instruction (primacy/recency)
+### Retry Budget and Token Efficiency
+Validation loops consume tokens. Design for efficiency:
+- **Cap retries**: 2-3 attempts is usually sufficient; more indicates a prompt or schema problem
+- **Escalate gracefully**: After retry budget exhausted, surface a clear failure rather than looping
+- **Track retry rates**: High retry rates signal opportunities for prompt improvement or schema simplification
+- **Consider model capability**: Less capable models may need higher retry budgets but with simpler feedback
 ### Best Practices
@@ -528,9 +695,12 @@ Before deploying:
 ## Provider-Specific Optimizations
-- **Anthropic**: Use `token-efficient-tools` beta header for up to 70% output token reduction
-- **OpenAI**: Consider fine-tuning for frequently-used patterns
-- **Local models**: Tool retrieval essential—small models struggle with 10+ tools
+- **Anthropic**: Use `token-efficient-tools` beta header for up to 70% output token reduction; temperature capped at 1.0
+- **OpenAI**: Consider fine-tuning for frequently-used patterns; temperature range 0.0–2.0
+- **Gemini**: Temperature range 0.0–2.0, similar behavior to OpenAI
+- **Ollama/Local**: Tool retrieval essential—small models struggle with 10+ tools; default temperature varies by model (typically 0.7–0.8)
+See [Sampling Parameters](#sampling-parameters) for detailed temperature guidance by use case.
 ---
@@ -549,6 +719,8 @@ Before deploying:
 | Dynamic few-shot | Static example bloat | Retrieve relevant examples |
 | Reflection | Quality failures | Draft → critique → refine |
 | Context pruning | Context rot | Summarize and remove stale turns |
+| Structured feedback | Vague validation errors | Categorize issues (invalid/missing/unknown) |
+| Phase-specific temperature | Format errors in structured output | High temp for discuss, low for serialize |
 | Model Class | Max Prompt | Max Tools | Strategy |
 |-------------|------------|-----------|----------|
@@ -567,6 +739,8 @@ Before deploying:
 | RAG-MCP (2025) | Two-stage selection reduces tokens 50%+, improves accuracy 3x |
 | Anthropic Token-Efficient Tools | Schema optimization reduces output tokens 70% |
 | Reflexion research | Self-correction improves quality on complex tasks |
+| STROT Framework (2025) | Structured feedback loops achieve 95% first-attempt success |
+| AWS Evaluator-Optimizer | Semantic reflection enables self-improving validation |
 ---

{ifcraftcorpus-1.1.0 → ifcraftcorpus-1.2.1}/pyproject.toml RENAMED Viewed

@@ -1,6 +1,6 @@
 [project]
 name = "ifcraftcorpus"
-version = "1.1.0"
+version = "1.2.1"
 description = "Interactive fiction craft corpus with search library and MCP server"
 readme = "README.md"
 license = {text = "MIT"}
@@ -80,6 +80,7 @@ build-backend = "hatchling.build"
 include = [
     "/src",
     "/corpus",
+    "/subagents",
 ]
 [tool.hatch.build.targets.wheel]
@@ -87,6 +88,7 @@ packages = ["src/ifcraftcorpus"]
 [tool.hatch.build.targets.wheel.shared-data]
 "corpus" = "share/ifcraftcorpus/corpus"
+"subagents" = "share/ifcraftcorpus/subagents"
 [tool.ruff]
 line-length = 100
@@ -101,6 +103,17 @@ strict = true
 warn_return_any = true
 warn_unused_ignores = true
+[[tool.mypy.overrides]]
+module = [
+    "fastmcp",
+    "fastmcp.prompts",
+    "mcp.*",
+    "sentence_transformers",
+    "numpy",
+    "httpx",
+]
+ignore_missing_imports = true
 [tool.pytest.ini_options]
 testpaths = ["tests"]
 addopts = "-v --tb=short"

{ifcraftcorpus-1.1.0 → ifcraftcorpus-1.2.1}/src/ifcraftcorpus/cli.py RENAMED Viewed

@@ -17,8 +17,26 @@ from __future__ import annotations
 import argparse
 import json
+import logging
 import sys
 from pathlib import Path
+from typing import TYPE_CHECKING
+if TYPE_CHECKING:
+    from ifcraftcorpus.providers import EmbeddingProvider
+from ifcraftcorpus.logging_utils import configure_logging
+configure_logging()
+logger = logging.getLogger(__name__)
+def _truncate(value: str, limit: int = 120) -> str:
+    """Shorten long log values to keep CLI logs readable."""
+    if len(value) <= limit:
+        return value
+    return f"{value[:limit]}..."
 def cmd_info(args: argparse.Namespace) -> int:
@@ -26,12 +44,19 @@ def cmd_info(args: argparse.Namespace) -> int:
     from ifcraftcorpus import Corpus, __version__
     corpus = Corpus()
+    clusters = corpus.list_clusters()
+    logger.info(
+        "CLI info command: version=%s docs=%s clusters=%s",
+        __version__,
+        corpus.document_count(),
+        len(clusters),
+    )
     print(f"\nIF Craft Corpus v{__version__}")
     print(f"Documents: {corpus.document_count()}")
-    print(f"Clusters: {len(corpus.list_clusters())}")
+    print(f"Clusters: {len(clusters)}")
     print("\nClusters:")
-    for cluster in corpus.list_clusters():
+    for cluster in clusters:
         docs = [d for d in corpus.list_documents() if d["cluster"] == cluster]
         print(f"  {cluster}: {len(docs)} file(s)")
@@ -43,6 +68,12 @@ def cmd_search(args: argparse.Namespace) -> int:
     from ifcraftcorpus import Corpus
     corpus = Corpus()
+    logger.info(
+        "CLI search query=%r cluster=%s limit=%s",
+        _truncate(args.query),
+        args.cluster,
+        args.limit,
+    )
     results = corpus.search(
         args.query,
         limit=args.limit,
@@ -51,6 +82,7 @@ def cmd_search(args: argparse.Namespace) -> int:
     )
     if not results:
+        logger.info("CLI search returned no matches")
         print("No results found.")
         return 0
@@ -65,6 +97,7 @@ def cmd_search(args: argparse.Namespace) -> int:
             content += "..."
         print(f"    {content}")
+    logger.info("CLI search returned %s results", len(results))
     return 0
@@ -77,6 +110,7 @@ def cmd_embeddings_status(args: argparse.Namespace) -> int:
         get_embedding_provider,
     )
+    logger.debug("CLI embeddings status requested")
     print("\n=== Embedding Providers ===\n")
     # Check each provider
@@ -129,7 +163,7 @@ def cmd_embeddings_build(args: argparse.Namespace) -> int:
     )
     # Get provider
-    provider = None
+    provider: EmbeddingProvider | None = None
     if args.provider:
         if args.provider == "ollama":
             provider = OllamaEmbeddings(model=args.model, host=args.ollama_host)
@@ -152,12 +186,19 @@ def cmd_embeddings_build(args: argparse.Namespace) -> int:
         print(f"Provider {provider.provider_name} is not available.", file=sys.stderr)
         return 1
+    logger.info(
+        "CLI embeddings build provider=%s model=%s output=%s",
+        provider.provider_name,
+        provider.model,
+        args.output,
+    )
     print(f"Using provider: {provider.provider_name}")
     print(f"Model: {provider.model} ({provider.dimension}d)")
     # Build embeddings
     corpus = Corpus()
-    print(f"\nBuilding embeddings for {corpus.document_count()} documents...")
+    doc_total = corpus.document_count()
+    print(f"\nBuilding embeddings for {doc_total} documents...")
     # Use the corpus's internal index
     embedding_index = EmbeddingIndex(provider=provider)
@@ -214,6 +255,12 @@ def cmd_embeddings_build(args: argparse.Namespace) -> int:
     output_path = Path(args.output)
     embedding_index.save(output_path)
+    logger.info(
+        "CLI embeddings build completed docs=%s sections=%s output=%s",
+        doc_count,
+        section_count,
+        output_path,
+    )
     print(f"\nDone! Embedded {section_count} sections from {doc_count} documents.")
     print(f"Saved to: {output_path}")
@@ -273,7 +320,9 @@ def main() -> int:
         emb_parser.print_help()
         return 0
-    return args.func(args)
+    logger.debug("CLI command executed: %s", args.command)
+    result: int = args.func(args)
+    return result
 if __name__ == "__main__":

{ifcraftcorpus-1.1.0 → ifcraftcorpus-1.2.1}/src/ifcraftcorpus/embeddings.py RENAMED Viewed

@@ -44,10 +44,13 @@ from __future__ import annotations
 import json
 import logging
 from pathlib import Path
-from typing import TYPE_CHECKING
+from typing import TYPE_CHECKING, Any
 import numpy as np
+if TYPE_CHECKING:
+    from sentence_transformers import SentenceTransformer
 if TYPE_CHECKING:
     from ifcraftcorpus.index import CorpusIndex
     from ifcraftcorpus.providers import EmbeddingProvider
@@ -107,7 +110,8 @@ class EmbeddingIndex:
         """
         self._provider = provider
         self._embeddings: np.ndarray | None = None
-        self._metadata: list[dict] = []
+        self._metadata: list[dict[str, Any]] = []
+        self._st_model: SentenceTransformer | None = None
         # For backward compatibility / persistence
         if provider:
@@ -117,7 +121,6 @@ class EmbeddingIndex:
             self.model_name = model_name
             self._provider_name = "sentence-transformers"
             # Lazy-load sentence-transformers model
-            self._st_model = None
             if not lazy_load:
                 self._load_st_model()
@@ -126,7 +129,7 @@ class EmbeddingIndex:
         """Get the provider name."""
         return self._provider_name
-    def _load_st_model(self):
+    def _load_st_model(self) -> SentenceTransformer:
         """Load sentence-transformers model (fallback)."""
         if self._st_model is None:
             try:
@@ -148,12 +151,13 @@ class EmbeddingIndex:
         else:
             # Fallback to sentence-transformers
             model = self._load_st_model()
-            return model.encode(texts, show_progress_bar=False, convert_to_numpy=True)
+            embeddings = model.encode(texts, show_progress_bar=False, convert_to_numpy=True)
+            return np.asarray(embeddings)
     def add_texts(
         self,
         texts: list[str],
-        metadata: list[dict],
+        metadata: list[dict[str, Any]],
     ) -> None:
         """Add texts with metadata to the index.
@@ -185,7 +189,7 @@ class EmbeddingIndex:
         *,
         top_k: int = 10,
         cluster: str | None = None,
-    ) -> list[tuple[dict, float]]:
+    ) -> list[tuple[dict[str, Any], float]]:
         """Search for semantically similar texts.
         Args:

{ifcraftcorpus-1.1.0 → ifcraftcorpus-1.2.1}/src/ifcraftcorpus/index.py RENAMED Viewed

@@ -48,10 +48,31 @@ from __future__ import annotations
 import sqlite3
 from dataclasses import dataclass
 from pathlib import Path
+from typing import Any
 from ifcraftcorpus.parser import Document, parse_directory
+def _sanitize_fts_query(query: str) -> str:
+    """Sanitize a query string for the FTS5 MATCH clause.
+    This function replaces hyphens with spaces to prevent FTS5 from
+    interpreting them as the `NOT` operator. This is intended to correctly
+    handle natural language queries with hyphenated words, for example
+    transforming "haunted-house" into a search for "haunted house".
+    It also collapses any resulting multiple spaces into a single space.
+    Args:
+        query: Raw query string from user input.
+    Returns:
+        Sanitized query safe for FTS5 MATCH.
+    """
+    # Replace hyphens and collapse whitespace in one go.
+    return " ".join(query.replace("-", " ").split())
 @dataclass
 class SearchResult:
     """A search result from the corpus FTS5 index.
@@ -379,8 +400,8 @@ class CorpusIndex:
             ...                        cluster="emotional-design",
             ...                        limit=5)
         """
-        # Build FTS5 query
-        fts_query = query
+        # Build FTS5 query - sanitize to handle special characters
+        fts_query = _sanitize_fts_query(query)
         # Add cluster filter if specified
         where_clause = ""
@@ -462,7 +483,7 @@ class CorpusIndex:
         cursor = self.conn.execute("SELECT DISTINCT cluster FROM documents ORDER BY cluster")
         return [row["cluster"] for row in cursor]
-    def get_document(self, name: str) -> dict | None:
+    def get_document(self, name: str) -> dict[str, Any] | None:
         """Get a document by name with all its sections.
         Retrieves complete document data including metadata and all
@@ -535,7 +556,8 @@ class CorpusIndex:
             Count of documents in the index.
         """
         cursor = self.conn.execute("SELECT COUNT(*) FROM documents")
-        return cursor.fetchone()[0]
+        result = cursor.fetchone()
+        return int(result[0]) if result else 0
 def build_index(corpus_dir: Path, output_path: Path) -> CorpusIndex:

ifcraftcorpus-1.2.1/src/ifcraftcorpus/logging_utils.py ADDED Viewed

@@ -0,0 +1,84 @@
+"""Shared logging helpers for the IF Craft Corpus codebase."""
+from __future__ import annotations
+import logging
+import os
+import sys
+from typing import Final
+LOG_LEVEL_ENV: Final[str] = "LOG_LEVEL"
+VERBOSE_ENV: Final[str] = "VERBOSE"
+__all__ = ["configure_logging", "LOG_LEVEL_ENV", "VERBOSE_ENV"]
+_TRUTHY_VALUES: Final[set[str]] = {"1", "true", "yes", "on"}
+_configured: bool = False
+_CHATTY_LOGGERS: Final[tuple[str, ...]] = (
+    "httpx",
+    "fakeredis",
+    "docket",
+)
+def _is_truthy(value: str | None) -> bool:
+    """Return True if the string resembles a truthy flag."""
+    if value is None:
+        return False
+    return value.strip().lower() in _TRUTHY_VALUES
+def _resolve_level(value: str | None) -> int | None:
+    """Convert a logging level string (name or integer) to ``int``."""
+    if not value:
+        return None
+    candidate = value.strip()
+    if not candidate:
+        return None
+    if candidate.isdigit():
+        return int(candidate)
+    name = candidate.upper()
+    return getattr(logging, name, None)
+def configure_logging(
+    *,
+    env_level: str = LOG_LEVEL_ENV,
+    env_verbose: str = VERBOSE_ENV,
+    fmt: str = "%(asctime)s [%(levelname)s] %(name)s: %(message)s",
+) -> int | None:
+    """Configure root logging when LOG_LEVEL/VERBOSE are set.
+    Returns the configured level when logging is enabled, ``None`` otherwise.
+    """
+    global _configured
+    raw_level = os.getenv(env_level)
+    level = _resolve_level(raw_level)
+    verbose_flag = os.getenv(env_verbose)
+    if raw_level and level is None:
+        print(
+            f"ifcraftcorpus: unknown log level '{raw_level}', defaulting to INFO",
+            file=sys.stderr,
+        )
+        level = logging.INFO
+    if level is None and not _is_truthy(verbose_flag):
+        return None
+    if level is None:
+        level = logging.DEBUG
+    root = logging.getLogger()
+    if not (root.handlers and _configured):
+        logging.basicConfig(level=level, format=fmt, stream=sys.stderr)
+        _configured = True
+    root.setLevel(level)
+    for name in _CHATTY_LOGGERS:
+        logging.getLogger(name).setLevel(max(logging.WARNING, level))
+    return level

ifcraftcorpus 1.1.0__tar.gz → 1.2.1__tar.gz

ifcraftcorpus 1.1.0tar.gz → 1.2.1tar.gz