ifcraftcorpus 1.1.0__py3-none-any.whl → 1.2.1__py3-none-any.whl

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (65) hide show
  1. ifcraftcorpus/cli.py +54 -5
  2. ifcraftcorpus/embeddings.py +11 -7
  3. ifcraftcorpus/index.py +26 -4
  4. ifcraftcorpus/logging_utils.py +84 -0
  5. ifcraftcorpus/mcp_server.py +418 -22
  6. ifcraftcorpus/providers.py +4 -4
  7. ifcraftcorpus/search.py +60 -12
  8. {ifcraftcorpus-1.1.0.data → ifcraftcorpus-1.2.1.data}/data/share/ifcraftcorpus/corpus/agent-design/agent_prompt_engineering.md +183 -9
  9. ifcraftcorpus-1.2.1.data/data/share/ifcraftcorpus/subagents/README.md +198 -0
  10. ifcraftcorpus-1.2.1.data/data/share/ifcraftcorpus/subagents/if_genre_consultant.md +257 -0
  11. ifcraftcorpus-1.2.1.data/data/share/ifcraftcorpus/subagents/if_platform_advisor.md +306 -0
  12. ifcraftcorpus-1.2.1.data/data/share/ifcraftcorpus/subagents/if_prose_writer.md +187 -0
  13. ifcraftcorpus-1.2.1.data/data/share/ifcraftcorpus/subagents/if_quality_reviewer.md +245 -0
  14. ifcraftcorpus-1.2.1.data/data/share/ifcraftcorpus/subagents/if_story_architect.md +162 -0
  15. ifcraftcorpus-1.2.1.data/data/share/ifcraftcorpus/subagents/if_world_curator.md +280 -0
  16. {ifcraftcorpus-1.1.0.dist-info → ifcraftcorpus-1.2.1.dist-info}/METADATA +18 -1
  17. ifcraftcorpus-1.2.1.dist-info/RECORD +67 -0
  18. ifcraftcorpus-1.1.0.dist-info/RECORD +0 -59
  19. {ifcraftcorpus-1.1.0.data → ifcraftcorpus-1.2.1.data}/data/share/ifcraftcorpus/corpus/agent-design/multi_agent_patterns.md +0 -0
  20. {ifcraftcorpus-1.1.0.data → ifcraftcorpus-1.2.1.data}/data/share/ifcraftcorpus/corpus/audience-and-access/accessibility_guidelines.md +0 -0
  21. {ifcraftcorpus-1.1.0.data → ifcraftcorpus-1.2.1.data}/data/share/ifcraftcorpus/corpus/audience-and-access/audience_targeting.md +0 -0
  22. {ifcraftcorpus-1.1.0.data → ifcraftcorpus-1.2.1.data}/data/share/ifcraftcorpus/corpus/audience-and-access/localization_considerations.md +0 -0
  23. {ifcraftcorpus-1.1.0.data → ifcraftcorpus-1.2.1.data}/data/share/ifcraftcorpus/corpus/craft-foundations/audio_visual_integration.md +0 -0
  24. {ifcraftcorpus-1.1.0.data → ifcraftcorpus-1.2.1.data}/data/share/ifcraftcorpus/corpus/craft-foundations/collaborative_if_writing.md +0 -0
  25. {ifcraftcorpus-1.1.0.data → ifcraftcorpus-1.2.1.data}/data/share/ifcraftcorpus/corpus/craft-foundations/creative_workflow_pipeline.md +0 -0
  26. {ifcraftcorpus-1.1.0.data → ifcraftcorpus-1.2.1.data}/data/share/ifcraftcorpus/corpus/craft-foundations/diegetic_design.md +0 -0
  27. {ifcraftcorpus-1.1.0.data → ifcraftcorpus-1.2.1.data}/data/share/ifcraftcorpus/corpus/craft-foundations/idea_capture_and_hooks.md +0 -0
  28. {ifcraftcorpus-1.1.0.data → ifcraftcorpus-1.2.1.data}/data/share/ifcraftcorpus/corpus/craft-foundations/if_platform_tools.md +0 -0
  29. {ifcraftcorpus-1.1.0.data → ifcraftcorpus-1.2.1.data}/data/share/ifcraftcorpus/corpus/craft-foundations/player_analytics_metrics.md +0 -0
  30. {ifcraftcorpus-1.1.0.data → ifcraftcorpus-1.2.1.data}/data/share/ifcraftcorpus/corpus/craft-foundations/quality_standards_if.md +0 -0
  31. {ifcraftcorpus-1.1.0.data → ifcraftcorpus-1.2.1.data}/data/share/ifcraftcorpus/corpus/craft-foundations/research_and_verification.md +0 -0
  32. {ifcraftcorpus-1.1.0.data → ifcraftcorpus-1.2.1.data}/data/share/ifcraftcorpus/corpus/craft-foundations/testing_interactive_fiction.md +0 -0
  33. {ifcraftcorpus-1.1.0.data → ifcraftcorpus-1.2.1.data}/data/share/ifcraftcorpus/corpus/emotional-design/conflict_patterns.md +0 -0
  34. {ifcraftcorpus-1.1.0.data → ifcraftcorpus-1.2.1.data}/data/share/ifcraftcorpus/corpus/emotional-design/emotional_beats.md +0 -0
  35. {ifcraftcorpus-1.1.0.data → ifcraftcorpus-1.2.1.data}/data/share/ifcraftcorpus/corpus/game-design/mechanics_design_patterns.md +0 -0
  36. {ifcraftcorpus-1.1.0.data → ifcraftcorpus-1.2.1.data}/data/share/ifcraftcorpus/corpus/genre-conventions/children_and_ya_conventions.md +0 -0
  37. {ifcraftcorpus-1.1.0.data → ifcraftcorpus-1.2.1.data}/data/share/ifcraftcorpus/corpus/genre-conventions/fantasy_conventions.md +0 -0
  38. {ifcraftcorpus-1.1.0.data → ifcraftcorpus-1.2.1.data}/data/share/ifcraftcorpus/corpus/genre-conventions/historical_fiction.md +0 -0
  39. {ifcraftcorpus-1.1.0.data → ifcraftcorpus-1.2.1.data}/data/share/ifcraftcorpus/corpus/genre-conventions/horror_conventions.md +0 -0
  40. {ifcraftcorpus-1.1.0.data → ifcraftcorpus-1.2.1.data}/data/share/ifcraftcorpus/corpus/genre-conventions/mystery_conventions.md +0 -0
  41. {ifcraftcorpus-1.1.0.data → ifcraftcorpus-1.2.1.data}/data/share/ifcraftcorpus/corpus/genre-conventions/sci_fi_conventions.md +0 -0
  42. {ifcraftcorpus-1.1.0.data → ifcraftcorpus-1.2.1.data}/data/share/ifcraftcorpus/corpus/narrative-structure/branching_narrative_construction.md +0 -0
  43. {ifcraftcorpus-1.1.0.data → ifcraftcorpus-1.2.1.data}/data/share/ifcraftcorpus/corpus/narrative-structure/branching_narrative_craft.md +0 -0
  44. {ifcraftcorpus-1.1.0.data → ifcraftcorpus-1.2.1.data}/data/share/ifcraftcorpus/corpus/narrative-structure/endings_patterns.md +0 -0
  45. {ifcraftcorpus-1.1.0.data → ifcraftcorpus-1.2.1.data}/data/share/ifcraftcorpus/corpus/narrative-structure/episodic_serialized_if.md +0 -0
  46. {ifcraftcorpus-1.1.0.data → ifcraftcorpus-1.2.1.data}/data/share/ifcraftcorpus/corpus/narrative-structure/nonlinear_structure.md +0 -0
  47. {ifcraftcorpus-1.1.0.data → ifcraftcorpus-1.2.1.data}/data/share/ifcraftcorpus/corpus/narrative-structure/pacing_and_tension.md +0 -0
  48. {ifcraftcorpus-1.1.0.data → ifcraftcorpus-1.2.1.data}/data/share/ifcraftcorpus/corpus/narrative-structure/romance_and_relationships.md +0 -0
  49. {ifcraftcorpus-1.1.0.data → ifcraftcorpus-1.2.1.data}/data/share/ifcraftcorpus/corpus/narrative-structure/scene_structure_and_beats.md +0 -0
  50. {ifcraftcorpus-1.1.0.data → ifcraftcorpus-1.2.1.data}/data/share/ifcraftcorpus/corpus/narrative-structure/scene_transitions.md +0 -0
  51. {ifcraftcorpus-1.1.0.data → ifcraftcorpus-1.2.1.data}/data/share/ifcraftcorpus/corpus/prose-and-language/character_voice.md +0 -0
  52. {ifcraftcorpus-1.1.0.data → ifcraftcorpus-1.2.1.data}/data/share/ifcraftcorpus/corpus/prose-and-language/dialogue_craft.md +0 -0
  53. {ifcraftcorpus-1.1.0.data → ifcraftcorpus-1.2.1.data}/data/share/ifcraftcorpus/corpus/prose-and-language/exposition_techniques.md +0 -0
  54. {ifcraftcorpus-1.1.0.data → ifcraftcorpus-1.2.1.data}/data/share/ifcraftcorpus/corpus/prose-and-language/narrative_point_of_view.md +0 -0
  55. {ifcraftcorpus-1.1.0.data → ifcraftcorpus-1.2.1.data}/data/share/ifcraftcorpus/corpus/prose-and-language/prose_patterns.md +0 -0
  56. {ifcraftcorpus-1.1.0.data → ifcraftcorpus-1.2.1.data}/data/share/ifcraftcorpus/corpus/prose-and-language/subtext_and_implication.md +0 -0
  57. {ifcraftcorpus-1.1.0.data → ifcraftcorpus-1.2.1.data}/data/share/ifcraftcorpus/corpus/prose-and-language/voice_register_consistency.md +0 -0
  58. {ifcraftcorpus-1.1.0.data → ifcraftcorpus-1.2.1.data}/data/share/ifcraftcorpus/corpus/scope-and-planning/scope_and_length.md +0 -0
  59. {ifcraftcorpus-1.1.0.data → ifcraftcorpus-1.2.1.data}/data/share/ifcraftcorpus/corpus/world-and-setting/canon_management.md +0 -0
  60. {ifcraftcorpus-1.1.0.data → ifcraftcorpus-1.2.1.data}/data/share/ifcraftcorpus/corpus/world-and-setting/setting_as_character.md +0 -0
  61. {ifcraftcorpus-1.1.0.data → ifcraftcorpus-1.2.1.data}/data/share/ifcraftcorpus/corpus/world-and-setting/worldbuilding_patterns.md +0 -0
  62. {ifcraftcorpus-1.1.0.dist-info → ifcraftcorpus-1.2.1.dist-info}/WHEEL +0 -0
  63. {ifcraftcorpus-1.1.0.dist-info → ifcraftcorpus-1.2.1.dist-info}/entry_points.txt +0 -0
  64. {ifcraftcorpus-1.1.0.dist-info → ifcraftcorpus-1.2.1.dist-info}/licenses/LICENSE +0 -0
  65. {ifcraftcorpus-1.1.0.dist-info → ifcraftcorpus-1.2.1.dist-info}/licenses/LICENSE-CONTENT +0 -0
ifcraftcorpus/search.py CHANGED
@@ -40,13 +40,26 @@ Classes:
40
40
 
41
41
  from __future__ import annotations
42
42
 
43
+ import logging
43
44
  from dataclasses import dataclass
44
45
  from pathlib import Path
45
- from typing import TYPE_CHECKING, Literal
46
+ from typing import TYPE_CHECKING, Any, Literal
46
47
 
47
48
  from ifcraftcorpus.index import CorpusIndex
48
49
 
50
+ logger = logging.getLogger(__name__)
51
+
52
+
53
+ def _truncate(value: str, limit: int = 120) -> str:
54
+ """Trim long query strings for readable logging."""
55
+
56
+ if len(value) <= limit:
57
+ return value
58
+ return f"{value[:limit]}..."
59
+
60
+
49
61
  if TYPE_CHECKING:
62
+ from ifcraftcorpus.embeddings import EmbeddingIndex
50
63
  from ifcraftcorpus.providers import EmbeddingProvider
51
64
 
52
65
 
@@ -190,7 +203,15 @@ class Corpus:
190
203
  self._use_bundled = use_bundled
191
204
 
192
205
  self._fts_index: CorpusIndex | None = None
193
- self._embedding_index = None # Lazy loaded
206
+ self._embedding_index: EmbeddingIndex | None = None # Lazy loaded
207
+
208
+ logger.debug(
209
+ "Corpus init corpus_dir=%s index_path=%s embeddings_path=%s use_bundled=%s",
210
+ corpus_dir,
211
+ index_path,
212
+ embeddings_path,
213
+ use_bundled,
214
+ )
194
215
 
195
216
  def _get_corpus_dir(self) -> Path:
196
217
  """Get the corpus directory path.
@@ -202,6 +223,7 @@ class Corpus:
202
223
  ValueError: If no corpus directory can be found.
203
224
  """
204
225
  if self._corpus_dir:
226
+ logger.debug("Using provided corpus directory: %s", self._corpus_dir)
205
227
  return self._corpus_dir
206
228
 
207
229
  if self._use_bundled:
@@ -214,15 +236,17 @@ class Corpus:
214
236
  # Check for installed shared data (pip install)
215
237
  bundled = Path(sys.prefix) / "share" / "ifcraftcorpus" / "corpus"
216
238
  if bundled.exists():
239
+ logger.debug("Using bundled corpus directory: %s", bundled)
217
240
  return bundled
218
241
 
219
242
  # Check relative to package (development mode / editable install)
220
243
  pkg_dir = Path(ifcraftcorpus.__file__).parent
221
244
  dev_corpus = pkg_dir.parent.parent / "corpus"
222
245
  if dev_corpus.exists():
246
+ logger.debug("Using development corpus directory: %s", dev_corpus)
223
247
  return dev_corpus
224
248
  except Exception:
225
- pass
249
+ logger.debug("Failed to auto-detect bundled corpus directory", exc_info=True)
226
250
 
227
251
  raise ValueError(
228
252
  "No corpus directory found. Provide corpus_dir or install package with bundled corpus."
@@ -239,15 +263,17 @@ class Corpus:
239
263
  """
240
264
  if self._fts_index is None:
241
265
  if self._index_path and self._index_path.exists():
266
+ logger.debug("Loading corpus index from %s", self._index_path)
242
267
  self._fts_index = CorpusIndex(self._index_path)
243
268
  else:
244
269
  # Build in-memory index
245
- self._fts_index = CorpusIndex()
246
270
  corpus_dir = self._get_corpus_dir()
271
+ logger.debug("Building in-memory corpus index from %s", corpus_dir)
272
+ self._fts_index = CorpusIndex()
247
273
  self._fts_index.build_from_directory(corpus_dir)
248
274
  return self._fts_index
249
275
 
250
- def _get_embedding_index(self):
276
+ def _get_embedding_index(self) -> EmbeddingIndex | None:
251
277
  """Get the embedding index for semantic search.
252
278
 
253
279
  Lazily loads the embedding index if embeddings_path was provided.
@@ -258,6 +284,7 @@ class Corpus:
258
284
  EmbeddingIndex instance or None if unavailable.
259
285
  """
260
286
  if self._embedding_index is None and self._embeddings_path:
287
+ logger.debug("Attempting to load embeddings from %s", self._embeddings_path)
261
288
  try:
262
289
  from ifcraftcorpus.embeddings import EmbeddingIndex
263
290
 
@@ -269,7 +296,9 @@ class Corpus:
269
296
  self._embeddings_path, provider=self._embedding_provider
270
297
  )
271
298
  except ImportError:
272
- pass # embeddings not available
299
+ logger.debug("Embedding support not installed", exc_info=True)
300
+ elif self._embedding_index is None and not self._embeddings_path:
301
+ logger.debug("No embeddings path configured; semantic search disabled")
273
302
  return self._embedding_index
274
303
 
275
304
  def build_embeddings(self, *, force: bool = False) -> int:
@@ -309,12 +338,14 @@ class Corpus:
309
338
  and self._embeddings_path.exists()
310
339
  and (self._embeddings_path / "metadata.json").exists()
311
340
  ):
341
+ logger.info("Embeddings already exist at %s; skipping rebuild", self._embeddings_path)
312
342
  return 0
313
343
 
314
344
  from ifcraftcorpus.embeddings import EmbeddingIndex
315
345
 
316
346
  embedding_index = EmbeddingIndex(provider=self._embedding_provider)
317
347
 
348
+ logger.info("Building embeddings into %s", self._embeddings_path)
318
349
  count = 0
319
350
  for doc_info in self.list_documents():
320
351
  doc = self.get_document(doc_info["name"])
@@ -359,6 +390,7 @@ class Corpus:
359
390
  embedding_index.save(self._embeddings_path)
360
391
  self._embedding_index = embedding_index
361
392
 
393
+ logger.info("Saved embeddings (%s items) to %s", count, self._embeddings_path)
362
394
  return count
363
395
 
364
396
  def search(
@@ -407,6 +439,14 @@ class Corpus:
407
439
  >>> # Semantic search (if embeddings available)
408
440
  >>> results = corpus.search("scary atmosphere", mode="semantic")
409
441
  """
442
+ logger.debug(
443
+ "Corpus.search query=%r cluster=%s limit=%s mode=%s",
444
+ _truncate(query),
445
+ cluster,
446
+ limit,
447
+ mode,
448
+ )
449
+
410
450
  results: list[CorpusResult] = []
411
451
 
412
452
  if mode in ("keyword", "hybrid"):
@@ -445,18 +485,26 @@ class Corpus:
445
485
 
446
486
  # Deduplicate and sort by score
447
487
  if mode == "hybrid":
448
- seen = set()
449
- unique_results = []
450
- for r in sorted(results, key=lambda x: x.score, reverse=True):
451
- key = (r.document_name, r.section_heading)
488
+ seen: set[tuple[str, str | None]] = set()
489
+ unique_results: list[CorpusResult] = []
490
+ sorted_results: list[CorpusResult] = sorted(
491
+ results, key=lambda x: x.score, reverse=True
492
+ )
493
+ for result in sorted_results:
494
+ key = (result.document_name, result.section_heading)
452
495
  if key not in seen:
453
496
  seen.add(key)
454
- unique_results.append(r)
497
+ unique_results.append(result)
455
498
  results = unique_results[:limit]
456
499
 
500
+ logger.debug(
501
+ "Corpus.search returning %s results (mode=%s)",
502
+ len(results),
503
+ mode,
504
+ )
457
505
  return results
458
506
 
459
- def get_document(self, name: str) -> dict | None:
507
+ def get_document(self, name: str) -> dict[str, Any] | None:
460
508
  """Get a document by name with all its sections.
461
509
 
462
510
  Retrieves complete document data including metadata and all
@@ -285,6 +285,70 @@ Small models may interpret as "never validate" or "always validate."
285
285
 
286
286
  ---
287
287
 
288
+ ## Sampling Parameters
289
+
290
+ Sampling parameters control the randomness and diversity of LLM outputs. The two most important are **temperature** and **top_p**. These can be set per API call, enabling different settings for different phases of a workflow.
291
+
292
+ ### Temperature
293
+
294
+ Temperature controls the probability distribution over tokens. Lower values make the model more deterministic; higher values increase randomness and creativity.
295
+
296
+ | Temperature | Effect | Use Cases |
297
+ |-------------|--------|-----------|
298
+ | 0.0–0.2 | Highly deterministic, consistent | Structured output, tool calling, factual responses |
299
+ | 0.3–0.5 | Balanced, slight variation | General conversation, summarization |
300
+ | 0.6–0.8 | More creative, diverse | Brainstorming, draft generation |
301
+ | 0.9–1.0+ | High randomness, exploratory | Creative writing, idea exploration, poetry |
302
+
303
+ **How it works:** Temperature scales the logits (pre-softmax scores) before sampling. At T=0, the model always picks the highest-probability token. At T>1, probability differences flatten, making unlikely tokens more probable.
304
+
305
+ **Caveats:**
306
+
307
+ - Even T=0 isn't fully deterministic—hardware concurrency and floating-point variations can introduce tiny differences
308
+ - High temperature increases hallucination risk
309
+ - Temperature interacts with top_p; tuning both simultaneously requires care
310
+
311
+ ### Top_p (Nucleus Sampling)
312
+
313
+ Top_p limits sampling to the smallest set of tokens whose cumulative probability exceeds p. This provides a different control over diversity than temperature.
314
+
315
+ | Top_p | Effect |
316
+ |-------|--------|
317
+ | 0.1–0.3 | Very focused, few token choices |
318
+ | 0.5–0.7 | Moderate diversity |
319
+ | 0.9–1.0 | Wide sampling, more variation |
320
+
321
+ **Temperature vs Top_p:**
322
+
323
+ - Temperature affects *all* token probabilities uniformly
324
+ - Top_p dynamically adjusts the candidate pool based on probability mass
325
+ - For most use cases, adjust one and leave the other at default
326
+ - Common pattern: low temperature (0.0–0.3) with top_p=1.0 for structured tasks
327
+
328
+ ### Provider Temperature Ranges
329
+
330
+ | Provider | Range | Default | Notes |
331
+ |----------|-------|---------|-------|
332
+ | OpenAI | 0.0–2.0 | 1.0 | Values >1.0 increase randomness significantly |
333
+ | Anthropic | 0.0–1.0 | 1.0 | Cannot exceed 1.0 |
334
+ | Gemini | 0.0–2.0 | 1.0 | Similar to OpenAI |
335
+ | Ollama | 0.0–1.0+ | 0.7–0.8 | Model-dependent defaults |
336
+
337
+ ### Phase-Specific Temperature
338
+
339
+ Since temperature can be set per API call, use different values for different workflow phases:
340
+
341
+ | Phase | Temperature | Rationale |
342
+ |-------|-------------|-----------|
343
+ | Brainstorming/Discuss | 0.7–1.0 | Encourage diverse ideas, exploration |
344
+ | Planning/Freeze | 0.3–0.5 | Balance creativity with coherence |
345
+ | Serialize/Tool calls | 0.0–0.2 | Maximize format compliance |
346
+ | Validation repair | 0.0–0.2 | Deterministic corrections |
347
+
348
+ This is particularly relevant for the **Discuss → Freeze → Serialize** pattern described below—each stage benefits from different temperature settings.
349
+
350
+ ---
351
+
288
352
  ## Structured Output Pipelines
289
353
 
290
354
  Many agent tasks end in a **strict artifact**—JSON/YAML configs, story plans, outlines—rather than free-form prose. Trying to get both *conversation* and *perfectly formatted output* from a single response is brittle, especially for small/local models.
@@ -297,21 +361,23 @@ A more reliable approach is to separate the flow into stages:
297
361
 
298
362
  ### Discuss → Freeze → Serialize
299
363
 
300
- **Discuss:** keep prompts focused on meaning, not field names. Explicitly tell the model *not* to output JSON/YAML during this phase.
364
+ **Discuss** (temperature 0.7–1.0): Keep prompts focused on meaning, not field names. Explicitly tell the model *not* to output JSON/YAML during this phase. Higher temperature encourages diverse ideas and creative exploration.
301
365
 
302
- **Freeze:** compress decisions into a short summary:
366
+ **Freeze** (temperature 0.3–0.5): Compress decisions into a short summary:
303
367
 
304
368
  - 10–30 bullets, one decision per line.
305
369
  - No open questions, only resolved choices.
306
370
  - Structured enough that a smaller model can follow it reliably.
371
+ - Moderate temperature balances coherence with flexibility.
307
372
 
308
- **Serialize:** in a separate call:
373
+ **Serialize** (temperature 0.0–0.2): In a separate call:
309
374
 
310
375
  - Provide the schema (JSON Schema, typed model, or tool definition).
311
- - Instruct: *“Output only JSON that matches this schema. No prose, no markdown fences.”*
376
+ - Instruct: *"Output only JSON that matches this schema. No prose, no markdown fences."*
312
377
  - Use constrained decoding/tool calling where available.
378
+ - Low temperature maximizes format compliance.
313
379
 
314
- This separates conversational drift from serialization, which significantly improves reliability for structured outputs like story plans, world-bible slices, or configuration objects.
380
+ This separates conversational drift from serialization, which significantly improves reliability for structured outputs like story plans, world-bible slices, or configuration objects. The temperature gradient—high for exploration, low for precision—matches each phase's purpose.
315
381
 
316
382
  ### Tool-Gated Finalization
317
383
 
@@ -363,7 +429,108 @@ When a candidate fails validation, the repair prompt should:
363
429
 
364
430
  > “Return a corrected JSON object that fixes **only** these errors. Do not change fields that are not mentioned. Output only JSON.”
365
431
 
366
- For small models, keep error descriptions compact and concrete rather than abstract (string too long: 345 > max 200).
432
+ For small models, keep error descriptions compact and concrete rather than abstract ("string too long: 345 > max 200").
433
+
434
+ ### Structured Validation Feedback
435
+
436
+ Rather than returning free-form error messages, use a structured feedback format that leverages attention patterns (status first, action last) and distinguishes error types clearly.
437
+
438
+ **Result Categories**
439
+
440
+ Use a semantic result enum rather than boolean success/failure:
441
+
442
+ | Result | Meaning | Model Action |
443
+ |--------|---------|--------------|
444
+ | `accepted` | Validation passed, artifact stored | Proceed to next step |
445
+ | `validation_failed` | Content issues the model can fix | Repair and resubmit |
446
+ | `tool_error` | Infrastructure failure | Retry unchanged or escalate |
447
+
448
+ This distinction matters: `validation_failed` tells the model its *content* was wrong (fixable), while `tool_error` indicates the tool itself failed (retry or give up).
449
+
450
+ **Error Categorization**
451
+
452
+ Group validation errors by type to help the model understand what went wrong:
453
+
454
+ ```json
455
+ {
456
+ "result": "validation_failed",
457
+ "issues": {
458
+ "invalid": [
459
+ {"field": "estimated_passages", "value": 15, "requirement": "must be 1-10"}
460
+ ],
461
+ "missing": ["protagonist_name", "setting"],
462
+ "unknown": ["passages"]
463
+ },
464
+ "issue_count": {"invalid": 1, "missing": 2, "unknown": 1},
465
+ "action": "Fix the 4 issues above and resubmit. Use exact field names from the schema."
466
+ }
467
+ ```
468
+
469
+ | Category | Meaning | Common Cause |
470
+ |----------|---------|--------------|
471
+ | `invalid` | Field present but value wrong | Constraint violation, wrong type |
472
+ | `missing` | Required field not provided | Omission, incomplete output |
473
+ | `unknown` | Field not in schema | Typo, hallucinated field name |
474
+
475
+ The `unknown` category is particularly valuable—it catches near-misses like `passages` instead of `estimated_passages` that would otherwise appear as "missing" with no hint about the typo.
476
+
477
+ **Field Ordering (Primacy/Recency)**
478
+
479
+ Structure feedback to exploit the U-shaped attention curve:
480
+
481
+ 1. **Result status** (first—immediate orientation)
482
+ 2. **Issues by category** (middle—detailed content)
483
+ 3. **Issue count** (severity summary)
484
+ 4. **Action instructions** (last—what to do next)
485
+
486
+ **What NOT to Include**
487
+
488
+ | Avoid | Why |
489
+ |-------|-----|
490
+ | Full schema | Already in tool definition; wastes tokens in retry loops |
491
+ | Boolean `success` field | Ambiguous; use semantic result categories instead |
492
+ | Generic hints | Replace with actionable, field-specific instructions |
493
+ | Valid fields | Only describe what failed, not what succeeded |
494
+
495
+ **Example: Before and After**
496
+
497
+ Anti-pattern (vague, wastes tokens):
498
+
499
+ ```
500
+ Error: Validation failed. Expected fields: type, title, protagonist_name,
501
+ setting, theme, estimated_passages, tone. Please check your submission
502
+ and ensure all required fields are present with valid values.
503
+ ```
504
+
505
+ Better (specific, actionable):
506
+
507
+ ```json
508
+ {
509
+ "result": "validation_failed",
510
+ "issues": {
511
+ "invalid": [{"field": "type", "value": "story", "requirement": "must be 'dream'"}],
512
+ "missing": ["protagonist_name"],
513
+ "unknown": ["passages"]
514
+ },
515
+ "action": "Fix these 3 issues. Did you mean 'estimated_passages' instead of 'passages'?"
516
+ }
517
+ ```
518
+
519
+ The improved version:
520
+
521
+ - Names the exact fields that failed
522
+ - Suggests the likely typo (`passages` → `estimated_passages`)
523
+ - Doesn't repeat schema information already available to the model
524
+ - Ends with a clear action instruction (primacy/recency)
525
+
526
+ ### Retry Budget and Token Efficiency
527
+
528
+ Validation loops consume tokens. Design for efficiency:
529
+
530
+ - **Cap retries**: 2-3 attempts is usually sufficient; more indicates a prompt or schema problem
531
+ - **Escalate gracefully**: After retry budget exhausted, surface a clear failure rather than looping
532
+ - **Track retry rates**: High retry rates signal opportunities for prompt improvement or schema simplification
533
+ - **Consider model capability**: Less capable models may need higher retry budgets but with simpler feedback
367
534
 
368
535
  ### Best Practices
369
536
 
@@ -528,9 +695,12 @@ Before deploying:
528
695
 
529
696
  ## Provider-Specific Optimizations
530
697
 
531
- - **Anthropic**: Use `token-efficient-tools` beta header for up to 70% output token reduction
532
- - **OpenAI**: Consider fine-tuning for frequently-used patterns
533
- - **Local models**: Tool retrieval essential—small models struggle with 10+ tools
698
+ - **Anthropic**: Use `token-efficient-tools` beta header for up to 70% output token reduction; temperature capped at 1.0
699
+ - **OpenAI**: Consider fine-tuning for frequently-used patterns; temperature range 0.0–2.0
700
+ - **Gemini**: Temperature range 0.0–2.0, similar behavior to OpenAI
701
+ - **Ollama/Local**: Tool retrieval essential—small models struggle with 10+ tools; default temperature varies by model (typically 0.7–0.8)
702
+
703
+ See [Sampling Parameters](#sampling-parameters) for detailed temperature guidance by use case.
534
704
 
535
705
  ---
536
706
 
@@ -549,6 +719,8 @@ Before deploying:
549
719
  | Dynamic few-shot | Static example bloat | Retrieve relevant examples |
550
720
  | Reflection | Quality failures | Draft → critique → refine |
551
721
  | Context pruning | Context rot | Summarize and remove stale turns |
722
+ | Structured feedback | Vague validation errors | Categorize issues (invalid/missing/unknown) |
723
+ | Phase-specific temperature | Format errors in structured output | High temp for discuss, low for serialize |
552
724
 
553
725
  | Model Class | Max Prompt | Max Tools | Strategy |
554
726
  |-------------|------------|-----------|----------|
@@ -567,6 +739,8 @@ Before deploying:
567
739
  | RAG-MCP (2025) | Two-stage selection reduces tokens 50%+, improves accuracy 3x |
568
740
  | Anthropic Token-Efficient Tools | Schema optimization reduces output tokens 70% |
569
741
  | Reflexion research | Self-correction improves quality on complex tasks |
742
+ | STROT Framework (2025) | Structured feedback loops achieve 95% first-attempt success |
743
+ | AWS Evaluator-Optimizer | Semantic reflection enables self-improving validation |
570
744
 
571
745
  ---
572
746
 
@@ -0,0 +1,198 @@
1
+ # IF Craft Corpus Subagents
2
+
3
+ Specialized agent templates for Interactive Fiction authoring workflows. These templates provide system prompts for LLM agents that can assist with different aspects of IF creation.
4
+
5
+ ## Overview
6
+
7
+ The subagents follow a **hub-and-spoke orchestration pattern** where specialized agents handle specific tasks:
8
+
9
+ | Agent | Archetype | Role |
10
+ |-------|-----------|------|
11
+ | **Story Architect** | Orchestrator | Plans narrative structure, decomposes projects, coordinates creation |
12
+ | **Prose Writer** | Creator | Writes narrative prose, dialogue, and scene text |
13
+ | **Quality Reviewer** | Validator | Reviews content for quality, consistency, and standards |
14
+ | **Genre Consultant** | Researcher | Provides genre-specific guidance on conventions and tropes |
15
+ | **World Curator** | Curator | Maintains world consistency, manages canon |
16
+ | **Platform Advisor** | Researcher | Guides tool/platform selection and technical implementation |
17
+
18
+ ## Usage
19
+
20
+ ### Via MCP Prompts (Recommended)
21
+
22
+ When using the IF Craft Corpus MCP server, subagents are exposed as **prompts** that can be retrieved and used as system prompts for agents:
23
+
24
+ ```python
25
+ # Using FastMCP client
26
+ from fastmcp import Client
27
+
28
+ async with Client("ifcraftcorpus-mcp") as client:
29
+ # List available subagents
30
+ prompts = await client.list_prompts()
31
+
32
+ # Get a specific prompt
33
+ result = await client.get_prompt(
34
+ "if_story_architect",
35
+ arguments={"project_name": "My IF Game", "genre": "mystery"}
36
+ )
37
+
38
+ # Use the prompt content as a system prompt
39
+ system_prompt = result.messages[0].content.text
40
+ ```
41
+
42
+ ### Via MCP Tool
43
+
44
+ You can also use the `list_subagents` tool to discover available agents:
45
+
46
+ ```python
47
+ subagents = await client.call_tool("list_subagents")
48
+ # Returns list of agents with name, description, archetype, and parameters
49
+ ```
50
+
51
+ ### Direct File Access
52
+
53
+ The markdown templates can also be read directly:
54
+
55
+ ```python
56
+ from pathlib import Path
57
+
58
+ # In development
59
+ template = Path("subagents/if_prose_writer.md").read_text()
60
+
61
+ # In installed package
62
+ import sys
63
+ template_path = Path(sys.prefix) / "share" / "ifcraftcorpus" / "subagents" / "if_prose_writer.md"
64
+ template = template_path.read_text()
65
+ ```
66
+
67
+ ## Agent Details
68
+
69
+ ### IF Story Architect
70
+
71
+ **Archetype:** Orchestrator
72
+ **Parameters:** `project_name`, `genre`
73
+
74
+ Plans and coordinates IF projects without writing content itself. Responsibilities:
75
+ - Design narrative topology (time cave, branch-and-bottleneck, QBN, etc.)
76
+ - Decompose projects into scenes and branches
77
+ - Plan emotional arcs across branches
78
+ - Create scene briefs for content creators
79
+
80
+ **When to use:** At project start to plan structure, or when restructuring.
81
+
82
+ ---
83
+
84
+ ### IF Prose Writer
85
+
86
+ **Archetype:** Creator
87
+ **Parameters:** `genre`, `pov`
88
+
89
+ Creates narrative content from briefs. Responsibilities:
90
+ - Write scene prose and dialogue
91
+ - Maintain character voice consistency
92
+ - Handle POV and exposition
93
+ - Create choice text
94
+
95
+ **When to use:** For actual content creation from scene briefs.
96
+
97
+ ---
98
+
99
+ ### IF Quality Reviewer
100
+
101
+ **Archetype:** Validator
102
+ **Parameters:** `focus_areas`
103
+
104
+ Reviews content for quality issues. Responsibilities:
105
+ - Check structural integrity (orphaned content, dead ends)
106
+ - Verify voice and style consistency
107
+ - Validate canon and continuity
108
+ - Audit accessibility compliance
109
+
110
+ **When to use:** After content creation, before publishing.
111
+
112
+ ---
113
+
114
+ ### IF Genre Consultant
115
+
116
+ **Archetype:** Researcher
117
+ **Parameters:** `primary_genre`, `secondary_genre`
118
+
119
+ Provides genre-specific guidance. Responsibilities:
120
+ - Explain genre conventions and expectations
121
+ - Suggest appropriate tropes and subversions
122
+ - Advise on cross-genre blending
123
+ - Guide tone and style
124
+
125
+ **When to use:** During planning, or when genre questions arise.
126
+
127
+ ---
128
+
129
+ ### IF World Curator
130
+
131
+ **Archetype:** Curator
132
+ **Parameters:** `world_name`, `setting_type`
133
+
134
+ Maintains world consistency. Responsibilities:
135
+ - Track canon facts across branches
136
+ - Manage timeline and character states
137
+ - Flag contradictions
138
+ - Maintain world bible
139
+
140
+ **When to use:** Throughout project to maintain consistency.
141
+
142
+ ---
143
+
144
+ ### IF Platform Advisor
145
+
146
+ **Archetype:** Researcher
147
+ **Parameters:** `target_platform`, `team_size`
148
+
149
+ Guides technical decisions. Responsibilities:
150
+ - Compare IF platforms (Twine, Ink, ChoiceScript, etc.)
151
+ - Recommend tools based on project needs
152
+ - Advise on workflow and collaboration
153
+ - Guide integration strategies
154
+
155
+ **When to use:** At project start for platform selection, or when evaluating tools.
156
+
157
+ ## Corpus Integration
158
+
159
+ All subagents are designed to use the IF Craft Corpus MCP tools:
160
+
161
+ - `search_corpus(query, cluster?, limit?)` - Find relevant guidance
162
+ - `get_document(name)` - Retrieve full document
163
+ - `list_documents(cluster?)` - Discover available guidance
164
+
165
+ Each template includes guidance on which corpus clusters are most relevant for that agent's work.
166
+
167
+ ## Web Research
168
+
169
+ Subagents are also encouraged to use web search for:
170
+ - Historical/factual accuracy
171
+ - Current platform documentation
172
+ - Published IF examples
173
+ - Domain-specific knowledge
174
+
175
+ ## Design Principles
176
+
177
+ These templates follow patterns from the corpus's own agent design documents:
178
+
179
+ 1. **Sandwich Pattern** - Critical constraints at start AND end of prompt
180
+ 2. **Menu + Consult** - Summary in prompt, retrieve details on demand
181
+ 3. **Clear Archetypes** - Each agent has a defined role and boundaries
182
+ 4. **Neutral Tool Descriptions** - Descriptive, not prescriptive
183
+
184
+ ## Extending
185
+
186
+ To create custom subagents:
187
+
188
+ 1. Copy an existing template as a starting point
189
+ 2. Modify the role, responsibilities, and workflow sections
190
+ 3. Update the corpus cluster references for your agent's domain
191
+ 4. Add any custom output formats needed
192
+ 5. Register as an MCP prompt if desired
193
+
194
+ ## License
195
+
196
+ These templates are part of the IF Craft Corpus package:
197
+ - **Code**: MIT License
198
+ - **Content**: CC-BY-4.0