mdcontext 0.1.0 → 0.2.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/.changeset/config.json +9 -9
- package/.claude/settings.local.json +25 -0
- package/.github/workflows/claude-code-review.yml +44 -0
- package/.github/workflows/claude.yml +85 -0
- package/CONTRIBUTING.md +186 -0
- package/NOTES/NOTES +44 -0
- package/README.md +206 -3
- package/biome.json +1 -1
- package/dist/chunk-23UPXDNL.js +3044 -0
- package/dist/chunk-2W7MO2DL.js +1366 -0
- package/dist/chunk-3NUAZGMA.js +1689 -0
- package/dist/chunk-7TOWB2XB.js +366 -0
- package/dist/chunk-7XOTOADQ.js +3065 -0
- package/dist/chunk-AH2PDM2K.js +3042 -0
- package/dist/chunk-BNXWSZ63.js +3742 -0
- package/dist/chunk-BTL5DJVU.js +3222 -0
- package/dist/chunk-HDHYG7E4.js +104 -0
- package/dist/chunk-HLR4KZBP.js +3234 -0
- package/dist/chunk-IP3FRFEB.js +1045 -0
- package/dist/chunk-KHU56VDO.js +3042 -0
- package/dist/chunk-KRYIFLQR.js +85 -89
- package/dist/chunk-LBSDNLEM.js +287 -0
- package/dist/chunk-MNTQ7HCP.js +2643 -0
- package/dist/chunk-MUJELQQ6.js +1387 -0
- package/dist/chunk-MXJGMSLV.js +2199 -0
- package/dist/chunk-N6QJGC3Z.js +2636 -0
- package/dist/chunk-OBELGBPM.js +1713 -0
- package/dist/chunk-OT7R5XTA.js +3192 -0
- package/dist/chunk-P7X4RA2T.js +106 -0
- package/dist/chunk-PIDUQNC2.js +3185 -0
- package/dist/chunk-POGCDIH4.js +3187 -0
- package/dist/chunk-PSIEOQGZ.js +3043 -0
- package/dist/chunk-PVRT3IHA.js +3238 -0
- package/dist/chunk-QNN4TT23.js +1430 -0
- package/dist/chunk-RE3R45RJ.js +3042 -0
- package/dist/chunk-S7E6TFX6.js +718 -657
- package/dist/chunk-SG6GLU4U.js +1378 -0
- package/dist/chunk-SJCDV2ST.js +274 -0
- package/dist/chunk-SYE5XLF3.js +104 -0
- package/dist/chunk-T5VLYBZD.js +103 -0
- package/dist/chunk-TOQB7VWU.js +3238 -0
- package/dist/chunk-VFNMZ4ZQ.js +3228 -0
- package/dist/chunk-VVTGZNBT.js +1533 -1423
- package/dist/chunk-W7Q4RFEV.js +104 -0
- package/dist/chunk-XTYYVRLO.js +3190 -0
- package/dist/chunk-Y6MDYVJD.js +3063 -0
- package/dist/cli/main.js +4072 -629
- package/dist/index.d.ts +420 -33
- package/dist/index.js +8 -15
- package/dist/mcp/server.js +103 -7
- package/dist/schema-BAWSG7KY.js +22 -0
- package/dist/schema-E3QUPL26.js +20 -0
- package/dist/schema-EHL7WUT6.js +20 -0
- package/docs/019-USAGE.md +44 -5
- package/docs/020-current-implementation.md +8 -8
- package/docs/021-DOGFOODING-FINDINGS.md +1 -1
- package/docs/CONFIG.md +1123 -0
- package/docs/ERRORS.md +383 -0
- package/docs/summarization.md +320 -0
- package/justfile +40 -0
- package/package.json +39 -33
- package/research/INDEX.md +315 -0
- package/research/code-review/README.md +90 -0
- package/research/code-review/cli-error-handling-review.md +979 -0
- package/research/code-review/code-review-validation-report.md +464 -0
- package/research/code-review/main-ts-review.md +1128 -0
- package/research/config-docs/SUMMARY.md +357 -0
- package/research/config-docs/TEST-RESULTS.md +776 -0
- package/research/config-docs/TODO.md +542 -0
- package/research/config-docs/analysis.md +744 -0
- package/research/config-docs/fix-validation.md +502 -0
- package/research/config-docs/help-audit.md +264 -0
- package/research/config-docs/help-system-analysis.md +890 -0
- package/research/frontmatter/COMMENTS-ARE-SKIPPED.md +149 -0
- package/research/frontmatter/LLM-CODE-NAVIGATION.md +276 -0
- package/research/issue-review.md +603 -0
- package/research/llm-summarization/agent-cli-tools-2026.md +1082 -0
- package/research/llm-summarization/alternative-providers-2026.md +1428 -0
- package/research/llm-summarization/anthropic-2026.md +367 -0
- package/research/llm-summarization/claude-cli-integration.md +1706 -0
- package/research/llm-summarization/cli-integration-patterns.md +3155 -0
- package/research/llm-summarization/openai-2026.md +473 -0
- package/research/llm-summarization/openai-compatible-providers-2026.md +1022 -0
- package/research/llm-summarization/opencode-cli-integration.md +1552 -0
- package/research/llm-summarization/prompt-engineering-2026.md +1426 -0
- package/research/llm-summarization/prototype-results.md +56 -0
- package/research/llm-summarization/provider-switching-patterns-2026.md +2153 -0
- package/research/llm-summarization/typescript-llm-libraries-2026.md +2436 -0
- package/research/mdcontext-pudding/00-EXECUTIVE-SUMMARY.md +282 -0
- package/research/mdcontext-pudding/01-index-embed.md +956 -0
- package/research/mdcontext-pudding/02-search-COMMANDS.md +142 -0
- package/research/mdcontext-pudding/02-search-SUMMARY.md +146 -0
- package/research/mdcontext-pudding/02-search.md +970 -0
- package/research/mdcontext-pudding/03-context.md +779 -0
- package/research/mdcontext-pudding/04-navigation-and-analytics.md +803 -0
- package/research/mdcontext-pudding/04-tree.md +704 -0
- package/research/mdcontext-pudding/05-config.md +1038 -0
- package/research/mdcontext-pudding/06-links-summary.txt +87 -0
- package/research/mdcontext-pudding/06-links.md +679 -0
- package/research/mdcontext-pudding/07-stats.md +693 -0
- package/research/mdcontext-pudding/BUG-FIX-PLAN.md +388 -0
- package/research/mdcontext-pudding/P0-BUG-VALIDATION.md +167 -0
- package/research/mdcontext-pudding/README.md +168 -0
- package/research/mdcontext-pudding/TESTING-SUMMARY.md +128 -0
- package/research/research-quality-review.md +834 -0
- package/research/semantic-search/embedding-text-analysis.md +156 -0
- package/research/semantic-search/multi-word-failure-reproduction.md +171 -0
- package/research/semantic-search/query-processing-analysis.md +207 -0
- package/research/semantic-search/root-cause-and-solution.md +114 -0
- package/research/semantic-search/threshold-validation-report.md +69 -0
- package/research/semantic-search/vector-search-analysis.md +63 -0
- package/research/test-path-issues.md +276 -0
- package/review/ALP-76/1-error-type-design.md +962 -0
- package/review/ALP-76/2-error-handling-patterns.md +906 -0
- package/review/ALP-76/3-error-presentation.md +624 -0
- package/review/ALP-76/4-test-coverage.md +625 -0
- package/review/ALP-76/5-migration-completeness.md +440 -0
- package/review/ALP-76/6-effect-best-practices.md +755 -0
- package/scripts/apply-branch-protection.sh +47 -0
- package/scripts/branch-protection-templates.json +79 -0
- package/scripts/prototype-summarization.ts +346 -0
- package/scripts/rebuild-hnswlib.js +32 -37
- package/scripts/setup-branch-protection.sh +64 -0
- package/src/__tests__/fixtures/semantic-search/multi-word-corpus/.mdcontext/active-provider.json +7 -0
- package/src/__tests__/fixtures/semantic-search/multi-word-corpus/.mdcontext/bm25.json +541 -0
- package/src/__tests__/fixtures/semantic-search/multi-word-corpus/.mdcontext/bm25.meta.json +5 -0
- package/src/__tests__/fixtures/semantic-search/multi-word-corpus/.mdcontext/config.json +8 -0
- package/src/__tests__/fixtures/semantic-search/multi-word-corpus/.mdcontext/embeddings/openai_text-embedding-3-small_512/vectors.bin +0 -0
- package/src/__tests__/fixtures/semantic-search/multi-word-corpus/.mdcontext/embeddings/openai_text-embedding-3-small_512/vectors.meta.bin +0 -0
- package/src/__tests__/fixtures/semantic-search/multi-word-corpus/.mdcontext/indexes/documents.json +60 -0
- package/src/__tests__/fixtures/semantic-search/multi-word-corpus/.mdcontext/indexes/links.json +13 -0
- package/src/__tests__/fixtures/semantic-search/multi-word-corpus/.mdcontext/indexes/sections.json +1197 -0
- package/src/__tests__/fixtures/semantic-search/multi-word-corpus/configuration-management.md +99 -0
- package/src/__tests__/fixtures/semantic-search/multi-word-corpus/distributed-systems.md +92 -0
- package/src/__tests__/fixtures/semantic-search/multi-word-corpus/error-handling.md +78 -0
- package/src/__tests__/fixtures/semantic-search/multi-word-corpus/failure-automation.md +55 -0
- package/src/__tests__/fixtures/semantic-search/multi-word-corpus/job-context.md +69 -0
- package/src/__tests__/fixtures/semantic-search/multi-word-corpus/process-orchestration.md +99 -0
- package/src/cli/argv-preprocessor.test.ts +2 -2
- package/src/cli/cli.test.ts +230 -33
- package/src/cli/commands/config-cmd.ts +642 -0
- package/src/cli/commands/context.ts +97 -9
- package/src/cli/commands/duplicates.ts +122 -0
- package/src/cli/commands/embeddings.ts +529 -0
- package/src/cli/commands/index-cmd.ts +210 -30
- package/src/cli/commands/index.ts +3 -0
- package/src/cli/commands/search.ts +894 -64
- package/src/cli/commands/stats.ts +3 -0
- package/src/cli/commands/tree.ts +26 -5
- package/src/cli/config-layer.ts +176 -0
- package/src/cli/error-handler.test.ts +235 -0
- package/src/cli/error-handler.ts +655 -0
- package/src/cli/flag-schemas.ts +66 -0
- package/src/cli/help.ts +209 -7
- package/src/cli/main.ts +348 -58
- package/src/cli/options.ts +10 -0
- package/src/cli/shared-error-handling.ts +199 -0
- package/src/cli/utils.ts +150 -17
- package/src/config/file-provider.test.ts +320 -0
- package/src/config/file-provider.ts +273 -0
- package/src/config/index.ts +72 -0
- package/src/config/integration.test.ts +667 -0
- package/src/config/precedence.test.ts +277 -0
- package/src/config/precedence.ts +451 -0
- package/src/config/schema.test.ts +414 -0
- package/src/config/schema.ts +603 -0
- package/src/config/service.test.ts +320 -0
- package/src/config/service.ts +243 -0
- package/src/config/testing.test.ts +264 -0
- package/src/config/testing.ts +110 -0
- package/src/core/types.ts +6 -33
- package/src/duplicates/detector.test.ts +183 -0
- package/src/duplicates/detector.ts +414 -0
- package/src/duplicates/index.ts +18 -0
- package/src/embeddings/embedding-namespace.test.ts +300 -0
- package/src/embeddings/embedding-namespace.ts +947 -0
- package/src/embeddings/heading-boost.test.ts +222 -0
- package/src/embeddings/hnsw-build-options.test.ts +198 -0
- package/src/embeddings/hyde.test.ts +272 -0
- package/src/embeddings/hyde.ts +264 -0
- package/src/embeddings/index.ts +2 -0
- package/src/embeddings/openai-provider.ts +332 -83
- package/src/embeddings/pricing.json +22 -0
- package/src/embeddings/provider-constants.ts +204 -0
- package/src/embeddings/provider-errors.test.ts +967 -0
- package/src/embeddings/provider-errors.ts +565 -0
- package/src/embeddings/provider-factory.test.ts +240 -0
- package/src/embeddings/provider-factory.ts +225 -0
- package/src/embeddings/provider-integration.test.ts +788 -0
- package/src/embeddings/query-preprocessing.test.ts +187 -0
- package/src/embeddings/semantic-search-threshold.test.ts +508 -0
- package/src/embeddings/semantic-search.ts +780 -93
- package/src/embeddings/types.ts +293 -16
- package/src/embeddings/vector-store.ts +486 -77
- package/src/embeddings/voyage-provider.ts +313 -0
- package/src/errors/errors.test.ts +845 -0
- package/src/errors/index.ts +533 -0
- package/src/index/ignore-patterns.test.ts +354 -0
- package/src/index/ignore-patterns.ts +305 -0
- package/src/index/indexer.ts +286 -48
- package/src/index/storage.ts +94 -30
- package/src/index/types.ts +40 -2
- package/src/index/watcher.ts +67 -9
- package/src/index.ts +22 -0
- package/src/integration/search-keyword.test.ts +678 -0
- package/src/mcp/server.ts +135 -6
- package/src/parser/parser.ts +18 -19
- package/src/parser/section-filter.test.ts +277 -0
- package/src/parser/section-filter.ts +125 -3
- package/src/search/__tests__/hybrid-search.test.ts +650 -0
- package/src/search/bm25-store.ts +366 -0
- package/src/search/cross-encoder.test.ts +253 -0
- package/src/search/cross-encoder.ts +406 -0
- package/src/search/fuzzy-search.test.ts +419 -0
- package/src/search/fuzzy-search.ts +273 -0
- package/src/search/hybrid-search.ts +448 -0
- package/src/search/path-matcher.test.ts +276 -0
- package/src/search/path-matcher.ts +33 -0
- package/src/search/searcher.test.ts +99 -1
- package/src/search/searcher.ts +189 -67
- package/src/search/wink-bm25.d.ts +30 -0
- package/src/summarization/cli-providers/claude.ts +202 -0
- package/src/summarization/cli-providers/detection.test.ts +273 -0
- package/src/summarization/cli-providers/detection.ts +118 -0
- package/src/summarization/cli-providers/index.ts +8 -0
- package/src/summarization/cost.test.ts +139 -0
- package/src/summarization/cost.ts +102 -0
- package/src/summarization/error-handler.test.ts +127 -0
- package/src/summarization/error-handler.ts +111 -0
- package/src/summarization/index.ts +102 -0
- package/src/summarization/pipeline.test.ts +498 -0
- package/src/summarization/pipeline.ts +231 -0
- package/src/summarization/prompts.test.ts +269 -0
- package/src/summarization/prompts.ts +133 -0
- package/src/summarization/provider-factory.test.ts +396 -0
- package/src/summarization/provider-factory.ts +178 -0
- package/src/summarization/types.ts +184 -0
- package/src/summarize/summarizer.ts +104 -35
- package/src/types/huggingface-transformers.d.ts +66 -0
- package/tests/fixtures/cli/.mdcontext/active-provider.json +7 -0
- package/tests/fixtures/cli/.mdcontext/embeddings/openai_text-embedding-3-small_512/vectors.bin +0 -0
- package/tests/fixtures/cli/.mdcontext/embeddings/openai_text-embedding-3-small_512/vectors.meta.bin +0 -0
- package/tests/fixtures/cli/.mdcontext/indexes/documents.json +4 -4
- package/tests/fixtures/cli/.mdcontext/indexes/sections.json +14 -0
- package/tests/integration/embed-index.test.ts +712 -0
- package/tests/integration/search-context.test.ts +469 -0
- package/tests/integration/search-semantic.test.ts +522 -0
- package/vitest.config.ts +1 -6
- package/AGENTS.md +0 -46
- package/tests/fixtures/cli/.mdcontext/vectors.bin +0 -0
- package/tests/fixtures/cli/.mdcontext/vectors.meta.json +0 -1264
|
@@ -0,0 +1,473 @@
|
|
|
1
|
+
# OpenAI Models & Pricing Research - 2026
|
|
2
|
+
|
|
3
|
+
**Research Date:** January 26, 2026
|
|
4
|
+
**Focus:** Code understanding, summarization, and analysis capabilities
|
|
5
|
+
|
|
6
|
+
---
|
|
7
|
+
|
|
8
|
+
## Executive Summary
|
|
9
|
+
|
|
10
|
+
OpenAI has significantly expanded its model lineup in 2025-2026, with major releases including GPT-5 (August 2025), GPT-5.2 (January 2026), and specialized coding models. The company has also reduced pricing substantially, notably an 80% price drop for o3 reasoning models. For code summarization and understanding tasks, GPT-5.2-Codex represents the state-of-the-art, while GPT-4o-mini and GPT-5-mini offer cost-effective alternatives.
|
|
11
|
+
|
|
12
|
+
---
|
|
13
|
+
|
|
14
|
+
## Current Model Lineup (2026)
|
|
15
|
+
|
|
16
|
+
### 1. GPT-5 Series - Flagship Models
|
|
17
|
+
|
|
18
|
+
#### GPT-5.2 (Latest - January 2026)
|
|
19
|
+
**Best for:** Coding, agentic tasks, document summarization, complex reasoning
|
|
20
|
+
|
|
21
|
+
- **Pricing:** $1.75/1M input tokens, $14.00/1M output tokens
|
|
22
|
+
- **Context Window:** Up to 400K tokens
|
|
23
|
+
- **Knowledge Cutoff:** August 2025
|
|
24
|
+
- **Key Features:**
|
|
25
|
+
- Designed for deeper work and complex tasks
|
|
26
|
+
- Excels at coding, summarizing long documents, file Q&A
|
|
27
|
+
- Step-by-step math and logic reasoning
|
|
28
|
+
- Flagship model for coding and agentic tasks across industries
|
|
29
|
+
- Stronger multimodal performance than predecessors
|
|
30
|
+
|
|
31
|
+
#### GPT-5.2-Pro
|
|
32
|
+
- **Pricing:** $21/1M input tokens, $168/1M output tokens
|
|
33
|
+
- **Context Window:** Up to 400K tokens
|
|
34
|
+
- **Key Features:**
|
|
35
|
+
- Smartest and most trustworthy option for difficult questions
|
|
36
|
+
- Fewer major errors in testing
|
|
37
|
+
- Stronger performance in complex domains like programming
|
|
38
|
+
- Supports extended reasoning for top-quality execution
|
|
39
|
+
|
|
40
|
+
#### GPT-5 (Standard)
|
|
41
|
+
- **Pricing:** $1.25/1M input tokens, $10.00/1M output tokens
|
|
42
|
+
- **Context Window:** Up to 400K tokens
|
|
43
|
+
- **Release:** August 7, 2025
|
|
44
|
+
- **Key Features:**
|
|
45
|
+
- 74.9% accuracy on SWE-bench Verified
|
|
46
|
+
- 88% on Aider Polyglot benchmarks
|
|
47
|
+
- State-of-the-art for real-world coding tasks
|
|
48
|
+
|
|
49
|
+
#### GPT-5 Variants
|
|
50
|
+
- **GPT-5 Pro Global:** $15/1M input, $120/1M output
|
|
51
|
+
- **GPT-5-mini:** $0.25/1M input, $2.00/1M output
|
|
52
|
+
- **GPT-5-nano:** $0.05/1M input, $0.40/1M output
|
|
53
|
+
|
|
54
|
+
### 2. GPT-5.2-Codex - Specialized Coding Model
|
|
55
|
+
|
|
56
|
+
**Best for:** Complex software engineering, large refactors, code migrations, agentic coding
|
|
57
|
+
|
|
58
|
+
- **Pricing:** $1.25/1M input tokens, $10.00/1M output tokens (estimated based on GPT-5 Codex)
|
|
59
|
+
- **Note:** Specific GPT-5.2-Codex API pricing not yet publicly finalized
|
|
60
|
+
- **Context Window:** Up to 400K tokens
|
|
61
|
+
- **Key Features:**
|
|
62
|
+
- Most advanced agentic coding model for complex, real-world software engineering
|
|
63
|
+
- Improvements in context compaction for long-horizon work
|
|
64
|
+
- Stronger performance on large code changes (refactors, migrations)
|
|
65
|
+
- Improved Windows environment support
|
|
66
|
+
- **Significantly stronger cybersecurity capabilities**
|
|
67
|
+
- State-of-the-art on SWE-Bench Pro and Terminal-Bench 2.0
|
|
68
|
+
- Better at working in large repositories over extended sessions
|
|
69
|
+
- Can reliably complete complex tasks without losing context
|
|
70
|
+
|
|
71
|
+
**Availability:**
|
|
72
|
+
- Currently available in ChatGPT Codex surfaces for paid users
|
|
73
|
+
- API access rolling out in coming weeks (as of January 2026)
|
|
74
|
+
|
|
75
|
+
**ChatGPT Subscription Access:**
|
|
76
|
+
- **Plus Plan:** Up to 160 messages with GPT-5.2 every 3 hours
|
|
77
|
+
- **Pro Plan:** $200/month with unlimited messages (subject to fair use)
|
|
78
|
+
|
|
79
|
+
### 3. GPT-4 Series - Previous Generation
|
|
80
|
+
|
|
81
|
+
#### GPT-4.1
|
|
82
|
+
- **Pricing:** $2.00/1M input tokens, $8.00/1M output tokens
|
|
83
|
+
- **Context Window:** Up to 1.0M tokens (largest context window)
|
|
84
|
+
- **Release:** April 14, 2025
|
|
85
|
+
|
|
86
|
+
#### GPT-4.1-mini
|
|
87
|
+
- **Pricing:** $0.40/1M input, $1.60/1M output tokens
|
|
88
|
+
- **Context Window:** 128K tokens
|
|
89
|
+
|
|
90
|
+
#### GPT-4o
|
|
91
|
+
- **Pricing:** $2.50/1M input tokens, $10.00/1M output tokens
|
|
92
|
+
- **Context Window:** Up to 128K tokens
|
|
93
|
+
- **Key Features:**
|
|
94
|
+
- Text generation, summarization, knowledge-based Q&A
|
|
95
|
+
- Reasoning, complex math, coding capabilities
|
|
96
|
+
- Strong multimodal performance
|
|
97
|
+
|
|
98
|
+
#### GPT-4o-mini (Budget Option)
|
|
99
|
+
- **Pricing:** $0.150/1M input tokens, $0.600/1M output tokens
|
|
100
|
+
- **Context Window:** Up to 128K tokens
|
|
101
|
+
- **Key Features:**
|
|
102
|
+
- Most cost-effective option
|
|
103
|
+
- Suitable for high-volume applications
|
|
104
|
+
- Good for straightforward code analysis tasks
|
|
105
|
+
|
|
106
|
+
### 4. Reasoning Models (o-series)
|
|
107
|
+
|
|
108
|
+
#### o3 (Latest Reasoning Model - 2026)
|
|
109
|
+
**Best for:** Complex reasoning, mathematical proofs, advanced problem-solving
|
|
110
|
+
|
|
111
|
+
- **Pricing:** $2.00/1M input tokens, $8.00/1M output tokens
|
|
112
|
+
- **Price Reduction:** 80% cheaper than previous rates
|
|
113
|
+
- **Cached Input Discount:** Additional $0.50/1M tokens discount
|
|
114
|
+
- **Flex Mode:** $5/1M input, $20/1M output (synchronous processing)
|
|
115
|
+
- **Key Features:**
|
|
116
|
+
- Extended reasoning capabilities
|
|
117
|
+
- Suitable for complex logic and mathematical tasks
|
|
118
|
+
|
|
119
|
+
#### o3-pro
|
|
120
|
+
- **Availability:** ChatGPT Pro ($200/month) and Team users only
|
|
121
|
+
- **Pricing:** Significantly higher API costs (reflects extended compute time)
|
|
122
|
+
- **Note:** Best for most demanding reasoning tasks
|
|
123
|
+
|
|
124
|
+
#### o4-mini
|
|
125
|
+
- **Pricing:** Not specified, positioned as cost-effective
|
|
126
|
+
- **Key Features:**
|
|
127
|
+
- Remarkable reasoning performance at fraction of cost
|
|
128
|
+
- Cost-effective for high-volume applications
|
|
129
|
+
|
|
130
|
+
#### o1 (Previous Generation - Historical Reference)
|
|
131
|
+
- **o1-preview:** $15/1M input, $60/1M output tokens
|
|
132
|
+
- **o1-mini:** $1.10/1M input, $4.40/1M output tokens
|
|
133
|
+
|
|
134
|
+
---
|
|
135
|
+
|
|
136
|
+
## Batch API Pricing
|
|
137
|
+
|
|
138
|
+
For non-urgent workloads processed within 24 hours:
|
|
139
|
+
|
|
140
|
+
- **50% discount on all models**
|
|
141
|
+
- Example: GPT-5 drops to $0.625/$5.00 per million tokens (vs $1.25/$10.00 standard)
|
|
142
|
+
- Ideal for bulk code analysis and summarization tasks
|
|
143
|
+
|
|
144
|
+
---
|
|
145
|
+
|
|
146
|
+
## Rate Limits & API Tiers
|
|
147
|
+
|
|
148
|
+
### Rate Limit Metrics
|
|
149
|
+
OpenAI measures limits in five ways:
|
|
150
|
+
- **RPM:** Requests per minute
|
|
151
|
+
- **RPD:** Requests per day
|
|
152
|
+
- **TPM:** Tokens per minute
|
|
153
|
+
- **TPD:** Tokens per day
|
|
154
|
+
- **IPM:** Images per minute
|
|
155
|
+
|
|
156
|
+
### 2026 Rate Limit Updates
|
|
157
|
+
|
|
158
|
+
**Significant Increases:**
|
|
159
|
+
- GPT-5 Tier 1: Increased from ~30,000 TPM to **500,000 TPM**
|
|
160
|
+
- GPT-5 Tier 1: Approximately **1,000 RPM**
|
|
161
|
+
- Higher tiers provide substantially more capacity
|
|
162
|
+
|
|
163
|
+
### Usage Tiers
|
|
164
|
+
- Users automatically graduate to higher tiers as API spending increases
|
|
165
|
+
- Most models available on Tiers 1-5 (subject to organization verification)
|
|
166
|
+
- Check current limits at: `platform.openai.com/settings/organization/limits`
|
|
167
|
+
|
|
168
|
+
### Best Practices for Rate Limits
|
|
169
|
+
|
|
170
|
+
1. **Error Handling:**
|
|
171
|
+
- Watch for `429: Too Many Requests` errors
|
|
172
|
+
- Implement `RateLimitError` exception handling
|
|
173
|
+
|
|
174
|
+
2. **Exponential Backoff:**
|
|
175
|
+
- Use retries with randomized, increasing wait periods
|
|
176
|
+
- Most effective strategy for handling rate limits
|
|
177
|
+
|
|
178
|
+
3. **Request Throttling:**
|
|
179
|
+
- Reference OpenAI's `api_request_parallel_processor.py` script
|
|
180
|
+
- Batch multiple small requests together
|
|
181
|
+
|
|
182
|
+
4. **Optimization:**
|
|
183
|
+
- Set `max_tokens` parameter realistically
|
|
184
|
+
- Cache responses for frequently asked questions
|
|
185
|
+
- Use Batch API for non-urgent workloads
|
|
186
|
+
|
|
187
|
+
---
|
|
188
|
+
|
|
189
|
+
## Recommendations for Code Summarization Use Cases
|
|
190
|
+
|
|
191
|
+
### Best Model Choices by Use Case
|
|
192
|
+
|
|
193
|
+
#### 1. Production Code Summarization at Scale
|
|
194
|
+
**Recommended:** GPT-4o-mini or GPT-5-mini
|
|
195
|
+
- **Why:** Extremely cost-effective ($0.15/$0.60 or $0.25/$2.00 per million tokens)
|
|
196
|
+
- **Use for:**
|
|
197
|
+
- Summarizing function docstrings
|
|
198
|
+
- Generating README files
|
|
199
|
+
- Inline code comments
|
|
200
|
+
- High-volume batch processing
|
|
201
|
+
- **Cost at 10M tokens/day:**
|
|
202
|
+
- GPT-4o-mini: ~$7.50/day
|
|
203
|
+
- GPT-5-mini: ~$22.50/day
|
|
204
|
+
|
|
205
|
+
#### 2. Complex Code Understanding & Analysis
|
|
206
|
+
**Recommended:** GPT-5.2 or GPT-5
|
|
207
|
+
- **Why:** Superior reasoning and code comprehension
|
|
208
|
+
- **Use for:**
|
|
209
|
+
- Architecture documentation
|
|
210
|
+
- Complex dependency analysis
|
|
211
|
+
- Security vulnerability assessment
|
|
212
|
+
- Technical debt analysis
|
|
213
|
+
- **Cost at 1M tokens/day:**
|
|
214
|
+
- GPT-5.2: ~$15.75/day
|
|
215
|
+
- GPT-5: ~$11.25/day
|
|
216
|
+
|
|
217
|
+
#### 3. Agentic Code Refactoring & Large Migrations
|
|
218
|
+
**Recommended:** GPT-5.2-Codex
|
|
219
|
+
- **Why:** Specifically optimized for multi-step coding tasks
|
|
220
|
+
- **Use for:**
|
|
221
|
+
- Large-scale refactors
|
|
222
|
+
- Framework migrations
|
|
223
|
+
- Codebase modernization
|
|
224
|
+
- Security-critical code changes
|
|
225
|
+
- **Key Advantage:** Maintains context over long sessions
|
|
226
|
+
- **Cost:** ~$11.25/day at 1M tokens (estimated)
|
|
227
|
+
|
|
228
|
+
#### 4. Deep Reasoning About Code Logic
|
|
229
|
+
**Recommended:** o3
|
|
230
|
+
- **Why:** Best reasoning capabilities, now 80% cheaper
|
|
231
|
+
- **Use for:**
|
|
232
|
+
- Algorithm correctness verification
|
|
233
|
+
- Complex bug root cause analysis
|
|
234
|
+
- Performance optimization reasoning
|
|
235
|
+
- Mathematical code verification
|
|
236
|
+
- **Cost at 1M tokens/day:** ~$10.00/day
|
|
237
|
+
|
|
238
|
+
#### 5. Budget-Conscious Batch Processing
|
|
239
|
+
**Recommended:** GPT-4o-mini or GPT-5-mini via Batch API
|
|
240
|
+
- **Why:** 50% discount + already low prices
|
|
241
|
+
- **Use for:**
|
|
242
|
+
- Overnight documentation generation
|
|
243
|
+
- Weekly codebase summaries
|
|
244
|
+
- Non-urgent analysis tasks
|
|
245
|
+
- **Cost at 10M tokens/day:**
|
|
246
|
+
- GPT-4o-mini Batch: ~$3.75/day
|
|
247
|
+
- GPT-5-mini Batch: ~$11.25/day
|
|
248
|
+
|
|
249
|
+
### Cost Comparison Table
|
|
250
|
+
|
|
251
|
+
| Model | Input ($/1M) | Output ($/1M) | Batch Input | Batch Output | Best For |
|
|
252
|
+
|-------|-------------|---------------|-------------|--------------|----------|
|
|
253
|
+
| GPT-4o-mini | $0.15 | $0.60 | $0.075 | $0.30 | High-volume basic tasks |
|
|
254
|
+
| GPT-5-nano | $0.05 | $0.40 | $0.025 | $0.20 | Ultra-low-cost simple summaries |
|
|
255
|
+
| GPT-5-mini | $0.25 | $2.00 | $0.125 | $1.00 | Balanced cost/quality |
|
|
256
|
+
| GPT-5 | $1.25 | $10.00 | $0.625 | $5.00 | Standard complex tasks |
|
|
257
|
+
| GPT-5.2 | $1.75 | $14.00 | $0.875 | $7.00 | Latest flagship |
|
|
258
|
+
| GPT-5.2-Codex | ~$1.25 | ~$10.00 | ~$0.625 | ~$5.00 | Agentic coding |
|
|
259
|
+
| GPT-4.1 | $2.00 | $8.00 | $1.00 | $4.00 | Large context (1M) |
|
|
260
|
+
| o3 | $2.00 | $8.00 | $1.00 | $4.00 | Deep reasoning |
|
|
261
|
+
|
|
262
|
+
---
|
|
263
|
+
|
|
264
|
+
## Special Considerations for Code Analysis
|
|
265
|
+
|
|
266
|
+
### Context Window Strategy
|
|
267
|
+
|
|
268
|
+
1. **Small Files (<10K tokens):** Any model works
|
|
269
|
+
2. **Medium Projects (10K-128K tokens):** GPT-4o, GPT-5.2, GPT-5
|
|
270
|
+
3. **Large Codebases (128K-400K tokens):** GPT-5.2, GPT-5, GPT-4.1
|
|
271
|
+
4. **Massive Projects (>400K tokens):** GPT-4.1 (1M context) or chunking strategy
|
|
272
|
+
|
|
273
|
+
### Token Optimization Tips
|
|
274
|
+
|
|
275
|
+
- **Minify Code:** Remove comments/whitespace before sending (if not needed for analysis)
|
|
276
|
+
- **Selective Inclusion:** Only send relevant files
|
|
277
|
+
- **Hierarchical Summarization:** Summarize small chunks, then summarize summaries
|
|
278
|
+
- **Caching:** OpenAI offers prompt caching - reuse common context
|
|
279
|
+
|
|
280
|
+
### Security Considerations
|
|
281
|
+
|
|
282
|
+
- **Sensitive Code:** Ensure compliance with data handling policies
|
|
283
|
+
- **GPT-5.2-Codex:** Specifically enhanced for cybersecurity use cases
|
|
284
|
+
- **Private Deployment:** Consider Azure OpenAI for enterprise security requirements
|
|
285
|
+
|
|
286
|
+
---
|
|
287
|
+
|
|
288
|
+
## API Capabilities Relevant to Code Analysis
|
|
289
|
+
|
|
290
|
+
### Supported Features
|
|
291
|
+
|
|
292
|
+
1. **Function Calling:**
|
|
293
|
+
- All GPT-4 and GPT-5 models support structured function calls
|
|
294
|
+
- Useful for integrating code analysis into tools
|
|
295
|
+
|
|
296
|
+
2. **JSON Mode:**
|
|
297
|
+
- Guaranteed JSON responses
|
|
298
|
+
- Perfect for structured code summaries
|
|
299
|
+
|
|
300
|
+
3. **Vision (Multimodal):**
|
|
301
|
+
- GPT-4o and GPT-5 series support image inputs
|
|
302
|
+
- Analyze architecture diagrams, UI screenshots, flowcharts
|
|
303
|
+
|
|
304
|
+
4. **Fine-tuning:**
|
|
305
|
+
- Available for GPT-4o and GPT-4o-mini
|
|
306
|
+
- Can specialize for specific code summarization formats
|
|
307
|
+
|
|
308
|
+
5. **Embeddings (Separate Pricing):**
|
|
309
|
+
- text-embedding-3-large: $0.13/1M tokens
|
|
310
|
+
- text-embedding-3-small: $0.02/1M tokens
|
|
311
|
+
- Useful for semantic code search
|
|
312
|
+
|
|
313
|
+
### Developer Experience
|
|
314
|
+
|
|
315
|
+
- **Streaming:** All models support streaming responses
|
|
316
|
+
- **Retry Logic:** Built-in exponential backoff recommended
|
|
317
|
+
- **SDKs:** Official Python, Node.js, and community libraries
|
|
318
|
+
- **Monitoring:** Usage dashboard at platform.openai.com
|
|
319
|
+
|
|
320
|
+
---
|
|
321
|
+
|
|
322
|
+
## Pricing Trends & Future Outlook
|
|
323
|
+
|
|
324
|
+
### Recent Changes
|
|
325
|
+
- **80% price reduction** for o3 reasoning model (2026)
|
|
326
|
+
- **Significant rate limit increases** (GPT-5 TPM: 30K → 500K)
|
|
327
|
+
- Introduction of ultra-low-cost nano variants
|
|
328
|
+
|
|
329
|
+
### Competitive Landscape
|
|
330
|
+
- OpenAI faces competition from:
|
|
331
|
+
- Anthropic Claude (Opus 4.5, Sonnet 4.5)
|
|
332
|
+
- Google Gemini 3 Pro
|
|
333
|
+
- DeepSeek R1 (reasoning model)
|
|
334
|
+
- Price reductions driven by market competition
|
|
335
|
+
|
|
336
|
+
### What to Expect
|
|
337
|
+
- Continued price decreases as competition intensifies
|
|
338
|
+
- More specialized models for specific tasks
|
|
339
|
+
- Larger context windows becoming standard
|
|
340
|
+
- Enhanced agentic capabilities across model lineup
|
|
341
|
+
|
|
342
|
+
---
|
|
343
|
+
|
|
344
|
+
## Quick Decision Matrix
|
|
345
|
+
|
|
346
|
+
### Choose GPT-4o-mini if:
|
|
347
|
+
- Budget is primary constraint
|
|
348
|
+
- Tasks are straightforward (basic summarization)
|
|
349
|
+
- Processing millions of tokens daily
|
|
350
|
+
- Quality requirements are moderate
|
|
351
|
+
|
|
352
|
+
### Choose GPT-5-mini if:
|
|
353
|
+
- Need better quality than GPT-4o-mini
|
|
354
|
+
- Still cost-sensitive
|
|
355
|
+
- Want newer model capabilities
|
|
356
|
+
- Willing to pay ~3-4x more for better results
|
|
357
|
+
|
|
358
|
+
### Choose GPT-5.2 if:
|
|
359
|
+
- Need best general-purpose performance
|
|
360
|
+
- Code summarization requires deep understanding
|
|
361
|
+
- Working with complex codebases
|
|
362
|
+
- Can justify ~10x cost vs mini models
|
|
363
|
+
|
|
364
|
+
### Choose GPT-5.2-Codex if:
|
|
365
|
+
- Performing agentic coding tasks
|
|
366
|
+
- Large refactors or migrations
|
|
367
|
+
- Security-critical code work
|
|
368
|
+
- Need multi-step reasoning with code
|
|
369
|
+
|
|
370
|
+
### Choose o3 if:
|
|
371
|
+
- Need deep logical reasoning about code
|
|
372
|
+
- Algorithm verification required
|
|
373
|
+
- Mathematical correctness matters
|
|
374
|
+
- Budget allows reasoning model costs
|
|
375
|
+
|
|
376
|
+
### Choose GPT-4.1 if:
|
|
377
|
+
- Need 1M token context window
|
|
378
|
+
- Processing entire large repositories
|
|
379
|
+
- Context window more important than latest features
|
|
380
|
+
|
|
381
|
+
---
|
|
382
|
+
|
|
383
|
+
## Sample Cost Scenarios
|
|
384
|
+
|
|
385
|
+
### Scenario 1: Daily Codebase Documentation
|
|
386
|
+
- **Task:** Summarize 100 files/day, avg 500 tokens each = 50K tokens
|
|
387
|
+
- **Model:** GPT-5-mini
|
|
388
|
+
- **Cost:** (50K × $0.25/1M) + (50K × $2/1M) = $0.0125 + $0.10 = **$0.11/day**
|
|
389
|
+
- **Monthly:** ~$3.30/month
|
|
390
|
+
|
|
391
|
+
### Scenario 2: Weekly Architecture Analysis
|
|
392
|
+
- **Task:** Deep analysis of 1M tokens once weekly
|
|
393
|
+
- **Model:** GPT-5.2
|
|
394
|
+
- **Cost:** ($1.75 + $14) = **$15.75/week**
|
|
395
|
+
- **Monthly:** ~$63/month
|
|
396
|
+
|
|
397
|
+
### Scenario 3: Continuous Code Review Pipeline
|
|
398
|
+
- **Task:** 10M tokens/day, basic summaries
|
|
399
|
+
- **Model:** GPT-4o-mini (Batch API)
|
|
400
|
+
- **Cost:** (10M × $0.075/1M) + (10M × $0.30/1M) = $0.75 + $3.00 = **$3.75/day**
|
|
401
|
+
- **Monthly:** ~$112.50/month
|
|
402
|
+
|
|
403
|
+
### Scenario 4: Enterprise Refactoring Project
|
|
404
|
+
- **Task:** 500K tokens/day for 30 days (major migration)
|
|
405
|
+
- **Model:** GPT-5.2-Codex
|
|
406
|
+
- **Cost:** (500K × $1.25/1M) + (500K × $10/1M) = $0.625 + $5.00 = **$5.625/day**
|
|
407
|
+
- **Monthly:** ~$168.75/month
|
|
408
|
+
|
|
409
|
+
---
|
|
410
|
+
|
|
411
|
+
## Resources & Links
|
|
412
|
+
|
|
413
|
+
### Official Documentation
|
|
414
|
+
- [OpenAI API Pricing](https://openai.com/api/pricing/)
|
|
415
|
+
- [Platform Pricing Docs](https://platform.openai.com/docs/pricing)
|
|
416
|
+
- [Rate Limits Guide](https://platform.openai.com/docs/guides/rate-limits)
|
|
417
|
+
- [Models Documentation](https://platform.openai.com/docs/models/)
|
|
418
|
+
|
|
419
|
+
### Model Announcements
|
|
420
|
+
- [Introducing GPT-5](https://openai.com/index/introducing-gpt-5/)
|
|
421
|
+
- [Introducing GPT-5.2](https://openai.com/index/introducing-gpt-5-2/)
|
|
422
|
+
- [Introducing GPT-5.2-Codex](https://openai.com/index/introducing-gpt-5-2-codex/)
|
|
423
|
+
- [Introducing o3 and o4-mini](https://openai.com/index/introducing-o3-and-o4-mini/)
|
|
424
|
+
|
|
425
|
+
### Developer Resources
|
|
426
|
+
- [GPT-5.2 Prompting Guide](https://cookbook.openai.com/examples/gpt-5/gpt-5-2_prompting_guide)
|
|
427
|
+
- [Rate Limit Handling](https://cookbook.openai.com/examples/how_to_handle_rate_limits)
|
|
428
|
+
- [OpenAI Academy](https://academy.openai.com/)
|
|
429
|
+
|
|
430
|
+
### Third-Party Resources
|
|
431
|
+
- [Price Per Token Calculator](https://pricepertoken.com/pricing-page/provider/openai)
|
|
432
|
+
- [Finout OpenAI Pricing Guide](https://www.finout.io/blog/openai-pricing-in-2026)
|
|
433
|
+
- [Cost Calculator](https://costgoat.com/pricing/openai-api)
|
|
434
|
+
|
|
435
|
+
---
|
|
436
|
+
|
|
437
|
+
## Conclusion & Recommendations
|
|
438
|
+
|
|
439
|
+
For **mdcontext** code summarization use cases:
|
|
440
|
+
|
|
441
|
+
1. **Start with GPT-5-mini** for basic file summarization
|
|
442
|
+
- Excellent cost/quality balance
|
|
443
|
+
- ~$0.25-$2.00 per million tokens
|
|
444
|
+
- Suitable for 80%+ of summarization tasks
|
|
445
|
+
|
|
446
|
+
2. **Upgrade to GPT-5.2** for complex analysis
|
|
447
|
+
- When understanding architecture matters
|
|
448
|
+
- When context relationships are critical
|
|
449
|
+
- When quality justifies ~7x cost increase
|
|
450
|
+
|
|
451
|
+
3. **Use Batch API** whenever possible
|
|
452
|
+
- 50% cost savings for non-urgent tasks
|
|
453
|
+
- Perfect for overnight documentation generation
|
|
454
|
+
|
|
455
|
+
4. **Reserve GPT-5.2-Codex** for agentic tasks
|
|
456
|
+
- Large-scale refactoring projects
|
|
457
|
+
- Security-critical analysis
|
|
458
|
+
- Multi-step code transformations
|
|
459
|
+
|
|
460
|
+
5. **Monitor costs closely**
|
|
461
|
+
- Set up billing alerts
|
|
462
|
+
- Track token usage per operation
|
|
463
|
+
- Optimize prompt engineering to minimize tokens
|
|
464
|
+
|
|
465
|
+
6. **Consider hybrid approach**
|
|
466
|
+
- Use mini models for initial pass
|
|
467
|
+
- Use flagship models for complex files flagged by mini models
|
|
468
|
+
- Can reduce costs 60-80% vs. using flagship for everything
|
|
469
|
+
|
|
470
|
+
---
|
|
471
|
+
|
|
472
|
+
**Last Updated:** January 26, 2026
|
|
473
|
+
**Next Review:** April 2026 (quarterly review recommended)
|