@dizzlkheinz/ynab-mcpb 0.16.0 → 0.17.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (114) hide show
  1. package/.code/agents/0098661e-0fa3-4990-beb9-c0cbf3f123aa/status.txt +1 -0
  2. package/.code/agents/1324/exec-call_tIpx9uV1TpARbAMZonRQm8AO.txt +757 -0
  3. package/.code/agents/1572/exec-call_GjVFBFOWcY7lE0idc5nWlLNh.txt +781 -0
  4. package/.code/agents/1846/exec-call_1YNAVD18RjrMN7JnfkkQhUP3.txt +766 -0
  5. package/.code/agents/1846/exec-call_lh3lDzE4WJAh1lFiomiiZ73D.txt +766 -0
  6. package/.code/agents/2038/exec-call_DYwOukaYsL8VCONWmV2rUW5u.txt +766 -0
  7. package/.code/agents/2038/exec-call_c7fOQ7UrpVcTtvdfGBRM146V.txt +652 -0
  8. package/.code/agents/2038/exec-call_ySNyq9Mm55jWE480s54r5QcA.txt +766 -0
  9. package/.code/agents/2256/exec-call_AtPcRWPmFPMcmX6qOFm1fCEY.txt +766 -0
  10. package/.code/agents/2454/exec-call_aFJpupwjfZeOBm7ixI5Vc8z2.txt +766 -0
  11. package/.code/agents/2454/exec-call_wogZ4HfXTodTEXvdgXlVUBpv.txt +766 -0
  12. package/.code/agents/2e905864-aa07-4314-bcf9-c5b32277e4ac/result.txt +36 -0
  13. package/.code/agents/3073/exec-call_Peeagc9DxGYLgE6pNdMZhqIE.txt +766 -0
  14. package/.code/agents/3073/exec-call_d2YSE3hXF08KRSoUM3qd8Z3x.txt +766 -0
  15. package/.code/agents/335aa031-466d-4fb7-925f-3cd864e264d0/result.txt +191 -0
  16. package/.code/agents/3364/exec-call_NbhIrsM5HhyDZDmJZG5CuCYL.txt +766 -0
  17. package/.code/agents/3364/exec-call_cKtJg0NrXiwXEFwlsE3uPZRA.txt +766 -0
  18. package/.code/agents/36d98414-5cde-4d9d-9a67-a240a18c1f07/result.txt +189 -0
  19. package/.code/agents/4604e866-b7b8-44f5-992f-2f683b0a523b/status.txt +1 -0
  20. package/.code/agents/5f8dc01c-47b3-4163-b0b3-aa31be89fcdc/status.txt +1 -0
  21. package/.code/agents/7/exec-call_HltHpkDox0Zm1vGEjdksUgpE.txt +1120 -0
  22. package/.code/agents/7/exec-call_LCATrOPPAgbxW9Q1z0XaVi2E.txt +2646 -0
  23. package/.code/agents/7/exec-call_W8DeRfNG9hvbgVFvf0clBf6R.txt +2646 -0
  24. package/.code/agents/94a0ddf3-a304-4ec3-913e-3cceef509948/error.txt +1 -0
  25. package/.code/agents/e2c752b7-711d-423a-af57-f53c809deb84/result.txt +160 -0
  26. package/.code/agents/e6601719-c31f-4a0e-8c71-d70787d0ab71/status.txt +1 -0
  27. package/.code/agents/f250b7ed-5bd5-4036-aa8c-ce63caee7d61/result.txt +20 -0
  28. package/AGENTS.md +1 -36
  29. package/CLAUDE.md +131 -51
  30. package/NUL +0 -1
  31. package/README.md +27 -14
  32. package/dist/bundle/index.cjs +41 -41
  33. package/dist/server/YNABMCPServer.js +28 -381
  34. package/dist/server/config.d.ts +2 -0
  35. package/dist/server/config.js +1 -0
  36. package/dist/tools/accountTools.d.ts +2 -0
  37. package/dist/tools/accountTools.js +45 -0
  38. package/dist/tools/adapters.d.ts +12 -0
  39. package/dist/tools/adapters.js +25 -0
  40. package/dist/tools/budgetTools.d.ts +2 -0
  41. package/dist/tools/budgetTools.js +30 -0
  42. package/dist/tools/categoryTools.d.ts +2 -0
  43. package/dist/tools/categoryTools.js +45 -0
  44. package/dist/tools/monthTools.d.ts +2 -0
  45. package/dist/tools/monthTools.js +32 -0
  46. package/dist/tools/payeeTools.d.ts +2 -0
  47. package/dist/tools/payeeTools.js +32 -0
  48. package/dist/tools/reconciliation/index.d.ts +2 -0
  49. package/dist/tools/reconciliation/index.js +33 -0
  50. package/dist/tools/schemas/common.d.ts +3 -0
  51. package/dist/tools/schemas/common.js +3 -0
  52. package/dist/tools/schemas/outputs/comparisonOutputs.d.ts +1 -1
  53. package/dist/tools/transactionTools.d.ts +2 -0
  54. package/dist/tools/transactionTools.js +129 -0
  55. package/dist/tools/utilityTools.d.ts +3 -1
  56. package/dist/tools/utilityTools.js +32 -2
  57. package/dist/types/index.d.ts +1 -0
  58. package/dist/types/toolRegistration.d.ts +27 -0
  59. package/dist/types/toolRegistration.js +1 -0
  60. package/package.json +2 -2
  61. package/scripts/run-domain-integration-tests.js +4 -1
  62. package/src/__tests__/workflows.e2e.test.ts +1 -7
  63. package/src/server/YNABMCPServer.ts +33 -519
  64. package/src/server/__tests__/toolRegistration.test.ts +236 -0
  65. package/src/server/config.ts +1 -0
  66. package/src/tools/__tests__/adapters.test.ts +113 -0
  67. package/src/tools/__tests__/transactionTools.test.ts +90 -17
  68. package/src/tools/__tests__/utilityTools.test.ts +7 -7
  69. package/src/tools/accountTools.ts +53 -0
  70. package/src/tools/adapters.ts +74 -0
  71. package/src/tools/budgetTools.ts +37 -0
  72. package/src/tools/categoryTools.ts +53 -0
  73. package/src/tools/monthTools.ts +39 -0
  74. package/src/tools/payeeTools.ts +39 -0
  75. package/src/tools/reconciliation/index.ts +45 -0
  76. package/src/tools/schemas/common.ts +18 -0
  77. package/src/tools/transactionTools.ts +150 -0
  78. package/src/tools/utilityTools.ts +42 -2
  79. package/src/types/index.ts +3 -0
  80. package/src/types/toolRegistration.ts +88 -0
  81. package/.dxtignore +0 -57
  82. package/.github/workflows/pr-description-check.yml +0 -88
  83. package/CODEREVIEW_RESPONSE.md +0 -128
  84. package/SCHEMA_IMPROVEMENT_SUMMARY.md +0 -120
  85. package/TESTING_NOTES.md +0 -217
  86. package/accountactivity-merged.csv +0 -149
  87. package/bundle-analysis.html +0 -13110
  88. package/docs/README.md +0 -72
  89. package/docs/getting-started/CONFIGURATION.md +0 -175
  90. package/docs/getting-started/INSTALLATION.md +0 -333
  91. package/docs/getting-started/QUICKSTART.md +0 -282
  92. package/docs/guides/ARCHITECTURE.md +0 -533
  93. package/docs/guides/DEPLOYMENT.md +0 -189
  94. package/docs/guides/INTEGRATION_TESTING.md +0 -730
  95. package/docs/guides/TESTING.md +0 -591
  96. package/docs/plans/2025-11-20-reloadable-config-token-validation.md +0 -93
  97. package/docs/plans/2025-11-21-fix-transaction-cached-property.md +0 -362
  98. package/docs/plans/2025-11-21-reconciliation-error-handling.md +0 -90
  99. package/docs/plans/2025-11-21-v014-hardening.md +0 -153
  100. package/docs/plans/reconciliation-v2-redesign.md +0 -1571
  101. package/docs/reconciliation-flow.md +0 -83
  102. package/docs/reference/EXAMPLES.md +0 -946
  103. package/docs/reference/TOOLS.md +0 -348
  104. package/docs/reference/TROUBLESHOOTING.md +0 -481
  105. package/fix-types.sh +0 -17
  106. package/test-csv-sample.csv +0 -28
  107. package/test-exports/sample_bank_statement.csv +0 -7
  108. package/test-reconcile-autodetect.js +0 -40
  109. package/test-reconcile-tool.js +0 -152
  110. package/test-reconcile-with-csv.cjs +0 -89
  111. package/test-statement.csv +0 -8
  112. package/test_debug.js +0 -47
  113. package/test_mcp_tools.mjs +0 -75
  114. package/test_simple.mjs +0 -16
@@ -1,591 +0,0 @@
1
- # YNAB MCP Server - Testing Guide
2
-
3
- Comprehensive testing guide covering automated tests, manual test scenarios, and quality assurance processes.
4
-
5
- ## Table of Contents
6
-
7
- - [Overview](#overview)
8
- - [Test Structure](#test-structure)
9
- - [Running Tests](#running-tests)
10
- - [Environment Setup](#environment-setup)
11
- - [Test Types](#test-types)
12
- - [Coverage Requirements](#coverage-requirements)
13
- - [Manual Test Scenarios](#manual-test-scenarios)
14
- - [Test Data Management](#test-data-management)
15
- - [Common Issues](#common-issues)
16
-
17
- ## Overview
18
-
19
- The YNAB MCP Server includes both automated and manual testing capabilities:
20
-
21
- **Automated Tests**:
22
- 1. **Unit Tests** - Test individual components in isolation
23
- 2. **Integration Tests** - Test component interactions with mocked dependencies
24
- 3. **End-to-End Tests** - Test complete workflows with real YNAB API (optional)
25
- 4. **Performance Tests** - Test response times, memory usage, and load handling
26
-
27
- **Manual Testing**:
28
- - Comprehensive test scenarios for Claude Desktop integration
29
- - Feature verification workflows
30
- - Performance and reliability validation
31
-
32
- ## Test Structure
33
-
34
- ```
35
- src/
36
- ├── __tests__/ # Global test utilities and E2E tests
37
- │ ├── setup.ts # Test environment setup
38
- │ ├── testUtils.ts # Shared test utilities
39
- │ ├── testRunner.ts # Comprehensive test runner
40
- │ ├── workflows.e2e.test.ts # End-to-end workflow tests
41
- │ ├── comprehensive.integration.test.ts # Integration tests
42
- │ └── performance.test.ts # Performance and load tests
43
- ├── server/__tests__/ # Server component tests
44
- ├── tools/__tests__/ # Tool-specific tests
45
- └── types/__tests__/ # Type definition tests
46
- ```
47
-
48
- ## Running Tests
49
-
50
- ### Quick Test Commands
51
-
52
- ```bash
53
- # Run all tests with coverage
54
- npm test
55
-
56
- # Run specific test types
57
- npm run test:unit # Unit tests only (fast, mocked)
58
- npm run test:integration # Integration tests (mocked API)
59
- npm run test:e2e # End-to-end tests (real API)
60
- npm run test:performance # Performance tests
61
-
62
- # Generate coverage report
63
- npm run test:coverage
64
-
65
- # Run comprehensive test suite with detailed reporting
66
- npm run test:comprehensive
67
-
68
- # Watch mode for test development
69
- npm run test:watch
70
- ```
71
-
72
- ## Environment Setup
73
-
74
- ### For Unit and Integration Tests (using mocks):
75
- ```bash
76
- # Optional - will use mock token if not provided
77
- YNAB_ACCESS_TOKEN=your_test_token
78
- ```
79
-
80
- ### For End-to-End Tests (using real YNAB API):
81
- ```bash
82
- # Required for E2E tests
83
- YNAB_ACCESS_TOKEN=your_real_ynab_personal_access_token
84
-
85
- # Optional - specify test budget/account IDs
86
- TEST_BUDGET_ID=your_test_budget_id
87
- TEST_ACCOUNT_ID=your_test_account_id
88
-
89
- # Optional - skip E2E tests
90
- SKIP_E2E_TESTS=true
91
- ```
92
-
93
- ## Test Types
94
-
95
- ### Unit Tests
96
- - Test individual functions and classes in isolation
97
- - Use mocked dependencies
98
- - Fast execution (< 10 seconds)
99
- - No external API calls
100
-
101
- ### Integration Tests
102
- - Test component interactions
103
- - Use mocked YNAB API responses
104
- - Validate complete tool workflows
105
- - Medium execution time (10-30 seconds)
106
-
107
- ### End-to-End Tests
108
- - Test against real YNAB API
109
- - Validate complete user workflows
110
- - Slower execution (30-60 seconds)
111
- - **Warning**: Creates real data in your test budget
112
-
113
- ### Performance Tests
114
- - Test response times and memory usage
115
- - Validate performance under load
116
- - Test error handling performance
117
- - Medium execution time (15-30 seconds)
118
-
119
- ## Coverage Requirements
120
-
121
- The test suite enforces minimum coverage thresholds:
122
-
123
- - **Lines**: 80%
124
- - **Functions**: 80%
125
- - **Branches**: 80%
126
- - **Statements**: 80%
127
-
128
- ---
129
-
130
- # Manual Test Scenarios
131
-
132
- Comprehensive test scenarios for manually validating the YNAB MCP server with Claude Desktop.
133
-
134
- ## 1. Setup Verification Tests
135
-
136
- ### 1.1 Server Startup and Connection
137
-
138
- **Objective**: Verify the server starts successfully and Claude Desktop connects.
139
-
140
- **Steps**:
141
- 1. Build the project: `npm run build`
142
- 2. Configure Claude Desktop with MCP server settings
143
- 3. Restart Claude Desktop
144
- 4. Check MCP servers list in Claude Desktop
145
-
146
- **Expected Results**:
147
- - Build completes without errors
148
- - Claude Desktop shows "ynab-mcp-server" in connected servers
149
- - No connection errors in Claude Desktop logs
150
-
151
- **Success Criteria**: Server appears as connected in Claude Desktop interface
152
-
153
- ### 1.2 YNAB Token Authentication
154
-
155
- **Objective**: Verify YNAB Personal Access Token is valid and working.
156
-
157
- **Steps**:
158
- 1. Ask Claude: "Can you run the diagnostic_info tool?"
159
- 2. Check the returned authentication status
160
- 3. Verify user information is retrieved
161
-
162
- **Expected Results**:
163
- - Diagnostic info returns successfully
164
- - Authentication status shows "authenticated: true"
165
- - User information includes YNAB user details
166
-
167
- **Success Criteria**: No authentication errors, user data present
168
-
169
- ### 1.3 System Status Verification
170
-
171
- **Objective**: Verify all server components are initialized properly.
172
-
173
- **Steps**:
174
- 1. Run diagnostic_info tool
175
- 2. Review system configuration
176
- 3. Check cache initialization
177
- 4. Verify environment variables
178
-
179
- **Expected Results**:
180
- - All services report healthy status
181
- - Cache is initialized with correct settings
182
- - Environment variables loaded properly
183
- - Tool registry shows all tools
184
-
185
- **Success Criteria**: All system components report healthy status
186
-
187
- ## 2. Basic Functionality Tests
188
-
189
- ### 2.1 Budget Management
190
-
191
- **Objective**: Test basic budget listing and selection functionality.
192
-
193
- **Steps**:
194
- 1. Ask Claude: "List my YNAB budgets"
195
- 2. Note the budget names returned
196
- 3. Ask Claude: "Set my default budget to [budget_name]"
197
- 4. Ask Claude: "What is my current default budget?"
198
-
199
- **Expected Results**:
200
- - Budget list returns user's budgets with names and IDs
201
- - Default budget is set successfully
202
- - Cache warming is triggered automatically
203
- - Default budget query returns the selected budget
204
-
205
- **Success Criteria**: Budget operations work without errors, cache warming occurs
206
-
207
- ### 2.2 Account Listing
208
-
209
- **Objective**: Test account retrieval and caching behavior.
210
-
211
- **Steps**:
212
- 1. Ask Claude: "List my accounts" (first time)
213
- 2. Note response time
214
- 3. Ask Claude: "List my accounts" (second time)
215
- 4. Compare response times
216
- 5. Check diagnostic_info for cache hits
217
-
218
- **Expected Results**:
219
- - First request fetches from YNAB API
220
- - Second request is faster (cache hit)
221
- - Both requests return identical account data
222
- - Cache metrics show hit count increase
223
-
224
- **Success Criteria**: Caching improves response time, data consistency maintained
225
-
226
- ### 2.3 Transaction Retrieval
227
-
228
- **Objective**: Test transaction listing with various filters.
229
-
230
- **Steps**:
231
- 1. Ask Claude: "Show me recent transactions"
232
- 2. Ask Claude: "Show me transactions from a specific account"
233
- 3. Ask Claude: "Show me transactions from the last 30 days"
234
- 4. Ask Claude: "Show me uncategorized transactions"
235
-
236
- **Expected Results**:
237
- - All transaction queries return appropriate data
238
- - Filters work correctly (account, date range, categorization status)
239
- - Response times are reasonable
240
- - Data format is consistent
241
-
242
- **Success Criteria**: All transaction filters work correctly, consistent formatting
243
-
244
- ## 3. Enhanced Caching Tests
245
-
246
- ### 3.1 Cache Warming Verification
247
-
248
- **Objective**: Verify cache warming works after setting default budget.
249
-
250
- **Steps**:
251
- 1. Clear cache (restart server or use diagnostic tools)
252
- 2. Set default budget
253
- 3. Check cache metrics immediately after
254
- 4. Verify accounts, categories, and payees are cached
255
-
256
- **Expected Results**:
257
- - Cache warming triggers automatically
258
- - Accounts, categories, and payees are pre-loaded
259
- - Subsequent requests for these data types are fast
260
- - Cache hit rate improves dramatically
261
-
262
- **Success Criteria**: Cache warming pre-loads commonly used data
263
-
264
- ### 3.2 LRU Eviction Testing
265
-
266
- **Objective**: Test cache eviction when limits are reached.
267
-
268
- **Steps**:
269
- 1. Set cache limit to low value (via environment variables)
270
- 2. Request data for multiple different filters
271
- 3. Check cache metrics for evictions
272
- 4. Verify least recently used items are evicted first
273
-
274
- **Expected Results**:
275
- - Cache respects maximum entry limits
276
- - Older entries are evicted as new ones are added
277
- - Most frequently accessed data remains cached
278
- - No memory leaks occur
279
-
280
- **Success Criteria**: LRU eviction maintains cache within limits
281
-
282
- ### 3.3 Stale-While-Revalidate Testing
283
-
284
- **Objective**: Test stale data serving while refreshing in background.
285
-
286
- **Steps**:
287
- 1. Cache some data and wait for it to become stale
288
- 2. Request the stale data
289
- 3. Verify immediate response with stale data
290
- 4. Confirm background refresh occurs
291
-
292
- **Expected Results**:
293
- - Stale data is served immediately for fast response
294
- - Background refresh updates the cache
295
- - User gets immediate response, cache stays fresh
296
- - No blocking on refresh operations
297
-
298
- **Success Criteria**: Stale-while-revalidate provides fast responses while maintaining freshness
299
-
300
- ## 4. Tool Registry Tests
301
-
302
- ### 4.1 All Tools Accessibility
303
-
304
- **Objective**: Verify all tools are accessible through Claude Desktop.
305
-
306
- **Steps**:
307
- 1. Ask Claude to list available YNAB tools
308
- 2. Test a selection of tools from different categories:
309
- - Budget management (list_budgets, set_default_budget)
310
- - Account management (list_accounts, get_account)
311
- - Transaction management (list_transactions, create_transaction)
312
- - Monthly data analysis (get_month, list_months)
313
- - Utility tools (diagnostic_info, convert_amount)
314
-
315
- **Expected Results**:
316
- - All tools are accessible and respond correctly
317
- - Parameter validation works consistently
318
- - Error handling is uniform across tools
319
- - Tool descriptions are helpful and accurate
320
-
321
- **Success Criteria**: All tools accessible with consistent behavior
322
-
323
- ### 4.2 Parameter Validation
324
-
325
- **Objective**: Test parameter validation across different tools.
326
-
327
- **Steps**:
328
- 1. Try tools with missing required parameters
329
- 2. Try tools with invalid parameter values
330
- 3. Try tools with correct parameters
331
- 4. Test optional parameter handling
332
-
333
- **Expected Results**:
334
- - Missing required parameters result in clear error messages
335
- - Invalid parameters are rejected with helpful guidance
336
- - Valid parameters are processed correctly
337
- - Optional parameters work when provided or omitted
338
-
339
- **Success Criteria**: Consistent parameter validation with helpful error messages
340
-
341
- ## 5. Transaction Management Tests
342
-
343
- ### 5.1 Transaction Creation
344
-
345
- **Objective**: Test creating new transactions.
346
-
347
- **Steps**:
348
- 1. Ask Claude: "Create a test transaction for $10.00 groceries"
349
- 2. Verify transaction appears in YNAB
350
- 3. Check transaction details
351
- 4. Clean up test transaction
352
-
353
- **Expected Results**:
354
- - Transaction is created successfully
355
- - All details are recorded correctly
356
- - Transaction appears in YNAB interface
357
- - Appropriate account and category are used
358
-
359
- **Success Criteria**: Transaction creation works with accurate data
360
-
361
- ### 5.2 Transaction Export
362
-
363
- **Objective**: Test transaction export functionality.
364
-
365
- **Steps**:
366
- 1. Ask Claude: "Export my transactions to a file"
367
- 2. Check export directory for file
368
- 3. Review exported data format
369
- 4. Verify data completeness
370
-
371
- **Expected Results**:
372
- - Export file is created in correct directory
373
- - File contains accurate transaction data
374
- - Format is readable and well-structured
375
- - All requested transactions are included
376
-
377
- **Success Criteria**: Export creates complete, accurate files
378
-
379
- ### 5.3 CSV Comparison
380
-
381
- **Objective**: Test CSV comparison functionality.
382
-
383
- **Steps**:
384
- 1. Use a sample CSV file
385
- 2. Ask Claude: "Compare this CSV with my YNAB transactions"
386
- 3. Review matching results
387
- 4. Check unmatched transaction identification
388
-
389
- **Expected Results**:
390
- - CSV parsing works correctly
391
- - Transaction matching algorithms function properly
392
- - Unmatched transactions are identified
393
- - Clear reporting of comparison results
394
-
395
- **Success Criteria**: CSV comparison accurately identifies matches and discrepancies
396
-
397
- ### 5.4 Account Reconciliation
398
-
399
- **Objective**: Test comprehensive account reconciliation.
400
-
401
- **Steps**:
402
- 1. Prepare CSV export from your bank
403
- 2. Ask Claude: "Reconcile my checking account with this CSV"
404
- 3. Review matching analysis
405
- 4. Check balance verification
406
- 5. Review recommendations
407
-
408
- **Expected Results**:
409
- - Smart duplicate matching works correctly
410
- - Automatic date adjustment handles timezone issues
411
- - Balance matching provides exact reconciliation
412
- - Comprehensive reporting shows all details
413
-
414
- **Success Criteria**: Reconciliation accurately matches transactions and balances
415
-
416
- ## 6. Error Handling Tests
417
-
418
- ### 6.1 Missing Budget ID Scenarios
419
-
420
- **Objective**: Test behavior when no default budget is set.
421
-
422
- **Steps**:
423
- 1. Clear default budget setting
424
- 2. Try tools that require budget context
425
- 3. Check error messages
426
- 4. Verify recovery guidance
427
-
428
- **Expected Results**:
429
- - Clear error messages about missing budget
430
- - Helpful guidance on setting default budget
431
- - No system crashes or unclear errors
432
- - Easy recovery path provided
433
-
434
- **Success Criteria**: Clear error messages with actionable recovery steps
435
-
436
- ### 6.2 Invalid Parameter Testing
437
-
438
- **Objective**: Test handling of invalid parameters.
439
-
440
- **Steps**:
441
- 1. Try tools with malformed parameters
442
- 2. Test with out-of-range values
443
- 3. Try with incorrect data types
444
- 4. Test with missing required fields
445
-
446
- **Expected Results**:
447
- - Validation catches all invalid parameters
448
- - Error messages clearly identify the problem
449
- - Suggestions for correct parameter format
450
- - No system instability from bad inputs
451
-
452
- **Success Criteria**: Robust parameter validation with helpful error messages
453
-
454
- ### 6.3 YNAB API Error Scenarios
455
-
456
- **Objective**: Test handling of YNAB API errors.
457
-
458
- **Steps**:
459
- 1. Test with expired token (if possible)
460
- 2. Test during YNAB API maintenance
461
- 3. Test with network connectivity issues
462
- 4. Test with rate limiting scenarios
463
-
464
- **Expected Results**:
465
- - Graceful handling of API errors
466
- - Clear error messages about external issues
467
- - No server crashes or hangs
468
- - Appropriate retry mechanisms
469
-
470
- **Success Criteria**: Robust error handling for external API issues
471
-
472
- ## 7. Performance Tests
473
-
474
- ### 7.1 Response Time Verification
475
-
476
- **Objective**: Verify acceptable response times with and without caching.
477
-
478
- **Steps**:
479
- 1. Measure response times for fresh requests
480
- 2. Measure response times for cached requests
481
- 3. Compare performance improvements
482
- 4. Test with large data sets
483
-
484
- **Expected Results**:
485
- - Fresh requests complete within reasonable time (< 5 seconds)
486
- - Cached requests are significantly faster (< 1 second)
487
- - Large data sets are handled efficiently
488
- - No performance degradation over time
489
-
490
- **Success Criteria**: Response times meet performance expectations
491
-
492
- ### 7.2 Concurrent Request Handling
493
-
494
- **Objective**: Test server behavior under concurrent load.
495
-
496
- **Steps**:
497
- 1. Make multiple simultaneous requests
498
- 2. Check for race conditions
499
- 3. Verify data consistency
500
- 4. Monitor resource usage
501
-
502
- **Expected Results**:
503
- - Concurrent requests handled properly
504
- - No race conditions in cache or data
505
- - Consistent results across all requests
506
- - Reasonable resource usage
507
-
508
- **Success Criteria**: Stable performance under concurrent load
509
-
510
- ### 7.3 Memory Usage Monitoring
511
-
512
- **Objective**: Verify memory usage remains stable during extended use.
513
-
514
- **Steps**:
515
- 1. Monitor baseline memory usage
516
- 2. Perform extended testing session
517
- 3. Check for memory leaks
518
- 4. Verify cache size limits are respected
519
-
520
- **Expected Results**:
521
- - Memory usage remains stable over time
522
- - No significant memory leaks detected
523
- - Cache eviction prevents unbounded growth
524
- - Resource usage stays within acceptable limits
525
-
526
- **Success Criteria**: Stable memory usage without leaks
527
-
528
- ## Test Data Management
529
-
530
- ### Mock Data
531
- - Unit and integration tests use mock data
532
- - Mock responses are defined in test files
533
- - No real API calls or data modification
534
-
535
- ### E2E Test Data
536
- - E2E tests create real data in your YNAB budget
537
- - Test transactions are automatically cleaned up
538
- - Test accounts cannot be deleted via API (manual cleanup required)
539
- - Use a dedicated test budget to avoid affecting real data
540
-
541
- ## Common Issues and Solutions
542
-
543
- ### E2E Tests Skipped
544
- ```
545
- ⏭️ E2E tests skipped (no API key or SKIP_E2E_TESTS=true)
546
- ```
547
- **Solution**: Set `YNAB_ACCESS_TOKEN` environment variable
548
-
549
- ### Coverage Below Threshold
550
- ```
551
- ⚠️ Coverage below target (<80%)
552
- ```
553
- **Solution**: Add more tests or remove untestable code from coverage
554
-
555
- ### Performance Tests Failing
556
- ```
557
- ❌ Performance assertion failed: 1500ms > 1000ms
558
- ```
559
- **Solution**: Optimize code or adjust performance thresholds
560
-
561
- ### Connection Errors in Claude Desktop
562
- **Solution**:
563
- 1. Verify Node.js version (18+)
564
- 2. Check build completed successfully
565
- 3. Verify MCP server configuration
566
- 4. Restart Claude Desktop completely
567
-
568
- ## Test Execution Guidelines
569
-
570
- 1. **Prerequisites**: Ensure .env file is configured with valid YNAB token
571
- 2. **Environment**: Use development configuration for detailed logging
572
- 3. **Documentation**: Record results for each test scenario
573
- 4. **Issues**: Log any problems with steps to reproduce
574
- 5. **Performance**: Record timing measurements for performance tests
575
- 6. **Cleanup**: Clean up test data after testing (transactions, exports)
576
-
577
- ## Success Criteria Summary
578
-
579
- - ✅ All basic functionality works correctly
580
- - ✅ Enhanced caching provides performance improvements
581
- - ✅ Error handling is robust and helpful
582
- - ✅ All tools are accessible and functional
583
- - ✅ Transaction management works reliably
584
- - ✅ Performance meets or exceeds expectations
585
- - ✅ Integration with Claude Desktop is seamless
586
- - ✅ Security and reliability standards are met
587
-
588
- ---
589
-
590
- For automated testing implementation details, see the source code in `src/__tests__/`.
591
- For additional testing checklists, see `../development/TESTING_CHECKLIST.md`.
@@ -1,93 +0,0 @@
1
- # Reloadable Config & Token Validation Implementation Plan
2
-
3
- > **For Claude:** REQUIRED SUB-SKILL: Use superpowers:executing-plans to implement this plan task-by-task.
4
-
5
- **Goal:** Make config parsing reloadable for env-mutation tests/CI, harden token validation against malformed responses, and confirm integration runs with a valid YNAB token.
6
-
7
- **Architecture:** Parse env vars on-demand via `loadConfig()` (with a backward-compatible `config` singleton), inject per-server config instances instead of module globals, and wrap YNAB token validation failures (including non-JSON responses) into `AuthenticationError` with clear messaging.
8
-
9
- **Tech Stack:** Node + TypeScript, Zod, dotenv, Vitest, YNAB SDK, esbuild.
10
-
11
- ### Task 1: Reloadable config loader
12
-
13
- **Files:**
14
- - Modify: `src/server/config.ts`
15
- - Modify: `src/server/__tests__/config.test.ts`
16
-
17
- **Step 1: Write failing test**
18
- Add a test that calls `loadConfig()` twice after mutating `process.env.YNAB_ACCESS_TOKEN` (without re-importing the module) and expects the second call to return the updated token.
19
-
20
- **Step 2: Run test to verify failure**
21
- Run `npx vitest run src/server/__tests__/config.test.ts` and confirm the new test fails because the loader is still tied to initial state.
22
-
23
- **Step 3: Implement reloadable loader**
24
- Keep the Zod schema and explicit `config` singleton, but ensure `loadConfig()` re-parses `process.env` on every call (optionally allowing an env override for tests) and throws `ValidationError` on failure; keep `import 'dotenv/config'` so `.env` is loaded for Node execution.
25
-
26
- **Step 4: Re-run targeted test**
27
- Re-run `npx vitest run src/server/__tests__/config.test.ts` to confirm the reloadable behavior passes.
28
-
29
- ### Task 2: Inject per-instance config into YNABMCPServer
30
-
31
- **Files:**
32
- - Modify: `src/server/YNABMCPServer.ts`
33
- - Modify: `src/server/__tests__/YNABMCPServer.test.ts`
34
- - Modify: `src/server/__tests__/server-startup.integration.test.ts`
35
-
36
- **Step 1: Add/adjust tests**
37
- Add coverage that changing `process.env.YNAB_ACCESS_TOKEN` before constructing a new `YNABMCPServer` produces a server wired to the new token (no module cache reset), and update expectations to align with `ValidationError` from `loadConfig()` where appropriate.
38
-
39
- **Step 2: Run tests to see failures**
40
- Run `npx vitest run src/server/__tests__/YNABMCPServer.test.ts src/server/__tests__/server-startup.integration.test.ts`.
41
-
42
- **Step 3: Apply code changes**
43
- Ensure the constructor stores `const configInstance = loadConfig()` and uses it for YNAB API creation, token validation, and tool execution auth; remove any lingering usage of the `config` singleton for runtime behavior.
44
-
45
- **Step 4: Re-run the affected tests**
46
- Re-run the same Vitest targets to verify per-instance config wiring passes.
47
-
48
- ### Task 3: Token validation resilience
49
-
50
- **Files:**
51
- - Modify: `src/server/YNABMCPServer.ts`
52
- - Modify: `src/server/__tests__/server-startup.integration.test.ts`
53
-
54
- **Step 1: Write failing test**
55
- Mock `ynab.API().user.getUser` to reject with a `SyntaxError`/HTML-shaped error and expect `validateToken()` to reject with `AuthenticationError("Unexpected response from YNAB during token validation")` instead of surfacing the raw syntax failure.
56
-
57
- **Step 2: Run test to confirm failure**
58
- Run `npx vitest run src/server/__tests__/server-startup.integration.test.ts`.
59
-
60
- **Step 3: Implement graceful handling**
61
- Wrap token validation to catch non-JSON/SyntaxError cases (or responses lacking expected shape) and throw `AuthenticationError` with the clear message while preserving existing 401/403 mapping.
62
-
63
- **Step 4: Re-run validation tests**
64
- Re-run the targeted integration test to ensure the new mapping passes.
65
-
66
- ### Task 4: Test alignment & runner portability
67
-
68
- **Files:**
69
- - Modify: `src/server/__tests__/config.test.ts`
70
- - Modify: `scripts/run-throttled-integration-tests.js`
71
-
72
- **Step 1: Align config test patterns**
73
- Update any assertions relying on module-level parsing side effects to use `vi.resetModules()` + `loadConfig()` explicitly for reload checks; keep singleton expectations where intentional.
74
-
75
- **Step 2: Harden integration runner on Windows**
76
- Change the throttled runner to spawn Vitest via a platform-portable path (e.g., `node` + resolved `vitest` bin) to avoid `spawn EINVAL` with `.cmd` on Windows.
77
-
78
- **Step 3: Run quick smoke**
79
- Run `node scripts/run-throttled-integration-tests.js --help` or kick a single file to ensure the wrapper executes without path errors.
80
-
81
- ### Task 5: Full verification
82
-
83
- **Files/Commands:**
84
- - Commands: `npm test`, `npm run test:integration:core` (with `YNAB_ACCESS_TOKEN` set), optionally `npm run test:integration:domain`.
85
-
86
- **Step 1: Run unit suite**
87
- Execute `npm test` to ensure unit coverage stays green.
88
-
89
- **Step 2: Run core integrations with real token**
90
- Execute `npm run test:integration:core` using a known-good `YNAB_ACCESS_TOKEN`; capture any regressions.
91
-
92
- **Step 3: Optional extended coverage**
93
- If time permits, run `npm run test:integration:domain` for broader confidence; note any skips or rate-limit impacts.