evalvault 1.76.0__tar.gz → 1.77.0__tar.gz
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- {evalvault-1.76.0 → evalvault-1.77.0}/.env.offline.example +5 -3
- evalvault-1.77.0/.env.offline.ollama.example +59 -0
- evalvault-1.77.0/.env.offline.vllm.example +83 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/.gitignore +6 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/CLAUDE.md +54 -4
- {evalvault-1.76.0 → evalvault-1.77.0}/Dockerfile +3 -3
- {evalvault-1.76.0 → evalvault-1.77.0}/PKG-INFO +2 -1
- {evalvault-1.76.0 → evalvault-1.77.0}/config/models.yaml +12 -0
- evalvault-1.77.0/docker-compose.offline.build.yml +28 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/docker-compose.offline.yml +18 -15
- evalvault-1.77.0/docs/guides/DI_EXTRACTION_NEXT_STEPS.md +162 -0
- evalvault-1.77.0/docs/guides/OFFLINE_BUNDLE_WORK_IN_PROGRESS.md +292 -0
- evalvault-1.77.0/docs/guides/OFFLINE_DOCKER.md +385 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/docs/guides/OFFLINE_MODELS.md +5 -0
- evalvault-1.77.0/docs/guides/OFFLINE_VLLM_BUNDLE.md +118 -0
- evalvault-1.77.0/docs/guides/evalvault_presentation_case_study_ko.md +374 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/docs/handbook/CHAPTERS/00_overview.md +22 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/docs/handbook/CHAPTERS/04_operations.md +16 -1
- {evalvault-1.76.0 → evalvault-1.77.0}/docs/handbook/CHAPTERS/05_security.md +9 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/docs/handbook/CHAPTERS/06_quality_and_testing.md +6 -1
- {evalvault-1.76.0 → evalvault-1.77.0}/docs/handbook/CHAPTERS/08_roadmap.md +21 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/docs/handbook/EXTERNAL.md +8 -2
- {evalvault-1.76.0 → evalvault-1.77.0}/frontend/package-lock.json +144 -20
- {evalvault-1.76.0 → evalvault-1.77.0}/frontend/package.json +2 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/frontend/src/pages/EvaluationStudio.tsx +35 -18
- {evalvault-1.76.0 → evalvault-1.77.0}/frontend/src/pages/Settings.tsx +6 -9
- {evalvault-1.76.0 → evalvault-1.77.0}/pyproject.toml +2 -1
- evalvault-1.77.0/scripts/offline/build_full_offline_bundle.sh +201 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/scripts/offline/bundle_datasets.sh +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/scripts/offline/bundle_model_cache.sh +9 -1
- evalvault-1.77.0/scripts/offline/bundle_ollama_models.sh +43 -0
- evalvault-1.77.0/scripts/offline/bundle_vllm_models.sh +36 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/scripts/offline/export_base_images.sh +1 -1
- {evalvault-1.76.0 → evalvault-1.77.0}/scripts/offline/export_images.sh +15 -6
- evalvault-1.77.0/scripts/offline/import_images.sh +32 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/scripts/offline/load_base_images.sh +5 -3
- {evalvault-1.76.0 → evalvault-1.77.0}/scripts/offline/restore_datasets.sh +0 -0
- evalvault-1.77.0/scripts/offline/restore_model_cache.sh +35 -0
- evalvault-1.77.0/scripts/offline/restore_ollama_models.sh +38 -0
- evalvault-1.77.0/scripts/offline/restore_vllm_models.sh +31 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/scripts/offline/smoke_test.sh +2 -2
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/adapters/inbound/api/adapter.py +24 -1
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/adapters/inbound/api/main.py +2 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/adapters/inbound/cli/app.py +3 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/adapters/inbound/cli/commands/analyze.py +6 -1
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/adapters/inbound/cli/commands/method.py +1 -1
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/adapters/inbound/cli/commands/run.py +9 -4
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/adapters/inbound/cli/commands/run_helpers.py +18 -16
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/adapters/outbound/analysis/llm_report_module.py +515 -33
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/adapters/outbound/phoenix/sync_service.py +1 -1
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/adapters/outbound/report/markdown_adapter.py +92 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/adapters/outbound/storage/factory.py +1 -4
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/adapters/outbound/tracker/phoenix_adapter.py +25 -8
- evalvault-1.77.0/src/evalvault/config/runtime_services.py +122 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/tests/fixtures/README.md +2 -0
- evalvault-1.77.0/tests/fixtures/e2e/ab_test_baseline_korean.json +200 -0
- evalvault-1.77.0/tests/fixtures/e2e/ab_test_candidate_korean.json +201 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/tests/unit/test_mlflow_tracker.py +38 -36
- {evalvault-1.76.0 → evalvault-1.77.0}/tests/unit/test_phoenix_adapter.py +1 -1
- {evalvault-1.76.0 → evalvault-1.77.0}/uv.lock +373 -1
- evalvault-1.76.0/docs/guides/OFFLINE_DOCKER.md +0 -175
- evalvault-1.76.0/scripts/offline/build_full_offline_bundle.sh +0 -64
- evalvault-1.76.0/scripts/offline/import_images.sh +0 -16
- evalvault-1.76.0/scripts/offline/restore_model_cache.sh +0 -17
- {evalvault-1.76.0 → evalvault-1.77.0}/.dockerignore +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/.env.example +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/.github/workflows/ci.yml +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/.github/workflows/regression-gate.yml +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/.github/workflows/release.yml +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/.github/workflows/stale.yml +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/.pre-commit-config.yaml +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/.python-version +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/AGENTS.md +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/CHANGELOG.md +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/CODE_OF_CONDUCT.md +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/CONTRIBUTING.md +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/LICENSE.md +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/README.en.md +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/README.md +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/SECURITY.md +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/agent/README.md +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/agent/agent.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/agent/client.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/agent/config.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/agent/main.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/agent/memory/README.md +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/agent/memory/shared/decisions.md +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/agent/memory/shared/dependencies.md +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/agent/memory/templates/coordinator_guide.md +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/agent/memory/templates/work_log_template.md +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/agent/memory_integration.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/agent/progress.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/agent/prompts/app_spec.txt +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/agent/prompts/baseline.txt +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/agent/prompts/coding_prompt.md +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/agent/prompts/existing_project_prompt.md +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/agent/prompts/improvement/architecture_prompt.md +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/agent/prompts/improvement/base_prompt.md +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/agent/prompts/improvement/coordinator_prompt.md +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/agent/prompts/improvement/observability_prompt.md +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/agent/prompts/initializer_prompt.md +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/agent/prompts/prompt_manifest.json +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/agent/prompts/system.txt +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/agent/prompts.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/agent/requirements.txt +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/agent/security.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/config/domains/insurance/memory.yaml +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/config/domains/insurance/terms_dictionary_en.json +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/config/domains/insurance/terms_dictionary_ko.json +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/config/methods.yaml +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/config/ragas_prompts_override.yaml +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/config/regressions/ci.json +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/config/regressions/default.json +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/config/regressions/ux.json +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/config/stage_metric_playbook.yaml +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/config/stage_metric_thresholds.json +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/data/datasets/dummy_test_dataset.json +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/data/datasets/insurance_qa_korean.csv +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/data/datasets/insurance_qa_korean.json +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/data/datasets/insurance_qa_korean_2.json +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/data/datasets/insurance_qa_korean_3.json +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/data/datasets/ragas_ko90_en10.json +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/data/datasets/sample.json +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/data/datasets/visualization_20q_cluster_map.csv +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/data/datasets/visualization_20q_korean.json +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/data/datasets/visualization_2q_cluster_map.csv +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/data/datasets/visualization_2q_korean.json +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/data/kg/knowledge_graph.json +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/data/rag/user_guide_bm25.json +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/dataset_templates/dataset_template.csv +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/dataset_templates/dataset_template.json +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/dataset_templates/dataset_template.xlsx +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/dataset_templates/method_input_template.json +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/docker-compose.langfuse.yml +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/docker-compose.offline.modelcache.yml +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/docker-compose.phoenix.yaml +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/docker-compose.yml +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/docs/INDEX.md +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/docs/README.ko.md +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/docs/ROADMAP.md +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/docs/STATUS.md +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/docs/api/adapters/inbound.md +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/docs/api/adapters/outbound.md +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/docs/api/config.md +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/docs/api/domain/entities.md +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/docs/api/domain/metrics.md +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/docs/api/domain/services.md +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/docs/api/ports/inbound.md +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/docs/api/ports/outbound.md +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/docs/architecture/open-rag-trace-collector.md +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/docs/architecture/open-rag-trace-spec.md +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/docs/getting-started/INSTALLATION.md +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/docs/guides/AGENTS_SYSTEM_GUIDE.md +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/docs/guides/CHAINLIT_INTEGRATION_PLAN.md +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/docs/guides/CI_REGRESSION_GATE.md +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/docs/guides/CLI_MCP_PLAN.md +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/docs/guides/CLI_PARALLEL_FEATURES_SPEC.md +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/docs/guides/CLI_UX_REDESIGN.md +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/docs/guides/DEV_GUIDE.md +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/docs/guides/DOCS_REFRESH_PLAN.md +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/docs/guides/EVALVAULT_DIAGNOSTIC_PLAYBOOK.md +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/docs/guides/EVALVAULT_RUN_EXCEL_SHEETS.md +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/docs/guides/EVALVAULT_WORK_PLAN.md +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/docs/guides/EXPERIMENT_TRACKING_STACK.md +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/docs/guides/EXTERNAL_TRACE_API_SPEC.md +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/docs/guides/Extension_2.md +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/docs/guides/Extension_Data_Difficulty_Profiling_Custom_Judge_Model.md +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/docs/guides/INSURANCE_SUMMARY_METRICS_PLAN.md +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/docs/guides/LENA_MVP_IMPLEMENTATION_PLAN.md +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/docs/guides/LENA_RAGAS_CALIBRATION_DEV_PLAN.md +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/docs/guides/MULTITURN_EVAL_GUIDE.md +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/docs/guides/NEXT_STEPS_EXECUTION_PLAN.md +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/docs/guides/OPEN_RAG_TRACE_INTERNAL_ADAPTER.md +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/docs/guides/OPEN_RAG_TRACE_SAMPLES.md +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/docs/guides/P0_P3_EXECUTION_REPORT.md +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/docs/guides/P1_P4_WORK_PLAN.md +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/docs/guides/PARALLEL_WORK_APPROVAL_RULES.md +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/docs/guides/PRD_LENA.md +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/docs/guides/PROJECT_STATUS_AND_PLAN.md +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/docs/guides/RAGAS_HUMAN_FEEDBACK_CALIBRATION_GUIDE.md +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/docs/guides/RAG_CLI_WORKFLOW_TEMPLATES.md +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/docs/guides/RAG_NOISE_REDUCTION_GUIDE.md +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/docs/guides/RAG_PERFORMANCE_IMPLEMENTATION_LOG.md +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/docs/guides/RAG_PERFORMANCE_IMPROVEMENT_PROPOSAL.md +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/docs/guides/RAG_PGVECTOR_PREINDEX_PLAN.md +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/docs/guides/RELEASE_CHECKLIST.md +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/docs/guides/USER_GUIDE.md +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/docs/guides/WEBUI_CLI_ROLLOUT_PLAN.md +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/docs/guides/WORKLOG_LAST_2_DAYS.md +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/docs/guides/_DEPRECATED_NOTICE.md +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/docs/guides/cli_process.md +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/docs/guides/prompt_suggestions_design.md +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/docs/guides/rag_human_feedback_calibration_implementation_plan.md +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/docs/guides/refactoring_strategy.md +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/docs/guides/repeat_query.md +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/docs/handbook/CHAPTERS/01_architecture.md +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/docs/handbook/CHAPTERS/02_data_and_metrics.md +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/docs/handbook/CHAPTERS/03_workflows.md +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/docs/handbook/CHAPTERS/07_ux_and_product.md +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/docs/handbook/CHAPTERS/09_competitive_positioning.md +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/docs/handbook/INDEX.md +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/docs/handbook/WORKLOG_DOCS_CLEANUP_2026-01-29.md +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/docs/handbook/appendix-coverage-matrix.md +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/docs/handbook/appendix-file-inventory.md +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/docs/handbook/appendix-roadmap.md +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/docs/handbook/appendix-taxonomy.md +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/docs/mapping/component-to-whitepaper.yaml +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/docs/new_whitepaper/00_frontmatter.md +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/docs/new_whitepaper/01_overview.md +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/docs/new_whitepaper/02_architecture.md +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/docs/new_whitepaper/03_data_flow.md +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/docs/new_whitepaper/04_components.md +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/docs/new_whitepaper/05_expert_lenses.md +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/docs/new_whitepaper/06_implementation.md +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/docs/new_whitepaper/07_advanced.md +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/docs/new_whitepaper/08_customization.md +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/docs/new_whitepaper/09_quality.md +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/docs/new_whitepaper/10_performance.md +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/docs/new_whitepaper/11_security.md +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/docs/new_whitepaper/12_operations.md +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/docs/new_whitepaper/13_standards.md +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/docs/new_whitepaper/14_roadmap.md +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/docs/new_whitepaper/INDEX.md +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/docs/new_whitepaper/STYLE_GUIDE.md +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/docs/refactor/REFAC_000_master_plan.md +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/docs/refactor/REFAC_010_agent_playbook.md +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/docs/refactor/REFAC_020_logging_policy.md +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/docs/refactor/REFAC_030_phase0_responsibility_map.md +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/docs/refactor/REFAC_040_wbs_parallel_plan.md +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/docs/refactor/logs/phase-0-baseline.md +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/docs/refactor/logs/phase-1-evaluator.md +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/docs/refactor/logs/phase-2-cli-run.md +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/docs/refactor/logs/phase-3-analysis.md +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/docs/security_audit_worklog.md +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/docs/stylesheets/extra.css +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/docs/templates/dataset_template.csv +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/docs/templates/dataset_template.json +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/docs/templates/dataset_template.xlsx +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/docs/templates/eval_report_templates.md +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/docs/templates/kg_template.json +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/docs/templates/otel_openinference_trace_example.json +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/docs/templates/ragas_dataset_example_ko90_en10.json +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/docs/templates/retriever_docs_template.json +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/docs/tools/generate-whitepaper.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/docs/web_ui_analysis_migration_plan.md +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/dummy_test_dataset.json +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/examples/README.md +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/examples/benchmarks/README.md +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/examples/benchmarks/korean_rag/faithfulness_test.json +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/examples/benchmarks/korean_rag/insurance_qa_100.json +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/examples/benchmarks/korean_rag/keyword_extraction_test.json +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/examples/benchmarks/korean_rag/retrieval_test.json +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/examples/benchmarks/output/comparison.json +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/examples/benchmarks/output/full_results.json +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/examples/benchmarks/output/leaderboard.json +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/examples/benchmarks/output/results_mteb.json +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/examples/benchmarks/output/retrieval_result.json +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/examples/benchmarks/run_korean_benchmark.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/examples/kg_generator_demo.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/examples/method_plugin_template/README.md +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/examples/method_plugin_template/pyproject.toml +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/examples/method_plugin_template/src/method_plugin_template/__init__.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/examples/method_plugin_template/src/method_plugin_template/methods.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/examples/stage_events.jsonl +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/examples/usecase/comprehensive_workflow_test.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/examples/usecase/insurance_eval_dataset.json +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/examples/usecase/output/comprehensive_report.html +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/frontend/.env.example +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/frontend/.gitignore +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/frontend/Dockerfile +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/frontend/README.md +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/frontend/e2e/analysis-compare.spec.ts +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/frontend/e2e/analysis-lab.spec.ts +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/frontend/e2e/compare-runs.spec.ts +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/frontend/e2e/dashboard.spec.ts +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/frontend/e2e/domain-memory.spec.ts +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/frontend/e2e/evaluation-studio.spec.ts +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/frontend/e2e/judge-calibration.spec.ts +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/frontend/e2e/knowledge-base.spec.ts +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/frontend/e2e/mocks/intents.json +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/frontend/e2e/mocks/run_details.json +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/frontend/e2e/mocks/runs.json +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/frontend/e2e/run-details.spec.ts +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/frontend/eslint.config.js +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/frontend/index.html +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/frontend/nginx.conf +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/frontend/playwright.config.ts +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/frontend/public/vite.svg +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/frontend/src/App.css +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/frontend/src/App.tsx +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/frontend/src/assets/react.svg +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/frontend/src/components/AnalysisNodeOutputs.tsx +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/frontend/src/components/InsightSpacePanel.tsx +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/frontend/src/components/Layout.tsx +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/frontend/src/components/MarkdownContent.tsx +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/frontend/src/components/PrioritySummaryPanel.tsx +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/frontend/src/components/SpaceLegend.tsx +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/frontend/src/components/SpacePlot2D.tsx +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/frontend/src/components/SpacePlot3D.tsx +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/frontend/src/components/StatusBadge.tsx +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/frontend/src/components/ToastProvider.tsx +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/frontend/src/components/VirtualizedText.tsx +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/frontend/src/components/ai-elements/Conversation.tsx +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/frontend/src/components/ai-elements/Message.tsx +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/frontend/src/components/ai-elements/PromptInput.tsx +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/frontend/src/components/ai-elements/Response.tsx +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/frontend/src/components/ai-elements/index.ts +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/frontend/src/config/ui.ts +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/frontend/src/config.ts +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/frontend/src/hooks/useInsightSpace.ts +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/frontend/src/index.css +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/frontend/src/main.tsx +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/frontend/src/pages/AiSdkChat.tsx +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/frontend/src/pages/AnalysisCompareView.tsx +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/frontend/src/pages/AnalysisLab.tsx +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/frontend/src/pages/AnalysisResultView.tsx +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/frontend/src/pages/Chat.tsx +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/frontend/src/pages/CompareRuns.tsx +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/frontend/src/pages/ComprehensiveAnalysis.tsx +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/frontend/src/pages/CustomerReport.tsx +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/frontend/src/pages/Dashboard.tsx +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/frontend/src/pages/DomainMemory.tsx +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/frontend/src/pages/JudgeCalibration.tsx +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/frontend/src/pages/KnowledgeBase.tsx +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/frontend/src/pages/RunDetails.tsx +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/frontend/src/pages/Visualization.tsx +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/frontend/src/pages/VisualizationHome.tsx +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/frontend/src/services/api.ts +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/frontend/src/types/plotly.d.ts +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/frontend/src/utils/cliCommandBuilder.ts +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/frontend/src/utils/clipboard.ts +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/frontend/src/utils/format.ts +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/frontend/src/utils/phoenix.ts +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/frontend/src/utils/runAnalytics.ts +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/frontend/src/utils/score.ts +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/frontend/src/utils/summaryMetrics.ts +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/frontend/tailwind.config.js +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/frontend/tsconfig.app.json +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/frontend/tsconfig.json +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/frontend/tsconfig.node.json +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/frontend/vite.config.ts +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/mkdocs.yml +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/package-lock.json +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/prompts/system_override.txt +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/reports/.gitkeep +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/reports/README.md +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/reports/debug_report_r1_smoke.md +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/reports/debug_report_r2_graphrag.md +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/reports/debug_report_r2_graphrag_openai.md +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/reports/debug_report_r3_bm25.md +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/reports/debug_report_r3_bm25_langfuse3.md +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/reports/debug_report_r3_dense_faiss.md +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/reports/feature_verification_report.md +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/reports/improvement_1d91a667-4288-4742-be3a-a8f5310c5140.md +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/reports/r2_graphrag_openai_stage_events.jsonl +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/reports/r2_graphrag_openai_stage_report.txt +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/reports/r2_graphrag_stage_events.jsonl +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/reports/r2_graphrag_stage_report.txt +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/reports/r3_bm25_langfuse2_stage_events.jsonl +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/reports/r3_bm25_langfuse3_stage_events.jsonl +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/reports/r3_bm25_langfuse_stage_events.jsonl +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/reports/r3_bm25_phoenix_stage_events.jsonl +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/reports/r3_bm25_stage_events.jsonl +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/reports/r3_bm25_stage_report.txt +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/reports/r3_dense_faiss_stage_events.jsonl +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/reports/r3_dense_faiss_stage_report.txt +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/scripts/benchmark/download_kmmlu.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/scripts/ci/run_regression_gate.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/scripts/dev/open_rag_trace_demo.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/scripts/dev/open_rag_trace_integration_template.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/scripts/dev/otel-collector-config.yaml +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/scripts/dev/preindex_pgvector_runs.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/scripts/dev/start_web_ui_with_phoenix.sh +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/scripts/dev/validate_open_rag_trace.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/scripts/dev/verify_dashboard_endpoint.sh +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/scripts/dev_seed_pipeline_results.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/scripts/docs/__init__.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/scripts/docs/analyzer/__init__.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/scripts/docs/analyzer/ast_scanner.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/scripts/docs/analyzer/confidence_scorer.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/scripts/docs/analyzer/graph_builder.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/scripts/docs/analyzer/side_effect_detector.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/scripts/docs/generate_api_docs.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/scripts/docs/models/__init__.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/scripts/docs/models/schema.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/scripts/docs/renderer/__init__.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/scripts/docs/renderer/html_generator.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/scripts/offline/export_api_base_only.sh +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/scripts/offline/predownload_nlp_models.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/scripts/ops/phoenix_watch.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/scripts/perf/backfill_langfuse_trace_url.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/scripts/perf/r3_dense_smoke.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/scripts/perf/r3_evalvault_run_dataset.json +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/scripts/perf/r3_retriever_docs.json +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/scripts/perf/r3_smoke_real.jsonl +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/scripts/perf/r3_stage_events_sample.jsonl +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/scripts/pipeline_template_inspect.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/scripts/reports/generate_release_notes.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/scripts/run_with_timeout.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/scripts/test_full_evaluation.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/scripts/tests/run_regressions.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/scripts/tests/run_retriever_stage_report_smoke.sh +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/scripts/validate_tutorials.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/scripts/verify_ragas_compliance.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/scripts/verify_workflows.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/__init__.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/adapters/__init__.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/adapters/inbound/__init__.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/adapters/inbound/api/__init__.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/adapters/inbound/api/routers/__init__.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/adapters/inbound/api/routers/benchmark.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/adapters/inbound/api/routers/calibration.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/adapters/inbound/api/routers/chat.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/adapters/inbound/api/routers/config.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/adapters/inbound/api/routers/domain.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/adapters/inbound/api/routers/knowledge.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/adapters/inbound/api/routers/mcp.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/adapters/inbound/api/routers/pipeline.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/adapters/inbound/api/routers/runs.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/adapters/inbound/cli/__init__.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/adapters/inbound/cli/commands/__init__.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/adapters/inbound/cli/commands/agent.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/adapters/inbound/cli/commands/api.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/adapters/inbound/cli/commands/artifacts.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/adapters/inbound/cli/commands/benchmark.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/adapters/inbound/cli/commands/calibrate.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/adapters/inbound/cli/commands/calibrate_judge.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/adapters/inbound/cli/commands/compare.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/adapters/inbound/cli/commands/config.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/adapters/inbound/cli/commands/debug.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/adapters/inbound/cli/commands/domain.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/adapters/inbound/cli/commands/experiment.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/adapters/inbound/cli/commands/gate.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/adapters/inbound/cli/commands/generate.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/adapters/inbound/cli/commands/graph_rag.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/adapters/inbound/cli/commands/history.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/adapters/inbound/cli/commands/init.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/adapters/inbound/cli/commands/kg.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/adapters/inbound/cli/commands/langfuse.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/adapters/inbound/cli/commands/ops.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/adapters/inbound/cli/commands/phoenix.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/adapters/inbound/cli/commands/pipeline.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/adapters/inbound/cli/commands/profile_difficulty.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/adapters/inbound/cli/commands/prompts.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/adapters/inbound/cli/commands/regress.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/adapters/inbound/cli/commands/stage.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/adapters/inbound/cli/utils/__init__.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/adapters/inbound/cli/utils/analysis_io.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/adapters/inbound/cli/utils/console.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/adapters/inbound/cli/utils/errors.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/adapters/inbound/cli/utils/formatters.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/adapters/inbound/cli/utils/options.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/adapters/inbound/cli/utils/presets.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/adapters/inbound/cli/utils/progress.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/adapters/inbound/cli/utils/validators.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/adapters/inbound/mcp/__init__.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/adapters/inbound/mcp/schemas.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/adapters/inbound/mcp/tools.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/adapters/outbound/__init__.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/adapters/outbound/analysis/__init__.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/adapters/outbound/analysis/analysis_report_module.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/adapters/outbound/analysis/base_module.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/adapters/outbound/analysis/bm25_searcher_module.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/adapters/outbound/analysis/causal_adapter.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/adapters/outbound/analysis/causal_analyzer_module.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/adapters/outbound/analysis/common.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/adapters/outbound/analysis/comparison_pipeline_adapter.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/adapters/outbound/analysis/comparison_report_module.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/adapters/outbound/analysis/data_loader_module.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/adapters/outbound/analysis/dataset_feature_analyzer_module.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/adapters/outbound/analysis/detailed_report_module.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/adapters/outbound/analysis/diagnostic_playbook_module.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/adapters/outbound/analysis/embedding_analyzer_module.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/adapters/outbound/analysis/embedding_distribution_module.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/adapters/outbound/analysis/embedding_searcher_module.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/adapters/outbound/analysis/hybrid_rrf_module.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/adapters/outbound/analysis/hybrid_weighted_module.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/adapters/outbound/analysis/hypothesis_generator_module.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/adapters/outbound/analysis/low_performer_extractor_module.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/adapters/outbound/analysis/model_analyzer_module.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/adapters/outbound/analysis/morpheme_analyzer_module.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/adapters/outbound/analysis/morpheme_quality_checker_module.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/adapters/outbound/analysis/multiturn_analyzer_module.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/adapters/outbound/analysis/network_analyzer_module.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/adapters/outbound/analysis/nlp_adapter.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/adapters/outbound/analysis/nlp_analyzer_module.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/adapters/outbound/analysis/pattern_detector_module.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/adapters/outbound/analysis/pipeline_factory.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/adapters/outbound/analysis/pipeline_helpers.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/adapters/outbound/analysis/priority_summary_module.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/adapters/outbound/analysis/ragas_evaluator_module.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/adapters/outbound/analysis/retrieval_analyzer_module.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/adapters/outbound/analysis/retrieval_benchmark_module.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/adapters/outbound/analysis/retrieval_quality_checker_module.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/adapters/outbound/analysis/root_cause_analyzer_module.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/adapters/outbound/analysis/run_analyzer_module.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/adapters/outbound/analysis/run_change_detector_module.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/adapters/outbound/analysis/run_comparator_module.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/adapters/outbound/analysis/run_loader_module.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/adapters/outbound/analysis/run_metric_comparator_module.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/adapters/outbound/analysis/search_comparator_module.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/adapters/outbound/analysis/statistical_adapter.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/adapters/outbound/analysis/statistical_analyzer_module.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/adapters/outbound/analysis/statistical_comparator_module.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/adapters/outbound/analysis/summary_report_module.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/adapters/outbound/analysis/time_series_analyzer_module.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/adapters/outbound/analysis/timeseries_advanced_module.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/adapters/outbound/analysis/trend_detector_module.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/adapters/outbound/analysis/verification_report_module.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/adapters/outbound/artifact_fs.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/adapters/outbound/benchmark/__init__.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/adapters/outbound/benchmark/lm_eval_adapter.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/adapters/outbound/cache/__init__.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/adapters/outbound/cache/hybrid_cache.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/adapters/outbound/cache/memory_cache.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/adapters/outbound/dataset/__init__.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/adapters/outbound/dataset/base.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/adapters/outbound/dataset/csv_loader.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/adapters/outbound/dataset/excel_loader.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/adapters/outbound/dataset/json_loader.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/adapters/outbound/dataset/loader_factory.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/adapters/outbound/dataset/method_input_loader.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/adapters/outbound/dataset/multiturn_json_loader.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/adapters/outbound/dataset/streaming_loader.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/adapters/outbound/dataset/templates.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/adapters/outbound/dataset/thresholds.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/adapters/outbound/debug/__init__.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/adapters/outbound/debug/report_renderer.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/adapters/outbound/documents/__init__.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/adapters/outbound/documents/ocr/__init__.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/adapters/outbound/documents/ocr/paddleocr_backend.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/adapters/outbound/documents/pdf_extractor.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/adapters/outbound/documents/versioned_loader.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/adapters/outbound/domain_memory/__init__.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/adapters/outbound/domain_memory/domain_memory_schema.sql +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/adapters/outbound/domain_memory/factory.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/adapters/outbound/domain_memory/postgres_adapter.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/adapters/outbound/domain_memory/postgres_domain_memory_schema.sql +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/adapters/outbound/domain_memory/sqlite_adapter.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/adapters/outbound/filesystem/__init__.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/adapters/outbound/filesystem/difficulty_profile_writer.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/adapters/outbound/filesystem/ops_snapshot_writer.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/adapters/outbound/improvement/__init__.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/adapters/outbound/improvement/insight_generator.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/adapters/outbound/improvement/pattern_detector.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/adapters/outbound/improvement/playbook_loader.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/adapters/outbound/improvement/stage_metric_playbook_loader.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/adapters/outbound/judge_calibration_adapter.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/adapters/outbound/judge_calibration_reporter.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/adapters/outbound/kg/__init__.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/adapters/outbound/kg/graph_rag_retriever.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/adapters/outbound/kg/networkx_adapter.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/adapters/outbound/kg/parallel_kg_builder.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/adapters/outbound/kg/query_strategies.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/adapters/outbound/llm/__init__.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/adapters/outbound/llm/anthropic_adapter.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/adapters/outbound/llm/azure_adapter.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/adapters/outbound/llm/base.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/adapters/outbound/llm/factory.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/adapters/outbound/llm/instructor_factory.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/adapters/outbound/llm/llm_relation_augmenter.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/adapters/outbound/llm/ollama_adapter.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/adapters/outbound/llm/openai_adapter.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/adapters/outbound/llm/token_aware_chat.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/adapters/outbound/llm/vllm_adapter.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/adapters/outbound/methods/__init__.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/adapters/outbound/methods/baseline_oracle.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/adapters/outbound/methods/external_command.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/adapters/outbound/methods/registry.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/adapters/outbound/nlp/__init__.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/adapters/outbound/nlp/korean/__init__.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/adapters/outbound/nlp/korean/bm25_retriever.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/adapters/outbound/nlp/korean/dense_retriever.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/adapters/outbound/nlp/korean/document_chunker.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/adapters/outbound/nlp/korean/hybrid_retriever.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/adapters/outbound/nlp/korean/kiwi_tokenizer.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/adapters/outbound/nlp/korean/korean_evaluation.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/adapters/outbound/nlp/korean/korean_stopwords.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/adapters/outbound/nlp/korean/toolkit.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/adapters/outbound/nlp/korean/toolkit_factory.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/adapters/outbound/ops/__init__.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/adapters/outbound/ops/report_renderer.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/adapters/outbound/report/__init__.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/adapters/outbound/report/ci_report_formatter.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/adapters/outbound/report/dashboard_generator.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/adapters/outbound/report/llm_report_generator.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/adapters/outbound/report/pr_comment_formatter.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/adapters/outbound/retriever/__init__.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/adapters/outbound/retriever/graph_rag_adapter.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/adapters/outbound/retriever/pgvector_store.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/adapters/outbound/storage/__init__.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/adapters/outbound/storage/base_sql.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/adapters/outbound/storage/benchmark_storage_adapter.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/adapters/outbound/storage/postgres_adapter.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/adapters/outbound/storage/postgres_schema.sql +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/adapters/outbound/storage/schema.sql +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/adapters/outbound/storage/sqlite_adapter.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/adapters/outbound/tracer/__init__.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/adapters/outbound/tracer/open_rag_log_handler.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/adapters/outbound/tracer/open_rag_trace_adapter.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/adapters/outbound/tracer/open_rag_trace_decorators.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/adapters/outbound/tracer/open_rag_trace_helpers.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/adapters/outbound/tracer/phoenix_tracer_adapter.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/adapters/outbound/tracker/__init__.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/adapters/outbound/tracker/langfuse_adapter.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/adapters/outbound/tracker/log_sanitizer.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/adapters/outbound/tracker/mlflow_adapter.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/config/__init__.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/config/agent_types.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/config/domain_config.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/config/instrumentation.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/config/langfuse_support.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/config/model_config.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/config/phoenix_support.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/config/playbooks/improvement_playbook.yaml +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/config/secret_manager.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/config/settings.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/debug_ragas.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/debug_ragas_real.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/domain/__init__.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/domain/entities/__init__.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/domain/entities/analysis.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/domain/entities/analysis_pipeline.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/domain/entities/benchmark.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/domain/entities/benchmark_run.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/domain/entities/dataset.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/domain/entities/debug.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/domain/entities/experiment.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/domain/entities/feedback.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/domain/entities/graph_rag.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/domain/entities/improvement.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/domain/entities/judge_calibration.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/domain/entities/kg.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/domain/entities/memory.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/domain/entities/method.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/domain/entities/multiturn.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/domain/entities/ops_report.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/domain/entities/prompt.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/domain/entities/prompt_suggestion.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/domain/entities/rag_trace.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/domain/entities/result.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/domain/entities/stage.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/domain/metrics/__init__.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/domain/metrics/analysis_registry.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/domain/metrics/confidence.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/domain/metrics/contextual_relevancy.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/domain/metrics/entity_preservation.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/domain/metrics/insurance.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/domain/metrics/multiturn_metrics.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/domain/metrics/no_answer.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/domain/metrics/registry.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/domain/metrics/retrieval_rank.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/domain/metrics/summary_accuracy.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/domain/metrics/summary_needs_followup.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/domain/metrics/summary_non_definitive.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/domain/metrics/summary_risk_coverage.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/domain/metrics/terms_dictionary.json +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/domain/metrics/text_match.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/domain/services/__init__.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/domain/services/analysis_service.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/domain/services/artifact_lint_service.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/domain/services/async_batch_executor.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/domain/services/batch_executor.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/domain/services/benchmark_report_service.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/domain/services/benchmark_runner.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/domain/services/benchmark_service.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/domain/services/cache_metrics.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/domain/services/cluster_map_builder.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/domain/services/custom_metric_snapshot.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/domain/services/dataset_preprocessor.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/domain/services/debug_report_service.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/domain/services/difficulty_profile_reporter.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/domain/services/difficulty_profiling_service.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/domain/services/document_chunker.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/domain/services/document_versioning.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/domain/services/domain_learning_hook.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/domain/services/embedding_overlay.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/domain/services/entity_extractor.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/domain/services/evaluator.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/domain/services/experiment_comparator.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/domain/services/experiment_manager.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/domain/services/experiment_reporter.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/domain/services/experiment_repository.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/domain/services/experiment_statistics.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/domain/services/graph_rag_experiment.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/domain/services/holdout_splitter.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/domain/services/improvement_guide_service.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/domain/services/intent_classifier.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/domain/services/judge_calibration_service.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/domain/services/kg_generator.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/domain/services/memory_aware_evaluator.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/domain/services/memory_based_analysis.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/domain/services/method_runner.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/domain/services/multiturn_evaluator.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/domain/services/ops_report_service.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/domain/services/ops_snapshot_service.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/domain/services/pipeline_orchestrator.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/domain/services/pipeline_template_registry.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/domain/services/prompt_candidate_service.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/domain/services/prompt_manifest.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/domain/services/prompt_registry.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/domain/services/prompt_scoring_service.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/domain/services/prompt_status.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/domain/services/prompt_suggestion_reporter.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/domain/services/ragas_prompt_overrides.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/domain/services/regression_gate_service.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/domain/services/retrieval_metrics.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/domain/services/retriever_context.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/domain/services/run_comparison_service.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/domain/services/satisfaction_calibration_service.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/domain/services/stage_event_builder.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/domain/services/stage_metric_guide_service.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/domain/services/stage_metric_service.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/domain/services/stage_summary_service.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/domain/services/synthetic_qa_generator.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/domain/services/testset_generator.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/domain/services/threshold_profiles.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/domain/services/unified_report_service.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/domain/services/visual_space_service.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/mkdocs_helpers.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/ports/__init__.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/ports/inbound/__init__.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/ports/inbound/analysis_pipeline_port.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/ports/inbound/evaluator_port.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/ports/inbound/learning_hook_port.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/ports/inbound/multiturn_port.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/ports/inbound/web_port.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/ports/outbound/__init__.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/ports/outbound/analysis_cache_port.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/ports/outbound/analysis_module_port.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/ports/outbound/analysis_port.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/ports/outbound/artifact_fs_port.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/ports/outbound/benchmark_port.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/ports/outbound/causal_analysis_port.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/ports/outbound/comparison_pipeline_port.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/ports/outbound/dataset_port.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/ports/outbound/difficulty_profile_port.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/ports/outbound/domain_memory_port.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/ports/outbound/embedding_port.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/ports/outbound/graph_retriever_port.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/ports/outbound/improvement_port.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/ports/outbound/intent_classifier_port.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/ports/outbound/judge_calibration_port.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/ports/outbound/korean_nlp_port.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/ports/outbound/llm_factory_port.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/ports/outbound/llm_port.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/ports/outbound/method_port.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/ports/outbound/nlp_analysis_port.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/ports/outbound/ops_snapshot_port.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/ports/outbound/relation_augmenter_port.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/ports/outbound/report_port.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/ports/outbound/stage_storage_port.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/ports/outbound/storage_port.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/ports/outbound/tracer_port.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/ports/outbound/tracker_port.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/reports/__init__.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/reports/release_notes.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/scripts/__init__.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/src/evalvault/scripts/regression_runner.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/tests/__init__.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/tests/conftest.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/tests/fixtures/benchmark/retrieval_ground_truth_min.json +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/tests/fixtures/benchmark/retrieval_ground_truth_multi.json +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/tests/fixtures/e2e/auto_insurance_qa_korean_full.json +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/tests/fixtures/e2e/callcenter_summary_5cases.json +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/tests/fixtures/e2e/comprehensive_dataset.json +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/tests/fixtures/e2e/edge_cases.json +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/tests/fixtures/e2e/edge_cases.xlsx +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/tests/fixtures/e2e/evaluation_test_sample.json +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/tests/fixtures/e2e/graphrag_benchmark.json +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/tests/fixtures/e2e/graphrag_multi_sample.json +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/tests/fixtures/e2e/graphrag_retriever_docs.json +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/tests/fixtures/e2e/graphrag_smoke.json +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/tests/fixtures/e2e/insurance_document.txt +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/tests/fixtures/e2e/insurance_qa_english.csv +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/tests/fixtures/e2e/insurance_qa_english.json +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/tests/fixtures/e2e/insurance_qa_english.xlsx +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/tests/fixtures/e2e/insurance_qa_korean.csv +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/tests/fixtures/e2e/insurance_qa_korean.json +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/tests/fixtures/e2e/insurance_qa_korean.xlsx +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/tests/fixtures/e2e/insurance_qa_korean_versioned_pdf.json +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/tests/fixtures/e2e/multiturn_benchmark.json +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/tests/fixtures/e2e/regression_baseline.json +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/tests/fixtures/e2e/run_mode_full_domain_memory.json +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/tests/fixtures/e2e/run_mode_simple.json +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/tests/fixtures/e2e/summary_eval_minimal.json +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/tests/fixtures/kg/minimal_graph.json +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/tests/fixtures/sample_dataset.csv +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/tests/fixtures/sample_dataset.json +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/tests/fixtures/sample_dataset.xlsx +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/tests/integration/__init__.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/tests/integration/benchmark/test_benchmark_service_integration.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/tests/integration/conftest.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/tests/integration/test_cli_integration.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/tests/integration/test_data_flow.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/tests/integration/test_e2e_scenarios.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/tests/integration/test_evaluation_flow.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/tests/integration/test_full_workflow.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/tests/integration/test_langfuse_flow.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/tests/integration/test_phoenix_flow.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/tests/integration/test_pipeline_api_contracts.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/tests/integration/test_storage_flow.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/tests/integration/test_summary_eval_fixture.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/tests/optional_deps.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/tests/unit/__init__.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/tests/unit/adapters/inbound/mcp/test_execute_tools.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/tests/unit/adapters/inbound/mcp/test_read_tools.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/tests/unit/adapters/outbound/documents/test_pdf_extractor.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/tests/unit/adapters/outbound/documents/test_versioned_loader.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/tests/unit/adapters/outbound/improvement/__init__.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/tests/unit/adapters/outbound/improvement/test_insight_generator.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/tests/unit/adapters/outbound/improvement/test_pattern_detector.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/tests/unit/adapters/outbound/improvement/test_playbook_loader.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/tests/unit/adapters/outbound/improvement/test_stage_metric_playbook_loader.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/tests/unit/adapters/outbound/kg/test_graph_rag_retriever.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/tests/unit/adapters/outbound/kg/test_parallel_kg_builder.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/tests/unit/adapters/outbound/retriever/test_graph_rag_adapter.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/tests/unit/adapters/outbound/storage/test_benchmark_storage_adapter.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/tests/unit/config/test_phoenix_support.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/tests/unit/conftest.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/tests/unit/domain/metrics/test_analysis_metric_registry.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/tests/unit/domain/metrics/test_confidence.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/tests/unit/domain/metrics/test_contextual_relevancy.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/tests/unit/domain/metrics/test_entity_preservation.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/tests/unit/domain/metrics/test_metric_registry.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/tests/unit/domain/metrics/test_multiturn_metrics.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/tests/unit/domain/metrics/test_no_answer.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/tests/unit/domain/metrics/test_retrieval_rank.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/tests/unit/domain/metrics/test_text_match.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/tests/unit/domain/services/test_cache_metrics.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/tests/unit/domain/services/test_claim_level.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/tests/unit/domain/services/test_dataset_preprocessor.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/tests/unit/domain/services/test_document_versioning.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/tests/unit/domain/services/test_evaluator_comprehensive.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/tests/unit/domain/services/test_holdout_splitter.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/tests/unit/domain/services/test_improvement_guide_service.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/tests/unit/domain/services/test_judge_calibration_service.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/tests/unit/domain/services/test_ops_snapshot_service.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/tests/unit/domain/services/test_regression_gate_service.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/tests/unit/domain/services/test_retrieval_metrics.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/tests/unit/domain/services/test_retriever_context.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/tests/unit/domain/services/test_stage_event_builder.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/tests/unit/domain/services/test_stage_metric_guide_service.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/tests/unit/domain/services/test_synthetic_qa_generator.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/tests/unit/domain/test_embedding_overlay.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/tests/unit/domain/test_prompt_manifest.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/tests/unit/domain/test_prompt_status.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/tests/unit/reports/test_release_notes.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/tests/unit/scripts/test_regression_runner.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/tests/unit/test_agent_types.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/tests/unit/test_analysis_entities.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/tests/unit/test_analysis_modules.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/tests/unit/test_analysis_pipeline.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/tests/unit/test_analysis_service.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/tests/unit/test_anthropic_adapter.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/tests/unit/test_artifact_lint_service.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/tests/unit/test_async_batch_executor.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/tests/unit/test_azure_adapter.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/tests/unit/test_benchmark_helpers.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/tests/unit/test_benchmark_runner.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/tests/unit/test_causal_adapter.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/tests/unit/test_ci_gate_cli.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/tests/unit/test_cli.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/tests/unit/test_cli_artifacts.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/tests/unit/test_cli_calibrate_judge.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/tests/unit/test_cli_domain.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/tests/unit/test_cli_init.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/tests/unit/test_cli_ops.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/tests/unit/test_cli_progress.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/tests/unit/test_cli_utils.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/tests/unit/test_data_loaders.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/tests/unit/test_difficulty_profiling_service.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/tests/unit/test_domain_config.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/tests/unit/test_domain_memory.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/tests/unit/test_entities.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/tests/unit/test_entities_kg.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/tests/unit/test_entity_extractor.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/tests/unit/test_evaluator.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/tests/unit/test_experiment.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/tests/unit/test_hybrid_cache.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/tests/unit/test_instrumentation.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/tests/unit/test_insurance_metric.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/tests/unit/test_intent_classifier.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/tests/unit/test_kg_generator.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/tests/unit/test_kg_networkx.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/tests/unit/test_kiwi_tokenizer.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/tests/unit/test_kiwi_warning_suppression.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/tests/unit/test_korean_dense.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/tests/unit/test_korean_evaluation.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/tests/unit/test_korean_retrieval.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/tests/unit/test_langfuse_tracker.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/tests/unit/test_llm_relation_augmenter.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/tests/unit/test_lm_eval_adapter.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/tests/unit/test_markdown_report.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/tests/unit/test_memory_cache.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/tests/unit/test_memory_services.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/tests/unit/test_method_plugins.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/tests/unit/test_model_config.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/tests/unit/test_nlp_adapter.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/tests/unit/test_nlp_entities.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/tests/unit/test_ollama_adapter.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/tests/unit/test_openai_adapter.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/tests/unit/test_pipeline_orchestrator.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/tests/unit/test_ports.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/tests/unit/test_postgres_storage.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/tests/unit/test_pr_comment_formatter.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/tests/unit/test_prompt_candidate_service.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/tests/unit/test_rag_trace_entities.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/tests/unit/test_regress_cli.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/tests/unit/test_run_comparison_service.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/tests/unit/test_run_memory_helpers.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/tests/unit/test_run_mode_fixtures.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/tests/unit/test_settings.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/tests/unit/test_sqlite_storage.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/tests/unit/test_stage_cli.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/tests/unit/test_stage_event_schema.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/tests/unit/test_stage_metric_service.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/tests/unit/test_stage_storage.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/tests/unit/test_stage_summary_service.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/tests/unit/test_statistical_adapter.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/tests/unit/test_streaming_loader.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/tests/unit/test_summary_eval_fixture.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/tests/unit/test_testset_generator.py +0 -0
- {evalvault-1.76.0 → evalvault-1.77.0}/tests/unit/test_web_adapter.py +0 -0
|
@@ -4,11 +4,13 @@
|
|
|
4
4
|
# Usage:
|
|
5
5
|
# cp .env.offline.example .env.offline
|
|
6
6
|
# # Edit .env.offline with your air-gapped network settings
|
|
7
|
-
# docker compose -f docker-compose.offline.yml --env-file .env.offline up -d
|
|
7
|
+
# docker compose -f docker-compose.offline.yml --env-file .env.offline up -d --no-build --pull never
|
|
8
8
|
|
|
9
9
|
# ================================================
|
|
10
|
-
# Profile
|
|
10
|
+
# Profile (choose one)
|
|
11
11
|
# ================================================
|
|
12
|
+
# Ollama-based (dev/prod): dev or prod
|
|
13
|
+
# vLLM-based: vllm
|
|
12
14
|
EVALVAULT_PROFILE=dev
|
|
13
15
|
|
|
14
16
|
# ================================================
|
|
@@ -31,7 +33,7 @@ EVALVAULT_MEMORY_DB_PATH=data/db/evalvault_memory.db
|
|
|
31
33
|
CORS_ORIGINS=http://localhost:5173,http://127.0.0.1:5173
|
|
32
34
|
|
|
33
35
|
# ================================================
|
|
34
|
-
# Docker Base Images (
|
|
36
|
+
# Docker Base Images (online build only)
|
|
35
37
|
# ================================================
|
|
36
38
|
EVALVAULT_PYTHON_IMAGE=python:3.12.6-slim
|
|
37
39
|
EVALVAULT_UV_IMAGE=ghcr.io/astral-sh/uv:0.4.28
|
|
@@ -0,0 +1,59 @@
|
|
|
1
|
+
# EvalVault Offline Environment Template (Ollama)
|
|
2
|
+
# Copy to .env.offline and fill required values.
|
|
3
|
+
#
|
|
4
|
+
# Usage:
|
|
5
|
+
# cp .env.offline.ollama.example .env.offline
|
|
6
|
+
# # Edit .env.offline with your air-gapped network settings
|
|
7
|
+
# docker compose -f docker-compose.offline.yml --env-file .env.offline up -d --no-build --pull never
|
|
8
|
+
|
|
9
|
+
# ================================================
|
|
10
|
+
# Profile (Ollama)
|
|
11
|
+
# ================================================
|
|
12
|
+
EVALVAULT_PROFILE=dev
|
|
13
|
+
|
|
14
|
+
# ================================================
|
|
15
|
+
# PostgreSQL (core stack)
|
|
16
|
+
# ================================================
|
|
17
|
+
POSTGRES_IMAGE=pgvector/pgvector:0.8.0-pg16
|
|
18
|
+
POSTGRES_USER=evalvault
|
|
19
|
+
POSTGRES_PASSWORD=evalvault
|
|
20
|
+
POSTGRES_DB=evalvault
|
|
21
|
+
|
|
22
|
+
# ================================================
|
|
23
|
+
# Storage (SQLite paths for local file-based storage)
|
|
24
|
+
# ================================================
|
|
25
|
+
EVALVAULT_DB_PATH=data/db/evalvault.db
|
|
26
|
+
EVALVAULT_MEMORY_DB_PATH=data/db/evalvault_memory.db
|
|
27
|
+
|
|
28
|
+
# ================================================
|
|
29
|
+
# API / CORS
|
|
30
|
+
# ================================================
|
|
31
|
+
CORS_ORIGINS=http://localhost:5173,http://127.0.0.1:5173
|
|
32
|
+
|
|
33
|
+
# ================================================
|
|
34
|
+
# Docker Base Images (online build only)
|
|
35
|
+
# ================================================
|
|
36
|
+
EVALVAULT_PYTHON_IMAGE=python:3.12.6-slim
|
|
37
|
+
EVALVAULT_UV_IMAGE=ghcr.io/astral-sh/uv:0.4.28
|
|
38
|
+
EVALVAULT_NODE_IMAGE=node:20.19-alpine
|
|
39
|
+
EVALVAULT_NGINX_IMAGE=nginx:1.27.3-alpine
|
|
40
|
+
|
|
41
|
+
# API_AUTH_TOKENS=
|
|
42
|
+
# KNOWLEDGE_READ_TOKENS=
|
|
43
|
+
# KNOWLEDGE_WRITE_TOKENS=
|
|
44
|
+
|
|
45
|
+
# ================================================
|
|
46
|
+
# Ollama server (air-gapped network)
|
|
47
|
+
# ================================================
|
|
48
|
+
# IMPORTANT: Model weights are NOT shipped with EvalVault.
|
|
49
|
+
# You must provide the URL to your air-gapped Ollama server.
|
|
50
|
+
|
|
51
|
+
# OLLAMA_BASE_URL=
|
|
52
|
+
OLLAMA_TIMEOUT=120
|
|
53
|
+
# OLLAMA_TOOL_MODELS=
|
|
54
|
+
|
|
55
|
+
# ================================================
|
|
56
|
+
# Faithfulness fallback (optional)
|
|
57
|
+
# ================================================
|
|
58
|
+
# FAITHFULNESS_FALLBACK_PROVIDER=
|
|
59
|
+
# FAITHFULNESS_FALLBACK_MODEL=
|
|
@@ -0,0 +1,83 @@
|
|
|
1
|
+
# EvalVault Offline Environment Template (vLLM)
|
|
2
|
+
# Copy to .env.offline and fill required values.
|
|
3
|
+
#
|
|
4
|
+
# Usage:
|
|
5
|
+
# cp .env.offline.vllm.example .env.offline
|
|
6
|
+
# # Edit .env.offline with your air-gapped network settings
|
|
7
|
+
# docker compose -f docker-compose.offline.yml --env-file .env.offline up -d --no-build --pull never
|
|
8
|
+
|
|
9
|
+
# ================================================
|
|
10
|
+
# Profile (vLLM)
|
|
11
|
+
# ================================================
|
|
12
|
+
EVALVAULT_PROFILE=vllm
|
|
13
|
+
|
|
14
|
+
# ================================================
|
|
15
|
+
# PostgreSQL (core stack)
|
|
16
|
+
# ================================================
|
|
17
|
+
POSTGRES_IMAGE=pgvector/pgvector:0.8.0-pg16
|
|
18
|
+
POSTGRES_USER=evalvault
|
|
19
|
+
POSTGRES_PASSWORD=evalvault
|
|
20
|
+
POSTGRES_DB=evalvault
|
|
21
|
+
|
|
22
|
+
# ================================================
|
|
23
|
+
# Storage (SQLite paths for local file-based storage)
|
|
24
|
+
# ================================================
|
|
25
|
+
EVALVAULT_DB_PATH=data/db/evalvault.db
|
|
26
|
+
EVALVAULT_MEMORY_DB_PATH=data/db/evalvault_memory.db
|
|
27
|
+
|
|
28
|
+
# ================================================
|
|
29
|
+
# API / CORS
|
|
30
|
+
# ================================================
|
|
31
|
+
CORS_ORIGINS=http://localhost:5173,http://127.0.0.1:5173
|
|
32
|
+
|
|
33
|
+
# ================================================
|
|
34
|
+
# Docker Base Images (online build only)
|
|
35
|
+
# ================================================
|
|
36
|
+
EVALVAULT_PYTHON_IMAGE=python:3.12.6-slim
|
|
37
|
+
EVALVAULT_UV_IMAGE=ghcr.io/astral-sh/uv:0.4.28
|
|
38
|
+
EVALVAULT_NODE_IMAGE=node:20.19-alpine
|
|
39
|
+
EVALVAULT_NGINX_IMAGE=nginx:1.27.3-alpine
|
|
40
|
+
|
|
41
|
+
# API_AUTH_TOKENS=
|
|
42
|
+
# KNOWLEDGE_READ_TOKENS=
|
|
43
|
+
# KNOWLEDGE_WRITE_TOKENS=
|
|
44
|
+
|
|
45
|
+
# ================================================
|
|
46
|
+
# vLLM server (air-gapped network - internal models)
|
|
47
|
+
# ================================================
|
|
48
|
+
# IMPORTANT: Model weights are NOT shipped with EvalVault.
|
|
49
|
+
# You must provide the URL to your air-gapped vLLM server.
|
|
50
|
+
|
|
51
|
+
# VLLM_BASE_URL=http://vllm-server:8000/v1
|
|
52
|
+
# VLLM_API_KEY=
|
|
53
|
+
# VLLM_MODEL=Qwen/Qwen2.5-7B-Instruct
|
|
54
|
+
# VLLM_EMBEDDING_MODEL=BAAI/bge-m3
|
|
55
|
+
# VLLM_EMBEDDING_BASE_URL=http://vllm-embedding:8000/v1
|
|
56
|
+
|
|
57
|
+
# ================================================
|
|
58
|
+
# Closed-source models (requires network proxy)
|
|
59
|
+
# ================================================
|
|
60
|
+
# If your air-gapped network has proxy access to external APIs,
|
|
61
|
+
# you can use closed-source models via API keys.
|
|
62
|
+
|
|
63
|
+
# OpenAI
|
|
64
|
+
# OPENAI_API_KEY=sk-...
|
|
65
|
+
# OPENAI_MODEL=gpt-4o-mini
|
|
66
|
+
# OPENAI_BASE_URL=https://api.openai.com/v1
|
|
67
|
+
|
|
68
|
+
# Anthropic
|
|
69
|
+
# ANTHROPIC_API_KEY=sk-ant-...
|
|
70
|
+
# ANTHROPIC_MODEL=claude-sonnet-4-20250514
|
|
71
|
+
|
|
72
|
+
# Azure OpenAI
|
|
73
|
+
# AZURE_OPENAI_API_KEY=
|
|
74
|
+
# AZURE_OPENAI_ENDPOINT=https://your-resource.openai.azure.com
|
|
75
|
+
# AZURE_OPENAI_DEPLOYMENT=gpt-4o-mini
|
|
76
|
+
# AZURE_OPENAI_API_VERSION=2024-02-15-preview
|
|
77
|
+
|
|
78
|
+
# ================================================
|
|
79
|
+
# Faithfulness fallback (optional)
|
|
80
|
+
# ================================================
|
|
81
|
+
# Use a more capable model for faithfulness metric if needed
|
|
82
|
+
# FAITHFULNESS_FALLBACK_PROVIDER=openai
|
|
83
|
+
# FAITHFULNESS_FALLBACK_MODEL=gpt-4o
|
|
@@ -4,11 +4,11 @@ This file provides guidance to Claude Code (claude.ai/code) when working with co
|
|
|
4
4
|
|
|
5
5
|
## Project Overview
|
|
6
6
|
|
|
7
|
-
EvalVault is a RAG (Retrieval-Augmented Generation) evaluation system for Korean/English insurance documents. Built on Ragas +
|
|
7
|
+
EvalVault is a RAG (Retrieval-Augmented Generation) evaluation system for Korean/English insurance documents. Built on Ragas with dual-tracker support (MLflow + Phoenix) for experiment tracking and observability.
|
|
8
8
|
|
|
9
9
|
**Core Flow:**
|
|
10
10
|
```
|
|
11
|
-
Input (CSV/Excel/JSON) → Ragas Evaluation →
|
|
11
|
+
Input (CSV/Excel/JSON) → Ragas Evaluation → MLflow + Phoenix (dual logging) → Analysis
|
|
12
12
|
```
|
|
13
13
|
|
|
14
14
|
**Supported Metrics:**
|
|
@@ -53,6 +53,8 @@ src/evalvault/
|
|
|
53
53
|
| DatasetPort | CSV/Excel/JSON Loaders | ✅ Complete |
|
|
54
54
|
| TrackerPort | LangfuseAdapter | ✅ Complete |
|
|
55
55
|
| TrackerPort | MLflowAdapter | ✅ Complete |
|
|
56
|
+
| TrackerPort | PhoenixAdapter | ✅ Complete |
|
|
57
|
+
| TrackerPort | MultiTrackerAdapter | ✅ Complete (dual-logging: MLflow + Phoenix) |
|
|
56
58
|
| StoragePort | SQLiteAdapter | ✅ Complete |
|
|
57
59
|
| StoragePort | PostgreSQLAdapter | ✅ Complete |
|
|
58
60
|
| EvaluatorPort | RagasEvaluator | ✅ Complete |
|
|
@@ -69,6 +71,25 @@ src/evalvault/
|
|
|
69
71
|
- **Purpose**: Trace logging, score tracking, evaluation history
|
|
70
72
|
- **Credentials**: Inject via `LANGFUSE_PUBLIC_KEY` / `LANGFUSE_SECRET_KEY`
|
|
71
73
|
|
|
74
|
+
### Experiment Tracking (MLflow + Phoenix)
|
|
75
|
+
|
|
76
|
+
EvalVault uses dual-tracker logging by default:
|
|
77
|
+
- **MLflow**: Experiment management, metric comparison, model versioning
|
|
78
|
+
- **Phoenix**: Real-time observability, trace visualization, prompt debugging
|
|
79
|
+
|
|
80
|
+
```bash
|
|
81
|
+
# MLflow (required)
|
|
82
|
+
MLFLOW_TRACKING_URI=http://localhost:5000
|
|
83
|
+
MLFLOW_EXPERIMENT_NAME=evalvault-experiments
|
|
84
|
+
|
|
85
|
+
# Phoenix (required)
|
|
86
|
+
PHOENIX_ENDPOINT=http://localhost:6006
|
|
87
|
+
PHOENIX_PROJECT_NAME=evalvault
|
|
88
|
+
|
|
89
|
+
# Tracker selection (default: mlflow+phoenix)
|
|
90
|
+
TRACKER_PROVIDER=mlflow+phoenix # Options: mlflow, phoenix, langfuse, mlflow+phoenix
|
|
91
|
+
```
|
|
92
|
+
|
|
72
93
|
## Development Commands
|
|
73
94
|
|
|
74
95
|
```bash
|
|
@@ -188,14 +209,40 @@ OPENAI_MODEL=gpt-5-nano
|
|
|
188
209
|
OPENAI_EMBEDDING_MODEL=text-embedding-3-small
|
|
189
210
|
# OPENAI_BASE_URL=https://api.openai.com/v1 # optional
|
|
190
211
|
|
|
191
|
-
# Langfuse (
|
|
212
|
+
# Langfuse (optional, if using langfuse tracker)
|
|
192
213
|
LANGFUSE_PUBLIC_KEY=pk-lf-...
|
|
193
214
|
LANGFUSE_SECRET_KEY=sk-lf-...
|
|
194
215
|
LANGFUSE_HOST=http://your-langfuse-host:port
|
|
216
|
+
|
|
217
|
+
# MLflow + Phoenix (default dual-tracker)
|
|
218
|
+
MLFLOW_TRACKING_URI=http://localhost:5000
|
|
219
|
+
PHOENIX_ENDPOINT=http://localhost:6006
|
|
220
|
+
TRACKER_PROVIDER=mlflow+phoenix
|
|
195
221
|
```
|
|
196
222
|
|
|
197
223
|
**Note:** 메트릭 임계값(thresholds)은 환경변수가 아닌 **데이터셋 JSON 파일**에 정의합니다.
|
|
198
224
|
|
|
225
|
+
## Offline Deployment
|
|
226
|
+
|
|
227
|
+
EvalVault supports air-gapped (오프라인) deployment with bundled models.
|
|
228
|
+
|
|
229
|
+
### Supported Configurations
|
|
230
|
+
- **Ollama**: Lightweight local inference (`llama3.2`, `llama3.1`, etc.)
|
|
231
|
+
- **vLLM**: High-performance GPU inference (recommended for production)
|
|
232
|
+
|
|
233
|
+
### Quick Start
|
|
234
|
+
```bash
|
|
235
|
+
# Build offline bundle
|
|
236
|
+
./scripts/offline/build_full_offline_bundle.sh
|
|
237
|
+
|
|
238
|
+
# Restore on target machine
|
|
239
|
+
./scripts/offline/import_images.sh
|
|
240
|
+
./scripts/offline/restore_model_cache.sh
|
|
241
|
+
docker compose -f docker-compose.offline.yml up -d
|
|
242
|
+
```
|
|
243
|
+
|
|
244
|
+
See [docs/guides/OFFLINE_DOCKER.md](docs/guides/OFFLINE_DOCKER.md) for detailed instructions.
|
|
245
|
+
|
|
199
246
|
## Data Format
|
|
200
247
|
|
|
201
248
|
### Input Dataset (JSON)
|
|
@@ -241,7 +288,8 @@ tc-001,"질문","답변","[""컨텍스트1"",""컨텍스트2""]","정답"
|
|
|
241
288
|
| RagasEvaluator | ✅ Complete | 6 metrics (Ragas v1.0) |
|
|
242
289
|
| LLM Adapters | ✅ Complete | OpenAI, Ollama, Azure, Anthropic |
|
|
243
290
|
| Storage Adapters | ✅ Complete | SQLite, PostgreSQL |
|
|
244
|
-
| Tracker Adapters | ✅ Complete | Langfuse, MLflow |
|
|
291
|
+
| Tracker Adapters | ✅ Complete | Langfuse, MLflow, Phoenix (dual-tracker) |
|
|
292
|
+
| Offline Bundle | ✅ Complete | Ollama/vLLM, Docker images, model cache |
|
|
245
293
|
| CLI | ✅ Complete | run, metrics, config, history, compare, export, generate, pipeline, benchmark |
|
|
246
294
|
| Testset Generation | ✅ Complete | Basic + Knowledge Graph |
|
|
247
295
|
| Experiment Management | ✅ Complete | A/B testing, comparison |
|
|
@@ -265,6 +313,8 @@ tc-001,"질문","답변","[""컨텍스트1"",""컨텍스트2""]","정답"
|
|
|
265
313
|
| [docs/ROADMAP.md](docs/ROADMAP.md) | 2026-2027 개발 로드맵 (Phase 15-19+) |
|
|
266
314
|
| [docs/IMPROVEMENT_PLAN.md](docs/IMPROVEMENT_PLAN.md) | 코드 품질 개선 계획 (P1-P7 우선순위, 병렬 에이전트 워크플로우) |
|
|
267
315
|
| [docs/KG_IMPROVEMENT_PLAN.md](docs/KG_IMPROVEMENT_PLAN.md) | Knowledge Graph 개선 계획 |
|
|
316
|
+
| [docs/guides/OFFLINE_DOCKER.md](docs/guides/OFFLINE_DOCKER.md) | 오프라인 Docker 배포 가이드 |
|
|
317
|
+
| [docs/guides/OFFLINE_MODELS.md](docs/guides/OFFLINE_MODELS.md) | 오프라인 모델 번들링 가이드 |
|
|
268
318
|
| [agent/README.md](agent/README.md) | 자율 에이전트 시스템 사용 가이드 |
|
|
269
319
|
|
|
270
320
|
## Autonomous Agent System
|
|
@@ -18,15 +18,15 @@ WORKDIR /app
|
|
|
18
18
|
# Copy dependency files and README (required by pyproject.toml)
|
|
19
19
|
COPY pyproject.toml uv.lock README.md ./
|
|
20
20
|
|
|
21
|
-
# Install dependencies
|
|
22
|
-
RUN uv sync --frozen --no-dev --no-install-project
|
|
21
|
+
# Install dependencies (include postgres for PostgreSQL support)
|
|
22
|
+
RUN uv sync --frozen --no-dev --no-install-project --extra postgres
|
|
23
23
|
|
|
24
24
|
# Copy source code
|
|
25
25
|
COPY src/ ./src/
|
|
26
26
|
COPY config/ ./config/
|
|
27
27
|
|
|
28
28
|
# Install the project
|
|
29
|
-
RUN uv sync --frozen --no-dev
|
|
29
|
+
RUN uv sync --frozen --no-dev --extra postgres
|
|
30
30
|
|
|
31
31
|
|
|
32
32
|
# Stage 2: Runtime stage
|
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
Metadata-Version: 2.4
|
|
2
2
|
Name: evalvault
|
|
3
|
-
Version: 1.
|
|
3
|
+
Version: 1.77.0
|
|
4
4
|
Summary: RAG evaluation system using Ragas with Phoenix/Langfuse tracing
|
|
5
5
|
Project-URL: Homepage, https://github.com/ntts9990/EvalVault
|
|
6
6
|
Project-URL: Documentation, https://github.com/ntts9990/EvalVault#readme
|
|
@@ -66,6 +66,7 @@ Requires-Dist: ijson>=3.3.0; extra == 'dev'
|
|
|
66
66
|
Requires-Dist: kiwipiepy>=0.18.0; extra == 'dev'
|
|
67
67
|
Requires-Dist: langchain-anthropic; extra == 'dev'
|
|
68
68
|
Requires-Dist: lm-eval[api]>=0.4.0; extra == 'dev'
|
|
69
|
+
Requires-Dist: manim>=0.18.0; extra == 'dev'
|
|
69
70
|
Requires-Dist: mkdocs-material>=9.5.0; extra == 'dev'
|
|
70
71
|
Requires-Dist: mkdocs>=1.5.0; extra == 'dev'
|
|
71
72
|
Requires-Dist: mkdocstrings[python]>=0.24.0; extra == 'dev'
|
|
@@ -85,6 +85,18 @@ profiles:
|
|
|
85
85
|
provider: vllm
|
|
86
86
|
model: qwen3-embedding:0.6b
|
|
87
87
|
|
|
88
|
+
# ──────────────────────────────────────────────────────────────
|
|
89
|
+
# Qwen3 14B (보고서 작성용)
|
|
90
|
+
# ──────────────────────────────────────────────────────────────
|
|
91
|
+
qwen3-14b:
|
|
92
|
+
description: "보고서 작성용 Qwen3 14B"
|
|
93
|
+
llm:
|
|
94
|
+
provider: ollama
|
|
95
|
+
model: qwen3:14b
|
|
96
|
+
embedding:
|
|
97
|
+
provider: ollama
|
|
98
|
+
model: qwen3-embedding:0.6b
|
|
99
|
+
|
|
88
100
|
# ──────────────────────────────────────────────────────────────
|
|
89
101
|
# 참고: 인프라 설정
|
|
90
102
|
# ──────────────────────────────────────────────────────────────
|
|
@@ -0,0 +1,28 @@
|
|
|
1
|
+
# EvalVault Offline Build Overrides
|
|
2
|
+
# Online build only. Do not use in air-gapped runtime.
|
|
3
|
+
#
|
|
4
|
+
# IMPORTANT: Builds linux/amd64 images for cross-platform compatibility.
|
|
5
|
+
# Apple Silicon Macs will build amd64 images that run on Ubuntu/Intel.
|
|
6
|
+
|
|
7
|
+
services:
|
|
8
|
+
evalvault-api:
|
|
9
|
+
platform: linux/amd64
|
|
10
|
+
build:
|
|
11
|
+
context: .
|
|
12
|
+
dockerfile: Dockerfile
|
|
13
|
+
platforms:
|
|
14
|
+
- linux/amd64
|
|
15
|
+
args:
|
|
16
|
+
PYTHON_IMAGE: ${EVALVAULT_PYTHON_IMAGE:-python:3.12.6-slim}
|
|
17
|
+
UV_IMAGE: ${EVALVAULT_UV_IMAGE:-ghcr.io/astral-sh/uv:0.4.28}
|
|
18
|
+
|
|
19
|
+
evalvault-web:
|
|
20
|
+
platform: linux/amd64
|
|
21
|
+
build:
|
|
22
|
+
context: ./frontend
|
|
23
|
+
dockerfile: Dockerfile
|
|
24
|
+
platforms:
|
|
25
|
+
- linux/amd64
|
|
26
|
+
args:
|
|
27
|
+
NODE_IMAGE: ${EVALVAULT_NODE_IMAGE:-node:20.19-alpine}
|
|
28
|
+
NGINX_IMAGE: ${EVALVAULT_NGINX_IMAGE:-nginx:1.27.3-alpine}
|
|
@@ -1,9 +1,9 @@
|
|
|
1
1
|
# EvalVault Offline Deployment - Docker Compose Configuration
|
|
2
2
|
# Air-gapped environment: PostgreSQL + EvalVault API + Web UI
|
|
3
3
|
#
|
|
4
|
-
# Usage:
|
|
4
|
+
# Usage (air-gapped runtime only; no build):
|
|
5
5
|
# docker compose -f docker-compose.offline.yml --env-file .env.offline.example config
|
|
6
|
-
# docker compose -f docker-compose.offline.yml --env-file .env.offline up -d
|
|
6
|
+
# docker compose -f docker-compose.offline.yml --env-file .env.offline up -d --no-build
|
|
7
7
|
#
|
|
8
8
|
# Prerequisites:
|
|
9
9
|
# - External LLM server (Ollama/vLLM) must be accessible within the air-gapped network
|
|
@@ -18,6 +18,7 @@ services:
|
|
|
18
18
|
# PostgreSQL database for evaluation storage
|
|
19
19
|
postgres:
|
|
20
20
|
image: ${POSTGRES_IMAGE:-pgvector/pgvector:0.8.0-pg16}
|
|
21
|
+
platform: linux/amd64
|
|
21
22
|
container_name: evalvault-postgres
|
|
22
23
|
pull_policy: never
|
|
23
24
|
restart: unless-stopped
|
|
@@ -38,12 +39,7 @@ services:
|
|
|
38
39
|
# EvalVault API service (FastAPI backend)
|
|
39
40
|
evalvault-api:
|
|
40
41
|
image: evalvault-api:offline
|
|
41
|
-
|
|
42
|
-
context: .
|
|
43
|
-
dockerfile: Dockerfile
|
|
44
|
-
args:
|
|
45
|
-
PYTHON_IMAGE: ${EVALVAULT_PYTHON_IMAGE:-python:3.12.6-slim}
|
|
46
|
-
UV_IMAGE: ${EVALVAULT_UV_IMAGE:-ghcr.io/astral-sh/uv:0.4.28}
|
|
42
|
+
platform: linux/amd64
|
|
47
43
|
container_name: evalvault-api
|
|
48
44
|
restart: unless-stopped
|
|
49
45
|
pull_policy: never
|
|
@@ -75,6 +71,18 @@ services:
|
|
|
75
71
|
# Faithfulness fallback (optional)
|
|
76
72
|
FAITHFULNESS_FALLBACK_PROVIDER: ${FAITHFULNESS_FALLBACK_PROVIDER:-}
|
|
77
73
|
FAITHFULNESS_FALLBACK_MODEL: ${FAITHFULNESS_FALLBACK_MODEL:-}
|
|
74
|
+
# OpenAI (closed-source, requires proxy access)
|
|
75
|
+
OPENAI_API_KEY: ${OPENAI_API_KEY:-}
|
|
76
|
+
OPENAI_MODEL: ${OPENAI_MODEL:-}
|
|
77
|
+
OPENAI_BASE_URL: ${OPENAI_BASE_URL:-}
|
|
78
|
+
# Anthropic (closed-source, requires proxy access)
|
|
79
|
+
ANTHROPIC_API_KEY: ${ANTHROPIC_API_KEY:-}
|
|
80
|
+
ANTHROPIC_MODEL: ${ANTHROPIC_MODEL:-}
|
|
81
|
+
# Azure OpenAI (closed-source, requires proxy access)
|
|
82
|
+
AZURE_OPENAI_API_KEY: ${AZURE_OPENAI_API_KEY:-}
|
|
83
|
+
AZURE_OPENAI_ENDPOINT: ${AZURE_OPENAI_ENDPOINT:-}
|
|
84
|
+
AZURE_OPENAI_DEPLOYMENT: ${AZURE_OPENAI_DEPLOYMENT:-}
|
|
85
|
+
AZURE_OPENAI_API_VERSION: ${AZURE_OPENAI_API_VERSION:-}
|
|
78
86
|
# API authentication (optional)
|
|
79
87
|
API_AUTH_TOKENS: ${API_AUTH_TOKENS:-}
|
|
80
88
|
KNOWLEDGE_READ_TOKENS: ${KNOWLEDGE_READ_TOKENS:-}
|
|
@@ -86,7 +94,7 @@ services:
|
|
|
86
94
|
# Override entrypoint to run API server
|
|
87
95
|
command: ["serve-api", "--host", "0.0.0.0", "--port", "8000"]
|
|
88
96
|
healthcheck:
|
|
89
|
-
test: ["CMD-SHELL", "
|
|
97
|
+
test: ["CMD-SHELL", "python -c \"import urllib.request; urllib.request.urlopen('http://localhost:8000/health')\" || exit 1"]
|
|
90
98
|
interval: 30s
|
|
91
99
|
timeout: 10s
|
|
92
100
|
start_period: 10s
|
|
@@ -98,12 +106,7 @@ services:
|
|
|
98
106
|
# Nginx proxies /api/ to evalvault-api:8000
|
|
99
107
|
evalvault-web:
|
|
100
108
|
image: evalvault-web:offline
|
|
101
|
-
|
|
102
|
-
context: ./frontend
|
|
103
|
-
dockerfile: Dockerfile
|
|
104
|
-
args:
|
|
105
|
-
NODE_IMAGE: ${EVALVAULT_NODE_IMAGE:-node:20.19-alpine}
|
|
106
|
-
NGINX_IMAGE: ${EVALVAULT_NGINX_IMAGE:-nginx:1.27.3-alpine}
|
|
109
|
+
platform: linux/amd64
|
|
107
110
|
container_name: evalvault-web
|
|
108
111
|
restart: unless-stopped
|
|
109
112
|
pull_policy: never
|
|
@@ -0,0 +1,162 @@
|
|
|
1
|
+
# 의존성 주입/모듈 분리 전체 계획서
|
|
2
|
+
|
|
3
|
+
이 문서는 모듈 이식성을 높이기 위한 전체 계획과 상세 계획을 동시에 제공한다.
|
|
4
|
+
현재 단계에서는 설계와 범위를 확정하고, 단계별 리팩터링의 기준을 문서화한다.
|
|
5
|
+
|
|
6
|
+
## 목표
|
|
7
|
+
- Settings 의존을 진입점(Entry point)으로만 제한
|
|
8
|
+
- Storage/LLM/Tracker 생성 책임을 외부로 이동
|
|
9
|
+
- 도메인/포트 계층을 별도 패키지로 분리 가능한 구조로 정리
|
|
10
|
+
- 기능/동작의 하위 호환을 유지하며 점진적 전환
|
|
11
|
+
|
|
12
|
+
## 성공 기준 (Definition of Done)
|
|
13
|
+
- 모든 도메인 서비스가 Settings를 직접 참조하지 않음
|
|
14
|
+
- 모든 inbound 계층에서 Settings 직접 생성 제거
|
|
15
|
+
- 어댑터 생성은 entrypoint에서만 수행
|
|
16
|
+
- 도메인/포트 패키지 분리 후에도 기존 CLI/API 동작 유지
|
|
17
|
+
- 테스트/CI 정상 통과
|
|
18
|
+
|
|
19
|
+
## 전체 단계 요약
|
|
20
|
+
1) PR1: 컨테이너 스캐폴딩 + 주입 경로 추가 (하위 호환)
|
|
21
|
+
2) PR2: CLI commands에서 Settings 직접 호출 제거
|
|
22
|
+
3) PR3: API adapter에서 Settings 직접 호출 제거
|
|
23
|
+
4) PR4: Storage/LLM/Tracker factory 호출 외부화
|
|
24
|
+
5) PR5: 도메인/포트 분리 패키징
|
|
25
|
+
|
|
26
|
+
## 상세 계획
|
|
27
|
+
|
|
28
|
+
### PR1: 컨테이너 스캐폴딩 + 주입 경로 추가
|
|
29
|
+
**목표**: 외부 주입 경로를 만들고 기존 동작은 유지한다.
|
|
30
|
+
|
|
31
|
+
**신규 파일**
|
|
32
|
+
- `src/evalvault/app/context.py`
|
|
33
|
+
- `src/evalvault/app/container.py`
|
|
34
|
+
|
|
35
|
+
**컨테이너 인터페이스**
|
|
36
|
+
- `AppContext` (dataclass)
|
|
37
|
+
- settings: Settings
|
|
38
|
+
- storage: StoragePort
|
|
39
|
+
- llm: LLMPort
|
|
40
|
+
- tracker: TrackerPort
|
|
41
|
+
- domain_memory: DomainMemoryPort
|
|
42
|
+
|
|
43
|
+
- `build_app_context(settings: Settings | None = None) -> AppContext`
|
|
44
|
+
- settings가 없으면 Settings 생성
|
|
45
|
+
- 내부 factory로 의존성 생성 (임시)
|
|
46
|
+
|
|
47
|
+
**수정 대상 (주입 경로 추가)**
|
|
48
|
+
- `src/evalvault/adapters/inbound/cli/commands/run.py`
|
|
49
|
+
- 기존 동작 유지, `context: AppContext | None = None` 옵션 추가
|
|
50
|
+
- `src/evalvault/adapters/inbound/api/adapter.py`
|
|
51
|
+
- 기존 동작 유지, `context` 주입 경로 추가
|
|
52
|
+
- `src/evalvault/adapters/inbound/mcp/tools.py`
|
|
53
|
+
- context에서 storage/llm 사용
|
|
54
|
+
|
|
55
|
+
**검증**
|
|
56
|
+
- 기존 CLI/API 사용 방식 그대로 동작
|
|
57
|
+
- context 주입 시 동일 결과
|
|
58
|
+
|
|
59
|
+
---
|
|
60
|
+
|
|
61
|
+
### PR2: CLI Settings 직접 호출 제거
|
|
62
|
+
**목표**: CLI 계층에서 Settings 직접 생성 제거
|
|
63
|
+
|
|
64
|
+
**변경 방향**
|
|
65
|
+
- CLI entrypoint에서 Settings/Context 생성
|
|
66
|
+
- commands는 settings 대신 context에서 의존성 사용
|
|
67
|
+
|
|
68
|
+
**주요 파일**
|
|
69
|
+
- `src/evalvault/adapters/inbound/cli/commands/*.py`
|
|
70
|
+
- `src/evalvault/adapters/inbound/cli/app.py`
|
|
71
|
+
|
|
72
|
+
**검증**
|
|
73
|
+
- CLI 기능 전체 동작 유지
|
|
74
|
+
- 환경변수 변경 시 동일 결과
|
|
75
|
+
|
|
76
|
+
---
|
|
77
|
+
|
|
78
|
+
### PR3: API Settings 직접 호출 제거
|
|
79
|
+
**목표**: API 계층에서 Settings 직접 생성 제거
|
|
80
|
+
|
|
81
|
+
**변경 방향**
|
|
82
|
+
- FastAPI create_app에서 Settings/Context 생성
|
|
83
|
+
- routers/adapters에서 context를 통해 접근
|
|
84
|
+
|
|
85
|
+
**주요 파일**
|
|
86
|
+
- `src/evalvault/adapters/inbound/api/main.py`
|
|
87
|
+
- `src/evalvault/adapters/inbound/api/adapter.py`
|
|
88
|
+
- `src/evalvault/adapters/inbound/api/routers/*.py`
|
|
89
|
+
|
|
90
|
+
**검증**
|
|
91
|
+
- API 기능 전체 동작 유지
|
|
92
|
+
- CORS/Rate limit/Auth 설정 유지
|
|
93
|
+
|
|
94
|
+
---
|
|
95
|
+
|
|
96
|
+
### PR4: Factory 호출 외부화
|
|
97
|
+
**목표**: 내부 factory 의존 제거 (external wiring)
|
|
98
|
+
|
|
99
|
+
**변경 방향**
|
|
100
|
+
- build_storage_adapter/build_domain_memory_adapter를 entrypoint에서 호출
|
|
101
|
+
- 서비스/핸들러는 포트 인터페이스만 받음
|
|
102
|
+
|
|
103
|
+
**주요 파일**
|
|
104
|
+
- `src/evalvault/adapters/outbound/storage/factory.py`
|
|
105
|
+
- `src/evalvault/adapters/outbound/domain_memory/factory.py`
|
|
106
|
+
- `src/evalvault/adapters/outbound/llm/factory.py`
|
|
107
|
+
- `src/evalvault/adapters/inbound/*` (호출 제거)
|
|
108
|
+
|
|
109
|
+
**검증**
|
|
110
|
+
- 동일 설정으로 동일한 어댑터 생성
|
|
111
|
+
- 테스트 통과
|
|
112
|
+
|
|
113
|
+
---
|
|
114
|
+
|
|
115
|
+
### PR5: 도메인/포트 분리 패키징
|
|
116
|
+
**목표**: 모듈 분리 구조 완성
|
|
117
|
+
|
|
118
|
+
**제안 구조**
|
|
119
|
+
```
|
|
120
|
+
evalvault-core/
|
|
121
|
+
domain/
|
|
122
|
+
ports/
|
|
123
|
+
services/
|
|
124
|
+
|
|
125
|
+
evalvault-adapters/
|
|
126
|
+
outbound/
|
|
127
|
+
inbound/
|
|
128
|
+
|
|
129
|
+
evalvault-app/
|
|
130
|
+
cli/
|
|
131
|
+
api/
|
|
132
|
+
wiring/
|
|
133
|
+
```
|
|
134
|
+
|
|
135
|
+
**패키징 원칙**
|
|
136
|
+
- core는 외부 라이브러리 의존 최소화
|
|
137
|
+
- adapters는 core에 의존하지만 core는 adapters를 모름
|
|
138
|
+
- app은 core+adapters를 묶는 역할
|
|
139
|
+
|
|
140
|
+
**검증**
|
|
141
|
+
- core만 별도 설치 가능
|
|
142
|
+
- adapters/app에서 기존 CLI/API 동작 유지
|
|
143
|
+
|
|
144
|
+
---
|
|
145
|
+
|
|
146
|
+
## 리스크 및 대응
|
|
147
|
+
- 리스크: Settings 직접 접근 제거 과정에서 옵션 누락
|
|
148
|
+
- 대응: PR마다 테스트/스냅샷 비교
|
|
149
|
+
- 리스크: CLI/APIs에서 초기화 순서 문제
|
|
150
|
+
- 대응: AppContext 초기화 순서 문서화
|
|
151
|
+
- 리스크: 외부 의존성 분리 시 import 경로 붕괴
|
|
152
|
+
- 대응: 단계적 패키지 이동
|
|
153
|
+
|
|
154
|
+
## 테스트 전략
|
|
155
|
+
- PR1~PR3: 기존 테스트 그대로 통과
|
|
156
|
+
- PR4: storage/llm/tracker 관련 unit/integration 테스트 추가
|
|
157
|
+
- PR5: core 패키지 단독 설치 테스트
|
|
158
|
+
|
|
159
|
+
## 결정 필요 항목
|
|
160
|
+
- 컨테이너 네이밍: `AppContext` vs `AppContainer`
|
|
161
|
+
- 외부 wiring 위치: CLI entrypoint vs API create_app
|
|
162
|
+
- 분리 레포 구조(모노레포 vs 멀티레포)
|