PyPI - microeval - Versions diffs - 0.1.0__tar.gz - Mend

microeval 0.1.0__tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (61) hide show

microeval-0.1.0/.beads/.gitignore +29 -0
microeval-0.1.0/.beads/README.md +81 -0
microeval-0.1.0/.beads/beads.left.jsonl +7 -0
microeval-0.1.0/.beads/beads.left.meta.json +1 -0
microeval-0.1.0/.beads/config.yaml +56 -0
microeval-0.1.0/.beads/daemon.lock +7 -0
microeval-0.1.0/.beads/issues.jsonl +7 -0
microeval-0.1.0/.beads/metadata.json +5 -0
microeval-0.1.0/.env.example +9 -0
microeval-0.1.0/.gitattributes +3 -0
microeval-0.1.0/.gitignore +173 -0
microeval-0.1.0/LICENSE +21 -0
microeval-0.1.0/PKG-INFO +479 -0
microeval-0.1.0/README.md +446 -0
microeval-0.1.0/evals-consultant/prompts/candidate-skills.txt +59 -0
microeval-0.1.0/evals-consultant/prompts/candidate-summary.txt +29 -0
microeval-0.1.0/evals-consultant/queries/consultant.yaml +121 -0
microeval-0.1.0/evals-consultant/results/consultant-groq-llama-3.3-70b.yaml +72 -0
microeval-0.1.0/evals-consultant/runs/consultant-bedrock-nova-pro.yaml +12 -0
microeval-0.1.0/evals-consultant/runs/consultant-groq-llama-3.3-70b.yaml +12 -0
microeval-0.1.0/evals-consultant/runs/consultant-ollama-llama3.2.yaml +12 -0
microeval-0.1.0/evals-consultant/runs/consultant-openai-gpt-4o.yaml +12 -0
microeval-0.1.0/evals-engineer/prompts/candidate-skills.txt +59 -0
microeval-0.1.0/evals-engineer/prompts/candidate-summary.txt +29 -0
microeval-0.1.0/evals-engineer/queries/engineer.yaml +124 -0
microeval-0.1.0/evals-engineer/results/engineer-bedrock-nova-pro.yaml +76 -0
microeval-0.1.0/evals-engineer/results/engineer-groq-llama-3.3-70b.yaml +64 -0
microeval-0.1.0/evals-engineer/results/engineer-ollama-llama3.2.yaml +254 -0
microeval-0.1.0/evals-engineer/results/engineer-openai-gpt-4o.yaml +70 -0
microeval-0.1.0/evals-engineer/runs/engineer-bedrock-nova-pro.yaml +11 -0
microeval-0.1.0/evals-engineer/runs/engineer-groq-llama-3.3-70b.yaml +11 -0
microeval-0.1.0/evals-engineer/runs/engineer-ollama-llama3.2.yaml +11 -0
microeval-0.1.0/evals-engineer/runs/engineer-openai-gpt-4o.yaml +11 -0
microeval-0.1.0/microeval/__init__.py +0 -0
microeval-0.1.0/microeval/chat.py +62 -0
microeval-0.1.0/microeval/chat_client.py +1052 -0
microeval-0.1.0/microeval/cli.py +150 -0
microeval-0.1.0/microeval/config.json +14 -0
microeval-0.1.0/microeval/evaluator.py +330 -0
microeval-0.1.0/microeval/graph.py +133 -0
microeval-0.1.0/microeval/index.html +1306 -0
microeval-0.1.0/microeval/runner.py +139 -0
microeval-0.1.0/microeval/sample-evals/prompts/summarize.txt +13 -0
microeval-0.1.0/microeval/sample-evals/queries/summarize.yaml +13 -0
microeval-0.1.0/microeval/sample-evals/results/summarize-openai-gpt-4o.yaml +60 -0
microeval-0.1.0/microeval/sample-evals/runs/summarize-bedrock-nova.yaml +12 -0
microeval-0.1.0/microeval/sample-evals/runs/summarize-groq-llama.yaml +12 -0
microeval-0.1.0/microeval/sample-evals/runs/summarize-ollama-llama3.yaml +12 -0
microeval-0.1.0/microeval/sample-evals/runs/summarize-openai-gpt-4o.yaml +15 -0
microeval-0.1.0/microeval/schemas.py +127 -0
microeval-0.1.0/microeval/server.py +457 -0
microeval-0.1.0/microeval/setup_logger.py +51 -0
microeval-0.1.0/microeval/yaml_utils.py +55 -0
microeval-0.1.0/pyproject.toml +43 -0
microeval-0.1.0/sample-evals/prompts/summarize.txt +13 -0
microeval-0.1.0/sample-evals/queries/summarize.yaml +13 -0
microeval-0.1.0/sample-evals/results/summarize-groq-llama.yaml +60 -0
microeval-0.1.0/sample-evals/runs/summarize-bedrock-nova.yaml +12 -0
microeval-0.1.0/sample-evals/runs/summarize-groq-llama.yaml +12 -0
microeval-0.1.0/sample-evals/runs/summarize-ollama-llama3.yaml +12 -0
microeval-0.1.0/sample-evals/runs/summarize-openai-gpt-4o.yaml +12 -0

microeval-0.1.0/.beads/.gitignore ADDED Viewed

@@ -0,0 +1,29 @@
+# SQLite databases
+*.db
+*.db?*
+*.db-journal
+*.db-wal
+*.db-shm
+# Daemon runtime files
+daemon.lock
+daemon.log
+daemon.pid
+bd.sock
+# Legacy database files
+db.sqlite
+bd.db
+# Merge artifacts (temporary files from 3-way merge)
+beads.base.jsonl
+beads.base.meta.json
+beads.left.jsonl
+beads.left.meta.json
+beads.right.jsonl
+beads.right.meta.json
+# Keep JSONL exports and config (source of truth for git)
+!issues.jsonl
+!metadata.json
+!config.json

microeval-0.1.0/.beads/README.md ADDED Viewed

@@ -0,0 +1,81 @@
+# Beads - AI-Native Issue Tracking
+Welcome to Beads! This repository uses **Beads** for issue tracking - a modern, AI-native tool designed to live directly in your codebase alongside your code.
+## What is Beads?
+Beads is issue tracking that lives in your repo, making it perfect for AI coding agents and developers who want their issues close to their code. No web UI required - everything works through the CLI and integrates seamlessly with git.
+**Learn more:** [github.com/steveyegge/beads](https://github.com/steveyegge/beads)
+## Quick Start
+### Essential Commands
+```bash
+# Create new issues
+bd create "Add user authentication"
+# View all issues
+bd list
+# View issue details
+bd show <issue-id>
+# Update issue status
+bd update <issue-id> --status in-progress
+bd update <issue-id> --status done
+# Sync with git remote
+bd sync
+```
+### Working with Issues
+Issues in Beads are:
+- **Git-native**: Stored in `.beads/issues.jsonl` and synced like code
+- **AI-friendly**: CLI-first design works perfectly with AI coding agents
+- **Branch-aware**: Issues can follow your branch workflow
+- **Always in sync**: Auto-syncs with your commits
+## Why Beads?
+✨ **AI-Native Design**
+- Built specifically for AI-assisted development workflows
+- CLI-first interface works seamlessly with AI coding agents
+- No context switching to web UIs
+🚀 **Developer Focused**
+- Issues live in your repo, right next to your code
+- Works offline, syncs when you push
+- Fast, lightweight, and stays out of your way
+🔧 **Git Integration**
+- Automatic sync with git commits
+- Branch-aware issue tracking
+- Intelligent JSONL merge resolution
+## Get Started with Beads
+Try Beads in your own projects:
+```bash
+# Install Beads
+curl -sSL https://raw.githubusercontent.com/steveyegge/beads/main/scripts/install.sh | bash
+# Initialize in your repo
+bd init
+# Create your first issue
+bd create "Try out Beads"
+```
+## Learn More
+- **Documentation**: [github.com/steveyegge/beads/docs](https://github.com/steveyegge/beads/tree/main/docs)
+- **Quick Start Guide**: Run `bd quickstart`
+- **Examples**: [github.com/steveyegge/beads/examples](https://github.com/steveyegge/beads/tree/main/examples)
+---
+*Beads: Issue tracking that moves at the speed of thought* ⚡

microeval-0.1.0/.beads/beads.left.jsonl ADDED Viewed

@@ -0,0 +1,7 @@
+{"id":"eval-0so","title":"Add validation for config.json format","description":"Add schema validation for config.json to ensure chat_models and embed_models are properly formatted. Provide helpful error messages if config is invalid.","status":"open","priority":2,"issue_type":"task","created_at":"2025-11-30T15:55:31.313033+11:00","updated_at":"2025-11-30T15:55:31.313033+11:00"}
+{"id":"eval-519","title":"Add documentation for --evals-dir parameter usage","description":"Document the --evals-dir command-line parameter for both runner.py and server.py, including usage examples for switching between evals-engineer and evals-consultant.","status":"closed","priority":1,"issue_type":"task","created_at":"2025-11-30T15:55:08.697823+11:00","updated_at":"2025-11-30T15:59:02.552661+11:00","closed_at":"2025-11-30T15:59:02.552661+11:00"}
+{"id":"eval-dx6","title":"Update README.md to reflect new evals-engineer/evals-consultant structure","description":"README.md still references old evals directory. Need to update documentation to show split into evals-engineer and evals-consultant directories.","status":"closed","priority":1,"issue_type":"task","created_at":"2025-11-30T15:55:00.792861+11:00","updated_at":"2025-11-30T15:58:38.70658+11:00","closed_at":"2025-11-30T15:58:38.70658+11:00"}
+{"id":"eval-fy3","title":"Update graph.html to work with new directory structure","description":"Graph.html and graph-data.js need to be updated to read from either evals-engineer or evals-consultant directory structure. Currently references old evals/results path.","status":"open","priority":2,"issue_type":"task","created_at":"2025-11-30T15:55:22.977392+11:00","updated_at":"2025-11-30T15:55:22.977392+11:00"}
+{"id":"eval-mrs","title":"Add export/import functionality for evaluation configurations","description":"Add ability to export evaluation configurations (runs, queries, prompts) to a portable format and import them into different environments or share with other users.","status":"open","priority":3,"issue_type":"feature","created_at":"2025-11-30T15:55:38.520227+11:00","updated_at":"2025-11-30T15:55:38.520227+11:00"}
+{"id":"eval-rpx","title":"Add ability to switch between evals-engineer and evals-consultant in the UI","description":"Add a dropdown or toggle in the web UI to switch between evals-engineer and evals-consultant directories without restarting the server.","status":"open","priority":2,"issue_type":"feature","created_at":"2025-11-30T15:55:15.623639+11:00","updated_at":"2025-11-30T15:55:15.623639+11:00"}
+{"id":"eval-y1v","title":"Add comparison view for results across different models","description":"Create a UI view that allows side-by-side comparison of evaluation results across different models (e.g., compare gpt-4o vs llama3.2 vs bedrock-nova performance on same query).","status":"open","priority":3,"issue_type":"feature","created_at":"2025-11-30T15:55:46.05204+11:00","updated_at":"2025-11-30T15:55:46.05204+11:00"}

microeval-0.1.0/.beads/beads.left.meta.json ADDED Viewed

	@@ -0,0 +1 @@
1	+ {"version":"0.26.0","timestamp":"2025-12-14T10:15:10.811603+11:00","commit":"34a150f"}

microeval-0.1.0/.beads/config.yaml ADDED Viewed

@@ -0,0 +1,56 @@
+# Beads Configuration File
+# This file configures default behavior for all bd commands in this repository
+# All settings can also be set via environment variables (BD_* prefix)
+# or overridden with command-line flags
+# Issue prefix for this repository (used by bd init)
+# If not set, bd init will auto-detect from directory name
+# Example: issue-prefix: "myproject" creates issues like "myproject-1", "myproject-2", etc.
+# issue-prefix: ""
+# Use no-db mode: load from JSONL, no SQLite, write back after each command
+# When true, bd will use .beads/issues.jsonl as the source of truth
+# instead of SQLite database
+# no-db: false
+# Disable daemon for RPC communication (forces direct database access)
+# no-daemon: false
+# Disable auto-flush of database to JSONL after mutations
+# no-auto-flush: false
+# Disable auto-import from JSONL when it's newer than database
+# no-auto-import: false
+# Enable JSON output by default
+# json: false
+# Default actor for audit trails (overridden by BD_ACTOR or --actor)
+# actor: ""
+# Path to database (overridden by BEADS_DB or --db)
+# db: ""
+# Auto-start daemon if not running (can also use BEADS_AUTO_START_DAEMON)
+# auto-start-daemon: true
+# Debounce interval for auto-flush (can also use BEADS_FLUSH_DEBOUNCE)
+# flush-debounce: "5s"
+# Multi-repo configuration (experimental - bd-307)
+# Allows hydrating from multiple repositories and routing writes to the correct JSONL
+# repos:
+#   primary: "."  # Primary repo (where this database lives)
+#   additional:   # Additional repos to hydrate from (read-only)
+#     - ~/beads-planning  # Personal planning repo
+#     - ~/work-planning   # Work planning repo
+# Integration settings (access with 'bd config get/set')
+# These are stored in the database, not in this file:
+# - jira.url
+# - jira.project
+# - linear.url
+# - linear.api-key
+# - github.org
+# - github.repo
+# - sync.branch - Git branch for beads commits (use BEADS_SYNC_BRANCH env var or bd config set)

microeval-0.1.0/.beads/daemon.lock ADDED Viewed

@@ -0,0 +1,7 @@
+{
+  "pid": 2865,
+  "parent_pid": 2860,
+  "database": "/Users/boscoh/p/starteval/.beads/beads.db",
+  "version": "0.26.0",
+  "started_at": "2025-12-15T07:22:28.962716Z"
+}

microeval-0.1.0/.beads/issues.jsonl ADDED Viewed

@@ -0,0 +1,7 @@
+{"id":"eval-0so","title":"Add validation for config.json format","description":"Add schema validation for config.json to ensure chat_models and embed_models are properly formatted. Provide helpful error messages if config is invalid.","status":"open","priority":2,"issue_type":"task","created_at":"2025-11-30T15:55:31.313033+11:00","updated_at":"2025-11-30T15:55:31.313033+11:00"}
+{"id":"eval-519","title":"Add documentation for --evals-dir parameter usage","description":"Document the --evals-dir command-line parameter for both runner.py and server.py, including usage examples for switching between evals-engineer and evals-consultant.","status":"closed","priority":1,"issue_type":"task","created_at":"2025-11-30T15:55:08.697823+11:00","updated_at":"2025-11-30T15:59:02.552661+11:00","closed_at":"2025-11-30T15:59:02.552661+11:00"}
+{"id":"eval-dx6","title":"Update README.md to reflect new evals-engineer/evals-consultant structure","description":"README.md still references old evals directory. Need to update documentation to show split into evals-engineer and evals-consultant directories.","status":"closed","priority":1,"issue_type":"task","created_at":"2025-11-30T15:55:00.792861+11:00","updated_at":"2025-11-30T15:58:38.70658+11:00","closed_at":"2025-11-30T15:58:38.70658+11:00"}
+{"id":"eval-fy3","title":"Update graph.html to work with new directory structure","description":"Graph.html and graph-data.js need to be updated to read from either evals-engineer or evals-consultant directory structure. Currently references old evals/results path.","status":"open","priority":2,"issue_type":"task","created_at":"2025-11-30T15:55:22.977392+11:00","updated_at":"2025-11-30T15:55:22.977392+11:00"}
+{"id":"eval-mrs","title":"Add export/import functionality for evaluation configurations","description":"Add ability to export evaluation configurations (runs, queries, prompts) to a portable format and import them into different environments or share with other users.","status":"open","priority":3,"issue_type":"feature","created_at":"2025-11-30T15:55:38.520227+11:00","updated_at":"2025-11-30T15:55:38.520227+11:00"}
+{"id":"eval-rpx","title":"Add ability to switch between evals-engineer and evals-consultant in the UI","description":"Add a dropdown or toggle in the web UI to switch between evals-engineer and evals-consultant directories without restarting the server.","status":"open","priority":2,"issue_type":"feature","created_at":"2025-11-30T15:55:15.623639+11:00","updated_at":"2025-11-30T15:55:15.623639+11:00"}
+{"id":"eval-y1v","title":"Add comparison view for results across different models","description":"Create a UI view that allows side-by-side comparison of evaluation results across different models (e.g., compare gpt-4o vs llama3.2 vs bedrock-nova performance on same query).","status":"open","priority":3,"issue_type":"feature","created_at":"2025-11-30T15:55:46.05204+11:00","updated_at":"2025-11-30T15:55:46.05204+11:00"}

microeval-0.1.0/.beads/metadata.json ADDED Viewed

@@ -0,0 +1,5 @@
+{
+  "database": "beads.db",
+  "jsonl_export": "issues.jsonl",
+  "last_bd_version": "0.26.0"
+}

microeval-0.1.0/.env.example ADDED Viewed

@@ -0,0 +1,9 @@
+# OpenAI API Configuration
+OPENAI_API_KEY=your_openai_api_key_here
+# Groq API Configuration
+GROQ_API_KEY=your_groq_api_key_here
+# AWS Configuration (recommended - configure via ~/.aws/credentials)
+AWS_PROFILE=your_aws_profile_name

microeval-0.1.0/.gitattributes ADDED Viewed

@@ -0,0 +1,3 @@
+# Use bd merge for beads JSONL files
+.beads/issues.jsonl merge=beads

microeval-0.1.0/.gitignore ADDED Viewed

@@ -0,0 +1,173 @@
+# Byte-compiled / optimized / DLL files
+__pycache__/
+*.py[cod]
+*$py.class
+# C extensions
+*.so
+# Distribution / packaging
+.Python
+build/
+develop-eggs/
+dist/
+downloads/
+eggs/
+.eggs/
+lib/
+lib64/
+parts/
+sdist/
+var/
+wheels/
+share/python-wheels/
+*.egg-info/
+.installed.cfg
+*.egg
+MANIFEST
+# PyInstaller
+#  Usually these files are written by a python script from a template
+#  before PyInstaller builds the exe, so as to inject date/other infos into it.
+*.manifest
+*.spec
+# Installer logs
+pip-log.txt
+pip-delete-this-directory.txt
+# Unit test / coverage reports
+htmlcov/
+.tox/
+.nox/
+.coverage
+.coverage.*
+.cache
+nosetests.xml
+coverage.xml
+*.cover
+*.py,cover
+.hypothesis/
+.pytest_cache/
+cover/
+# Translations
+*.mo
+*.pot
+# Django stuff:
+*.log
+local_settings.py
+db.sqlite3
+db.sqlite3-journal
+# Flask stuff:
+instance/
+.webassets-cache
+# Scrapy stuff:
+.scrapy
+# Sphinx documentation
+docs/_build/
+# PyBuilder
+.pybuilder/
+target/
+# Jupyter Notebook
+.ipynb_checkpoints
+# IPython
+profile_default/
+ipython_config.py
+# pyenv
+#   For a library or package, you might want to ignore these files since the code is
+#   intended to run in multiple environments; otherwise, check them in:
+# .python-version
+# pipenv
+#   According to pypa/pipenv#598, it is recommended to include Pipfile.lock in version control.
+#   However, in case of collaboration, if having platform-specific dependencies or dependencies
+#   having no cross-platform support, pipenv may install dependencies that don't work, or not
+#   install all needed dependencies.
+#Pipfile.lock
+# PEP 582; used by e.g. github.com/David-OConnor/pyflow
+__pypackages__/
+# Celery stuff
+celerybeat-schedule
+celerybeat.pid
+# SageMath parsed files
+*.sage.py
+# Environments
+.env
+.venv
+env/
+venv/
+ENV/
+env.bak/
+venv.bak/
+# Spyder project settings
+.spyderproject
+.spyproject
+# Rope project settings
+.ropeproject
+# mkdocs documentation
+/site
+# mypy
+.mypy_cache/
+.dmypy.json
+dmypy.json
+# Pyre type checker
+.pyre/
+# pytype static type analyzer
+.pytype/
+# Cython debug symbols
+cython_debug/
+# PyCharm
+.idea/
+*.iml
+# VS Code
+.vscode/
+*.code-workspace
+# macOS
+.DS_Store
+.AppleDouble
+.LSOverride
+# Windows
+Thumbs.db
+ehthumbs.db
+Desktop.ini
+$RECYCLE.BIN/
+# Linux
+*~
+# Local development
+*.local
+# Project specific
+*.db
+*.sqlite3
+*.log
+*.pid
+*.pid.lock
+uv.lock
+.ruff_cache

microeval-0.1.0/LICENSE ADDED Viewed

@@ -0,0 +1,21 @@
+MIT License
+Copyright (c) 2025 Bosco Ho
+Permission is hereby granted, free of charge, to any person obtaining a copy
+of this software and associated documentation files (the "Software"), to deal
+in the Software without restriction, including without limitation the rights
+to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+copies of the Software, and to permit persons to whom the Software is
+furnished to do so, subject to the following conditions:
+The above copyright notice and this permission notice shall be included in all
+copies or substantial portions of the Software.
+THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+SOFTWARE.