RubyGems - agent_c - Versions diffs - 2.71828 - Mend

agent_c 2.71828

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (62) hide show

checksums.yaml +7 -0
data/.rubocop.yml +10 -0
data/.ruby-version +1 -0
data/CLAUDE.md +21 -0
data/README.md +360 -0
data/Rakefile +16 -0
data/TODO.md +12 -0
data/agent_c.gemspec +38 -0
data/docs/chat-methods.md +157 -0
data/docs/cost-reporting.md +86 -0
data/docs/pipeline-tips-and-tricks.md +71 -0
data/docs/session-configuration.md +274 -0
data/docs/testing.md +747 -0
data/docs/tools.md +103 -0
data/docs/versioned-store.md +840 -0
data/lib/agent_c/agent/chat.rb +211 -0
data/lib/agent_c/agent/chat_response.rb +32 -0
data/lib/agent_c/agent/chats/anthropic_bedrock.rb +48 -0
data/lib/agent_c/batch.rb +102 -0
data/lib/agent_c/configs/repo.rb +90 -0
data/lib/agent_c/context.rb +56 -0
data/lib/agent_c/costs/data.rb +39 -0
data/lib/agent_c/costs/report.rb +219 -0
data/lib/agent_c/db/store.rb +162 -0
data/lib/agent_c/errors.rb +19 -0
data/lib/agent_c/pipeline.rb +188 -0
data/lib/agent_c/processor.rb +98 -0
data/lib/agent_c/prompts.yml +53 -0
data/lib/agent_c/schema.rb +85 -0
data/lib/agent_c/session.rb +207 -0
data/lib/agent_c/store.rb +72 -0
data/lib/agent_c/test_helpers.rb +173 -0
data/lib/agent_c/tools/dir_glob.rb +46 -0
data/lib/agent_c/tools/edit_file.rb +112 -0
data/lib/agent_c/tools/file_metadata.rb +43 -0
data/lib/agent_c/tools/grep.rb +119 -0
data/lib/agent_c/tools/paths.rb +36 -0
data/lib/agent_c/tools/read_file.rb +94 -0
data/lib/agent_c/tools/run_rails_test.rb +87 -0
data/lib/agent_c/tools.rb +60 -0
data/lib/agent_c/utils/git.rb +75 -0
data/lib/agent_c/utils/shell.rb +58 -0
data/lib/agent_c/version.rb +5 -0
data/lib/agent_c.rb +32 -0
data/lib/versioned_store/base.rb +314 -0
data/lib/versioned_store/config.rb +26 -0
data/lib/versioned_store/stores/schema.rb +127 -0
data/lib/versioned_store/version.rb +5 -0
data/lib/versioned_store.rb +5 -0
data/template/Gemfile +9 -0
data/template/Gemfile.lock +152 -0
data/template/README.md +61 -0
data/template/Rakefile +50 -0
data/template/bin/rake +27 -0
data/template/lib/autoload.rb +10 -0
data/template/lib/config.rb +59 -0
data/template/lib/pipeline.rb +19 -0
data/template/lib/prompts.yml +57 -0
data/template/lib/store.rb +17 -0
data/template/test/pipeline_test.rb +221 -0
data/template/test/test_helper.rb +18 -0
metadata +191 -0

checksums.yaml ADDED Viewed

@@ -0,0 +1,7 @@
+---
+SHA256:
+  metadata.gz: 8321a1602d20f566b59365d641fecb934340b043d6544e43b01b2951e947282b
+  data.tar.gz: bf3cd0d58944294f4e83e080ac7aebfff8677798c96426411e559c8f7e7f6a7d
+SHA512:
+  metadata.gz: 5dbe7a3d1ca921db961a5a3685c01f4e539b0358506cb45f9d6a04c462cb0657efa45eff16eea4f65c00bdb6f113b40551b38e71f5d61ceadf8463cec1ea9007
+  data.tar.gz: 4b407bc2cf7086536bce703b209552b8753c6b3eef37b096437728095945d247a5b6643113487ad739339e407b6e4c43c6b0cdd72aeca3f75551ab9e1ac763e0

data/.rubocop.yml ADDED Viewed

@@ -0,0 +1,10 @@
+AllCops:
+  Exclude: [ "bin/*" ]
+  TargetRubyVersion: 3.2
+  DisabledByDefault: true
+Style/FrozenStringLiteralComment:
+  Enabled: true
+Layout/EmptyLineAfterMagicComment:
+  Enabled: true

data/.ruby-version ADDED Viewed

	@@ -0,0 +1 @@
1	+ 3.2.9

data/CLAUDE.md ADDED Viewed

@@ -0,0 +1,21 @@
+# Rules
+- Leave modules modules if they do not need state. If they apply behaviors to other schema/classes leave them modules.
+- Prefer to not store lambdas as variables unless necessary. If they are just going to be passed to other methods, leave them as blocks
+- If a class is not trivial (eg, more than one method and/or more than like 30 lines) then extract it to its own file.
+- This project is using Zeitwerk. You should not use require_relative, just match the module names to file path and it will load automatically.
+- When you commit, use the --no-gpg-sign flag. Start commit messages with "claude: "
+- DO NOT add example scripts. Either add it to the readme or make a test.
+- DO NOT add documentation outside of the README
+- DO NOT program defensively. If something should respond_to?() a method then just invoke the method. An error is better than a false positive
+- If you need to test a one-off script, write a test-case and run it instad of writing a temporary file or using a giant shell script
+- DO NOT edit the singleton class of an object. If you think you need to do this, ideas for avoiding: inject an object, create a module and include it, make a base class.
+# TESTING
+- We do not use stubbing in our test. If you need to stub something (or monkey-patch it) to test it, that thing should be injectable.
+- Run tests with `bin/rake test` You can pass TESTOPTS to run a specific file.
+# Style
+- For multiline Strings always use a HEREDOC

data/README.md ADDED Viewed

@@ -0,0 +1,360 @@
+# AgentC
+A small Ruby wrapper around [RubyLLM](https://github.com/alexrudall/ruby_llm) that helps you write a pipeline of AI prompts and run it many times, in bulk. Built for automating repetitive refactors across a large codebase.
+<small>Most of what's below is generated by an LLM. I take no responsibility for any of it, unless it's awesome... then it was pure prompting skills which I will take credit for.</small>
+## Overview
+AgentC provides batch processing and pipeline orchestration for AI-powered tasks:
+- **Batch Processing** - Execute pipelines across multiple records with automatic parallelization via worktrees
+- **Pipeline Orchestration** - Define multi-step workflows with AI-powered agent steps and custom logic
+- **Resumable Execution** - Automatically skip completed steps when pipelines are rerun
+- **Automatic query persistence** - All interactions saved to SQLite
+- **Cost tracking** - Detailed reports on token usage and costs
+- **Custom tools** - File operations, grep, Rails tests, and more
+- **Schema validation** - RubyLLM Schema support for structured responses
+## Installation
+This gem is not pushed to rubygems. Instead, you should add a git reference to your Gemfile (use a revision because I'm going to make changes with complete disregard for backwards compatibility).
+## Example template
+See an [example template](./template) you can run in the `template/` directory of this repo. Poke around there after perusing this section.
+You can copy this template to start building your own.
+## Quick Start
+A "Pipeline" is a series of prompts for Claude to perform. Data gathered from prior steps are fed into subsequent steps (you'll define an ActiveRecord class to capture the data). If any step fails, the pipeline aborts.
+A "Batch" is a collection of pipelines to be run. They can be run against a single directory in series, or concurrently across multiple git worktrees. If a pipeline fails, the failure will be recorded but the batch will continue.
+### The necessary structures
+In this example, we'll have Claude choose a random file, summarize its contents in a language of our choosing, then write it to disk and commit.
+```ruby
+# Define the records your agent will interact with.
+# Normally you'd only have one record.
+#
+# A versioned store saves a full db backup per-transaction
+# so that you can recover from any step of the process.
+# Just trying to save tokens...
+class MyStore < VersionedStore::Base
+  include AgentC::Store
+  record(:summary) do
+    # the migration schema is defined in line
+    schema do |t|
+      # we'll input this data
+      t.string(:language)
+      # claude will generate this data
+      t.string(:input_path)
+      t.text(:summary_text)
+      t.text(:summary_path)
+    end
+    # this is the body of your ActiveRecord class
+    # add methods here as needed
+  end
+end
+# A "pipeline" processes a single record
+class MyPipeline < AgentC::Pipeline
+  # The prompts for these steps will
+  # live in our prompts.yml file
+  agent_step(:analyze_code)
+  agent_step(:write_summary_to_file)
+  step(:finalize) do
+    repo.commit_all("claude: analyzed code")
+  end
+  # if this pipeline fails, we want to
+  # leave the repo in a clean state
+  # for the next pipeline.
+  on_failure do
+    repo.reset_hard_all
+  end
+end
+```
+```yaml
+# define your prompts in a prompts.yml file:
+en:
+  # the key names must match up to the `agent_step` invocation above
+  analyze_code:
+    # prompts here will be cached across pipelines.
+    # These prompts cannot interpolate any attributes.
+    # Suggested use is to put as much in the cached_prompts
+    # as possible and put variable data in the prompt.
+    cached_prompts:
+      - "Choose a random file. Read it and summarize it in the provided language."
+    # You can interpolate any attribute from your record class
+    prompt: "lanuage: %{language}"
+    # Tools available:
+    # - dir_glob
+    # - read_file
+    # - edit_file
+    # - grep
+    # - run_rails_test
+    # you can add more...
+    tools: [read_file, dir_glob]
+    # The response schema defines what Claude will return.
+    # The keys must be attributes from your record. What Claude
+    # returns will automatically be saved to your record.
+    response_schema:
+      summary_text:
+        type: string # this is the default
+        required: true # this is the default
+        description: "The summary text"
+      input_path:
+        type: string # this is the default
+        required: true # this is the default
+        description: "The path of the file you summarized"
+  write_summary_to_file:
+    cached_prompts:
+      - |
+      You will be given some text.
+      Choose a well-named file and write the text to it"
+    prompt: "Here is the text to write: %{summary_text}"
+    tools: [edit_file]
+    response_schema:
+      summary_path:
+        description: "the path of the file you wrote"
+```
+Now, make a Batch and invoke it. A batch requires a lot of configuration, related to data storage, where your repo is, and claude API credentials:
+```ruby
+batch = Batch.new(
+  record_type: :summary, # the class name you want to work on
+  pipeline: Pipeline, # the Pipeline class you made
+  # A batch has a "project" and a "run". These are ways
+  # to track Claude usage. Your Batch will have a
+  # "project". Each time you call batch.new you get
+  # a new "run".
+  project: "TemplateProject",
+  # We'll set some spending limits. Once these are
+  # reached, the Batch will abort.
+  max_spend_project: 100.0,
+  max_spend_run: 20.0,
+  store: {
+    class: Store, # the Store class you made
+    config: {
+      logger: Logger.new("/dev/null"), # a logger for the store
+      dir: "/where/you/want/your/store/saved"
+    }
+  },
+  # Where Claude will work
+  workspace: {
+    dir: "/where/claude/will/be/working",
+    env: {
+      # available to your tools
+      # only used by run_rails_test currently
+      SOME_ENV_VAR: "1"
+    }
+  }
+  # If you prefer, you can have the Batch manage
+  # some git worktrees for you. It will parallelize
+  # your tasks across your worktrees for MAXIMUM
+  # TOKEN BURN.
+  #
+  # Worktrees will be created for you if you are
+  # starting a new Batch. If you are continuing an
+  # existing Batch (after an error, for example),
+  # the worktrees will be left in their current
+  # state.
+  #
+  # You must pass *either* a workspace or a repo
+  repo: {
+    dir: "/path/to/your/repo",
+    # an existing git revision or branch name
+    initial_revision: "main",
+     # optional: limit Claude to a subdir from your repo
+    working_subdir: "./",
+    # Where to put your worktrees
+    worktrees_root_dir: "/tmp/example-worktrees",
+    # Each worktree gets a branch, they'll be suffixed
+    # with a counter
+    worktree_branch_prefix: "summary-examples",
+    # Currently, this defines how many worktrees to
+    # create. It's obnoxious I know, but hey, it works.
+    worktree_envs: [{}, {}],
+  }
+  # The claude configuration:
+  session: {
+    # all chats with claude are saved to a sqlite db.
+    # this is separate than your Store's db because
+    # why throw anything away. Can be useful for
+    # debugging why Claude did what it did
+    agent_db_path: "/path/to/your/claude/db.sqlite",
+    logger: Logger.new("/dev/null"), # probably use the same logger for everything...
+    i18n_path: "/path/to/your/prompts.yml",
+    # as you debug your pipeline, you'll probably run it
+    # many times. We tag all Claude chat records with a
+    # project so you can track costs.
+    project: "SomeProject",
+    # only available for Bedrock...
+    ruby_llm: {
+      bedrock_api_key: ENV.fetch("AWS_ACCESS_KEY_ID"),
+      bedrock_secret_key: ENV.fetch("AWS_SECRET_ACCESS_KEY"),
+      bedrock_session_token: ENV.fetch("AWS_SESSION_TOKEN"),
+      bedrock_region: ENV.fetch("AWS_REGION", "us-west-2"),
+      default_model: ENV.fetch("LLM_MODEL", "us.anthropic.claude-sonnet-4-5-20250929-v1:0")
+    }
+  },
+)
+# WHEW that's a lot of config,
+# Now we add some records for processing.
+# The batches "store" is just a bunch of
+# ActiveRecord classes, but you reference
+# them by the name you gave them in the
+# store.
+#
+# We'll add some summary records.
+# This seeded data represents the input
+# into your pipelines.
+#
+# Because your batch can be stopped and
+# restarted, we need our data creation
+# to be idemptotent.
+record_1 = (
+  batch
+    .store
+    .summary
+    .find_or_create_by!(language: "english")
+)
+record_2 = (
+  batch
+    .store
+    .summary
+    .find_or_create_by!(language: "spanish")
+)
+# Add the records to be processed.
+# add_task is idempotent
+batch.add_task(record_1)
+batch.add_task(record_2)
+batch.call
+# See the details of what happened
+puts batch.report
+# =>
+# Summary report:
+# Succeeded: 2
+# Pending: 0
+# Failed: 0
+# Run cost: $2.34
+# Project total cost: $10.40
+# ---
+# task: 1 - wrote summary to /tmp/example-worktrees/summary-examples-0/CHAT_TEST_SUMMARY.md
+# task: 2 - wrote summary to /tmp/example-worktrees/summary-examples-1/RESUMEN_BASE.md
+# Get a more detailed breakdown
+cost = batch.cost
+# Explore the records created
+tasks = batch.store.task.all
+summaries = batch.store.summary.all
+```
+You can tail your logs to see what's happening. The full text of you Claude chats are logged to DEBUG.
+If you just want to see your pipeline's progression:
+```shell
+# Only see INFO
+tail -f /path/to/log.log | grep INFO
+```
+#### Batch errors
+If your batch is interupted (by an exception or you kill it), you can continue it by simply running your batch again. The progress is persisted in the Batch's store.
+If you need to correct any data or go back in time, you can peruse the store's versions by doing:
+```ruby
+# see how many versions
+puts batch.store.versions.count
+# peruse your store:
+batch.store.versions[12].summary.count
+# restore a prior version
+batch.store.version[12].restore
+# re-run the batch
+batch.call
+```
+### Resetting a Batch
+You can delete the sqlite database for your store.
+Delete the database you configured at `store: { config: { dir: "/path/to/db" } }`
+### Debugging a Batch
+If you make multiple worktrees, they will be processed concurrently. This makes things hard to debug using `binding.irb`.
+I suggest making one worktree until it's running successfully.
+### Structuring your project
+I suggest following the structure of the [example template](./template).
+# Detailed Documentation
+Detailed guides for all features:
+- **[Main README](../README.md)** - Batch and Pipeline usage (primary approach)
+- **[Pipeline Tips and Tricks](docs/pipeline-tips-and-tricks.md)** - Useful patterns and techniques for pipelines
+- **[Chat Methods](docs/chat-methods.md)** - Using session.prompt and session.chat for direct interactions
+- **[Tools](docs/tools.md)** - Built-in tools for file operations and code interaction
+- **[Testing](docs/testing.md)** - Using TestHelpers::DummyChat for testing without real LLM calls
+- **[Cost Reporting](docs/cost-reporting.md)** - Track token usage and costs
+- **[Session Configuration](docs/session-configuration.md)** - All configuration options
+- **[Store Versioning](docs/versioned-store.md)** - All configuration options
+## Requirements
+- Ruby >= 3.0.0
+- AWS credentials configured for Bedrock access
+- SQLite3
+## License
+WTFPL - Do What The Fuck You Want To Public License
+## Author
+Pete Kinnecom (git@k7u7.com)

data/Rakefile ADDED Viewed

@@ -0,0 +1,16 @@
+# frozen_string_literal: true
+require "bundler/setup"
+require "rake/testtask"
+Rake::TestTask.new(:test) do |t|
+  t.libs << "test"
+  t.libs << "."
+  t.test_files = FileList["test/**/*_test.rb"]
+  t.verbose = false
+  t.warning = false
+  # Enable parallel test execution based on number of processors
+  ENV["MT_CPU"] ||= Etc.nprocessors.to_s
+end
+task default: :test

data/TODO.md ADDED Viewed

@@ -0,0 +1,12 @@
+# TODOs
+Things I'd like to work on:
+- Make injecting a Chat record simpler.
+- Make injecting Git simpler (make injecting anything easier)
+- Add a request queue to AgentC::Chat so that we can rate-limit and retry on error
+- Use spring for run_rails_test, but add a timeout condition where it kills the
+  process if no stdout appears for a while and tries again without spring.
+- tool calls should write the full results to file (except for readfile) and pass
+  back a reference for future queries. For example, if RunRailsTest gives way too
+  much output, we have to truncate but how to see the rest?

data/agent_c.gemspec ADDED Viewed

@@ -0,0 +1,38 @@
+# frozen_string_literal: true
+require_relative "lib/agent_c/version"
+Gem::Specification.new do |spec|
+  spec.name = "agent_c"
+  spec.version = AgentC::VERSION
+  spec.authors = ["Pete Kinnecom"]
+  spec.email = ["git@k7u7.com"]
+  spec.summary = <<~TEXT.strip
+    Batch processing for pipelines of steps for AI. AgentC, get it?
+  TEXT
+  spec.homepage = "https://github.com/petekinnecom/agent_c"
+  spec.license = "WTFPL"
+  spec.required_ruby_version = ">= 3.0.0"
+  spec.metadata["allowed_push_host"] = "https://rubygems.org"
+  spec.metadata["homepage_uri"] = spec.homepage
+  spec.metadata["source_code_uri"] = spec.homepage
+  spec.metadata["changelog_uri"] = spec.homepage
+  spec.files = Dir.chdir(__dir__) do
+    `git ls-files -z`.split("\x0").reject do |f|
+      (File.expand_path(f) == __FILE__) ||
+        f.start_with?(*%w[bin/ test/ spec/ features/ .git .circleci appveyor Gemfile])
+    end
+  end
+  spec.bindir = "exe"
+  spec.executables = spec.files.grep(%r{\Aexe/}) { |f| File.basename(f) }
+  spec.require_paths = ["lib"]
+  spec.add_dependency("zeitwerk", "~> 2.7")
+  spec.add_dependency("activerecord", "~> 8.1")
+  spec.add_dependency("sqlite3", "~> 2.9")
+  spec.add_dependency("async", "~> 2.35")
+  spec.add_dependency("ruby_llm", "~> 1.9")
+  spec.add_dependency("json-schema", "~> 6.1")
+end

data/docs/chat-methods.md ADDED Viewed

@@ -0,0 +1,157 @@
+# Chat Methods
+**Note:** For batch processing and structured workflows, use [Batch and Pipeline](../README.md) instead. The methods below are for direct chat interactions and one-off requests.
+AgentC provides several methods for interacting with LLMs, each optimized for different use cases.
+## Creating Chats
+```ruby
+# See the [configuration](./session-configuration.md) for session args
+session = Session.new(...)
+chat = session.chat(
+  tools: [:read_file, :edit_file],
+  cached_prompts: ["You are a helpful assistant"],
+  workspace_dir: Dir.pwd
+)
+```
+## Chat.ask(message)
+Basic interaction - send a message and get a response:
+```ruby
+chat = session.chat
+response = chat.ask("Explain recursion in simple terms")
+```
+## Chat.get(message, schema:, confirm:, out_of:)
+Get a structured response with optional confirmation:
+```ruby
+# Get a simple answer
+answer = chat.get("What is 2 + 2?")
+# Get structured response using AgentC::Schema.result
+# This creates a schema that accepts either success or error responses
+#
+# You can make your own schema using RubyLLM::Schema, but
+# this is a pretty standard approach. It will allow the LLM
+# to indicate that they could not fulfill your request and
+# give a reason why.
+#
+# The response will look like one of the following:
+# {
+#   status: "success",
+#   name: "...",
+#   email: "...",
+# }
+# OR:
+# {
+#   status: "failure",
+#   message: "some reason why it couldn't do it"
+# }
+schema = AgentC::Schema.result do
+  string(:name, description: "Person's name")
+  string(:email, description: "Person's email")
+end
+result = chat.get(
+  "Extract the name and email from this text: 'Contact John at john@example.com'",
+  schema: schema
+)
+# => { "status" => "success", "name" => "John", "email" => "john@example.com" }
+# If the LLM can't complete the task, it returns an error response:
+# => { "status" => "error", "message" => "No email found in the text" }
+```
+### Using confirm and out_of for consensus
+LLMs are non-deterministic and can give different answers to the same question. The `confirm` feature asks the question multiple times and only accepts an answer when it appears at least `confirm` times out of `out_of` attempts. This gives you much higher confidence the answer isn't a hallucination or random variation.
+```ruby
+class YesOrNoSchema < RubyLLM::Schema
+  string(:value, enum: ["yes", "no"])
+end
+confirmed = chat.get(
+  "Is vanilla better than chocolate?",
+  confirm: 2,    # Need 2 matching answers
+  out_of: 3,      # Out of 3 attempts max
+  schema: YesOrNoSchema
+)
+```
+## Chat.refine(message, schema:, times:)
+Iteratively refine a response by having the LLM review and improve its own answer.
+The refine feature asks your question, gets an answer, then asks the LLM to review that answer for accuracy and improvements. This repeats for the specified number of times. Each iteration gives the LLM a chance to catch mistakes, add detail, or improve quality.
+This works because LLMs are often better at *reviewing* content than generating it perfectly the first time - like having an editor review a draft. It's especially effective for creative tasks, complex analysis, or code generation where iterative improvement leads to higher quality outputs.
+```ruby
+HaikuSchema = RubyLLM::Schema.object(
+  haiku: RubyLLM::Schema.string
+)
+refined_answer = chat.refine(
+  "Write a haiku about programming",
+  schema: HaikuSchema,
+  times: 3  # LLM reviews and refines its answer 3 times
+)
+```
+## Session.prompt() - One-Off Requests
+For single-shot requests where you don't need a persistent chat, use `session.prompt()`:
+```ruby
+# See the [configuration](./session-configuration.md) for session args
+session = Session.new(...)
+# Simple one-off request
+result = session.prompt(
+  prompt: "What is the capital of France?",
+  schema: -> { string(:answer) }
+)
+# => ChatResponse with success/error status
+# With tools and custom settings
+result = session.prompt(
+  prompt: "Read the README file and summarize it",
+  schema: -> { string(:summary) },
+  tools: [:read_file],
+  tool_args: { workspace_dir: '/path/to/project' },
+  cached_prompt: ["You are a helpful documentation assistant"]
+)
+if result.success?
+  puts result.data['summary']
+else
+  puts "Error: #{result.error_message}"
+end
+```
+This is equivalent to creating a chat, calling `get()`, and handling the response, but more concise for one-off requests.
+## Cached Prompts
+To optimize token usage and reduce costs, you can use cached prompts. Cached prompts are stored in the API provider's cache and can significantly reduce the number of input tokens charged on subsequent requests.
+```ruby
+# Provide cached prompts that will be reused across conversations
+cached_prompts = [
+  "You are a helpful coding assistant specialized in Ruby.",
+  "Always write idiomatic Ruby code following Ruby community best practices."
+]
+chat = session.chat(cached_prompts: cached_prompts)
+response = chat.ask("Write a method to calculate fibonacci numbers")
+```
+The first request will incur cache creation costs, but subsequent requests with the same cached prompts will use significantly fewer tokens, reducing overall API costs.