structify 0.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
checksums.yaml ADDED
@@ -0,0 +1,7 @@
1
+ ---
2
+ SHA256:
3
+ metadata.gz: eb04b492770cd56dc4378bb73de9ff1e0ee525e89f30c942f87f20314990bbc9
4
+ data.tar.gz: 5061d34d693c475070680501b3bf78f0634437cefc6da8b914fcda8a56ef3470
5
+ SHA512:
6
+ metadata.gz: 0a1ee0ace2f35d460d244902a2de7699ce346ce2629c14e4151fb3d58e00f2f81444739faddd267f5eef2f53a9ecaf4334a1334e975b549fbf66993b7577c56f
7
+ data.tar.gz: 11cf2d7c6bbd538c8bc7fd944d289c3c303faa4e4fa08736ba0099ab276a4131ee973c2fc86a353e75c5ec13f25bb6a4409e4ee3c0a54b28be15167dbe3e3cdd
data/.gitignore ADDED
@@ -0,0 +1,11 @@
1
+ /.bundle/
2
+ /.yardoc
3
+ /_yardoc/
4
+ /coverage/
5
+ /doc/
6
+ /pkg/
7
+ /spec/reports/
8
+ /tmp/
9
+
10
+ # rspec failure tracking
11
+ .rspec_status
data/.rspec ADDED
@@ -0,0 +1,4 @@
1
+ --require spec_helper
2
+ --format documentation
3
+ --color
4
+ --order random
data/.travis.yml ADDED
@@ -0,0 +1,6 @@
1
+ ---
2
+ language: ruby
3
+ cache: bundler
4
+ rvm:
5
+ - 3.2.2
6
+ before_install: gem install bundler -v 2.1.4
@@ -0,0 +1,74 @@
1
+ # Contributor Covenant Code of Conduct
2
+
3
+ ## Our Pledge
4
+
5
+ In the interest of fostering an open and welcoming environment, we as
6
+ contributors and maintainers pledge to making participation in our project and
7
+ our community a harassment-free experience for everyone, regardless of age, body
8
+ size, disability, ethnicity, gender identity and expression, level of experience,
9
+ nationality, personal appearance, race, religion, or sexual identity and
10
+ orientation.
11
+
12
+ ## Our Standards
13
+
14
+ Examples of behavior that contributes to creating a positive environment
15
+ include:
16
+
17
+ * Using welcoming and inclusive language
18
+ * Being respectful of differing viewpoints and experiences
19
+ * Gracefully accepting constructive criticism
20
+ * Focusing on what is best for the community
21
+ * Showing empathy towards other community members
22
+
23
+ Examples of unacceptable behavior by participants include:
24
+
25
+ * The use of sexualized language or imagery and unwelcome sexual attention or
26
+ advances
27
+ * Trolling, insulting/derogatory comments, and personal or political attacks
28
+ * Public or private harassment
29
+ * Publishing others' private information, such as a physical or electronic
30
+ address, without explicit permission
31
+ * Other conduct which could reasonably be considered inappropriate in a
32
+ professional setting
33
+
34
+ ## Our Responsibilities
35
+
36
+ Project maintainers are responsible for clarifying the standards of acceptable
37
+ behavior and are expected to take appropriate and fair corrective action in
38
+ response to any instances of unacceptable behavior.
39
+
40
+ Project maintainers have the right and responsibility to remove, edit, or
41
+ reject comments, commits, code, wiki edits, issues, and other contributions
42
+ that are not aligned to this Code of Conduct, or to ban temporarily or
43
+ permanently any contributor for other behaviors that they deem inappropriate,
44
+ threatening, offensive, or harmful.
45
+
46
+ ## Scope
47
+
48
+ This Code of Conduct applies both within project spaces and in public spaces
49
+ when an individual is representing the project or its community. Examples of
50
+ representing a project or community include using an official project e-mail
51
+ address, posting via an official social media account, or acting as an appointed
52
+ representative at an online or offline event. Representation of a project may be
53
+ further defined and clarified by project maintainers.
54
+
55
+ ## Enforcement
56
+
57
+ Instances of abusive, harassing, or otherwise unacceptable behavior may be
58
+ reported by contacting the project team at kieranklaassen@gmail.com. All
59
+ complaints will be reviewed and investigated and will result in a response that
60
+ is deemed necessary and appropriate to the circumstances. The project team is
61
+ obligated to maintain confidentiality with regard to the reporter of an incident.
62
+ Further details of specific enforcement policies may be posted separately.
63
+
64
+ Project maintainers who do not follow or enforce the Code of Conduct in good
65
+ faith may face temporary or permanent repercussions as determined by other
66
+ members of the project's leadership.
67
+
68
+ ## Attribution
69
+
70
+ This Code of Conduct is adapted from the [Contributor Covenant][homepage], version 1.4,
71
+ available at [https://contributor-covenant.org/version/1/4][version]
72
+
73
+ [homepage]: https://contributor-covenant.org
74
+ [version]: https://contributor-covenant.org/version/1/4/
data/Gemfile ADDED
@@ -0,0 +1,16 @@
1
+ source "https://rubygems.org"
2
+
3
+ # Specify your gem's dependencies in structify.gemspec
4
+ gemspec
5
+
6
+ group :development, :test do
7
+ gem "rake", "~> 13.0"
8
+ gem "rspec", "~> 3.12"
9
+ gem "rspec-rails", "~> 6.1"
10
+ gem "activerecord", "~> 7.1.0"
11
+ gem "sqlite3", "~> 1.6.0" # For testing with ActiveRecord
12
+ gem "rubocop", "~> 1.21"
13
+ gem "rubocop-rspec", "~> 2.25"
14
+ gem "yard", "~> 0.9"
15
+ gem "debug", ">= 1.0.0"
16
+ end
data/Gemfile.lock ADDED
@@ -0,0 +1,195 @@
1
+ PATH
2
+ remote: .
3
+ specs:
4
+ structify (0.1.0)
5
+ activesupport (>= 7.1)
6
+ attr_json (~> 2.1)
7
+
8
+ GEM
9
+ remote: https://rubygems.org/
10
+ specs:
11
+ actionpack (7.1.5.1)
12
+ actionview (= 7.1.5.1)
13
+ activesupport (= 7.1.5.1)
14
+ nokogiri (>= 1.8.5)
15
+ racc
16
+ rack (>= 2.2.4)
17
+ rack-session (>= 1.0.1)
18
+ rack-test (>= 0.6.3)
19
+ rails-dom-testing (~> 2.2)
20
+ rails-html-sanitizer (~> 1.6)
21
+ actionview (7.1.5.1)
22
+ activesupport (= 7.1.5.1)
23
+ builder (~> 3.1)
24
+ erubi (~> 1.11)
25
+ rails-dom-testing (~> 2.2)
26
+ rails-html-sanitizer (~> 1.6)
27
+ activemodel (7.1.5.1)
28
+ activesupport (= 7.1.5.1)
29
+ activerecord (7.1.5.1)
30
+ activemodel (= 7.1.5.1)
31
+ activesupport (= 7.1.5.1)
32
+ timeout (>= 0.4.0)
33
+ activesupport (7.1.5.1)
34
+ base64
35
+ benchmark (>= 0.3)
36
+ bigdecimal
37
+ concurrent-ruby (~> 1.0, >= 1.0.2)
38
+ connection_pool (>= 2.2.5)
39
+ drb
40
+ i18n (>= 1.6, < 2)
41
+ logger (>= 1.4.2)
42
+ minitest (>= 5.1)
43
+ mutex_m
44
+ securerandom (>= 0.3)
45
+ tzinfo (~> 2.0)
46
+ ast (2.4.2)
47
+ attr_json (2.5.0)
48
+ activerecord (>= 6.0.0, < 8.1)
49
+ base64 (0.2.0)
50
+ benchmark (0.4.0)
51
+ bigdecimal (3.1.9)
52
+ builder (3.3.0)
53
+ concurrent-ruby (1.3.5)
54
+ connection_pool (2.5.0)
55
+ crass (1.0.6)
56
+ date (3.4.1)
57
+ debug (1.10.0)
58
+ irb (~> 1.10)
59
+ reline (>= 0.3.8)
60
+ diff-lcs (1.5.1)
61
+ drb (2.2.1)
62
+ erubi (1.13.1)
63
+ i18n (1.14.7)
64
+ concurrent-ruby (~> 1.0)
65
+ io-console (0.8.0)
66
+ irb (1.15.1)
67
+ pp (>= 0.6.0)
68
+ rdoc (>= 4.0.0)
69
+ reline (>= 0.4.2)
70
+ json (2.9.1)
71
+ language_server-protocol (3.17.0.4)
72
+ logger (1.6.5)
73
+ loofah (2.24.0)
74
+ crass (~> 1.0.2)
75
+ nokogiri (>= 1.12.0)
76
+ minitest (5.25.4)
77
+ mutex_m (0.3.0)
78
+ nokogiri (1.18.2-x86_64-darwin)
79
+ racc (~> 1.4)
80
+ parallel (1.26.3)
81
+ parser (3.3.7.0)
82
+ ast (~> 2.4.1)
83
+ racc
84
+ pp (0.6.2)
85
+ prettyprint
86
+ prettyprint (0.2.0)
87
+ psych (5.2.3)
88
+ date
89
+ stringio
90
+ racc (1.8.1)
91
+ rack (3.1.9)
92
+ rack-session (2.1.0)
93
+ base64 (>= 0.1.0)
94
+ rack (>= 3.0.0)
95
+ rack-test (2.2.0)
96
+ rack (>= 1.3)
97
+ rackup (2.2.1)
98
+ rack (>= 3)
99
+ rails-dom-testing (2.2.0)
100
+ activesupport (>= 5.0.0)
101
+ minitest
102
+ nokogiri (>= 1.6)
103
+ rails-html-sanitizer (1.6.2)
104
+ loofah (~> 2.21)
105
+ nokogiri (>= 1.15.7, != 1.16.7, != 1.16.6, != 1.16.5, != 1.16.4, != 1.16.3, != 1.16.2, != 1.16.1, != 1.16.0.rc1, != 1.16.0)
106
+ railties (7.1.5.1)
107
+ actionpack (= 7.1.5.1)
108
+ activesupport (= 7.1.5.1)
109
+ irb
110
+ rackup (>= 1.0.0)
111
+ rake (>= 12.2)
112
+ thor (~> 1.0, >= 1.2.2)
113
+ zeitwerk (~> 2.6)
114
+ rainbow (3.1.1)
115
+ rake (13.2.1)
116
+ rdoc (6.11.0)
117
+ psych (>= 4.0.0)
118
+ regexp_parser (2.10.0)
119
+ reline (0.6.0)
120
+ io-console (~> 0.5)
121
+ rspec (3.13.0)
122
+ rspec-core (~> 3.13.0)
123
+ rspec-expectations (~> 3.13.0)
124
+ rspec-mocks (~> 3.13.0)
125
+ rspec-core (3.13.2)
126
+ rspec-support (~> 3.13.0)
127
+ rspec-expectations (3.13.3)
128
+ diff-lcs (>= 1.2.0, < 2.0)
129
+ rspec-support (~> 3.13.0)
130
+ rspec-mocks (3.13.2)
131
+ diff-lcs (>= 1.2.0, < 2.0)
132
+ rspec-support (~> 3.13.0)
133
+ rspec-rails (6.1.5)
134
+ actionpack (>= 6.1)
135
+ activesupport (>= 6.1)
136
+ railties (>= 6.1)
137
+ rspec-core (~> 3.13)
138
+ rspec-expectations (~> 3.13)
139
+ rspec-mocks (~> 3.13)
140
+ rspec-support (~> 3.13)
141
+ rspec-support (3.13.2)
142
+ rubocop (1.71.1)
143
+ json (~> 2.3)
144
+ language_server-protocol (>= 3.17.0)
145
+ parallel (~> 1.10)
146
+ parser (>= 3.3.0.2)
147
+ rainbow (>= 2.2.2, < 4.0)
148
+ regexp_parser (>= 2.9.3, < 3.0)
149
+ rubocop-ast (>= 1.38.0, < 2.0)
150
+ ruby-progressbar (~> 1.7)
151
+ unicode-display_width (>= 2.4.0, < 4.0)
152
+ rubocop-ast (1.38.0)
153
+ parser (>= 3.3.1.0)
154
+ rubocop-capybara (2.21.0)
155
+ rubocop (~> 1.41)
156
+ rubocop-factory_bot (2.26.1)
157
+ rubocop (~> 1.61)
158
+ rubocop-rspec (2.31.0)
159
+ rubocop (~> 1.40)
160
+ rubocop-capybara (~> 2.17)
161
+ rubocop-factory_bot (~> 2.22)
162
+ rubocop-rspec_rails (~> 2.28)
163
+ rubocop-rspec_rails (2.29.1)
164
+ rubocop (~> 1.61)
165
+ ruby-progressbar (1.13.0)
166
+ securerandom (0.4.1)
167
+ sqlite3 (1.6.9-x86_64-darwin)
168
+ stringio (3.1.2)
169
+ thor (1.3.2)
170
+ timeout (0.4.3)
171
+ tzinfo (2.0.6)
172
+ concurrent-ruby (~> 1.0)
173
+ unicode-display_width (3.1.4)
174
+ unicode-emoji (~> 4.0, >= 4.0.4)
175
+ unicode-emoji (4.0.4)
176
+ yard (0.9.37)
177
+ zeitwerk (2.7.1)
178
+
179
+ PLATFORMS
180
+ x86_64-darwin-22
181
+
182
+ DEPENDENCIES
183
+ activerecord (~> 7.1.0)
184
+ debug (>= 1.0.0)
185
+ rake (~> 13.0)
186
+ rspec (~> 3.12)
187
+ rspec-rails (~> 6.1)
188
+ rubocop (~> 1.21)
189
+ rubocop-rspec (~> 2.25)
190
+ sqlite3 (~> 1.6.0)
191
+ structify!
192
+ yard (~> 0.9)
193
+
194
+ BUNDLED WITH
195
+ 2.4.20
data/LICENSE.txt ADDED
@@ -0,0 +1,21 @@
1
+ The MIT License (MIT)
2
+
3
+ Copyright (c) 2025 Kieran Klaassen
4
+
5
+ Permission is hereby granted, free of charge, to any person obtaining a copy
6
+ of this software and associated documentation files (the "Software"), to deal
7
+ in the Software without restriction, including without limitation the rights
8
+ to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9
+ copies of the Software, and to permit persons to whom the Software is
10
+ furnished to do so, subject to the following conditions:
11
+
12
+ The above copyright notice and this permission notice shall be included in
13
+ all copies or substantial portions of the Software.
14
+
15
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16
+ IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17
+ FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18
+ AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19
+ LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20
+ OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
21
+ THE SOFTWARE.
data/README.md ADDED
@@ -0,0 +1,221 @@
1
+ # Structify
2
+
3
+ [![Gem Version](https://badge.fury.io/rb/structify.svg)](https://badge.fury.io/rb/structify)
4
+
5
+ Structify is a Ruby gem that provides a simple DSL to define extraction schemas for LLM-powered models. It integrates seamlessly with Rails models, allowing you to specify versioning, assistant prompts, and field definitions—all in a clean, declarative syntax.
6
+
7
+ ## Features
8
+
9
+ - 🎯 Simple DSL for defining LLM extraction schemas
10
+ - 🔄 Built-in versioning for schema evolution
11
+ - 📝 Support for custom assistant prompts
12
+ - 🏗️ JSON Schema generation for LLM validation
13
+ - 🔌 Seamless Rails/ActiveRecord integration
14
+ - 💾 Automatic JSON attribute handling
15
+
16
+ ## Installation
17
+
18
+ Add this line to your application's Gemfile:
19
+
20
+ ```ruby
21
+ gem 'structify'
22
+ ```
23
+
24
+ And then execute:
25
+
26
+ ```bash
27
+ $ bundle install
28
+ ```
29
+
30
+ Or install it yourself as:
31
+
32
+ ```bash
33
+ $ gem install structify
34
+ ```
35
+
36
+ ## Usage
37
+
38
+ ### Basic Example
39
+
40
+ Here's a simple example of using Structify in a Rails model:
41
+
42
+ ```ruby
43
+ class Article < ApplicationRecord
44
+ include Structify::Model
45
+
46
+ schema_definition do
47
+ title "Article Extraction"
48
+ description "Extract key information from articles"
49
+ version 1
50
+
51
+ assistant_prompt "Extract the following fields from the article content"
52
+ llm_model "gpt-4"
53
+
54
+ field :title, :string, required: true
55
+ field :summary, :text, description: "A brief summary of the article"
56
+ field :category, :string, enum: ["tech", "business", "science"]
57
+ end
58
+ end
59
+ ```
60
+
61
+ ### Advanced Example
62
+
63
+ Here's a more complex example showing all available features:
64
+
65
+ ```ruby
66
+ class EmailSummary < ApplicationRecord
67
+ include Structify::Model
68
+
69
+ schema_definition do
70
+ version 2 # Increment this when making breaking changes
71
+ title "Email Thread Extraction"
72
+ description "Extracts key information from email threads"
73
+
74
+ assistant_prompt <<~PROMPT
75
+ You are an assistant that extracts concise metadata from email threads.
76
+ Focus on producing a clear summary, action items, and sentiment analysis.
77
+ If there are multiple participants, include their roles in the conversation.
78
+ PROMPT
79
+
80
+ llm_model "gpt-4" # Supports any LLM model
81
+
82
+ # Required fields
83
+ field :subject, :string,
84
+ required: true,
85
+ description: "The main topic or subject of the email thread"
86
+
87
+ field :summary, :text,
88
+ required: true,
89
+ description: "A concise summary of the entire thread"
90
+
91
+ # Optional fields with enums
92
+ field :sentiment, :string,
93
+ enum: ["positive", "neutral", "negative"],
94
+ description: "The overall sentiment of the conversation"
95
+
96
+ field :priority, :string,
97
+ enum: ["high", "medium", "low"],
98
+ description: "The priority level based on content and tone"
99
+
100
+ # Complex fields
101
+ field :participants, :json,
102
+ description: "List of participants and their roles"
103
+
104
+ field :action_items, :json,
105
+ description: "Array of action items extracted from the thread"
106
+
107
+ field :next_steps, :string,
108
+ description: "Recommended next steps based on the thread"
109
+ end
110
+
111
+ # You can still use regular ActiveRecord features
112
+ validates :subject, presence: true
113
+ validates :summary, length: { minimum: 10 }
114
+ end
115
+ ```
116
+
117
+ ### Accessing Schema Information
118
+
119
+ Structify provides several helper methods to access schema information:
120
+
121
+ ```ruby
122
+ # Get the JSON Schema
123
+ EmailSummary.json_schema
124
+ # => {
125
+ # name: "Email Thread Extraction",
126
+ # description: "Extracts key information from email threads",
127
+ # parameters: {
128
+ # type: "object",
129
+ # required: ["subject", "summary"],
130
+ # properties: {
131
+ # subject: { type: "string" },
132
+ # summary: { type: "text" },
133
+ # sentiment: {
134
+ # type: "string",
135
+ # enum: ["positive", "neutral", "negative"]
136
+ # },
137
+ # # ...
138
+ # }
139
+ # }
140
+ # }
141
+
142
+ # Get the current version
143
+ EmailSummary.extraction_version # => 2
144
+
145
+ # Get the assistant prompt
146
+ EmailSummary.extraction_assistant_prompt
147
+ # => "You are an assistant that extracts concise metadata..."
148
+
149
+ # Get the LLM model
150
+ EmailSummary.extraction_llm_model # => "gpt-4"
151
+ ```
152
+
153
+ ### Working with Extracted Data
154
+
155
+ Structify uses the `attr_json` gem to handle JSON attributes. All fields are stored in the `extracted_data` JSON column:
156
+
157
+ ```ruby
158
+ # Create a new record with extracted data
159
+ summary = EmailSummary.create(
160
+ subject: "Project Update",
161
+ summary: "Team discussed Q2 goals",
162
+ sentiment: "positive",
163
+ priority: "high",
164
+ participants: [
165
+ { name: "Alice", role: "presenter" },
166
+ { name: "Bob", role: "reviewer" }
167
+ ]
168
+ )
169
+
170
+ # Access fields directly
171
+ summary.subject # => "Project Update"
172
+ summary.sentiment # => "positive"
173
+ summary.participants # => [{ name: "Alice", ... }]
174
+
175
+ # Validate enum values
176
+ summary.sentiment = "invalid"
177
+ summary.valid? # => false
178
+ ```
179
+
180
+ ## Database Setup
181
+
182
+ Ensure your model has a JSON column named `extracted_data`:
183
+
184
+ ```ruby
185
+ class CreateEmailSummaries < ActiveRecord::Migration[7.1]
186
+ def change
187
+ create_table :email_summaries do |t|
188
+ t.json :extracted_data # Required by Structify
189
+ t.timestamps
190
+ end
191
+ end
192
+ end
193
+ ```
194
+
195
+ ## Development
196
+
197
+ After checking out the repo, run `bin/setup` to install dependencies. Then, run `rake spec` to run the tests. You can also run `bin/console` for an interactive prompt that will allow you to experiment.
198
+
199
+ To install this gem onto your local machine, run `bundle exec rake install`.
200
+
201
+ ## Contributing
202
+
203
+ 1. Fork it
204
+ 2. Create your feature branch (`git checkout -b feature/my-new-feature`)
205
+ 3. Commit your changes (`git commit -am 'Add some feature'`)
206
+ 4. Push to the branch (`git push origin feature/my-new-feature`)
207
+ 5. Create a new Pull Request
208
+
209
+ Bug reports and pull requests are welcome on GitHub at https://github.com/kieranklaassen/structify.
210
+
211
+ ## License
212
+
213
+ The gem is available as open source under the terms of the [MIT License](https://opensource.org/licenses/MIT).
214
+
215
+ ## Code of Conduct
216
+
217
+ Everyone interacting in the Structify project's codebases, issue trackers, chat rooms and mailing lists is expected to follow the [code of conduct](CODE_OF_CONDUCT.md).
218
+
219
+ ```
220
+
221
+ ```
data/Rakefile ADDED
@@ -0,0 +1,6 @@
1
+ require "bundler/gem_tasks"
2
+ require "rspec/core/rake_task"
3
+
4
+ RSpec::Core::RakeTask.new(:spec)
5
+
6
+ task :default => :spec
data/bin/console ADDED
@@ -0,0 +1,14 @@
1
+ #!/usr/bin/env ruby
2
+
3
+ require "bundler/setup"
4
+ require "structify"
5
+
6
+ # You can add fixtures and/or initialization code here to make experimenting
7
+ # with your gem easier. You can also use a different console, if you like.
8
+
9
+ # (If you use this, don't forget to add pry to your Gemfile!)
10
+ # require "pry"
11
+ # Pry.start
12
+
13
+ require "irb"
14
+ IRB.start(__FILE__)
data/bin/setup ADDED
@@ -0,0 +1,8 @@
1
+ #!/usr/bin/env bash
2
+ set -euo pipefail
3
+ IFS=$'\n\t'
4
+ set -vx
5
+
6
+ bundle install
7
+
8
+ # Do any other automated setup that you need to do here
@@ -0,0 +1,186 @@
1
+ # frozen_string_literal: true
2
+
3
+ require "active_support/concern"
4
+ require "active_support/core_ext/class/attribute"
5
+ require "attr_json"
6
+
7
+ module Structify
8
+ # The Model module provides a DSL for defining LLM extraction schemas in your Rails models.
9
+ # It allows you to define fields, versioning, and assistant prompts for LLM-based data extraction.
10
+ #
11
+ # @example
12
+ # class Article < ApplicationRecord
13
+ # include Structify::Model
14
+ #
15
+ # schema_definition do
16
+ # title "Article Extraction"
17
+ # description "Extract article metadata"
18
+ # version 1
19
+ # assistant_prompt "Extract the following fields from the article"
20
+ # llm_model "gpt-4"
21
+ #
22
+ # field :title, :string, required: true
23
+ # field :summary, :text, description: "A brief summary of the article"
24
+ # field :category, :string, enum: ["tech", "business", "science"]
25
+ # end
26
+ # end
27
+ module Model
28
+ extend ActiveSupport::Concern
29
+
30
+ included do
31
+ include AttrJson::Record
32
+ class_attribute :schema_builder, instance_writer: false, default: nil
33
+
34
+ # Store all extracted data in the extracted_data JSON column
35
+ attr_json_config(default_container_attribute: :extracted_data)
36
+ end
37
+
38
+ # Class methods added to the including class
39
+ module ClassMethods
40
+ # Define the schema for LLM extraction
41
+ #
42
+ # @yield [void] The schema definition block
43
+ # @return [void]
44
+ def schema_definition(&block)
45
+ self.schema_builder ||= SchemaBuilder.new(self)
46
+ schema_builder.instance_eval(&block) if block_given?
47
+ end
48
+
49
+ # Get the JSON schema representation
50
+ #
51
+ # @return [Hash] The JSON schema
52
+ def json_schema
53
+ schema_builder&.to_json_schema
54
+ end
55
+
56
+ # Get the current extraction version
57
+ #
58
+ # @return [Integer] The version number
59
+ def extraction_version
60
+ schema_builder&.version_number
61
+ end
62
+
63
+ # Get the assistant prompt
64
+ #
65
+ # @return [String] The assistant prompt
66
+ def extraction_assistant_prompt
67
+ schema_builder&.assistant_prompt_str
68
+ end
69
+
70
+ # Get the LLM model name
71
+ #
72
+ # @return [String] The model name
73
+ def extraction_llm_model
74
+ schema_builder&.model_name
75
+ end
76
+ end
77
+ end
78
+
79
+ # Builder class for constructing the schema
80
+ class SchemaBuilder
81
+ # @return [Class] The model class
82
+ # @return [Array<Hash>] The field definitions
83
+ # @return [String] The schema title
84
+ # @return [String] The schema description
85
+ # @return [String] The assistant prompt
86
+ # @return [String] The LLM model name
87
+ # @return [Integer] The schema version
88
+ attr_reader :model, :fields, :title_str, :description_str,
89
+ :assistant_prompt_str, :model_name, :version_number
90
+
91
+ # Initialize a new SchemaBuilder
92
+ #
93
+ # @param model [Class] The model class
94
+ def initialize(model)
95
+ @model = model
96
+ @fields = []
97
+ @assistant_prompt_str = nil
98
+ @model_name = nil
99
+ @version_number = 1
100
+ end
101
+
102
+ # Set the schema title
103
+ #
104
+ # @param name [String] The title
105
+ # @return [void]
106
+ def title(name)
107
+ @title_str = name
108
+ end
109
+
110
+ # Set the schema description
111
+ #
112
+ # @param desc [String] The description
113
+ # @return [void]
114
+ def description(desc)
115
+ @description_str = desc
116
+ end
117
+
118
+ # Set the schema version
119
+ #
120
+ # @param num [Integer] The version number
121
+ # @return [void]
122
+ def version(num)
123
+ @version_number = num
124
+ model.attribute :version, :integer, default: num
125
+ end
126
+
127
+ # Set the assistant prompt
128
+ #
129
+ # @param prompt [String] The prompt text
130
+ # @return [void]
131
+ def assistant_prompt(prompt)
132
+ @assistant_prompt_str = prompt.strip
133
+ end
134
+
135
+ # Set the LLM model name
136
+ #
137
+ # @param name [String] The model name
138
+ # @return [void]
139
+ def llm_model(name)
140
+ @model_name = name
141
+ end
142
+
143
+ # Define a field in the schema
144
+ #
145
+ # @param name [Symbol] The field name
146
+ # @param type [Symbol] The field type
147
+ # @param required [Boolean] Whether the field is required
148
+ # @param description [String] The field description
149
+ # @param enum [Array] Possible values for the field
150
+ # @return [void]
151
+ def field(name, type, required: false, description: nil, enum: nil)
152
+ fields << {
153
+ name: name,
154
+ type: type,
155
+ required: required,
156
+ description: description,
157
+ enum: enum
158
+ }
159
+
160
+ model.attr_json name, type
161
+ end
162
+
163
+ # Generate the JSON schema representation
164
+ #
165
+ # @return [Hash] The JSON schema
166
+ def to_json_schema
167
+ required_fields = fields.select { |f| f[:required] }.map { |f| f[:name].to_s }
168
+ properties_hash = fields.each_with_object({}) do |f, hash|
169
+ prop = { type: f[:type].to_s }
170
+ prop[:description] = f[:description] if f[:description]
171
+ prop[:enum] = f[:enum] if f[:enum]
172
+ hash[f[:name].to_s] = prop
173
+ end
174
+
175
+ {
176
+ name: title_str,
177
+ description: description_str,
178
+ parameters: {
179
+ type: "object",
180
+ required: required_fields,
181
+ properties: properties_hash
182
+ }
183
+ }
184
+ end
185
+ end
186
+ end
@@ -0,0 +1,3 @@
1
+ module Structify
2
+ VERSION = "0.1.0"
3
+ end
data/lib/structify.rb ADDED
@@ -0,0 +1,29 @@
1
+ # frozen_string_literal: true
2
+
3
+ require_relative "structify/version"
4
+ require_relative "structify/model"
5
+
6
+ # Structify is a DSL for defining extraction schemas for LLM-powered models.
7
+ # It provides a simple way to integrate with Rails models for LLM extraction,
8
+ # including versioning, assistant prompts, and more.
9
+ #
10
+ # @example
11
+ # class Article < ApplicationRecord
12
+ # include Structify::Model
13
+ #
14
+ # schema_definition do
15
+ # title "Article Extraction"
16
+ # description "Extract article metadata"
17
+ # version 1
18
+ # assistant_prompt "Extract the following fields from the article"
19
+ # llm_model "gpt-4"
20
+ #
21
+ # field :title, :string, required: true
22
+ # field :summary, :text, description: "A brief summary of the article"
23
+ # field :category, :string, enum: ["tech", "business", "science"]
24
+ # end
25
+ # end
26
+ module Structify
27
+ class Error < StandardError; end
28
+ # Your code goes here...
29
+ end
data/structify.gemspec ADDED
@@ -0,0 +1,31 @@
1
+ require_relative 'lib/structify/version'
2
+
3
+ Gem::Specification.new do |spec|
4
+ spec.name = "structify"
5
+ spec.version = Structify::VERSION
6
+ spec.authors = ["Kieran Klaassen"]
7
+ spec.email = ["kieranklaassen@gmail.com"]
8
+
9
+ spec.summary = %q{A DSL for defining extraction schemas for LLM-powered models}
10
+ spec.description = %q{Structify provides a simple DSL to integrate with Rails models for LLM extraction, including versioning, assistant prompts, and more}
11
+ spec.homepage = "https://github.com/kieranklaassen/structify"
12
+ spec.license = "MIT"
13
+ spec.required_ruby_version = Gem::Requirement.new(">= 2.3.0")
14
+
15
+ spec.metadata["homepage_uri"] = spec.homepage
16
+ spec.metadata["source_code_uri"] = spec.homepage
17
+ spec.metadata["changelog_uri"] = "#{spec.homepage}/blob/main/CHANGELOG.md"
18
+
19
+ # Specify which files should be added to the gem when it is released.
20
+ # The `git ls-files -z` loads the files in the RubyGem that have been added into git.
21
+ spec.files = Dir.chdir(File.expand_path('..', __FILE__)) do
22
+ `git ls-files -z`.split("\x0").reject { |f| f.match(%r{^(test|spec|features)/}) }
23
+ end
24
+ spec.bindir = "exe"
25
+ spec.executables = spec.files.grep(%r{^exe/}) { |f| File.basename(f) }
26
+ spec.require_paths = ["lib"]
27
+
28
+ # Runtime dependencies
29
+ spec.add_dependency "activesupport", "~> 7.1"
30
+ spec.add_dependency "attr_json", "~> 2.1"
31
+ end
metadata ADDED
@@ -0,0 +1,90 @@
1
+ --- !ruby/object:Gem::Specification
2
+ name: structify
3
+ version: !ruby/object:Gem::Version
4
+ version: 0.1.0
5
+ platform: ruby
6
+ authors:
7
+ - Kieran Klaassen
8
+ autorequire:
9
+ bindir: exe
10
+ cert_chain: []
11
+ date: 2025-02-03 00:00:00.000000000 Z
12
+ dependencies:
13
+ - !ruby/object:Gem::Dependency
14
+ name: activesupport
15
+ requirement: !ruby/object:Gem::Requirement
16
+ requirements:
17
+ - - "~>"
18
+ - !ruby/object:Gem::Version
19
+ version: '7.1'
20
+ type: :runtime
21
+ prerelease: false
22
+ version_requirements: !ruby/object:Gem::Requirement
23
+ requirements:
24
+ - - "~>"
25
+ - !ruby/object:Gem::Version
26
+ version: '7.1'
27
+ - !ruby/object:Gem::Dependency
28
+ name: attr_json
29
+ requirement: !ruby/object:Gem::Requirement
30
+ requirements:
31
+ - - "~>"
32
+ - !ruby/object:Gem::Version
33
+ version: '2.1'
34
+ type: :runtime
35
+ prerelease: false
36
+ version_requirements: !ruby/object:Gem::Requirement
37
+ requirements:
38
+ - - "~>"
39
+ - !ruby/object:Gem::Version
40
+ version: '2.1'
41
+ description: Structify provides a simple DSL to integrate with Rails models for LLM
42
+ extraction, including versioning, assistant prompts, and more
43
+ email:
44
+ - kieranklaassen@gmail.com
45
+ executables: []
46
+ extensions: []
47
+ extra_rdoc_files: []
48
+ files:
49
+ - ".gitignore"
50
+ - ".rspec"
51
+ - ".travis.yml"
52
+ - CODE_OF_CONDUCT.md
53
+ - Gemfile
54
+ - Gemfile.lock
55
+ - LICENSE.txt
56
+ - README.md
57
+ - Rakefile
58
+ - bin/console
59
+ - bin/setup
60
+ - lib/structify.rb
61
+ - lib/structify/model.rb
62
+ - lib/structify/version.rb
63
+ - structify.gemspec
64
+ homepage: https://github.com/kieranklaassen/structify
65
+ licenses:
66
+ - MIT
67
+ metadata:
68
+ homepage_uri: https://github.com/kieranklaassen/structify
69
+ source_code_uri: https://github.com/kieranklaassen/structify
70
+ changelog_uri: https://github.com/kieranklaassen/structify/blob/main/CHANGELOG.md
71
+ post_install_message:
72
+ rdoc_options: []
73
+ require_paths:
74
+ - lib
75
+ required_ruby_version: !ruby/object:Gem::Requirement
76
+ requirements:
77
+ - - ">="
78
+ - !ruby/object:Gem::Version
79
+ version: 2.3.0
80
+ required_rubygems_version: !ruby/object:Gem::Requirement
81
+ requirements:
82
+ - - ">="
83
+ - !ruby/object:Gem::Version
84
+ version: '0'
85
+ requirements: []
86
+ rubygems_version: 3.4.10
87
+ signing_key:
88
+ specification_version: 4
89
+ summary: A DSL for defining extraction schemas for LLM-powered models
90
+ test_files: []