RubyGems - serialbench - Versions diffs - 0.1.2 → 0.1.3 - Mend

serialbench 0.1.2 → 0.1.3

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (58) hide show

checksums.yaml +4 -4
data/.github/workflows/benchmark.yml +273 -228
data/.github/workflows/rake.yml +11 -0
data/.github/workflows/windows-debug.yml +171 -0
data/.gitignore +32 -0
data/.rubocop.yml +1 -0
data/.rubocop_todo.yml +274 -0
data/Gemfile +13 -1
data/README.adoc +36 -0
data/data/schemas/result.yml +29 -0
data/docs/PLATFORM_VALIDATION_FIX.md +79 -0
data/docs/SYCK_YAML_FIX.md +91 -0
data/docs/WEBSITE_COMPLETION_PLAN.md +440 -0
data/docs/WINDOWS_LIBXML_FIX.md +136 -0
data/docs/WINDOWS_SETUP.md +122 -0
data/lib/serialbench/benchmark_runner.rb +3 -3
data/lib/serialbench/cli/benchmark_cli.rb +74 -1
data/lib/serialbench/cli/environment_cli.rb +3 -3
data/lib/serialbench/cli/resultset_cli.rb +72 -26
data/lib/serialbench/cli/ruby_build_cli.rb +75 -88
data/lib/serialbench/cli/validate_cli.rb +88 -0
data/lib/serialbench/cli.rb +6 -2
data/lib/serialbench/config_manager.rb +15 -26
data/lib/serialbench/models/benchmark_config.rb +12 -0
data/lib/serialbench/models/benchmark_result.rb +39 -3
data/lib/serialbench/models/environment_config.rb +3 -2
data/lib/serialbench/models/platform.rb +56 -4
data/lib/serialbench/models/result.rb +28 -1
data/lib/serialbench/models/result_set.rb +8 -0
data/lib/serialbench/ruby_build_manager.rb +19 -23
data/lib/serialbench/runners/asdf_runner.rb +1 -1
data/lib/serialbench/runners/docker_runner.rb +2 -4
data/lib/serialbench/runners/local_runner.rb +71 -0
data/lib/serialbench/serializers/base_serializer.rb +1 -1
data/lib/serialbench/serializers/json/rapidjson_serializer.rb +1 -1
data/lib/serialbench/serializers/toml/base_toml_serializer.rb +0 -2
data/lib/serialbench/serializers/toml/toml_rb_serializer.rb +1 -1
data/lib/serialbench/serializers/toml/tomlib_serializer.rb +1 -1
data/lib/serialbench/serializers/xml/libxml_serializer.rb +4 -8
data/lib/serialbench/serializers/xml/nokogiri_serializer.rb +2 -2
data/lib/serialbench/serializers/xml/oga_serializer.rb +4 -8
data/lib/serialbench/serializers/xml/ox_serializer.rb +2 -2
data/lib/serialbench/serializers/xml/rexml_serializer.rb +3 -3
data/lib/serialbench/serializers/yaml/psych_serializer.rb +1 -1
data/lib/serialbench/serializers/yaml/syck_serializer.rb +1 -1
data/lib/serialbench/serializers.rb +2 -2
data/lib/serialbench/site_generator.rb +180 -2
data/lib/serialbench/templates/assets/css/format_based.css +1 -53
data/lib/serialbench/templates/assets/css/themes.css +5 -4
data/lib/serialbench/templates/assets/js/chart_helpers.js +44 -14
data/lib/serialbench/templates/assets/js/dashboard.js +14 -15
data/lib/serialbench/templates/format_based.liquid +480 -252
data/lib/serialbench/version.rb +1 -1
data/lib/serialbench/yaml_validator.rb +36 -0
data/serialbench.gemspec +11 -2
metadata +34 -23
data/.github/workflows/ci.yml +0 -74
data/.github/workflows/docker.yml +0 -272

data/docs/SYCK_YAML_FIX.md ADDED Viewed

@@ -0,0 +1,91 @@
+# Syck YAML Constant Fix
+## Problem Summary
+GitHub Actions workflow failed with cryptic error:
+```
+Error adding run to resultset: undefined method `platform_string' for nil
+```
+Root cause: Benchmark results.yaml files were only 2 bytes containing `{}` instead of the expected ~10KB files with complete benchmark data.
+## Root Cause Analysis
+After 4 rounds of progressive fixes, we discovered the Syck gem was overriding the YAML constant, causing Lutaml::Model's `to_yaml` method to produce empty output.
+### Fix Timeline
+1. **Fix #1 (Commit 53f1799)**: Corrected `memory_usage` → `memory` attribute name
+   - Result: Still produced empty `{}` files
+2. **Fix #2 (Commit fcd23f2)**: Added key_value blocks to BenchmarkResult models
+   - Result: Still produced empty `{}` files
+3. **Fix #3 (Commit d8fb78b)**: Added key_value blocks to all remaining models
+   - Result: Still produced empty `{}` files
+4. **Fix #4 (Commit 69a2bd7)**: **Restored YAML constant to Psych in LocalRunner**
+   - Result: ✅ **SUCCESS! 10KB files with complete data**
+## The Critical Fix
+Added to `lib/serialbench/runners/local_runner.rb`:
+```ruby
+# Restore YAML to use Psych for output, otherwise lutaml-model's to_yaml
+# will have no output (Syck gem overrides YAML constant)
+Object.const_set(:YAML, Psych)
+results_file = File.join(result_dir, 'results.yaml')
+results_model.to_file(results_file)
+```
+## Why This Happened
+- The `_docker_execute` method in `benchmark_cli.rb` already had this fix (lines 151-154)
+- Local testing with `serialbench benchmark _docker_execute` worked fine
+- GHA with `serialbench environment execute` (uses LocalRunner) failed
+- The Syck gem overrides the YAML constant during benchmark execution
+- Without restoring it to Psych, Lutaml::Model serialization produces empty output
+## Verification
+From successful GHA run #51:
+```
+File size: 10258 bytes
+✅ File size OK: 10258 bytes
+✅ YAML syntax valid
+---
+platform:
+  platform_string: local-3.4.7
+  kind: local
+  os: macos
+  arch: arm64
+  ruby_build_tag: 3.4.7
+metadata:
+  benchmark_config_path: config/benchmarks/short.yml
+  environment_config_path: config/environments/ci-ruby-3.4.yml
+  tags:
+  - local
+  - macos
+  - arm64
+  - ruby-3.4
+[...]
+```
+## Impact
+This fix ensures:
+1. Benchmark results are properly serialized in all execution contexts
+2. ResultSet aggregation works correctly
+3. No more cryptic nil errors
+4. Complete benchmark data flows through the entire pipeline
+## Lessons Learned
+1. Always check for gem interference with global constants
+2. The Syck gem is a known source of YAML constant conflicts
+3. Test in the same execution context as production (LocalRunner vs DockerRunner behavior differs)
+4. Even with proper model definitions, external factors can break serialization

data/docs/WEBSITE_COMPLETION_PLAN.md ADDED Viewed

@@ -0,0 +1,440 @@
+# SerialBench Website Completion Plan
+## Executive Summary
+This document outlines the plan to complete the SerialBench results website,
+which will display performance benchmarks for Ruby serialization libraries
+across different formats (XML, JSON, YAML, TOML).
+## Current State
+### Completed Components
+1. **Core Infrastructure**
+   - Lutaml::Model-based data models for benchmark results
+   - CLI interface using Thor for benchmark execution
+   - Docker and ASDF runners for multi-environment testing
+   - Result storage and aggregation via ResultSet
+   - GitHub Actions workflow for weekly automated benchmarks
+2. **Website Generation**
+   - Liquid templating engine integration
+   - Base template structure (base.liquid)
+   - Format-based view template (format_based.liquid)
+   - CSS styling with theme support
+   - JavaScript for Chart.js integration
+   - Metadata population from benchmark data
+3. **Deployment**
+   - GitHub Pages deployment via GitHub Actions
+   - Weekly schedule (Sundays at 2 AM UTC)
+   - Multi-platform matrix (Ruby 3.1-3.4 × Ubuntu/macOS)
+### Current Issues
+1. **Limited Visualization Options**
+   - Only format-based view available
+   - No serializer-focused comparisons
+   - Missing historical trend analysis
+2. **Incomplete Documentation**
+   - Website lacks explanatory text
+   - No methodology documentation
+   - Missing interpretation guidelines
+3. **Limited Interactivity**
+   - Static charts only
+   - No filtering or search capabilities
+   - No download options for raw data
+## Proposed Enhancements
+### Phase 1: Core Website Functionality (Priority: HIGH)
+#### 1.1 Multiple View Templates
+**Objective**: Provide different perspectives on benchmark data
+**Tasks**:
+- Create `serializer_based.liquid` template
+  - Group by serializer instead of format
+  - Show cross-format performance for each serializer
+  - Useful for comparing Nokogiri vs Ox vs REXML
+- Create `platform_based.liquid` template
+  - Group by Ruby version and OS
+  - Show how performance varies across platforms
+  - Useful for identifying platform-specific optimizations
+- Create `historical.liquid` template
+  - Time-series visualization
+  - Show performance trends over weekly runs
+  - Identify improvements or regressions
+**Implementation**:
+```ruby
+# In lib/serialbench/site_generator.rb
+def generate_views
+  ['format_based', 'serializer_based', 'platform_based',
+   'historical'].each do |view|
+    generate_view(view)
+  end
+end
+```
+**Estimated Effort**: 2-3 days
+#### 1.2 Navigation and Layout
+**Objective**: Cohesive multi-page website structure
+**Tasks**:
+- Update `base.liquid` with:
+  - Top navigation menu
+  - View switcher tabs
+  - Footer with methodology link
+  - Responsive mobile layout
+- Add breadcrumb navigation
+- Add page metadata (titles, descriptions)
+**Implementation**:
+```html
+<nav class="main-navigation">
+  <ul>
+    <li><a href="index.html">Format View</a></li>
+    <li><a href="serializer_view.html">Serializer View</a></li>
+    <li><a href="platform_view.html">Platform View</a></li>
+    <li><a href="historical.html">Trends</a></li>
+    <li><a href="methodology.html">Methodology</a></li>
+  </ul>
+</nav>
+```
+**Estimated Effort**: 1 day
+#### 1.3 Enhanced Data Tables
+**Objective**: Sortable, searchable result tables
+**Tasks**:
+- Integrate DataTables.js or similar library
+- Add sorting by any column
+- Add search/filter functionality
+- Add CSV export option
+- Add copy-to-clipboard for sharing
+**Implementation**:
+```javascript
+$('.benchmark-table').DataTable({
+  order: [[2, 'asc']], // Sort by i/s descending
+  pageLength: 25,
+  buttons: ['copy', 'csv', 'excel']
+});
+```
+**Estimated Effort**: 1 day
+### Phase 2: Documentation and Context (Priority: HIGH)
+#### 2.1 Methodology Page
+**Objective**: Explain how benchmarks are conducted
+**Tasks**:
+- Create `methodology.liquid` template
+- Document:
+  - Benchmark environment setup
+  - Test data characteristics
+  - Measurement methodology
+  - Statistical approach
+  - Limitations and caveats
+**Content Structure**:
+```markdown
+## Benchmark Methodology
+### Environment
+- Ruby versions tested
+- Operating systems
+- Hardware specifications (GitHub Actions runners)
+### Test Data
+- Sample size and structure
+- Complexity levels
+- Real-world representativeness
+### Measurements
+- Iterations per second (throughput)
+- Memory allocation (bytes)
+- Statistical significance
+### Limitations
+- Single-threaded only
+- Memory profiling overhead
+- GitHub Actions runner variability
+```
+**Estimated Effort**: 1 day
+#### 2.2 Interpretation Guide
+**Objective**: Help users understand results
+**Tasks**:
+- Add tooltips to charts
+- Add explanatory text to each view
+- Create "How to Read This Chart" sections
+- Add recommendations section
+**Example**:
+```html
+<div class="interpretation-guide">
+  <h3>Understanding the Results</h3>
+  <ul>
+    <li><strong>Higher i/s is better</strong>: More iterations per
+        second means faster serialization</li>
+    <li><strong>Lower memory is better</strong>: Less allocation
+        reduces GC pressure</li>
+    <li><strong>Trade-offs exist</strong>: Fastest may not be
+        most memory-efficient</li>
+  </ul>
+</div>
+```
+**Estimated Effort**: 0.5 days
+#### 2.3 Library Information
+**Objective**: Provide context about each serializer
+**Tasks**:
+- Add library descriptions
+- Link to official documentation
+- Show gem versions tested
+- List known issues or limitations
+**Implementation**:
+```ruby
+SERIALIZER_INFO = {
+  'nokogiri' => {
+    description: 'XML parsing using libxml2',
+    homepage: 'https://nokogiri.org',
+    features: ['XPath', 'CSS selectors', 'SAX parsing'],
+    notes: 'Most popular Ruby XML library'
+  },
+  # ... more serializers
+}
+```
+**Estimated Effort**: 1 day
+### Phase 3: Advanced Features (Priority: MEDIUM)
+#### 3.1 Historical Trend Analysis
+**Objective**: Track performance over time
+**Tasks**:
+- Store all historical benchmark runs
+- Create time-series database of results
+- Generate trend charts
+- Highlight significant changes
+- Add regression detection
+**Implementation**:
+```javascript
+// Chart.js line chart for trends
+const trendChart = new Chart(ctx, {
+  type: 'line',
+  data: {
+    labels: timestamps,
+    datasets: [{
+      label: 'Nokogiri (parse)',
+      data: nokogiriParseResults,
+      borderColor: 'rgb(75, 192, 192)'
+    }]
+  }
+});
+```
+**Estimated Effort**: 2 days
+#### 3.2 Comparison Matrix
+**Objective**: Side-by-side serializer comparisons
+**Tasks**:
+- Create comparison view
+- Allow selecting 2-4 serializers
+- Show radar chart of characteristics
+- Highlight strengths/weaknesses
+**Estimated Effort**: 1-2 days
+#### 3.3 Custom Benchmark Runs
+**Objective**: Allow users to run specific benchmarks
+**Tasks**:
+- Add configuration UI
+- Generate custom benchmark commands
+- Provide Docker commands for local execution
+- Document how to submit results
+**Note**: This would be documentation only, not actual execution on
+the website
+**Estimated Effort**: 1 day
+### Phase 4: Performance and Polish (Priority: LOW)
+#### 4.1 Performance Optimization
+**Tasks**:
+- Minimize JavaScript/CSS
+- Lazy load charts
+- Optimize image assets
+- Add service worker for caching
+- Compress result data
+**Estimated Effort**: 1 day
+#### 4.2 Accessibility
+**Tasks**:
+- Add ARIA labels
+- Ensure keyboard navigation
+- Test with screen readers
+- Add alternative text descriptions of charts
+- Ensure sufficient color contrast
+**Estimated Effort**: 1 day
+#### 4.3 Mobile Optimization
+**Tasks**:
+- Responsive chart sizes
+- Touch-friendly interactions
+- Mobile navigation menu
+- Optimize for smaller screens
+**Estimated Effort**: 0.5 days
+## Implementation Priority
+### Immediate (Next Sprint)
+1. Create serializer_based.liquid template
+2. Add methodology page
+3. Enhance navigation in base.liquid
+4. Add interpretation guides to existing charts
+### Short-term (1-2 Weeks)
+1. Implement historical trend tracking
+2. Add DataTables for sortable results
+3. Create platform_based view
+4. Add library information sections
+### Medium-term (1 Month)
+1. Comparison matrix feature
+2. Performance optimizations
+3. Accessibility improvements
+4. Mobile optimization
+### Long-term (Ongoing)
+1. Monitor weekly benchmark runs
+2. Add new serializers as they emerge
+3. Update methodology as needed
+4. Community feedback integration
+## Technical Requirements
+### Dependencies to Add
+```ruby
+# Gemfile additions for enhanced website
+gem 'rouge', '~> 4.0' # Syntax highlighting for code examples
+```
+### JavaScript Libraries
+- Chart.js (already included)
+- DataTables.js (for sortable tables)
+- Lodash (for data manipulation)
+### Build Process Updates
+```ruby
+# lib/serialbench/site_generator.rb enhancements
+class SiteGenerator
+  def generate
+    copy_assets
+    generate_views
+    generate_methodology_page
+    generate_index_redirect
+    optimize_assets
+  end
+  private
+  def generate_views
+    %w[format_based serializer_based platform_based
+       historical].each do |view|
+      generate_view(view)
+    end
+  end
+end
+```
+## Success Metrics
+1. **Functionality**
+   - All views generate without errors
+   - Charts render correctly in all major browsers
+   - Mobile experience is usable
+2. **Usability**
+   - Users can find information in < 3 clicks
+   - Charts are self-explanatory
+   - Methodology is clear and comprehensive
+3. **Performance**
+   - Page load < 2 seconds
+   - Time to interactive < 3 seconds
+   - Lighthouse score > 90
+4. **Accessibility**
+   - WCAG 2.1 Level AA compliance
+   - Screen reader compatible
+   - Keyboard navigation functional
+## Risk Mitigation
+1. **Data Volume Growth**
+   - Risk: Historical data grows unbounded
+   - Mitigation: Implement data retention policy (e.g., keep 52 weeks)
+2. **GitHub Pages Limits**
+   - Risk: Site exceeds 1GB limit
+   - Mitigation: Compress data, archive old results
+3. **Breaking Changes**
+   - Risk: Lutaml::Model API changes
+   - Mitigation: Pin versions, test before updates
+4. **Runner Variability**
+   - Risk: Inconsistent benchmark results
+   - Mitigation: Document variance, run multiple times, use statistics
+## Conclusion
+This plan provides a structured approach to completing the SerialBench
+results website. The phased approach allows for incremental delivery
+of value while maintaining quality and usability standards.
+**Next Immediate Action**: Push changes and verify GitHub Actions
+workflow executes successfully.

data/docs/WINDOWS_LIBXML_FIX.md ADDED Viewed

@@ -0,0 +1,136 @@
+# Windows libxml2 Installation Fix
+## Issue
+The `libxml-ruby` gem was failing to install on Windows in GitHub Actions for Ruby 3.1 and 3.4 due to missing native libxml2 libraries.
+**Error:**
+```
+extconf failure: Cannot find libxml2.
+Install the library or try one of the following options to extconf.rb:
+  --with-xml2-config=/path/to/xml2-config
+  --with-xml2-dir=/path/to/libxml2
+```
+**Root Cause:**
+The `libxml-ruby` gem requires native C libraries (libxml2) that are not available by default on Windows. The gem's native extension compilation fails because:
+1. libxml2 development headers are not found
+2. libxml2 shared libraries are not in the system path
+3. pkg-config cannot locate the libxml2 package
+## Solution
+### Approach
+Instead of excluding `libxml-ruby` from Windows builds, we install the required native libraries using Chocolatey package manager, which is pre-installed on GitHub Actions Windows runners.
+### Implementation
+#### 1. New GitHub Actions Workflow (`.github/workflows/windows-setup.yml`)
+Created a dedicated Windows workflow that:
+- Installs libxml2 via Chocolatey before Ruby setup
+- Dynamically locates the libxml2 installation directory
+- Configures environment variables (PKG_CONFIG_PATH, PATH)
+- Passes library paths to bundler for gem compilation
+- Runs tests to verify the setup
+Key features:
+- **Dynamic path detection**: Uses PowerShell to find the actual libxml2 installation directory
+- **Robust configuration**: Sets multiple environment variables to ensure gem compilation succeeds
+- **Bundler configuration**: Explicitly passes include and lib paths to the gem build process
+#### 2. Documentation (`docs/WINDOWS_SETUP.md`)
+Comprehensive guide covering:
+- Problem explanation
+- Multiple installation options (Chocolatey, vcpkg)
+- Step-by-step setup instructions
+- Environment variable configuration
+- Troubleshooting common issues
+- CI/CD integration
+## Technical Details
+### Chocolatey Package
+The libxml2 Chocolatey package installs to:
+```
+C:\ProgramData\chocolatey\lib\libxml2\tools\libxml2-{version}-win32-x86_64\
+```
+Required components:
+- `bin/` - DLL files (libxml2.dll, etc.)
+- `include/libxml2/` - Header files for compilation
+- `lib/` - Link libraries and pkg-config files
+- `lib/pkgconfig/` - libxml-2.0.pc file
+### Environment Variables
+The workflow sets:
+- `PKG_CONFIG_PATH`: Points to the pkgconfig directory
+- `PATH`: Includes the bin directory for DLLs
+- `LIBXML2_INCLUDE`: Header file location for gem compilation
+- `LIBXML2_LIB`: Library file location for gem compilation
+### Bundler Configuration
+```powershell
+bundle config build.libxml-ruby "--with-xml2-include=<path> --with-xml2-lib=<path>"
+```
+This passes the library locations directly to the gem's extconf.rb during native extension compilation.
+## Testing
+The fix can be tested by:
+1. **Local Windows testing:**
+   ```powershell
+   choco install libxml2
+   choco install pkgconfiglite
+   bundle install
+   bundle exec rake spec
+   ```
+2. **GitHub Actions:**
+   - Push changes to trigger the `windows-setup` workflow
+   - Verify both Ruby 3.1 and 3.4 builds succeed
+   - Check that libxml-ruby tests pass
+## Benefits
+1. **Complete testing**: Windows builds now test libxml-ruby serializer
+2. **No platform exclusions**: Maintains consistency across all platforms
+3. **Automated setup**: CI/CD handles installation automatically
+4. **Documented process**: Clear guidance for local development
+## Alternative Approaches Considered
+### 1. Platform-conditional dependency (rejected)
+```ruby
+spec.add_dependency 'libxml-ruby' unless Gem.win_platform?
+```
+**Pros:** Simple, no setup required
+**Cons:** Reduces test coverage, inconsistent behavior across platforms
+### 2. Pre-built binary gems (not available)
+The libxml-ruby gem doesn't provide pre-built Windows binaries, requiring source compilation.
+### 3. WSL-only testing (rejected)
+**Pros:** Linux-like environment
+**Cons:** Doesn't test native Windows Ruby installations
+## Future Considerations
+1. **Caching**: Consider caching the Chocolatey packages to speed up CI runs
+2. **Version pinning**: May want to pin libxml2 version for reproducibility
+3. **Alternative packages**: Monitor for official pre-built binaries
+4. **Ruby 3.2/3.3**: Extend testing to additional Ruby versions if needed
+## References
+- GitHub Actions issue: https://github.com/metanorma/serialbench/actions/runs/18628094589
+- libxml-ruby gem: https://github.com/xml4r/libxml-ruby
+- Chocolatey libxml2 package: https://community.chocolatey.org/packages/libxml2
+- Windows setup documentation: `docs/WINDOWS_SETUP.md`