legal_summariser 0.1.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- checksums.yaml +7 -0
 - data/.rspec +3 -0
 - data/CHANGELOG.md +46 -0
 - data/Gemfile +6 -0
 - data/README.md +281 -0
 - data/Rakefile +12 -0
 - data/exe/legal_summariser +121 -0
 - data/lib/legal_summariser/clause_detector.rb +206 -0
 - data/lib/legal_summariser/document_parser.rb +10 -0
 - data/lib/legal_summariser/formatter.rb +213 -0
 - data/lib/legal_summariser/risk_analyzer.rb +257 -0
 - data/lib/legal_summariser/summariser.rb +230 -0
 - data/lib/legal_summariser/text_extractor.rb +79 -0
 - data/lib/legal_summariser/version.rb +5 -0
 - data/lib/legal_summariser.rb +71 -0
 - metadata +204 -0
 
    
        checksums.yaml
    ADDED
    
    | 
         @@ -0,0 +1,7 @@ 
     | 
|
| 
      
 1 
     | 
    
         
            +
            ---
         
     | 
| 
      
 2 
     | 
    
         
            +
            SHA256:
         
     | 
| 
      
 3 
     | 
    
         
            +
              metadata.gz: 2a32e0da3e5422be003d79a333a6f3ea9417fadcc362164e3cef9cae0d84dafb
         
     | 
| 
      
 4 
     | 
    
         
            +
              data.tar.gz: 3219d6167c936a2f056f43b5e2491bc4c67a697ef9d72169c7741be03f5a2726
         
     | 
| 
      
 5 
     | 
    
         
            +
            SHA512:
         
     | 
| 
      
 6 
     | 
    
         
            +
              metadata.gz: 9481e9eb32d6770586b21f8c56ced7f37d99afe8c9ba162fd284cc086b8f02f71b042bef0200bd61104446c0309763da7c362a3e5abae202ccf295c04ef63281
         
     | 
| 
      
 7 
     | 
    
         
            +
              data.tar.gz: c41d771b2ef842b185ebf0114de4921060ad6e55a17377acfc47412790237428fcab8a5ceff3efe2112b3341a9380a30160244c21b0b078eecc108181e9d4ce8
         
     | 
    
        data/.rspec
    ADDED
    
    
    
        data/CHANGELOG.md
    ADDED
    
    | 
         @@ -0,0 +1,46 @@ 
     | 
|
| 
      
 1 
     | 
    
         
            +
            # Changelog
         
     | 
| 
      
 2 
     | 
    
         
            +
             
     | 
| 
      
 3 
     | 
    
         
            +
            All notable changes to this project will be documented in this file.
         
     | 
| 
      
 4 
     | 
    
         
            +
             
     | 
| 
      
 5 
     | 
    
         
            +
            The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
         
     | 
| 
      
 6 
     | 
    
         
            +
            and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
         
     | 
| 
      
 7 
     | 
    
         
            +
             
     | 
| 
      
 8 
     | 
    
         
            +
            ## [0.1.0] - 2024-09-09
         
     | 
| 
      
 9 
     | 
    
         
            +
             
     | 
| 
      
 10 
     | 
    
         
            +
            ### Added
         
     | 
| 
      
 11 
     | 
    
         
            +
            - Initial release of Legal Summariser
         
     | 
| 
      
 12 
     | 
    
         
            +
            - Text extraction from PDF, DOCX, and TXT files
         
     | 
| 
      
 13 
     | 
    
         
            +
            - Basic legal document summarisation with plain English conversion
         
     | 
| 
      
 14 
     | 
    
         
            +
            - Clause detection for 8 key legal areas:
         
     | 
| 
      
 15 
     | 
    
         
            +
              - Data Processing & Privacy
         
     | 
| 
      
 16 
     | 
    
         
            +
              - Liability & Indemnification  
         
     | 
| 
      
 17 
     | 
    
         
            +
              - Confidentiality & Non-disclosure
         
     | 
| 
      
 18 
     | 
    
         
            +
              - Termination & Cancellation
         
     | 
| 
      
 19 
     | 
    
         
            +
              - Payment & Fees
         
     | 
| 
      
 20 
     | 
    
         
            +
              - Intellectual Property
         
     | 
| 
      
 21 
     | 
    
         
            +
              - Dispute Resolution
         
     | 
| 
      
 22 
     | 
    
         
            +
              - Governing Law
         
     | 
| 
      
 23 
     | 
    
         
            +
            - Risk analysis system with high/medium risk detection
         
     | 
| 
      
 24 
     | 
    
         
            +
            - Compliance gap identification for GDPR and KVKK
         
     | 
| 
      
 25 
     | 
    
         
            +
            - Unfair terms detection
         
     | 
| 
      
 26 
     | 
    
         
            +
            - Multiple output formats (JSON, Markdown, Plain Text)
         
     | 
| 
      
 27 
     | 
    
         
            +
            - Command-line interface with Thor
         
     | 
| 
      
 28 
     | 
    
         
            +
            - Comprehensive test suite with RSpec
         
     | 
| 
      
 29 
     | 
    
         
            +
            - Document type auto-detection
         
     | 
| 
      
 30 
     | 
    
         
            +
            - Offline processing capabilities
         
     | 
| 
      
 31 
     | 
    
         
            +
             
     | 
| 
      
 32 
     | 
    
         
            +
            ### Features
         
     | 
| 
      
 33 
     | 
    
         
            +
            - Rule-based clause extraction using regex patterns
         
     | 
| 
      
 34 
     | 
    
         
            +
            - Smart sentence scoring for summarisation
         
     | 
| 
      
 35 
     | 
    
         
            +
            - Legal language simplification
         
     | 
| 
      
 36 
     | 
    
         
            +
            - Risk scoring algorithm
         
     | 
| 
      
 37 
     | 
    
         
            +
            - Compliance checking framework
         
     | 
| 
      
 38 
     | 
    
         
            +
            - Multi-format document support
         
     | 
| 
      
 39 
     | 
    
         
            +
            - CLI demo mode with sample NDA
         
     | 
| 
      
 40 
     | 
    
         
            +
             
     | 
| 
      
 41 
     | 
    
         
            +
            ### Technical
         
     | 
| 
      
 42 
     | 
    
         
            +
            - Ruby gem structure with proper gemspec
         
     | 
| 
      
 43 
     | 
    
         
            +
            - Modular architecture with separate classes for each function
         
     | 
| 
      
 44 
     | 
    
         
            +
            - Error handling for unsupported formats and missing files
         
     | 
| 
      
 45 
     | 
    
         
            +
            - Text cleaning and normalization
         
     | 
| 
      
 46 
     | 
    
         
            +
            - Comprehensive documentation and examples
         
     | 
    
        data/Gemfile
    ADDED
    
    
    
        data/README.md
    ADDED
    
    | 
         @@ -0,0 +1,281 @@ 
     | 
|
| 
      
 1 
     | 
    
         
            +
            # Legal Summariser 📋⚖️
         
     | 
| 
      
 2 
     | 
    
         
            +
             
     | 
| 
      
 3 
     | 
    
         
            +
            > A Ruby-based AI-powered toolkit for legal document analysis that summarises contracts, extracts key clauses, flags risks, and translates legal jargon into plain English while preserving legal accuracy.
         
     | 
| 
      
 4 
     | 
    
         
            +
             
     | 
| 
      
 5 
     | 
    
         
            +
            **Created by [Ahmet KAHRAMAN](https://ahmetxhero.web.app)** 👨💻
         
     | 
| 
      
 6 
     | 
    
         
            +
             
     | 
| 
      
 7 
     | 
    
         
            +
            [](https://ruby-lang.org)
         
     | 
| 
      
 8 
     | 
    
         
            +
            [](LICENSE)
         
     | 
| 
      
 9 
     | 
    
         
            +
            [](#testing)
         
     | 
| 
      
 10 
     | 
    
         
            +
             
     | 
| 
      
 11 
     | 
    
         
            +
            ---
         
     | 
| 
      
 12 
     | 
    
         
            +
             
     | 
| 
      
 13 
     | 
    
         
            +
            ## 👋 About the Author
         
     | 
| 
      
 14 
     | 
    
         
            +
             
     | 
| 
      
 15 
     | 
    
         
            +
            **Ahmet KAHRAMAN** - Mobile Developer & Cyber Security Expert
         
     | 
| 
      
 16 
     | 
    
         
            +
             
     | 
| 
      
 17 
     | 
    
         
            +
            - 🌐 **Portfolio**: [ahmetxhero.web.app](https://ahmetxhero.web.app)
         
     | 
| 
      
 18 
     | 
    
         
            +
            - 🎥 **YouTube**: [@ahmetxhero](https://youtube.com/@ahmetxhero)
         
     | 
| 
      
 19 
     | 
    
         
            +
            - 💼 **LinkedIn**: [linkedin.com/in/ahmetxhero](https://linkedin.com/in/ahmetxhero)
         
     | 
| 
      
 20 
     | 
    
         
            +
            - 🐤 **Twitter**: [@ahmetxhero](https://x.com/ahmetxhero)
         
     | 
| 
      
 21 
     | 
    
         
            +
            - 📧 **Email**: ahmetxhero@gmail.com
         
     | 
| 
      
 22 
     | 
    
         
            +
            - 🏠 **Location**: Ankara, Turkey 🇹🇷
         
     | 
| 
      
 23 
     | 
    
         
            +
             
     | 
| 
      
 24 
     | 
    
         
            +
            *"Security first, innovation always"* - Building secure, innovative solutions for a better digital future 🚀
         
     | 
| 
      
 25 
     | 
    
         
            +
             
     | 
| 
      
 26 
     | 
    
         
            +
            ---
         
     | 
| 
      
 27 
     | 
    
         
            +
             
     | 
| 
      
 28 
     | 
    
         
            +
            ## Features
         
     | 
| 
      
 29 
     | 
    
         
            +
             
     | 
| 
      
 30 
     | 
    
         
            +
            - **Document Processing**: Supports PDF, DOCX, and plain text files
         
     | 
| 
      
 31 
     | 
    
         
            +
            - **Smart Summarisation**: Converts legal documents into concise plain English
         
     | 
| 
      
 32 
     | 
    
         
            +
            - **Clause Detection**: Automatically identifies key legal clauses including:
         
     | 
| 
      
 33 
     | 
    
         
            +
              - Data Processing & Privacy (GDPR/KVKK compliance)
         
     | 
| 
      
 34 
     | 
    
         
            +
              - Liability & Indemnification
         
     | 
| 
      
 35 
     | 
    
         
            +
              - Confidentiality & Non-disclosure
         
     | 
| 
      
 36 
     | 
    
         
            +
              - Termination & Cancellation
         
     | 
| 
      
 37 
     | 
    
         
            +
              - Payment & Fees
         
     | 
| 
      
 38 
     | 
    
         
            +
              - Intellectual Property
         
     | 
| 
      
 39 
     | 
    
         
            +
              - Dispute Resolution
         
     | 
| 
      
 40 
     | 
    
         
            +
              - Governing Law
         
     | 
| 
      
 41 
     | 
    
         
            +
            - **Risk Analysis**: Flags potential legal risks and unfair terms
         
     | 
| 
      
 42 
     | 
    
         
            +
            - **Compliance Checking**: Identifies gaps in GDPR, KVKK, and other regulations
         
     | 
| 
      
 43 
     | 
    
         
            +
            - **Multiple Output Formats**: JSON, Markdown, and plain text
         
     | 
| 
      
 44 
     | 
    
         
            +
            - **CLI Interface**: Command-line tool for batch processing
         
     | 
| 
      
 45 
     | 
    
         
            +
            - **Offline Processing**: Works without internet for sensitive documents
         
     | 
| 
      
 46 
     | 
    
         
            +
             
     | 
| 
      
 47 
     | 
    
         
            +
            ## Installation
         
     | 
| 
      
 48 
     | 
    
         
            +
             
     | 
| 
      
 49 
     | 
    
         
            +
            Add this line to your application's Gemfile:
         
     | 
| 
      
 50 
     | 
    
         
            +
             
     | 
| 
      
 51 
     | 
    
         
            +
            ```ruby
         
     | 
| 
      
 52 
     | 
    
         
            +
            gem 'legal_summariser'
         
     | 
| 
      
 53 
     | 
    
         
            +
            ```
         
     | 
| 
      
 54 
     | 
    
         
            +
             
     | 
| 
      
 55 
     | 
    
         
            +
            And then execute:
         
     | 
| 
      
 56 
     | 
    
         
            +
             
     | 
| 
      
 57 
     | 
    
         
            +
            ```bash
         
     | 
| 
      
 58 
     | 
    
         
            +
            bundle install
         
     | 
| 
      
 59 
     | 
    
         
            +
            ```
         
     | 
| 
      
 60 
     | 
    
         
            +
             
     | 
| 
      
 61 
     | 
    
         
            +
            Or install it yourself as:
         
     | 
| 
      
 62 
     | 
    
         
            +
             
     | 
| 
      
 63 
     | 
    
         
            +
            ```bash
         
     | 
| 
      
 64 
     | 
    
         
            +
            gem install legal_summariser
         
     | 
| 
      
 65 
     | 
    
         
            +
            ```
         
     | 
| 
      
 66 
     | 
    
         
            +
             
     | 
| 
      
 67 
     | 
    
         
            +
            ## Usage
         
     | 
| 
      
 68 
     | 
    
         
            +
             
     | 
| 
      
 69 
     | 
    
         
            +
            ### Ruby API
         
     | 
| 
      
 70 
     | 
    
         
            +
             
     | 
| 
      
 71 
     | 
    
         
            +
            ```ruby
         
     | 
| 
      
 72 
     | 
    
         
            +
            require "legal_summariser"
         
     | 
| 
      
 73 
     | 
    
         
            +
             
     | 
| 
      
 74 
     | 
    
         
            +
            # Basic usage
         
     | 
| 
      
 75 
     | 
    
         
            +
            summary = LegalSummariser.summarise("contracts/nda.pdf")
         
     | 
| 
      
 76 
     | 
    
         
            +
            puts summary[:plain_text]
         
     | 
| 
      
 77 
     | 
    
         
            +
            # => "This Non-Disclosure Agreement establishes confidentiality, valid for 2 years. The company may terminate at any time..."
         
     | 
| 
      
 78 
     | 
    
         
            +
             
     | 
| 
      
 79 
     | 
    
         
            +
            # With options
         
     | 
| 
      
 80 
     | 
    
         
            +
            result = LegalSummariser.summarise("contract.pdf", {
         
     | 
| 
      
 81 
     | 
    
         
            +
              format: 'markdown',
         
     | 
| 
      
 82 
     | 
    
         
            +
              max_sentences: 3
         
     | 
| 
      
 83 
     | 
    
         
            +
            })
         
     | 
| 
      
 84 
     | 
    
         
            +
             
     | 
| 
      
 85 
     | 
    
         
            +
            # Access different parts of the analysis
         
     | 
| 
      
 86 
     | 
    
         
            +
            puts result[:key_points]        # Key contract points
         
     | 
| 
      
 87 
     | 
    
         
            +
            puts result[:clauses]           # Detected legal clauses
         
     | 
| 
      
 88 
     | 
    
         
            +
            puts result[:risks]             # Risk analysis
         
     | 
| 
      
 89 
     | 
    
         
            +
            puts result[:metadata]          # Document metadata
         
     | 
| 
      
 90 
     | 
    
         
            +
            ```
         
     | 
| 
      
 91 
     | 
    
         
            +
             
     | 
| 
      
 92 
     | 
    
         
            +
            ### Command Line Interface
         
     | 
| 
      
 93 
     | 
    
         
            +
             
     | 
| 
      
 94 
     | 
    
         
            +
            ```bash
         
     | 
| 
      
 95 
     | 
    
         
            +
            # Analyze a document
         
     | 
| 
      
 96 
     | 
    
         
            +
            legal_summariser analyze contract.pdf
         
     | 
| 
      
 97 
     | 
    
         
            +
             
     | 
| 
      
 98 
     | 
    
         
            +
            # Specify output format
         
     | 
| 
      
 99 
     | 
    
         
            +
            legal_summariser analyze contract.pdf --format markdown
         
     | 
| 
      
 100 
     | 
    
         
            +
             
     | 
| 
      
 101 
     | 
    
         
            +
            # Save to file
         
     | 
| 
      
 102 
     | 
    
         
            +
            legal_summariser analyze contract.pdf --output summary.md --format markdown
         
     | 
| 
      
 103 
     | 
    
         
            +
             
     | 
| 
      
 104 
     | 
    
         
            +
            # Run demo
         
     | 
| 
      
 105 
     | 
    
         
            +
            legal_summariser demo
         
     | 
| 
      
 106 
     | 
    
         
            +
             
     | 
| 
      
 107 
     | 
    
         
            +
            # Show supported formats
         
     | 
| 
      
 108 
     | 
    
         
            +
            legal_summariser supported_formats
         
     | 
| 
      
 109 
     | 
    
         
            +
             
     | 
| 
      
 110 
     | 
    
         
            +
            # Show version
         
     | 
| 
      
 111 
     | 
    
         
            +
            legal_summariser version
         
     | 
| 
      
 112 
     | 
    
         
            +
            ```
         
     | 
| 
      
 113 
     | 
    
         
            +
             
     | 
| 
      
 114 
     | 
    
         
            +
            ## Example Output
         
     | 
| 
      
 115 
     | 
    
         
            +
             
     | 
| 
      
 116 
     | 
    
         
            +
            ### Plain Text Summary
         
     | 
| 
      
 117 
     | 
    
         
            +
            ```
         
     | 
| 
      
 118 
     | 
    
         
            +
            This Non-Disclosure Agreement establishes confidentiality obligations between parties. 
         
     | 
| 
      
 119 
     | 
    
         
            +
            The agreement will remain valid for 2 years from the date of signing. Either party may 
         
     | 
| 
      
 120 
     | 
    
         
            +
            terminate with 30 days written notice. The receiving party will be liable for any 
         
     | 
| 
      
 121 
     | 
    
         
            +
            breach of confidentiality obligations.
         
     | 
| 
      
 122 
     | 
    
         
            +
            ```
         
     | 
| 
      
 123 
     | 
    
         
            +
             
     | 
| 
      
 124 
     | 
    
         
            +
            ### Risk Analysis
         
     | 
| 
      
 125 
     | 
    
         
            +
            ```
         
     | 
| 
      
 126 
     | 
    
         
            +
            High Risks Found:
         
     | 
| 
      
 127 
     | 
    
         
            +
            - Unlimited Liability: Agreement may expose party to unlimited financial liability
         
     | 
| 
      
 128 
     | 
    
         
            +
            - Broad Indemnification: Very broad indemnification obligations that could be costly
         
     | 
| 
      
 129 
     | 
    
         
            +
             
     | 
| 
      
 130 
     | 
    
         
            +
            Compliance Gaps:
         
     | 
| 
      
 131 
     | 
    
         
            +
            - Missing GDPR Reference: Document processes personal data but lacks GDPR compliance language
         
     | 
| 
      
 132 
     | 
    
         
            +
            - Missing Data Subject Rights: No mention of data subject rights under GDPR
         
     | 
| 
      
 133 
     | 
    
         
            +
            ```
         
     | 
| 
      
 134 
     | 
    
         
            +
             
     | 
| 
      
 135 
     | 
    
         
            +
            ### Detected Clauses
         
     | 
| 
      
 136 
     | 
    
         
            +
            - **Confidentiality**: 3 clauses found
         
     | 
| 
      
 137 
     | 
    
         
            +
            - **Liability**: 2 clauses found  
         
     | 
| 
      
 138 
     | 
    
         
            +
            - **Termination**: 1 clause found
         
     | 
| 
      
 139 
     | 
    
         
            +
            - **Data Processing**: 2 clauses found
         
     | 
| 
      
 140 
     | 
    
         
            +
             
     | 
| 
      
 141 
     | 
    
         
            +
            ## Supported Document Types
         
     | 
| 
      
 142 
     | 
    
         
            +
             
     | 
| 
      
 143 
     | 
    
         
            +
            The system automatically detects and optimizes analysis for:
         
     | 
| 
      
 144 
     | 
    
         
            +
             
     | 
| 
      
 145 
     | 
    
         
            +
            - **Non-Disclosure Agreements (NDAs)**
         
     | 
| 
      
 146 
     | 
    
         
            +
            - **Service Agreements**
         
     | 
| 
      
 147 
     | 
    
         
            +
            - **Employment Contracts**
         
     | 
| 
      
 148 
     | 
    
         
            +
            - **Privacy Policies**
         
     | 
| 
      
 149 
     | 
    
         
            +
            - **License Agreements**
         
     | 
| 
      
 150 
     | 
    
         
            +
            - **General Contracts**
         
     | 
| 
      
 151 
     | 
    
         
            +
             
     | 
| 
      
 152 
     | 
    
         
            +
            ## Supported File Formats
         
     | 
| 
      
 153 
     | 
    
         
            +
             
     | 
| 
      
 154 
     | 
    
         
            +
            ### Input Formats
         
     | 
| 
      
 155 
     | 
    
         
            +
            - PDF (.pdf)
         
     | 
| 
      
 156 
     | 
    
         
            +
            - Microsoft Word (.docx)
         
     | 
| 
      
 157 
     | 
    
         
            +
            - Plain Text (.txt)
         
     | 
| 
      
 158 
     | 
    
         
            +
             
     | 
| 
      
 159 
     | 
    
         
            +
            ### Output Formats
         
     | 
| 
      
 160 
     | 
    
         
            +
            - JSON (structured data)
         
     | 
| 
      
 161 
     | 
    
         
            +
            - Markdown (formatted report)
         
     | 
| 
      
 162 
     | 
    
         
            +
            - Plain Text (simple summary)
         
     | 
| 
      
 163 
     | 
    
         
            +
             
     | 
| 
      
 164 
     | 
    
         
            +
            ## Target Users
         
     | 
| 
      
 165 
     | 
    
         
            +
             
     | 
| 
      
 166 
     | 
    
         
            +
            - **Law firms & compliance teams**: Faster contract reviews
         
     | 
| 
      
 167 
     | 
    
         
            +
            - **Startups & SMEs**: Understanding investor or supplier contracts
         
     | 
| 
      
 168 
     | 
    
         
            +
            - **Forensics experts**: Extracting critical legal clauses for reports
         
     | 
| 
      
 169 
     | 
    
         
            +
            - **Academics & NGOs**: Analysing legal policies and regulations
         
     | 
| 
      
 170 
     | 
    
         
            +
             
     | 
| 
      
 171 
     | 
    
         
            +
            ## Technical Architecture
         
     | 
| 
      
 172 
     | 
    
         
            +
             
     | 
| 
      
 173 
     | 
    
         
            +
            ### Hybrid Approach
         
     | 
| 
      
 174 
     | 
    
         
            +
            - **Rule-based extractors**: For structured clause detection
         
     | 
| 
      
 175 
     | 
    
         
            +
            - **NLP processing**: For summarisation and risk detection
         
     | 
| 
      
 176 
     | 
    
         
            +
            - **Pattern matching**: For compliance gap identification
         
     | 
| 
      
 177 
     | 
    
         
            +
             
     | 
| 
      
 178 
     | 
    
         
            +
            ### Key Components
         
     | 
| 
      
 179 
     | 
    
         
            +
            - **TextExtractor**: Multi-format document parsing
         
     | 
| 
      
 180 
     | 
    
         
            +
            - **Summariser**: Plain English conversion engine
         
     | 
| 
      
 181 
     | 
    
         
            +
            - **ClauseDetector**: Legal clause identification
         
     | 
| 
      
 182 
     | 
    
         
            +
            - **RiskAnalyzer**: Risk assessment and flagging
         
     | 
| 
      
 183 
     | 
    
         
            +
            - **Formatter**: Multi-format output generation
         
     | 
| 
      
 184 
     | 
    
         
            +
             
     | 
| 
      
 185 
     | 
    
         
            +
            ## Development
         
     | 
| 
      
 186 
     | 
    
         
            +
             
     | 
| 
      
 187 
     | 
    
         
            +
            After checking out the repo, run `bundle install` to install dependencies. Then, run `rake spec` to run the tests.
         
     | 
| 
      
 188 
     | 
    
         
            +
             
     | 
| 
      
 189 
     | 
    
         
            +
            ```bash
         
     | 
| 
      
 190 
     | 
    
         
            +
            # Install dependencies
         
     | 
| 
      
 191 
     | 
    
         
            +
            bundle install
         
     | 
| 
      
 192 
     | 
    
         
            +
             
     | 
| 
      
 193 
     | 
    
         
            +
            # Run tests
         
     | 
| 
      
 194 
     | 
    
         
            +
            bundle exec rspec
         
     | 
| 
      
 195 
     | 
    
         
            +
             
     | 
| 
      
 196 
     | 
    
         
            +
            # Run linter
         
     | 
| 
      
 197 
     | 
    
         
            +
            bundle exec rubocop
         
     | 
| 
      
 198 
     | 
    
         
            +
             
     | 
| 
      
 199 
     | 
    
         
            +
            # Build gem
         
     | 
| 
      
 200 
     | 
    
         
            +
            gem build legal_summariser.gemspec
         
     | 
| 
      
 201 
     | 
    
         
            +
             
     | 
| 
      
 202 
     | 
    
         
            +
            # Install local gem
         
     | 
| 
      
 203 
     | 
    
         
            +
            gem install ./legal_summariser-*.gem
         
     | 
| 
      
 204 
     | 
    
         
            +
            ```
         
     | 
| 
      
 205 
     | 
    
         
            +
             
     | 
| 
      
 206 
     | 
    
         
            +
            ## Roadmap
         
     | 
| 
      
 207 
     | 
    
         
            +
             
     | 
| 
      
 208 
     | 
    
         
            +
            - **v0.1** ✅ Text extraction + basic summarisation
         
     | 
| 
      
 209 
     | 
    
         
            +
            - **v0.2** ✅ Clause detection + risk flagging  
         
     | 
| 
      
 210 
     | 
    
         
            +
            - **v0.3** 🔄 Plain language generator (fine-tuned models)
         
     | 
| 
      
 211 
     | 
    
         
            +
            - **v1.0** 📋 Multi-language support + PDF annotation output
         
     | 
| 
      
 212 
     | 
    
         
            +
             
     | 
| 
      
 213 
     | 
    
         
            +
            ## Contributing
         
     | 
| 
      
 214 
     | 
    
         
            +
             
     | 
| 
      
 215 
     | 
    
         
            +
            Bug reports and pull requests are welcome on GitHub at https://github.com/legal-summariser/legal_summariser.
         
     | 
| 
      
 216 
     | 
    
         
            +
             
     | 
| 
      
 217 
     | 
    
         
            +
            ## License
         
     | 
| 
      
 218 
     | 
    
         
            +
             
     | 
| 
      
 219 
     | 
    
         
            +
            The gem is available as open source under the terms of the [MIT License](https://opensource.org/licenses/MIT).
         
     | 
| 
      
 220 
     | 
    
         
            +
             
     | 
| 
      
 221 
     | 
    
         
            +
            ## Disclaimer
         
     | 
| 
      
 222 
     | 
    
         
            +
             
     | 
| 
      
 223 
     | 
    
         
            +
            This tool is designed to assist with legal document analysis but should not replace professional legal advice. Always consult with qualified legal professionals for important legal matters.
         
     | 
| 
      
 224 
     | 
    
         
            +
             
     | 
| 
      
 225 
     | 
    
         
            +
            ## 🌍 Global Impact
         
     | 
| 
      
 226 
     | 
    
         
            +
             
     | 
| 
      
 227 
     | 
    
         
            +
            - **Innovation**: Bridges AI with law and compliance in a developer-friendly way
         
     | 
| 
      
 228 
     | 
    
         
            +
            - **Contribution**: Open-source library for legal NLP in Ruby
         
     | 
| 
      
 229 
     | 
    
         
            +
            - **Public benefit**: Helps both professionals and citizens better understand their rights and obligations
         
     | 
| 
      
 230 
     | 
    
         
            +
            - **Global relevance**: Applicable across jurisdictions (GDPR, KVKK, HIPAA, CCPA)
         
     | 
| 
      
 231 
     | 
    
         
            +
             
     | 
| 
      
 232 
     | 
    
         
            +
            ## 🛠️ Tech Stack
         
     | 
| 
      
 233 
     | 
    
         
            +
             
     | 
| 
      
 234 
     | 
    
         
            +
            This project leverages my expertise in:
         
     | 
| 
      
 235 
     | 
    
         
            +
             
     | 
| 
      
 236 
     | 
    
         
            +
            - **Ruby Development**: Gem architecture, modular design patterns
         
     | 
| 
      
 237 
     | 
    
         
            +
            - **AI & NLP**: Rule-based text analysis, pattern recognition
         
     | 
| 
      
 238 
     | 
    
         
            +
            - **Cybersecurity**: Compliance frameworks (GDPR, KVKK), risk assessment
         
     | 
| 
      
 239 
     | 
    
         
            +
            - **Digital Forensics**: Legal document analysis, evidence extraction
         
     | 
| 
      
 240 
     | 
    
         
            +
            - **Software Engineering**: Test-driven development, CLI tools
         
     | 
| 
      
 241 
     | 
    
         
            +
             
     | 
| 
      
 242 
     | 
    
         
            +
            ## 🎓 Professional Background
         
     | 
| 
      
 243 
     | 
    
         
            +
             
     | 
| 
      
 244 
     | 
    
         
            +
            As a **Mobile Developer & Cyber Security Expert** with 10+ years in Public Sector IT:
         
     | 
| 
      
 245 
     | 
    
         
            +
             
     | 
| 
      
 246 
     | 
    
         
            +
            - 🎓 **Master's in Forensic Informatics** - Gazi University (2021-2023)
         
     | 
| 
      
 247 
     | 
    
         
            +
            - 🏢 **Mobile Developer** - Gendarmerie General Command (2024-Present)
         
     | 
| 
      
 248 
     | 
    
         
            +
            - 🔒 **Certified Ethical Hacker (CEH)**
         
     | 
| 
      
 249 
     | 
    
         
            +
            - 📱 **iOS & Android Development Expert**
         
     | 
| 
      
 250 
     | 
    
         
            +
             
     | 
| 
      
 251 
     | 
    
         
            +
            ## 🤝 Connect & Collaborate
         
     | 
| 
      
 252 
     | 
    
         
            +
             
     | 
| 
      
 253 
     | 
    
         
            +
            | Platform | Link | Purpose |
         
     | 
| 
      
 254 
     | 
    
         
            +
            |----------|------|---------|
         
     | 
| 
      
 255 
     | 
    
         
            +
            | 🌐 **Portfolio** | [ahmetxhero.web.app](https://ahmetxhero.web.app) | Professional showcase |
         
     | 
| 
      
 256 
     | 
    
         
            +
            | 🎥 **YouTube** | [@ahmetxhero](https://youtube.com/@ahmetxhero) | Tech tutorials & content |
         
     | 
| 
      
 257 
     | 
    
         
            +
            | 💼 **LinkedIn** | [ahmetxhero](https://linkedin.com/in/ahmetxhero) | Professional network |
         
     | 
| 
      
 258 
     | 
    
         
            +
            | 🐤 **Twitter** | [@ahmetxhero](https://x.com/ahmetxhero) | Tech updates & thoughts |
         
     | 
| 
      
 259 
     | 
    
         
            +
            | 📝 **Medium** | [ahmetxhero.medium.com](https://ahmetxhero.medium.com) | Technical articles |
         
     | 
| 
      
 260 
     | 
    
         
            +
            | 📷 **Instagram** | [@ahmetxhero](https://instagram.com/ahmetxhero) | Behind the scenes |
         
     | 
| 
      
 261 
     | 
    
         
            +
             
     | 
| 
      
 262 
     | 
    
         
            +
            ## ☕ Support My Work
         
     | 
| 
      
 263 
     | 
    
         
            +
             
     | 
| 
      
 264 
     | 
    
         
            +
            If this project helps you or your organization, consider:
         
     | 
| 
      
 265 
     | 
    
         
            +
             
     | 
| 
      
 266 
     | 
    
         
            +
            - ⭐ **Star this repository** on GitHub
         
     | 
| 
      
 267 
     | 
    
         
            +
            - 🔄 **Share it** with your network
         
     | 
| 
      
 268 
     | 
    
         
            +
            - ☕ **Buy me a coffee** to support open source development
         
     | 
| 
      
 269 
     | 
    
         
            +
            - 🗣️ **Invite me to speak** at your tech event about legal tech innovation
         
     | 
| 
      
 270 
     | 
    
         
            +
             
     | 
| 
      
 271 
     | 
    
         
            +
            ## 🎯 Current Focus
         
     | 
| 
      
 272 
     | 
    
         
            +
             
     | 
| 
      
 273 
     | 
    
         
            +
            - 🔒 **Cybersecurity**: Developing secure applications with privacy-first approach
         
     | 
| 
      
 274 
     | 
    
         
            +
            - 📱 **Mobile Development**: Creating innovative iOS and Android solutions
         
     | 
| 
      
 275 
     | 
    
         
            +
            - 🔍 **Digital Forensics**: Advancing forensic investigation techniques
         
     | 
| 
      
 276 
     | 
    
         
            +
            - 📚 **Knowledge Sharing**: Contributing to the tech community through open source
         
     | 
| 
      
 277 
     | 
    
         
            +
            - ⚖️ **Legal Tech**: Building tools that make legal processes more accessible
         
     | 
| 
      
 278 
     | 
    
         
            +
             
     | 
| 
      
 279 
     | 
    
         
            +
            ---
         
     | 
| 
      
 280 
     | 
    
         
            +
             
     | 
| 
      
 281 
     | 
    
         
            +
            *Building secure, innovative solutions for a better digital future* 🚀
         
     | 
    
        data/Rakefile
    ADDED
    
    
| 
         @@ -0,0 +1,121 @@ 
     | 
|
| 
      
 1 
     | 
    
         
            +
            #!/usr/bin/env ruby
         
     | 
| 
      
 2 
     | 
    
         
            +
            # frozen_string_literal: true
         
     | 
| 
      
 3 
     | 
    
         
            +
             
     | 
| 
      
 4 
     | 
    
         
            +
            require_relative '../lib/legal_summariser'
         
     | 
| 
      
 5 
     | 
    
         
            +
            require 'thor'
         
     | 
| 
      
 6 
     | 
    
         
            +
             
     | 
| 
      
 7 
     | 
    
         
            +
            module LegalSummariser
         
     | 
| 
      
 8 
     | 
    
         
            +
              class CLI < Thor
         
     | 
| 
      
 9 
     | 
    
         
            +
                desc "analyze FILE", "Analyze a legal document and generate summary"
         
     | 
| 
      
 10 
     | 
    
         
            +
                option :format, aliases: '-f', default: 'text', desc: 'Output format (json, markdown, text)'
         
     | 
| 
      
 11 
     | 
    
         
            +
                option :output, aliases: '-o', desc: 'Output file path (optional)'
         
     | 
| 
      
 12 
     | 
    
         
            +
                option :max_sentences, type: :numeric, default: 5, desc: 'Maximum sentences in summary'
         
     | 
| 
      
 13 
     | 
    
         
            +
                def analyze(file_path)
         
     | 
| 
      
 14 
     | 
    
         
            +
                  begin
         
     | 
| 
      
 15 
     | 
    
         
            +
                    puts "Analyzing: #{file_path}"
         
     | 
| 
      
 16 
     | 
    
         
            +
                    puts "Format: #{options[:format]}"
         
     | 
| 
      
 17 
     | 
    
         
            +
                    puts "-" * 50
         
     | 
| 
      
 18 
     | 
    
         
            +
                    
         
     | 
| 
      
 19 
     | 
    
         
            +
                    # Perform analysis
         
     | 
| 
      
 20 
     | 
    
         
            +
                    results = LegalSummariser.summarise(file_path, {
         
     | 
| 
      
 21 
     | 
    
         
            +
                      format: options[:format],
         
     | 
| 
      
 22 
     | 
    
         
            +
                      max_sentences: options[:max_sentences]
         
     | 
| 
      
 23 
     | 
    
         
            +
                    })
         
     | 
| 
      
 24 
     | 
    
         
            +
                    
         
     | 
| 
      
 25 
     | 
    
         
            +
                    # Output results
         
     | 
| 
      
 26 
     | 
    
         
            +
                    if options[:output]
         
     | 
| 
      
 27 
     | 
    
         
            +
                      File.write(options[:output], results)
         
     | 
| 
      
 28 
     | 
    
         
            +
                      puts "Results saved to: #{options[:output]}"
         
     | 
| 
      
 29 
     | 
    
         
            +
                    else
         
     | 
| 
      
 30 
     | 
    
         
            +
                      puts results
         
     | 
| 
      
 31 
     | 
    
         
            +
                    end
         
     | 
| 
      
 32 
     | 
    
         
            +
                    
         
     | 
| 
      
 33 
     | 
    
         
            +
                  rescue LegalSummariser::DocumentNotFoundError => e
         
     | 
| 
      
 34 
     | 
    
         
            +
                    puts "Error: #{e.message}"
         
     | 
| 
      
 35 
     | 
    
         
            +
                    exit 1
         
     | 
| 
      
 36 
     | 
    
         
            +
                  rescue LegalSummariser::UnsupportedFormatError => e
         
     | 
| 
      
 37 
     | 
    
         
            +
                    puts "Error: #{e.message}"
         
     | 
| 
      
 38 
     | 
    
         
            +
                    exit 1
         
     | 
| 
      
 39 
     | 
    
         
            +
                  rescue => e
         
     | 
| 
      
 40 
     | 
    
         
            +
                    puts "Unexpected error: #{e.message}"
         
     | 
| 
      
 41 
     | 
    
         
            +
                    puts e.backtrace if ENV['DEBUG']
         
     | 
| 
      
 42 
     | 
    
         
            +
                    exit 1
         
     | 
| 
      
 43 
     | 
    
         
            +
                  end
         
     | 
| 
      
 44 
     | 
    
         
            +
                end
         
     | 
| 
      
 45 
     | 
    
         
            +
             
     | 
| 
      
 46 
     | 
    
         
            +
                desc "version", "Show version information"
         
     | 
| 
      
 47 
     | 
    
         
            +
                def version
         
     | 
| 
      
 48 
     | 
    
         
            +
                  puts "Legal Summariser v#{LegalSummariser::VERSION}"
         
     | 
| 
      
 49 
     | 
    
         
            +
                  puts "Ruby-based AI-powered legal document analysis toolkit"
         
     | 
| 
      
 50 
     | 
    
         
            +
                end
         
     | 
| 
      
 51 
     | 
    
         
            +
             
     | 
| 
      
 52 
     | 
    
         
            +
                desc "supported_formats", "List supported document formats"
         
     | 
| 
      
 53 
     | 
    
         
            +
                def supported_formats
         
     | 
| 
      
 54 
     | 
    
         
            +
                  puts "Supported document formats:"
         
     | 
| 
      
 55 
     | 
    
         
            +
                  puts "- PDF (.pdf)"
         
     | 
| 
      
 56 
     | 
    
         
            +
                  puts "- Microsoft Word (.docx)"
         
     | 
| 
      
 57 
     | 
    
         
            +
                  puts "- Plain text (.txt)"
         
     | 
| 
      
 58 
     | 
    
         
            +
                  puts ""
         
     | 
| 
      
 59 
     | 
    
         
            +
                  puts "Output formats:"
         
     | 
| 
      
 60 
     | 
    
         
            +
                  puts "- JSON (json)"
         
     | 
| 
      
 61 
     | 
    
         
            +
                  puts "- Markdown (markdown, md)"
         
     | 
| 
      
 62 
     | 
    
         
            +
                  puts "- Plain text (text, txt)"
         
     | 
| 
      
 63 
     | 
    
         
            +
                end
         
     | 
| 
      
 64 
     | 
    
         
            +
             
     | 
| 
      
 65 
     | 
    
         
            +
                desc "demo", "Run demo analysis on sample documents"
         
     | 
| 
      
 66 
     | 
    
         
            +
                def demo
         
     | 
| 
      
 67 
     | 
    
         
            +
                  puts "Legal Summariser Demo"
         
     | 
| 
      
 68 
     | 
    
         
            +
                  puts "=" * 50
         
     | 
| 
      
 69 
     | 
    
         
            +
                  puts ""
         
     | 
| 
      
 70 
     | 
    
         
            +
                  
         
     | 
| 
      
 71 
     | 
    
         
            +
                  # Create sample NDA text for demo
         
     | 
| 
      
 72 
     | 
    
         
            +
                  sample_text = create_sample_nda
         
     | 
| 
      
 73 
     | 
    
         
            +
                  sample_file = "/tmp/sample_nda.txt"
         
     | 
| 
      
 74 
     | 
    
         
            +
                  File.write(sample_file, sample_text)
         
     | 
| 
      
 75 
     | 
    
         
            +
                  
         
     | 
| 
      
 76 
     | 
    
         
            +
                  puts "Analyzing sample NDA document..."
         
     | 
| 
      
 77 
     | 
    
         
            +
                  puts ""
         
     | 
| 
      
 78 
     | 
    
         
            +
                  
         
     | 
| 
      
 79 
     | 
    
         
            +
                  results = LegalSummariser.summarise(sample_file, { format: 'markdown' })
         
     | 
| 
      
 80 
     | 
    
         
            +
                  puts results
         
     | 
| 
      
 81 
     | 
    
         
            +
                  
         
     | 
| 
      
 82 
     | 
    
         
            +
                  # Clean up
         
     | 
| 
      
 83 
     | 
    
         
            +
                  File.delete(sample_file) if File.exist?(sample_file)
         
     | 
| 
      
 84 
     | 
    
         
            +
                end
         
     | 
| 
      
 85 
     | 
    
         
            +
             
     | 
| 
      
 86 
     | 
    
         
            +
                private
         
     | 
| 
      
 87 
     | 
    
         
            +
             
     | 
| 
      
 88 
     | 
    
         
            +
                def create_sample_nda
         
     | 
| 
      
 89 
     | 
    
         
            +
                  <<~NDA
         
     | 
| 
      
 90 
     | 
    
         
            +
                    NON-DISCLOSURE AGREEMENT
         
     | 
| 
      
 91 
     | 
    
         
            +
             
     | 
| 
      
 92 
     | 
    
         
            +
                    This Non-Disclosure Agreement ("Agreement") is entered into on [DATE] between Company ABC ("Disclosing Party") and John Doe ("Receiving Party").
         
     | 
| 
      
 93 
     | 
    
         
            +
             
     | 
| 
      
 94 
     | 
    
         
            +
                    1. CONFIDENTIAL INFORMATION
         
     | 
| 
      
 95 
     | 
    
         
            +
                    The Disclosing Party may disclose certain confidential and proprietary information to the Receiving Party. Confidential information includes all technical data, trade secrets, know-how, research, product plans, products, services, customers, customer lists, markets, software, developments, inventions, processes, formulas, technology, designs, drawings, engineering, hardware configuration information, marketing, finances, or other business information.
         
     | 
| 
      
 96 
     | 
    
         
            +
             
     | 
| 
      
 97 
     | 
    
         
            +
                    2. OBLIGATIONS
         
     | 
| 
      
 98 
     | 
    
         
            +
                    The Receiving Party agrees to hold and maintain the Confidential Information in strict confidence for a period of two (2) years from the date of disclosure. The Receiving Party shall not disclose any Confidential Information to third parties without prior written consent.
         
     | 
| 
      
 99 
     | 
    
         
            +
             
     | 
| 
      
 100 
     | 
    
         
            +
                    3. LIABILITY
         
     | 
| 
      
 101 
     | 
    
         
            +
                    The Receiving Party shall be liable for any breach of this Agreement and shall indemnify the Disclosing Party against all claims, damages, and expenses arising from such breach.
         
     | 
| 
      
 102 
     | 
    
         
            +
             
     | 
| 
      
 103 
     | 
    
         
            +
                    4. TERMINATION
         
     | 
| 
      
 104 
     | 
    
         
            +
                    This Agreement may be terminated by either party with thirty (30) days written notice. Upon termination, all Confidential Information must be returned or destroyed.
         
     | 
| 
      
 105 
     | 
    
         
            +
             
     | 
| 
      
 106 
     | 
    
         
            +
                    5. GOVERNING LAW
         
     | 
| 
      
 107 
     | 
    
         
            +
                    This Agreement shall be governed by the laws of England and Wales. Any disputes shall be resolved through binding arbitration.
         
     | 
| 
      
 108 
     | 
    
         
            +
             
     | 
| 
      
 109 
     | 
    
         
            +
                    6. DATA PROTECTION
         
     | 
| 
      
 110 
     | 
    
         
            +
                    Both parties acknowledge their obligations under the General Data Protection Regulation (GDPR) regarding any personal data processed under this Agreement.
         
     | 
| 
      
 111 
     | 
    
         
            +
             
     | 
| 
      
 112 
     | 
    
         
            +
                    IN WITNESS WHEREOF, the parties have executed this Agreement as of the date first written above.
         
     | 
| 
      
 113 
     | 
    
         
            +
                  NDA
         
     | 
| 
      
 114 
     | 
    
         
            +
                end
         
     | 
| 
      
 115 
     | 
    
         
            +
              end
         
     | 
| 
      
 116 
     | 
    
         
            +
            end
         
     | 
| 
      
 117 
     | 
    
         
            +
             
     | 
| 
      
 118 
     | 
    
         
            +
            # Run CLI if called directly
         
     | 
| 
      
 119 
     | 
    
         
            +
            if __FILE__ == $0
         
     | 
| 
      
 120 
     | 
    
         
            +
              LegalSummariser::CLI.start(ARGV)
         
     | 
| 
      
 121 
     | 
    
         
            +
            end
         
     | 
| 
         @@ -0,0 +1,206 @@ 
     | 
|
| 
      
 1 
     | 
    
         
            +
            # frozen_string_literal: true
         
     | 
| 
      
 2 
     | 
    
         
            +
             
     | 
| 
      
 3 
     | 
    
         
            +
            module LegalSummariser
         
     | 
| 
      
 4 
     | 
    
         
            +
              class ClauseDetector
         
     | 
| 
      
 5 
     | 
    
         
            +
                attr_reader :text
         
     | 
| 
      
 6 
     | 
    
         
            +
             
     | 
| 
      
 7 
     | 
    
         
            +
                def initialize(text)
         
     | 
| 
      
 8 
     | 
    
         
            +
                  @text = text.downcase
         
     | 
| 
      
 9 
     | 
    
         
            +
                end
         
     | 
| 
      
 10 
     | 
    
         
            +
             
     | 
| 
      
 11 
     | 
    
         
            +
                # Detect key legal clauses in the document
         
     | 
| 
      
 12 
     | 
    
         
            +
                # @return [Hash] Detected clauses with their content
         
     | 
| 
      
 13 
     | 
    
         
            +
                def detect
         
     | 
| 
      
 14 
     | 
    
         
            +
                  {
         
     | 
| 
      
 15 
     | 
    
         
            +
                    data_processing: detect_data_processing_clauses,
         
     | 
| 
      
 16 
     | 
    
         
            +
                    liability: detect_liability_clauses,
         
     | 
| 
      
 17 
     | 
    
         
            +
                    confidentiality: detect_confidentiality_clauses,
         
     | 
| 
      
 18 
     | 
    
         
            +
                    termination: detect_termination_clauses,
         
     | 
| 
      
 19 
     | 
    
         
            +
                    payment: detect_payment_clauses,
         
     | 
| 
      
 20 
     | 
    
         
            +
                    intellectual_property: detect_ip_clauses,
         
     | 
| 
      
 21 
     | 
    
         
            +
                    dispute_resolution: detect_dispute_resolution_clauses,
         
     | 
| 
      
 22 
     | 
    
         
            +
                    governing_law: detect_governing_law_clauses
         
     | 
| 
      
 23 
     | 
    
         
            +
                  }.compact
         
     | 
| 
      
 24 
     | 
    
         
            +
                end
         
     | 
| 
      
 25 
     | 
    
         
            +
             
     | 
| 
      
 26 
     | 
    
         
            +
                private
         
     | 
| 
      
 27 
     | 
    
         
            +
             
     | 
| 
      
 28 
     | 
    
         
            +
                # Detect data processing and privacy clauses
         
     | 
| 
      
 29 
     | 
    
         
            +
                # @return [Array<Hash>] Data processing clauses
         
     | 
| 
      
 30 
     | 
    
         
            +
                def detect_data_processing_clauses
         
     | 
| 
      
 31 
     | 
    
         
            +
                  patterns = [
         
     | 
| 
      
 32 
     | 
    
         
            +
                    /data\s+processing/,
         
     | 
| 
      
 33 
     | 
    
         
            +
                    /personal\s+data/,
         
     | 
| 
      
 34 
     | 
    
         
            +
                    /gdpr/,
         
     | 
| 
      
 35 
     | 
    
         
            +
                    /kvkk/,
         
     | 
| 
      
 36 
     | 
    
         
            +
                    /data\s+protection/,
         
     | 
| 
      
 37 
     | 
    
         
            +
                    /privacy\s+policy/,
         
     | 
| 
      
 38 
     | 
    
         
            +
                    /data\s+subject/,
         
     | 
| 
      
 39 
     | 
    
         
            +
                    /data\s+controller/,
         
     | 
| 
      
 40 
     | 
    
         
            +
                    /data\s+processor/
         
     | 
| 
      
 41 
     | 
    
         
            +
                  ]
         
     | 
| 
      
 42 
     | 
    
         
            +
             
     | 
| 
      
 43 
     | 
    
         
            +
                  find_clauses_by_patterns(patterns, "Data Processing")
         
     | 
| 
      
 44 
     | 
    
         
            +
                end
         
     | 
| 
      
 45 
     | 
    
         
            +
             
     | 
| 
      
 46 
     | 
    
         
            +
                # Detect liability and indemnification clauses
         
     | 
| 
      
 47 
     | 
    
         
            +
                # @return [Array<Hash>] Liability clauses
         
     | 
| 
      
 48 
     | 
    
         
            +
                def detect_liability_clauses
         
     | 
| 
      
 49 
     | 
    
         
            +
                  patterns = [
         
     | 
| 
      
 50 
     | 
    
         
            +
                    /liabilit/,
         
     | 
| 
      
 51 
     | 
    
         
            +
                    /liable/,
         
     | 
| 
      
 52 
     | 
    
         
            +
                    /indemnif/,
         
     | 
| 
      
 53 
     | 
    
         
            +
                    /damages/,
         
     | 
| 
      
 54 
     | 
    
         
            +
                    /limitation\s+of\s+liability/,
         
     | 
| 
      
 55 
     | 
    
         
            +
                    /exclude.*liability/,
         
     | 
| 
      
 56 
     | 
    
         
            +
                    /consequential\s+damages/,
         
     | 
| 
      
 57 
     | 
    
         
            +
                    /indirect\s+damages/
         
     | 
| 
      
 58 
     | 
    
         
            +
                  ]
         
     | 
| 
      
 59 
     | 
    
         
            +
             
     | 
| 
      
 60 
     | 
    
         
            +
                  find_clauses_by_patterns(patterns, "Liability")
         
     | 
| 
      
 61 
     | 
    
         
            +
                end
         
     | 
| 
      
 62 
     | 
    
         
            +
             
     | 
| 
      
 63 
     | 
    
         
            +
                # Detect confidentiality and non-disclosure clauses
         
     | 
| 
      
 64 
     | 
    
         
            +
                # @return [Array<Hash>] Confidentiality clauses
         
     | 
| 
      
 65 
     | 
    
         
            +
                def detect_confidentiality_clauses
         
     | 
| 
      
 66 
     | 
    
         
            +
                  patterns = [
         
     | 
| 
      
 67 
     | 
    
         
            +
                    /confidential/,
         
     | 
| 
      
 68 
     | 
    
         
            +
                    /non.?disclosure/,
         
     | 
| 
      
 69 
     | 
    
         
            +
                    /proprietary\s+information/,
         
     | 
| 
      
 70 
     | 
    
         
            +
                    /trade\s+secret/,
         
     | 
| 
      
 71 
     | 
    
         
            +
                    /confidentiality\s+agreement/,
         
     | 
| 
      
 72 
     | 
    
         
            +
                    /nda/
         
     | 
| 
      
 73 
     | 
    
         
            +
                  ]
         
     | 
| 
      
 74 
     | 
    
         
            +
             
     | 
| 
      
 75 
     | 
    
         
            +
                  find_clauses_by_patterns(patterns, "Confidentiality")
         
     | 
| 
      
 76 
     | 
    
         
            +
                end
         
     | 
| 
      
 77 
     | 
    
         
            +
             
     | 
| 
      
 78 
     | 
    
         
            +
                # Detect termination clauses
         
     | 
| 
      
 79 
     | 
    
         
            +
                # @return [Array<Hash>] Termination clauses
         
     | 
| 
      
 80 
     | 
    
         
            +
                def detect_termination_clauses
         
     | 
| 
      
 81 
     | 
    
         
            +
                  patterns = [
         
     | 
| 
      
 82 
     | 
    
         
            +
                    /terminat/,
         
     | 
| 
      
 83 
     | 
    
         
            +
                    /end\s+this\s+agreement/,
         
     | 
| 
      
 84 
     | 
    
         
            +
                    /breach.*agreement/,
         
     | 
| 
      
 85 
     | 
    
         
            +
                    /notice\s+of\s+termination/,
         
     | 
| 
      
 86 
     | 
    
         
            +
                    /expir/,
         
     | 
| 
      
 87 
     | 
    
         
            +
                    /cancel/
         
     | 
| 
      
 88 
     | 
    
         
            +
                  ]
         
     | 
| 
      
 89 
     | 
    
         
            +
             
     | 
| 
      
 90 
     | 
    
         
            +
                  find_clauses_by_patterns(patterns, "Termination")
         
     | 
| 
      
 91 
     | 
    
         
            +
                end
         
     | 
| 
      
 92 
     | 
    
         
            +
             
     | 
| 
      
 93 
     | 
    
         
            +
                # Detect payment and fee clauses
         
     | 
| 
      
 94 
     | 
    
         
            +
                # @return [Array<Hash>] Payment clauses
         
     | 
| 
      
 95 
     | 
    
         
            +
                def detect_payment_clauses
         
     | 
| 
      
 96 
     | 
    
         
            +
                  patterns = [
         
     | 
| 
      
 97 
     | 
    
         
            +
                    /payment/,
         
     | 
| 
      
 98 
     | 
    
         
            +
                    /fee/,
         
     | 
| 
      
 99 
     | 
    
         
            +
                    /\$[\d,]+/,
         
     | 
| 
      
 100 
     | 
    
         
            +
                    /invoice/,
         
     | 
| 
      
 101 
     | 
    
         
            +
                    /billing/,
         
     | 
| 
      
 102 
     | 
    
         
            +
                    /compensation/,
         
     | 
| 
      
 103 
     | 
    
         
            +
                    /remuneration/,
         
     | 
| 
      
 104 
     | 
    
         
            +
                    /salary/,
         
     | 
| 
      
 105 
     | 
    
         
            +
                    /wage/
         
     | 
| 
      
 106 
     | 
    
         
            +
                  ]
         
     | 
| 
      
 107 
     | 
    
         
            +
             
     | 
| 
      
 108 
     | 
    
         
            +
                  find_clauses_by_patterns(patterns, "Payment")
         
     | 
| 
      
 109 
     | 
    
         
            +
                end
         
     | 
| 
      
 110 
     | 
    
         
            +
             
     | 
| 
      
 111 
     | 
    
         
            +
                # Detect intellectual property clauses
         
     | 
| 
      
 112 
     | 
    
         
            +
                # @return [Array<Hash>] IP clauses
         
     | 
| 
      
 113 
     | 
    
         
            +
                def detect_ip_clauses
         
     | 
| 
      
 114 
     | 
    
         
            +
                  patterns = [
         
     | 
| 
      
 115 
     | 
    
         
            +
                    /intellectual\s+property/,
         
     | 
| 
      
 116 
     | 
    
         
            +
                    /copyright/,
         
     | 
| 
      
 117 
     | 
    
         
            +
                    /trademark/,
         
     | 
| 
      
 118 
     | 
    
         
            +
                    /patent/,
         
     | 
| 
      
 119 
     | 
    
         
            +
                    /trade\s+mark/,
         
     | 
| 
      
 120 
     | 
    
         
            +
                    /proprietary\s+rights/,
         
     | 
| 
      
 121 
     | 
    
         
            +
                    /ownership/,
         
     | 
| 
      
 122 
     | 
    
         
            +
                    /license/,
         
     | 
| 
      
 123 
     | 
    
         
            +
                    /licensing/
         
     | 
| 
      
 124 
     | 
    
         
            +
                  ]
         
     | 
| 
      
 125 
     | 
    
         
            +
             
     | 
| 
      
 126 
     | 
    
         
            +
                  find_clauses_by_patterns(patterns, "Intellectual Property")
         
     | 
| 
      
 127 
     | 
    
         
            +
                end
         
     | 
| 
      
 128 
     | 
    
         
            +
             
     | 
| 
      
 129 
     | 
    
         
            +
                # Detect dispute resolution clauses
         
     | 
| 
      
 130 
     | 
    
         
            +
                # @return [Array<Hash>] Dispute resolution clauses
         
     | 
| 
      
 131 
     | 
    
         
            +
                def detect_dispute_resolution_clauses
         
     | 
| 
      
 132 
     | 
    
         
            +
                  patterns = [
         
     | 
| 
      
 133 
     | 
    
         
            +
                    /dispute/,
         
     | 
| 
      
 134 
     | 
    
         
            +
                    /arbitration/,
         
     | 
| 
      
 135 
     | 
    
         
            +
                    /mediation/,
         
     | 
| 
      
 136 
     | 
    
         
            +
                    /litigation/,
         
     | 
| 
      
 137 
     | 
    
         
            +
                    /court/,
         
     | 
| 
      
 138 
     | 
    
         
            +
                    /jurisdiction/,
         
     | 
| 
      
 139 
     | 
    
         
            +
                    /resolution\s+of\s+disputes/,
         
     | 
| 
      
 140 
     | 
    
         
            +
                    /legal\s+proceedings/
         
     | 
| 
      
 141 
     | 
    
         
            +
                  ]
         
     | 
| 
      
 142 
     | 
    
         
            +
             
     | 
| 
      
 143 
     | 
    
         
            +
                  find_clauses_by_patterns(patterns, "Dispute Resolution")
         
     | 
| 
      
 144 
     | 
    
         
            +
                end
         
     | 
| 
      
 145 
     | 
    
         
            +
             
     | 
| 
      
 146 
     | 
    
         
            +
                # Detect governing law clauses
         
     | 
| 
      
 147 
     | 
    
         
            +
                # @return [Array<Hash>] Governing law clauses
         
     | 
| 
      
 148 
     | 
    
         
            +
                def detect_governing_law_clauses
         
     | 
| 
      
 149 
     | 
    
         
            +
                  patterns = [
         
     | 
| 
      
 150 
     | 
    
         
            +
                    /governing\s+law/,
         
     | 
| 
      
 151 
     | 
    
         
            +
                    /applicable\s+law/,
         
     | 
| 
      
 152 
     | 
    
         
            +
                    /laws?\s+of/,
         
     | 
| 
      
 153 
     | 
    
         
            +
                    /jurisdiction/,
         
     | 
| 
      
 154 
     | 
    
         
            +
                    /governed\s+by/,
         
     | 
| 
      
 155 
     | 
    
         
            +
                    /subject\s+to.*law/
         
     | 
| 
      
 156 
     | 
    
         
            +
                  ]
         
     | 
| 
      
 157 
     | 
    
         
            +
             
     | 
| 
      
 158 
     | 
    
         
            +
                  find_clauses_by_patterns(patterns, "Governing Law")
         
     | 
| 
      
 159 
     | 
    
         
            +
                end
         
     | 
| 
      
 160 
     | 
    
         
            +
             
     | 
| 
      
 161 
     | 
    
         
            +
                # Find clauses matching given patterns
         
     | 
| 
      
 162 
     | 
    
         
            +
                # @param patterns [Array<Regexp>] Regex patterns to match
         
     | 
| 
      
 163 
     | 
    
         
            +
                # @param clause_type [String] Type of clause being detected
         
     | 
| 
      
 164 
     | 
    
         
            +
                # @return [Array<Hash>] Found clauses
         
     | 
| 
      
 165 
     | 
    
         
            +
                def find_clauses_by_patterns(patterns, clause_type)
         
     | 
| 
      
 166 
     | 
    
         
            +
                  clauses = []
         
     | 
| 
      
 167 
     | 
    
         
            +
                  sentences = extract_sentences
         
     | 
| 
      
 168 
     | 
    
         
            +
             
     | 
| 
      
 169 
     | 
    
         
            +
                  sentences.each_with_index do |sentence, index|
         
     | 
| 
      
 170 
     | 
    
         
            +
                    patterns.each do |pattern|
         
     | 
| 
      
 171 
     | 
    
         
            +
                      if sentence.match?(pattern)
         
     | 
| 
      
 172 
     | 
    
         
            +
                        clauses << {
         
     | 
| 
      
 173 
     | 
    
         
            +
                          type: clause_type,
         
     | 
| 
      
 174 
     | 
    
         
            +
                          content: sentence.strip,
         
     | 
| 
      
 175 
     | 
    
         
            +
                          position: index + 1,
         
     | 
| 
      
 176 
     | 
    
         
            +
                          keywords: extract_keywords(sentence, pattern)
         
     | 
| 
      
 177 
     | 
    
         
            +
                        }
         
     | 
| 
      
 178 
     | 
    
         
            +
                        break # Don't match multiple patterns for the same sentence
         
     | 
| 
      
 179 
     | 
    
         
            +
                      end
         
     | 
| 
      
 180 
     | 
    
         
            +
                    end
         
     | 
| 
      
 181 
     | 
    
         
            +
                  end
         
     | 
| 
      
 182 
     | 
    
         
            +
             
     | 
| 
      
 183 
     | 
    
         
            +
                  clauses.uniq { |clause| clause[:content] }
         
     | 
| 
      
 184 
     | 
    
         
            +
                end
         
     | 
| 
      
 185 
     | 
    
         
            +
             
     | 
| 
      
 186 
     | 
    
         
            +
                # Extract sentences from text
         
     | 
| 
      
 187 
     | 
    
         
            +
                # @return [Array<String>] Array of sentences
         
     | 
| 
      
 188 
     | 
    
         
            +
                def extract_sentences
         
     | 
| 
      
 189 
     | 
    
         
            +
                  # Split on sentence boundaries
         
     | 
| 
      
 190 
     | 
    
         
            +
                  sentences = text.split(/(?<=[.!?])\s+/)
         
     | 
| 
      
 191 
     | 
    
         
            +
                  
         
     | 
| 
      
 192 
     | 
    
         
            +
                  # Filter out very short sentences
         
     | 
| 
      
 193 
     | 
    
         
            +
                  sentences.select { |s| s.length > 20 }
         
     | 
| 
      
 194 
     | 
    
         
            +
                           .map { |s| s.strip.gsub(/\s+/, ' ') }
         
     | 
| 
      
 195 
     | 
    
         
            +
                end
         
     | 
| 
      
 196 
     | 
    
         
            +
             
     | 
| 
      
 197 
     | 
    
         
            +
                # Extract relevant keywords from a sentence based on pattern
         
     | 
| 
      
 198 
     | 
    
         
            +
                # @param sentence [String] The sentence
         
     | 
| 
      
 199 
     | 
    
         
            +
                # @param pattern [Regexp] The matching pattern
         
     | 
| 
      
 200 
     | 
    
         
            +
                # @return [Array<String>] Extracted keywords
         
     | 
| 
      
 201 
     | 
    
         
            +
                def extract_keywords(sentence, pattern)
         
     | 
| 
      
 202 
     | 
    
         
            +
                  matches = sentence.scan(pattern).flatten
         
     | 
| 
      
 203 
     | 
    
         
            +
                  matches.map(&:strip).reject(&:empty?)
         
     | 
| 
      
 204 
     | 
    
         
            +
                end
         
     | 
| 
      
 205 
     | 
    
         
            +
              end
         
     | 
| 
      
 206 
     | 
    
         
            +
            end
         
     |