yara-ffi 4.0.0 → 4.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: fbff0f40a5c38903311cc68831538ee87fd1e88089188187560c25b401886ac4
4
- data.tar.gz: c554613d1be85c26e2e7ddc87ac9509a60f7b5b25296de64017b7b067ef13c4f
3
+ metadata.gz: f964b7eb475719ea6ceb456125b6ead06be8d9263a8839267ee96cdbe072b470
4
+ data.tar.gz: 36cc2f79374b0d00f245329f5d3e18423b6f16740e92f893c07d662f2183ce67
5
5
  SHA512:
6
- metadata.gz: f09c36c15253e02a2267373561064434b77fc9dcacdd445a1ec28357fa4e01cc09940496f5b645b1dbd920bc2f6c822d8d8215361ff71c8bebe73f42e01897ce
7
- data.tar.gz: 52e22658f2b1eb0f9adddc3642db98f715ed0db7c2d8291808547fa4be86e9f088998e3785204fe1655f2935d5e3dff62588f285120c9375a1603da36458193b
6
+ metadata.gz: 5636bffbece91a2b3d48568e7bcac3e75d0f7713231c1a0abfbcbf83a0436b1d950ca5dccb5dae98b82c6ab3686521f61b561a2802f1c26145f887ea64fccf37
7
+ data.tar.gz: b10394734d3c00d986a5f12a6a2682087a7641e2abb62108728fc699f1e79f395f4bf2c5d94842c440fb15c960690773760907be56b4f14edbe5c86c3589aa3c
@@ -1,6 +1,6 @@
1
1
  # yara-ffi AI Coding Instructions
2
2
 
3
- This Ruby gem provides FFI bindings to YARA-X (Rust-based YARA implementation) for malware/pattern detection.
3
+ This Ruby gem provides FFI bindings to YARA-X (Rust-based YARA implementation) for malware/pattern detection with advanced pattern matching analysis, rule compilation, serialization, and metadata support.
4
4
 
5
5
  ## Quick Development Guide
6
6
 
@@ -10,13 +10,84 @@ This Ruby gem provides FFI bindings to YARA-X (Rust-based YARA implementation) f
10
10
  3. Scanner lifecycle: `add_rule()` → `compile()` → `scan()` → `close()`
11
11
  4. Always use resource-safe patterns: `Scanner.open { |s| ... }` or manual `close()`
12
12
  5. Interactive testing: `docker run -it --mount type=bind,src="$(pwd)",dst=/app yara-ffi bin/console`
13
+ 6. **Documentation**: See `USAGE.md` for comprehensive examples and patterns
13
14
 
14
15
  ## Core Components (Read These Files First)
15
16
 
16
- - `lib/yara/scanner.rb`: Main API - compile-then-scan workflow, resource management
17
+ - `lib/yara/scanner.rb`: Main API - compile-then-scan workflow, resource management, global variables
18
+ - `lib/yara/compiler.rb`: Advanced rule compilation with globals, error diagnostics, serialization
19
+ - `lib/yara/scan_result.rb`: Enhanced result parsing with pattern matches, metadata, tags, namespaces
20
+ - `lib/yara/pattern_match.rb`: Detailed pattern match information with offsets and data extraction
17
21
  - `lib/yara/ffi.rb`: Raw FFI bindings with error codes (`YRX_SUCCESS = 0`)
18
- - `lib/yara/scan_result.rb`: Result parsing (temporary regex-based metadata extraction)
19
- - Tests in `test/scanner_test.rb`: Working examples of all patterns
22
+ - Tests in `test/`: Comprehensive test coverage for all features
23
+ - `test/scanner_test.rb`: Basic scanner patterns
24
+ - `test/scanner_pattern_match_test.rb`: Pattern matching analysis
25
+ - `test/compiler_test.rb`: Advanced compilation features
26
+ - `test/serialize_test.rb`: Rule serialization/deserialization
27
+ - `test/metadata_test.rb`: Metadata extraction
28
+ - `test/tags_test.rb`: Tag support
29
+ - `test/namespace_test.rb`: Namespace functionality
30
+
31
+ ## Key Features & APIs
32
+
33
+ ### Pattern Matching Analysis (NEW)
34
+ ```ruby
35
+ # Detailed pattern match information
36
+ results = scanner.scan(data)
37
+ result = results.first
38
+
39
+ # Access specific pattern matches
40
+ matches = result.matches_for_pattern(:$suspicious)
41
+ matches.each do |match|
42
+ puts "At offset #{match.offset}: #{match.matched_data(data)}"
43
+ end
44
+
45
+ # Pattern match convenience methods
46
+ result.pattern_matched?(:$api_call) # => true/false
47
+ result.total_matches # => 5
48
+ result.all_matches # => [PatternMatch, ...]
49
+ ```
50
+
51
+ ### Advanced Rule Compilation (NEW)
52
+ ```ruby
53
+ # Use Compiler for complex scenarios
54
+ compiler = Yara::Compiler.new
55
+ compiler.define_global_str("ENV", "production")
56
+ compiler.define_global_bool("DEBUG", false)
57
+ compiler.add_source(rule1, "rule1.yar")
58
+ compiler.add_source(rule2, "rule2.yar")
59
+
60
+ # Build and serialize
61
+ serialized = compiler.build_serialized
62
+ scanner = Yara::Scanner.from_serialized(serialized)
63
+ ```
64
+
65
+ ### Global Variables (NEW)
66
+ ```ruby
67
+ # Set individual globals
68
+ scanner.set_global_str("ENV", "production")
69
+ scanner.set_global_int("MAX_SIZE", 1000)
70
+ scanner.set_global_bool("DEBUG", false)
71
+
72
+ # Bulk setting with error handling
73
+ scanner.set_globals({
74
+ "ENV" => "production",
75
+ "RETRIES" => 3
76
+ }, strict: false)
77
+ ```
78
+
79
+ ### Metadata & Tags (NEW)
80
+ ```ruby
81
+ # Access metadata with type safety
82
+ result.rule_meta[:author] # Raw access
83
+ result.metadata_string(:author) # Type-safe String
84
+ result.metadata_int(:severity) # Type-safe Integer
85
+
86
+ # Tag support
87
+ result.tags # => ["malware", "trojan"]
88
+ result.has_tag?("malware") # => true
89
+ result.qualified_name # => "namespace.rule_name"
90
+ ```
20
91
 
21
92
  ## Critical FFI Patterns
22
93
 
@@ -113,11 +184,45 @@ scanner.close # Required!
113
184
  ```
114
185
 
115
186
  **Error Handling:** Custom exceptions for different failure modes:
116
- - `CompilationError` - YARA rule syntax issues
117
- - `ScanError` - Runtime scanning failures
118
- - `NotCompiledError` - Scanning before compilation
187
+ - `Scanner::CompilationError` - YARA rule syntax issues
188
+ - `Scanner::ScanError` - Runtime scanning failures
189
+ - `Scanner::NotCompiledError` - Scanning before compilation
190
+ - `Compiler::CompileError` - Compilation errors with structured diagnostics
191
+
192
+ **Enhanced Result Processing:** ScanResult now provides:
193
+ - Structured metadata access via YARA-X API
194
+ - Detailed pattern match information with offsets/lengths
195
+ - Tag extraction and querying
196
+ - Namespace support
197
+ - Pattern match convenience methods
198
+
199
+ ## Performance & Advanced Features
200
+
201
+ **Rule Serialization for Production:**
202
+ ```ruby
203
+ # Compile once, use many times
204
+ compiler = Yara::Compiler.new
205
+ compiler.add_source(ruleset)
206
+ serialized = compiler.build_serialized
207
+
208
+ # Create multiple scanners from same rules
209
+ scanners = 10.times.map { Yara::Scanner.from_serialized(serialized) }
210
+ ```
119
211
 
120
- **Metadata Parsing:** ScanResult parses YARA rule metadata and strings via regex from rule source (temporary solution until YARA-X API improvements).
212
+ **Timeout Configuration:**
213
+ ```ruby
214
+ scanner.set_timeout(10000) # 10 seconds
215
+ ```
216
+
217
+ **Error Diagnostics:**
218
+ ```ruby
219
+ begin
220
+ compiler.build
221
+ rescue Yara::Compiler::CompileError
222
+ errors = compiler.errors_json
223
+ warnings = compiler.warnings_json
224
+ end
225
+ ```
121
226
 
122
227
  ## Adding New FFI Functions
123
228
 
@@ -138,8 +243,21 @@ end
138
243
  - `yrx_compile(src, rules_ptr)` - Compile rules from string
139
244
  - `yrx_scanner_create(rules, scanner_ptr)` - Create scanner from compiled rules
140
245
  - `yrx_scanner_scan(scanner, data, len)` - Scan data
246
+ - `yrx_scanner_set_global_*()` - Set global variables on scanner
247
+ - `yrx_scanner_set_timeout()` - Configure scan timeout
248
+ - `yrx_compiler_*()` - Advanced compilation functions
249
+ - `yrx_rules_serialize/deserialize()` - Rule serialization
250
+ - `yrx_rule_iter_*()` - Iterate rule components (patterns, metadata, tags)
251
+ - `yrx_pattern_iter_matches()` - Extract pattern match details
141
252
  - `yrx_last_error()` - Get last error message
142
- - Cleanup: `yrx_rules_destroy()`, `yrx_scanner_destroy()`
253
+ - Cleanup: `yrx_rules_destroy()`, `yrx_scanner_destroy()`, `yrx_compiler_destroy()`
254
+
255
+ ## Documentation Structure
256
+
257
+ - `README.md`: Project overview, installation, minimal usage example
258
+ - `USAGE.md`: Comprehensive usage guide with quick reference + detailed examples
259
+ - `DEVELOPMENT.md`: Development setup and contribution workflow
260
+ - `.github/copilot-instructions.md`: This file - AI coding guidance
143
261
 
144
262
  ## Dependencies & Constraints
145
263
 
data/CHANGELOG.md CHANGED
@@ -1,5 +1,36 @@
1
1
  ## [Unreleased]
2
2
 
3
+ ## [4.1.0] - 2025-08-20
4
+
5
+ - **NEW**: Added advanced `Yara::Compiler` API for complex rule compilation scenarios
6
+ - `Compiler.new` - Create a new compiler instance
7
+ - `Compiler#define_global_*` methods for setting globals before compilation
8
+ - `Compiler#add_source` for adding rules from multiple sources
9
+ - `Compiler#build` and `Compiler#build_serialized` for creating compiled rules
10
+ - `Compiler#errors_json` and `Compiler#warnings_json` for detailed diagnostics
11
+ - **NEW**: Added rule serialization and deserialization support
12
+ - `Scanner.from_serialized` - Create scanner from serialized rules
13
+ - `Scanner.from_rules` - Create scanner from pre-compiled rules
14
+ - Enables compile-once, use-many-times pattern for production deployments
15
+ - **NEW**: Enhanced pattern matching analysis with `Yara::PatternMatch`
16
+ - Detailed pattern match information with offsets and lengths
17
+ - `PatternMatch#offset`, `PatternMatch#length`, `PatternMatch#matched_data`
18
+ - `ScanResult#matches_for_pattern` - Get matches for specific patterns
19
+ - `ScanResult#pattern_matched?` - Check if specific pattern matched
20
+ - `ScanResult#total_matches` and `ScanResult#all_matches` for match analysis
21
+ - **NEW**: Added comprehensive metadata and tag support
22
+ - Type-safe metadata accessors: `metadata_string`, `metadata_int`, `metadata_bool`
23
+ - `ScanResult#tags` - Access rule tags as array
24
+ - `ScanResult#has_tag?` - Check for specific tags
25
+ - `ScanResult#qualified_name` - Get namespaced rule name
26
+ - **NEW**: Added global variable support for scanners
27
+ - `Scanner#set_global_str`, `Scanner#set_global_int`, `Scanner#set_global_bool`, `Scanner#set_global_float`
28
+ - `Scanner#set_globals` - Bulk setting with error handling options
29
+ - Enable dynamic rule behavior based on runtime variables
30
+ - **NEW**: Added scanner timeout configuration via `Scanner#set_timeout`
31
+ - **IMPROVED**: Enhanced documentation with comprehensive usage examples in `USAGE.md`
32
+ - **IMPROVED**: Updated development documentation and AI coding instructions
33
+
3
34
  ## [4.0.0] - 2025-08-19
4
35
 
5
36
  - **BREAKING**: Migrated from legacy libyara FFI bindings to YARA-X C API (`libyara_x_capi.so`) ([#24](https://github.com/jonmagic/yara-ffi/pull/24))
data/Gemfile.lock CHANGED
@@ -1,7 +1,7 @@
1
1
  PATH
2
2
  remote: .
3
3
  specs:
4
- yara-ffi (4.0.0)
4
+ yara-ffi (4.1.0)
5
5
  ffi
6
6
 
7
7
  GEM
data/README.md CHANGED
@@ -7,6 +7,22 @@ A Ruby library for using [YARA-X](https://virustotal.github.io/yara-x/) via FFI
7
7
  - Ruby 3.0 or later
8
8
  - YARA-X C API library (`libyara_x_capi`) installed on your system
9
9
 
10
+ ## Major Features
11
+
12
+ **🔍 Pattern Matching Analysis**: Extract detailed pattern match information with exact offsets, lengths, and matched data - perfect for forensic analysis.
13
+
14
+ **🛠️ Advanced Rule Compilation**: Use the `Yara::Compiler` class for complex scenarios with global variables, structured error reporting, and multiple rule sources.
15
+
16
+ **💾 Rule Serialization**: Compile rules once, serialize for persistence or transport, then deserialize for instant scanning - eliminating compilation overhead.
17
+
18
+ **🏷️ Metadata & Tags**: Full access to rule metadata with type safety and tag-based rule categorization and filtering.
19
+
20
+ **🌐 Global Variables**: Set string, boolean, integer, and float globals at runtime to customize rule behavior dynamically.
21
+
22
+ **📁 Namespace Support**: Organize rules logically, avoid naming conflicts, and access qualified rule names in large rule sets.
23
+
24
+ **⚡ Performance**: Configurable scan timeouts, efficient resource management with automatic cleanup, and parallel scanning support.
25
+
10
26
  ## Installation
11
27
 
12
28
  Add this line to your application's Gemfile:
@@ -23,138 +39,31 @@ Or install it yourself as:
23
39
 
24
40
  $ gem install yara-ffi
25
41
 
26
- ## Usage
27
-
28
- ### Quick scanning with convenience methods
29
-
30
- ```ruby
31
- rule = <<-RULE
32
- rule ExampleRule
33
- {
34
- meta:
35
- description = "Example rule for testing"
36
-
37
- strings:
38
- $text_string = "we were here"
39
- $text_regex = /were here/
40
-
41
- condition:
42
- $text_string or $text_regex
43
- }
44
- RULE
45
-
46
- # Test a rule against data
47
- results = Yara.test(rule, "one day we were here and then we were not")
48
- puts results.first.match? # => true
49
- puts results.first.rule_name # => "ExampleRule"
50
-
51
- # Scan with a block for processing results
52
- Yara.scan(rule, "sample data") do |result|
53
- puts "Matched: #{result.rule_name}"
54
- end
55
- ```
56
-
57
- ### Manual scanner usage
42
+ ## Quick Start
58
43
 
59
44
  ```ruby
60
- rule = <<-RULE
61
- rule ExampleRule
62
- {
63
- meta:
64
- string_meta = "an example rule for testing"
65
- int_meta = 42
66
- bool_meta = true
67
-
68
- strings:
69
- $my_text_string = "we were here"
70
- $my_text_regex = /were here/
71
-
72
- condition:
73
- $my_text_string or $my_text_regex
74
- }
75
- RULE
76
-
77
- scanner = Yara::Scanner.new
78
- scanner.add_rule(rule)
79
- scanner.compile
80
-
81
- results = scanner.scan("one day we were here and then we were not")
82
- result = results.first
83
-
84
- puts result.match? # => true
85
- puts result.rule_name # => "ExampleRule"
86
- puts result.rule_meta # => {:string_meta=>"an example rule for testing", :int_meta=>42, :bool_meta=>true}
87
- puts result.rule_strings # => {:"$my_text_string"=>"we were here", :"$my_text_regex"=>"were here"}
88
-
89
- scanner.close # Clean up resources when done
90
- ```
45
+ require 'yara'
91
46
 
92
- ### Block-based scanner usage
47
+ # Simple test
48
+ results = Yara.test(rule_string, data)
49
+ puts "Matched: #{results.first.rule_name}" if results.first&.match?
93
50
 
94
- ```ruby
95
- # Automatically handles resource cleanup
51
+ # Resource-managed scanning
96
52
  Yara::Scanner.open(rule) do |scanner|
97
53
  scanner.compile
98
- results = scanner.scan("test data")
99
- # scanner is automatically closed when block exits
54
+ results = scanner.scan(data)
100
55
  end
101
56
  ```
102
57
 
103
- ### Multiple rules
104
-
105
- ```ruby
106
- rule1 = <<-RULE
107
- rule RuleOne
108
- {
109
- strings:
110
- $a = "pattern one"
111
- condition:
112
- $a
113
- }
114
- RULE
115
-
116
- rule2 = <<-RULE
117
- rule RuleTwo
118
- {
119
- strings:
120
- $b = "pattern two"
121
- condition:
122
- $b
123
- }
124
- RULE
125
-
126
- scanner = Yara::Scanner.new
127
- scanner.add_rule(rule1)
128
- scanner.add_rule(rule2)
129
- scanner.compile
130
-
131
- results = scanner.scan("text with pattern one and pattern two")
132
- puts results.map(&:rule_name) # => ["RuleOne", "RuleTwo"]
133
- scanner.close
134
- ```
135
-
136
- ## API Reference
137
-
138
- ### Yara module methods
58
+ **📖 For comprehensive usage examples, advanced features, and API documentation, see [USAGE.md](USAGE.md).**
139
59
 
140
- - `Yara.test(rule_string, data)` - Quick test of a rule against data, returns array of ScanResult objects
141
- - `Yara.scan(rule_string, data, &block)` - Scan data with rule, optionally yielding each result to block
60
+ ## API Overview
142
61
 
143
- ### Scanner class
62
+ **Core Classes**: `Yara`, `Yara::Scanner`, `Yara::Compiler`, `Yara::ScanResult`, `Yara::ScanResults`, `Yara::PatternMatch`
144
63
 
145
- - `Scanner.new` - Create a new scanner instance
146
- - `Scanner.open(rule_string, namespace: nil, &block)` - Create scanner with optional rule and namespace, auto-cleanup with block
147
- - `#add_rule(rule_string, namespace: nil)` - Add a YARA rule to the scanner
148
- - `#compile` - Compile all added rules (required before scanning)
149
- - `#scan(data, &block)` - Scan data and return ScanResults, or yield each result to block
150
- - `#close` - Free scanner resources
64
+ **Key Methods**: `Yara.test()`, `Yara.scan()`, `Scanner.open()`, `Scanner#scan()`, `ScanResult#pattern_matches`
151
65
 
152
- ### ScanResult class
153
-
154
- - `#match?` - Returns true if rule matched
155
- - `#rule_name` - Name of the matched rule
156
- - `#rule_meta` - Hash of rule metadata (keys are symbols)
157
- - `#rule_strings` - Hash of rule strings (keys are symbols with $ prefix)
66
+ **📖 For detailed API documentation, examples, and advanced usage patterns, see [USAGE.md](USAGE.md).**
158
67
 
159
68
  ## Installing YARA-X
160
69
 
@@ -170,12 +79,8 @@ See [DEVELOPMENT.md](DEVELOPMENT.md) for detailed development setup instructions
170
79
 
171
80
  ## Contributing
172
81
 
173
- Bug reports and pull requests are welcome on GitHub at https://github.com/jonmagic/yara-ffi. This project is intended to be a safe, welcoming space for collaboration, and contributors are expected to adhere to the [code of conduct](https://github.com/jonmagic/yara-ffi/blob/main/CODE_OF_CONDUCT.md).
82
+ Bug reports and pull requests are welcome on GitHub at https://github.com/jonmagic/yara-ffi. This project is intended to be a safe, welcoming space for collaboration, and contributors are expected to adhere to the [code of conduct](CODE_OF_CONDUCT.md).
174
83
 
175
84
  ## License
176
85
 
177
- The gem is available as open source under the terms of the [MIT License](https://opensource.org/licenses/MIT).
178
-
179
- ## Code of Conduct
180
-
181
- Everyone interacting in the yara-ffi project's codebases, issue trackers, chat rooms and mailing lists is expected to follow the [code of conduct](https://github.com/jonmagic/yara-ffi/blob/main/CODE_OF_CONDUCT.md).
86
+ The gem is available as open source under the terms of the [MIT License](LICENSE.txt).