xlg 0.4.2 → 1.0.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: 487577ad7b07e1220696b70de189c3c098a39547c84dc8db0204b5420c4ae0af
4
- data.tar.gz: 8e58a4ba07793fd06036bbeb3c03069938dab6fe5f3eb0aa21dc38a7d94912bf
3
+ metadata.gz: 33bc5a5ea33c1b5b7a68bb4d7a7061c03323eb1be97cbb68c2b156361ec896d0
4
+ data.tar.gz: 5fd055f3b0c8bdd53d83400e4b1818e97ac84380abc11bcc894f8b477f3e5bb4
5
5
  SHA512:
6
- metadata.gz: 54c70a935c0b50d11d63e70f9d0203172afd17022f8bd058a2ddb8cc8e66de14524a561881d8070ca301d93aede144305c909c9a0fb9d389628bc69ee1dfeeff
7
- data.tar.gz: da330328bb724784d21f0d1e59322f4eec091fbe57c48fc55ab03cee9238470867c8b321a2b833894a96cf69c80b830f528bfa800188b44b7744aee66516ca21
6
+ metadata.gz: dedc5f0004b795af9b98782e350c9e7dc1ccaee8bdd094950dad05b2bab945c2e12d73833bf53506ae24fab55bb9b23a1934424e604d416921417f1a39441dbd
7
+ data.tar.gz: 5097f1343ad1a3bb517bd95267034598b120ca74da92e896b1b1338b78c03e2b2a0099ece782811a079bccf331675454c9a1b92adf7db18d103b6b97c9595cad
data/README.md CHANGED
@@ -5,6 +5,7 @@
5
5
  ## Features
6
6
 
7
7
  - Search text in Excel files (.xlsx, .xls, .xlsm)
8
+ - **Regular expression support** for advanced pattern matching
8
9
  - Support for single file, multiple files, and directory-based searches
9
10
  - Case-insensitive keyword matching
10
11
  - Grep-like output format: `filename:sheet:cell_ref:matched_text`
@@ -41,19 +42,49 @@ xlg 'keyword' file1.xlsx file2.xlsx
41
42
  xlg 'keyword' /path/to/directory/
42
43
  ```
43
44
 
45
+ ### Regular Expression Support
46
+
47
+ `xlg` supports Ruby-style regular expressions for advanced pattern matching:
48
+
49
+ ```bash
50
+ # Search for date patterns (YYYY-MM-DD format)
51
+ xlg '\d{4}-\d{2}-\d{2}' data.xlsx
52
+
53
+ # Search for email addresses
54
+ xlg '[a-zA-Z0-9]+@[a-zA-Z0-9]+\.[a-zA-Z]+' contacts.xlsx
55
+
56
+ # Search for strings starting with capital letters
57
+ xlg '^[A-Z].*' documents.xlsx
58
+
59
+ # Search for "test" OR "demo"
60
+ xlg 'test|demo' samples.xlsx
61
+
62
+ # Case-insensitive regex using flags
63
+ xlg '/hello/i' greetings.xlsx
64
+
65
+ # Search for phone numbers
66
+ xlg '\d{3}-\d{3}-\d{4}' phonebook.xlsx
67
+ ```
68
+
44
69
  ### Command Line Options
45
70
 
46
71
  ```
47
- xlg KEYWORD FILE.xlsx # Single file search
48
- xlg KEYWORD FILE1.xlsx FILE2.xlsx # Multiple file search
49
- xlg KEYWORD /path/to/directory/ # Directory search
72
+ xlg PATTERN FILE.xlsx # Single file search
73
+ xlg PATTERN FILE1.xlsx FILE2.xlsx # Multiple file search
74
+ xlg PATTERN /path/to/directory/ # Directory search
50
75
 
51
76
  Options:
52
77
  -h, --help Show help message
53
78
  ```
54
79
 
80
+ Where `PATTERN` can be:
81
+ - A simple text string (e.g., `'hello'`)
82
+ - A regular expression (e.g., `'\d+'` for numbers)
83
+ - A regex with flags (e.g., `'/pattern/i'` for case-insensitive)
84
+
55
85
  ### Examples
56
86
 
87
+ #### Text Search
57
88
  ```bash
58
89
  # Search for "test" in sample.xlsx
59
90
  xlg 'test' sample.xlsx
@@ -65,6 +96,24 @@ xlg 'データ' file1.xlsx file2.xlsx
65
96
  xlg 'キーワード' /home/user/documents/
66
97
  ```
67
98
 
99
+ #### Regular Expression Search
100
+ ```bash
101
+ # Find all dates in YYYY-MM-DD format
102
+ xlg '\d{4}-\d{2}-\d{2}' reports.xlsx
103
+
104
+ # Find email addresses
105
+ xlg '\w+@\w+\.\w+' contacts.xlsx
106
+
107
+ # Find cells starting with "Total"
108
+ xlg '^Total' financial.xlsx
109
+
110
+ # Find prices in dollar format
111
+ xlg '\$\d+\.\d{2}' prices.xlsx
112
+
113
+ # Case-insensitive search
114
+ xlg '/error/i' logs.xlsx
115
+ ```
116
+
68
117
  ## Output Format
69
118
 
70
119
  The output follows a grep-like format:
@@ -97,7 +146,7 @@ The tool consists of four main components:
97
146
 
98
147
  - `ExcelGrep` (lib/excel_grep.rb): Core search functionality for single files
99
148
  - `MultiFileSearcher` (lib/multi_file_searcher.rb): Handles multiple files and directories
100
- - `CellMatcher` (lib/cell_matcher.rb): Performs case-insensitive text matching
149
+ - `CellMatcher` (lib/cell_matcher.rb): Performs text and regular expression matching
101
150
  - `OutputFormatter` (lib/output_formatter.rb): Formats output in grep-like style
102
151
 
103
152
  ## Development
@@ -105,7 +154,13 @@ The tool consists of four main components:
105
154
  ### Running Tests
106
155
 
107
156
  ```bash
108
- ruby test/test_*.rb
157
+ # Run individual test files
158
+ RUBYLIB=lib bundle exec ruby test/test_cell_matcher.rb
159
+ RUBYLIB=lib bundle exec ruby test/test_regex_integration.rb
160
+ RUBYLIB=lib bundle exec ruby test/test_edge_cases.rb
161
+
162
+ # Or run all tests
163
+ RUBYLIB=lib bundle exec ruby test/test_*.rb
109
164
  ```
110
165
 
111
166
  ### Project Structure
data/bin/xlg CHANGED
@@ -5,14 +5,15 @@ require 'multi_file_searcher'
5
5
 
6
6
  def show_help
7
7
  puts "使用方法:"
8
- puts " xlg KEYWORD FILE.xlsx # 単一ファイル"
9
- puts " xlg KEYWORD FILE1.xlsx FILE2.xlsx # 複数ファイル"
10
- puts " xlg KEYWORD /path/to/directory/ # ディレクトリ内全検索"
8
+ puts " xlg PATTERN FILE.xlsx # 単一ファイル"
9
+ puts " xlg PATTERN FILE1.xlsx FILE2.xlsx # 複数ファイル"
10
+ puts " xlg PATTERN /path/to/directory/ # ディレクトリ内全検索"
11
11
  puts ""
12
12
  puts "Excel ファイル内の文字列を検索し、grep形式で出力します。"
13
+ puts "検索パターンには通常の文字列または正規表現を指定できます。"
13
14
  puts ""
14
15
  puts "引数:"
15
- puts " KEYWORD 検索するキーワード"
16
+ puts " PATTERN 検索パターン(文字列または正規表現)"
16
17
  puts " FILE.xlsx 検索対象のExcelファイル(複数指定可能)"
17
18
  puts " /path/to/dir/ 検索対象のディレクトリ"
18
19
  puts ""
@@ -22,9 +23,15 @@ def show_help
22
23
  puts " .xlsm (マクロ有効Excel)"
23
24
  puts ""
24
25
  puts "例:"
25
- puts " xlg 'test' sample.xlsx"
26
- puts " xlg 'データ' file1.xlsx file2.xlsx"
27
- puts " xlg 'キーワード' /home/user/documents/"
26
+ puts " xlg 'test' sample.xlsx # 通常の文字列検索"
27
+ puts " xlg 'データ' file1.xlsx file2.xlsx # 複数ファイル検索"
28
+ puts " xlg 'キーワード' /home/user/documents/ # ディレクトリ検索"
29
+ puts ""
30
+ puts "正規表現の例:"
31
+ puts " xlg '\\d{4}-\\d{2}-\\d{2}' data.xlsx # 日付形式 (YYYY-MM-DD)"
32
+ puts " xlg '^[A-Z].*' data.xlsx # 大文字で始まる文字列"
33
+ puts " xlg 'test|demo' data.xlsx # testまたはdemoを含む"
34
+ puts " xlg '/pattern/i' data.xlsx # 大文字小文字を区別しない"
28
35
  puts ""
29
36
  puts "オプション:"
30
37
  puts " -h, --help このヘルプを表示"
@@ -38,12 +45,12 @@ def main
38
45
 
39
46
  if ARGV.length < 2
40
47
  $stderr.puts "エラー: 引数の数が正しくありません"
41
- $stderr.puts "使用方法: xlg KEYWORD FILE.xlsx [FILE2.xlsx ...]"
48
+ $stderr.puts "使用方法: xlg PATTERN FILE.xlsx [FILE2.xlsx ...]"
42
49
  $stderr.puts "詳細は 'xlg --help' を参照してください"
43
50
  exit 1
44
51
  end
45
52
 
46
- keyword = ARGV[0]
53
+ pattern = ARGV[0]
47
54
  paths = ARGV[1..-1]
48
55
 
49
56
  begin
@@ -52,16 +59,16 @@ def main
52
59
  path = paths[0]
53
60
  if File.directory?(path)
54
61
  # ディレクトリの場合はMultiFileSearcherを使用
55
- searcher = MultiFileSearcher.new(keyword, [path])
62
+ searcher = MultiFileSearcher.new(pattern, [path])
56
63
  success = searcher.search
57
64
  else
58
65
  # 単一ファイルの場合はExcelGrepを使用
59
- grep = ExcelGrep.new(keyword, path)
66
+ grep = ExcelGrep.new(pattern, path)
60
67
  success = grep.search
61
68
  end
62
69
  else
63
70
  # 複数パスの場合はMultiFileSearcherを使用
64
- searcher = MultiFileSearcher.new(keyword, paths)
71
+ searcher = MultiFileSearcher.new(pattern, paths)
65
72
  success = searcher.search
66
73
  end
67
74
 
@@ -69,6 +76,10 @@ def main
69
76
  rescue ArgumentError => e
70
77
  $stderr.puts "エラー: #{e.message}"
71
78
  exit 1
79
+ rescue RegexpError => e
80
+ $stderr.puts "正規表現エラー: 不正な正規表現パターンです - #{e.message}"
81
+ $stderr.puts "正規表現の構文については 'xlg --help' を参照してください"
82
+ exit 1
72
83
  rescue => e
73
84
  $stderr.puts "予期しないエラーが発生しました: #{e.message}"
74
85
  exit 1
data/lib/cell_matcher.rb CHANGED
@@ -1,22 +1,93 @@
1
1
  class CellMatcher
2
- def match?(cell_value, keyword)
3
- return false if cell_value.nil? || keyword.nil? || keyword.empty?
2
+ def initialize(pattern)
3
+ raise ArgumentError, "パターンが空です" if pattern.nil? || pattern.empty?
4
+ @pattern = pattern
5
+ @compiled_pattern = compile_pattern(pattern)
6
+ end
7
+
8
+ def match?(cell_value)
9
+ return false if cell_value.nil? || @pattern.nil? || @pattern.empty?
4
10
 
5
11
  cell_str = cell_value.to_s
6
12
  return false if cell_str.empty?
7
13
 
8
- cell_str.downcase.include?(keyword.downcase)
14
+ if is_regex?(@pattern)
15
+ if @compiled_pattern.is_a?(Regexp)
16
+ !(@compiled_pattern =~ cell_str).nil?
17
+ else
18
+ # Fallback to literal matching if regex compilation failed
19
+ cell_str.downcase.include?(@pattern.downcase)
20
+ end
21
+ else
22
+ cell_str.downcase.include?(@pattern.downcase)
23
+ end
9
24
  end
10
25
 
11
- def extract_match(cell_value, keyword)
12
- return nil unless match?(cell_value, keyword)
26
+ def extract_match(cell_value)
27
+ return nil unless match?(cell_value)
13
28
 
14
29
  cell_str = cell_value.to_s
15
- keyword_lower = keyword.downcase
16
30
 
17
- start_index = cell_str.downcase.index(keyword_lower)
18
- return nil if start_index.nil?
19
-
20
- cell_str[start_index, keyword.length]
31
+ if is_regex?(@pattern)
32
+ if @compiled_pattern.is_a?(Regexp)
33
+ match_result = @compiled_pattern.match(cell_str)
34
+ match_result ? match_result[0] : nil
35
+ else
36
+ # Fallback to literal matching if regex compilation failed
37
+ pattern_lower = @pattern.downcase
38
+ start_index = cell_str.downcase.index(pattern_lower)
39
+ return nil if start_index.nil?
40
+
41
+ cell_str[start_index, @pattern.length]
42
+ end
43
+ else
44
+ pattern_lower = @pattern.downcase
45
+ start_index = cell_str.downcase.index(pattern_lower)
46
+ return nil if start_index.nil?
47
+
48
+ cell_str[start_index, @pattern.length]
49
+ end
50
+ end
51
+
52
+ private
53
+
54
+ def is_regex?(pattern)
55
+ # Check if pattern looks like a regex (contains regex metacharacters)
56
+ pattern.match?(/[.*+?^${}()|\\]/) ||
57
+ pattern.include?('[') || pattern.include?(']') ||
58
+ pattern.start_with?('/') && pattern.end_with?('/') ||
59
+ pattern.start_with?('/') && pattern.match?(/\/[gimxo]*$/)
60
+ end
61
+
62
+ def compile_pattern(pattern)
63
+ begin
64
+ if is_regex?(pattern)
65
+ # Handle /pattern/flags format
66
+ if pattern.start_with?('/') && pattern.match?(/\/([gimxo]*)$/)
67
+ # Extract pattern and flags
68
+ match = pattern.match(/^\/(.*)\/([gimxo]*)$/)
69
+ if match
70
+ regex_pattern = match[1]
71
+ flags = match[2]
72
+
73
+ options = 0
74
+ options |= Regexp::IGNORECASE if flags.include?('i')
75
+ options |= Regexp::MULTILINE if flags.include?('m')
76
+ options |= Regexp::EXTENDED if flags.include?('x')
77
+
78
+ return Regexp.new(regex_pattern, options)
79
+ end
80
+ end
81
+
82
+ # Try to compile as-is for patterns with metacharacters
83
+ Regexp.new(pattern, Regexp::IGNORECASE)
84
+ else
85
+ # For plain text, return the pattern as-is
86
+ pattern
87
+ end
88
+ rescue RegexpError => e
89
+ # If regex compilation fails, treat as plain text
90
+ pattern
91
+ end
21
92
  end
22
93
  end
data/lib/excel_grep.rb CHANGED
@@ -3,15 +3,15 @@ require 'cell_matcher'
3
3
  require 'output_formatter'
4
4
 
5
5
  class ExcelGrep
6
- attr_reader :keyword, :file_path
6
+ attr_reader :pattern, :file_path
7
7
 
8
- def initialize(keyword, file_path)
9
- raise ArgumentError, "キーワードが空です" if keyword.nil? || keyword.empty?
8
+ def initialize(pattern, file_path)
9
+ raise ArgumentError, "検索パターンが空です" if pattern.nil? || pattern.empty?
10
10
  raise ArgumentError, "ファイルパスが空です" if file_path.nil? || file_path.empty?
11
11
 
12
- @keyword = keyword
12
+ @pattern = pattern
13
13
  @file_path = file_path
14
- @matcher = CellMatcher.new
14
+ @matcher = CellMatcher.new(pattern)
15
15
  @formatter = OutputFormatter.new
16
16
  end
17
17
 
@@ -45,9 +45,10 @@ class ExcelGrep
45
45
  next if cell.nil? || cell.value.nil?
46
46
 
47
47
  cell_value = cell.value.to_s
48
- if @matcher.match?(cell_value, @keyword)
48
+ if @matcher.match?(cell_value)
49
49
  cell_ref = RubyXL::Reference.ind2ref(row_index, col_index)
50
- puts @formatter.format(file_name, sheet_name, cell_ref, cell_value)
50
+ match_text = @matcher.extract_match(cell_value) || cell_value
51
+ puts @formatter.format(file_name, sheet_name, cell_ref, match_text)
51
52
  end
52
53
  end
53
54
  end
@@ -1,13 +1,13 @@
1
1
  require 'excel_grep'
2
2
 
3
3
  class MultiFileSearcher
4
- attr_reader :keyword, :paths
4
+ attr_reader :pattern, :paths
5
5
 
6
- def initialize(keyword, paths)
7
- raise ArgumentError, "キーワードが空です" if keyword.nil? || keyword.empty?
6
+ def initialize(pattern, paths)
7
+ raise ArgumentError, "検索パターンが空です" if pattern.nil? || pattern.empty?
8
8
  raise ArgumentError, "パスが空です" if paths.nil? || paths.empty?
9
9
 
10
- @keyword = keyword
10
+ @pattern = pattern
11
11
  @paths = paths
12
12
  end
13
13
 
@@ -17,7 +17,7 @@ class MultiFileSearcher
17
17
 
18
18
  expanded_files.each do |file_path|
19
19
  begin
20
- grep = ExcelGrep.new(@keyword, file_path)
20
+ grep = ExcelGrep.new(@pattern, file_path)
21
21
  if grep.search
22
22
  success_count += 1
23
23
  end
metadata CHANGED
@@ -1,14 +1,14 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: xlg
3
3
  version: !ruby/object:Gem::Version
4
- version: 0.4.2
4
+ version: 1.0.0
5
5
  platform: ruby
6
6
  authors:
7
7
  - Kazto TAKAHASHI
8
8
  autorequire:
9
9
  bindir: bin
10
10
  cert_chain: []
11
- date: 2025-08-08 00:00:00.000000000 Z
11
+ date: 2025-08-12 00:00:00.000000000 Z
12
12
  dependencies:
13
13
  - !ruby/object:Gem::Dependency
14
14
  name: rubyXL