hizuke 0.0.3 → 0.0.4
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- checksums.yaml +4 -4
- data/CHANGELOG.md +23 -1
- data/Gemfile.lock +1 -1
- data/README.md +88 -2
- data/hizuke.gemspec +36 -0
- data/lib/hizuke/parser.rb +122 -17
- data/lib/hizuke/version.rb +1 -1
- metadata +3 -2
checksums.yaml
CHANGED
@@ -1,7 +1,7 @@
|
|
1
1
|
---
|
2
2
|
SHA256:
|
3
|
-
metadata.gz:
|
4
|
-
data.tar.gz:
|
3
|
+
metadata.gz: 6a44032e09036e25847a067394cf50b60f47197e5fe4d29b068c0e822960db9d
|
4
|
+
data.tar.gz: 8f6a2d515fe0b1b64e970ef5ced8aad5aaa96febfd6c44448dc765d78521a709
|
5
5
|
SHA512:
|
6
|
-
metadata.gz:
|
7
|
-
data.tar.gz:
|
6
|
+
metadata.gz: be87cac6f8fdf4ca69ca42641ad214804787a498b019d80b80b263835ce5fed52215af0d1f6e04e30e12889c710cb56db7df19a2c20c056c4969c97cfddbbd90
|
7
|
+
data.tar.gz: 8e36da5a76ea3f9940b3970bab64a6dd2fa33a050caf02d9dd743e2052773ea4f28678f142dc82ce4b16dd436d0396b07afcb8bf961f9168e5d9d8fed2c733e6
|
data/CHANGELOG.md
CHANGED
@@ -5,7 +5,29 @@ All notable changes to this project will be documented in this file.
|
|
5
5
|
The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
|
6
6
|
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
|
7
7
|
|
8
|
-
## [0.0.
|
8
|
+
## [0.0.4] - 2025-04-29
|
9
|
+
|
10
|
+
### Added
|
11
|
+
- Support for time recognition in text:
|
12
|
+
- "at X" format (e.g., "at 10" for 10:00)
|
13
|
+
- "@ X" alternative syntax
|
14
|
+
- Time with minutes "at X:Y" (e.g., "at 10:30")
|
15
|
+
- Time with seconds "at X:Y:Z" (e.g., "at 10:30:45")
|
16
|
+
- AM/PM format "at Xam/pm" (e.g., "at 10am", "at 7pm")
|
17
|
+
- New attributes in Result class:
|
18
|
+
- `time` attribute to access the extracted time
|
19
|
+
- `datetime` method to get a combined Time object of date and time
|
20
|
+
- Created a dedicated `TimeOfDay` class for better time handling with cleaner display
|
21
|
+
- Support for word-based time expressions:
|
22
|
+
- "at noon" - returns 12:00
|
23
|
+
- "at midnight" - returns 00:00
|
24
|
+
- "in the morning" - returns configurable time (default 08:00)
|
25
|
+
- "in the evening" - returns configurable time (default 20:00)
|
26
|
+
- Configuration system to customize times for "morning" and "evening"
|
27
|
+
- Comprehensive tests for time parsing functionality
|
28
|
+
- Updated documentation with time parsing examples and supported formats
|
29
|
+
|
30
|
+
## [0.0.3] - 2025-04-24
|
9
31
|
|
10
32
|
### Added
|
11
33
|
- Support for quarterly date references:
|
data/Gemfile.lock
CHANGED
data/README.md
CHANGED
@@ -1,6 +1,6 @@
|
|
1
1
|
# Hizuke
|
2
2
|
|
3
|
-
Hizuke is a simple Ruby gem that parses text containing date references like "yesterday", "today", and "tomorrow". It extracts the date and returns the clean text without the date reference.
|
3
|
+
Hizuke is a simple Ruby gem that parses text containing date references like "yesterday", "today", and "tomorrow". It extracts the date and returns the clean text without the date reference. It can also recognize time references like "at 10" or "at 7pm".
|
4
4
|
|
5
5
|
## Installation
|
6
6
|
|
@@ -61,10 +61,76 @@ puts result.date # => <Date: 2024-01-01> (represents the first day of the next
|
|
61
61
|
result = Hizuke.parse("hiking this weekend")
|
62
62
|
puts result.text # => "hiking"
|
63
63
|
puts result.date # => <Date: 2023-04-01> (represents the next Saturday)
|
64
|
+
|
65
|
+
# Parse text with time
|
66
|
+
result = Hizuke.parse("meeting tomorrow at 10")
|
67
|
+
puts result.text # => "meeting"
|
68
|
+
puts result.date # => <Date: 2023-04-01> (represents tomorrow's date)
|
69
|
+
puts result.time # => 10:00 (represents the time)
|
70
|
+
puts result.datetime # => 2023-04-01 10:00:00 (combines date and time)
|
71
|
+
|
72
|
+
# Parse text with time including minutes
|
73
|
+
result = Hizuke.parse("call client today at 14:30")
|
74
|
+
puts result.text # => "call client"
|
75
|
+
puts result.date # => <Date: 2023-03-31> (represents today's date)
|
76
|
+
puts result.time # => 14:30 (represents the time)
|
77
|
+
puts result.datetime # => 2023-03-31 14:30:00 (combines date and time)
|
78
|
+
|
79
|
+
# Parse text with AM/PM time
|
80
|
+
result = Hizuke.parse("lunch meeting tomorrow at 12pm")
|
81
|
+
puts result.text # => "lunch meeting"
|
82
|
+
puts result.date # => <Date: 2023-04-01> (represents tomorrow's date)
|
83
|
+
puts result.time # => 12:00 (represents the time, noon)
|
84
|
+
puts result.datetime # => 2023-04-01 12:00:00 (combines date and time)
|
85
|
+
|
86
|
+
# Parse text with word-based time
|
87
|
+
result = Hizuke.parse("dinner tomorrow at noon")
|
88
|
+
puts result.text # => "dinner"
|
89
|
+
puts result.date # => <Date: 2023-04-01> (represents tomorrow's date)
|
90
|
+
puts result.time # => 12:00 (represents noon)
|
91
|
+
|
92
|
+
result = Hizuke.parse("flight today at midnight")
|
93
|
+
puts result.text # => "flight"
|
94
|
+
puts result.date # => <Date: 2023-03-31> (represents today's date)
|
95
|
+
puts result.time # => 00:00 (represents midnight)
|
96
|
+
|
97
|
+
result = Hizuke.parse("breakfast tomorrow in the morning")
|
98
|
+
puts result.text # => "breakfast"
|
99
|
+
puts result.date # => <Date: 2023-04-01> (represents tomorrow's date)
|
100
|
+
puts result.time # => 08:00 (default morning time)
|
101
|
+
|
102
|
+
result = Hizuke.parse("dinner today in the evening")
|
103
|
+
puts result.text # => "dinner"
|
104
|
+
puts result.date # => <Date: 2023-03-31> (represents today's date)
|
105
|
+
puts result.time # => 20:00 (default evening time)
|
64
106
|
```
|
65
107
|
|
66
108
|
The parser is case-insensitive and can handle date references located anywhere in the text. It also supports date references with or without spaces (e.g., "nextweek" or "next week").
|
67
109
|
|
110
|
+
## Configuration
|
111
|
+
|
112
|
+
You can configure the time values for "in the morning" and "in the evening" expressions:
|
113
|
+
|
114
|
+
```ruby
|
115
|
+
Hizuke.configure do |config|
|
116
|
+
# Set morning time to 9:30
|
117
|
+
config.morning_time = { hour: 9, min: 30 }
|
118
|
+
|
119
|
+
# Set evening time to 7:00 PM
|
120
|
+
config.evening_time = { hour: 19, min: 0 }
|
121
|
+
end
|
122
|
+
|
123
|
+
# Now when parsing "in the morning", it will return 9:30
|
124
|
+
result = Hizuke.parse("breakfast tomorrow in the morning")
|
125
|
+
puts result.time # => 09:30
|
126
|
+
|
127
|
+
# And "in the evening" will return 19:00
|
128
|
+
result = Hizuke.parse("dinner today in the evening")
|
129
|
+
puts result.time # => 19:00
|
130
|
+
```
|
131
|
+
|
132
|
+
By default, "in the morning" is set to 8:00 and "in the evening" is set to 20:00.
|
133
|
+
|
68
134
|
## Supported Date Keywords
|
69
135
|
|
70
136
|
Currently, the following English date keywords are supported:
|
@@ -100,6 +166,26 @@ Currently, the following English date keywords are supported:
|
|
100
166
|
- `next quarter` / `nextquarter` - returns the first day of the next quarter
|
101
167
|
- `last quarter` / `lastquarter` - returns the first day of the last quarter
|
102
168
|
|
169
|
+
## Supported Time Formats
|
170
|
+
|
171
|
+
The following time formats are supported:
|
172
|
+
|
173
|
+
### Numeric time formats
|
174
|
+
- `at X` - where X is a number (e.g., "at 10" for 10:00)
|
175
|
+
- `@ X` - alternative syntax with @ symbol
|
176
|
+
- `at X:Y` - where X is hours and Y is minutes (e.g., "at 10:30")
|
177
|
+
- `at X:Y:Z` - where X is hours, Y is minutes, and Z is seconds
|
178
|
+
- `at Xam/pm` - with AM/PM indicator (e.g., "at 10am", "at 7pm")
|
179
|
+
- `at X:Yam/pm` - with minutes and AM/PM indicator (e.g., "at 10:30am")
|
180
|
+
|
181
|
+
### Word-based time formats
|
182
|
+
- `at noon` - returns 12:00
|
183
|
+
- `at midnight` - returns 00:00
|
184
|
+
- `in the morning` - returns configurable time (default 08:00)
|
185
|
+
- `in the evening` - returns configurable time (default 20:00)
|
186
|
+
|
187
|
+
When time is included, you can access it through the `time` attribute of the result. The time is displayed in the format "HH:MM" or "HH:MM:SS" if seconds are present. Additionally, you can use the `datetime` attribute to get a Time object combining both the date and time information.
|
188
|
+
|
103
189
|
## Development
|
104
190
|
|
105
191
|
After checking out the repo, run `bin/setup` to install dependencies. Then, run `rake test` to run the tests. You can also run `bin/console` for an interactive prompt that will allow you to experiment.
|
@@ -108,7 +194,7 @@ To install this gem onto your local machine, run `bundle exec rake install`. To
|
|
108
194
|
|
109
195
|
## Contributing
|
110
196
|
|
111
|
-
Bug reports and pull requests are welcome on GitHub at https://github.com/
|
197
|
+
Bug reports and pull requests are welcome on GitHub at https://github.com/majur/hizuke.
|
112
198
|
|
113
199
|
## License
|
114
200
|
|
data/hizuke.gemspec
ADDED
@@ -0,0 +1,36 @@
|
|
1
|
+
# frozen_string_literal: true
|
2
|
+
|
3
|
+
require_relative "lib/hizuke/version"
|
4
|
+
|
5
|
+
Gem::Specification.new do |spec|
|
6
|
+
spec.name = "hizuke"
|
7
|
+
spec.version = Hizuke::VERSION
|
8
|
+
spec.authors = ["Juraj Maťaše"]
|
9
|
+
spec.email = ["juraj@hey.com"]
|
10
|
+
|
11
|
+
spec.summary = "A simple date parser for natural language time references"
|
12
|
+
spec.description = "Hizuke is a lightweight utility that extracts dates from text by recognizing common time expressions. It cleans the original text and returns both the parsed date and the text without the date reference."
|
13
|
+
spec.homepage = "https://github.com/majur/hizuke"
|
14
|
+
spec.license = "MIT"
|
15
|
+
spec.required_ruby_version = ">= 2.6.0"
|
16
|
+
|
17
|
+
spec.metadata["homepage_uri"] = spec.homepage
|
18
|
+
spec.metadata["source_code_uri"] = spec.homepage
|
19
|
+
spec.metadata["changelog_uri"] = "#{spec.homepage}/blob/main/CHANGELOG.md"
|
20
|
+
|
21
|
+
# Specify which files should be added to the gem when it is released.
|
22
|
+
# The `git ls-files -z` loads the files in the RubyGem that have been added into git.
|
23
|
+
spec.files = Dir.chdir(__dir__) do
|
24
|
+
`git ls-files -z`.split("\x0").reject do |f|
|
25
|
+
(f == __FILE__) || f.match(%r{\A(?:(?:bin|test|spec|features)/|\.(?:git|travis|circleci)|appveyor)})
|
26
|
+
end
|
27
|
+
end
|
28
|
+
spec.bindir = "exe"
|
29
|
+
spec.executables = spec.files.grep(%r{\Aexe/}) { |f| File.basename(f) }
|
30
|
+
spec.require_paths = ["lib"]
|
31
|
+
|
32
|
+
# Add development dependencies here
|
33
|
+
spec.add_development_dependency "minitest", "~> 5.0"
|
34
|
+
spec.add_development_dependency "rake", "~> 13.0"
|
35
|
+
spec.add_development_dependency "rubocop", "~> 1.21"
|
36
|
+
end
|
data/lib/hizuke/parser.rb
CHANGED
@@ -1,17 +1,71 @@
|
|
1
1
|
# frozen_string_literal: true
|
2
2
|
|
3
3
|
require "date"
|
4
|
+
require "time"
|
4
5
|
|
5
6
|
module Hizuke
|
6
|
-
#
|
7
|
+
# Simple class to represent a time of day without a date
|
8
|
+
class TimeOfDay
|
9
|
+
attr_reader :hour, :min, :sec
|
10
|
+
|
11
|
+
def initialize(hour, min = 0, sec = 0)
|
12
|
+
@hour = hour
|
13
|
+
@min = min
|
14
|
+
@sec = sec
|
15
|
+
end
|
16
|
+
|
17
|
+
def to_s
|
18
|
+
if sec == 0
|
19
|
+
format("%02d:%02d", hour, min)
|
20
|
+
else
|
21
|
+
format("%02d:%02d:%02d", hour, min, sec)
|
22
|
+
end
|
23
|
+
end
|
24
|
+
|
25
|
+
def inspect
|
26
|
+
to_s
|
27
|
+
end
|
28
|
+
end
|
29
|
+
|
30
|
+
# Result object containing the clean text and extracted date/time
|
7
31
|
class Result
|
8
|
-
attr_reader :text, :date
|
32
|
+
attr_reader :text, :date, :time
|
9
33
|
|
10
|
-
def initialize(text, date)
|
34
|
+
def initialize(text, date, time = nil)
|
11
35
|
@text = text
|
12
36
|
@date = date
|
37
|
+
@time = time
|
38
|
+
end
|
39
|
+
|
40
|
+
def datetime
|
41
|
+
return nil unless @time
|
42
|
+
|
43
|
+
# Combine date and time into a Time object
|
44
|
+
Time.new(@date.year, @date.month, @date.day,
|
45
|
+
@time.hour, @time.min, @time.sec)
|
46
|
+
end
|
47
|
+
end
|
48
|
+
|
49
|
+
# Configuration class for Hizuke
|
50
|
+
class Configuration
|
51
|
+
attr_accessor :morning_time, :evening_time
|
52
|
+
|
53
|
+
def initialize
|
54
|
+
@morning_time = { hour: 8, min: 0 }
|
55
|
+
@evening_time = { hour: 20, min: 0 }
|
13
56
|
end
|
14
57
|
end
|
58
|
+
|
59
|
+
# Allows configuration of Hizuke
|
60
|
+
def self.configure
|
61
|
+
@configuration ||= Configuration.new
|
62
|
+
yield(@configuration) if block_given?
|
63
|
+
end
|
64
|
+
|
65
|
+
# Returns the configuration
|
66
|
+
def self.configuration
|
67
|
+
@configuration ||= Configuration.new
|
68
|
+
end
|
15
69
|
|
16
70
|
# Parser class responsible for extracting dates from text
|
17
71
|
class Parser
|
@@ -80,6 +134,15 @@ module Hizuke
|
|
80
134
|
NEXT_DAY_PATTERN = /next (monday|tuesday|wednesday|thursday|friday|saturday|sunday)/i
|
81
135
|
LAST_DAY_PATTERN = /last (monday|tuesday|wednesday|thursday|friday|saturday|sunday)/i
|
82
136
|
|
137
|
+
# Regex patterns for time references
|
138
|
+
TIME_PATTERN = /(?:at|@)\s*(\d{1,2})(?::(\d{1,2}))?(?::(\d{1,2}))?\s*(am|pm)?/i
|
139
|
+
|
140
|
+
# Regex patterns for word-based time references
|
141
|
+
NOON_PATTERN = /at\s+noon/i
|
142
|
+
MIDNIGHT_PATTERN = /at\s+midnight/i
|
143
|
+
MORNING_PATTERN = /in\s+the\s+morning/i
|
144
|
+
EVENING_PATTERN = /in\s+the\s+evening/i
|
145
|
+
|
83
146
|
# Parse text containing time references and extract both
|
84
147
|
# the clean text and the date.
|
85
148
|
#
|
@@ -99,20 +162,62 @@ module Hizuke
|
|
99
162
|
# Check if text is nil or empty
|
100
163
|
raise ParseError, "Input text cannot be nil or empty" if text.nil? || text.empty?
|
101
164
|
|
165
|
+
# Extract time if present
|
166
|
+
extracted_time = nil
|
167
|
+
clean_text = text
|
168
|
+
|
169
|
+
# Try to match word-based time patterns first
|
170
|
+
if match = clean_text.match(NOON_PATTERN)
|
171
|
+
extracted_time = TimeOfDay.new(12, 0, 0)
|
172
|
+
clean_text = clean_text.gsub(match[0], "").strip
|
173
|
+
elsif match = clean_text.match(MIDNIGHT_PATTERN)
|
174
|
+
extracted_time = TimeOfDay.new(0, 0, 0)
|
175
|
+
clean_text = clean_text.gsub(match[0], "").strip
|
176
|
+
elsif match = clean_text.match(MORNING_PATTERN)
|
177
|
+
config = Hizuke.configuration
|
178
|
+
extracted_time = TimeOfDay.new(config.morning_time[:hour], config.morning_time[:min], 0)
|
179
|
+
clean_text = clean_text.gsub(match[0], "").strip
|
180
|
+
elsif match = clean_text.match(EVENING_PATTERN)
|
181
|
+
config = Hizuke.configuration
|
182
|
+
extracted_time = TimeOfDay.new(config.evening_time[:hour], config.evening_time[:min], 0)
|
183
|
+
clean_text = clean_text.gsub(match[0], "").strip
|
184
|
+
# Then try the numeric time pattern
|
185
|
+
elsif time_match = clean_text.match(TIME_PATTERN)
|
186
|
+
hour = time_match[1].to_i
|
187
|
+
min = time_match[2] ? time_match[2].to_i : 0
|
188
|
+
sec = time_match[3] ? time_match[3].to_i : 0
|
189
|
+
|
190
|
+
# Adjust for AM/PM
|
191
|
+
if time_match[4]&.downcase == "pm" && hour < 12
|
192
|
+
hour += 12
|
193
|
+
elsif time_match[4]&.downcase == "am" && hour == 12
|
194
|
+
hour = 0
|
195
|
+
end
|
196
|
+
|
197
|
+
extracted_time = TimeOfDay.new(hour, min, sec)
|
198
|
+
|
199
|
+
# Remove the time expression from the text
|
200
|
+
clean_text = clean_text.gsub(time_match[0], "").strip
|
201
|
+
end
|
202
|
+
|
102
203
|
# Check for dynamic patterns first (in X days, X days ago)
|
103
|
-
result = check_dynamic_patterns(
|
104
|
-
|
204
|
+
result = check_dynamic_patterns(clean_text)
|
205
|
+
if result
|
206
|
+
return Result.new(result.text, result.date, extracted_time)
|
207
|
+
end
|
105
208
|
|
106
209
|
# Check for day of week patterns (this Monday, next Tuesday, etc.)
|
107
|
-
result = check_day_of_week_patterns(
|
108
|
-
|
210
|
+
result = check_day_of_week_patterns(clean_text)
|
211
|
+
if result
|
212
|
+
return Result.new(result.text, result.date, extracted_time)
|
213
|
+
end
|
109
214
|
|
110
215
|
# Try to find compound date expressions (like "next week")
|
111
216
|
compound_matches = {}
|
112
217
|
|
113
218
|
DATE_KEYWORDS.keys.select { |k| k.include?(" ") }.each do |compound_key|
|
114
|
-
if
|
115
|
-
start_idx =
|
219
|
+
if clean_text.downcase.include?(compound_key)
|
220
|
+
start_idx = clean_text.downcase.index(compound_key)
|
116
221
|
end_idx = start_idx + compound_key.length - 1
|
117
222
|
compound_matches[compound_key] = [start_idx, end_idx]
|
118
223
|
end
|
@@ -128,15 +233,15 @@ module Hizuke
|
|
128
233
|
date = calculate_date(date_value)
|
129
234
|
|
130
235
|
# Remove the date expression from the text
|
131
|
-
|
132
|
-
|
133
|
-
|
236
|
+
final_text = clean_text.dup
|
237
|
+
final_text.slice!(indices[0]..indices[1])
|
238
|
+
final_text = final_text.strip
|
134
239
|
|
135
|
-
return Result.new(
|
240
|
+
return Result.new(final_text, date, extracted_time)
|
136
241
|
end
|
137
242
|
|
138
243
|
# Split the text into words (for single-word date references)
|
139
|
-
words =
|
244
|
+
words = clean_text.split
|
140
245
|
|
141
246
|
# Find the first date keyword
|
142
247
|
date_word_index = nil
|
@@ -152,7 +257,7 @@ module Hizuke
|
|
152
257
|
end
|
153
258
|
|
154
259
|
if date_word_index.nil?
|
155
|
-
raise ParseError, "No valid date reference found in '#{
|
260
|
+
raise ParseError, "No valid date reference found in '#{clean_text}'"
|
156
261
|
end
|
157
262
|
|
158
263
|
# Calculate the date based on the keyword
|
@@ -161,9 +266,9 @@ module Hizuke
|
|
161
266
|
# Create the clean text by removing the date keyword
|
162
267
|
clean_words = words.dup
|
163
268
|
clean_words.delete_at(date_word_index)
|
164
|
-
|
269
|
+
final_text = clean_words.join(" ").strip
|
165
270
|
|
166
|
-
Result.new(
|
271
|
+
Result.new(final_text, date, extracted_time)
|
167
272
|
end
|
168
273
|
|
169
274
|
private
|
data/lib/hizuke/version.rb
CHANGED
metadata
CHANGED
@@ -1,14 +1,14 @@
|
|
1
1
|
--- !ruby/object:Gem::Specification
|
2
2
|
name: hizuke
|
3
3
|
version: !ruby/object:Gem::Version
|
4
|
-
version: 0.0.
|
4
|
+
version: 0.0.4
|
5
5
|
platform: ruby
|
6
6
|
authors:
|
7
7
|
- Juraj Maťaše
|
8
8
|
autorequire:
|
9
9
|
bindir: exe
|
10
10
|
cert_chain: []
|
11
|
-
date: 2025-04-
|
11
|
+
date: 2025-04-29 00:00:00.000000000 Z
|
12
12
|
dependencies:
|
13
13
|
- !ruby/object:Gem::Dependency
|
14
14
|
name: minitest
|
@@ -67,6 +67,7 @@ files:
|
|
67
67
|
- LICENSE.txt
|
68
68
|
- README.md
|
69
69
|
- Rakefile
|
70
|
+
- hizuke.gemspec
|
70
71
|
- lib/hizuke.rb
|
71
72
|
- lib/hizuke/parser.rb
|
72
73
|
- lib/hizuke/version.rb
|