whatsapp-chat-parser 0.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
checksums.yaml ADDED
@@ -0,0 +1,7 @@
1
+ ---
2
+ SHA256:
3
+ metadata.gz: c63551e684919c385110d12bcbbe8e15bae0eef6e4ffe31a0617970f45f24bd6
4
+ data.tar.gz: a510d3a34c51154faae21a8f69d7629e6f0f0ca103a4cfddfbb7571013cc8d3c
5
+ SHA512:
6
+ metadata.gz: c35b864edcaef50faed6aff4c32d80c8d12e7a624819165766a3e935052b7148aff429dfb674c714dae32b938b2f0e3a654e0fbf8628f2147ead5c46e46dfaa6
7
+ data.tar.gz: 16c197ee4c6ffc1f1e5ecfc94aadabde22a2b5329ab0b7059ffc2897b18afeb3f6b879a799649b98ac162506162d9a965a0ecfcd6e9423c0f11192a2f9880d48
data/CHANGELOG.md ADDED
@@ -0,0 +1,16 @@
1
+ # Changelog
2
+
3
+ All notable changes to this project will be documented in this file.
4
+
5
+ The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
6
+ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
7
+
8
+ ## [0.1.0] - 2026-02-18
9
+
10
+ ### Added
11
+ - Initial release of `whatsapp-chat-parser`.
12
+ - Support for parsing exported WhatsApp chat `.txt` files from both **Android and iOS** platforms.
13
+ - Capability to parse both file streams and raw message strings.
14
+ - Automatic encoding normalization for cross-platform file compatibility.
15
+ - Support for multi-line messages.
16
+ - High-precision timestamp parsing (including second-level precision for iOS).
data/LICENSE ADDED
@@ -0,0 +1,21 @@
1
+ MIT License
2
+
3
+ Copyright (c) 2025-2026 Emmanuel Akachukwu
4
+
5
+ Permission is hereby granted, free of charge, to any person obtaining a copy
6
+ of this software and associated documentation files (the "Software"), to deal
7
+ in the Software without restriction, including without limitation the rights
8
+ to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9
+ copies of the Software, and to permit persons to whom the Software is
10
+ furnished to do so, subject to the following conditions:
11
+
12
+ The above copyright notice and this permission notice shall be included in all
13
+ copies or substantial portions of the Software.
14
+
15
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16
+ IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17
+ FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18
+ AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES, OR OTHER
19
+ LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20
+ OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21
+ SOFTWARE.
data/README.md ADDED
@@ -0,0 +1,146 @@
1
+ # WhatsApp Chat Parser
2
+
3
+ A Ruby library that parses exported WhatsApp chat `.txt` files and converts them into structured, machine-readable data. Designed for downstream processing such as analytics, ETL pipelines, storage, and transformation - not for rendering UI or interacting with the WhatsApp API.
4
+
5
+ ## Features
6
+
7
+ - **Platform support**: Handles both Android and iOS WhatsApp chat exports
8
+ - **Structured output**: Normalized message records suitable for JSON, databases, or further transformation
9
+ - **Robust parsing**: Detects platform-specific formats, normalizes timestamps, and groups multi-line messages
10
+ - **Deterministic**: No dependencies, explicit platform handling, predictable output structure
11
+ - **Fail-safe**: Skips or handles malformed lines when possible instead of aborting
12
+
13
+ ## Installation
14
+
15
+ Add to your Gemfile:
16
+
17
+ ```ruby
18
+ gem 'whatsapp-chat-parser'
19
+ ```
20
+
21
+ Then run:
22
+
23
+ ```bash
24
+ bundle install
25
+ ```
26
+
27
+ Or install directly:
28
+
29
+ ```bash
30
+ gem install whatsapp-chat-parser
31
+ ```
32
+
33
+ ## Usage
34
+
35
+ **Parse a single message string** (returns a `Message` or `nil` if malformed):
36
+
37
+ ```ruby
38
+ require 'whatsapp-chat-parser'
39
+
40
+ line = '12/15/25, 10:30:00 AM - John Doe: Hello World'
41
+ msg = WhatsappChatParser.parse_line(line)
42
+ puts "#{msg.timestamp} | #{msg.author}: #{msg.body}" if msg
43
+ ```
44
+
45
+ **Parse a file by path or io** (returns an enumerable of `Message`; malformed lines are skipped):
46
+
47
+ ```ruby
48
+ messages = WhatsappChatParser.parse_file('path/to/chat.txt')
49
+ messages.each { |msg| puts "#{msg.timestamp} | #{msg.author}: #{msg.body}" }
50
+ ```
51
+
52
+ ```ruby
53
+ File.open('path/to/chat.txt') do |f|
54
+ WhatsappChatParser.parse_file(f).each { |msg| puts "#{msg.timestamp} | #{msg.author}: #{msg.body}" }
55
+ end
56
+ ```
57
+
58
+ Each message has `timestamp`, `author`, `body`, `platform` and `type`. The result is suitable for JSON, databases, or pipelines.
59
+
60
+ For a more comprehensive example, see [samples/example.rb](samples/example.rb).
61
+
62
+ ## Output format
63
+
64
+ Each parsed record includes:
65
+
66
+ | Field | Description |
67
+ |-------------|----------------------------------------------------|
68
+ | `timestamp` | Normalized date/time (consistent across platforms) |
69
+ | `author` | Sender name or identifier (when present) |
70
+ | `body` | Full message content (multi-line messages grouped) |
71
+ | `platform` | Platform where chat was exported from (Anroid/iOS) |
72
+ | `type` | e.g. user message, system message |
73
+
74
+ ## Design principles
75
+
76
+ - **Deterministic parsing** - Same input yields same output
77
+ - **No dependencies** - Self-contained Ruby
78
+ - **Explicit platform handling** - Android vs iOS format differences are handled explicitly
79
+ - **Predictable structure** - Stable, documented output schema
80
+
81
+ ## Use cases
82
+
83
+ - Chat analytics and reporting
84
+ - Data migration or archival
85
+ - ETL pipelines into databases or spreadsheets
86
+ - Automated processing of exported WhatsApp conversations
87
+
88
+ ## Non-goals
89
+
90
+ This library does **not**:
91
+
92
+ - Interact with WhatsApp APIs
93
+ - Require network access
94
+ - Perform message interpretation, sentiment analysis, or NLP
95
+ - Handle encrypted or proprietary WhatsApp data formats
96
+
97
+ Input must be unmodified exports from WhatsApp’s “Export Chat” feature.
98
+
99
+ ## How to export WhatsApp chats
100
+
101
+ To use this library you need a plain-text export of a WhatsApp conversation. Use WhatsApp’s built-in **Export Chat** and choose **Without media** so you get a single `.txt` file.
102
+
103
+ - [Android](https://faq.whatsapp.com/1180414079177245?cms_platform=android)
104
+ - [iOS](https://faq.whatsapp.com/1180414079177245/?cms_platform=iphone)
105
+
106
+ Use the exported `.txt` file as-is; do not edit the format. This library supports both Android and iOS export formats.
107
+
108
+ ## Development
109
+
110
+ ### Setup
111
+
112
+ Clone the repository and install dependencies:
113
+
114
+ ```bash
115
+ git clone https://github.com/emmaakachukwu/whatsapp-chat-parser-rb
116
+ cd whatsapp-chat-parser-rb
117
+ bundle install
118
+ ```
119
+
120
+ ### Running Tests
121
+
122
+ We use RSpec for testing. Ensure all tests pass before submitting changes:
123
+
124
+ ```bash
125
+ bundle exec rspec
126
+ ```
127
+
128
+ ### Linting
129
+
130
+ We use RuboCop to maintain code quality:
131
+
132
+ ```bash
133
+ bundle exec rubocop
134
+ ```
135
+
136
+ ## Contributing
137
+
138
+ Contributions are welcome. Please open an issue or pull request on the project repository.
139
+
140
+ 1. **Fork** the repository and create a feature branch.
141
+ 2. Ensure your code follows the **Development** steps above (tests and linting pass).
142
+ 3. **Submit a Pull Request** with a detailed description of your work.
143
+
144
+ ## License
145
+
146
+ This project is licensed under the MIT License. See the [LICENSE](LICENSE) file for details.
@@ -0,0 +1,55 @@
1
+ # frozen_string_literal: true
2
+
3
+ module WhatsappChatParser
4
+ # Handles encoding detection and normalization for chat exports.
5
+ module Encoding
6
+ UTF8_BOM = "\xEF\xBB\xBF".b.freeze
7
+ UTF16LE_BOM = "\xFF\xFE".b.freeze
8
+ UTF16BE_BOM = "\xFE\xFF".b.freeze
9
+ FALLBACK_ENCODING = 'UTF-8'
10
+
11
+ class << self
12
+ # Normalizes a string to UTF-8, handling BOM and common encodings.
13
+ # @param line [String] The raw input string.
14
+ # @return [String] The normalized UTF-8 string.
15
+ def normalize_to_utf8(line)
16
+ enc = encoding_for(line)
17
+ str = line.dup.force_encoding(enc)
18
+ str = strip_bom(str, enc)
19
+ unless enc == ::Encoding::UTF_8
20
+ str = str.encode(
21
+ ::Encoding::UTF_8, invalid: :replace, undef: :replace
22
+ )
23
+ end
24
+
25
+ str
26
+ end
27
+
28
+ private
29
+
30
+ def encoding_for(line)
31
+ raw = line.dup.force_encoding(::Encoding::BINARY)
32
+ if raw.start_with?(UTF8_BOM)
33
+ ::Encoding::UTF_8
34
+ elsif raw.start_with?(UTF16LE_BOM)
35
+ ::Encoding::UTF_16LE
36
+ elsif raw.start_with?(UTF16BE_BOM)
37
+ ::Encoding::UTF_16BE
38
+ else
39
+ ::Encoding.find(FALLBACK_ENCODING)
40
+ end
41
+ end
42
+
43
+ def strip_bom(str, encoding)
44
+ case encoding
45
+ when ::Encoding::UTF_8
46
+ str.start_with?("\uFEFF") ? str.delete_prefix("\uFEFF") : str
47
+ when ::Encoding::UTF_16LE, ::Encoding::UTF_16BE
48
+ str.bytesize >= 2 ? str.byteslice(2..).force_encoding(encoding) : str
49
+ else
50
+ str
51
+ end
52
+ end
53
+ end
54
+ end
55
+ end
@@ -0,0 +1,56 @@
1
+ # frozen_string_literal: true
2
+
3
+ require 'stringio'
4
+
5
+ module WhatsappChatParser
6
+ # Handles reading and processing chat export files.
7
+ module FileProcessor
8
+ class << self
9
+ # Iterates through the source and yields parsed messages.
10
+ # @param source [String, IO] The file path or IO object.
11
+ # @yield [message] Yields each parsed message.
12
+ # @yieldparam message [WhatsappChatParser::Models::Message]
13
+ # @return [Enumerator] if no block is given.
14
+ def parse(source, &block)
15
+ return enum_for(__method__, source) unless block_given?
16
+
17
+ file = source.is_a?(StringIO) ? source : File.open(source)
18
+ parse_io(file, &block)
19
+ end
20
+
21
+ private
22
+
23
+ # Processes an IO object.
24
+ # @param io [IO] The input source.
25
+ # @yield [message]
26
+ def parse_io(io, &block)
27
+ return enum_for(__method__, io) unless block_given?
28
+
29
+ accumulate_messages(io, &block)
30
+ end
31
+
32
+ def accumulate_messages(io, &block)
33
+ message = ''
34
+
35
+ io.each_line do |line|
36
+ if message_starts_here?(line)
37
+ yield_message(message, &block) unless message.empty?
38
+ message = line
39
+ else
40
+ message << line
41
+ end
42
+ end
43
+
44
+ yield_message(message, &block) unless message.empty?
45
+ end
46
+
47
+ def yield_message(message)
48
+ yield(Platforms.parse(message))
49
+ end
50
+
51
+ def message_starts_here?(line)
52
+ !Platforms.parse(line).nil?
53
+ end
54
+ end
55
+ end
56
+ end
@@ -0,0 +1,38 @@
1
+ # frozen_string_literal: true
2
+
3
+ module WhatsappChatParser
4
+ module Models
5
+ # Represents a single WhatsApp message.
6
+ class Message
7
+ # @return [String] The date and time the message was sent (standardized SQL format).
8
+ attr_accessor :timestamp
9
+ # @return [String, nil] The name or phone number of the message author, or nil for system messages.
10
+ attr_accessor :author
11
+ # @return [String] The content of the message.
12
+ attr_accessor :body
13
+ # @return [Symbol] The type of message (:user or :system).
14
+ attr_accessor :type
15
+ # @return [Symbol] The platform the message was exported from (:android or :ios).
16
+ attr_accessor :platform
17
+
18
+ # Initializes a new Message object.
19
+ # @param timestamp [String] The standardized timestamp.
20
+ # @param author [String, nil] The author of the message.
21
+ # @param body [String] The message body.
22
+ # @param platform [Symbol] The export platform.
23
+ def initialize(timestamp:, author:, body:, platform:)
24
+ @timestamp = timestamp
25
+ @author = author
26
+ @body = body
27
+ @platform = platform
28
+ @type = message_type
29
+ end
30
+
31
+ private
32
+
33
+ def message_type
34
+ @author.nil? ? :system : :user
35
+ end
36
+ end
37
+ end
38
+ end
@@ -0,0 +1,47 @@
1
+ # frozen_string_literal: true
2
+
3
+ module WhatsappChatParser
4
+ module Platforms
5
+ module Android
6
+ # Regex patterns and builders for Android WhatsApp exports.
7
+ module Pattern
8
+ # rubocop:disable Layout/HashAlignment
9
+ PATTERNS = {
10
+ month: /(\d{1,2})/,
11
+ day: /(\d{1,2})/,
12
+ year: /(\d{2})/,
13
+ hour: /(\d{1,2})/,
14
+ minute: /(\d{2})/,
15
+ meridiem: /\p{Space}*([AP]M)/,
16
+ author: /(?:([^:]+): )?/,
17
+ body: /(.*)/
18
+ }.freeze
19
+ # rubocop:enable Layout/HashAlignment
20
+
21
+ class << self
22
+ # Returns the compiled regex for Android chat exports.
23
+ # @return [Regexp]
24
+ def regex
25
+ Regexp.new(
26
+ "#{date_pattern}, #{time_pattern} " \
27
+ "- #{PatternHelpers.source(PATTERNS, :author)}#{PatternHelpers.source(PATTERNS, :body)}",
28
+ Regexp::MULTILINE
29
+ )
30
+ end
31
+
32
+ private
33
+
34
+ def date_pattern
35
+ PATTERNS.fetch_values(:month, :day, :year)
36
+ .map(&:source)
37
+ .join('/')
38
+ end
39
+
40
+ def time_pattern
41
+ PatternHelpers.format_sources(PATTERNS, %i[hour minute meridiem], '%s:%s%s')
42
+ end
43
+ end
44
+ end
45
+ end
46
+ end
47
+ end
@@ -0,0 +1,86 @@
1
+ # frozen_string_literal: true
2
+
3
+ module WhatsappChatParser
4
+ module Platforms
5
+ # Parser for Android WhatsApp chat exports.
6
+ module Android
7
+ class << self
8
+ # Parses a line from an Android export.
9
+ # @param line [String] The exported line.
10
+ # @return [Models::Message, nil]
11
+ def parse(line)
12
+ match = line.match(Pattern.regex)
13
+ return unless match
14
+
15
+ timestamp = extract_timestamp(match)
16
+ author = extract(match, :author)
17
+ body = extract(match, :body)
18
+
19
+ Models::Message.new(timestamp: timestamp, author: author, body: body, platform: :android)
20
+ end
21
+
22
+ # Checks if a line matches the Android format.
23
+ # @param line [String]
24
+ # @return [Boolean]
25
+ def matches?(line)
26
+ Pattern.regex.match?(line)
27
+ end
28
+
29
+ private
30
+
31
+ def extract(match, key)
32
+ index = Pattern::PATTERNS.keys.index(key)
33
+ match[index + 1]
34
+ end
35
+
36
+ def extract_timestamp(match)
37
+ date_components = extract_date_components(match)
38
+ time_components = extract_time_components(match)
39
+
40
+ format_sql_timestamp(date_components, time_components)
41
+ end
42
+
43
+ def extract_date_components(match)
44
+ month = extract(match, :month)
45
+ day = extract(match, :day)
46
+ year = extract(match, :year).to_i + 2000
47
+
48
+ { month: month, day: day, year: year }
49
+ end
50
+
51
+ def extract_time_components(match)
52
+ hour = extract(match, :hour).to_i
53
+ minute = extract(match, :minute).to_i
54
+ meridiem = extract(match, :meridiem)
55
+ hour = convert_to_24_hour(hour, meridiem)
56
+
57
+ { hour: hour, minute: minute }
58
+ end
59
+
60
+ def convert_to_24_hour(hour, meridiem)
61
+ meridiem = meridiem.upcase
62
+ if meridiem == 'PM' && hour < 12
63
+ hour + 12
64
+ elsif meridiem == 'AM' && hour == 12
65
+ 0
66
+ else
67
+ hour
68
+ end
69
+ end
70
+
71
+ def format_sql_timestamp(date, time)
72
+ # rubocop:disable Layout/HashAlignment
73
+ format(
74
+ '%<year>04d-%<month>02d-%<day>02d %<hour>02d:%<minute>02d:00',
75
+ year: date[:year],
76
+ month: date[:month],
77
+ day: date[:day],
78
+ hour: time[:hour],
79
+ minute: time[:minute]
80
+ )
81
+ # rubocop:enable Layout/HashAlignment
82
+ end
83
+ end
84
+ end
85
+ end
86
+ end
@@ -0,0 +1,64 @@
1
+ # frozen_string_literal: true
2
+
3
+ module WhatsappChatParser
4
+ module Platforms
5
+ module Ios
6
+ # Regex patterns and builders for iOS WhatsApp exports.
7
+ module Pattern
8
+ # rubocop:disable Layout/HashAlignment
9
+ PATTERNS = {
10
+ day: /(\d{1,2})/,
11
+ month: /(\d{1,2})/,
12
+ year: /(\d{4})/,
13
+ hour: /(\d{1,2})/,
14
+ minute: /(\d{2})/,
15
+ second: /(?::(\d{2}))?/,
16
+ meridiem: /\p{Space}*([AP]M)?/,
17
+ author: /(?:([^:]+?)\p{Space}*:\p{Space}*)?/,
18
+ body: /(.*)/
19
+ }.freeze
20
+ # rubocop:enable Layout/HashAlignment
21
+
22
+ class << self
23
+ # Returns the compiled regex for iOS chat exports.
24
+ # @return [Regexp]
25
+ def regex
26
+ Regexp.new(
27
+ "#{square_bracket_open_pattern}" \
28
+ "#{date_pattern},#{space_pattern}" \
29
+ "#{time_pattern}" \
30
+ "#{square_bracket_close_pattern}" \
31
+ "#{space_pattern}#{/[-~]?/.source}#{space_pattern}" \
32
+ "#{PatternHelpers.source(PATTERNS, :author)}#{PatternHelpers.source(PATTERNS, :body)}",
33
+ Regexp::MULTILINE
34
+ )
35
+ end
36
+
37
+ private
38
+
39
+ def date_pattern
40
+ PatternHelpers.join_sources(PATTERNS, %i[day month year], '/')
41
+ end
42
+
43
+ def time_pattern
44
+ PatternHelpers.format_sources(
45
+ PATTERNS, %i[hour minute second meridiem], '%s:%s%s%s'
46
+ )
47
+ end
48
+
49
+ def space_pattern
50
+ /\p{Space}*/.source
51
+ end
52
+
53
+ def square_bracket_open_pattern
54
+ /\[?/.source
55
+ end
56
+
57
+ def square_bracket_close_pattern
58
+ /\]?/.source
59
+ end
60
+ end
61
+ end
62
+ end
63
+ end
64
+ end
@@ -0,0 +1,87 @@
1
+ # frozen_string_literal: true
2
+
3
+ module WhatsappChatParser
4
+ module Platforms
5
+ # Parser for iOS WhatsApp chat exports.
6
+ module Ios
7
+ class << self
8
+ # Parses a line from an iOS export.
9
+ # @param line [String] The exported line.
10
+ # @return [Models::Message, nil]
11
+ def parse(line)
12
+ match = line.match(Pattern.regex)
13
+ return unless match
14
+
15
+ timestamp = extract_timestamp(match)
16
+ author = extract(match, :author)
17
+ body = extract(match, :body)
18
+
19
+ Models::Message.new(timestamp: timestamp, author: author, body: body, platform: :ios)
20
+ end
21
+
22
+ # Checks if a line matches the iOS format.
23
+ # @param line [String]
24
+ # @return [Boolean]
25
+ def matches?(line)
26
+ Pattern.regex.match?(line)
27
+ end
28
+
29
+ private
30
+
31
+ def extract(match, key)
32
+ index = Pattern::PATTERNS.keys.index(key)
33
+ match[index + 1]
34
+ end
35
+
36
+ def extract_timestamp(match)
37
+ date_components = extract_date_components(match)
38
+ time_components = extract_time_components(match)
39
+
40
+ format_sql_timestamp(date_components, time_components)
41
+ end
42
+
43
+ def extract_date_components(match)
44
+ month = extract(match, :month)
45
+ day = extract(match, :day)
46
+ year = extract(match, :year)
47
+
48
+ { month: month, day: day, year: year }
49
+ end
50
+
51
+ def extract_time_components(match)
52
+ hour = extract(match, :hour).to_i
53
+ minute = extract(match, :minute).to_i
54
+ second = extract(match, :second)
55
+ meridiem = extract(match, :meridiem)
56
+ hour = convert_to_24_hour(hour, meridiem)
57
+
58
+ { hour: hour, minute: minute, second: second }
59
+ end
60
+
61
+ def convert_to_24_hour(hour, meridiem)
62
+ if meridiem == 'PM' && hour < 12
63
+ hour + 12
64
+ elsif meridiem == 'AM' && hour == 12
65
+ 0
66
+ else
67
+ hour
68
+ end
69
+ end
70
+
71
+ def format_sql_timestamp(date, time)
72
+ # rubocop:disable Layout/HashAlignment
73
+ format(
74
+ '%<year>04d-%<month>02d-%<day>02d %<hour>02d:%<minute>02d:%<second>02d',
75
+ year: date[:year],
76
+ month: date[:month],
77
+ day: date[:day],
78
+ hour: time[:hour],
79
+ minute: time[:minute],
80
+ second: time[:second]
81
+ )
82
+ # rubocop:enable Layout/HashAlignment
83
+ end
84
+ end
85
+ end
86
+ end
87
+ end
@@ -0,0 +1,23 @@
1
+ # frozen_string_literal: true
2
+
3
+ module WhatsappChatParser
4
+ module Platforms
5
+ # Shared utilities for building regex patterns.
6
+ module PatternHelpers
7
+ class << self
8
+ def join_sources(patterns, keys, separator)
9
+ patterns.fetch_values(*keys).map(&:source).join(separator)
10
+ end
11
+
12
+ def source(patterns, key)
13
+ patterns[key].source
14
+ end
15
+
16
+ def format_sources(patterns, keys, format_string)
17
+ values = patterns.fetch_values(*keys).map(&:source)
18
+ format_string % values
19
+ end
20
+ end
21
+ end
22
+ end
23
+ end
@@ -0,0 +1,38 @@
1
+ # frozen_string_literal: true
2
+
3
+ require_relative 'encoding'
4
+ require_relative 'platforms/android'
5
+ require_relative 'platforms/ios'
6
+ require_relative 'platforms/android/pattern'
7
+ require_relative 'platforms/ios/pattern'
8
+ require_relative 'platforms/pattern_helpers'
9
+
10
+ module WhatsappChatParser
11
+ # Registry and dispatcher for platform-specific chat parsers.
12
+ module Platforms
13
+ PLATFORMS = [Android, Ios].freeze
14
+
15
+ class << self
16
+ # Attempts to parse a message line by identifying its platform.
17
+ # @param line [String] The raw message line.
18
+ # @return [WhatsappChatParser::Models::Message, nil] The parsed message or nil.
19
+ def parse(line)
20
+ sanitized = sanitize(line)
21
+ platform = platform_for(sanitized)
22
+ return nil unless platform
23
+
24
+ platform.parse(sanitized)
25
+ end
26
+
27
+ private
28
+
29
+ def platform_for(line)
30
+ PLATFORMS.find { |platform| platform.matches?(line) }
31
+ end
32
+
33
+ def sanitize(line)
34
+ Encoding.normalize_to_utf8(line).strip.scrub(' ').squeeze(' ')
35
+ end
36
+ end
37
+ end
38
+ end
@@ -0,0 +1,5 @@
1
+ # frozen_string_literal: true
2
+
3
+ module WhatsappChatParser
4
+ VERSION = '0.1.0'
5
+ end
@@ -0,0 +1,24 @@
1
+ # frozen_string_literal: true
2
+
3
+ require_relative 'whatsapp-chat-parser/platforms'
4
+ require_relative 'whatsapp-chat-parser/models/message'
5
+ require_relative 'whatsapp-chat-parser/file_processor'
6
+
7
+ # Main entry point for the WhatsApp Chat Parser library.
8
+ module WhatsappChatParser
9
+ class << self
10
+ # Parses a single message line of a WhatsApp chat export.
11
+ # @param line [String] The line to parse.
12
+ # @return [WhatsappChatParser::Models::Message, nil] The parsed message or nil if message is malformed.
13
+ def parse_line(line)
14
+ Platforms.parse(line)
15
+ end
16
+
17
+ # Parses a WhatsApp chat export .txt file path or IO.
18
+ # @param source [String, IO] The path to the file or an IO object.
19
+ # @return [Enumerator<WhatsappChatParser::Models::Message>] Enumerator of parsed messages.
20
+ def parse_file(source)
21
+ FileProcessor.parse(source)
22
+ end
23
+ end
24
+ end
metadata ADDED
@@ -0,0 +1,137 @@
1
+ --- !ruby/object:Gem::Specification
2
+ name: whatsapp-chat-parser
3
+ version: !ruby/object:Gem::Version
4
+ version: 0.1.0
5
+ platform: ruby
6
+ authors:
7
+ - Emmanuel Akachukwu
8
+ autorequire:
9
+ bindir: bin
10
+ cert_chain: []
11
+ date: 2026-02-18 00:00:00.000000000 Z
12
+ dependencies:
13
+ - !ruby/object:Gem::Dependency
14
+ name: rspec
15
+ requirement: !ruby/object:Gem::Requirement
16
+ requirements:
17
+ - - "~>"
18
+ - !ruby/object:Gem::Version
19
+ version: '3.13'
20
+ type: :development
21
+ prerelease: false
22
+ version_requirements: !ruby/object:Gem::Requirement
23
+ requirements:
24
+ - - "~>"
25
+ - !ruby/object:Gem::Version
26
+ version: '3.13'
27
+ - !ruby/object:Gem::Dependency
28
+ name: rubocop
29
+ requirement: !ruby/object:Gem::Requirement
30
+ requirements:
31
+ - - "~>"
32
+ - !ruby/object:Gem::Version
33
+ version: '1.84'
34
+ type: :development
35
+ prerelease: false
36
+ version_requirements: !ruby/object:Gem::Requirement
37
+ requirements:
38
+ - - "~>"
39
+ - !ruby/object:Gem::Version
40
+ version: '1.84'
41
+ - !ruby/object:Gem::Dependency
42
+ name: rubocop-performance
43
+ requirement: !ruby/object:Gem::Requirement
44
+ requirements:
45
+ - - "~>"
46
+ - !ruby/object:Gem::Version
47
+ version: '1.26'
48
+ type: :development
49
+ prerelease: false
50
+ version_requirements: !ruby/object:Gem::Requirement
51
+ requirements:
52
+ - - "~>"
53
+ - !ruby/object:Gem::Version
54
+ version: '1.26'
55
+ - !ruby/object:Gem::Dependency
56
+ name: rubocop-rspec
57
+ requirement: !ruby/object:Gem::Requirement
58
+ requirements:
59
+ - - "~>"
60
+ - !ruby/object:Gem::Version
61
+ version: '3.9'
62
+ type: :development
63
+ prerelease: false
64
+ version_requirements: !ruby/object:Gem::Requirement
65
+ requirements:
66
+ - - "~>"
67
+ - !ruby/object:Gem::Version
68
+ version: '3.9'
69
+ - !ruby/object:Gem::Dependency
70
+ name: simplecov
71
+ requirement: !ruby/object:Gem::Requirement
72
+ requirements:
73
+ - - "~>"
74
+ - !ruby/object:Gem::Version
75
+ version: '0.22'
76
+ type: :development
77
+ prerelease: false
78
+ version_requirements: !ruby/object:Gem::Requirement
79
+ requirements:
80
+ - - "~>"
81
+ - !ruby/object:Gem::Version
82
+ version: '0.22'
83
+ description: |
84
+ WhatsappChatParser parses exported WhatsApp chat .txt files (Android and iOS)
85
+ and converts them into structured, machine-readable message objects.
86
+ It supports both file inputs and raw message strings.
87
+ email:
88
+ - emmanuelakachukwu1@gmail.com
89
+ executables: []
90
+ extensions: []
91
+ extra_rdoc_files: []
92
+ files:
93
+ - CHANGELOG.md
94
+ - LICENSE
95
+ - README.md
96
+ - lib/whatsapp-chat-parser.rb
97
+ - lib/whatsapp-chat-parser/encoding.rb
98
+ - lib/whatsapp-chat-parser/file_processor.rb
99
+ - lib/whatsapp-chat-parser/models/message.rb
100
+ - lib/whatsapp-chat-parser/platforms.rb
101
+ - lib/whatsapp-chat-parser/platforms/android.rb
102
+ - lib/whatsapp-chat-parser/platforms/android/pattern.rb
103
+ - lib/whatsapp-chat-parser/platforms/ios.rb
104
+ - lib/whatsapp-chat-parser/platforms/ios/pattern.rb
105
+ - lib/whatsapp-chat-parser/platforms/pattern_helpers.rb
106
+ - lib/whatsapp-chat-parser/version.rb
107
+ homepage: https://github.com/emmaakachukwu/whatsapp-chat-parser-rb
108
+ licenses:
109
+ - MIT
110
+ metadata:
111
+ homepage_uri: https://github.com/emmaakachukwu/whatsapp-chat-parser-rb
112
+ bug_tracker_uri: https://github.com/emmaakachukwu/whatsapp-chat-parser-rb/issues
113
+ changelog_uri: https://github.com/emmaakachukwu/whatsapp-chat-parser-rb/blob/v0.1.0/CHANGELOG.md
114
+ documentation_uri: https://www.rubydoc.info/gems/whatsapp-chat-parser/0.1.0
115
+ source_code_uri: https://github.com/emmaakachukwu/whatsapp-chat-parser-rb/tree/v0.1.0
116
+ keywords: whatsapp chat parser whatsapp-chat-parser text export android ios
117
+ rubygems_mfa_required: 'true'
118
+ post_install_message:
119
+ rdoc_options: []
120
+ require_paths:
121
+ - lib
122
+ required_ruby_version: !ruby/object:Gem::Requirement
123
+ requirements:
124
+ - - ">="
125
+ - !ruby/object:Gem::Version
126
+ version: 3.0.0
127
+ required_rubygems_version: !ruby/object:Gem::Requirement
128
+ requirements:
129
+ - - ">="
130
+ - !ruby/object:Gem::Version
131
+ version: '0'
132
+ requirements: []
133
+ rubygems_version: 3.5.22
134
+ signing_key:
135
+ specification_version: 4
136
+ summary: A Ruby library for parsing exported WhatsApp chat .txt files or message strings.
137
+ test_files: []