mbuzz 0.6.3 → 0.6.4
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- checksums.yaml +4 -4
- data/.DS_Store +0 -0
- data/CHECK_BUG.md +168 -0
- data/lib/.DS_Store +0 -0
- data/lib/mbuzz/middleware/tracking.rb +38 -11
- data/lib/mbuzz/session/id_generator.rb +33 -0
- data/lib/mbuzz/version.rb +1 -1
- data/lib/mbuzz.rb +1 -0
- data/lib/specs/v0.7.0_deterministic_sessions.md +640 -0
- data/mbuzz-0.6.3.gem +0 -0
- metadata +7 -1
checksums.yaml
CHANGED
|
@@ -1,7 +1,7 @@
|
|
|
1
1
|
---
|
|
2
2
|
SHA256:
|
|
3
|
-
metadata.gz:
|
|
4
|
-
data.tar.gz:
|
|
3
|
+
metadata.gz: 897fe1df10d939d598328512ab1e8723abc68f1a77be50a3db7c976078f957b8
|
|
4
|
+
data.tar.gz: 56282a6cccb6d0f033835694f9773d2fc03286db5bed97f2ab258a712da407be
|
|
5
5
|
SHA512:
|
|
6
|
-
metadata.gz:
|
|
7
|
-
data.tar.gz:
|
|
6
|
+
metadata.gz: ec1e1f1ca91cfc9e1e557fbebea612be3bb5924505c39273f5d07dac4c4f442a057969c27cfcab87a428c3b7fa91c84a4ca399c11498427d31868c38fdfa8190
|
|
7
|
+
data.tar.gz: 835fa0595ebdb2056b5c78f47f85b75650ad5cf2014a524b2cc1d743c70403c563187f5f2fced9e42ef8b22626a40a6c161de2f58933745fc39b5b247fc9d0cd
|
data/.DS_Store
ADDED
|
Binary file
|
data/CHECK_BUG.md
ADDED
|
@@ -0,0 +1,168 @@
|
|
|
1
|
+
# Thread-Safety Bug Fix Verification
|
|
2
|
+
|
|
3
|
+
## Bug Summary
|
|
4
|
+
|
|
5
|
+
**Issue**: Middleware used instance variables (`@session_id`, `@visitor_id`, `@request`) shared across concurrent requests in multi-threaded servers (Puma). This caused race conditions where session/visitor IDs leaked between requests.
|
|
6
|
+
|
|
7
|
+
**Impact**: Pet Resorts Australia had **178,428 sessions** for only **172,000 visitors**. Some visitors had 1,500+ sessions because cookies were set with wrong session_ids under concurrent load.
|
|
8
|
+
|
|
9
|
+
**Root Cause**: Rack middleware is instantiated once and shared across all requests. Instance variables are not thread-safe.
|
|
10
|
+
|
|
11
|
+
## Fix Details
|
|
12
|
+
|
|
13
|
+
| Field | Value |
|
|
14
|
+
|-------|-------|
|
|
15
|
+
| **Fixed in version** | 0.6.3 |
|
|
16
|
+
| **Commit** | `bdf4c64` |
|
|
17
|
+
| **Fix deployed** | 2025-12-22 ~21:30 UTC (2025-12-23 ~08:30 AEDT) |
|
|
18
|
+
| **Gem published** | 2025-12-23 |
|
|
19
|
+
|
|
20
|
+
## Verification Checklist
|
|
21
|
+
|
|
22
|
+
### After 24-48 hours (by 2025-12-25):
|
|
23
|
+
|
|
24
|
+
Run these queries in production Rails console:
|
|
25
|
+
|
|
26
|
+
```ruby
|
|
27
|
+
# 1. Check session creation rate AFTER fix
|
|
28
|
+
# Should see dramatically fewer sessions per hour
|
|
29
|
+
cutoff = Time.parse("2025-12-23 08:30:00 UTC") # Adjust to actual deploy time
|
|
30
|
+
|
|
31
|
+
puts "Sessions BEFORE fix (last 24h before deploy):"
|
|
32
|
+
before_sessions = Session.where(created_at: (cutoff - 24.hours)..cutoff).count
|
|
33
|
+
puts " Count: #{before_sessions}"
|
|
34
|
+
|
|
35
|
+
puts "\nSessions AFTER fix (24h after deploy):"
|
|
36
|
+
after_sessions = Session.where(created_at: cutoff..(cutoff + 24.hours)).count
|
|
37
|
+
puts " Count: #{after_sessions}"
|
|
38
|
+
|
|
39
|
+
puts "\nReduction: #{((before_sessions - after_sessions).to_f / before_sessions * 100).round(1)}%"
|
|
40
|
+
```
|
|
41
|
+
|
|
42
|
+
```ruby
|
|
43
|
+
# 2. Check sessions per visitor ratio
|
|
44
|
+
# Should be close to 1.0-1.5 for new visitors (was 1.05 overall but outliers had 1500+)
|
|
45
|
+
cutoff = Time.parse("2025-12-23 08:30:00 UTC")
|
|
46
|
+
|
|
47
|
+
new_visitors = Visitor.where("created_at > ?", cutoff)
|
|
48
|
+
new_visitor_ids = new_visitors.pluck(:id)
|
|
49
|
+
|
|
50
|
+
sessions_for_new = Session.where(visitor_id: new_visitor_ids).count
|
|
51
|
+
puts "New visitors since fix: #{new_visitors.count}"
|
|
52
|
+
puts "Sessions for new visitors: #{sessions_for_new}"
|
|
53
|
+
puts "Ratio: #{(sessions_for_new.to_f / new_visitors.count).round(2)}"
|
|
54
|
+
```
|
|
55
|
+
|
|
56
|
+
```ruby
|
|
57
|
+
# 3. Check for any new outliers (visitors with 10+ sessions in 24h)
|
|
58
|
+
cutoff = Time.parse("2025-12-23 08:30:00 UTC")
|
|
59
|
+
|
|
60
|
+
outliers = Session.where("created_at > ?", cutoff)
|
|
61
|
+
.group(:visitor_id)
|
|
62
|
+
.having("count(*) > 10")
|
|
63
|
+
.count
|
|
64
|
+
|
|
65
|
+
puts "Visitors with 10+ sessions since fix: #{outliers.count}"
|
|
66
|
+
outliers.sort_by { |_, v| -v }.first(5).each do |vid, count|
|
|
67
|
+
puts " Visitor #{vid}: #{count} sessions"
|
|
68
|
+
end
|
|
69
|
+
```
|
|
70
|
+
|
|
71
|
+
### Expected Results After Fix:
|
|
72
|
+
|
|
73
|
+
- [ ] Session creation rate drops by 90%+
|
|
74
|
+
- [ ] Sessions per new visitor ratio < 2.0
|
|
75
|
+
- [ ] No new outliers with 100+ sessions
|
|
76
|
+
- [ ] Cookie session_id matches env session_id (verified by tests)
|
|
77
|
+
|
|
78
|
+
---
|
|
79
|
+
|
|
80
|
+
## Other SDKs to Review
|
|
81
|
+
|
|
82
|
+
**CRITICAL**: Check all other SDKs for the same thread-safety bug!
|
|
83
|
+
|
|
84
|
+
### SDK Review Checklist:
|
|
85
|
+
|
|
86
|
+
| SDK | Location | Status | Reviewed By | Date |
|
|
87
|
+
|-----|----------|--------|-------------|------|
|
|
88
|
+
| mbuzz-ruby | `/Users/vlad/code/m/mbuzz-ruby` | FIXED | Claude | 2025-12-22 |
|
|
89
|
+
| mbuzz-python | `/Users/vlad/code/m/mbuzz-python` | SAFE | Claude | 2025-12-22 |
|
|
90
|
+
| mbuzz-php | `/Users/vlad/code/m/mbuzz-php` | SAFE | Claude | 2025-12-22 |
|
|
91
|
+
| mbuzz-node | `/Users/vlad/code/m/mbuzz-node` | SAFE | Claude | 2025-12-22 |
|
|
92
|
+
|
|
93
|
+
### Review Results:
|
|
94
|
+
|
|
95
|
+
**mbuzz-python**: SAFE
|
|
96
|
+
- Uses `contextvars.ContextVar` for thread-safe context storage
|
|
97
|
+
- Uses Flask's `g` object for request-scoped storage
|
|
98
|
+
- Local variables used throughout middleware
|
|
99
|
+
- Async session creation captures values in local variables before spawning thread
|
|
100
|
+
|
|
101
|
+
**mbuzz-php**: SAFE
|
|
102
|
+
- PHP is single-process per request by default
|
|
103
|
+
- No shared state between requests
|
|
104
|
+
- Each request gets fresh instance of everything
|
|
105
|
+
|
|
106
|
+
**mbuzz-node**: SAFE
|
|
107
|
+
- Uses `AsyncLocalStorage` from `node:async_hooks` for async request isolation
|
|
108
|
+
- Express middleware uses local variables (`visitor`, `session`, `secure`)
|
|
109
|
+
- Attaches data to request-scoped `req.mbuzz` object
|
|
110
|
+
- `createSessionAsync` captures values as function parameters before `setImmediate`
|
|
111
|
+
- Node.js is single-threaded, so race conditions are inherently less likely
|
|
112
|
+
|
|
113
|
+
### What to Look For:
|
|
114
|
+
|
|
115
|
+
1. **Middleware/Handler using instance variables or class variables for request-specific data**
|
|
116
|
+
- BAD: `self.session_id = ...` or `@session_id = ...`
|
|
117
|
+
- GOOD: Local variables passed through function calls
|
|
118
|
+
|
|
119
|
+
2. **Mutable shared state**
|
|
120
|
+
- BAD: Global or class-level dicts/hashes storing request data
|
|
121
|
+
- GOOD: Request-scoped context objects or local variables
|
|
122
|
+
|
|
123
|
+
3. **Thread-local storage without proper cleanup**
|
|
124
|
+
- Check that thread-local data is cleared after each request
|
|
125
|
+
|
|
126
|
+
### Python-specific concerns:
|
|
127
|
+
- Check for module-level variables
|
|
128
|
+
- Check Flask/Django middleware for shared state
|
|
129
|
+
- WSGI apps can have similar issues with global state
|
|
130
|
+
|
|
131
|
+
### PHP-specific concerns:
|
|
132
|
+
- PHP is typically single-threaded per request, so likely SAFE
|
|
133
|
+
- But check for any persistent worker modes (Swoole, RoadRunner, FrankenPHP)
|
|
134
|
+
|
|
135
|
+
### Node.js-specific concerns:
|
|
136
|
+
- Node is single-threaded, so likely SAFE
|
|
137
|
+
- But check for any shared state in closures or module scope
|
|
138
|
+
|
|
139
|
+
---
|
|
140
|
+
|
|
141
|
+
## Data Cleanup (Optional)
|
|
142
|
+
|
|
143
|
+
After verifying the fix works, consider cleaning up the bad data:
|
|
144
|
+
|
|
145
|
+
```ruby
|
|
146
|
+
# Find sessions with no events (likely created by the bug)
|
|
147
|
+
# BE CAREFUL - only run after thorough analysis
|
|
148
|
+
|
|
149
|
+
# Count empty sessions by account
|
|
150
|
+
Account.find_each do |account|
|
|
151
|
+
session_ids_with_events = account.events.distinct.pluck(:session_id)
|
|
152
|
+
empty_sessions = account.sessions.where.not(session_id: session_ids_with_events).count
|
|
153
|
+
total_sessions = account.sessions.count
|
|
154
|
+
|
|
155
|
+
next if empty_sessions == 0
|
|
156
|
+
|
|
157
|
+
puts "#{account.name}: #{empty_sessions}/#{total_sessions} empty sessions (#{(empty_sessions.to_f/total_sessions*100).round(1)}%)"
|
|
158
|
+
end
|
|
159
|
+
```
|
|
160
|
+
|
|
161
|
+
---
|
|
162
|
+
|
|
163
|
+
## Notes
|
|
164
|
+
|
|
165
|
+
- Bug was discovered via dashboard metrics investigation (avg visits showing 28.5 with 0.6 avg days)
|
|
166
|
+
- Traced to Pet Resorts Australia account (PetPro360)
|
|
167
|
+
- Logs showed session creation every few seconds with different session_ids
|
|
168
|
+
- Test added: `test_race_condition_with_slow_app` - 49/50 failures before fix, 0 after
|
data/lib/.DS_Store
ADDED
|
Binary file
|
|
@@ -49,19 +49,22 @@ module Mbuzz
|
|
|
49
49
|
# Build all request-specific context as a frozen hash
|
|
50
50
|
# This ensures thread-safety by using local variables only
|
|
51
51
|
def build_request_context(request)
|
|
52
|
-
visitor_id = visitor_id_from_cookie(request) || Visitor::Identifier.generate
|
|
53
|
-
session_id = session_id_from_cookie(request) || generate_session_id
|
|
54
|
-
user_id = user_id_from_session(request)
|
|
55
|
-
new_session = session_id_from_cookie(request).nil?
|
|
56
|
-
|
|
57
52
|
{
|
|
58
|
-
visitor_id:
|
|
59
|
-
session_id:
|
|
60
|
-
user_id:
|
|
61
|
-
new_session: new_session
|
|
53
|
+
visitor_id: resolve_visitor_id(request),
|
|
54
|
+
session_id: resolve_session_id(request),
|
|
55
|
+
user_id: user_id_from_session(request),
|
|
56
|
+
new_session: new_session?(request)
|
|
62
57
|
}.freeze
|
|
63
58
|
end
|
|
64
59
|
|
|
60
|
+
def resolve_visitor_id(request)
|
|
61
|
+
visitor_id_from_cookie(request) || Visitor::Identifier.generate
|
|
62
|
+
end
|
|
63
|
+
|
|
64
|
+
def resolve_session_id(request)
|
|
65
|
+
session_id_from_cookie(request) || generate_session_id(request)
|
|
66
|
+
end
|
|
67
|
+
|
|
65
68
|
def visitor_id_from_cookie(request)
|
|
66
69
|
request.cookies[VISITOR_COOKIE_NAME]
|
|
67
70
|
end
|
|
@@ -74,8 +77,32 @@ module Mbuzz
|
|
|
74
77
|
request.session[SESSION_USER_ID_KEY] if request.session
|
|
75
78
|
end
|
|
76
79
|
|
|
77
|
-
def
|
|
78
|
-
|
|
80
|
+
def new_session?(request)
|
|
81
|
+
session_id_from_cookie(request).nil?
|
|
82
|
+
end
|
|
83
|
+
|
|
84
|
+
def generate_session_id(request)
|
|
85
|
+
existing_visitor_id = visitor_id_from_cookie(request)
|
|
86
|
+
|
|
87
|
+
if existing_visitor_id
|
|
88
|
+
Session::IdGenerator.generate_deterministic(visitor_id: existing_visitor_id)
|
|
89
|
+
else
|
|
90
|
+
Session::IdGenerator.generate_from_fingerprint(
|
|
91
|
+
client_ip: client_ip(request),
|
|
92
|
+
user_agent: user_agent(request)
|
|
93
|
+
)
|
|
94
|
+
end
|
|
95
|
+
end
|
|
96
|
+
|
|
97
|
+
def client_ip(request)
|
|
98
|
+
request.env["HTTP_X_FORWARDED_FOR"]&.split(",")&.first&.strip ||
|
|
99
|
+
request.env["HTTP_X_REAL_IP"] ||
|
|
100
|
+
request.ip ||
|
|
101
|
+
"unknown"
|
|
102
|
+
end
|
|
103
|
+
|
|
104
|
+
def user_agent(request)
|
|
105
|
+
request.user_agent || "unknown"
|
|
79
106
|
end
|
|
80
107
|
|
|
81
108
|
# Session creation
|
|
@@ -0,0 +1,33 @@
|
|
|
1
|
+
# frozen_string_literal: true
|
|
2
|
+
|
|
3
|
+
require "digest"
|
|
4
|
+
require "securerandom"
|
|
5
|
+
|
|
6
|
+
module Mbuzz
|
|
7
|
+
module Session
|
|
8
|
+
class IdGenerator
|
|
9
|
+
SESSION_TIMEOUT_SECONDS = 1800
|
|
10
|
+
SESSION_ID_LENGTH = 64
|
|
11
|
+
FINGERPRINT_LENGTH = 32
|
|
12
|
+
|
|
13
|
+
class << self
|
|
14
|
+
def generate_deterministic(visitor_id:, timestamp: Time.now.to_i)
|
|
15
|
+
time_bucket = timestamp / SESSION_TIMEOUT_SECONDS
|
|
16
|
+
raw = "#{visitor_id}_#{time_bucket}"
|
|
17
|
+
Digest::SHA256.hexdigest(raw)[0, SESSION_ID_LENGTH]
|
|
18
|
+
end
|
|
19
|
+
|
|
20
|
+
def generate_from_fingerprint(client_ip:, user_agent:, timestamp: Time.now.to_i)
|
|
21
|
+
fingerprint = Digest::SHA256.hexdigest("#{client_ip}|#{user_agent}")[0, FINGERPRINT_LENGTH]
|
|
22
|
+
time_bucket = timestamp / SESSION_TIMEOUT_SECONDS
|
|
23
|
+
raw = "#{fingerprint}_#{time_bucket}"
|
|
24
|
+
Digest::SHA256.hexdigest(raw)[0, SESSION_ID_LENGTH]
|
|
25
|
+
end
|
|
26
|
+
|
|
27
|
+
def generate_random
|
|
28
|
+
SecureRandom.hex(32)
|
|
29
|
+
end
|
|
30
|
+
end
|
|
31
|
+
end
|
|
32
|
+
end
|
|
33
|
+
end
|
data/lib/mbuzz/version.rb
CHANGED
data/lib/mbuzz.rb
CHANGED
|
@@ -3,6 +3,7 @@
|
|
|
3
3
|
require_relative "mbuzz/version"
|
|
4
4
|
require_relative "mbuzz/configuration"
|
|
5
5
|
require_relative "mbuzz/visitor/identifier"
|
|
6
|
+
require_relative "mbuzz/session/id_generator"
|
|
6
7
|
require_relative "mbuzz/request_context"
|
|
7
8
|
require_relative "mbuzz/api"
|
|
8
9
|
require_relative "mbuzz/client"
|
|
@@ -0,0 +1,640 @@
|
|
|
1
|
+
# mbuzz SDK v0.7.0 - Deterministic Session IDs
|
|
2
|
+
|
|
3
|
+
**Status**: Proposed
|
|
4
|
+
**Last Updated**: 2025-12-29
|
|
5
|
+
**Breaking Change**: No (backward compatible)
|
|
6
|
+
**Affects**: All SDKs (Ruby, Python, PHP, Node)
|
|
7
|
+
|
|
8
|
+
---
|
|
9
|
+
|
|
10
|
+
## Problem Statement
|
|
11
|
+
|
|
12
|
+
### The Race Condition
|
|
13
|
+
|
|
14
|
+
When a page loads, multiple concurrent HTTP requests can hit the server before the first response sets cookies. Each request generates a new random session ID, creating duplicate sessions:
|
|
15
|
+
|
|
16
|
+
```
|
|
17
|
+
User clicks link → Page starts loading
|
|
18
|
+
|
|
19
|
+
Request 1 (HTML) ──────────────────▶ No cookies
|
|
20
|
+
Server: session_id = random_1
|
|
21
|
+
|
|
22
|
+
Request 2 (Turbo/fetch) ───────────▶ No cookies (response 1 not back yet)
|
|
23
|
+
Server: session_id = random_2
|
|
24
|
+
|
|
25
|
+
Request 3 (XHR) ───────────────────▶ No cookies
|
|
26
|
+
Server: session_id = random_3
|
|
27
|
+
|
|
28
|
+
Result: 3 different sessions created for the same page load!
|
|
29
|
+
```
|
|
30
|
+
|
|
31
|
+
### Production Evidence
|
|
32
|
+
|
|
33
|
+
From Pet Resorts Australia (visitor #139):
|
|
34
|
+
- 65 timestamps with multiple sessions created at the exact same second
|
|
35
|
+
- 5 sessions created within 35ms, all with different session_ids
|
|
36
|
+
- Same visitor_id preserved (2-year cookie works)
|
|
37
|
+
- 94.8% of sessions have 0 page views (phantom sessions)
|
|
38
|
+
|
|
39
|
+
### Impact on Attribution
|
|
40
|
+
|
|
41
|
+
1. **Inflated "Direct" channel**: Internal navigations create new sessions classified as "direct"
|
|
42
|
+
2. **Broken Last Touch**: Real acquisition channels get overwritten by phantom "direct" sessions
|
|
43
|
+
3. **Skewed Metrics**: Average visits per conversion is 35.1 instead of ~1-2
|
|
44
|
+
|
|
45
|
+
---
|
|
46
|
+
|
|
47
|
+
## Solution: Deterministic Session ID Generation
|
|
48
|
+
|
|
49
|
+
### Core Concept
|
|
50
|
+
|
|
51
|
+
Instead of generating random session IDs, generate **deterministic** IDs based on:
|
|
52
|
+
1. Visitor ID (from cookie - has 2-year expiry)
|
|
53
|
+
2. Time bucket (30-minute windows matching session timeout)
|
|
54
|
+
3. Request fingerprint (IP + User-Agent for new visitors)
|
|
55
|
+
|
|
56
|
+
All concurrent requests will generate the **same session ID**.
|
|
57
|
+
|
|
58
|
+
### Algorithm
|
|
59
|
+
|
|
60
|
+
```
|
|
61
|
+
IF visitor_id cookie exists:
|
|
62
|
+
session_id = SHA256(visitor_id + time_bucket)[0:64]
|
|
63
|
+
ELSE:
|
|
64
|
+
fingerprint = SHA256(client_ip + user_agent)[0:32]
|
|
65
|
+
session_id = SHA256(fingerprint + time_bucket)[0:64]
|
|
66
|
+
|
|
67
|
+
WHERE:
|
|
68
|
+
time_bucket = floor(unix_timestamp / SESSION_TIMEOUT_SECONDS)
|
|
69
|
+
SESSION_TIMEOUT_SECONDS = 1800 (30 minutes)
|
|
70
|
+
```
|
|
71
|
+
|
|
72
|
+
### Why This Works
|
|
73
|
+
|
|
74
|
+
| Scenario | Visitor Cookie | Result |
|
|
75
|
+
|----------|----------------|--------|
|
|
76
|
+
| Returning visitor, expired session | ✅ Exists | All concurrent requests get same session_id |
|
|
77
|
+
| New visitor, first page load | ❌ Missing | All concurrent requests from same IP+UA get same session_id |
|
|
78
|
+
| Active session | ✅ Exists + Session cookie | Cookie used directly (no generation needed) |
|
|
79
|
+
|
|
80
|
+
### Security Considerations
|
|
81
|
+
|
|
82
|
+
1. **Session IDs remain unpredictable**: SHA256 hash is not reversible
|
|
83
|
+
2. **No PII in session ID**: Only hashed values stored
|
|
84
|
+
3. **Time bucket prevents replay**: Old session IDs naturally expire
|
|
85
|
+
4. **IP+UA fingerprint is ephemeral**: Not stored, only used for generation
|
|
86
|
+
|
|
87
|
+
---
|
|
88
|
+
|
|
89
|
+
## Implementation Specification
|
|
90
|
+
|
|
91
|
+
### Constants
|
|
92
|
+
|
|
93
|
+
```
|
|
94
|
+
SESSION_TIMEOUT_SECONDS = 1800 # 30 minutes
|
|
95
|
+
SESSION_ID_LENGTH = 64 # Characters (256 bits as hex)
|
|
96
|
+
FINGERPRINT_LENGTH = 32 # Characters for IP+UA hash
|
|
97
|
+
HASH_ALGORITHM = "SHA256"
|
|
98
|
+
```
|
|
99
|
+
|
|
100
|
+
### New Module: `SessionIdGenerator`
|
|
101
|
+
|
|
102
|
+
Each SDK must implement a `SessionIdGenerator` with these methods:
|
|
103
|
+
|
|
104
|
+
#### `generate_deterministic(visitor_id, timestamp)`
|
|
105
|
+
|
|
106
|
+
Generate session ID for **returning visitors** (have visitor cookie).
|
|
107
|
+
|
|
108
|
+
```
|
|
109
|
+
Input:
|
|
110
|
+
- visitor_id: string (64 hex chars)
|
|
111
|
+
- timestamp: unix timestamp (seconds)
|
|
112
|
+
|
|
113
|
+
Output:
|
|
114
|
+
- session_id: string (64 hex chars)
|
|
115
|
+
|
|
116
|
+
Algorithm:
|
|
117
|
+
time_bucket = floor(timestamp / SESSION_TIMEOUT_SECONDS)
|
|
118
|
+
raw = visitor_id + "_" + string(time_bucket)
|
|
119
|
+
session_id = SHA256(raw).hexdigest()[0:64]
|
|
120
|
+
```
|
|
121
|
+
|
|
122
|
+
#### `generate_from_fingerprint(client_ip, user_agent, timestamp)`
|
|
123
|
+
|
|
124
|
+
Generate session ID for **new visitors** (no visitor cookie).
|
|
125
|
+
|
|
126
|
+
```
|
|
127
|
+
Input:
|
|
128
|
+
- client_ip: string (e.g., "192.168.1.1" or "2001:db8::1")
|
|
129
|
+
- user_agent: string (browser user agent)
|
|
130
|
+
- timestamp: unix timestamp (seconds)
|
|
131
|
+
|
|
132
|
+
Output:
|
|
133
|
+
- session_id: string (64 hex chars)
|
|
134
|
+
|
|
135
|
+
Algorithm:
|
|
136
|
+
fingerprint = SHA256(client_ip + "|" + user_agent).hexdigest()[0:32]
|
|
137
|
+
time_bucket = floor(timestamp / SESSION_TIMEOUT_SECONDS)
|
|
138
|
+
raw = fingerprint + "_" + string(time_bucket)
|
|
139
|
+
session_id = SHA256(raw).hexdigest()[0:64]
|
|
140
|
+
```
|
|
141
|
+
|
|
142
|
+
#### `generate_random()`
|
|
143
|
+
|
|
144
|
+
Fallback for edge cases (no IP available, etc.).
|
|
145
|
+
|
|
146
|
+
```
|
|
147
|
+
Output:
|
|
148
|
+
- session_id: string (64 hex chars, cryptographically random)
|
|
149
|
+
```
|
|
150
|
+
|
|
151
|
+
---
|
|
152
|
+
|
|
153
|
+
## SDK Implementation Details
|
|
154
|
+
|
|
155
|
+
### Ruby (mbuzz-ruby)
|
|
156
|
+
|
|
157
|
+
**File**: `lib/mbuzz/session/id_generator.rb`
|
|
158
|
+
|
|
159
|
+
```ruby
|
|
160
|
+
# frozen_string_literal: true
|
|
161
|
+
|
|
162
|
+
require "digest"
|
|
163
|
+
require "securerandom"
|
|
164
|
+
|
|
165
|
+
module Mbuzz
|
|
166
|
+
module Session
|
|
167
|
+
class IdGenerator
|
|
168
|
+
SESSION_TIMEOUT_SECONDS = 1800
|
|
169
|
+
SESSION_ID_LENGTH = 64
|
|
170
|
+
FINGERPRINT_LENGTH = 32
|
|
171
|
+
|
|
172
|
+
class << self
|
|
173
|
+
def generate_deterministic(visitor_id:, timestamp: Time.now.to_i)
|
|
174
|
+
time_bucket = timestamp / SESSION_TIMEOUT_SECONDS
|
|
175
|
+
raw = "#{visitor_id}_#{time_bucket}"
|
|
176
|
+
Digest::SHA256.hexdigest(raw)[0, SESSION_ID_LENGTH]
|
|
177
|
+
end
|
|
178
|
+
|
|
179
|
+
def generate_from_fingerprint(client_ip:, user_agent:, timestamp: Time.now.to_i)
|
|
180
|
+
fingerprint = Digest::SHA256.hexdigest("#{client_ip}|#{user_agent}")[0, FINGERPRINT_LENGTH]
|
|
181
|
+
time_bucket = timestamp / SESSION_TIMEOUT_SECONDS
|
|
182
|
+
raw = "#{fingerprint}_#{time_bucket}"
|
|
183
|
+
Digest::SHA256.hexdigest(raw)[0, SESSION_ID_LENGTH]
|
|
184
|
+
end
|
|
185
|
+
|
|
186
|
+
def generate_random
|
|
187
|
+
SecureRandom.hex(32)
|
|
188
|
+
end
|
|
189
|
+
end
|
|
190
|
+
end
|
|
191
|
+
end
|
|
192
|
+
end
|
|
193
|
+
```
|
|
194
|
+
|
|
195
|
+
**File**: `lib/mbuzz/middleware/tracking.rb` (updated)
|
|
196
|
+
|
|
197
|
+
```ruby
|
|
198
|
+
def build_request_context(request)
|
|
199
|
+
visitor_id = visitor_id_from_cookie(request)
|
|
200
|
+
is_new_visitor = visitor_id.nil?
|
|
201
|
+
visitor_id ||= Visitor::Identifier.generate
|
|
202
|
+
|
|
203
|
+
session_id = session_id_from_cookie(request) || generate_session_id(
|
|
204
|
+
visitor_id: is_new_visitor ? nil : visitor_id,
|
|
205
|
+
request: request
|
|
206
|
+
)
|
|
207
|
+
|
|
208
|
+
# ... rest unchanged
|
|
209
|
+
end
|
|
210
|
+
|
|
211
|
+
def generate_session_id(visitor_id:, request:)
|
|
212
|
+
if visitor_id
|
|
213
|
+
Session::IdGenerator.generate_deterministic(visitor_id: visitor_id)
|
|
214
|
+
else
|
|
215
|
+
Session::IdGenerator.generate_from_fingerprint(
|
|
216
|
+
client_ip: client_ip(request),
|
|
217
|
+
user_agent: user_agent(request)
|
|
218
|
+
)
|
|
219
|
+
end
|
|
220
|
+
end
|
|
221
|
+
|
|
222
|
+
def client_ip(request)
|
|
223
|
+
request.env["HTTP_X_FORWARDED_FOR"]&.split(",")&.first&.strip ||
|
|
224
|
+
request.env["HTTP_X_REAL_IP"] ||
|
|
225
|
+
request.ip ||
|
|
226
|
+
"unknown"
|
|
227
|
+
end
|
|
228
|
+
|
|
229
|
+
def user_agent(request)
|
|
230
|
+
request.user_agent || "unknown"
|
|
231
|
+
end
|
|
232
|
+
```
|
|
233
|
+
|
|
234
|
+
---
|
|
235
|
+
|
|
236
|
+
### Node.js (mbuzz-node)
|
|
237
|
+
|
|
238
|
+
**File**: `src/utils/sessionId.ts`
|
|
239
|
+
|
|
240
|
+
```typescript
|
|
241
|
+
import { createHash, randomBytes } from 'node:crypto';
|
|
242
|
+
|
|
243
|
+
const SESSION_TIMEOUT_SECONDS = 1800;
|
|
244
|
+
const SESSION_ID_LENGTH = 64;
|
|
245
|
+
const FINGERPRINT_LENGTH = 32;
|
|
246
|
+
|
|
247
|
+
export function generateDeterministic(
|
|
248
|
+
visitorId: string,
|
|
249
|
+
timestamp: number = Math.floor(Date.now() / 1000)
|
|
250
|
+
): string {
|
|
251
|
+
const timeBucket = Math.floor(timestamp / SESSION_TIMEOUT_SECONDS);
|
|
252
|
+
const raw = `${visitorId}_${timeBucket}`;
|
|
253
|
+
return createHash('sha256').update(raw).digest('hex').slice(0, SESSION_ID_LENGTH);
|
|
254
|
+
}
|
|
255
|
+
|
|
256
|
+
export function generateFromFingerprint(
|
|
257
|
+
clientIp: string,
|
|
258
|
+
userAgent: string,
|
|
259
|
+
timestamp: number = Math.floor(Date.now() / 1000)
|
|
260
|
+
): string {
|
|
261
|
+
const fingerprint = createHash('sha256')
|
|
262
|
+
.update(`${clientIp}|${userAgent}`)
|
|
263
|
+
.digest('hex')
|
|
264
|
+
.slice(0, FINGERPRINT_LENGTH);
|
|
265
|
+
const timeBucket = Math.floor(timestamp / SESSION_TIMEOUT_SECONDS);
|
|
266
|
+
const raw = `${fingerprint}_${timeBucket}`;
|
|
267
|
+
return createHash('sha256').update(raw).digest('hex').slice(0, SESSION_ID_LENGTH);
|
|
268
|
+
}
|
|
269
|
+
|
|
270
|
+
export function generateRandom(): string {
|
|
271
|
+
return randomBytes(32).toString('hex');
|
|
272
|
+
}
|
|
273
|
+
```
|
|
274
|
+
|
|
275
|
+
**File**: `src/middleware/express.ts` (updated)
|
|
276
|
+
|
|
277
|
+
```typescript
|
|
278
|
+
import { generateDeterministic, generateFromFingerprint, generateRandom } from '../utils/sessionId';
|
|
279
|
+
|
|
280
|
+
const getClientIp = (req: Request): string => {
|
|
281
|
+
const forwarded = req.headers['x-forwarded-for'];
|
|
282
|
+
if (typeof forwarded === 'string') {
|
|
283
|
+
return forwarded.split(',')[0].trim();
|
|
284
|
+
}
|
|
285
|
+
return req.ip || req.socket.remoteAddress || 'unknown';
|
|
286
|
+
};
|
|
287
|
+
|
|
288
|
+
const getUserAgent = (req: Request): string => {
|
|
289
|
+
return req.headers['user-agent'] || 'unknown';
|
|
290
|
+
};
|
|
291
|
+
|
|
292
|
+
const getSessionId = (req: Request, visitorId: string | null): { id: string; isNew: boolean } => {
|
|
293
|
+
const existing = req.cookies?.[SESSION_COOKIE];
|
|
294
|
+
if (existing) {
|
|
295
|
+
return { id: existing, isNew: false };
|
|
296
|
+
}
|
|
297
|
+
|
|
298
|
+
const id = visitorId
|
|
299
|
+
? generateDeterministic(visitorId)
|
|
300
|
+
: generateFromFingerprint(getClientIp(req), getUserAgent(req));
|
|
301
|
+
|
|
302
|
+
return { id, isNew: true };
|
|
303
|
+
};
|
|
304
|
+
```
|
|
305
|
+
|
|
306
|
+
---
|
|
307
|
+
|
|
308
|
+
### Python (mbuzz-python)
|
|
309
|
+
|
|
310
|
+
**File**: `src/mbuzz/utils/session_id.py`
|
|
311
|
+
|
|
312
|
+
```python
|
|
313
|
+
"""Deterministic session ID generation."""
|
|
314
|
+
|
|
315
|
+
import hashlib
|
|
316
|
+
import secrets
|
|
317
|
+
import time
|
|
318
|
+
|
|
319
|
+
SESSION_TIMEOUT_SECONDS = 1800
|
|
320
|
+
SESSION_ID_LENGTH = 64
|
|
321
|
+
FINGERPRINT_LENGTH = 32
|
|
322
|
+
|
|
323
|
+
|
|
324
|
+
def generate_deterministic(visitor_id: str, timestamp: int | None = None) -> str:
|
|
325
|
+
"""Generate session ID for returning visitors."""
|
|
326
|
+
if timestamp is None:
|
|
327
|
+
timestamp = int(time.time())
|
|
328
|
+
time_bucket = timestamp // SESSION_TIMEOUT_SECONDS
|
|
329
|
+
raw = f"{visitor_id}_{time_bucket}"
|
|
330
|
+
return hashlib.sha256(raw.encode()).hexdigest()[:SESSION_ID_LENGTH]
|
|
331
|
+
|
|
332
|
+
|
|
333
|
+
def generate_from_fingerprint(
|
|
334
|
+
client_ip: str,
|
|
335
|
+
user_agent: str,
|
|
336
|
+
timestamp: int | None = None
|
|
337
|
+
) -> str:
|
|
338
|
+
"""Generate session ID for new visitors using IP+UA fingerprint."""
|
|
339
|
+
if timestamp is None:
|
|
340
|
+
timestamp = int(time.time())
|
|
341
|
+
fingerprint = hashlib.sha256(
|
|
342
|
+
f"{client_ip}|{user_agent}".encode()
|
|
343
|
+
).hexdigest()[:FINGERPRINT_LENGTH]
|
|
344
|
+
time_bucket = timestamp // SESSION_TIMEOUT_SECONDS
|
|
345
|
+
raw = f"{fingerprint}_{time_bucket}"
|
|
346
|
+
return hashlib.sha256(raw.encode()).hexdigest()[:SESSION_ID_LENGTH]
|
|
347
|
+
|
|
348
|
+
|
|
349
|
+
def generate_random() -> str:
|
|
350
|
+
"""Generate random session ID (fallback)."""
|
|
351
|
+
return secrets.token_hex(32)
|
|
352
|
+
```
|
|
353
|
+
|
|
354
|
+
**File**: `src/mbuzz/middleware/flask.py` (updated)
|
|
355
|
+
|
|
356
|
+
```python
|
|
357
|
+
from flask import request
|
|
358
|
+
from ..utils.session_id import generate_deterministic, generate_from_fingerprint
|
|
359
|
+
|
|
360
|
+
|
|
361
|
+
def _get_client_ip() -> str:
|
|
362
|
+
"""Get client IP from request headers."""
|
|
363
|
+
forwarded = request.headers.get('X-Forwarded-For', '')
|
|
364
|
+
if forwarded:
|
|
365
|
+
return forwarded.split(',')[0].strip()
|
|
366
|
+
return request.remote_addr or 'unknown'
|
|
367
|
+
|
|
368
|
+
|
|
369
|
+
def _get_user_agent() -> str:
|
|
370
|
+
"""Get user agent from request."""
|
|
371
|
+
return request.headers.get('User-Agent', 'unknown')
|
|
372
|
+
|
|
373
|
+
|
|
374
|
+
def _get_or_create_session_id(visitor_id: str | None) -> str:
|
|
375
|
+
"""Get session ID from cookie or generate deterministic one."""
|
|
376
|
+
existing = request.cookies.get(SESSION_COOKIE)
|
|
377
|
+
if existing:
|
|
378
|
+
return existing
|
|
379
|
+
|
|
380
|
+
if visitor_id:
|
|
381
|
+
return generate_deterministic(visitor_id)
|
|
382
|
+
else:
|
|
383
|
+
return generate_from_fingerprint(_get_client_ip(), _get_user_agent())
|
|
384
|
+
```
|
|
385
|
+
|
|
386
|
+
---
|
|
387
|
+
|
|
388
|
+
### PHP (mbuzz-php)
|
|
389
|
+
|
|
390
|
+
**File**: `src/Mbuzz/SessionIdGenerator.php`
|
|
391
|
+
|
|
392
|
+
```php
|
|
393
|
+
<?php
|
|
394
|
+
|
|
395
|
+
declare(strict_types=1);
|
|
396
|
+
|
|
397
|
+
namespace Mbuzz;
|
|
398
|
+
|
|
399
|
+
final class SessionIdGenerator
|
|
400
|
+
{
|
|
401
|
+
private const SESSION_TIMEOUT_SECONDS = 1800;
|
|
402
|
+
private const SESSION_ID_LENGTH = 64;
|
|
403
|
+
private const FINGERPRINT_LENGTH = 32;
|
|
404
|
+
|
|
405
|
+
/**
|
|
406
|
+
* Generate session ID for returning visitors (have visitor cookie).
|
|
407
|
+
*/
|
|
408
|
+
public static function generateDeterministic(
|
|
409
|
+
string $visitorId,
|
|
410
|
+
?int $timestamp = null
|
|
411
|
+
): string {
|
|
412
|
+
$timestamp = $timestamp ?? time();
|
|
413
|
+
$timeBucket = intdiv($timestamp, self::SESSION_TIMEOUT_SECONDS);
|
|
414
|
+
$raw = "{$visitorId}_{$timeBucket}";
|
|
415
|
+
return substr(hash('sha256', $raw), 0, self::SESSION_ID_LENGTH);
|
|
416
|
+
}
|
|
417
|
+
|
|
418
|
+
/**
|
|
419
|
+
* Generate session ID for new visitors using IP+UA fingerprint.
|
|
420
|
+
*/
|
|
421
|
+
public static function generateFromFingerprint(
|
|
422
|
+
string $clientIp,
|
|
423
|
+
string $userAgent,
|
|
424
|
+
?int $timestamp = null
|
|
425
|
+
): string {
|
|
426
|
+
$timestamp = $timestamp ?? time();
|
|
427
|
+
$fingerprint = substr(
|
|
428
|
+
hash('sha256', "{$clientIp}|{$userAgent}"),
|
|
429
|
+
0,
|
|
430
|
+
self::FINGERPRINT_LENGTH
|
|
431
|
+
);
|
|
432
|
+
$timeBucket = intdiv($timestamp, self::SESSION_TIMEOUT_SECONDS);
|
|
433
|
+
$raw = "{$fingerprint}_{$timeBucket}";
|
|
434
|
+
return substr(hash('sha256', $raw), 0, self::SESSION_ID_LENGTH);
|
|
435
|
+
}
|
|
436
|
+
|
|
437
|
+
/**
|
|
438
|
+
* Generate random session ID (fallback).
|
|
439
|
+
*/
|
|
440
|
+
public static function generateRandom(): string
|
|
441
|
+
{
|
|
442
|
+
return bin2hex(random_bytes(32));
|
|
443
|
+
}
|
|
444
|
+
}
|
|
445
|
+
```
|
|
446
|
+
|
|
447
|
+
**File**: `src/Mbuzz/Context.php` (updated to use new generator)
|
|
448
|
+
|
|
449
|
+
```php
|
|
450
|
+
<?php
|
|
451
|
+
|
|
452
|
+
// In the session ID resolution logic:
|
|
453
|
+
|
|
454
|
+
private function resolveSessionId(?string $visitorId): string
|
|
455
|
+
{
|
|
456
|
+
$existing = $this->cookieManager->getSessionId();
|
|
457
|
+
if ($existing !== null) {
|
|
458
|
+
return $existing;
|
|
459
|
+
}
|
|
460
|
+
|
|
461
|
+
if ($visitorId !== null) {
|
|
462
|
+
return SessionIdGenerator::generateDeterministic($visitorId);
|
|
463
|
+
}
|
|
464
|
+
|
|
465
|
+
return SessionIdGenerator::generateFromFingerprint(
|
|
466
|
+
$this->getClientIp(),
|
|
467
|
+
$this->getUserAgent()
|
|
468
|
+
);
|
|
469
|
+
}
|
|
470
|
+
|
|
471
|
+
private function getClientIp(): string
|
|
472
|
+
{
|
|
473
|
+
return $_SERVER['HTTP_X_FORWARDED_FOR']
|
|
474
|
+
? explode(',', $_SERVER['HTTP_X_FORWARDED_FOR'])[0]
|
|
475
|
+
: ($_SERVER['HTTP_X_REAL_IP'] ?? $_SERVER['REMOTE_ADDR'] ?? 'unknown');
|
|
476
|
+
}
|
|
477
|
+
|
|
478
|
+
private function getUserAgent(): string
|
|
479
|
+
{
|
|
480
|
+
return $_SERVER['HTTP_USER_AGENT'] ?? 'unknown';
|
|
481
|
+
}
|
|
482
|
+
```
|
|
483
|
+
|
|
484
|
+
---
|
|
485
|
+
|
|
486
|
+
## Server-Side Handling (multibuzz API)
|
|
487
|
+
|
|
488
|
+
The API should handle idempotent session creation:
|
|
489
|
+
|
|
490
|
+
### Sessions Endpoint Update
|
|
491
|
+
|
|
492
|
+
```ruby
|
|
493
|
+
# app/services/sessions/creation_service.rb
|
|
494
|
+
|
|
495
|
+
def run
|
|
496
|
+
return existing_session_result if session_exists?
|
|
497
|
+
|
|
498
|
+
# Create new session...
|
|
499
|
+
end
|
|
500
|
+
|
|
501
|
+
def session_exists?
|
|
502
|
+
# Check if session with this session_id already exists
|
|
503
|
+
@existing_session = account.sessions.find_by(
|
|
504
|
+
session_id: session_id,
|
|
505
|
+
visitor: visitor
|
|
506
|
+
)
|
|
507
|
+
end
|
|
508
|
+
|
|
509
|
+
def existing_session_result
|
|
510
|
+
success_result(
|
|
511
|
+
visitor_id: visitor.visitor_id,
|
|
512
|
+
session_id: @existing_session.session_id,
|
|
513
|
+
channel: @existing_session.channel,
|
|
514
|
+
existing: true
|
|
515
|
+
)
|
|
516
|
+
end
|
|
517
|
+
```
|
|
518
|
+
|
|
519
|
+
This ensures:
|
|
520
|
+
1. Multiple requests with same deterministic session_id don't create duplicates
|
|
521
|
+
2. First request's data (UTM, referrer) is preserved
|
|
522
|
+
3. Subsequent requests are no-ops
|
|
523
|
+
|
|
524
|
+
---
|
|
525
|
+
|
|
526
|
+
## Migration Path
|
|
527
|
+
|
|
528
|
+
### Backward Compatibility
|
|
529
|
+
|
|
530
|
+
- Random session IDs still work (cookie-based sessions unaffected)
|
|
531
|
+
- No changes to cookie format or names
|
|
532
|
+
- No changes to API endpoints
|
|
533
|
+
- Existing sessions continue to function
|
|
534
|
+
|
|
535
|
+
### Rollout Strategy
|
|
536
|
+
|
|
537
|
+
1. **Phase 1**: Deploy API changes (idempotent session creation)
|
|
538
|
+
2. **Phase 2**: Release SDK updates (v0.7.0 for all SDKs)
|
|
539
|
+
3. **Phase 3**: Monitor metrics for reduced duplicate sessions
|
|
540
|
+
|
|
541
|
+
### Version Matrix
|
|
542
|
+
|
|
543
|
+
| SDK | Current | After |
|
|
544
|
+
|-----|---------|-------|
|
|
545
|
+
| mbuzz-ruby | 0.6.x | 0.7.0 |
|
|
546
|
+
| mbuzz-node | 0.6.x | 0.7.0 |
|
|
547
|
+
| mbuzz-python | 0.6.x | 0.7.0 |
|
|
548
|
+
| mbuzz-php | 0.6.x | 0.7.0 |
|
|
549
|
+
|
|
550
|
+
---
|
|
551
|
+
|
|
552
|
+
## Testing Requirements
|
|
553
|
+
|
|
554
|
+
### Unit Tests
|
|
555
|
+
|
|
556
|
+
Each SDK must test:
|
|
557
|
+
|
|
558
|
+
1. **Deterministic generation is consistent**
|
|
559
|
+
```
|
|
560
|
+
generate_deterministic("visitor_abc", 1735500000) == generate_deterministic("visitor_abc", 1735500000)
|
|
561
|
+
```
|
|
562
|
+
|
|
563
|
+
2. **Different visitors get different IDs**
|
|
564
|
+
```
|
|
565
|
+
generate_deterministic("visitor_a", t) != generate_deterministic("visitor_b", t)
|
|
566
|
+
```
|
|
567
|
+
|
|
568
|
+
3. **Time bucket boundaries work correctly**
|
|
569
|
+
```
|
|
570
|
+
# Same 30-minute window = same ID
|
|
571
|
+
generate_deterministic("v", 1735500000) == generate_deterministic("v", 1735500001)
|
|
572
|
+
|
|
573
|
+
# Different window = different ID
|
|
574
|
+
generate_deterministic("v", 1735500000) != generate_deterministic("v", 1735501800)
|
|
575
|
+
```
|
|
576
|
+
|
|
577
|
+
4. **Fingerprint generation is consistent**
|
|
578
|
+
```
|
|
579
|
+
generate_from_fingerprint("1.2.3.4", "Mozilla/5.0", t) ==
|
|
580
|
+
generate_from_fingerprint("1.2.3.4", "Mozilla/5.0", t)
|
|
581
|
+
```
|
|
582
|
+
|
|
583
|
+
5. **Different fingerprints get different IDs**
|
|
584
|
+
```
|
|
585
|
+
generate_from_fingerprint("1.2.3.4", "UA1", t) !=
|
|
586
|
+
generate_from_fingerprint("1.2.3.4", "UA2", t)
|
|
587
|
+
```
|
|
588
|
+
|
|
589
|
+
### Integration Tests
|
|
590
|
+
|
|
591
|
+
1. Concurrent requests from same visitor get same session
|
|
592
|
+
2. Session cookie is set correctly on first response
|
|
593
|
+
3. Subsequent requests use cookie (not regenerated)
|
|
594
|
+
|
|
595
|
+
---
|
|
596
|
+
|
|
597
|
+
## Metrics to Monitor
|
|
598
|
+
|
|
599
|
+
After deployment, track:
|
|
600
|
+
|
|
601
|
+
1. **Sessions per visitor ratio** - Should decrease toward 1.0
|
|
602
|
+
2. **Duplicate session timestamps** - Should approach 0
|
|
603
|
+
3. **"Direct" channel percentage** - Should decrease if inflated
|
|
604
|
+
4. **Average visits per conversion** - Should normalize
|
|
605
|
+
|
|
606
|
+
### Success Criteria
|
|
607
|
+
|
|
608
|
+
| Metric | Before | Target |
|
|
609
|
+
|--------|--------|--------|
|
|
610
|
+
| Sessions per new visitor | 2.0+ | < 1.2 |
|
|
611
|
+
| Timestamps with duplicates | 65+ | < 5 |
|
|
612
|
+
| Empty sessions (0 page views) | 94.8% | < 10% |
|
|
613
|
+
|
|
614
|
+
---
|
|
615
|
+
|
|
616
|
+
## Open Questions
|
|
617
|
+
|
|
618
|
+
1. **IPv6 handling**: Should we normalize IPv6 addresses before hashing?
|
|
619
|
+
2. **Proxy detection**: Should we try multiple headers (X-Real-IP, CF-Connecting-IP)?
|
|
620
|
+
3. **Bot detection**: Should known bot user agents bypass deterministic generation?
|
|
621
|
+
|
|
622
|
+
---
|
|
623
|
+
|
|
624
|
+
## Appendix: Hash Examples
|
|
625
|
+
|
|
626
|
+
```
|
|
627
|
+
# Returning visitor
|
|
628
|
+
visitor_id = "a1b2c3d4e5f6..."
|
|
629
|
+
timestamp = 1735500000
|
|
630
|
+
time_bucket = 1735500000 / 1800 = 964166
|
|
631
|
+
raw = "a1b2c3d4e5f6..._964166"
|
|
632
|
+
session_id = SHA256(raw)[0:64] = "7f8e9d0c1b2a..."
|
|
633
|
+
|
|
634
|
+
# New visitor
|
|
635
|
+
client_ip = "203.0.113.42"
|
|
636
|
+
user_agent = "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7)..."
|
|
637
|
+
fingerprint = SHA256("203.0.113.42|Mozilla/5.0...")[0:32] = "abc123..."
|
|
638
|
+
raw = "abc123..._964166"
|
|
639
|
+
session_id = SHA256(raw)[0:64] = "9e8d7c6b5a4..."
|
|
640
|
+
```
|
data/mbuzz-0.6.3.gem
ADDED
|
Binary file
|
metadata
CHANGED
|
@@ -1,7 +1,7 @@
|
|
|
1
1
|
--- !ruby/object:Gem::Specification
|
|
2
2
|
name: mbuzz
|
|
3
3
|
version: !ruby/object:Gem::Version
|
|
4
|
-
version: 0.6.
|
|
4
|
+
version: 0.6.4
|
|
5
5
|
platform: ruby
|
|
6
6
|
authors:
|
|
7
7
|
- mbuzz team
|
|
@@ -32,10 +32,13 @@ executables: []
|
|
|
32
32
|
extensions: []
|
|
33
33
|
extra_rdoc_files: []
|
|
34
34
|
files:
|
|
35
|
+
- ".DS_Store"
|
|
35
36
|
- CHANGELOG.md
|
|
37
|
+
- CHECK_BUG.md
|
|
36
38
|
- LICENSE.txt
|
|
37
39
|
- README.md
|
|
38
40
|
- Rakefile
|
|
41
|
+
- lib/.DS_Store
|
|
39
42
|
- lib/mbuzz.rb
|
|
40
43
|
- lib/mbuzz/api.rb
|
|
41
44
|
- lib/mbuzz/client.rb
|
|
@@ -48,6 +51,7 @@ files:
|
|
|
48
51
|
- lib/mbuzz/middleware/tracking.rb
|
|
49
52
|
- lib/mbuzz/railtie.rb
|
|
50
53
|
- lib/mbuzz/request_context.rb
|
|
54
|
+
- lib/mbuzz/session/id_generator.rb
|
|
51
55
|
- lib/mbuzz/version.rb
|
|
52
56
|
- lib/mbuzz/visitor/identifier.rb
|
|
53
57
|
- lib/specs/old/SPECIFICATION.md
|
|
@@ -56,7 +60,9 @@ files:
|
|
|
56
60
|
- lib/specs/old/v0.2.0_breaking_changes.md
|
|
57
61
|
- lib/specs/old/v2.0.0_sessions_upgrade.md
|
|
58
62
|
- lib/specs/v0.5.0_four_call_model.md
|
|
63
|
+
- lib/specs/v0.7.0_deterministic_sessions.md
|
|
59
64
|
- mbuzz-0.6.0.gem
|
|
65
|
+
- mbuzz-0.6.3.gem
|
|
60
66
|
- sig/mbuzz.rbs
|
|
61
67
|
homepage: https://mbuzz.co
|
|
62
68
|
licenses:
|