async-container-supervisor 0.6.3 → 0.7.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: ad71770c6717fe41a49b28a2f4614e7f539d572332805e44cc8f857d0c6cc404
4
- data.tar.gz: 2f2ed9f22cce34b84f85c132f87b5c3296b539530ea26d0127a24dacdc4b5aac
3
+ metadata.gz: a2da6a39261568dcfdcd067bcbd7364397df19aad34826ec9d7b6745e74aa198
4
+ data.tar.gz: b8510b8f17ac2fea393f12223604c96a8f9303d6e87b1fc2ac54fc6b0cdfdaeb
5
5
  SHA512:
6
- metadata.gz: 360ca6f0fe692937d47579307d9abd59f4879ce4eef88d927b32441eb71f40ac827ea63bc66e2c45b9b489072385de13f49c2bfb2698ab0ff878853d6a34d768
7
- data.tar.gz: 8c655ae5432df6b742660ca02251b8f7d25ca2b8a2937aa59d66cf485b562e7abc9f1bd7154cdded585f0e773de40dc9c1bb93871871a56f8cda5288ae61fc60
6
+ metadata.gz: bf79321c826f009edac43b3c1bf1313710ed6881863aa56dea8c7909c7c231cbb1e07b92b0ed2241f1f76cd137e934ec17fb0b6ed00b30ebe8778c430c61735c
7
+ data.tar.gz: c469d508ec02830abe705ec935b6778b45f78923ed592e9af652f07189ad1929e2c61be389bed2ab2076697e7349398e199a43dd238b486e6ebbc899e4b5f9f7
checksums.yaml.gz.sig CHANGED
Binary file
@@ -29,6 +29,25 @@ def status
29
29
  end
30
30
  end
31
31
 
32
+ # Sample memory allocations from a worker over a time period.
33
+ #
34
+ # This is useful for identifying memory leaks by tracking allocations
35
+ # that are retained after garbage collection.
36
+ #
37
+ # @parameter duration [Integer] The duration in seconds to sample for (default: 10).
38
+ # @parameter connection_id [String] The connection ID to target a specific worker.
39
+ def memory_sample(duration: 10, connection_id:)
40
+ client do |connection|
41
+ Console.info(self, "Sampling memory from worker...", duration: duration, connection_id: connection_id)
42
+
43
+ # Build the operation request:
44
+ operation = {do: :memory_sample, duration: duration}
45
+
46
+ # Use the forward operation to proxy the request to a worker:
47
+ return connection.call(do: :forward, operation: operation, connection_id: connection_id)
48
+ end
49
+ end
50
+
32
51
  private
33
52
 
34
53
  def endpoint
@@ -139,13 +139,46 @@ The {ruby Async::Container::Supervisor::MemoryMonitor} will periodically check w
139
139
 
140
140
  The supervisor can collect various diagnostics from workers on demand:
141
141
 
142
- - **Memory dumps**: Full heap dumps for memory analysis
143
- - **Thread dumps**: Stack traces of all threads
142
+ - **Memory dumps**: Full heap dumps for memory analysis via `ObjectSpace.dump_all`.
143
+ - **Memory samples**: Lightweight sampling to identify memory leaks.
144
+ - **Thread dumps**: Stack traces of all threads.
144
145
  - **Scheduler dumps**: Async fiber hierarchy
145
146
  - **Garbage collection profiles**: GC performance data
146
147
 
147
148
  These can be triggered programmatically or via command-line tools (when available).
148
149
 
150
+ #### Memory Leak Diagnosis
151
+
152
+ To identify memory leaks, you can use the memory sampling feature which is much lighter weight than a full memory dump. It tracks allocations over a time period and focuses on retained objects.
153
+
154
+ **Using the bake task:**
155
+
156
+ ```bash
157
+ # Sample for 30 seconds and print report to console
158
+ $ bake async:container:supervisor:memory_sample duration=30
159
+ ```
160
+
161
+ **Programmatically:**
162
+
163
+ ```ruby
164
+ # Assuming you have a connection to a worker:
165
+ result = connection.call(do: :memory_sample, duration: 30)
166
+ puts result[:data]
167
+ ```
168
+
169
+ This will sample memory allocations for the specified duration, then force a garbage collection and return a JSON report showing what objects were allocated during that period and retained after GC. Late-lifecycle allocations that are retained are likely memory leaks.
170
+
171
+ The JSON report includes:
172
+ - `total_allocated`: Total allocated memory and count
173
+ - `total_retained`: Total retained memory and count
174
+ - `by_gem`: Breakdown by gem/library
175
+ - `by_file`: Breakdown by source file
176
+ - `by_location`: Breakdown by specific file:line locations
177
+ - `by_class`: Breakdown by object class
178
+ - `strings`: String allocation analysis
179
+
180
+ This is much more efficient than `do: :memory_dump` which uses `ObjectSpace.dump_all` and can be slow and blocking on large heaps. The JSON format also makes it easy to integrate with monitoring and analysis tools.
181
+
149
182
  ## Advanced Usage
150
183
 
151
184
  ### Custom Monitors
@@ -45,7 +45,7 @@ module Async
45
45
 
46
46
  # Run the client in a loop, reconnecting if necessary.
47
47
  def run
48
- Async do
48
+ Async(annotation: "Supervisor Client", transient: true) do
49
49
  loop do
50
50
  connection = connect!
51
51
 
@@ -71,6 +71,27 @@ module Async
71
71
  @queue.closed?
72
72
  end
73
73
 
74
+ # Forward this call to another connection, proxying all responses back.
75
+ #
76
+ # This provides true streaming forwarding - intermediate responses flow through
77
+ # in real-time rather than being buffered.
78
+ #
79
+ # @parameter target_connection [Connection] The connection to forward the call to.
80
+ # @parameter operation [Hash] The operation request to forward (must include :do key).
81
+ def forward(target_connection, operation)
82
+ # Forward the operation in an async task to avoid blocking
83
+ Async do
84
+ # Make the call to the target connection and stream responses back:
85
+ Call.call(target_connection, **operation) do |response|
86
+ # Push each response through our queue:
87
+ self.push(**response)
88
+ end
89
+ ensure
90
+ # Close our queue to signal completion:
91
+ @queue.close
92
+ end
93
+ end
94
+
74
95
  def self.dispatch(connection, target, id, message)
75
96
  Async do
76
97
  call = self.new(connection, id, message)
@@ -10,15 +10,19 @@ module Async
10
10
  module Container
11
11
  module Supervisor
12
12
  class MemoryMonitor
13
+ MEMORY_SAMPLE = {duration: 60, timeout: 60+20}
14
+
13
15
  # Create a new memory monitor.
14
16
  #
15
17
  # @parameter interval [Integer] The interval at which to check for memory leaks.
16
18
  # @parameter total_size_limit [Integer] The total size limit of all processes, or nil for no limit.
17
19
  # @parameter options [Hash] Options to pass to the cluster when adding processes.
18
- def initialize(interval: 10, total_size_limit: nil, **options)
20
+ def initialize(interval: 10, total_size_limit: nil, memory_sample: MEMORY_SAMPLE, **options)
19
21
  @interval = interval
20
22
  @cluster = Memory::Leak::Cluster.new(total_size_limit: total_size_limit)
21
23
 
24
+ @memory_sample = memory_sample
25
+
22
26
  # We use these options when adding processes to the cluster:
23
27
  @options = options
24
28
 
@@ -74,6 +78,23 @@ module Async
74
78
  # @parameter monitor [Memory::Leak::Monitor] The monitor that detected the memory leak.
75
79
  # @returns [Boolean] True if the process was killed.
76
80
  def memory_leak_detected(process_id, monitor)
81
+ Console.info(self, "Memory leak detected!", child: {process_id: process_id}, monitor: monitor)
82
+
83
+ if @memory_sample
84
+ Console.info(self, "Capturing memory sample...", child: {process_id: process_id}, memory_sample: @memory_sample)
85
+
86
+ # We are tracking multiple connections to the same process:
87
+ connections = @processes[process_id]
88
+
89
+ # Try to capture a memory sample:
90
+ connections.each do |connection|
91
+ result = connection.call(do: :memory_sample, **@memory_sample)
92
+
93
+ Console.info(self, "Memory sample completed:", child: {process_id: process_id}, result: result)
94
+ end
95
+ end
96
+
97
+ # Kill the process gently:
77
98
  Console.info(self, "Killing process!", child: {process_id: process_id})
78
99
  Process.kill(:INT, process_id)
79
100
 
@@ -3,6 +3,8 @@
3
3
  # Released under the MIT License.
4
4
  # Copyright, 2025, by Samuel Williams.
5
5
 
6
+ require "securerandom"
7
+
6
8
  require_relative "connection"
7
9
  require_relative "endpoint"
8
10
  require_relative "dispatchable"
@@ -17,15 +19,23 @@ module Async
17
19
  def initialize(monitors: [], endpoint: Supervisor.endpoint)
18
20
  @monitors = monitors
19
21
  @endpoint = endpoint
22
+
23
+ @connections = {}
20
24
  end
21
25
 
22
26
  attr :monitors
27
+ attr :connections
23
28
 
24
29
  include Dispatchable
25
30
 
26
31
  def do_register(call)
27
32
  call.connection.state.merge!(call.message[:state])
28
33
 
34
+ connection_id = SecureRandom.uuid
35
+ call.connection.state[:connection_id] = connection_id
36
+
37
+ @connections[connection_id] = call.connection
38
+
29
39
  @monitors.each do |monitor|
30
40
  monitor.register(call.connection)
31
41
  rescue => error
@@ -35,6 +45,31 @@ module Async
35
45
  call.finish
36
46
  end
37
47
 
48
+ # Forward an operation to a worker connection.
49
+ #
50
+ # @parameter call [Connection::Call] The call to handle.
51
+ # @parameter operation [Hash] The operation to forward, must include :do key.
52
+ # @parameter connection_id [String] The connection ID to target.
53
+ def do_forward(call)
54
+ operation = call[:operation]
55
+ connection_id = call[:connection_id]
56
+
57
+ unless connection_id
58
+ call.fail(error: "Missing 'connection_id' parameter")
59
+ return
60
+ end
61
+
62
+ connection = @connections[connection_id]
63
+
64
+ unless connection
65
+ call.fail(error: "Connection not found", connection_id: connection_id)
66
+ return
67
+ end
68
+
69
+ # Forward the call to the target connection
70
+ call.forward(connection, operation)
71
+ end
72
+
38
73
  # Restart the current process group, usually including the supervisor and any other processes.
39
74
  #
40
75
  # @parameter signal [Symbol] The signal to send to the process group.
@@ -48,14 +83,26 @@ module Async
48
83
  end
49
84
 
50
85
  def do_status(call)
86
+ connections = @connections.map do |connection_id, connection|
87
+ {
88
+ connection_id: connection_id,
89
+ process_id: connection.state[:process_id],
90
+ state: connection.state,
91
+ }
92
+ end
93
+
51
94
  @monitors.each do |monitor|
52
95
  monitor.status(call)
53
96
  end
54
97
 
55
- call.finish
98
+ call.finish(connections: connections)
56
99
  end
57
100
 
58
101
  def remove(connection)
102
+ if connection_id = connection.state[:connection_id]
103
+ @connections.delete(connection_id)
104
+ end
105
+
59
106
  @monitors.each do |monitor|
60
107
  monitor.remove(connection)
61
108
  rescue => error
@@ -6,7 +6,7 @@
6
6
  module Async
7
7
  module Container
8
8
  module Supervisor
9
- VERSION = "0.6.3"
9
+ VERSION = "0.7.0"
10
10
  end
11
11
  end
12
12
  end
@@ -53,6 +53,44 @@ module Async
53
53
  end
54
54
  end
55
55
 
56
+ # Sample memory allocations over a time period to identify potential leaks.
57
+ #
58
+ # This method is much lighter weight than {do_memory_dump} and focuses on
59
+ # retained objects allocated during the sampling period. Late-lifecycle
60
+ # allocations that are retained are likely memory leaks.
61
+ #
62
+ # @parameter call [Connection::Call] The call to respond to.
63
+ # @parameter duration [Numeric] The duration in seconds to sample for (default: 10).
64
+ def do_memory_sample(call)
65
+ require "memory"
66
+
67
+ unless duration = call[:duration] and duration.positive?
68
+ raise ArgumentError, "Positive duration is required!"
69
+ end
70
+
71
+ Console.info(self, "Starting memory sampling...", duration: duration)
72
+
73
+ # Create a sampler to track allocations
74
+ sampler = Memory::Sampler.new
75
+
76
+ # Start sampling
77
+ sampler.start
78
+
79
+ # Sample for the specified duration
80
+ sleep(duration)
81
+
82
+ # Stop sampling
83
+ sampler.stop
84
+
85
+ Console.info(self, "Memory sampling completed, generating report...", sampler: sampler)
86
+
87
+ # Generate a report focused on retained objects (likely leaks):
88
+ report = sampler.report
89
+ call.finish(report: report.as_json)
90
+ ensure
91
+ GC.start
92
+ end
93
+
56
94
  def do_thread_dump(call)
57
95
  dump(call) do |file|
58
96
  Thread.list.each do |thread|
@@ -68,11 +106,11 @@ module Async
68
106
  end
69
107
 
70
108
  def do_garbage_profile_stop(call)
71
- GC::Profiler.disable
72
-
73
109
  dump(connection, message) do |file|
74
110
  file.puts GC::Profiler.result
75
111
  end
112
+ ensure
113
+ GC::Profiler.disable
76
114
  end
77
115
 
78
116
  protected def connected!(connection)
data/readme.md CHANGED
@@ -22,6 +22,14 @@ Please see the [project documentation](https://socketry.github.io/async-containe
22
22
 
23
23
  Please see the [project releases](https://socketry.github.io/async-container-supervisor/releases/index) for all releases.
24
24
 
25
+ ### v0.7.0
26
+
27
+ - If a memory leak is detected, sample memory usage for 60 seconds before exiting.
28
+
29
+ ### v0.6.4
30
+
31
+ - Make client task (in supervised worker) transient, so that it doesn't keep the reactor alive unnecessarily. It also won't be stopped by default when SIGINT is received, so that the worker will remain connected to the supervisor until the worker is completely terminated.
32
+
25
33
  ### v0.6.3
26
34
 
27
35
  - Add agent context documentation.
data/releases.md CHANGED
@@ -1,5 +1,13 @@
1
1
  # Releases
2
2
 
3
+ ## v0.7.0
4
+
5
+ - If a memory leak is detected, sample memory usage for 60 seconds before exiting.
6
+
7
+ ## v0.6.4
8
+
9
+ - Make client task (in supervised worker) transient, so that it doesn't keep the reactor alive unnecessarily. It also won't be stopped by default when SIGINT is received, so that the worker will remain connected to the supervisor until the worker is completely terminated.
10
+
3
11
  ## v0.6.3
4
12
 
5
13
  - Add agent context documentation.
data.tar.gz.sig CHANGED
Binary file
metadata CHANGED
@@ -1,7 +1,7 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: async-container-supervisor
3
3
  version: !ruby/object:Gem::Version
4
- version: 0.6.3
4
+ version: 0.7.0
5
5
  platform: ruby
6
6
  authors:
7
7
  - Samuel Williams
@@ -66,6 +66,20 @@ dependencies:
66
66
  - - ">="
67
67
  - !ruby/object:Gem::Version
68
68
  version: '0'
69
+ - !ruby/object:Gem::Dependency
70
+ name: memory
71
+ requirement: !ruby/object:Gem::Requirement
72
+ requirements:
73
+ - - "~>"
74
+ - !ruby/object:Gem::Version
75
+ version: '0.7'
76
+ type: :runtime
77
+ prerelease: false
78
+ version_requirements: !ruby/object:Gem::Requirement
79
+ requirements:
80
+ - - "~>"
81
+ - !ruby/object:Gem::Version
82
+ version: '0.7'
69
83
  - !ruby/object:Gem::Dependency
70
84
  name: memory-leak
71
85
  requirement: !ruby/object:Gem::Requirement
metadata.gz.sig CHANGED
Binary file