rflow 1.0.1 → 1.1.0

Sign up to get free protection for your applications and to get access to all the features.
data/NOTES DELETED
@@ -1,187 +0,0 @@
1
- RFlow Manager
2
-
3
- Components
4
- Input Ports
5
- Output Ports
6
-
7
- Connections
8
- Input Ports
9
- Output Ports
10
-
11
- rflow <config file>
12
- figure out a work directory
13
- make sure that it has the right subdirectories (can be overridden)
14
- run tmp logs schemas components
15
- Set up logging to logs/rflow.log
16
- Load all schemas
17
- Verify all component installation
18
- Initialize components
19
- Start components running and make sure that they "daemonize" correctly
20
- - place pid files in deployment's run directory
21
- Configure components via zmq
22
- Daemonize self
23
-
24
-
25
-
26
- class Component
27
- def self.input_port
28
- end
29
-
30
- def self.output_port
31
- end
32
-
33
- attr_accessor :state
34
-
35
- def initialize(config, run_directory)
36
-
37
- end
38
-
39
- def run
40
-
41
- end
42
-
43
- def configure
44
-
45
- end
46
-
47
-
48
-
49
- class PassThrough < Component
50
- input_port [:in]
51
- input_port :another_in
52
- output_port :out
53
- output_port :another_out
54
-
55
- def initialize(config, run_directory)
56
- # This will initialize the ports
57
- super
58
- # Do stuff to initialize component
59
- end
60
-
61
- end
62
-
63
-
64
- Computation Requirements:
65
- Initial startup with:
66
- - management bus connection information
67
- - group and instance UUID
68
- - beacon interval
69
- - run directory, containing
70
- - PID files
71
- - log dir + logs
72
- - computation-specific configuration (conf dir)
73
- Needs to process the following messages from mgmt bus:
74
- - CONFIGURE (ports)
75
- - RUN
76
- - SHUTDOWN
77
- Needs to send the following messages to mgmt bus:
78
- - LOG
79
- - BEACON (state machine of the below submessages)
80
- - STARTED
81
- - CONFIGURED
82
- - RUNNING
83
- - STOPPING
84
- - STOPPED
85
- - ERROR
86
- On startup:
87
- - listen to mgmt bus
88
- - publish BEACON + state to mgmt bus every (beacon interval) seconds (default to 1 sec)
89
-
90
-
91
- External Computations:
92
- - Given (out-of-band) startup info (mgmt bus, UUIDs, beacon interval)
93
- -
94
-
95
-
96
- RFlow
97
- - Will need a DB for config
98
- - Initial startup will need to resolve all remaining outstanding items (ports, UUIDs, etc) and store in config DB
99
- - MVC, Mongrel2-like?
100
-
101
- Translate
102
- - Need to add <associated type="objtype" name="myname"> where name attr can be used in later XML templates
103
-
104
-
105
-
106
-
107
-
108
-
109
- ----------------
110
- Plugins:
111
- an externally defined plugin needs access to all current data types, as well as being able to define its own and tell the system about that.
112
- - necessary to tell system?
113
- - need a protocol for defining schema transfer
114
- - each message has attached schema
115
-
116
-
117
- lib/rflow/message.rb
118
-
119
- RFlow::Config
120
-
121
- RFlow::Management
122
- - Somewhere for external people to register new computations with running system
123
- - computation says that its running and asks for Connection configuration
124
- - how will it specify where in the workflow it wants to run????
125
-
126
- RFlow::Message(complete on-the-wire Avro message format)
127
- data_type, provenance, external_ids, empty, data (see below)
128
-
129
- RFlow::Data::(various message data blocks)
130
-
131
- RFlow::Computation
132
- uuid, name, class, input_ports, output_ports
133
-
134
-
135
- RFlow::Connection
136
- encapsulates link knowlege and provides an API for sending and receiving
137
- each computation will have one for each port
138
- each computation will call into the connection to send (possibly via a Port object) and recieve
139
-
140
- RFlow::Connection::AMQP
141
- will manage connections to an AMQP server
142
-
143
- RFlow::Connection::ZMQ
144
-
145
-
146
-
147
-
148
- computation_a.output_port -> (connection.incoming -> connection.outgoing) -> computation_b.input_port
149
-
150
- AMQP::Topic - responsible for setting up a topic -> queue binding
151
- r.incoming = amqp connection, channel, vhost, login, password, topic
152
- r.outgoing = amqp connection, channel, vhost, login, password, queue name
153
- behavior -> n x m, "round-robin" among the connected outgoing
154
- incoming behavior will need to set topic/key, uses the data type in the RFlow::Message
155
-
156
-
157
- ZMQ::PubSub - device-less, responsible for assigning ip/port and assigning one client to bind the port
158
- r.incoming = zmq connection string (tcp://ip:port), type pub
159
- r.outgoing = zmq connection string (tcp://ip:port), type sub
160
- behavior -> n x m, broadcast sending,
161
-
162
- ZMQ::PushPull - device-less, responsible for assigning ip/port and assigning one client to bind the port
163
- r.incoming = zmq connection string (tcp://ip:port), type push
164
- r.outgoing = zmq connection string (tcp://ip:port), type pull
165
-
166
-
167
- Startup
168
-
169
- RFlow.run is the management process for the workflow
170
-
171
- computations = config.computations.map do |c|
172
- instantiate_computation(c)
173
- # Check for errors here, which would be evident if a computation couldn't be found/created
174
- # Just creating single process ruby objects here to check for errors
175
- end
176
-
177
- computations.each do |c|
178
- c.configure # with what????
179
- # Still single ruby process to set and deconflict all the configuration parameters
180
- end
181
-
182
- computations.each do |c|
183
- c.run
184
- end
185
-
186
- listen_for_management_events_from_old_computations
187
- listen_for_new_computation_registration