karafka 0.5.0

Sign up to get free protection for your applications and to get access to all the features.
Files changed (68) hide show
  1. checksums.yaml +7 -0
  2. data/.gitignore +68 -0
  3. data/.ruby-gemset +1 -0
  4. data/.ruby-version +1 -0
  5. data/.travis.yml +6 -0
  6. data/CHANGELOG.md +202 -0
  7. data/Gemfile +8 -0
  8. data/Gemfile.lock +216 -0
  9. data/MIT-LICENCE +18 -0
  10. data/README.md +831 -0
  11. data/Rakefile +17 -0
  12. data/bin/karafka +7 -0
  13. data/karafka.gemspec +34 -0
  14. data/lib/karafka.rb +73 -0
  15. data/lib/karafka/app.rb +45 -0
  16. data/lib/karafka/base_controller.rb +162 -0
  17. data/lib/karafka/base_responder.rb +118 -0
  18. data/lib/karafka/base_worker.rb +41 -0
  19. data/lib/karafka/capistrano.rb +2 -0
  20. data/lib/karafka/capistrano/karafka.cap +84 -0
  21. data/lib/karafka/cli.rb +52 -0
  22. data/lib/karafka/cli/base.rb +74 -0
  23. data/lib/karafka/cli/console.rb +23 -0
  24. data/lib/karafka/cli/flow.rb +46 -0
  25. data/lib/karafka/cli/info.rb +26 -0
  26. data/lib/karafka/cli/install.rb +45 -0
  27. data/lib/karafka/cli/routes.rb +39 -0
  28. data/lib/karafka/cli/server.rb +59 -0
  29. data/lib/karafka/cli/worker.rb +26 -0
  30. data/lib/karafka/connection/consumer.rb +29 -0
  31. data/lib/karafka/connection/listener.rb +54 -0
  32. data/lib/karafka/connection/message.rb +17 -0
  33. data/lib/karafka/connection/topic_consumer.rb +48 -0
  34. data/lib/karafka/errors.rb +50 -0
  35. data/lib/karafka/fetcher.rb +40 -0
  36. data/lib/karafka/helpers/class_matcher.rb +77 -0
  37. data/lib/karafka/helpers/multi_delegator.rb +31 -0
  38. data/lib/karafka/loader.rb +77 -0
  39. data/lib/karafka/logger.rb +52 -0
  40. data/lib/karafka/monitor.rb +82 -0
  41. data/lib/karafka/params/interchanger.rb +33 -0
  42. data/lib/karafka/params/params.rb +102 -0
  43. data/lib/karafka/patches/dry/configurable/config.rb +37 -0
  44. data/lib/karafka/process.rb +61 -0
  45. data/lib/karafka/responders/builder.rb +33 -0
  46. data/lib/karafka/responders/topic.rb +43 -0
  47. data/lib/karafka/responders/usage_validator.rb +59 -0
  48. data/lib/karafka/routing/builder.rb +89 -0
  49. data/lib/karafka/routing/route.rb +80 -0
  50. data/lib/karafka/routing/router.rb +38 -0
  51. data/lib/karafka/server.rb +53 -0
  52. data/lib/karafka/setup/config.rb +57 -0
  53. data/lib/karafka/setup/configurators/base.rb +33 -0
  54. data/lib/karafka/setup/configurators/celluloid.rb +20 -0
  55. data/lib/karafka/setup/configurators/sidekiq.rb +34 -0
  56. data/lib/karafka/setup/configurators/water_drop.rb +19 -0
  57. data/lib/karafka/setup/configurators/worker_glass.rb +13 -0
  58. data/lib/karafka/status.rb +23 -0
  59. data/lib/karafka/templates/app.rb.example +26 -0
  60. data/lib/karafka/templates/application_controller.rb.example +5 -0
  61. data/lib/karafka/templates/application_responder.rb.example +9 -0
  62. data/lib/karafka/templates/application_worker.rb.example +12 -0
  63. data/lib/karafka/templates/config.ru.example +13 -0
  64. data/lib/karafka/templates/sidekiq.yml.example +26 -0
  65. data/lib/karafka/version.rb +6 -0
  66. data/lib/karafka/workers/builder.rb +49 -0
  67. data/log/.gitkeep +0 -0
  68. metadata +267 -0
@@ -0,0 +1,18 @@
1
+ Permission is hereby granted, free of charge, to any person obtaining
2
+ a copy of this software and associated documentation files (the
3
+ "Software"), to deal in the Software without restriction, including
4
+ without limitation the rights to use, copy, modify, merge, publish,
5
+ distribute, sublicense, and/or sell copies of the Software, and to
6
+ permit persons to whom the Software is furnished to do so, subject to
7
+ the following conditions:
8
+
9
+ The above copyright notice and this permission notice shall be
10
+ included in all copies or substantial portions of the Software.
11
+
12
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
13
+ EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
14
+ MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
15
+ NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE
16
+ LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION
17
+ OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION
18
+ WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
@@ -0,0 +1,831 @@
1
+ # Karafka
2
+
3
+ [![Build Status](https://travis-ci.org/karafka/karafka.png)](https://travis-ci.org/karafka/karafka)
4
+ [![Code Climate](https://codeclimate.com/github/karafka/karafka/badges/gpa.svg)](https://codeclimate.com/github/karafka/karafka)
5
+ [![Join the chat at https://gitter.im/karafka/karafka](https://badges.gitter.im/karafka/karafka.svg)](https://gitter.im/karafka/karafka?utm_source=badge&utm_medium=badge&utm_campaign=pr-badge&utm_content=badge)
6
+
7
+ Framework used to simplify Apache Kafka based Ruby applications development.
8
+
9
+ It allows programmers to use approach similar to "the Rails way" when working with asynchronous Kafka messages.
10
+
11
+ Karafka not only handles incoming messages but also provides tools for building complex data-flow applications that receive and send messages.
12
+
13
+ ## Table of Contents
14
+
15
+ - [Table of Contents](#table-of-contents)
16
+ - [Support](#support)
17
+ - [Requirements](#requirements)
18
+ - [How does it work](#how-does-it-work)
19
+ - [Installation](#installation)
20
+ - [Setup](#setup)
21
+ - [Application](#application)
22
+ - [Configurators](#configurators)
23
+ - [Environment variables settings](#environment-variables-settings)
24
+ - [Kafka brokers auto-discovery](#kafka-brokers-auto-discovery)
25
+ - [Usage](#usage)
26
+ - [Karafka CLI](#karafka-cli)
27
+ - [Routing](#routing)
28
+ - [Topic](#topic)
29
+ - [Group](#group)
30
+ - [Worker](#worker)
31
+ - [Parser](#parser)
32
+ - [Interchanger](#interchanger)
33
+ - [Responder](#responder)
34
+ - [Receiving messages](#receiving-messages)
35
+ - [Processing messages directly (without Sidekiq)](#processing-messages-directly-without-sidekiq)
36
+ - [Sending messages from Karafka](#sending-messages-from-karafka)
37
+ - [Using responders (recommended)](#using-responders-recommended)
38
+ - [Using WaterDrop directly](#using-waterdrop-directly)
39
+ - [Important components](#important-components)
40
+ - [Controllers](#controllers)
41
+ - [Controllers callbacks](#controllers-callbacks)
42
+ - [Responders](#responders)
43
+ - [Registering topics](#registering-topics)
44
+ - [Responding on topics](#responding-on-topics)
45
+ - [Response validation](#response-validation)
46
+ - [Monitoring and logging](#monitoring-and-logging)
47
+ - [Example monitor with Errbit/Airbrake support](#example-monitor-with-errbitairbrake-support)
48
+ - [Example monitor with NewRelic support](#example-monitor-with-newrelic-support)
49
+ - [Deployment](#deployment)
50
+ - [Capistrano](#capistrano)
51
+ - [Docker](#docker)
52
+ - [Sidekiq Web UI](#sidekiq-web-ui)
53
+ - [Concurrency](#concurrency)
54
+ - [Integrating with other frameworks](#integrating-with-other-frameworks)
55
+ - [Integrating with Ruby on Rails](#integrating-with-ruby-on-rails)
56
+ - [Integrating with Sinatra](#integrating-with-sinatra)
57
+ - [Articles and other references](#articles-and-other-references)
58
+ - [Libraries and components](#libraries-and-components)
59
+ - [Articles and references](#articles-and-references)
60
+ - [Note on Patches/Pull Requests](#note-on-patchespull-requests)
61
+
62
+ ## How does it work
63
+
64
+ Karafka provides a higher-level abstraction than raw Kafka Ruby drivers, such as Kafka-Ruby and Poseidon. Instead of focusing on single topic consumption, it provides developers with a set of tools that are dedicated for building multi-topic applications similarly to how Rails applications are being built.
65
+
66
+ ## Support
67
+
68
+ If you have any questions about using Karafka, feel free to join our [Gitter](https://gitter.im/karafka/karafka) chat channel.
69
+
70
+ ## Requirements
71
+
72
+ In order to use Karafka framework, you need to have:
73
+
74
+ - Zookeeper (required by Kafka)
75
+ - Kafka (at least 0.9.0)
76
+ - Ruby (at least 2.3.0)
77
+
78
+ ## Installation
79
+
80
+ Karafka does not have a full installation shell command. In order to install it, please follow given steps:
81
+
82
+ Create a directory for your project:
83
+
84
+ ```bash
85
+ mkdir app_dir
86
+ cd app_dir
87
+ ```
88
+
89
+ Create a **Gemfile** with Karafka:
90
+
91
+ ```ruby
92
+ source 'https://rubygems.org'
93
+
94
+ gem 'karafka', github: 'karafka/karafka'
95
+ ```
96
+
97
+ and run Karafka install CLI task:
98
+
99
+ ```
100
+ bundle exec karafka install
101
+ ```
102
+
103
+ ## Setup
104
+
105
+ ### Application
106
+ Karafka has following configuration options:
107
+
108
+ | Option | Required | Value type | Description |
109
+ |------------------------|----------|-------------------|---------------------------------------------------------------------------------------------|
110
+ | name | true | String | Application name |
111
+ | redis | true | Hash | Hash with Redis configuration options |
112
+ | monitor | false | Object | Monitor instance (defaults to Karafka::Monitor) |
113
+ | logger | false | Object | Logger instance (defaults to Karafka::Logger) |
114
+ | kafka.hosts | false | Array<String> | Kafka server hosts. If 1 provided, Karafka will discover cluster structure automatically |
115
+
116
+ To apply this configuration, you need to use a *setup* method from the Karafka::App class (app.rb):
117
+
118
+ ```ruby
119
+ class App < Karafka::App
120
+ setup do |config|
121
+ config.kafka.hosts = %w( 127.0.0.1:9092 )
122
+ config.redis = {
123
+ url: 'redis://redis.example.com:7372/1'
124
+ }
125
+ config.name = 'my_application'
126
+ config.logger = MyCustomLogger.new # not required
127
+ end
128
+ end
129
+ ```
130
+
131
+ Note: You can use any library like [Settingslogic](https://github.com/binarylogic/settingslogic) to handle your application configuration.
132
+
133
+ ### Configurators
134
+
135
+ If you want to do some configurations after all of this is done, please add to config directory a proper file (needs to inherit from Karafka::Config::Base and implement setup method), after that everything will happen automatically.
136
+
137
+ Example configuration class:
138
+
139
+ ```ruby
140
+ class ExampleConfigurator < Base
141
+ def setup
142
+ ExampleClass.logger = Karafka.logger
143
+ ExampleClass.redis = config.redis
144
+ end
145
+ end
146
+ ```
147
+
148
+ ### Environment variables settings
149
+
150
+ There are several env settings you can use:
151
+
152
+ | ENV name | Default | Description |
153
+ |-------------------|-----------------|-------------------------------------------------------------------------------|
154
+ | KARAFKA_ENV | development | In what mode this application should boot (production/development/test/etc) |
155
+ | KARAFKA_BOOT_FILE | app_root/app.rb | Path to a file that contains Karafka app configuration and booting procedures |
156
+
157
+ ### Kafka brokers auto-discovery
158
+
159
+ Karafka supports Kafka brokers auto-discovery during startup and on failures. You need to provide at least one Kafka broker, from which the entire Kafka cluster will be discovered. Karafka will refresh list of available brokers if something goes wrong. This allows it to be aware of changes that happen in the infrastructure (adding and removing nodes).
160
+
161
+ ## Usage
162
+
163
+ ### Karafka CLI
164
+
165
+ Karafka has a simple CLI built in. It provides following commands:
166
+
167
+ | Command | Description |
168
+ |----------------|---------------------------------------------------------------------------|
169
+ | help [COMMAND] | Describe available commands or one specific command |
170
+ | console | Start the Karafka console (short-cut alias: "c") |
171
+ | flow | Print application data flow (incoming => outgoing) |
172
+ | info | Print configuration details and other options of your application |
173
+ | install | Installs all required things for Karafka application in current directory |
174
+ | routes | Print out all defined routes in alphabetical order |
175
+ | server | Start the Karafka server (short-cut alias: "s") |
176
+ | worker | Start the Karafka Sidekiq worker (short-cut alias: "w") |
177
+
178
+ All the commands are executed the same way:
179
+
180
+ ```
181
+ bundle exec karafka [COMMAND]
182
+ ```
183
+
184
+ If you need more details about each of the CLI commands, you can execute following command:
185
+
186
+ ```
187
+ bundle exec karafka help [COMMAND]
188
+ ```
189
+
190
+ ### Routing
191
+
192
+ Routing engine provides an interface to describe how messages from all the topics should be handled. To start using it, just use the *draw* method on routes:
193
+
194
+ ```ruby
195
+ App.routes.draw do
196
+ topic :example do
197
+ controller ExampleController
198
+ end
199
+ end
200
+ ```
201
+
202
+ The basic route description requires providing *topic* and *controller* that should handle it (Karafka will create a separate controller instance for each request).
203
+
204
+ There are also several other methods available (optional):
205
+
206
+ - *group* - symbol/string with a group name. Groups are used to cluster applications
207
+ - *worker* - Class name - name of a worker class that we want to use to schedule perform code
208
+ - *parser* - Class name - name of a parser class that we want to use to parse incoming data
209
+ - *interchanger* - Class name - name of a interchanger class that we want to use to format data that we put/fetch into/from *#perform_async*
210
+ - *responder* - Class name - name of a responder that we want to use to generate responses to other Kafka topics based on our processed data
211
+
212
+ ```ruby
213
+ App.routes.draw do
214
+ topic :binary_video_details do
215
+ group :composed_application
216
+ controller Videos::DetailsController
217
+ worker Workers::DetailsWorker
218
+ parser Parsers::BinaryToJson
219
+ interchanger Interchangers::Binary
220
+ responder BinaryVideoProcessingResponder
221
+ end
222
+
223
+ topic :new_videos do
224
+ controller Videos::NewVideosController
225
+ end
226
+ end
227
+ ```
228
+
229
+ See description below for more details on each of them.
230
+
231
+ ##### Topic
232
+
233
+ - *topic* - symbol/string with a topic that we want to route
234
+
235
+ ```ruby
236
+ topic :incoming_messages do
237
+ # Details about how to handle this topic should go here
238
+ end
239
+ ```
240
+
241
+ Topic is the root point of each route. Keep in mind that:
242
+
243
+ - All topic names must be unique in a single Karafka application
244
+ - Topics names are being validated because Kafka does not accept some characters
245
+ - If you don't specify a group, it will be built based on the topic and application name
246
+
247
+ ##### Group
248
+
249
+ - *group* - symbol/string with a group name. Groups are used to cluster applications
250
+
251
+ Optionally you can use **group** method to define group for this topic. Use it if you want to build many applications that will share the same Kafka group. Otherwise it will just build it based on the **topic** and application name. If you're not planning to build applications that will load-balance messages between many different applications (but between one applications many processes), you may want not to define it and allow the framework to define it for you.
252
+
253
+ ```ruby
254
+ topic :incoming_messages do
255
+ group :load_balanced_group
256
+ controller MessagesController
257
+ end
258
+ ```
259
+
260
+ Note that a single group can be used only in a single topic.
261
+
262
+ ##### Worker
263
+
264
+ - *worker* - Class name - name of a worker class that we want to use to schedule perform code
265
+
266
+ Karafka by default will build a worker that will correspond to each of your controllers (so you will have a pair - controller and a worker). All of them will inherit from **ApplicationWorker** and will share all its settings.
267
+
268
+ To run Sidekiq you should have sidekiq.yml file in *config* folder. The example of sidekiq.yml file will be generated to config/sidekiq.yml.example once you run **bundle exec karafka install**.
269
+
270
+ However, if you want to use a raw Sidekiq worker (without any Karafka additional magic), or you want to use SidekiqPro (or any other queuing engine that has the same API as Sidekiq), you can assign your own custom worker:
271
+
272
+ ```ruby
273
+ topic :incoming_messages do
274
+ controller MessagesController
275
+ worker MyCustomController
276
+ end
277
+ ```
278
+
279
+ Note that even then, you need to specify a controller that will schedule a background task.
280
+
281
+ Custom workers need to provide a **#perform_async** method. It needs to accept two arguments:
282
+
283
+ - *topic* - first argument is a current topic from which a given message comes
284
+ - *params* - all the params that came from Kafka + additional metadata. This data format might be changed if you use custom interchangers. Otherwise it will be an instance of Karafka::Params::Params.
285
+
286
+ Keep in mind, that params might be in two states: parsed or unparsed when passed to #perform_async. This means, that if you use custom interchangers and/or custom workers, you might want to look into Karafka's sources to see exactly how it works.
287
+
288
+ ##### Parser
289
+
290
+ - *parser* - Class name - name of a parser class that we want to use to parse incoming data
291
+
292
+ Karafka by default will parse messages with a JSON parser. If you want to change this behaviour you need to set custom parser for each route. Parser needs to have a #parse method and raise error that is a ::Karafka::Errors::ParserError descendant when problem appears during parsing process.
293
+
294
+ ```ruby
295
+ class XmlParser
296
+ class ParserError < ::Karafka::Errors::ParserError; end
297
+
298
+ def self.parse(message)
299
+ Hash.from_xml(message)
300
+ rescue REXML::ParseException
301
+ raise ParserError
302
+ end
303
+ end
304
+
305
+ App.routes.draw do
306
+ topic :binary_video_details do
307
+ controller Videos::DetailsController
308
+ parser XmlParser
309
+ end
310
+ end
311
+ ```
312
+
313
+ Note that parsing failure won't stop the application flow. Instead, Karafka will assign the raw message inside the :message key of params. That way you can handle raw message inside the Sidekiq worker (you can implement error detection, etc - any "heavy" parsing logic can and should be implemented there).
314
+
315
+ ##### Interchanger
316
+
317
+ - *interchanger* - Class name - name of a interchanger class that we want to use to format data that we put/fetch into/from #perform_async.
318
+
319
+ Custom interchangers target issues with non-standard (binary, etc) data that we want to store when we do #perform_async. This data might be corrupted when fetched in a worker (see [this](https://github.com/karafka/karafka/issues/30) issue). With custom interchangers, you can encode/compress data before it is being passed to scheduling and decode/decompress it when it gets into the worker.
320
+
321
+ **Warning**: if you decide to use slow interchangers, they might significantly slow down Karafka.
322
+
323
+ ```ruby
324
+ class Base64Interchanger
325
+ class << self
326
+ def load(params)
327
+ Base64.encode64(Marshal.dump(params))
328
+ end
329
+
330
+ def parse(params)
331
+ Marshal.load(Base64.decode64(params))
332
+ end
333
+ end
334
+ end
335
+
336
+ topic :binary_video_details do
337
+ controller Videos::DetailsController
338
+ interchanger Base64Interchanger
339
+ end
340
+ ```
341
+
342
+ ##### Responder
343
+
344
+ - *responder* - Class name - name of a responder that we want to use to generate responses to other Kafka topics based on our processed data.
345
+
346
+ Responders are used to design the response that should be generated and sent to proper Kafka topics, once processing is done. It allows programmers to build not only data-consuming apps, but to build apps that consume data and, then, based on the business logic output send this processed data onwards (similary to how Bash pipelines work).
347
+
348
+ ```ruby
349
+ class Responder < ApplicationResponder
350
+ topic :users_created
351
+ topic :profiles_created
352
+
353
+ def respond(user, profile)
354
+ respond_to :users_created, user
355
+ respond_to :profiles_created, profile
356
+ end
357
+ end
358
+ ```
359
+
360
+ For more details about responders, please go to the [using responders](#using-responders) section.
361
+
362
+ ### Receiving messages
363
+
364
+ Karafka framework has a long running server process that is responsible for receiving messages.
365
+
366
+ To start Karafka server process, use the following CLI command:
367
+
368
+ ```bash
369
+ bundle exec karafka server
370
+ ```
371
+
372
+ Karafka server can be daemonized with the **--daemon** flag:
373
+
374
+ ```
375
+ bundle exec karafka server --daemon
376
+ ```
377
+
378
+ #### Processing messages directly (without Sidekiq)
379
+
380
+ If you don't want to use Sidekiq for processing and you would rather process messages directly in the main Karafka server process, you can do that using the *before_enqueue* callback inside of controller:
381
+
382
+ ```ruby
383
+ class UsersController < ApplicationController
384
+ before_enqueue :perform_directly
385
+
386
+ # By throwing abort signal, Karafka will not schedule a background #perform task.
387
+ def perform_directly
388
+ User.create(params[:user])
389
+ throw(:abort)
390
+ end
391
+ end
392
+ ```
393
+
394
+ Note: it can slow Karafka significantly if you do heavy stuff that way.
395
+
396
+ ### Sending messages from Karafka
397
+
398
+ It's quite common when using Kafka, to treat applications as parts of a bigger pipeline (similary to Bash pipeline) and forward processing results to other applications. Karafka provides two ways of dealing with that:
399
+
400
+ - Using responders
401
+ - Using Waterdrop directly
402
+
403
+ Each of them has it's own advantages and disadvantages and it strongly depends on your application business logic which one will be better. The recommended (and way more elegant) way is to use responders for that.
404
+
405
+ #### Using responders (recommended)
406
+
407
+ One of the main differences when you respond to a Kafka message instead of a HTTP response, is that the response can be sent to many topics (instead of one HTTP response per one request) and that the data that is being sent can be different for different topics. That's why a simple **respond_to** would not be enough.
408
+
409
+ In order to go beyond this limitation, Karafka uses responder objects that are responsible for sending data to other Kafka topics.
410
+
411
+ By default, if you name a responder with the same name as a controller, it will be detected automatically:
412
+
413
+ ```ruby
414
+ module Users
415
+ class CreateController < ApplicationController
416
+ def perform
417
+ # You can provide as many objects as you want to respond_with as long as a responders
418
+ # #respond method accepts the same amount
419
+ respond_with User.create(params[:user])
420
+ end
421
+ end
422
+
423
+ class CreateResponder < ApplicationResponder
424
+ topic :user_created
425
+
426
+ def respond(user)
427
+ respond_to :user_created, user
428
+ end
429
+ end
430
+ end
431
+ ```
432
+
433
+ Appropriate responder will be used automatically when you invoke the **respond_with** controller method.
434
+
435
+ Why did we separate response layer from the controller layer? Because sometimes when you respond to multiple topics conditionally, that logic can be really complex and it is way better to manage and test it in isolation.
436
+
437
+ For more details about responders DSL, please visit the [responders](#responders) section.
438
+
439
+ #### Using WaterDrop directly
440
+
441
+ It is not recommended (as it breaks responders validations and makes it harder to track data flow), but if you want to send messages outside of Karafka responders, you can to use **waterdrop** gem directly.
442
+
443
+ Example usage:
444
+
445
+ ```ruby
446
+ message = WaterDrop::Message.new('topic', 'message')
447
+ message.send!
448
+
449
+ message = WaterDrop::Message.new('topic', { user_id: 1 }.to_json)
450
+ message.send!
451
+ ```
452
+
453
+ Please follow [WaterDrop README](https://github.com/karafka/waterdrop/blob/master/README.md) for more details on how to use it.
454
+
455
+
456
+ ## Important components
457
+
458
+ Apart from the internal implementation, Karafka is combined from the following components programmers mostly will work with:
459
+
460
+ - Controllers - objects that are responsible for processing incoming messages (similar to Rails controllers)
461
+ - Responders - objects that are responsible for sending responses based on the processed data
462
+ - Workers - objects that execute data processing using Sidekiq backend
463
+
464
+ ### Controllers
465
+
466
+ Controllers should inherit from **ApplicationController** (or any other controller that inherits from **Karafka::BaseController**). If you don't want to use custom workers (and except some particular cases you don't need to), you need to define a **#perform** method that will execute your business logic code in background.
467
+
468
+ ```ruby
469
+ class UsersController < ApplicationController
470
+ # Method execution will be enqueued in Sidekiq
471
+ # Karafka will schedule automatically a proper job and execute this logic in the background
472
+ def perform
473
+ User.create(params[:user])
474
+ end
475
+ end
476
+ ```
477
+
478
+ #### Controllers callbacks
479
+
480
+ You can add any number of *before_enqueue* callbacks. It can be method or block.
481
+ before_enqueue acts in a similar way to Rails before_action so it should perform "lightweight" operations. You have access to params inside. Based on it you can define which data you want to receive and which not.
482
+
483
+ **Warning**: keep in mind, that all *before_enqueue* blocks/methods are executed after messages are received. This is not executed in Sidekiq, but right after receiving the incoming message. This means, that if you perform "heavy duty" operations there, Karafka might significantly slow down.
484
+
485
+ If any of callbacks throws :abort - *perform* method will be not enqueued to the worker (the execution chain will stop).
486
+
487
+ Once you run consumer - messages from Kafka server will be send to a proper controller (based on topic name).
488
+
489
+ Presented example controller will accept incoming messages from a Kafka topic named :karafka_topic
490
+
491
+ ```ruby
492
+ class TestController < ApplicationController
493
+ # before_enqueue has access to received params.
494
+ # You can modify them before enqueue it to sidekiq queue.
495
+ before_enqueue {
496
+ params.merge!(received_time: Time.now.to_s)
497
+ }
498
+
499
+ before_enqueue :validate_params
500
+
501
+ # Method execution will be enqueued in Sidekiq.
502
+ def perform
503
+ Service.new.add_to_queue(params[:message])
504
+ end
505
+
506
+ # Define this method if you want to use Sidekiq reentrancy.
507
+ # Logic to do if Sidekiq worker fails (because of exception, timeout, etc)
508
+ def after_failure
509
+ Service.new.remove_from_queue(params[:message])
510
+ end
511
+
512
+ private
513
+
514
+ # We will not enqueue to sidekiq those messages, which were sent
515
+ # from sum method and return too high message for our purpose.
516
+ def validate_params
517
+ throw(:abort) unless params['message'].to_i > 50 && params['method'] != 'sum'
518
+ end
519
+ end
520
+ ```
521
+
522
+ ### Responders
523
+
524
+ Responders are used to design and control response flow that comes from a single controller action. You might be familiar with a #respond_with Rails controller method. In Karafka it is an entrypoint to a responder *#respond*.
525
+
526
+ Having a responders layer helps you prevent bugs when you design a receive-respond applications that handle multiple incoming and outgoing topics. Responders also provide a security layer that allows you to control that the flow is as you intended. It will raise an exception if you didn't respond to all the topics that you wanted to respond to.
527
+
528
+ Here's a simple responder example:
529
+
530
+ ```ruby
531
+ class ExampleResponder < ApplicationResponder
532
+ topic :users_notified
533
+
534
+ def respond(user)
535
+ respond_to :users_notified, user
536
+ end
537
+ end
538
+ ```
539
+
540
+ Note: You can use responders outside of controllers scope, however it is not recommended because then, they won't be listed when executing **karafka flow** CLI command.
541
+
542
+ #### Registering topics
543
+
544
+ In order to maintain order in topics organization, before you can send data to a given topic, you need to register it. To do that, just execute *#topic* method with a topic name and optional settings during responder initialization:
545
+
546
+ ```ruby
547
+ class ExampleResponder < ApplicationResponder
548
+ topic :regular_topic
549
+ topic :optional_topic, required: false
550
+ topic :multiple_use_topic, multiple_usage: true
551
+ end
552
+ ```
553
+
554
+ *#topic* method accepts following settings:
555
+
556
+ | Option | Type | Default | Description |
557
+ |----------------|---------|---------|------------------------------------------------------------------------------------------------------------|
558
+ | required | Boolean | true | Should we raise an error when a topic was not used (if required) |
559
+ | multiple_usage | Boolean | false | Should we raise an error when during a single response flow we sent more than one message to a given topic |
560
+
561
+ #### Responding on topics
562
+
563
+ When you receive a single HTTP request, you generate a single HTTP response. This logic does not apply to Karafka. You can respond on as many topics as you want (or on none).
564
+
565
+ To handle responding, you need to define *#respond* instance method. This method should accept the same amount of arguments passed into *#respond_with* method.
566
+
567
+ In order to send a message to a given topic, you have to use *#respond_to* method that accepts two arguments:
568
+
569
+ - topic name (Symbol)
570
+ - data you want to send (if data is not string, responder will try to run #to_json method on the incoming data)
571
+
572
+ ```ruby
573
+ # respond_with user, profile
574
+
575
+ class ExampleResponder < ApplicationResponder
576
+ topic :regular_topic
577
+ topic :optional_topic, required: false
578
+
579
+ def respond(user, profile)
580
+ respond_to :regular_topic, user
581
+
582
+ if user.registered?
583
+ respond_to :optional_topic, profile
584
+ end
585
+ end
586
+ end
587
+ ```
588
+
589
+ #### Response validation
590
+
591
+ In order to ensure the dataflow is as intended, responder will validate what and where was sent, making sure that:
592
+
593
+ - Only topics that were registered were used (no typos, etc)
594
+ - Only a single message was sent to a topic that was registered without a **multiple_usage** flag
595
+ - Any topic that was registered with **required** flag (default behavior) has been used
596
+
597
+ This is an automatic process and does not require any triggers.
598
+
599
+ ## Monitoring and logging
600
+
601
+ Karafka provides a simple monitor (Karafka::Monitor) with a really small API. You can use it to develop your own monitoring system (using for example NewRelic). By default, the only thing that is hooked up to this monitoring is a Karafka logger (Karafka::Logger). It is based on a standard [Ruby logger](http://ruby-doc.org/stdlib-2.2.3/libdoc/logger/rdoc/Logger.html).
602
+
603
+ To change monitor or a logger assign new logger/monitor during setup:
604
+
605
+ ```ruby
606
+ class App < Karafka::App
607
+ setup do |config|
608
+ # Other setup stuff...
609
+ config.logger = MyCustomLogger.new
610
+ config.monitor = CustomMonitor.new
611
+ end
612
+ end
613
+ ```
614
+
615
+ Keep in mind, that if you replace monitor with a custom one, you will have to implement logging as well. It is because monitoring is used for both monitoring and logging and a default monitor handles logging as well.
616
+
617
+ ### Example monitor with Errbit/Airbrake support
618
+
619
+ Here's a simple example of monitor that is used to handle errors logging into Airbrake/Errbit.
620
+
621
+ ```ruby
622
+ class AppMonitor < Karafka::Monitor
623
+ def notice_error(caller_class, e)
624
+ super
625
+ Airbrake.notify_or_ignore(e)
626
+ end
627
+ end
628
+ ```
629
+
630
+ ### Example monitor with NewRelic support
631
+
632
+ Here's a simple example of monitor that is used to handle events and errors logging into NewRelic. It will send metrics with information about amount of processed messages per topic and how many of them were scheduled to be performed async.
633
+
634
+ ```ruby
635
+ # NewRelic example monitor for Karafka
636
+ class AppMonitor < Karafka::Monitor
637
+ # @param [Class] caller class for this notice
638
+ # @param [Hash] hash with options for this notice
639
+ def notice(caller_class, options = {})
640
+ # Use default Karafka monitor logging
641
+ super
642
+ # Handle differently proper actions that we want to monit with NewRelic
643
+ return unless respond_to?(caller_label, true)
644
+ send(caller_label, options[:topic])
645
+ end
646
+
647
+ # @param [Class] caller class for this notice error
648
+ # @param e [Exception] error that happened
649
+ def notice_error(caller_class, e)
650
+ super
651
+ NewRelic::Agent.notice_error(e)
652
+ end
653
+
654
+ private
655
+
656
+ # Log that message for a given topic was consumed
657
+ # @param topic [String] topic name
658
+ def consume(topic)
659
+ record_count metric_key(topic, __method__)
660
+ end
661
+
662
+ # Log that message for topic was scheduled to be performed async
663
+ # @param topic [String] topic name
664
+ def perform_async(topic)
665
+ record_count metric_key(topic, __method__)
666
+ end
667
+
668
+ # Log that message for topic was performed async
669
+ # @param topic [String] topic name
670
+ def perform(topic)
671
+ record_count metric_key(topic, __method__)
672
+ end
673
+
674
+ # @param topic [String] topic name
675
+ # @param action [String] action that we want to log (consume/perform_async/perform)
676
+ # @return [String] a proper metric key for NewRelic
677
+ # @example
678
+ # metric_key('videos', 'perform_async') #=> 'Custom/videos/perform_async'
679
+ def metric_key(topic, action)
680
+ "Custom/#{topic}/#{action}"
681
+ end
682
+
683
+ # Records occurence of a given event
684
+ # @param [String] key under which we want to log
685
+ def record_count(key)
686
+ NewRelic::Agent.record_metric(key, count: 1)
687
+ end
688
+ end
689
+ ```
690
+
691
+ ## Deployment
692
+
693
+ Karafka is currently being used in production with following deployment methods:
694
+
695
+ - Capistrano
696
+ - Docker
697
+
698
+ Since the only thing that is long-running is Karafka server, it should't be hard to make it work with other deployment and CD tools.
699
+
700
+ ### Capistrano
701
+
702
+ Use the built-in Capistrano recipe for easy Karafka server start/stop and restart with deploys.
703
+
704
+ In your **Capfile** file:
705
+
706
+ ```ruby
707
+ require 'karafka/capistrano'
708
+ ```
709
+
710
+ Take a look at the [load:defaults task](https://github.com/karafka/karafka/blob/master/lib/karafka/capistrano/karafka.cap) (top of file) for options you can set. For example, to specify a different pidfile than default:
711
+
712
+ ```ruby
713
+ set :karafka_pid, ->{ File.join(shared_path, 'tmp', 'pids', 'karafka0') }
714
+ ```
715
+
716
+ ### Docker
717
+
718
+ Karafka can be dockerized as any other Ruby/Rails app. To execute **karafka server** command in your Docker container, just put this into your Dockerfile:
719
+
720
+ ```bash
721
+ ENV KARAFKA_ENV production
722
+ CMD bundle exec karafka server
723
+ ```
724
+
725
+ ## Sidekiq Web UI
726
+
727
+ Karafka comes with a Sidekiq Web UI application that can display the current state of a Sidekiq installation. If you installed Karafka based on the install instructions, you will have a **config.ru** file that allows you to run standalone Puma instance with a Sidekiq Web UI.
728
+
729
+ To be able to use it (since Karafka does not depend on Puma and Sinatra) add both of them into your Gemfile:
730
+
731
+ ```ruby
732
+ gem 'puma'
733
+ gem 'sinatra'
734
+ ```
735
+
736
+ bundle and run:
737
+
738
+ ```
739
+ bundle exec rackup
740
+ # Puma starting...
741
+ # * Min threads: 0, max threads: 16
742
+ # * Environment: development
743
+ # * Listening on tcp://localhost:9292
744
+ ```
745
+
746
+ You can then navigate to displayer url to check your Sidekiq status. Sidekiq Web UI by default is password protected. To check (or change) your login and password, please review **config.ru** file in your application.
747
+
748
+ ## Concurrency
749
+
750
+ Karafka uses [Celluloid](https://celluloid.io/) actors to handle listening to incoming connections. Since each topic and group requires a separate connection (which means that we have a connection per controller) we do this concurrently. It means, that for each route, you will have one additional thread running.
751
+
752
+ ## Integrating with other frameworks
753
+
754
+ Want to use Karafka with Ruby on Rails or Sinatra? It can be done!
755
+
756
+ ### Integrating with Ruby on Rails
757
+
758
+ Add Karafka to your Ruby on Rails application Gemfile:
759
+
760
+ ```ruby
761
+ gem 'karafka', github: 'karafka/karafka'
762
+ ```
763
+
764
+ Copy the **app.rb** file from your Karafka application into your Rails app (if you don't have this file, just create an empty Karafka app and copy it). This file is responsible for booting up Karafka framework. To make it work with Ruby on Rails, you need to load whole Rails application in this file. To do so, replace:
765
+
766
+ ```ruby
767
+ ENV['RACK_ENV'] ||= 'development'
768
+ ENV['KARAFKA_ENV'] = ENV['RACK_ENV']
769
+
770
+ Bundler.require(:default, ENV['KARAFKA_ENV'])
771
+ ```
772
+
773
+ with
774
+
775
+ ```ruby
776
+ ENV['RAILS_ENV'] ||= 'development'
777
+ ENV['KARAFKA_ENV'] = ENV['RAILS_ENV']
778
+
779
+ require ::File.expand_path('../config/environment', __FILE__)
780
+ Rails.application.eager_load!
781
+ ```
782
+
783
+ and you are ready to go!
784
+
785
+ ### Integrating with Sinatra
786
+
787
+ Sinatra applications differ from one another. There are single file applications and apps with similar to Rails structure. That's why we cannot provide a simple single tutorial. Here are some guidelines that you should follow in order to integrate it with Sinatra based application:
788
+
789
+ Add Karafka to your Sinatra application Gemfile:
790
+
791
+ ```ruby
792
+ gem 'karafka', github: 'karafka/karafka'
793
+ ```
794
+
795
+ After that make sure that whole your application is loaded before setting up and booting Karafka (see Ruby on Rails integration for more details about that).
796
+
797
+ ## Articles and other references
798
+
799
+ ### Libraries and components
800
+
801
+ * [Karafka framework](https://github.com/karafka/karafka)
802
+ * [Waterdrop](https://github.com/karafka/waterdrop)
803
+ * [Worker Glass](https://github.com/karafka/worker-glass)
804
+ * [Envlogic](https://github.com/karafka/envlogic)
805
+ * [Apache Kafka](http://kafka.apache.org/)
806
+ * [Apache ZooKeeper](https://zookeeper.apache.org/)
807
+ * [Ruby-Kafka](https://github.com/zendesk/ruby-kafka)
808
+
809
+ ### Articles and references
810
+
811
+ * [Karafka – Ruby micro-framework for building Apache Kafka message-based applications](http://dev.mensfeld.pl/2015/08/karafka-ruby-micro-framework-for-building-apache-kafka-message-based-applications/)
812
+ * [Benchmarking Karafka – how does it handle multiple TCP connections](http://dev.mensfeld.pl/2015/11/benchmarking-karafka-how-does-it-handle-multiple-tcp-connections/)
813
+ * [Karafka – Ruby framework for building Kafka message based applications (presentation)](http://mensfeld.github.io/karafka-framework-introduction/)
814
+ * [Karafka example application](https://github.com/karafka/karafka-example-app)
815
+ * [Karafka Travis CI](https://travis-ci.org/karafka/karafka)
816
+ * [Karafka Code Climate](https://codeclimate.com/github/karafka/karafka)
817
+
818
+ ## Note on Patches/Pull Requests
819
+
820
+ Fork the project.
821
+ Make your feature addition or bug fix.
822
+ Add tests for it. This is important so I don't break it in a future version unintentionally.
823
+ Commit, do not mess with Rakefile, version, or history. (if you want to have your own version, that is fine but bump version in a commit by itself I can ignore when I pull). Send me a pull request. Bonus points for topic branches.
824
+
825
+ Each pull request must pass our quality requirements. To check if everything is as it should be, we use [PolishGeeks Dev Tools](https://github.com/polishgeeks/polishgeeks-dev-tools) that combine multiple linters and code analyzers. Please run:
826
+
827
+ ```bash
828
+ bundle exec rake
829
+ ```
830
+
831
+ to check if everything is in order. After that you can submit a pull request.