neoid 0.0.51 → 0.1

Sign up to get free protection for your applications and to get access to all the features.
@@ -1,3 +1,38 @@
1
+ ## v0.1
2
+
3
+ * Added batch support, for much faster intiialization of current DB or reindexing all DB.
4
+ * Dropped indexes per model, instead, using `node_auto_index` and `relationship_auto_index`, letting Neo4j auto index objects.
5
+ * One `neo_save` method instead of `neo_create` and `neo_update`. It takes care of inserting or updating.
6
+
7
+ ### Breaking changes:
8
+
9
+ Model indexes (such as `users_index`) are now turned off by default. Instead, Neoid uses Neo4j's auto indexing feature.
10
+
11
+ In order to have the model indexes back, use this in your configuration:
12
+
13
+ ```ruby
14
+ Neoid.configure do |c|
15
+ c.enable_per_model_indexes = true
16
+ end
17
+ ```
18
+
19
+ This will turn on for all models.
20
+
21
+ You can turn off for a specific model with:
22
+
23
+ ```ruby
24
+ class User < ActiveRecord::Base
25
+ include Neoid::Node
26
+
27
+ neoidable enable_model_index: false do |c|
28
+ end
29
+ end
30
+ ```
31
+
32
+ ## v0.0.51
33
+
34
+ * Releasing Neoid as a gem.
35
+
1
36
  ## v0.0.41
2
37
 
3
38
  * fixed really annoying bug caused by Rails design -- Rails doesn't call `after_destroy` when assigning many to many relationships to a model, like `user.movies = [m1, m2, m3]` or `user.update_attributes(params[:user])` where it contains `params[:user][:movie_ids]` list (say from checkboxes), but it DOES CALL after_create for the new relationships. the fix adds after_remove callback to the has_many relationships, ensuring neo4j is up to date with all changes, no matter how they were committed
data/README.md CHANGED
@@ -3,7 +3,6 @@
3
3
  [![Build Status](https://secure.travis-ci.org/elado/neoid.png)](http://travis-ci.org/elado/neoid)
4
4
 
5
5
 
6
-
7
6
  Make your ActiveRecords stored and searchable on Neo4j graph database, in order to make fast graph queries that MySQL would crawl while doing them.
8
7
 
9
8
  Neoid to Neo4j is like Sunspot to Solr. You get the benefits of Neo4j speed while keeping your schema on your plain old RDBMS.
@@ -12,6 +11,11 @@ Neoid doesn't require JRuby. It's based on the great [Neography](https://github.
12
11
 
13
12
  Neoid offers querying Neo4j for IDs of objects and then fetch them from your RDBMS, or storing all desired data on Neo4j.
14
13
 
14
+ **Important: Heroku Support is not available because Herokud doesn't support Gremlin. So until further notice, easiest way is to self host a Neo4j on EC2 in the same zone, and connect from your dyno to it**
15
+
16
+ ## Changelog
17
+
18
+ [See Changelog](https://github.com/elado/neoid/blob/master/CHANGELOG.md)
15
19
 
16
20
 
17
21
  ## Installation
@@ -19,11 +23,9 @@ Neoid offers querying Neo4j for IDs of objects and then fetch them from your RDB
19
23
  Add to your Gemfile and run the `bundle` command to install it.
20
24
 
21
25
  ```ruby
22
- gem 'neoid', '~> 0.0.51'
26
+ gem 'neoid', '~> 0.1'
23
27
  ```
24
28
 
25
- Future versions may have breaking changes but will arrive with migration code.
26
-
27
29
  **Requires Ruby 1.9.2 or later.**
28
30
 
29
31
  ## Usage
@@ -51,6 +53,11 @@ Neography.configure do |c|
51
53
  end
52
54
 
53
55
  Neoid.db = $neo
56
+
57
+ Neoid.configure do |c|
58
+ # should Neoid create sub-reference from the ref node (id#0) to every node-model? default: true
59
+ c.enable_subrefs = true
60
+ end
54
61
  ```
55
62
 
56
63
  `01_` in the file name is in order to get this file loaded first, before the models (initializers are loaded alphabetically).
@@ -71,9 +78,9 @@ class User < ActiveRecord::Base
71
78
  end
72
79
  ```
73
80
 
74
- This will help to create a corresponding node on Neo4j when a user is created, delete it when a user is destroyed, and update it if needed.
81
+ This will help to create/update/destroy a corresponding node on Neo4j when changed are made a User model.
75
82
 
76
- Then, you can customize what fields will be saved on the node in Neo4j, inside neoidable configuration:
83
+ Then, you can customize what fields will be saved on the node in Neo4j, inside `neoidable` configuration, using `field`. You can also pass blocks to save content that's not a real column:
77
84
 
78
85
  ```ruby
79
86
  class User < ActiveRecord::Base
@@ -89,7 +96,6 @@ class User < ActiveRecord::Base
89
96
  end
90
97
  ```
91
98
 
92
-
93
99
  #### Relationships
94
100
 
95
101
  Let's assume that a `User` can `Like` `Movie`s:
@@ -151,7 +157,7 @@ class Like < ActiveRecord::Base
151
157
  end
152
158
  ```
153
159
 
154
- Neoid adds `neo_node` and `neo_relationships` to nodes and relationships, respectively.
160
+ Neoid adds the metohds `neo_node` and `neo_relationships` to instances of nodes and relationships, respectively.
155
161
 
156
162
  So you could do:
157
163
 
@@ -169,38 +175,52 @@ rel.end_node # user.movies.first.neo_node
169
175
  rel.rel_type # 'likes'
170
176
  ```
171
177
 
172
- ## Index for Full-Text Search
178
+ #### Disabling auto saving to Neo4j:
173
179
 
174
- Using `search` block inside a `neoidable` block, you can store certain fields.
180
+ If you'd like to save nodes manually rather than after_save, use `auto_index: false`:
175
181
 
176
182
  ```ruby
177
- # movie.rb
178
-
179
- class Movie < ActiveRecord::Base
183
+ class User < ActiveRecord::Base
180
184
  include Neoid::Node
181
-
182
- neoidable do |c|
183
- c.field :slug
184
- c.field :name
185
-
186
- c.search do |s|
187
- # full-text index fields
188
- s.fulltext :name
189
- s.fulltext :description
190
-
191
- # just index for exact matches
192
- s.index :year
193
- end
185
+
186
+ neoidable auto_index: false do |c|
194
187
  end
195
188
  end
196
- ```
197
189
 
198
- Records will be automatically indexed when inserted or updated.
190
+ user = User.create!(name: "Elad") # no node is created in Neo4j!
191
+
192
+ user.neo_save # now there is!
193
+ ```
199
194
 
200
195
  ## Querying
201
196
 
202
197
  You can query with all [Neography](https://github.com/maxdemarzi/neography)'s API: `traverse`, `execute_query` for Cypher, and `execute_script` for Gremlin.
203
198
 
199
+ ### Basics:
200
+
201
+ #### Finding a node by ID
202
+
203
+ Nodes and relationships are auto indexed in the `node_auto_index` and `relationship_auto_index` indexes, where the key is `Neoid::UNIQUE_ID_KEY` (which is 'neoid_unique_id') and the value is a combination of the class name and model id, `Movie:43`, this value is accessible with `model.neo_unique_id`. So use the constant and this method, never rely on assebling those values on your own because they might change in the future.
204
+
205
+ That means, you can query like this:
206
+
207
+ ```ruby
208
+ Neoid.db.get_node_auto_index(Neoid::UNIQUE_ID_KEY, user.neo_unique_id)
209
+ # => returns a Neography hash
210
+
211
+ Neoid::Node.from_hash(Neoid.db.get_node_auto_index(Neoid::UNIQUE_ID_KEY, user.neo_unique_id))
212
+ # => returns a Neography::Node
213
+ ```
214
+
215
+ #### Finding all nodes of type
216
+
217
+ If Subreferences are enabled, you can get the subref node and then get all attached nodes:
218
+
219
+ ```ruby
220
+ Neoid.ref_node.outgoing('users_subref').first.outgoing('users_subref').to_a
221
+ # => this, according to Neography, returns an array of Neography::Node so no conversion is needed
222
+ ```
223
+
204
224
  ### Gremlin Example:
205
225
 
206
226
  These examples query Neo4j using Gremlin for IDs of objects, and then fetches them from ActiveRecord with an `in` query.
@@ -208,7 +228,7 @@ These examples query Neo4j using Gremlin for IDs of objects, and then fetches th
208
228
  Of course, you can store using the `neoidable do |c| c.field ... end` all the data you need in Neo4j and avoid querying ActiveRecord.
209
229
 
210
230
 
211
- **Most popular categories**
231
+ **Most liked movies**
212
232
 
213
233
  ```ruby
214
234
  gremlin_query = <<-GREMLIN
@@ -228,15 +248,18 @@ movie_ids = Neoid.db.execute_script(gremlin_query)
228
248
  Movie.where(id: movie_ids)
229
249
  ```
230
250
 
231
- Assuming we have another `Friendship` model which is a relationship with start/end nodes of `user` and type of `friends`,
251
+ *Side note: the resulted movies won't be sorted by like count because the RDBMS won't necessarily do it as we passed a list of IDs. You can sort it yourself with array manipulation, since you have the ids.*
252
+
232
253
 
233
254
  **Movies of user friends that the user doesn't have**
234
255
 
256
+ Let's assume we have another `Friendship` model which is a relationship with start/end nodes of `user` and type of `friends`,
257
+
235
258
  ```ruby
236
259
  user = User.find(1)
237
260
 
238
261
  gremlin_query = <<-GREMLIN
239
- u = g.idx('users_index')[[ar_id:user_id]].next()
262
+ u = g.idx('node_auto_index').get(unique_id_key, user_unique_id).next()
240
263
  movies = []
241
264
 
242
265
  u
@@ -246,15 +269,42 @@ gremlin_query = <<-GREMLIN
246
269
  .except(movies).collect{it.ar_id}
247
270
  GREMLIN
248
271
 
249
- movie_ids = Neoid.db.execute_script(gremlin_query, user_id: user.id)
272
+ movie_ids = Neoid.db.execute_script(gremlin_query, unique_id_key: Neoid::UNIQUE_ID_KEY, user_unique_id: user.neo_unique_id)
250
273
 
251
274
  Movie.where(id: movie_ids)
252
275
  ```
253
276
 
254
- `.next()` is in order to get a vertex object which we can actually query on.
277
+ ## Full Text Search
255
278
 
279
+ ### Index for Full-Text Search
256
280
 
257
- ### Full Text Search
281
+ Using `search` block inside a `neoidable` block, you can store certain fields.
282
+
283
+ ```ruby
284
+ # movie.rb
285
+
286
+ class Movie < ActiveRecord::Base
287
+ include Neoid::Node
288
+
289
+ neoidable do |c|
290
+ c.field :slug
291
+ c.field :name
292
+
293
+ c.search do |s|
294
+ # full-text index fields
295
+ s.fulltext :name
296
+ s.fulltext :description
297
+
298
+ # just index for exact matches
299
+ s.index :year
300
+ end
301
+ end
302
+ end
303
+ ```
304
+
305
+ Records will be automatically indexed when inserted or updated.
306
+
307
+ ### Querying a Full-Text Search index
258
308
 
259
309
  ```ruby
260
310
  # will match all movies with full-text match for name/description. returns ActiveRecord instanced
@@ -270,14 +320,63 @@ Neoid.neo_search([Movie, User], "hello")
270
320
  Movie.neo_search(year: 2013).results
271
321
  ```
272
322
 
323
+ Full text search with Neoid is very limited and is likely not to develop more than this basic functionality. I strongly recommend using gems like Sunspot over Solr.
324
+
325
+ ## Batches
326
+
327
+ Neoid has a batch ability, that is good for mass updateing/inserting of nodes/relationships. It sends batched requests to Neography, and takes care of type conversion (neography batch returns hashes and other primitive types) and "after" actions (via promises).
328
+
329
+ A few examples, easy to complex:
330
+
331
+ ```ruby
332
+ Neoid.batch(batch_size: 100) do
333
+ User.all.each(&:neo_save)
334
+ end
335
+ ```
336
+ With `then`:
337
+
338
+ ```ruby
339
+ User.first.name # => "Elad"
340
+
341
+ Neoid.batch(batch_size: 100) do
342
+ User.all.each(&:neo_save)
343
+ end.then do |results|
344
+ # results is an array of the script results from neo4j REST.
345
+
346
+ results[0].name # => "Elad"
347
+ end
348
+ ```
349
+
350
+ *Nodes and relationships in the results are automatically converted to Neography::Node and Neography::Relationship, respectively.*
351
+
352
+ With individual `then` as well as `then` for the entire batch:
353
+
354
+ ```ruby
355
+ Neoid.batch(batch_size: 30) do |batch|
356
+ (1..90).each do |i|
357
+ (batch << [:create_node, { name: "Hello #{i}" }]).then { |result| puts result.name }
358
+ end
359
+ end.then do |results|
360
+ puts results.collect(&:name)
361
+ end
362
+ ```
363
+
364
+ When in a batch, `neo_save` adds gremlin scripts to a batch, instead of running them immediately. The batch flushes whenever the `batch_size` option is met.
365
+ So even if you have 20000 users, Neoid will insert/update in smaller batches. Default `batch_size` is 200.
366
+
367
+
273
368
  ## Inserting records of existing app
274
369
 
275
- If you have an existing database and just want to integrate Neoid, configure the `neoidable`s and run in a rake task or console
370
+ If you have an existing database and just want to integrate Neoid, configure the `neoidable`s and run in a rake task or console.
371
+
372
+ Use batches! It's free, and much faster. Also, you should use `includes` to incude the relationship edges on relationship entities, so it doesn't query the DB on each relationship.
276
373
 
277
374
  ```ruby
278
- [ Like.includes(:user).includes(:movie), OtherRelationshipModel ].each { |model| model.all.each(&:neo_update) }
375
+ Neoid.batch do
376
+ [ Like.includes(:user).includes(:movie), OtherRelationshipModel.includes(:from_model).includes(:to_model) ].each { |model| model.all.each(&:neo_save) }
279
377
 
280
- NodeModel.all.each(&:neo_update)
378
+ NodeModel.all.each(&:neo_save)
379
+ end
281
380
  ```
282
381
 
283
382
  This will loop through all of your relationship records and generate the two edge nodes along with a relationship (eager loading for better performance).
@@ -289,30 +388,32 @@ Better interface for that in the future.
289
388
 
290
389
  ## Behind The Scenes
291
390
 
292
- Whenever the `neo_node` on nodes or `neo_relationship` on relationships is called, Neoid checks if there's a corresponding node/relationship in Neo4j. If not, it does the following:
391
+ Whenever the `neo_node` on nodes or `neo_relationship` on relationships is called, Neoid checks if there's a corresponding node/relationship in Neo4j (with the auto indexes). If not, it does the following:
293
392
 
294
393
  ### For Nodes:
295
394
 
296
- 1. Ensures there's a sub reference node (read [here](http://docs.neo4j.org/chunked/stable/tutorials-java-embedded-index.html) about sub reference nodes)
395
+ 1. Ensures there's a sub reference node (read [here](http://docs.neo4j.org/chunked/stable/tutorials-java-embedded-index.html) about sub references), if that option is on.
297
396
  2. Creates a node based on the ActiveRecord, with the `id` attribute and all other attributes from `neoidable`'s field list
298
397
  3. Creates a relationship between the sub reference node and the newly created node
299
- 4. Adds the ActiveRecord `id` to a node index, pointing to the Neo4j node id, for fast lookup in the future
398
+ 4. Auto indexes a node in the auto index, for fast lookup in the future
300
399
 
301
- Then, when it needs to find it again, it just seeks the node index with that ActiveRecord id for its neo node id.
400
+ Then, when it needs to find it again, it just seeks the auto index with that ActiveRecord id.
302
401
 
303
402
  ### For Relationships:
304
403
 
305
- Like Nodes, it uses an index (relationship index) to look up a relationship by ActiveRecord id
404
+ Like Nodes, it uses an auto index, to look up a relationship by ActiveRecord id
306
405
 
307
406
  1. With the options passed in the `neoidable`, it fetches the `start_node` and `end_node`
308
407
  2. Then, it calls `neo_node` on both, in order to create the Neo4j nodes if they're not created yet, and creates the relationship with the type from the options.
309
- 3. Add the relationship to the relationship index.
408
+ 3. Adds the relationship to the relationship index.
310
409
 
311
410
  ## Testing
312
411
 
313
412
  In order to test your app or this gem, you need a running Neo4j database, dedicated to tests.
314
413
 
315
- I use port 7574 for this. To run another database locally:
414
+ I use port 7574 for testing.
415
+
416
+ To run another database locally (read [here](http://docs.neo4j.org/chunked/1.9.M03/server-installation.html#_multiple_server_instances_on_one_machine) too):
316
417
 
317
418
  Copy the entire Neo4j database folder to a different location,
318
419
 
@@ -344,7 +445,7 @@ end
344
445
 
345
446
  ## Testing This Gem
346
447
 
347
- Just run `rake` from the gem folder.
448
+ Run the Neo4j DB on port 7574, and run `rake` from the gem folder.
348
449
 
349
450
  ## Contributing
350
451
 
@@ -356,9 +457,9 @@ Please create a [new issue](https://github.com/elado/neoid/issues) if you run in
356
457
  Unfortunately, as for now, Neo4j add-on on Heroku doesn't support Gremlin. Therefore, this gem won't work on Heroku's add on. You should self-host a Neo4j instance on an EC2 or any other server.
357
458
 
358
459
 
359
- ## To Do
460
+ ## TO DO
360
461
 
361
- [To Do](https://github.com/elado/neoid/blob/master/TODO.md)
462
+ [TO DO](HTTPS://GITHUB.COM/ELADO/NEOID/BLOB/MASTER/TODO.MD)
362
463
 
363
464
 
364
465
  ---
data/TODO.md CHANGED
@@ -1,6 +1,4 @@
1
1
  # Neoid - To Do
2
2
 
3
- * Allow to disable sub reference nodes through options
4
3
  * Execute queries/scripts from model and not Neography (e.g. `Movie.neo_gremlin(gremlin_query)` with query that outputs IDs, returns a list of `Movie`s)
5
4
  * Rake task to index all nodes and relatiohsips in Neo4j
6
- * Test update node
@@ -1,24 +1,27 @@
1
+ require 'neography'
1
2
  require 'neoid/version'
3
+ require 'neoid/config'
2
4
  require 'neoid/model_config'
3
5
  require 'neoid/model_additions'
4
6
  require 'neoid/search_session'
5
7
  require 'neoid/node'
6
8
  require 'neoid/relationship'
9
+ require 'neoid/batch'
7
10
  require 'neoid/database_cleaner'
8
11
  require 'neoid/railtie' if defined?(Rails)
9
12
 
10
13
  module Neoid
11
- DEFAULT_FULLTEXT_SEARCH_INDEX_NAME = 'neoid_default_search_index'
14
+ DEFAULT_FULLTEXT_SEARCH_INDEX_NAME = :neoid_default_search_index
15
+ NODE_AUTO_INDEX_NAME = 'node_auto_index'
16
+ RELATIONSHIP_AUTO_INDEX_NAME = 'relationship_auto_index'
17
+ UNIQUE_ID_KEY = 'neoid_unique_id'
12
18
 
13
19
  class << self
14
20
  attr_accessor :db
15
21
  attr_accessor :logger
16
22
  attr_accessor :ref_node
17
23
  attr_accessor :env_loaded
18
-
19
- def models
20
- @models ||= []
21
- end
24
+ attr_reader :config
22
25
 
23
26
  def node_models
24
27
  @node_models ||= []
@@ -29,20 +32,42 @@ module Neoid
29
32
  end
30
33
 
31
34
  def config
32
- @config ||= {}
35
+ @config ||= begin
36
+ c = Neoid::Config.new
37
+
38
+ # default
39
+ c.enable_subrefs = true
40
+ c.enable_per_model_indexes = false
41
+
42
+ c
43
+ end
44
+ end
45
+
46
+ def configure
47
+ yield config
33
48
  end
34
49
 
35
50
  def initialize_all
36
51
  @env_loaded = true
37
- relationship_models.each do |rel_model|
38
- Relationship.initialize_relationship(rel_model)
39
- end
52
+ logger.info "Neoid initialize_all"
53
+ initialize_relationships
54
+ initialize_server
55
+ end
56
+
57
+ def initialize_server
58
+ initialize_auto_index
59
+ initialize_subrefs
60
+ initialize_per_model_indexes
40
61
  end
41
62
 
42
63
  def db
43
64
  raise "Must set Neoid.db with a Neography::Rest instance" unless @db
44
65
  @db
45
66
  end
67
+
68
+ def batch(options={}, &block)
69
+ Neoid::Batch.new(options, &block).run
70
+ end
46
71
 
47
72
  def logger
48
73
  @logger ||= Logger.new(ENV['NEOID_LOG'] ? ENV['NEOID_LOG_FILE'] || $stdout : '/dev/null')
@@ -53,10 +78,7 @@ module Neoid
53
78
  end
54
79
 
55
80
  def reset_cached_variables
56
- Neoid.models.each do |klass|
57
- klass.instance_variable_set(:@_neo_subref_node, nil)
58
- end
59
- $neo_ref_node = nil
81
+ initialize_subrefs
60
82
  end
61
83
 
62
84
  def clean_db(confirm)
@@ -83,6 +105,19 @@ module Neoid
83
105
  self.enabled = old
84
106
  end
85
107
 
108
+ def execute_script_or_add_to_batch(gremlin_query, script_vars)
109
+ if Neoid::Batch.current_batch
110
+ # returns a SingleResultPromiseProxy!
111
+ Neoid::Batch.current_batch << [:execute_script, gremlin_query, script_vars]
112
+ else
113
+ value = Neoid.db.execute_script(gremlin_query, script_vars)
114
+
115
+ value = yield(value) if block_given?
116
+
117
+ Neoid::BatchPromiseProxy.new(value)
118
+ end
119
+ end
120
+
86
121
  # create a fulltext index if not exists
87
122
  def ensure_default_fulltext_search_index
88
123
  Neoid.db.create_node_index(DEFAULT_FULLTEXT_SEARCH_INDEX_NAME, 'fulltext', 'lucene') unless (indexes = Neoid.db.list_node_indexes) && indexes[DEFAULT_FULLTEXT_SEARCH_INDEX_NAME]
@@ -155,5 +190,47 @@ module Neoid
155
190
 
156
191
  "(" + term.split(/\s+/).reject(&:empty?).map{ |t| "#{field}#{fulltext}:#{sanitize_term(t)}" }.join(" AND ") + ")"
157
192
  end
193
+
194
+ def initialize_relationships
195
+ logger.info "Neoid initialize_relationships"
196
+ relationship_models.each do |rel_model|
197
+ Relationship.initialize_relationship(rel_model)
198
+ end
199
+ end
200
+
201
+ def initialize_auto_index
202
+ logger.info "Neoid initialize_auto_index"
203
+ Neoid.db.set_node_auto_index_status(true)
204
+ Neoid.db.add_node_auto_index_property(UNIQUE_ID_KEY)
205
+
206
+ Neoid.db.set_relationship_auto_index_status(true)
207
+ Neoid.db.add_relationship_auto_index_property(UNIQUE_ID_KEY)
208
+ end
209
+
210
+ def initialize_subrefs
211
+ return unless config.enable_subrefs
212
+
213
+ node_models.each do |klass|
214
+ klass.reset_neo_subref_node
215
+ end
216
+
217
+ logger.info "Neoid initialize_subrefs"
218
+ batch do
219
+ node_models.each(&:neo_subref_node)
220
+ end.then do |results|
221
+ node_models.zip(results).each do |klass, subref|
222
+ klass.neo_subref_node = subref
223
+ end
224
+ end
225
+ end
226
+
227
+ def initialize_per_model_indexes
228
+ return unless config.enable_per_model_indexes
229
+
230
+ logger.info "Neoid initialize_subrefs"
231
+ batch do
232
+ node_models.each(&:neo_model_index)
233
+ end
234
+ end
158
235
  end
159
236
  end