neoid 0.0.51 → 0.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -1,3 +1,38 @@
1
+ ## v0.1
2
+
3
+ * Added batch support, for much faster intiialization of current DB or reindexing all DB.
4
+ * Dropped indexes per model, instead, using `node_auto_index` and `relationship_auto_index`, letting Neo4j auto index objects.
5
+ * One `neo_save` method instead of `neo_create` and `neo_update`. It takes care of inserting or updating.
6
+
7
+ ### Breaking changes:
8
+
9
+ Model indexes (such as `users_index`) are now turned off by default. Instead, Neoid uses Neo4j's auto indexing feature.
10
+
11
+ In order to have the model indexes back, use this in your configuration:
12
+
13
+ ```ruby
14
+ Neoid.configure do |c|
15
+ c.enable_per_model_indexes = true
16
+ end
17
+ ```
18
+
19
+ This will turn on for all models.
20
+
21
+ You can turn off for a specific model with:
22
+
23
+ ```ruby
24
+ class User < ActiveRecord::Base
25
+ include Neoid::Node
26
+
27
+ neoidable enable_model_index: false do |c|
28
+ end
29
+ end
30
+ ```
31
+
32
+ ## v0.0.51
33
+
34
+ * Releasing Neoid as a gem.
35
+
1
36
  ## v0.0.41
2
37
 
3
38
  * fixed really annoying bug caused by Rails design -- Rails doesn't call `after_destroy` when assigning many to many relationships to a model, like `user.movies = [m1, m2, m3]` or `user.update_attributes(params[:user])` where it contains `params[:user][:movie_ids]` list (say from checkboxes), but it DOES CALL after_create for the new relationships. the fix adds after_remove callback to the has_many relationships, ensuring neo4j is up to date with all changes, no matter how they were committed
data/README.md CHANGED
@@ -3,7 +3,6 @@
3
3
  [![Build Status](https://secure.travis-ci.org/elado/neoid.png)](http://travis-ci.org/elado/neoid)
4
4
 
5
5
 
6
-
7
6
  Make your ActiveRecords stored and searchable on Neo4j graph database, in order to make fast graph queries that MySQL would crawl while doing them.
8
7
 
9
8
  Neoid to Neo4j is like Sunspot to Solr. You get the benefits of Neo4j speed while keeping your schema on your plain old RDBMS.
@@ -12,6 +11,11 @@ Neoid doesn't require JRuby. It's based on the great [Neography](https://github.
12
11
 
13
12
  Neoid offers querying Neo4j for IDs of objects and then fetch them from your RDBMS, or storing all desired data on Neo4j.
14
13
 
14
+ **Important: Heroku Support is not available because Herokud doesn't support Gremlin. So until further notice, easiest way is to self host a Neo4j on EC2 in the same zone, and connect from your dyno to it**
15
+
16
+ ## Changelog
17
+
18
+ [See Changelog](https://github.com/elado/neoid/blob/master/CHANGELOG.md)
15
19
 
16
20
 
17
21
  ## Installation
@@ -19,11 +23,9 @@ Neoid offers querying Neo4j for IDs of objects and then fetch them from your RDB
19
23
  Add to your Gemfile and run the `bundle` command to install it.
20
24
 
21
25
  ```ruby
22
- gem 'neoid', '~> 0.0.51'
26
+ gem 'neoid', '~> 0.1'
23
27
  ```
24
28
 
25
- Future versions may have breaking changes but will arrive with migration code.
26
-
27
29
  **Requires Ruby 1.9.2 or later.**
28
30
 
29
31
  ## Usage
@@ -51,6 +53,11 @@ Neography.configure do |c|
51
53
  end
52
54
 
53
55
  Neoid.db = $neo
56
+
57
+ Neoid.configure do |c|
58
+ # should Neoid create sub-reference from the ref node (id#0) to every node-model? default: true
59
+ c.enable_subrefs = true
60
+ end
54
61
  ```
55
62
 
56
63
  `01_` in the file name is in order to get this file loaded first, before the models (initializers are loaded alphabetically).
@@ -71,9 +78,9 @@ class User < ActiveRecord::Base
71
78
  end
72
79
  ```
73
80
 
74
- This will help to create a corresponding node on Neo4j when a user is created, delete it when a user is destroyed, and update it if needed.
81
+ This will help to create/update/destroy a corresponding node on Neo4j when changed are made a User model.
75
82
 
76
- Then, you can customize what fields will be saved on the node in Neo4j, inside neoidable configuration:
83
+ Then, you can customize what fields will be saved on the node in Neo4j, inside `neoidable` configuration, using `field`. You can also pass blocks to save content that's not a real column:
77
84
 
78
85
  ```ruby
79
86
  class User < ActiveRecord::Base
@@ -89,7 +96,6 @@ class User < ActiveRecord::Base
89
96
  end
90
97
  ```
91
98
 
92
-
93
99
  #### Relationships
94
100
 
95
101
  Let's assume that a `User` can `Like` `Movie`s:
@@ -151,7 +157,7 @@ class Like < ActiveRecord::Base
151
157
  end
152
158
  ```
153
159
 
154
- Neoid adds `neo_node` and `neo_relationships` to nodes and relationships, respectively.
160
+ Neoid adds the metohds `neo_node` and `neo_relationships` to instances of nodes and relationships, respectively.
155
161
 
156
162
  So you could do:
157
163
 
@@ -169,38 +175,52 @@ rel.end_node # user.movies.first.neo_node
169
175
  rel.rel_type # 'likes'
170
176
  ```
171
177
 
172
- ## Index for Full-Text Search
178
+ #### Disabling auto saving to Neo4j:
173
179
 
174
- Using `search` block inside a `neoidable` block, you can store certain fields.
180
+ If you'd like to save nodes manually rather than after_save, use `auto_index: false`:
175
181
 
176
182
  ```ruby
177
- # movie.rb
178
-
179
- class Movie < ActiveRecord::Base
183
+ class User < ActiveRecord::Base
180
184
  include Neoid::Node
181
-
182
- neoidable do |c|
183
- c.field :slug
184
- c.field :name
185
-
186
- c.search do |s|
187
- # full-text index fields
188
- s.fulltext :name
189
- s.fulltext :description
190
-
191
- # just index for exact matches
192
- s.index :year
193
- end
185
+
186
+ neoidable auto_index: false do |c|
194
187
  end
195
188
  end
196
- ```
197
189
 
198
- Records will be automatically indexed when inserted or updated.
190
+ user = User.create!(name: "Elad") # no node is created in Neo4j!
191
+
192
+ user.neo_save # now there is!
193
+ ```
199
194
 
200
195
  ## Querying
201
196
 
202
197
  You can query with all [Neography](https://github.com/maxdemarzi/neography)'s API: `traverse`, `execute_query` for Cypher, and `execute_script` for Gremlin.
203
198
 
199
+ ### Basics:
200
+
201
+ #### Finding a node by ID
202
+
203
+ Nodes and relationships are auto indexed in the `node_auto_index` and `relationship_auto_index` indexes, where the key is `Neoid::UNIQUE_ID_KEY` (which is 'neoid_unique_id') and the value is a combination of the class name and model id, `Movie:43`, this value is accessible with `model.neo_unique_id`. So use the constant and this method, never rely on assebling those values on your own because they might change in the future.
204
+
205
+ That means, you can query like this:
206
+
207
+ ```ruby
208
+ Neoid.db.get_node_auto_index(Neoid::UNIQUE_ID_KEY, user.neo_unique_id)
209
+ # => returns a Neography hash
210
+
211
+ Neoid::Node.from_hash(Neoid.db.get_node_auto_index(Neoid::UNIQUE_ID_KEY, user.neo_unique_id))
212
+ # => returns a Neography::Node
213
+ ```
214
+
215
+ #### Finding all nodes of type
216
+
217
+ If Subreferences are enabled, you can get the subref node and then get all attached nodes:
218
+
219
+ ```ruby
220
+ Neoid.ref_node.outgoing('users_subref').first.outgoing('users_subref').to_a
221
+ # => this, according to Neography, returns an array of Neography::Node so no conversion is needed
222
+ ```
223
+
204
224
  ### Gremlin Example:
205
225
 
206
226
  These examples query Neo4j using Gremlin for IDs of objects, and then fetches them from ActiveRecord with an `in` query.
@@ -208,7 +228,7 @@ These examples query Neo4j using Gremlin for IDs of objects, and then fetches th
208
228
  Of course, you can store using the `neoidable do |c| c.field ... end` all the data you need in Neo4j and avoid querying ActiveRecord.
209
229
 
210
230
 
211
- **Most popular categories**
231
+ **Most liked movies**
212
232
 
213
233
  ```ruby
214
234
  gremlin_query = <<-GREMLIN
@@ -228,15 +248,18 @@ movie_ids = Neoid.db.execute_script(gremlin_query)
228
248
  Movie.where(id: movie_ids)
229
249
  ```
230
250
 
231
- Assuming we have another `Friendship` model which is a relationship with start/end nodes of `user` and type of `friends`,
251
+ *Side note: the resulted movies won't be sorted by like count because the RDBMS won't necessarily do it as we passed a list of IDs. You can sort it yourself with array manipulation, since you have the ids.*
252
+
232
253
 
233
254
  **Movies of user friends that the user doesn't have**
234
255
 
256
+ Let's assume we have another `Friendship` model which is a relationship with start/end nodes of `user` and type of `friends`,
257
+
235
258
  ```ruby
236
259
  user = User.find(1)
237
260
 
238
261
  gremlin_query = <<-GREMLIN
239
- u = g.idx('users_index')[[ar_id:user_id]].next()
262
+ u = g.idx('node_auto_index').get(unique_id_key, user_unique_id).next()
240
263
  movies = []
241
264
 
242
265
  u
@@ -246,15 +269,42 @@ gremlin_query = <<-GREMLIN
246
269
  .except(movies).collect{it.ar_id}
247
270
  GREMLIN
248
271
 
249
- movie_ids = Neoid.db.execute_script(gremlin_query, user_id: user.id)
272
+ movie_ids = Neoid.db.execute_script(gremlin_query, unique_id_key: Neoid::UNIQUE_ID_KEY, user_unique_id: user.neo_unique_id)
250
273
 
251
274
  Movie.where(id: movie_ids)
252
275
  ```
253
276
 
254
- `.next()` is in order to get a vertex object which we can actually query on.
277
+ ## Full Text Search
255
278
 
279
+ ### Index for Full-Text Search
256
280
 
257
- ### Full Text Search
281
+ Using `search` block inside a `neoidable` block, you can store certain fields.
282
+
283
+ ```ruby
284
+ # movie.rb
285
+
286
+ class Movie < ActiveRecord::Base
287
+ include Neoid::Node
288
+
289
+ neoidable do |c|
290
+ c.field :slug
291
+ c.field :name
292
+
293
+ c.search do |s|
294
+ # full-text index fields
295
+ s.fulltext :name
296
+ s.fulltext :description
297
+
298
+ # just index for exact matches
299
+ s.index :year
300
+ end
301
+ end
302
+ end
303
+ ```
304
+
305
+ Records will be automatically indexed when inserted or updated.
306
+
307
+ ### Querying a Full-Text Search index
258
308
 
259
309
  ```ruby
260
310
  # will match all movies with full-text match for name/description. returns ActiveRecord instanced
@@ -270,14 +320,63 @@ Neoid.neo_search([Movie, User], "hello")
270
320
  Movie.neo_search(year: 2013).results
271
321
  ```
272
322
 
323
+ Full text search with Neoid is very limited and is likely not to develop more than this basic functionality. I strongly recommend using gems like Sunspot over Solr.
324
+
325
+ ## Batches
326
+
327
+ Neoid has a batch ability, that is good for mass updateing/inserting of nodes/relationships. It sends batched requests to Neography, and takes care of type conversion (neography batch returns hashes and other primitive types) and "after" actions (via promises).
328
+
329
+ A few examples, easy to complex:
330
+
331
+ ```ruby
332
+ Neoid.batch(batch_size: 100) do
333
+ User.all.each(&:neo_save)
334
+ end
335
+ ```
336
+ With `then`:
337
+
338
+ ```ruby
339
+ User.first.name # => "Elad"
340
+
341
+ Neoid.batch(batch_size: 100) do
342
+ User.all.each(&:neo_save)
343
+ end.then do |results|
344
+ # results is an array of the script results from neo4j REST.
345
+
346
+ results[0].name # => "Elad"
347
+ end
348
+ ```
349
+
350
+ *Nodes and relationships in the results are automatically converted to Neography::Node and Neography::Relationship, respectively.*
351
+
352
+ With individual `then` as well as `then` for the entire batch:
353
+
354
+ ```ruby
355
+ Neoid.batch(batch_size: 30) do |batch|
356
+ (1..90).each do |i|
357
+ (batch << [:create_node, { name: "Hello #{i}" }]).then { |result| puts result.name }
358
+ end
359
+ end.then do |results|
360
+ puts results.collect(&:name)
361
+ end
362
+ ```
363
+
364
+ When in a batch, `neo_save` adds gremlin scripts to a batch, instead of running them immediately. The batch flushes whenever the `batch_size` option is met.
365
+ So even if you have 20000 users, Neoid will insert/update in smaller batches. Default `batch_size` is 200.
366
+
367
+
273
368
  ## Inserting records of existing app
274
369
 
275
- If you have an existing database and just want to integrate Neoid, configure the `neoidable`s and run in a rake task or console
370
+ If you have an existing database and just want to integrate Neoid, configure the `neoidable`s and run in a rake task or console.
371
+
372
+ Use batches! It's free, and much faster. Also, you should use `includes` to incude the relationship edges on relationship entities, so it doesn't query the DB on each relationship.
276
373
 
277
374
  ```ruby
278
- [ Like.includes(:user).includes(:movie), OtherRelationshipModel ].each { |model| model.all.each(&:neo_update) }
375
+ Neoid.batch do
376
+ [ Like.includes(:user).includes(:movie), OtherRelationshipModel.includes(:from_model).includes(:to_model) ].each { |model| model.all.each(&:neo_save) }
279
377
 
280
- NodeModel.all.each(&:neo_update)
378
+ NodeModel.all.each(&:neo_save)
379
+ end
281
380
  ```
282
381
 
283
382
  This will loop through all of your relationship records and generate the two edge nodes along with a relationship (eager loading for better performance).
@@ -289,30 +388,32 @@ Better interface for that in the future.
289
388
 
290
389
  ## Behind The Scenes
291
390
 
292
- Whenever the `neo_node` on nodes or `neo_relationship` on relationships is called, Neoid checks if there's a corresponding node/relationship in Neo4j. If not, it does the following:
391
+ Whenever the `neo_node` on nodes or `neo_relationship` on relationships is called, Neoid checks if there's a corresponding node/relationship in Neo4j (with the auto indexes). If not, it does the following:
293
392
 
294
393
  ### For Nodes:
295
394
 
296
- 1. Ensures there's a sub reference node (read [here](http://docs.neo4j.org/chunked/stable/tutorials-java-embedded-index.html) about sub reference nodes)
395
+ 1. Ensures there's a sub reference node (read [here](http://docs.neo4j.org/chunked/stable/tutorials-java-embedded-index.html) about sub references), if that option is on.
297
396
  2. Creates a node based on the ActiveRecord, with the `id` attribute and all other attributes from `neoidable`'s field list
298
397
  3. Creates a relationship between the sub reference node and the newly created node
299
- 4. Adds the ActiveRecord `id` to a node index, pointing to the Neo4j node id, for fast lookup in the future
398
+ 4. Auto indexes a node in the auto index, for fast lookup in the future
300
399
 
301
- Then, when it needs to find it again, it just seeks the node index with that ActiveRecord id for its neo node id.
400
+ Then, when it needs to find it again, it just seeks the auto index with that ActiveRecord id.
302
401
 
303
402
  ### For Relationships:
304
403
 
305
- Like Nodes, it uses an index (relationship index) to look up a relationship by ActiveRecord id
404
+ Like Nodes, it uses an auto index, to look up a relationship by ActiveRecord id
306
405
 
307
406
  1. With the options passed in the `neoidable`, it fetches the `start_node` and `end_node`
308
407
  2. Then, it calls `neo_node` on both, in order to create the Neo4j nodes if they're not created yet, and creates the relationship with the type from the options.
309
- 3. Add the relationship to the relationship index.
408
+ 3. Adds the relationship to the relationship index.
310
409
 
311
410
  ## Testing
312
411
 
313
412
  In order to test your app or this gem, you need a running Neo4j database, dedicated to tests.
314
413
 
315
- I use port 7574 for this. To run another database locally:
414
+ I use port 7574 for testing.
415
+
416
+ To run another database locally (read [here](http://docs.neo4j.org/chunked/1.9.M03/server-installation.html#_multiple_server_instances_on_one_machine) too):
316
417
 
317
418
  Copy the entire Neo4j database folder to a different location,
318
419
 
@@ -344,7 +445,7 @@ end
344
445
 
345
446
  ## Testing This Gem
346
447
 
347
- Just run `rake` from the gem folder.
448
+ Run the Neo4j DB on port 7574, and run `rake` from the gem folder.
348
449
 
349
450
  ## Contributing
350
451
 
@@ -356,9 +457,9 @@ Please create a [new issue](https://github.com/elado/neoid/issues) if you run in
356
457
  Unfortunately, as for now, Neo4j add-on on Heroku doesn't support Gremlin. Therefore, this gem won't work on Heroku's add on. You should self-host a Neo4j instance on an EC2 or any other server.
357
458
 
358
459
 
359
- ## To Do
460
+ ## TO DO
360
461
 
361
- [To Do](https://github.com/elado/neoid/blob/master/TODO.md)
462
+ [TO DO](HTTPS://GITHUB.COM/ELADO/NEOID/BLOB/MASTER/TODO.MD)
362
463
 
363
464
 
364
465
  ---
data/TODO.md CHANGED
@@ -1,6 +1,4 @@
1
1
  # Neoid - To Do
2
2
 
3
- * Allow to disable sub reference nodes through options
4
3
  * Execute queries/scripts from model and not Neography (e.g. `Movie.neo_gremlin(gremlin_query)` with query that outputs IDs, returns a list of `Movie`s)
5
4
  * Rake task to index all nodes and relatiohsips in Neo4j
6
- * Test update node
@@ -1,24 +1,27 @@
1
+ require 'neography'
1
2
  require 'neoid/version'
3
+ require 'neoid/config'
2
4
  require 'neoid/model_config'
3
5
  require 'neoid/model_additions'
4
6
  require 'neoid/search_session'
5
7
  require 'neoid/node'
6
8
  require 'neoid/relationship'
9
+ require 'neoid/batch'
7
10
  require 'neoid/database_cleaner'
8
11
  require 'neoid/railtie' if defined?(Rails)
9
12
 
10
13
  module Neoid
11
- DEFAULT_FULLTEXT_SEARCH_INDEX_NAME = 'neoid_default_search_index'
14
+ DEFAULT_FULLTEXT_SEARCH_INDEX_NAME = :neoid_default_search_index
15
+ NODE_AUTO_INDEX_NAME = 'node_auto_index'
16
+ RELATIONSHIP_AUTO_INDEX_NAME = 'relationship_auto_index'
17
+ UNIQUE_ID_KEY = 'neoid_unique_id'
12
18
 
13
19
  class << self
14
20
  attr_accessor :db
15
21
  attr_accessor :logger
16
22
  attr_accessor :ref_node
17
23
  attr_accessor :env_loaded
18
-
19
- def models
20
- @models ||= []
21
- end
24
+ attr_reader :config
22
25
 
23
26
  def node_models
24
27
  @node_models ||= []
@@ -29,20 +32,42 @@ module Neoid
29
32
  end
30
33
 
31
34
  def config
32
- @config ||= {}
35
+ @config ||= begin
36
+ c = Neoid::Config.new
37
+
38
+ # default
39
+ c.enable_subrefs = true
40
+ c.enable_per_model_indexes = false
41
+
42
+ c
43
+ end
44
+ end
45
+
46
+ def configure
47
+ yield config
33
48
  end
34
49
 
35
50
  def initialize_all
36
51
  @env_loaded = true
37
- relationship_models.each do |rel_model|
38
- Relationship.initialize_relationship(rel_model)
39
- end
52
+ logger.info "Neoid initialize_all"
53
+ initialize_relationships
54
+ initialize_server
55
+ end
56
+
57
+ def initialize_server
58
+ initialize_auto_index
59
+ initialize_subrefs
60
+ initialize_per_model_indexes
40
61
  end
41
62
 
42
63
  def db
43
64
  raise "Must set Neoid.db with a Neography::Rest instance" unless @db
44
65
  @db
45
66
  end
67
+
68
+ def batch(options={}, &block)
69
+ Neoid::Batch.new(options, &block).run
70
+ end
46
71
 
47
72
  def logger
48
73
  @logger ||= Logger.new(ENV['NEOID_LOG'] ? ENV['NEOID_LOG_FILE'] || $stdout : '/dev/null')
@@ -53,10 +78,7 @@ module Neoid
53
78
  end
54
79
 
55
80
  def reset_cached_variables
56
- Neoid.models.each do |klass|
57
- klass.instance_variable_set(:@_neo_subref_node, nil)
58
- end
59
- $neo_ref_node = nil
81
+ initialize_subrefs
60
82
  end
61
83
 
62
84
  def clean_db(confirm)
@@ -83,6 +105,19 @@ module Neoid
83
105
  self.enabled = old
84
106
  end
85
107
 
108
+ def execute_script_or_add_to_batch(gremlin_query, script_vars)
109
+ if Neoid::Batch.current_batch
110
+ # returns a SingleResultPromiseProxy!
111
+ Neoid::Batch.current_batch << [:execute_script, gremlin_query, script_vars]
112
+ else
113
+ value = Neoid.db.execute_script(gremlin_query, script_vars)
114
+
115
+ value = yield(value) if block_given?
116
+
117
+ Neoid::BatchPromiseProxy.new(value)
118
+ end
119
+ end
120
+
86
121
  # create a fulltext index if not exists
87
122
  def ensure_default_fulltext_search_index
88
123
  Neoid.db.create_node_index(DEFAULT_FULLTEXT_SEARCH_INDEX_NAME, 'fulltext', 'lucene') unless (indexes = Neoid.db.list_node_indexes) && indexes[DEFAULT_FULLTEXT_SEARCH_INDEX_NAME]
@@ -155,5 +190,47 @@ module Neoid
155
190
 
156
191
  "(" + term.split(/\s+/).reject(&:empty?).map{ |t| "#{field}#{fulltext}:#{sanitize_term(t)}" }.join(" AND ") + ")"
157
192
  end
193
+
194
+ def initialize_relationships
195
+ logger.info "Neoid initialize_relationships"
196
+ relationship_models.each do |rel_model|
197
+ Relationship.initialize_relationship(rel_model)
198
+ end
199
+ end
200
+
201
+ def initialize_auto_index
202
+ logger.info "Neoid initialize_auto_index"
203
+ Neoid.db.set_node_auto_index_status(true)
204
+ Neoid.db.add_node_auto_index_property(UNIQUE_ID_KEY)
205
+
206
+ Neoid.db.set_relationship_auto_index_status(true)
207
+ Neoid.db.add_relationship_auto_index_property(UNIQUE_ID_KEY)
208
+ end
209
+
210
+ def initialize_subrefs
211
+ return unless config.enable_subrefs
212
+
213
+ node_models.each do |klass|
214
+ klass.reset_neo_subref_node
215
+ end
216
+
217
+ logger.info "Neoid initialize_subrefs"
218
+ batch do
219
+ node_models.each(&:neo_subref_node)
220
+ end.then do |results|
221
+ node_models.zip(results).each do |klass, subref|
222
+ klass.neo_subref_node = subref
223
+ end
224
+ end
225
+ end
226
+
227
+ def initialize_per_model_indexes
228
+ return unless config.enable_per_model_indexes
229
+
230
+ logger.info "Neoid initialize_subrefs"
231
+ batch do
232
+ node_models.each(&:neo_model_index)
233
+ end
234
+ end
158
235
  end
159
236
  end