bigrecord 0.1.0 → 0.1.1
Sign up to get free protection for your applications and to get access to all the features.
- data/README.rdoc +23 -11
- data/VERSION +1 -1
- data/guides/bigrecord_specs.rdoc +9 -3
- data/guides/cassandra_install.rdoc +65 -0
- data/guides/deployment.rdoc +12 -5
- data/guides/getting_started.rdoc +48 -62
- data/guides/hbase_install.rdoc +48 -0
- data/guides/storage-conf.rdoc +310 -0
- data/lib/big_record/connection_adapters/cassandra_adapter.rb +34 -65
- data/spec/connections/bigrecord.yml +2 -2
- metadata +9 -3
data/README.rdoc
CHANGED
@@ -1,6 +1,7 @@
|
|
1
1
|
= Big Record
|
2
2
|
|
3
|
-
A Ruby Object/Data Mapper for distributed column-oriented data stores (inspired by BigTable) such as HBase. Intended
|
3
|
+
A Ruby Object/Data Mapper for distributed column-oriented data stores (inspired by BigTable) such as HBase. Intended
|
4
|
+
to work as a drop-in for Rails applications.
|
4
5
|
|
5
6
|
== Features
|
6
7
|
* Dynamic schemas (due to the schema-less design of BigTable).
|
@@ -12,22 +13,33 @@ A Ruby Object/Data Mapper for distributed column-oriented data stores (inspired
|
|
12
13
|
|
13
14
|
== Motivations
|
14
15
|
|
15
|
-
BigTable, and by extension, Bigrecord isn't right for everyone. A great introductory article discussing this topic can
|
16
|
+
BigTable, and by extension, Bigrecord isn't right for everyone. A great introductory article discussing this topic can
|
17
|
+
be found at http://blog.rapleaf.com/dev/?p=26 explaining why you would or wouldn't use BigTable. The rule of thumb,
|
18
|
+
however, is that if your data model is simple or can fit into a standard RDBMS, then you probably don't need it.
|
16
19
|
|
17
20
|
Beyond this though, there are two basic motivations that almost immediately demand a BigTable model database:
|
18
|
-
1. Your data is highly dynamic in nature and would not fit in a schema bound model, or you cannot define a schema ahead
|
19
|
-
|
21
|
+
1. Your data is highly dynamic in nature and would not fit in a schema bound model, or you cannot define a schema ahead
|
22
|
+
of time.
|
23
|
+
2. You know that your database will grow to tens or hundreds of gigabytes, and can't afford big iron servers. Instead,
|
24
|
+
you'd like to scale horizontally across many commodity servers.
|
20
25
|
|
21
|
-
==
|
26
|
+
== Components
|
22
27
|
|
23
|
-
*
|
24
|
-
* Big Record Driver: JRuby application that bridges Ruby and Java (through JRuby's Drb protocol) to interact with Java-based data stores and their native APIs. Required for HBase and Cassandra. This application can be run from a separate server than your Rails application.
|
25
|
-
* JRuby 1.1.6+ is needed to run Big Record Driver.
|
26
|
-
* Any other requirements needed to run Hadoop, HBase or your data store of choice.
|
28
|
+
* Bigrecord: Ruby Object/Data Mapper. Inspired and architected similarly to Active Record.
|
27
29
|
|
28
|
-
== Optional
|
30
|
+
== Optional Component
|
29
31
|
|
30
|
-
*
|
32
|
+
* Bigrecord Driver: Consists of a JRuby server component that bridges Ruby and Java (through the DRb protocol) to
|
33
|
+
interact with Java-based data stores and their native APIs. Clients that connect to the DRb server can be of any Ruby
|
34
|
+
type (JRuby, MRI, etc). Currently, this is used only for HBase to serve as a connection alternative to Thrift or
|
35
|
+
Stargate. This application can be run from a separate server than your Rails application.
|
36
|
+
|
37
|
+
* Bigindex [http://github.com/openplaces/bigindex]: Due to the nature of BigTable databases, some limitations are
|
38
|
+
present while using Bigrecord standalone when compared to Active Record. Some major limitations include the inability
|
39
|
+
to query for data other than with the row ID, indexing, searching, and dynamic finders (find_by_attribute_name). Since
|
40
|
+
these data access patterns are vital for most Rails applications to function, Bigindex was created to address these
|
41
|
+
issues, and bring the feature set more up to par with Active Record. Please refer to the <tt>Bigindex</tt> package for
|
42
|
+
more information and its requirements.
|
31
43
|
|
32
44
|
== Getting Started
|
33
45
|
|
data/VERSION
CHANGED
@@ -1 +1 @@
|
|
1
|
-
0.1.
|
1
|
+
0.1.1
|
data/guides/bigrecord_specs.rdoc
CHANGED
@@ -2,11 +2,15 @@
|
|
2
2
|
|
3
3
|
== Data store information
|
4
4
|
|
5
|
-
The default settings for the Bigrecord specs can be found at spec/connections/bigrecord.yml with each environment
|
5
|
+
The default settings for the Bigrecord specs can be found at spec/connections/bigrecord.yml with each environment
|
6
|
+
broken down by the data store type (Hbase and Cassandra at the time of writing). These are the minimal settings
|
7
|
+
required to connect to each data store, and should be modified freely to reflect your own system configurations.
|
6
8
|
|
7
9
|
== Data store migration
|
8
10
|
|
9
|
-
There are also migrations to create the necessary tables for the specs to run. To ensure migrations are functioning
|
11
|
+
There are also migrations to create the necessary tables for the specs to run. To ensure migrations are functioning
|
12
|
+
properly before actually running the migrations, you can run: spec spec/unit/migration_spec.rb. Alternatively, you
|
13
|
+
can manually create the tables according to the migration files under: spec/lib/migrations
|
10
14
|
|
11
15
|
Migrations have their own log file for debugging purposes. It's created under: bigrecord/migrate.log
|
12
16
|
|
@@ -31,6 +35,8 @@ To run a specific spec, you can run the following command from the bigrecord roo
|
|
31
35
|
|
32
36
|
== Debugging
|
33
37
|
|
34
|
-
If any problems or failures arise during the unit tests, please refer to the log files before submitting it as an
|
38
|
+
If any problems or failures arise during the unit tests, please refer to the log files before submitting it as an
|
39
|
+
issue. Often, it's a simple matter of forgetting to turn on BigrecordDriver, the tables weren't created, or
|
40
|
+
configurations weren't set properly.
|
35
41
|
|
36
42
|
The log file for specs is created under: <bigrecord root>/spec/debug.log
|
@@ -0,0 +1,65 @@
|
|
1
|
+
== Setting up Cassandra
|
2
|
+
|
3
|
+
To quickly get started with development, you can set up Cassandra to run as a single node cluster on your local system.
|
4
|
+
|
5
|
+
(1) Download and unpack the most recent release of Cassandra from http://cassandra.apache.org/download/
|
6
|
+
|
7
|
+
(2) Add a <Keyspace></Keyspace> entry into your (cassandra-dir)/conf/storage-conf.xml configuration file named after
|
8
|
+
your application, and create <ColumnFamily> entries corresponding to each model you wish to add. The following is an
|
9
|
+
example of the Bigrecord keyspace used to run the spec suite against:
|
10
|
+
|
11
|
+
<Keyspace Name="Bigrecord">
|
12
|
+
<ColumnFamily Name="animals" CompareWith="UTF8Type" />
|
13
|
+
<ColumnFamily Name="books" CompareWith="UTF8Type" />
|
14
|
+
<ColumnFamily Name="companies" CompareWith="UTF8Type" />
|
15
|
+
<ColumnFamily Name="employees" CompareWith="UTF8Type" />
|
16
|
+
<ColumnFamily Name="novels" CompareWith="UTF8Type" />
|
17
|
+
<ColumnFamily Name="zoos" CompareWith="UTF8Type" />
|
18
|
+
|
19
|
+
<ReplicaPlacementStrategy>org.apache.cassandra.locator.RackUnawareStrategy</ReplicaPlacementStrategy>
|
20
|
+
|
21
|
+
<ReplicationFactor>1</ReplicationFactor>
|
22
|
+
|
23
|
+
<EndPointSnitch>org.apache.cassandra.locator.EndPointSnitch</EndPointSnitch>
|
24
|
+
</Keyspace>
|
25
|
+
|
26
|
+
You can also see {file:guides/storage-conf.rdoc guides/storage-conf.rdoc} for an example of a full configuration. More
|
27
|
+
documentation on setting up Cassandra can be found at http://wiki.apache.org/cassandra/GettingStarted
|
28
|
+
|
29
|
+
(3) Install the Cassandra Rubygem:
|
30
|
+
|
31
|
+
$ [sudo] gem install cassandra
|
32
|
+
|
33
|
+
(4) Start up Cassandra:
|
34
|
+
$ (cassandra-dir)/bin/cassandra -f
|
35
|
+
|
36
|
+
|
37
|
+
== Setting up Bigrecord
|
38
|
+
|
39
|
+
(1) Add the following line into the Rails::Initializer.run do |config| block:
|
40
|
+
|
41
|
+
config.gem "bigrecord", :source => "http://gemcutter.org"
|
42
|
+
|
43
|
+
and run the following command to install all the gems listed for your Rails app:
|
44
|
+
|
45
|
+
[sudo] rake gems:install
|
46
|
+
|
47
|
+
(2) Bootstrap Bigrecord into your project:
|
48
|
+
|
49
|
+
script/generate bigrecord
|
50
|
+
|
51
|
+
(3) Edit the config/bigrecord.yml[.sample] file in your Rails root to the information corresponding to your Cassandra
|
52
|
+
install (keyspace should correspond to the one you defined in step 2 of "Setting up Cassandra" above):
|
53
|
+
|
54
|
+
development:
|
55
|
+
adapter: cassandra
|
56
|
+
keyspace: Bigrecord
|
57
|
+
servers: localhost:9160
|
58
|
+
production:
|
59
|
+
adapter: cassandra
|
60
|
+
keyspace: Bigrecord
|
61
|
+
servers:
|
62
|
+
- server1:9160
|
63
|
+
- server2:9160
|
64
|
+
|
65
|
+
Note: 9160 is the default port for Cassandra's Thrift server.
|
data/guides/deployment.rdoc
CHANGED
@@ -1,15 +1,22 @@
|
|
1
|
-
= Deploying Big Record
|
1
|
+
= Deploying Big Record with HBase
|
2
2
|
|
3
|
-
Stargate is a new implementation for HBase's web service front-end, and as such, is not currently recommended for
|
3
|
+
Stargate is a new implementation for HBase's web service front-end, and as such, is not currently recommended for
|
4
|
+
deployment.
|
4
5
|
|
5
|
-
We here at Openplaces have developed Bigrecord Driver, which uses JRuby to interact with HBase via the native
|
6
|
+
We here at Openplaces have developed Bigrecord Driver, which uses JRuby to interact with HBase via the native
|
7
|
+
Java API and connect to Bigrecord through the DRb protocol. This method is slightly more complicated to setup,
|
8
|
+
but preliminary benchmarks show that it runs faster (especially for scanner functionality).
|
6
9
|
|
7
10
|
== Instructions
|
8
|
-
* Your database should already be set up (please refer to the database's own documentation) with the required
|
11
|
+
* Your database should already be set up (please refer to the database's own documentation) with the required
|
12
|
+
information known such as the zookeeper quorum/port, etc. in order for Bigrecord to connect to it.
|
13
|
+
|
9
14
|
* Bigrecord Driver (if your database requires it for connecting)
|
10
15
|
* JRuby 1.1.6+ is needed to run Bigrecord Driver.
|
11
16
|
|
12
|
-
Install the Bigrecord Driver gem and its dependencies, then start up a DRb server. Please refer the Bigrecord Driver
|
17
|
+
Install the Bigrecord Driver gem and its dependencies, then start up a DRb server. Please refer the Bigrecord Driver
|
18
|
+
documentation for more detailed instructions.
|
19
|
+
(http://github.com/openplaces/bigrecord/blob/master/bigrecord-driver/README.rdoc)
|
13
20
|
|
14
21
|
Edit your bigrecord.yml config file as follows:
|
15
22
|
|
data/guides/getting_started.rdoc
CHANGED
@@ -1,50 +1,10 @@
|
|
1
1
|
= Getting Started
|
2
2
|
|
3
|
+
== Install HBase or Cassandra
|
3
4
|
|
4
|
-
|
5
|
-
|
6
|
-
To quickly get started with development, you can set up HBase to run as a single server on your local computer, along with Stargate, its RESTful web service front-end.
|
7
|
-
|
8
|
-
(1) Download and unpack the most recent release of HBase from http://hadoop.apache.org/hbase/releases.html#Download
|
9
|
-
|
10
|
-
(2) Edit (hbase-dir)/conf/hbase-env.sh and uncomment/modify the following line to correspond to your Java home path:
|
11
|
-
export JAVA_HOME=/usr/lib/jvm/java-6-sun
|
12
|
-
|
13
|
-
(3) Copy (hbase-dir)/contrib/stargate/hbase-<version>-stargate.jar into <hbase-dir>/lib
|
14
|
-
|
15
|
-
(4) Copy all the files in the (hbase-dir)/contrib/stargate/lib folder into <hbase-dir>/lib
|
16
|
-
|
17
|
-
(5) Start up HBase:
|
18
|
-
$ (hbase-dir)/bin/start-hbase.sh
|
19
|
-
|
20
|
-
(6)Start up Stargate (append "-p 1234" at the end if you want to change the port):
|
21
|
-
$ (hbase-dir)/bin/hbase org.apache.hadoop.hbase.stargate.Main
|
22
|
-
|
23
|
-
|
24
|
-
== Setting up Bigrecord
|
25
|
-
|
26
|
-
(1) Install the Bigrecord Driver gem and its dependencies, then start up a DRb server. Please see the Bigrecord Driver documentation for more detailed instructions. (http://github.com/openplaces/bigrecord/blob/master/bigrecord-driver/README.rdoc)
|
27
|
-
|
28
|
-
(2) Add the following line into the Rails::Initializer.run do |config| block:
|
29
|
-
|
30
|
-
config.gem "bigrecord", :source => "http://gemcutter.org"
|
31
|
-
|
32
|
-
and run the following command to install all the gems listed for your Rails app:
|
33
|
-
|
34
|
-
[sudo] rake gems:install
|
35
|
-
|
36
|
-
(3) Bootstrap Bigrecord into your project:
|
37
|
-
|
38
|
-
script/generate bigrecord
|
39
|
-
|
40
|
-
(4) Edit the config/bigrecord.yml[.sample] file in your Rails root to the information corresponding to the Stargate server.
|
41
|
-
|
42
|
-
development:
|
43
|
-
adapter: hbase_rest
|
44
|
-
api_address: http://localhost:8080
|
45
|
-
|
46
|
-
Note: 8080 is the default port that Stargate starts up on. Make sure you modify this if you changed the port from the default.
|
5
|
+
* HBase: {file:guides/hbase_install.rdoc guides/hbase_install.rdoc}
|
47
6
|
|
7
|
+
* Cassandra: {file:guides/cassandra_install.rdoc guides/cassandra_install.rdoc}
|
48
8
|
|
49
9
|
== Usage
|
50
10
|
|
@@ -54,7 +14,8 @@ Once Bigrecord is working in your Rails project, you can use the following gener
|
|
54
14
|
|
55
15
|
script/generate bigrecord_model ModelName
|
56
16
|
|
57
|
-
This will add a model in app/models and a migration file in db/bigrecord_migrate. Note: This generator does not
|
17
|
+
This will add a model in app/models and a migration file in db/bigrecord_migrate. Note: This generator does not
|
18
|
+
accept attributes.
|
58
19
|
|
59
20
|
script/generate bigrecord_migration MigrationName
|
60
21
|
|
@@ -62,11 +23,19 @@ Creates a Bigrecord specific migration and adds it into db/bigrecord_migrate
|
|
62
23
|
|
63
24
|
=== {BigRecord::Migration Migration File}
|
64
25
|
|
65
|
-
|
26
|
+
Note: Cassandra doesn't have the capability to modify the ColumnFamily schema while running, and can only be edited
|
27
|
+
from the storage-conf.xml configuration while the cluster is down. Future versions of Cassandra have this planned.
|
28
|
+
|
29
|
+
Although column-oriented databases are generally schema-less, certain ones (like Hbase) require the creation of
|
30
|
+
tables and column families ahead of time. The individual columns, however, are defined in the model itself and can
|
31
|
+
be modified dynamically without the need for migrations.
|
66
32
|
|
67
|
-
Unless you're familiar with column families, the majority of use cases work perfectly fine within one column family.
|
33
|
+
Unless you're familiar with column families, the majority of use cases work perfectly fine within one column family.
|
34
|
+
When you generate a bigrecord_model, it will default to creating the :attribute column family.
|
68
35
|
|
69
|
-
The following is a standard migration file that creates a table called "Books" with the default column family
|
36
|
+
The following is a standard migration file that creates a table called "Books" with the default column family
|
37
|
+
:attribute that has the following option of 100 versions and uses the 'lzo' compression scheme. Leave any options
|
38
|
+
blank for the default value.
|
70
39
|
|
71
40
|
class CreateBooks < BigRecord::Migration
|
72
41
|
def self.up
|
@@ -80,12 +49,15 @@ The following is a standard migration file that creates a table called "Books" w
|
|
80
49
|
end
|
81
50
|
end
|
82
51
|
|
83
|
-
|
52
|
+
==== HBase column family options (HBase specific)
|
84
53
|
|
85
|
-
* versions: integer. By default, Hbase will store 3 versions of changes for any column family. Changing this value on
|
86
|
-
|
54
|
+
* versions: integer. By default, Hbase will store 3 versions of changes for any column family. Changing this value on
|
55
|
+
the creation will change this behavior.
|
87
56
|
|
88
|
-
|
57
|
+
* compression: 'none', 'gz', 'lzo'. Defaults to 'none'. Since Hbase 0.20, column families can be stored using
|
58
|
+
compression. The compression scheme you define here must be installed on the Hbase servers!
|
59
|
+
|
60
|
+
==== Migrating
|
89
61
|
|
90
62
|
Run the following rake task to migrate your tables and column families up to the latest version:
|
91
63
|
|
@@ -93,7 +65,8 @@ Run the following rake task to migrate your tables and column families up to the
|
|
93
65
|
|
94
66
|
=== {BigRecord::ConnectionAdapters::Column Column and Attribute Definition}
|
95
67
|
|
96
|
-
Now that you have your tables and column families all set up, you can begin adding columns to your model. The
|
68
|
+
Now that you have your tables and column families all set up, you can begin adding columns to your model. The
|
69
|
+
following is an example of a model named book.rb
|
97
70
|
|
98
71
|
class Book < BigRecord::Base
|
99
72
|
column 'attribute:title', :string
|
@@ -102,11 +75,16 @@ Now that you have your tables and column families all set up, you can begin addi
|
|
102
75
|
column :links, :string, :collection => true
|
103
76
|
end
|
104
77
|
|
105
|
-
This simple model defines 4 columns of type string. An important thing to notice here is that the first column
|
78
|
+
This simple model defines 4 columns of type string. An important thing to notice here is that the first column
|
79
|
+
'attribute:title' had the column family prepended to it. This is identical to just passing the symbol :title to
|
80
|
+
the column method, and the default behaviour is to prepend the column family (attribute) automatically if one is
|
81
|
+
not defined. Furthermore, in Hbase, there's the option of storing collections for a given column. This will return
|
82
|
+
an array for the links attribute on a Book record.
|
106
83
|
|
107
84
|
=== {BigRecord::BrAssociations Associations}
|
108
85
|
|
109
|
-
There are also associations available in Bigrecord, as well as the ability to associate to Activerecord models. The
|
86
|
+
There are also associations available in Bigrecord, as well as the ability to associate to Activerecord models. The
|
87
|
+
following are a few models demonstrating this:
|
110
88
|
|
111
89
|
animal.rb
|
112
90
|
class Animal < BigRecord::Base
|
@@ -124,13 +102,18 @@ animal.rb
|
|
124
102
|
belongs_to :trainer, :foreign_key => :trainer_id
|
125
103
|
end
|
126
104
|
|
127
|
-
In this example, an Animal is related to Zoo and Trainer. Both Animal and Zoo are Bigrecord models, and Trainer is
|
105
|
+
In this example, an Animal is related to Zoo and Trainer. Both Animal and Zoo are Bigrecord models, and Trainer is
|
106
|
+
an Activerecord model. Notice here that we need to define both the association field for storing the information and
|
107
|
+
the association itself. It's also important to remember that Bigrecord models have their IDs stored as string, and
|
108
|
+
Activerecord models use integers.
|
128
109
|
|
129
|
-
Once the association columns are defined, you define the associations themselves with either belongs_to_bigrecord or
|
110
|
+
Once the association columns are defined, you define the associations themselves with either belongs_to_bigrecord or
|
111
|
+
belongs_to_many and defining the :foreign_key (this is required for all associations).
|
130
112
|
|
131
113
|
=== {BigRecord::ConnectionAdapters::View Specifying return columns}
|
132
114
|
|
133
|
-
There are two ways to define specific columns to be returned with your models: 1) at the model level and 2) during
|
115
|
+
There are two ways to define specific columns to be returned with your models: 1) at the model level and 2) during
|
116
|
+
the query.
|
134
117
|
|
135
118
|
(1) At the model level, a collection of columns are called named views, and are defined like the following:
|
136
119
|
|
@@ -147,7 +130,8 @@ There are two ways to define specific columns to be returned with your models: 1
|
|
147
130
|
view :default, :title, :author, :description
|
148
131
|
end
|
149
132
|
|
150
|
-
Now, whenever you work with a Book record, it will only returned the columns you specify according to the view option
|
133
|
+
Now, whenever you work with a Book record, it will only returned the columns you specify according to the view option
|
134
|
+
you pass. i.e.
|
151
135
|
|
152
136
|
>> Book.find(:first, :view => :front_page)
|
153
137
|
=> #<Book id: "2e13f182-1085-495e-9841-fe5c84ae9992", attribute:title: "Hello Thar", attribute:author: "Greg">
|
@@ -158,10 +142,11 @@ Now, whenever you work with a Book record, it will only returned the columns you
|
|
158
142
|
>> Book.find(:first, :view => :default)
|
159
143
|
=> #<Book id: "2e13f182-1085-495e-9841-fe5c84ae9992", attribute:description: "Masterpiece!", attribute:title: "Hello Thar", attribute:links: ["link1", "link2", "link3", "link4"], attribute:author: "Greg">
|
160
144
|
|
161
|
-
Note: A Bigrecord model will return all the columns within the default column family (when :view option is left blank,
|
162
|
-
|
145
|
+
Note: A Bigrecord model will return all the columns within the default column family (when :view option is left blank,
|
146
|
+
for example). You can override the :default name view to change this behaviour.
|
163
147
|
|
164
|
-
(2) If you don't want to define named views ahead of time, you can just pass an array of columns to the :columns
|
148
|
+
(2) If you don't want to define named views ahead of time, you can just pass an array of columns to the :columns
|
149
|
+
option and it will return only those attributes:
|
165
150
|
|
166
151
|
>> Book.find(:first, :columns => [:author, :description])
|
167
152
|
=> #<Book id: "2e13f182-1085-495e-9841-fe5c84ae9992", attribute:description: "Masterpiece!", attribute:author: "Greg">
|
@@ -170,4 +155,5 @@ As you may have noticed, this functionality is synonymous with the :select optio
|
|
170
155
|
|
171
156
|
=== {BigRecord::Embedded Embedded Records}
|
172
157
|
|
173
|
-
=== At this point, usage patterns for a Bigrecord model are similar to that of an Activerecord model, and much of that
|
158
|
+
=== At this point, usage patterns for a Bigrecord model are similar to that of an Activerecord model, and much of that
|
159
|
+
documentation applies as well. Please refer to those and see if they work!
|
@@ -0,0 +1,48 @@
|
|
1
|
+
== Setting up HBase and Stargate
|
2
|
+
|
3
|
+
To quickly get started with development, you can set up HBase to run as a single server on your local computer,
|
4
|
+
along with Stargate, its RESTful web service front-end.
|
5
|
+
|
6
|
+
(1) Download and unpack the most recent release of HBase from http://hadoop.apache.org/hbase/releases.html#Download
|
7
|
+
|
8
|
+
(2) Edit (hbase-dir)/conf/hbase-env.sh and uncomment/modify the following line to correspond to your Java home path:
|
9
|
+
export JAVA_HOME=/usr/lib/jvm/java-6-sun
|
10
|
+
|
11
|
+
(3) Copy (hbase-dir)/contrib/stargate/hbase-<version>-stargate.jar into <hbase-dir>/lib
|
12
|
+
|
13
|
+
(4) Copy all the files in the (hbase-dir)/contrib/stargate/lib folder into <hbase-dir>/lib
|
14
|
+
|
15
|
+
(5) Start up HBase:
|
16
|
+
$ (hbase-dir)/bin/start-hbase.sh
|
17
|
+
|
18
|
+
(6)Start up Stargate (append "-p 1234" at the end if you want to change the port):
|
19
|
+
$ (hbase-dir)/bin/hbase org.apache.hadoop.hbase.stargate.Main
|
20
|
+
|
21
|
+
|
22
|
+
== Setting up Bigrecord
|
23
|
+
|
24
|
+
(1) Install the Bigrecord Driver gem and its dependencies, then start up a DRb server. Please see the Bigrecord Driver
|
25
|
+
documentation for more detailed instructions.
|
26
|
+
(http://github.com/openplaces/bigrecord/blob/master/bigrecord-driver/README.rdoc)
|
27
|
+
|
28
|
+
(2) Add the following line into the Rails::Initializer.run do |config| block:
|
29
|
+
|
30
|
+
config.gem "bigrecord", :source => "http://gemcutter.org"
|
31
|
+
|
32
|
+
and run the following command to install all the gems listed for your Rails app:
|
33
|
+
|
34
|
+
[sudo] rake gems:install
|
35
|
+
|
36
|
+
(3) Bootstrap Bigrecord into your project:
|
37
|
+
|
38
|
+
script/generate bigrecord
|
39
|
+
|
40
|
+
(4) Edit the config/bigrecord.yml[.sample] file in your Rails root to the information corresponding to the Stargate
|
41
|
+
server.
|
42
|
+
|
43
|
+
development:
|
44
|
+
adapter: hbase_rest
|
45
|
+
api_address: http://localhost:8080
|
46
|
+
|
47
|
+
Note: 8080 is the default port that Stargate starts up on. Make sure you modify this if you changed the port from
|
48
|
+
the default.
|
@@ -0,0 +1,310 @@
|
|
1
|
+
Example storage-conf.xml file:
|
2
|
+
|
3
|
+
<!--
|
4
|
+
~ Licensed to the Apache Software Foundation (ASF) under one
|
5
|
+
~ or more contributor license agreements. See the NOTICE file
|
6
|
+
~ distributed with this work for additional information
|
7
|
+
~ regarding copyright ownership. The ASF licenses this file
|
8
|
+
~ to you under the Apache License, Version 2.0 (the
|
9
|
+
~ "License"); you may not use this file except in compliance
|
10
|
+
~ with the License. You may obtain a copy of the License at
|
11
|
+
~
|
12
|
+
~ http://www.apache.org/licenses/LICENSE-2.0
|
13
|
+
~
|
14
|
+
~ Unless required by applicable law or agreed to in writing,
|
15
|
+
~ software distributed under the License is distributed on an
|
16
|
+
~ "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
|
17
|
+
~ KIND, either express or implied. See the License for the
|
18
|
+
~ specific language governing permissions and limitations
|
19
|
+
~ under the License.
|
20
|
+
-->
|
21
|
+
<Storage>
|
22
|
+
<!--======================================================================-->
|
23
|
+
<!-- Basic Configuration -->
|
24
|
+
<!--======================================================================-->
|
25
|
+
|
26
|
+
<!--
|
27
|
+
~ The name of this cluster. This is mainly used to prevent machines in
|
28
|
+
~ one logical cluster from joining another.
|
29
|
+
-->
|
30
|
+
<ClusterName>Local Testing</ClusterName>
|
31
|
+
|
32
|
+
<!--
|
33
|
+
~ Turn on to make new [non-seed] nodes automatically migrate the right data
|
34
|
+
~ to themselves. (If no InitialToken is specified, they will pick one
|
35
|
+
~ such that they will get half the range of the most-loaded node.)
|
36
|
+
~ If a node starts up without bootstrapping, it will mark itself bootstrapped
|
37
|
+
~ so that you can't subsequently accidently bootstrap a node with
|
38
|
+
~ data on it. (You can reset this by wiping your data and commitlog
|
39
|
+
~ directories.)
|
40
|
+
~
|
41
|
+
~ Off by default so that new clusters and upgraders from 0.4 don't
|
42
|
+
~ bootstrap immediately. You should turn this on when you start adding
|
43
|
+
~ new nodes to a cluster that already has data on it. (If you are upgrading
|
44
|
+
~ from 0.4, start your cluster with it off once before changing it to true.
|
45
|
+
~ Otherwise, no data will be lost but you will incur a lot of unnecessary
|
46
|
+
~ I/O before your cluster starts up.)
|
47
|
+
-->
|
48
|
+
<AutoBootstrap>false</AutoBootstrap>
|
49
|
+
|
50
|
+
<!--
|
51
|
+
~ Keyspaces and ColumnFamilies:
|
52
|
+
~ A ColumnFamily is the Cassandra concept closest to a relational
|
53
|
+
~ table. Keyspaces are separate groups of ColumnFamilies. Except in
|
54
|
+
~ very unusual circumstances you will have one Keyspace per application.
|
55
|
+
|
56
|
+
~ There is an implicit keyspace named 'system' for Cassandra internals.
|
57
|
+
-->
|
58
|
+
<Keyspaces>
|
59
|
+
<Keyspace Name="Bigrecord">
|
60
|
+
<ColumnFamily Name="animals" CompareWith="UTF8Type" />
|
61
|
+
<ColumnFamily Name="books" CompareWith="UTF8Type" />
|
62
|
+
<ColumnFamily Name="companies" CompareWith="UTF8Type" />
|
63
|
+
<ColumnFamily Name="employees" CompareWith="UTF8Type" />
|
64
|
+
<ColumnFamily Name="novels" CompareWith="UTF8Type" />
|
65
|
+
<ColumnFamily Name="zoos" CompareWith="UTF8Type" />
|
66
|
+
|
67
|
+
<ReplicaPlacementStrategy>org.apache.cassandra.locator.RackUnawareStrategy</ReplicaPlacementStrategy>
|
68
|
+
|
69
|
+
<ReplicationFactor>1</ReplicationFactor>
|
70
|
+
|
71
|
+
<EndPointSnitch>org.apache.cassandra.locator.EndPointSnitch</EndPointSnitch>
|
72
|
+
</Keyspace>
|
73
|
+
</Keyspaces>
|
74
|
+
|
75
|
+
<!--
|
76
|
+
~ Authenticator: any IAuthenticator may be used, including your own as long
|
77
|
+
~ as it is on the classpath. Out of the box, Cassandra provides
|
78
|
+
~ org.apache.cassandra.auth.AllowAllAuthenticator and,
|
79
|
+
~ org.apache.cassandra.auth.SimpleAuthenticator
|
80
|
+
~ (SimpleAuthenticator uses access.properties and passwd.properties by
|
81
|
+
~ default).
|
82
|
+
~
|
83
|
+
~ If you don't specify an authenticator, AllowAllAuthenticator is used.
|
84
|
+
-->
|
85
|
+
<Authenticator>org.apache.cassandra.auth.AllowAllAuthenticator</Authenticator>
|
86
|
+
|
87
|
+
<!--
|
88
|
+
~ Partitioner: any IPartitioner may be used, including your own as long
|
89
|
+
~ as it is on the classpath. Out of the box, Cassandra provides
|
90
|
+
~ org.apache.cassandra.dht.RandomPartitioner,
|
91
|
+
~ org.apache.cassandra.dht.OrderPreservingPartitioner, and
|
92
|
+
~ org.apache.cassandra.dht.CollatingOrderPreservingPartitioner.
|
93
|
+
~ (CollatingOPP colates according to EN,US rules, not naive byte
|
94
|
+
~ ordering. Use this as an example if you need locale-aware collation.)
|
95
|
+
~ Range queries require using an order-preserving partitioner.
|
96
|
+
~
|
97
|
+
~ Achtung! Changing this parameter requires wiping your data
|
98
|
+
~ directories, since the partitioner can modify the sstable on-disk
|
99
|
+
~ format.
|
100
|
+
-->
|
101
|
+
<Partitioner>org.apache.cassandra.dht.RandomPartitioner</Partitioner>
|
102
|
+
|
103
|
+
<!--
|
104
|
+
~ If you are using an order-preserving partitioner and you know your key
|
105
|
+
~ distribution, you can specify the token for this node to use. (Keys
|
106
|
+
~ are sent to the node with the "closest" token, so distributing your
|
107
|
+
~ tokens equally along the key distribution space will spread keys
|
108
|
+
~ evenly across your cluster.) This setting is only checked the first
|
109
|
+
~ time a node is started.
|
110
|
+
|
111
|
+
~ This can also be useful with RandomPartitioner to force equal spacing
|
112
|
+
~ of tokens around the hash space, especially for clusters with a small
|
113
|
+
~ number of nodes.
|
114
|
+
-->
|
115
|
+
<InitialToken></InitialToken>
|
116
|
+
|
117
|
+
<!--
|
118
|
+
~ Directories: Specify where Cassandra should store different data on
|
119
|
+
~ disk. Keep the data disks and the CommitLog disks separate for best
|
120
|
+
~ performance
|
121
|
+
-->
|
122
|
+
<CommitLogDirectory>/var/lib/cassandra/commitlog</CommitLogDirectory>
|
123
|
+
<DataFileDirectories>
|
124
|
+
<DataFileDirectory>/var/lib/cassandra/data</DataFileDirectory>
|
125
|
+
</DataFileDirectories>
|
126
|
+
|
127
|
+
|
128
|
+
<!--
|
129
|
+
~ Addresses of hosts that are deemed contact points. Cassandra nodes
|
130
|
+
~ use this list of hosts to find each other and learn the topology of
|
131
|
+
~ the ring. You must change this if you are running multiple nodes!
|
132
|
+
-->
|
133
|
+
<Seeds>
|
134
|
+
<Seed>127.0.0.1</Seed>
|
135
|
+
</Seeds>
|
136
|
+
|
137
|
+
|
138
|
+
<!-- Miscellaneous -->
|
139
|
+
|
140
|
+
<!-- Time to wait for a reply from other nodes before failing the command -->
|
141
|
+
<RpcTimeoutInMillis>10000</RpcTimeoutInMillis>
|
142
|
+
<!-- Size to allow commitlog to grow to before creating a new segment -->
|
143
|
+
<CommitLogRotationThresholdInMB>128</CommitLogRotationThresholdInMB>
|
144
|
+
|
145
|
+
|
146
|
+
<!-- Local hosts and ports -->
|
147
|
+
|
148
|
+
<!--
|
149
|
+
~ Address to bind to and tell other nodes to connect to. You _must_
|
150
|
+
~ change this if you want multiple nodes to be able to communicate!
|
151
|
+
~
|
152
|
+
~ Leaving it blank leaves it up to InetAddress.getLocalHost(). This
|
153
|
+
~ will always do the Right Thing *if* the node is properly configured
|
154
|
+
~ (hostname, name resolution, etc), and the Right Thing is to use the
|
155
|
+
~ address associated with the hostname (it might not be).
|
156
|
+
-->
|
157
|
+
<ListenAddress>localhost</ListenAddress>
|
158
|
+
<!-- internal communications port -->
|
159
|
+
<StoragePort>7000</StoragePort>
|
160
|
+
|
161
|
+
<!--
|
162
|
+
~ The address to bind the Thrift RPC service to. Unlike ListenAddress
|
163
|
+
~ above, you *can* specify 0.0.0.0 here if you want Thrift to listen on
|
164
|
+
~ all interfaces.
|
165
|
+
~
|
166
|
+
~ Leaving this blank has the same effect it does for ListenAddress,
|
167
|
+
~ (i.e. it will be based on the configured hostname of the node).
|
168
|
+
-->
|
169
|
+
<ThriftAddress>localhost</ThriftAddress>
|
170
|
+
<!-- Thrift RPC port (the port clients connect to). -->
|
171
|
+
<ThriftPort>9160</ThriftPort>
|
172
|
+
<!--
|
173
|
+
~ Whether or not to use a framed transport for Thrift. If this option
|
174
|
+
~ is set to true then you must also use a framed transport on the
|
175
|
+
~ client-side, (framed and non-framed transports are not compatible).
|
176
|
+
-->
|
177
|
+
<ThriftFramedTransport>false</ThriftFramedTransport>
|
178
|
+
|
179
|
+
|
180
|
+
<!--======================================================================-->
|
181
|
+
<!-- Memory, Disk, and Performance -->
|
182
|
+
<!--======================================================================-->
|
183
|
+
|
184
|
+
<!--
|
185
|
+
~ Access mode. mmapped i/o is substantially faster, but only practical on
|
186
|
+
~ a 64bit machine (which notably does not include EC2 "small" instances)
|
187
|
+
~ or relatively small datasets. "auto", the safe choice, will enable
|
188
|
+
~ mmapping on a 64bit JVM. Other values are "mmap", "mmap_index_only"
|
189
|
+
~ (which may allow you to get part of the benefits of mmap on a 32bit
|
190
|
+
~ machine by mmapping only index files) and "standard".
|
191
|
+
~ (The buffer size settings that follow only apply to standard,
|
192
|
+
~ non-mmapped i/o.)
|
193
|
+
-->
|
194
|
+
<DiskAccessMode>auto</DiskAccessMode>
|
195
|
+
|
196
|
+
<!--
|
197
|
+
~ Size of compacted row above which to log a warning. (If compacted
|
198
|
+
~ rows do not fit in memory, Cassandra will crash. This is explained
|
199
|
+
~ in http://wiki.apache.org/cassandra/CassandraLimitations and is
|
200
|
+
~ scheduled to be fixed in 0.7.)
|
201
|
+
-->
|
202
|
+
<RowWarningThresholdInMB>512</RowWarningThresholdInMB>
|
203
|
+
|
204
|
+
<!--
|
205
|
+
~ Buffer size to use when performing contiguous column slices. Increase
|
206
|
+
~ this to the size of the column slices you typically perform.
|
207
|
+
~ (Name-based queries are performed with a buffer size of
|
208
|
+
~ ColumnIndexSizeInKB.)
|
209
|
+
-->
|
210
|
+
<SlicedBufferSizeInKB>64</SlicedBufferSizeInKB>
|
211
|
+
|
212
|
+
<!--
|
213
|
+
~ Buffer size to use when flushing memtables to disk. (Only one
|
214
|
+
~ memtable is ever flushed at a time.) Increase (decrease) the index
|
215
|
+
~ buffer size relative to the data buffer if you have few (many)
|
216
|
+
~ columns per key. Bigger is only better _if_ your memtables get large
|
217
|
+
~ enough to use the space. (Check in your data directory after your
|
218
|
+
~ app has been running long enough.) -->
|
219
|
+
<FlushDataBufferSizeInMB>32</FlushDataBufferSizeInMB>
|
220
|
+
<FlushIndexBufferSizeInMB>8</FlushIndexBufferSizeInMB>
|
221
|
+
|
222
|
+
<!--
|
223
|
+
~ Add column indexes to a row after its contents reach this size.
|
224
|
+
~ Increase if your column values are large, or if you have a very large
|
225
|
+
~ number of columns. The competing causes are, Cassandra has to
|
226
|
+
~ deserialize this much of the row to read a single column, so you want
|
227
|
+
~ it to be small - at least if you do many partial-row reads - but all
|
228
|
+
~ the index data is read for each access, so you don't want to generate
|
229
|
+
~ that wastefully either.
|
230
|
+
-->
|
231
|
+
<ColumnIndexSizeInKB>64</ColumnIndexSizeInKB>
|
232
|
+
|
233
|
+
<!--
|
234
|
+
~ Flush memtable after this much data has been inserted, including
|
235
|
+
~ overwritten data. There is one memtable per column family, and
|
236
|
+
~ this threshold is based solely on the amount of data stored, not
|
237
|
+
~ actual heap memory usage (there is some overhead in indexing the
|
238
|
+
~ columns).
|
239
|
+
-->
|
240
|
+
<MemtableThroughputInMB>64</MemtableThroughputInMB>
|
241
|
+
<!--
|
242
|
+
~ Throughput setting for Binary Memtables. Typically these are
|
243
|
+
~ used for bulk load so you want them to be larger.
|
244
|
+
-->
|
245
|
+
<BinaryMemtableThroughputInMB>256</BinaryMemtableThroughputInMB>
|
246
|
+
<!--
|
247
|
+
~ The maximum number of columns in millions to store in memory per
|
248
|
+
~ ColumnFamily before flushing to disk. This is also a per-memtable
|
249
|
+
~ setting. Use with MemtableThroughputInMB to tune memory usage.
|
250
|
+
-->
|
251
|
+
<MemtableOperationsInMillions>0.3</MemtableOperationsInMillions>
|
252
|
+
<!--
|
253
|
+
~ The maximum time to leave a dirty memtable unflushed.
|
254
|
+
~ (While any affected columnfamilies have unflushed data from a
|
255
|
+
~ commit log segment, that segment cannot be deleted.)
|
256
|
+
~ This needs to be large enough that it won't cause a flush storm
|
257
|
+
~ of all your memtables flushing at once because none has hit
|
258
|
+
~ the size or count thresholds yet. For production, a larger
|
259
|
+
~ value such as 1440 is recommended.
|
260
|
+
-->
|
261
|
+
<MemtableFlushAfterMinutes>60</MemtableFlushAfterMinutes>
|
262
|
+
|
263
|
+
<!--
|
264
|
+
~ Unlike most systems, in Cassandra writes are faster than reads, so
|
265
|
+
~ you can afford more of those in parallel. A good rule of thumb is 2
|
266
|
+
~ concurrent reads per processor core. Increase ConcurrentWrites to
|
267
|
+
~ the number of clients writing at once if you enable CommitLogSync +
|
268
|
+
~ CommitLogSyncDelay. -->
|
269
|
+
<ConcurrentReads>8</ConcurrentReads>
|
270
|
+
<ConcurrentWrites>32</ConcurrentWrites>
|
271
|
+
|
272
|
+
<!--
|
273
|
+
~ CommitLogSync may be either "periodic" or "batch." When in batch
|
274
|
+
~ mode, Cassandra won't ack writes until the commit log has been
|
275
|
+
~ fsynced to disk. It will wait up to CommitLogSyncBatchWindowInMS
|
276
|
+
~ milliseconds for other writes, before performing the sync.
|
277
|
+
|
278
|
+
~ This is less necessary in Cassandra than in traditional databases
|
279
|
+
~ since replication reduces the odds of losing data from a failure
|
280
|
+
~ after writing the log entry but before it actually reaches the disk.
|
281
|
+
~ So the other option is "periodic," where writes may be acked immediately
|
282
|
+
~ and the CommitLog is simply synced every CommitLogSyncPeriodInMS
|
283
|
+
~ milliseconds.
|
284
|
+
-->
|
285
|
+
<CommitLogSync>periodic</CommitLogSync>
|
286
|
+
<!--
|
287
|
+
~ Interval at which to perform syncs of the CommitLog in periodic mode.
|
288
|
+
~ Usually the default of 10000ms is fine; increase it if your i/o
|
289
|
+
~ load is such that syncs are taking excessively long times.
|
290
|
+
-->
|
291
|
+
<CommitLogSyncPeriodInMS>10000</CommitLogSyncPeriodInMS>
|
292
|
+
<!--
|
293
|
+
~ Delay (in milliseconds) during which additional commit log entries
|
294
|
+
~ may be written before fsync in batch mode. This will increase
|
295
|
+
~ latency slightly, but can vastly improve throughput where there are
|
296
|
+
~ many writers. Set to zero to disable (each entry will be synced
|
297
|
+
~ individually). Reasonable values range from a minimal 0.1 to 10 or
|
298
|
+
~ even more if throughput matters more than latency.
|
299
|
+
-->
|
300
|
+
<!-- <CommitLogSyncBatchWindowInMS>1</CommitLogSyncBatchWindowInMS> -->
|
301
|
+
|
302
|
+
<!--
|
303
|
+
~ Time to wait before garbage-collection deletion markers. Set this to
|
304
|
+
~ a large enough value that you are confident that the deletion marker
|
305
|
+
~ will be propagated to all replicas by the time this many seconds has
|
306
|
+
~ elapsed, even in the face of hardware failures. The default value is
|
307
|
+
~ ten days.
|
308
|
+
-->
|
309
|
+
<GCGraceSeconds>10</GCGraceSeconds>
|
310
|
+
</Storage>
|
@@ -68,7 +68,7 @@ module BigRecord
|
|
68
68
|
def update_raw(table_name, row, values, timestamp)
|
69
69
|
result = nil
|
70
70
|
log "UPDATE #{table_name} SET #{values.inspect if values} WHERE ROW=#{row};" do
|
71
|
-
result = @connection.insert(table_name, row,
|
71
|
+
result = @connection.insert(table_name, row, values, {:consistency => Cassandra::Consistency::QUORUM})
|
72
72
|
end
|
73
73
|
result
|
74
74
|
end
|
@@ -84,8 +84,7 @@ module BigRecord
|
|
84
84
|
def get_raw(table_name, row, column, options={})
|
85
85
|
result = nil
|
86
86
|
log "SELECT (#{column}) FROM #{table_name} WHERE ROW=#{row};" do
|
87
|
-
|
88
|
-
result = @connection.get(table_name, row, super_column, name)
|
87
|
+
result = @connection.get(table_name, row, column)
|
89
88
|
end
|
90
89
|
result
|
91
90
|
end
|
@@ -103,33 +102,33 @@ module BigRecord
|
|
103
102
|
|
104
103
|
def get_columns_raw(table_name, row, columns, options={})
|
105
104
|
result = {}
|
106
|
-
|
105
|
+
|
107
106
|
log "SELECT (#{columns.join(", ")}) FROM #{table_name} WHERE ROW=#{row};" do
|
108
|
-
|
109
|
-
|
107
|
+
prefix_mode = false
|
108
|
+
prefixes = []
|
110
109
|
|
111
|
-
|
112
|
-
|
110
|
+
columns.each do |name|
|
111
|
+
prefix, name = name.split(":")
|
112
|
+
prefixes << prefix+":" unless prefixes.include?(prefix+":")
|
113
|
+
prefix_mode = name.blank?
|
114
|
+
end
|
113
115
|
|
114
|
-
|
116
|
+
if prefix_mode
|
117
|
+
prefixes.sort!
|
118
|
+
values = @connection.get(table_name, row, {:start => prefixes.first, :finish => prefixes.last + "~"})
|
115
119
|
|
116
|
-
result["id"] = row if values && values.
|
117
|
-
|
118
|
-
|
119
|
-
result[
|
120
|
+
result["id"] = row if values && values.size > 0
|
121
|
+
|
122
|
+
values.each do |key,value|
|
123
|
+
result[key] = value unless value.blank?
|
120
124
|
end
|
121
125
|
else
|
122
|
-
values = @connection.get_columns(table_name, row,
|
126
|
+
values = @connection.get_columns(table_name, row, columns)
|
127
|
+
|
123
128
|
result["id"] = row if values && values.compact.size > 0
|
124
|
-
|
125
|
-
|
126
|
-
|
127
|
-
values[id].each do |column_name, value|
|
128
|
-
next if value.nil?
|
129
|
-
|
130
|
-
full_key = super_columns[id] + ":" + column_name
|
131
|
-
result[full_key] = value
|
132
|
-
end
|
129
|
+
|
130
|
+
columns.each_index do |id|
|
131
|
+
result[columns[id].to_s] = values[id] unless values[id].blank?
|
133
132
|
end
|
134
133
|
end
|
135
134
|
end
|
@@ -144,11 +143,11 @@ module BigRecord
|
|
144
143
|
row_cols.each do |key,value|
|
145
144
|
begin
|
146
145
|
result[key] =
|
147
|
-
|
148
|
-
|
149
|
-
|
150
|
-
|
151
|
-
|
146
|
+
if key == 'id'
|
147
|
+
value
|
148
|
+
else
|
149
|
+
deserialize(value)
|
150
|
+
end
|
152
151
|
rescue Exception => e
|
153
152
|
puts "Could not load column value #{key} for row=#{row.name}"
|
154
153
|
end
|
@@ -160,9 +159,9 @@ module BigRecord
|
|
160
159
|
result = []
|
161
160
|
log "SCAN (#{columns.join(", ")}) FROM #{table_name} WHERE START_ROW=#{start_row} AND STOP_ROW=#{stop_row} LIMIT=#{limit};" do
|
162
161
|
options = {}
|
163
|
-
options[:start] = start_row
|
164
|
-
options[:finish] = stop_row
|
165
|
-
options[:count] = limit
|
162
|
+
options[:start] = start_row unless start_row.blank?
|
163
|
+
options[:finish] = stop_row unless stop_row.blank?
|
164
|
+
options[:count] = limit unless limit.blank?
|
166
165
|
|
167
166
|
keys = @connection.get_range(table_name, options)
|
168
167
|
|
@@ -172,14 +171,9 @@ module BigRecord
|
|
172
171
|
row = {}
|
173
172
|
row["id"] = key.key
|
174
173
|
|
175
|
-
key.columns.each do |
|
176
|
-
|
177
|
-
|
178
|
-
|
179
|
-
super_column.columns.each do |column|
|
180
|
-
full_key = super_column_name + ":" + column.name
|
181
|
-
row[full_key] = column.value
|
182
|
-
end
|
174
|
+
key.columns.each do |col|
|
175
|
+
column = col.column
|
176
|
+
row[column.name] = column.value
|
183
177
|
end
|
184
178
|
|
185
179
|
result << row if row.keys.size > 1
|
@@ -266,31 +260,6 @@ module BigRecord
|
|
266
260
|
|
267
261
|
protected
|
268
262
|
|
269
|
-
def data_to_cassandra_format(data = {})
|
270
|
-
super_columns = {}
|
271
|
-
|
272
|
-
data.each do |name, value|
|
273
|
-
super_column, column = name.split(":")
|
274
|
-
super_columns[super_column.to_s] = {} unless super_columns.has_key?(super_column.to_s)
|
275
|
-
super_columns[super_column.to_s][column.to_s] = value
|
276
|
-
end
|
277
|
-
|
278
|
-
return super_columns
|
279
|
-
end
|
280
|
-
|
281
|
-
def columns_to_cassandra_format(column_names = [])
|
282
|
-
super_columns = {}
|
283
|
-
|
284
|
-
column_names.each do |name|
|
285
|
-
super_column, sub_column = name.split(":")
|
286
|
-
|
287
|
-
super_columns[super_column.to_s] = [] unless super_columns.has_key?(super_column.to_s)
|
288
|
-
super_columns[super_column.to_s] << sub_column
|
289
|
-
end
|
290
|
-
|
291
|
-
return super_columns
|
292
|
-
end
|
293
|
-
|
294
263
|
def log(str, name = nil)
|
295
264
|
if block_given?
|
296
265
|
if @logger and @logger.level <= Logger::INFO
|
@@ -346,4 +315,4 @@ module BigRecord
|
|
346
315
|
end
|
347
316
|
end
|
348
317
|
end
|
349
|
-
end
|
318
|
+
end
|
metadata
CHANGED
@@ -5,8 +5,8 @@ version: !ruby/object:Gem::Version
|
|
5
5
|
segments:
|
6
6
|
- 0
|
7
7
|
- 1
|
8
|
-
-
|
9
|
-
version: 0.1.
|
8
|
+
- 1
|
9
|
+
version: 0.1.1
|
10
10
|
platform: ruby
|
11
11
|
authors:
|
12
12
|
- openplaces.org
|
@@ -14,7 +14,7 @@ autorequire:
|
|
14
14
|
bindir: bin
|
15
15
|
cert_chain: []
|
16
16
|
|
17
|
-
date: 2010-
|
17
|
+
date: 2010-05-05 00:00:00 -04:00
|
18
18
|
default_executable:
|
19
19
|
dependencies:
|
20
20
|
- !ruby/object:Gem::Dependency
|
@@ -77,8 +77,11 @@ extra_rdoc_files:
|
|
77
77
|
- LICENSE
|
78
78
|
- README.rdoc
|
79
79
|
- guides/bigrecord_specs.rdoc
|
80
|
+
- guides/cassandra_install.rdoc
|
80
81
|
- guides/deployment.rdoc
|
81
82
|
- guides/getting_started.rdoc
|
83
|
+
- guides/hbase_install.rdoc
|
84
|
+
- guides/storage-conf.rdoc
|
82
85
|
files:
|
83
86
|
- Rakefile
|
84
87
|
- VERSION
|
@@ -92,8 +95,11 @@ files:
|
|
92
95
|
- generators/bigrecord_model/templates/model.rb
|
93
96
|
- generators/bigrecord_model/templates/model_spec.rb
|
94
97
|
- guides/bigrecord_specs.rdoc
|
98
|
+
- guides/cassandra_install.rdoc
|
95
99
|
- guides/deployment.rdoc
|
96
100
|
- guides/getting_started.rdoc
|
101
|
+
- guides/hbase_install.rdoc
|
102
|
+
- guides/storage-conf.rdoc
|
97
103
|
- init.rb
|
98
104
|
- install.rb
|
99
105
|
- lib/big_record.rb
|