abacus 0.0.1

Sign up to get free protection for your applications and to get access to all the features.
@@ -0,0 +1,119 @@
1
+ Abacus
2
+ ======
3
+
4
+ Abacus is an xdxf parser and semantic toolset for Ruby.
5
+
6
+ Installation
7
+ ------------
8
+
9
+ gem install abacus
10
+
11
+ Abacus uses a sqllite database to store dictionary data; a config.yml file is used to determine the path of the database file, the default config.yml lies within the gem directory, inside the lib folder. You can point to a different config.yml file by setting the costant ABACUS\_CONFIG\_FILEPATH before requiring the gem.
12
+
13
+ The default config.yml file set as database path an ".abacus\_db" folder within the current user home directory, for production environment you may, as stated before, set the ABACUS\_CONFIG\_FILEPATH costant before requiring the gem in order to use a different config.yml file.
14
+
15
+ Config.yml also store configuration parameters for some tools of the toolset but I'll talk about it later.
16
+
17
+ After installed Abacus you may want to import some dictionaries in order to start using the semantic tools, the first time you do that you have to execute from command line:
18
+
19
+ abacus db:create
20
+
21
+ in order to create the database file. (set ENV[ABACUS\_CONFIG\_FILEPATH] to specify a different config file).
22
+
23
+ ABACUS_CONFIG_FILEPATH=/where/you/want abacus db:create
24
+
25
+ Then use the import function to load xdxf dictionaries (you can choose from a broad selection from here: http://xdxf.revdanica.com/down/, I used this one: http://downloads.sourceforge.net/xdxf/comn\_sdict05\_eng\_eng\_main.tar.bz2); at the moment the parser is pretty naif and it supports only some tags, but the whole dictionary is already being stored as 'raw_data' during the import so with future releases there might be further improvements also over imported dictionaries. The syntax to import a dict file is:
26
+
27
+ abacus db:xdxf:import filaname.xdxf
28
+
29
+ The import popolate the db tables with the contents of the dictionary file, multiple dictionaries can be added calling multiple times the above command.
30
+
31
+ Navigate the dictionary
32
+ -----------------------
33
+
34
+ To navigate the dictionary do as following:
35
+
36
+ >> require 'abacus'
37
+ => true
38
+ >> include Abacus
39
+ => Object
40
+ >> Dictionary.all
41
+ => [#<Abacus::Dictionary id: 1, full_name: "English explanatory dictionary (main)", lang_from: "ENG", lang_to: "ENG", description: nil>]
42
+ >> Dictionary.first.articles[1000..1002]
43
+ => [#<Abacus::Article id: 1001, dictionary_id: 1, raw_text: "Camberwell Beauty\nn. a deep purple butterfly, Nymph...">, #<Abacus::Article id: 1002, dictionary_id: 1, raw_text: "Cambodian\nn. & adj. --n. 1 a a native or national o...">, #<Abacus::Article id: 1003, dictionary_id: 1, raw_text: "Cambrian\n\313\210k\303\246mbr\311\252\311\231n adj. & n. --adj. 1 Welsh. 2 ...">]
44
+ >> Dictionary.first.articles.find(13000).article_keys
45
+ => [#<Abacus::ArticleKey id: 13000, the_key: "decipher", raw_text: "decipher">]
46
+ >> ArticleKey.find_by_the_key("ruby").articles.first
47
+ => #<Abacus::Article id: 34912, dictionary_id: 1, raw_text: "ruby\n\313\210ru:b\311\252 n., adj., & v. --n. (pl. -ies) 1 a ra...">
48
+ >> ArticleKey.find_by_the_key("ruby").articles.first.raw_text
49
+ => "ruby\n\313\210ru:b\311\252 n., adj., & v. --n. (pl. -ies) 1 a rare precious stone consisting of corundum with a colour varying from deep crimson or purple to pale rose. 2 a glowing purple-tinged red colour. --adj. of this colour. --v.tr. (-ies, -ied) dye or tinge ruby-colour. \303\270ruby glass glass coloured with oxides of copper, iron, lead, tin, etc. ruby-tail a wasp, Chrysis ignita, with a ruby-coloured hinder part. ruby wedding the fortieth anniversary of a wedding. [ME f. OF rubi f. med.L rubinus (lapis) red (stone), rel. to L rubeus red]"
50
+
51
+
52
+ There are two main models, Article and ArticleKey, Article is the hub for all the article properties and it contains the raw text (attribute raw_text) taken from xml:
53
+
54
+ XML FILE:
55
+ <ar><k>ironic</k>
56
+ <tr>aɪˈrɔnɪk</tr> adj. (also ironical) 1 using or displaying irony. 2 in the nature of irony. øøironically adv. [F ironique or LL ironicus f. Gk eironikos dissembling (as IRONY(1))]</ar>
57
+
58
+ IRB:
59
+ >> ArticleKey.find_by_the_key('ironic').articles[0].raw_text
60
+ => "ironic\na\311\252\313\210r\311\224n\311\252k adj. (also ironical) 1 using or displaying irony. 2 in the nature of irony. \303\270\303\270ironically adv. [F ironique or LL ironicus f. Gk eironikos dissembling (as IRONY(1))]"
61
+
62
+ For each article there may be one or more article_keys, which contains the linguistic identifiers of the article itself. Each article key can in turn be related to more than one article but from different dictionaries.
63
+
64
+
65
+ FIRST TOOL: HERIGONE MNEMONIC SYSTEM
66
+ ------------------------------------
67
+
68
+ (detailed explaination of this technique on Wikipedia: http://en.wikipedia.org/wiki/Herigone%27s\_mnemonic_system) Within config.yml you can set a list of association between numbers and letters (the standard one is already written within the standard config file):
69
+
70
+ Here's a sample config.yml file (this also the default one):
71
+
72
+ database:
73
+ adapter: sqlite3
74
+ database: <%=File.join(ENV['HOME'] || ENV['USERPROFILE'] || (Abacus::LIB_ROOT + File::SEPARATOR + ".."),'.abacus_db','abacus')%>
75
+ timeout: 5000
76
+
77
+ system:
78
+ default:
79
+ 0:
80
+ z,s
81
+ 1:
82
+ t,d,th
83
+ 2:
84
+ n
85
+ 3:
86
+ m
87
+ 4:
88
+ r
89
+ 5:
90
+ l
91
+ 6:
92
+ j,ch,sh,dge
93
+ 7:
94
+ k
95
+ 8:
96
+ f,ph,v
97
+ 9:
98
+ p,b
99
+
100
+ Following the above instructions you can create your own config file and specify your own system. Multiple systems are supported, simply put the one below the other within config.yml.
101
+
102
+ To enhance the existing imported dictionaries with the Hèrigone system you need to launch from commandline:
103
+
104
+ abacus db:herigone:generate :default [you can change default with your system name]
105
+
106
+ Then you can perform some interesting queries as follow:
107
+
108
+ >> HerigoneNumber.find_by_number(357)
109
+ => #<Abacus::HerigoneNumber id: 19081, system: "default", number: 357>
110
+ >> HerigoneNumber.find_by_number(357).article_keys
111
+ => [#<Abacus::ArticleKey id: 20159, the_key: "hemlock", raw_text: "hemlock">, #<Abacus::ArticleKey id: 26062, the_key: "milk", raw_text: "milk">, #<Abacus::ArticleKey id: 26068, the_key: "milky", raw_text: "milky">]
112
+ >> HerigoneNumber.find_by_number(357).article_keys.map{|a| a.the_key}
113
+ => ["hemlock", "milk", "milky"]
114
+
115
+
116
+ CONCLUSIONS
117
+ -----------
118
+
119
+ If you need more informations please have a look at the source code, or send me an email to sandro dot paganotti at gmail dot com.
@@ -0,0 +1,51 @@
1
+ #!/usr/bin/env ruby
2
+
3
+ #= Synopsis
4
+ # Abacus helps you search and navigate inside xfxd files,
5
+ # use this command line utility to populate the database
6
+ #
7
+ #= Examples
8
+ # Load a dictionary with this command
9
+ # abacus load 'path/to/dictionary.xdxf'
10
+ #
11
+ # Remove a dictionary with
12
+ # abacus remove 'path/to/dictionary.xdxf'
13
+ #
14
+ #= Usage
15
+ # abacus [options] {load|remove} source_file
16
+ #
17
+ #= Options
18
+ # -h --help print this help page
19
+ # -v --version print the application version
20
+ # no options at the moment
21
+ #
22
+ #= Author
23
+ # Sandro Paganotti
24
+ #
25
+ #= Copyright
26
+ # Copyright (c) 2010 Sandro Paganotti. Licensed under the MIT License:
27
+ # http://www.opensource.org/licenses/mit-license.php
28
+
29
+ ABACUS_CONFIG_FILEPATH = ENV["ABACUS_CONFIG_FILEPATH"] if !ENV["ABACUS_CONFIG_FILEPATH"].nil?
30
+ require File.join(File.dirname(__FILE__),'..','lib','abacus')
31
+
32
+
33
+ case ARGV.shift
34
+ when 'db:create'
35
+ # Create database if not exist
36
+ Abacus::MigrationUtilities.create_database! if !File.exist?(Abacus::Abacus::CONNECTION_STRING['database'])
37
+
38
+ when 'db:xdxf:import'
39
+ # Import a xdxf file inside the database
40
+ Abacus::Abacus.import(ARGV.shift, :verbose=>true)
41
+
42
+ when 'db:herigone:generate'
43
+ # Generate herigone numbers upon a specifc system
44
+ Abacus::Herigone.new(ARGV.shift).generate_on_db(:verbose=>true)
45
+
46
+
47
+ end
48
+
49
+
50
+
51
+
@@ -0,0 +1,15 @@
1
+ require 'rubygems'
2
+ require 'nokogiri'
3
+ require 'open-uri'
4
+ require 'sqlite3'
5
+ require 'active_record'
6
+ require 'fileutils'
7
+ require 'yaml'
8
+ require 'erb'
9
+
10
+ module Abacus
11
+ LIB_ROOT = File.dirname(__FILE__)
12
+ end
13
+
14
+ require File.join(File.dirname(__FILE__),'abacus','main')
15
+
@@ -0,0 +1,45 @@
1
+ module Abacus
2
+
3
+ class Herigone
4
+
5
+ attr_reader :name
6
+
7
+ def initialize(name = 'default')
8
+ @name, @system = name, Hash[*(CONFIG_FILE['system'][name].to_a.flatten.map{|v| "#{v}"=~/,/ ? v.split(',') : v })]
9
+ @rev_system = Hash[*(@system.to_a.map{|t| [*t[1]].map{|m|[m,t[0]]}}.flatten)]
10
+ end
11
+
12
+ def generate_on_db(*args)
13
+ dictionaries = args.first.is_a?(Hash) ? Dictionary.all : args.shift
14
+ options = args.shift
15
+
16
+ Abacus.transaction do
17
+ dictionaries.each do |dict|
18
+ dict.articles.each do |art|
19
+ art.article_keys.each do |ak|
20
+ if (en = HerigoneNumber.first(:conditions=>{:number=>(n=to_number(ak.the_key)),:system=>@name})).nil?
21
+ ak.herigone_numbers.create(:number=>n,:system=>@name)
22
+ else
23
+ ak.herigone_numbers << en
24
+ end
25
+ puts "<< #{ak.the_key} to #{n}" if options[:verbose]
26
+ end
27
+ end
28
+ end
29
+ end
30
+ end
31
+
32
+ def to_word(string)
33
+ raise TypeError.new("Fatal: #{string} is not a number") if (Float(string) rescue false)
34
+
35
+ end
36
+
37
+ def to_number(string)
38
+ string.downcase.scan(Regexp.compile(@system.values.join('|'))).inject('') do |memo,nr_match|
39
+ "#{memo}#{@rev_system[nr_match]}"
40
+ end
41
+ end
42
+
43
+ end
44
+
45
+ end
@@ -0,0 +1,23 @@
1
+ # if (Object.const_get("ABACUS_CONFIG_FILEPATH") rescue true)
2
+ # ABACUS_CONFIG_FILEPATH = File.join(File.dirname(__FILE__),'..','config.yml')
3
+ # end
4
+ # puts ABACUS_CONFIG_FILEPATH
5
+ Object.const_get("ABACUS_CONFIG_FILEPATH") rescue begin
6
+ ABACUS_CONFIG_FILEPATH = File.join(File.dirname(__FILE__),'..','config.yml')
7
+ end
8
+
9
+ module Abacus
10
+ CONFIG_FILE = YAML::load(ERB.new(IO.read(ABACUS_CONFIG_FILEPATH)).result)
11
+ end
12
+
13
+ require File.join(File.dirname(__FILE__), 'models','abacus')
14
+ require File.join(File.dirname(__FILE__), 'models','article')
15
+ require File.join(File.dirname(__FILE__), 'models','article_key')
16
+ require File.join(File.dirname(__FILE__), 'models','article_key_article_join')
17
+ require File.join(File.dirname(__FILE__), 'models','dictionary')
18
+ require File.join(File.dirname(__FILE__), 'models','article_key_herigone_number_join')
19
+ require File.join(File.dirname(__FILE__), 'models','herigone_number')
20
+
21
+ require File.join(File.dirname(__FILE__), 'apps','herigone')
22
+
23
+ require File.join(File.dirname(__FILE__), 'utils')
@@ -0,0 +1,42 @@
1
+ module Abacus
2
+ class Abacus < ActiveRecord::Base
3
+
4
+ self.abstract_class = true
5
+ CONNECTION_STRING = CONFIG_FILE['database']
6
+
7
+ establish_connection CONNECTION_STRING
8
+
9
+ class << self
10
+
11
+ # Import a xdxf file inside the database
12
+ def import(filename, options = {:verbose=>false})
13
+ doc = Nokogiri::XML::Document.parse(File.read(filename))
14
+
15
+ Abacus.transaction do
16
+
17
+ # Info regarding the dict type
18
+ dic = Dictionary.create!(
19
+ :full_name => doc.css('full_name').first.content,
20
+ :lang_from => doc.css('xdxf').first.attributes['lang_from'].value,
21
+ :lang_to => doc.css('xdxf').first.attributes['lang_to'].value
22
+ )
23
+
24
+ # Articles
25
+ doc.css('ar').each do |ar|
26
+ (ardb = dic.articles.create!(
27
+ :raw_text => ar.to_s
28
+ )).article_keys = ar.css('k').map{ |k|
29
+ ArticleKey.find_or_create_by_the_key(
30
+ :the_key => k.xpath('text()').to_s,
31
+ :raw_text => k.to_s
32
+ )
33
+ }
34
+ puts "<< #{ardb.article_keys.map{|ak| ak.the_key}.join(",")}" if options[:verbose]
35
+ end
36
+
37
+ end
38
+
39
+ end
40
+ end
41
+ end
42
+ end
@@ -0,0 +1,11 @@
1
+ module Abacus
2
+ class Article < Abacus
3
+
4
+ validates_presence_of :dictionary_id, :raw_text
5
+
6
+ belongs_to :dictionary
7
+ has_many :article_key_article_joins, :dependent => :destroy, :class_name=>"Abacus::ArticleKeyArticleJoin"
8
+ has_many :article_keys, :through=>:article_key_article_joins, :class_name=>"Abacus::ArticleKey"
9
+
10
+ end
11
+ end
@@ -0,0 +1,13 @@
1
+ module Abacus
2
+ class ArticleKey < Abacus
3
+
4
+ validates_presence_of :the_key
5
+ validates_uniqueness_of :the_key
6
+
7
+ has_many :article_key_article_joins, :dependent => :destroy, :class_name=>"Abacus::ArticleKeyArticleJoin"
8
+ has_many :articles, :through=>:article_key_article_joins, :class_name=>"Abacus::Article"
9
+ has_many :article_key_herigone_number_joins, :dependent => :destroy, :class_name=>"Abacus::ArticleKeyHerigoneNumberJoin"
10
+ has_many :herigone_numbers, :through=>:article_key_herigone_number_joins, :class_name=>"Abacus::HerigoneNumber"
11
+
12
+ end
13
+ end
@@ -0,0 +1,8 @@
1
+ module Abacus
2
+ class ArticleKeyArticleJoin < Abacus
3
+
4
+ belongs_to :article, :class_name => "Abacus::Article"
5
+ belongs_to :article_key, :class_name => "Abacus::ArticleKey"
6
+
7
+ end
8
+ end
@@ -0,0 +1,9 @@
1
+ module Abacus
2
+ class ArticleKeyHerigoneNumberJoin < Abacus
3
+
4
+ belongs_to :herigone_number, :class_name => "Abacus::HerigoneNumber"
5
+ belongs_to :article_key, :class_name => "Abacus::ArticleKey"
6
+
7
+ end
8
+ end
9
+
@@ -0,0 +1,8 @@
1
+ module Abacus
2
+ class Dictionary < Abacus
3
+
4
+ validates_presence_of :lang_from, :lang_to, :full_name
5
+ has_many :articles, :dependent => :destroy, :class_name=>"Abacus::Article"
6
+
7
+ end
8
+ end
@@ -0,0 +1,11 @@
1
+ module Abacus
2
+ class HerigoneNumber < Abacus
3
+
4
+ validates_presence_of :system, :number
5
+ validates_uniqueness_of :number, :scope=>:system
6
+
7
+ has_many :article_key_herigone_number_joins, :dependent => :destroy, :class_name=>"Abacus::ArticleKeyHerigoneNumberJoin"
8
+ has_many :article_keys, :through=>:article_key_herigone_number_joins,:class_name=>"Abacus::ArticleKey"
9
+
10
+ end
11
+ end
@@ -0,0 +1,57 @@
1
+
2
+ # TOOLS TO MANAGE Abacus DATABASE
3
+ module Abacus
4
+
5
+ class MigrationUtilities
6
+
7
+ class << self
8
+
9
+ # It creates the database and execute the migrations
10
+ def create_database!
11
+
12
+ FileUtils.mkdir_p(File.split(Abacus::CONNECTION_STRING['database']).first)
13
+
14
+ ActiveRecord::Base.establish_connection Abacus::CONNECTION_STRING
15
+ ActiveRecord::Base.connection
16
+
17
+ ActiveRecord::Schema.define do
18
+ create_table :dictionaries do |t|
19
+ t.column :full_name, :string
20
+ t.column :lang_from, :string
21
+ t.column :lang_to, :string
22
+ t.column :description, :text
23
+ end
24
+
25
+ create_table :articles do |t|
26
+ t.column :dictionary_id, :integer
27
+ t.column :raw_text, :text
28
+ end
29
+
30
+ create_table :article_key_article_joins do |t|
31
+ t.column :article_id, :integer
32
+ t.column :article_key_id, :integer
33
+ end
34
+
35
+ create_table :article_keys do |t|
36
+ t.column :the_key, :string
37
+ t.column :raw_text, :text
38
+ end
39
+
40
+ create_table :herigone_numbers do |t|
41
+ t.column :system, :string
42
+ t.column :number, :integer, :limit=>8
43
+ end
44
+
45
+ create_table :article_key_herigone_number_joins do |t|
46
+ t.column :article_key_id, :integer
47
+ t.column :herigone_number_id, :integer
48
+ end
49
+
50
+ add_index :herigone_numbers, :number
51
+ add_index :article_keys, :the_key
52
+ end
53
+ end
54
+
55
+ end
56
+ end
57
+ end
@@ -0,0 +1,27 @@
1
+ database:
2
+ adapter: sqlite3
3
+ database: <%=File.join(ENV['HOME'] || ENV['USERPROFILE'] || (Abacus::LIB_ROOT + File::SEPARATOR + ".."),'.abacus_db','abacus')%>
4
+ timeout: 5000
5
+
6
+ system:
7
+ default:
8
+ 0:
9
+ z,s
10
+ 1:
11
+ t,d,th
12
+ 2:
13
+ n
14
+ 3:
15
+ m
16
+ 4:
17
+ r
18
+ 5:
19
+ l
20
+ 6:
21
+ j,ch,sh,dge
22
+ 7:
23
+ k
24
+ 8:
25
+ f,ph,v
26
+ 9:
27
+ p,b
metadata ADDED
@@ -0,0 +1,116 @@
1
+ --- !ruby/object:Gem::Specification
2
+ name: abacus
3
+ version: !ruby/object:Gem::Version
4
+ prerelease: false
5
+ segments:
6
+ - 0
7
+ - 0
8
+ - 1
9
+ version: 0.0.1
10
+ platform: ruby
11
+ authors:
12
+ - Sandro Paganotti
13
+ autorequire:
14
+ bindir: bin
15
+ cert_chain: []
16
+
17
+ date: 2010-02-28 00:00:00 +01:00
18
+ default_executable:
19
+ dependencies:
20
+ - !ruby/object:Gem::Dependency
21
+ name: nokogiri
22
+ prerelease: false
23
+ requirement: &id001 !ruby/object:Gem::Requirement
24
+ requirements:
25
+ - - ">="
26
+ - !ruby/object:Gem::Version
27
+ segments:
28
+ - 1
29
+ - 3
30
+ - 3
31
+ version: 1.3.3
32
+ type: :runtime
33
+ version_requirements: *id001
34
+ - !ruby/object:Gem::Dependency
35
+ name: sqlite3-ruby
36
+ prerelease: false
37
+ requirement: &id002 !ruby/object:Gem::Requirement
38
+ requirements:
39
+ - - ">="
40
+ - !ruby/object:Gem::Version
41
+ segments:
42
+ - 1
43
+ - 2
44
+ - 5
45
+ version: 1.2.5
46
+ type: :runtime
47
+ version_requirements: *id002
48
+ - !ruby/object:Gem::Dependency
49
+ name: activerecord
50
+ prerelease: false
51
+ requirement: &id003 !ruby/object:Gem::Requirement
52
+ requirements:
53
+ - - ">="
54
+ - !ruby/object:Gem::Version
55
+ segments:
56
+ - 2
57
+ - 3
58
+ - 5
59
+ version: 2.3.5
60
+ type: :runtime
61
+ version_requirements: *id003
62
+ description:
63
+ email: sandro.paganotti@gmail.com
64
+ executables:
65
+ - abacus
66
+ extensions: []
67
+
68
+ extra_rdoc_files:
69
+ - README.markdown
70
+ files:
71
+ - bin/abacus
72
+ - lib/abacus/apps/herigone.rb
73
+ - lib/abacus/main.rb
74
+ - lib/abacus/models/abacus.rb
75
+ - lib/abacus/models/article.rb
76
+ - lib/abacus/models/article_key.rb
77
+ - lib/abacus/models/article_key_article_join.rb
78
+ - lib/abacus/models/article_key_herigone_number_join.rb
79
+ - lib/abacus/models/dictionary.rb
80
+ - lib/abacus/models/herigone_number.rb
81
+ - lib/abacus/utils.rb
82
+ - lib/abacus.rb
83
+ - lib/config.yml
84
+ - README.markdown
85
+ has_rdoc: true
86
+ homepage: http://github.com/sandropaganotti/Abacus
87
+ licenses: []
88
+
89
+ post_install_message:
90
+ rdoc_options: []
91
+
92
+ require_paths:
93
+ - lib
94
+ required_ruby_version: !ruby/object:Gem::Requirement
95
+ requirements:
96
+ - - ">="
97
+ - !ruby/object:Gem::Version
98
+ segments:
99
+ - 0
100
+ version: "0"
101
+ required_rubygems_version: !ruby/object:Gem::Requirement
102
+ requirements:
103
+ - - ">="
104
+ - !ruby/object:Gem::Version
105
+ segments:
106
+ - 0
107
+ version: "0"
108
+ requirements: []
109
+
110
+ rubyforge_project:
111
+ rubygems_version: 1.3.6
112
+ signing_key:
113
+ specification_version: 3
114
+ summary: Abacus is an xdxf parser and semantic toolset for Ruby.
115
+ test_files: []
116
+