rbtagger 0.3.2 → 0.4.0

Sign up to get free protection for your applications and to get access to all the features.
Files changed (59) hide show
  1. data/README +44 -0
  2. data/Rakefile +78 -4
  3. data/ext/rule_tagger/registry.c +4 -4
  4. data/ext/rule_tagger/registry.h +1 -1
  5. data/ext/word_tagger/rtagger.cc +23 -1
  6. data/ext/word_tagger/tagger.cc +9 -4
  7. data/ext/word_tagger/tagger.h +2 -0
  8. data/ext/word_tagger/test.rb +2 -2
  9. data/lib/brill/brown/{LEXICON → Lexicon.rb} +0 -0
  10. data/lib/brill/tagger.rb +1 -1
  11. data/lib/rbtagger.rb +0 -3
  12. data/lib/rbtagger/version.rb +2 -2
  13. data/lib/word/tagger.rb +2 -1
  14. metadata +38 -101
  15. data/COPYING +0 -21
  16. data/History.txt +0 -4
  17. data/License.txt +0 -20
  18. data/Manifest.txt +0 -82
  19. data/PostInstall.txt +0 -1
  20. data/README.txt +0 -51
  21. data/config/hoe.rb +0 -74
  22. data/config/requirements.rb +0 -15
  23. data/ext/rule_tagger/mkmf.log +0 -46
  24. data/ext/word_tagger/mkmf.log +0 -24
  25. data/ext/word_tagger/test/Makefile +0 -22
  26. data/ext/word_tagger/test/doc.txt +0 -87
  27. data/lib/brill/brown/CONTEXTUALRULEFILE +0 -284
  28. data/lib/brill/brown/LEXICALRULEFILE +0 -148
  29. data/script/console +0 -10
  30. data/script/destroy +0 -14
  31. data/script/generate +0 -14
  32. data/script/txt2html +0 -82
  33. data/setup.rb +0 -1585
  34. data/tasks/deployment.rake +0 -34
  35. data/tasks/environment.rake +0 -7
  36. data/tasks/extconf.rake +0 -18
  37. data/tasks/extconf/rule_tagger.rake +0 -43
  38. data/tasks/extconf/word_tagger.rake +0 -43
  39. data/tasks/website.rake +0 -17
  40. data/test/docs/doc0.txt +0 -20
  41. data/test/docs/doc1.txt +0 -11
  42. data/test/docs/doc2.txt +0 -52
  43. data/test/docs/doc3.txt +0 -128
  44. data/test/docs/doc4.txt +0 -337
  45. data/test/docs/doc5.txt +0 -497
  46. data/test/docs/doc6.txt +0 -116
  47. data/test/docs/doc7.txt +0 -101
  48. data/test/docs/doc8.txt +0 -25
  49. data/test/docs/doc9.txt +0 -84
  50. data/test/fixtures/tags.txt +0 -976
  51. data/test/test_helper.rb +0 -5
  52. data/test/test_rule_tagger.rb +0 -151
  53. data/test/test_word_tagger.rb +0 -47
  54. data/tools/rakehelp.rb +0 -113
  55. data/website/index.html +0 -231
  56. data/website/index.txt +0 -70
  57. data/website/javascripts/rounded_corners_lite.inc.js +0 -285
  58. data/website/stylesheets/screen.css +0 -138
  59. data/website/template.html.erb +0 -184
data/test/test_helper.rb DELETED
@@ -1,5 +0,0 @@
1
- require 'test/unit'
2
- $:.unshift File.join(File.dirname(__FILE__),'..','ext','rule_tagger')
3
- $:.unshift File.join(File.dirname(__FILE__),'..','ext','word_tagger')
4
- $:.unshift File.join(File.dirname(__FILE__),'..','lib')
5
- require 'rbtagger'
@@ -1,151 +0,0 @@
1
- require File.dirname(__FILE__) + '/test_helper'
2
-
3
-
4
- class TestRuleTagger< Test::Unit::TestCase
5
- SAMPLE_DOC=%q(
6
- Take an active role in your care
7
- When it comes to making decisions about the goals and direction of treatment, don't sit back. Work closely and actively with your oncologist and the rest of your medical team.
8
- Dont overlook clinical trials
9
- If youre eligible to enroll in clinical trials, select an oncologist who participates in them. Patients who enroll in clinical studies receive closer follow-up, the highest standard-of-care treatment and access to experimental therapies at no extra cost.
10
- Maximize your nutrition strategy
11
- Doing your best to eat a healthy, well-balanced diet is vital to prompt healing after surgery and for recovery from radiation or chemotherapy. Many oncology practices employ registered dieticians who can help you optimize your nutrition.
12
- Steer clear of "natural cures"
13
- Before trying nutritional supplements or herbal remedies, be sure to discuss your plans with a doctor. Most have not been tested in clinical studies, and some may actually interfere with your treatment.
14
- Build a stronger body
15
- Even walking regularly is can help you minimize long-term muscle weakness caused by illness or de-conditioning.
16
- Focus on overall health
17
- Patients may be cured of cancer but still face life-threatening medical problems that are underemphasized during cancer treatments, such as diabetes, high blood pressure and heart disease. Continue to monitor your overall health.
18
- Put the fire out for good
19
- Smoking impairs healing after surgery and radiation and increases your risk of cardiovascular disease and many types of cancers. Ask your doctor for help identifying and obtaining the most appropriate cessation aids.
20
- Map a healthy future
21
- Once youve completed treatment, discuss appropriate follow-up plans with your doctor and keep track of them yourself. Intensified screening over many years is frequently recommended to identify and treat a recurrence early on.
22
- Share your feelings
23
- Allow yourself time to discuss the emotional consequences of your illness and treatment with family, friends, your doctor and, if necessary, a professional therapist. Many patients also find antidepressants helpful during treatment.
24
- Stay connected
25
- Although many newly diagnosed patients fear they will not be able to keep working during treatment, this is usually not the case. Working, even at a reduced schedule, helps you maintain valuable social connections and weekly structure.
26
- )
27
- SAMPLE_DOC2=%q(
28
- Britney Spears was granted a change in her visitation schedule with her sons Sean Preston and Jayden James at a hearing Tuesday.
29
- "There was a change in visitation status that was ordered by Commissioner Gordon this morning," Los Angeles Superior Court spokesperson Alan Parachini confirmed after the hearing, which both Kevin Federline and her father (and co-conservator) Jamie Spears attended. (Britney and Kevin did not address each other during the hearing.)
30
- The details of her visitation, however, are unclear.
31
- "I'm not at liberty to answer any questions about the nature of that change," Parachini said. (TMZ.com had reported that Spears wanted overnight visits.)
32
- Asked by Us if she were happy with the court outcome, Spears (clutching an Ed Hardy purse) smiled and told Us, "Yes."
33
- Next up: A status hearing set for July 15.
34
- The couple last appeared in court May 6. Spears was granted extended visitation — three days a week from 9 a.m. to 5 p.m. — of Sean Preston, 2, and Jayden James, 20 months.
35
- )
36
- SAMPLE_DOC3=%q(
37
- TMZ.com: Britney celebrated getting overnights with her kids by going on a wild shopping trip for herself.With L.A.'s finest at her service, it was a total clusterf**k outside of Fred Segal as Brit Brit made her way out. The scene was crazy -- and it was all... Read more
38
- )
39
- def setup
40
- if !defined?($tagger)
41
- $rtagger = Brill::Tagger.new
42
- end
43
- end
44
-
45
- def test_simple_tagger
46
- pairs = tagger.tag( SAMPLE_DOC )
47
- tags = [["", ")"], ["", ")"], ["Take", "VB"], ["an", "DT"], ["active", "JJ"], ["role", "NN"], ["in", "IN"],
48
- ["your", "PRP$"], ["care", "NN"], ["When", "WRB"], ["it", "PRP"], ["comes", "VBZ"], ["to", "TO"],
49
- ["making", "VBG"], ["decisions", "NNS"], ["about", "IN"], ["the", "DT"], ["goals", "NNS"], ["and", "CC"],
50
- ["direction", "NN"], ["of", "IN"], ["treatment", "NN"], [",", ","], ["", ")"], ["do", "VBP"], ["", ")"],
51
- ["n't", "RB"], ["sit", "VB"], ["back.", "CD"], ["Work", "NN"], ["closely", "RB"], ["and", "CC"],
52
- ["actively", "RB"], ["with", "IN"], ["your", "PRP$"], ["oncologist", "NN"], ["and", "CC"], ["the", "DT"],
53
- ["rest", "NN"], ["of", "IN"], ["your", "PRP$"], ["medical", "JJ"], ["team.", "JJ"], ["Do", "VBP"],
54
- ["", ")"], ["n't", "RB"], ["overlook", "VB"], ["clinical", "JJ"], ["trials", "NNS"], ["If", "IN"],
55
- ["you", "PRP"], ["'re", "VBP"], ["eligible", "JJ"], ["to", "TO"], ["enroll", "VB"], ["in", "IN"],
56
- ["clinical", "JJ"], ["trials", "NNS"], [",", ","], ["", ")"], ["select", "VB"], ["an", "DT"],
57
- ["oncologist", "NN"], ["who", "WP"], ["participates", "VBZ"], ["in", "IN"], ["them.", "JJ"],
58
- ["Patients", "NNS"], ["who", "WP"], ["enroll", "VBP"], ["in", "IN"], ["clinical", "JJ"],
59
- ["studies", "NNS"], ["receive", "VBP"], ["closer", "JJR"], ["follow-up", "NN"], [",", ","], ["", ")"],
60
- ["the", "DT"], ["highest", "JJS"], ["standard-of-care", "JJ"], ["treatment", "NN"], ["and", "CC"],
61
- ["access", "NN"], ["to", "TO"], ["experimental", "JJ"], ["therapies", "NNS"], ["at", "IN"], ["no", "DT"],
62
- ["extra", "JJ"], ["cost.", "NNP"], ["Maximize", "NNP"], ["your", "PRP$"], ["nutrition", "NN"],
63
- ["strategy", "NN"], ["Doing", "NNP"], ["your", "PRP$"], ["best", "JJS"], ["to", "TO"], ["eat", "VB"],
64
- ["a", "DT"], ["healthy", "JJ"], [",", ","], ["", ")"], ["well-balanced", "JJ"], ["diet", "NN"],
65
- ["is", "VBZ"], ["vital", "JJ"], ["to", "TO"], ["prompt", "VB"], ["healing", "NN"], ["after", "IN"],
66
- ["surgery", "NN"], ["and", "CC"], ["for", "IN"], ["recovery", "NN"], ["from", "IN"], ["radiation", "NN"],
67
- ["or", "CC"], ["chemotherapy.", "JJ"], ["Many", "JJ"], ["oncology", "NN"], ["practices", "NNS"],
68
- ["employ", "VBP"], ["registered", "VBN"], ["dieticians", "NNS"], ["who", "WP"], ["can", "MD"],
69
- ["help", "VB"], ["you", "PRP"], ["optimize", "VB"], ["your", "PRP$"], ["nutrition.", "JJ"],
70
- ["Steer", "VB"], ["clear", "JJ"], ["of", "IN"], ["", ")"], ["``", "``"], ["natural", "JJ"],
71
- ["cures", "NNS"], ["''", "''"], ["", ")"], ["Before", "IN"], ["trying", "VBG"], ["nutritional", "JJ"],
72
- ["supplements", "NNS"], ["or", "CC"], ["herbal", "JJ"], ["remedies", "NNS"], [",", ","], ["", ")"],
73
- ["be", "VB"], ["sure", "JJ"], ["to", "TO"], ["discuss", "VB"], ["your", "PRP$"], ["plans", "NNS"],
74
- ["with", "IN"], ["a", "DT"], ["doctor.", "JJ"], ["Most", "JJS"], ["have", "VBP"], ["not", "RB"],
75
- ["been", "VBN"], ["tested", "VBN"], ["in", "IN"], ["clinical", "JJ"], ["studies", "NNS"], [",", ","],
76
- ["", ")"], ["and", "CC"], ["some", "DT"], ["may", "MD"], ["actually", "RB"], ["interfere", "VB"],
77
- ["with", "IN"], ["your", "PRP$"], ["treatment.", "JJ"], ["Build", "VB"], ["a", "DT"], ["stronger", "JJR"],
78
- ["body", "NN"], ["Even", "RB"], ["walking", "VBG"], ["regularly", "RB"], ["is", "VBZ"], ["can", "MD"],
79
- ["help", "VB"], ["you", "PRP"], ["minimize", "VB"], ["long-term", "JJ"], ["muscle", "NN"],
80
- ["weakness", "NN"], ["caused", "VBN"], ["by", "IN"], ["illness", "NN"], ["or", "CC"],
81
- ["de-conditioning.", "NNP"], ["Focus", "NNP"], ["on", "IN"], ["overall", "JJ"], ["health", "NN"],
82
- ["Patients", "NNS"], ["may", "MD"], ["be", "VB"], ["cured", "VBN"], ["of", "IN"], ["cancer", "NN"],
83
- ["but", "CC"], ["still", "JJ"], ["face", "NN"], ["life-threatening", "JJ"], ["medical", "JJ"],
84
- ["problems", "NNS"], ["that", "WDT"], ["are", "VBP"], ["underemphasized", "JJ"], ["during", "IN"],
85
- ["cancer", "NN"], ["treatments", "NNS"], [",", ","], ["", ")"], ["such", "JJ"], ["as", "IN"],
86
- ["diabetes", "NN"], [",", ","], ["", ")"], ["high", "JJ"], ["blood", "NN"], ["pressure", "NN"],
87
- ["and", "CC"], ["heart", "NN"], ["disease.", "JJ"], ["Continue", "VB"], ["to", "TO"], ["monitor", "VB"],
88
- ["your", "PRP$"], ["overall", "JJ"], ["health.", "JJ"], ["Put", "NN"], ["the", "DT"], ["fire", "NN"],
89
- ["out", "IN"], ["for", "IN"], ["good", "JJ"], ["Smoking", "NNP"], ["impairs", "NNS"], ["healing", "NN"],
90
- ["after", "IN"], ["surgery", "NN"], ["and", "CC"], ["radiation", "NN"], ["and", "CC"], ["increases", "NNS"],
91
- ["your", "PRP$"], ["risk", "NN"], ["of", "IN"], ["cardiovascular", "JJ"], ["disease", "NN"], ["and", "CC"],
92
- ["many", "JJ"], ["types", "NNS"], ["of", "IN"], ["cancers.", "CD"], ["Ask", "VB"], ["your", "PRP$"],
93
- ["doctor", "NN"], ["for", "IN"], ["help", "NN"], ["identifying", "VBG"], ["and", "CC"], ["obtaining", "VBG"],
94
- ["the", "DT"], ["most", "RBS"], ["appropriate", "JJ"], ["cessation", "NN"], ["aids.", "NNP"], ["Map", "NNP"],
95
- ["a", "DT"], ["healthy", "JJ"], ["future", "NN"], ["Once", "RB"], ["youve", "VBP"], ["completed", "VBN"],
96
- ["treatment", "NN"], [",", ","], ["", ")"], ["discuss", "VB"], ["appropriate", "JJ"], ["follow-up", "NN"],
97
- ["plans", "NNS"], ["with", "IN"], ["your", "PRP$"], ["doctor", "NN"], ["and", "CC"], ["keep", "VB"],
98
- ["track", "NN"], ["of", "IN"], ["them", "PRP"], ["yourself.", "CD"], ["Intensified", "JJ"], ["screening", "NN"],
99
- ["over", "IN"], ["many", "JJ"], ["years", "NNS"], ["is", "VBZ"], ["frequently", "RB"], ["recommended", "VBN"],
100
- ["to", "TO"], ["identify", "VB"], ["and", "CC"], ["treat", "VB"], ["a", "DT"], ["recurrence", "NN"], ["early", "JJ"],
101
- ["on.", "CD"], ["Share", "VB"], ["your", "PRP$"], ["feelings", "NNS"], ["Allow", "VB"], ["yourself", "PRP"],
102
- ["time", "NN"], ["to", "TO"], ["discuss", "VB"], ["the", "DT"], ["emotional", "JJ"], ["consequences", "NNS"],
103
- ["of", "IN"], ["your", "PRP$"], ["illness", "NN"], ["and", "CC"], ["treatment", "NN"], ["with", "IN"],
104
- ["family", "NN"], [",", ","], ["", ")"], ["friends", "NNS"], [",", ","], ["", ")"], ["your", "PRP$"],
105
- ["doctor", "NN"], ["and", "CC"], [",", ","], ["", ")"], ["if", "IN"], ["necessary", "JJ"], [",", ","],
106
- ["", ")"], ["a", "DT"], ["professional", "JJ"], ["therapist.", "JJ"], ["Many", "JJ"], ["patients", "NNS"],
107
- ["also", "RB"], ["find", "VBP"], ["antidepressants", "NNS"], ["helpful", "JJ"], ["during", "IN"],
108
- ["treatment.", "JJ"], ["Stay", "VB"], ["connected", "VBN"], ["Although", "IN"], ["many", "JJ"],
109
- ["newly", "RB"], ["diagnosed", "VBN"], ["patients", "NNS"], ["fear", "VBP"], ["they", "PRP"], ["will", "MD"],
110
- ["not", "RB"], ["be", "VB"], ["able", "JJ"], ["to", "TO"], ["keep", "VB"], ["working", "VBG"], ["during", "IN"],
111
- ["treatment", "NN"], [",", ","], ["", ")"], ["this", "DT"], ["is", "VBZ"], ["usually", "RB"], ["not", "RB"],
112
- ["the", "DT"], ["case.", "CD"], ["Working", "NNP"], [",", ","], ["", ")"], ["even", "RB"], ["at", "IN"],
113
- ["a", "DT"], ["reduced", "VBN"], ["schedule", "NN"], [",", ","], ["", ")"], ["helps", "VBZ"], ["you", "PRP"],
114
- ["maintain", "VBP"], ["valuable", "JJ"], ["social", "JJ"], ["connections", "NNS"], ["and", "CC"],
115
- ["weekly", "JJ"], ["structure", "NN"], [".", "."]]
116
- assert_equal tags, pairs
117
- end
118
-
119
- def test_multiple_docs
120
- #timer = Time.now
121
- count = 0
122
- Dir["#{File.dirname(__FILE__)}/docs/doc*"].each do|doc|
123
- tagger.tag( File.read( doc ) )
124
- count += 1
125
- end
126
- #duration = Time.now - timer
127
- #puts "time: #{duration} sec #{count.to_f/duration} docs/sec"
128
- end
129
-
130
- def test_suggest
131
- results = tagger.suggest( SAMPLE_DOC )
132
- # puts results.inspect
133
- assert results.include?(["treatment", "NN", 5])
134
- results = tagger.suggest( SAMPLE_DOC2 )
135
- assert results.include?(["Britney Spears", "NNP", 6])
136
- assert results.include?(["Jamie Spears", "NNP", 12])
137
- # puts results.inspect
138
- results = tagger.suggest( SAMPLE_DOC3, 5 )
139
- #puts results.inspect
140
- end
141
-
142
- def test_adjectives
143
- results = tagger.adjectives("So happy i get to bring my baby boy home tomorrow. Hospital tv is horrible, ten channels no one watches")
144
- assert_equal [["happy", "JJ"], ["horrible", "JJ"]], results
145
- end
146
-
147
- private
148
- def tagger
149
- $rtagger
150
- end
151
- end
@@ -1,47 +0,0 @@
1
- require File.dirname(__FILE__) + '/test_helper'
2
-
3
- class TestWordTagger < Test::Unit::TestCase
4
-
5
- def setup
6
- if !defined?($wtagger)
7
- $wtagger = Word::Tagger.new( File.join(File.dirname(__FILE__),'fixtures','tags.txt'), :words => 4 )
8
- end
9
- end
10
-
11
- def test_basic
12
- #timer = Time.now
13
- text = "This is a sa'mple doc[]ument lets see how cancer ngrams 4 works out for this interesting text!"
14
- tags = $wtagger.execute( text )
15
- assert_equal ['cancer','work'], tags
16
- #puts "Duration: #{Time.now - timer} sec"
17
- end
18
-
19
- def test_sample_bug
20
- tags = ["foo", "bar", "baz", "squishy", "yummy"]
21
- txt = 'This is some sample text. Foo walked into a bar. The bartender said "What can I get you?" Foo said he wanted something yummy - like a baz.'
22
- tagger = Word::Tagger.new tags, :words => 4
23
- result_tags = tagger.execute( txt )
24
- assert_equal ["bar", "baz", "foo", "yummy"], result_tags
25
- end
26
-
27
- def test_ngram_size3
28
- #timer = Time.now
29
- text = "This body of text contains something like ventricular septal defect"
30
- tags = $wtagger.execute( text )
31
- assert_equal ['ventricular septal defect'], tags
32
- #puts "Duration: #{Time.now - timer} sec"
33
- end
34
-
35
- def test_cat_and_the_hat
36
- tagger = Word::Tagger.new( ['Cat','hat'], :words => 4 )
37
- tags = tagger.execute( 'the cAt and the hat' )
38
- assert_equal( ["Cat", "hat"], tags )
39
- end
40
-
41
- def test_freq_counts
42
- tagger = Word::Tagger.new( ['Cat','hat'], :words => 4 )
43
- tags = tagger.freq( 'the cAt and the hat the cAt and the hat the cAt and the hat the cAt and the hat' )
44
- assert_equal( {"Cat"=>4, "hat"=>4}, tags )
45
- end
46
-
47
- end
data/tools/rakehelp.rb DELETED
@@ -1,113 +0,0 @@
1
- # This final came directly from mongrel 1.0.1 source
2
- # with a few modifications to support some of my network tests
3
- # Also, i have figured out yet if this should remain so much a clone of the mongrel tree
4
- # or become a plugin, need to review more closely how that works
5
-
6
- def make(makedir)
7
- Dir.chdir(makedir) do
8
- sh(PLATFORM =~ /win32/ ? 'nmake' : 'make')
9
- end
10
- end
11
-
12
- def extconf(dir)
13
- Dir.chdir(dir) do ruby "extconf.rb" end
14
- end
15
-
16
- def setup_tests
17
- Rake::TestTask.new do |t|
18
- t.test_files = FileList["test/*_test.rb"]
19
- t.verbose = true
20
- end
21
- end
22
-
23
-
24
- def setup_clean otherfiles
25
- files = ['build/*', '**/*.o', '**/*.so', '**/*.a', 'lib/*-*', '**/*.log'] + otherfiles
26
- CLEAN.include(files)
27
- end
28
-
29
-
30
- def setup_rdoc files
31
- Rake::RDocTask.new do |rdoc|
32
- rdoc.rdoc_dir = 'doc/rdoc'
33
- rdoc.options << '--line-numbers'
34
- rdoc.rdoc_files.add(files)
35
- end
36
- end
37
-
38
-
39
- def setup_extension(dir, extension)
40
- ext = "ext/#{dir}"
41
- ext_so = "#{ext}/#{extension}.#{Config::CONFIG['DLEXT']}"
42
- ext_files = FileList[
43
- "#{ext}/*.c",
44
- "#{ext}/*.h",
45
- "#{ext}/extconf.rb",
46
- "#{ext}/Makefile",
47
- "lib"
48
- ]
49
-
50
- task "lib" do
51
- directory "lib"
52
- end
53
-
54
- desc "Builds just the #{extension} extension"
55
- task extension.to_sym => ["#{ext}/Makefile", ext_so ]
56
-
57
- file "#{ext}/Makefile" => ["#{ext}/extconf.rb"] do
58
- extconf "#{ext}"
59
- end
60
-
61
- file ext_so => ext_files do
62
- make "#{ext}"
63
- cp ext_so, "lib"
64
- end
65
- end
66
-
67
-
68
- def base_gem_spec(pkg_name, pkg_version)
69
- rm_rf "test/coverage"
70
- pkg_version = pkg_version
71
- pkg_name = pkg_name
72
- pkg_file_name = "#{pkg_name}-#{pkg_version}"
73
- Gem::Specification.new do |s|
74
- s.name = pkg_name
75
- s.version = pkg_version
76
- s.platform = Gem::Platform::RUBY
77
- s.has_rdoc = true
78
- s.extra_rdoc_files = [ "README" ]
79
-
80
- s.files = %w(COPYING LICENSE README Rakefile) +
81
- Dir.glob("{bin,doc/rdoc,test}/**/*") +
82
- Dir.glob("ext/**/*.{h,c,rb,rl}") +
83
- Dir.glob("{examples,tools,lib}/**/*.rb")
84
-
85
- s.require_path = "lib"
86
- s.extensions = FileList["ext/**/extconf.rb"].to_a
87
- s.bindir = "bin"
88
- end
89
- end
90
-
91
- def setup_gem(pkg_name, pkg_version)
92
- spec = base_gem_spec(pkg_name, pkg_version)
93
- yield spec if block_given?
94
-
95
- Rake::GemPackageTask.new(spec) do |p|
96
- p.gem_spec = spec
97
- p.need_tar = true if RUBY_PLATFORM !~ /mswin/
98
- end
99
- end
100
-
101
- # Conditional require rcov/rcovtask if present
102
- begin
103
- require 'rcov/rcovtask'
104
-
105
- Rcov::RcovTask.new do |t|
106
- t.test_files = FileList['test/unit/*_test.rb'] + FileList["test/integration/*_test.rb"]
107
- t.rcov_opts << "-x /usr"
108
- t.output_dir = "test/coverage"
109
- t.verbose = true
110
- end
111
- rescue Object => e
112
- puts e.message
113
- end
data/website/index.html DELETED
@@ -1,231 +0,0 @@
1
- <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
2
- "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
3
- <html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
4
- <head>
5
- <meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
6
- <title>rbtagger</title>
7
- <style type="text/css">
8
- body {
9
- background-color: #F1F1F1;
10
- font-family: "Georgia", sans-serif;
11
- font-size: 16px;
12
- line-height: 1.6em;
13
- padding: 1.6em 0 0 0;
14
- color: #333;
15
- }
16
- h1, h2, h3, h4, h5, h6 {
17
- color: #444;
18
- }
19
- h1 {
20
- font-family: sans-serif;
21
- font-weight: normal;
22
- font-size: 4em;
23
- line-height: 0.8em;
24
- letter-spacing: -0.1ex;
25
- margin: 5px;
26
- }
27
- li {
28
- padding: 0;
29
- margin: 0;
30
- list-style-type: square;
31
- }
32
- a {
33
- color: #5E5AFF;
34
- background-color: #DAC;
35
- font-weight: normal;
36
- text-decoration: underline;
37
- }
38
- blockquote {
39
- font-size: 90%;
40
- font-style: italic;
41
- border-left: 1px solid #111;
42
- padding-left: 1em;
43
- }
44
- .caps {
45
- font-size: 80%;
46
- }
47
-
48
- #main {
49
- width: 45em;
50
- padding: 0;
51
- margin: 0 auto;
52
- }
53
- .coda {
54
- text-align: right;
55
- color: #77f;
56
- font-size: smaller;
57
- }
58
-
59
- table {
60
- font-size: 90%;
61
- line-height: 1.4em;
62
- color: #ff8;
63
- background-color: #111;
64
- padding: 2px 10px 2px 10px;
65
- border-style: dashed;
66
- }
67
-
68
- th {
69
- color: #fff;
70
- }
71
-
72
- td {
73
- padding: 2px 10px 2px 10px;
74
- }
75
-
76
- .success {
77
- color: #0CC52B;
78
- }
79
-
80
- .failed {
81
- color: #E90A1B;
82
- }
83
-
84
- .unknown {
85
- color: #995000;
86
- }
87
- pre, code {
88
- font-family: monospace;
89
- font-size: 90%;
90
- line-height: 1.4em;
91
- color: #ff8;
92
- background-color: #111;
93
- padding: 2px 10px 2px 10px;
94
- }
95
- .comment { color: #aaa; font-style: italic; }
96
- .keyword { color: #eff; font-weight: bold; }
97
- .punct { color: #eee; font-weight: bold; }
98
- .symbol { color: #0bb; }
99
- .string { color: #6b4; }
100
- .ident { color: #ff8; }
101
- .constant { color: #66f; }
102
- .regex { color: #ec6; }
103
- .number { color: #F99; }
104
- .expr { color: #227; }
105
-
106
- #version {
107
- float: right;
108
- text-align: right;
109
- font-family: sans-serif;
110
- font-weight: normal;
111
- background-color: #B3ABFF;
112
- color: #141331;
113
- padding: 15px 20px 10px 20px;
114
- margin: 0 auto;
115
- margin-top: 15px;
116
- border: 3px solid #141331;
117
- display:block;
118
- -moz-border-radius-bottomleft:10px;
119
- -moz-border-radius-bottomright:10px;
120
- -moz-border-radius-topleft:10px;
121
- -moz-border-radius-topright:10px;
122
- -webkit-border-bottom-left-radius:10px;
123
- -webkit-border-bottom-right-radius:10px;
124
- -webkit-border-top-left-radius:10px;
125
- -webkit-border-top-right-radius:10px;
126
- }
127
-
128
- #version .numbers {
129
- display: block;
130
- font-size: 4em;
131
- line-height: 0.8em;
132
- letter-spacing: -0.1ex;
133
- margin-bottom: 15px;
134
- }
135
-
136
- #version p {
137
- text-decoration: none;
138
- color: #141331;
139
- background-color: #B3ABFF;
140
- margin: 0;
141
- padding: 0;
142
- }
143
-
144
- #version a {
145
- text-decoration: none;
146
- color: #141331;
147
- background-color: #B3ABFF;
148
- }
149
-
150
- .clickable {
151
- cursor: pointer;
152
- cursor: hand;
153
- }
154
-
155
- </style>
156
- </head>
157
- <body>
158
- <div id="main">
159
-
160
- <h1>rbtagger</h1>
161
- <div id="version" class="clickable" onclick='document.location = "http://rubyforge.org/projects/rbtagger"; return false'>
162
- <p>Get Version</p>
163
- <a href="http://rubyforge.org/projects/rbtagger" class="numbers">0.3.1</a>
164
- </div>
165
- <h4 style="float:right;padding-right:10px;"> &amp;#x2192; &#8216;rbtagger&#8217;</h4>
166
- <h2>What</h2>
167
- <p>A Simple Ruby Rule-Based Part of Speech Tagger</p>
168
- <p>This work is based on the work of Eric Brill</p>
169
- <h2>Installing</h2>
170
- <p><pre class='syntax'>
171
- gem install rbtagger
172
- </pre></p>
173
- <h2>The basics</h2>
174
- <h4>Using the rule tagger</h4>
175
- <p><pre class='syntax'>
176
- <span class="ident">require</span> <span class="punct">'</span><span class="string">rbtagger</span><span class="punct">'</span>
177
-
178
- <span class="ident">tagger</span> <span class="punct">=</span> <span class="constant">Brill</span><span class="punct">::</span><span class="constant">Tagger</span><span class="punct">.</span><span class="ident">new</span>
179
- <span class="ident">docs</span><span class="punct">.</span><span class="ident">each</span> <span class="keyword">do</span><span class="punct">|</span><span class="ident">doc</span><span class="punct">|</span>
180
- <span class="ident">tagger</span><span class="punct">.</span><span class="ident">tag</span><span class="punct">(</span> <span class="constant">File</span><span class="punct">.</span><span class="ident">read</span><span class="punct">(</span> <span class="ident">doc</span> <span class="punct">)</span> <span class="punct">)</span>
181
- <span class="keyword">end</span>
182
-
183
- <span class="ident">tagger</span><span class="punct">.</span><span class="ident">suggest</span><span class="punct">(</span> <span class="constant">File</span><span class="punct">.</span><span class="ident">read</span><span class="punct">(&quot;</span><span class="string">sample.txt</span><span class="punct">&quot;)</span> <span class="punct">)</span>
184
- <span class="punct">=&gt;</span> <span class="punct">[[&quot;</span><span class="string">doctor</span><span class="punct">&quot;,</span> <span class="punct">&quot;</span><span class="string">NN</span><span class="punct">&quot;,</span> <span class="number">3</span><span class="punct">],</span> <span class="punct">[&quot;</span><span class="string">treatment</span><span class="punct">&quot;,</span> <span class="punct">&quot;</span><span class="string">NN</span><span class="punct">&quot;,</span> <span class="number">5</span><span class="punct">]]</span>
185
-
186
- <span class="ident">tagger</span><span class="punct">.</span><span class="ident">nouns</span>
187
- <span class="ident">tagger</span><span class="punct">.</span><span class="ident">adjectives</span>
188
- </pre></p>
189
- <h4>Using the word tagger</h4>
190
- <p><pre class='syntax'>
191
- <span class="ident">require</span> <span class="punct">'</span><span class="string">rbtagger</span><span class="punct">'</span>
192
-
193
- <span class="ident">tagger</span> <span class="punct">=</span> <span class="constant">Word</span><span class="punct">::</span><span class="constant">Tagger</span><span class="punct">.</span><span class="ident">new</span><span class="punct">(</span> <span class="punct">['</span><span class="string">cat</span><span class="punct">','</span><span class="string">hat</span><span class="punct">'],</span> <span class="symbol">:words</span> <span class="punct">=&gt;</span> <span class="number">4</span> <span class="punct">)</span>
194
- <span class="ident">tags</span> <span class="punct">=</span> <span class="ident">tagger</span><span class="punct">.</span><span class="ident">execute</span><span class="punct">(</span> <span class="punct">'</span><span class="string">the cat and the hat</span><span class="punct">'</span> <span class="punct">)</span>
195
- <span class="ident">assert_equal</span><span class="punct">(</span> <span class="punct">[&quot;</span><span class="string">cat</span><span class="punct">&quot;,</span> <span class="punct">&quot;</span><span class="string">hat</span><span class="punct">&quot;],</span> <span class="ident">tags</span> <span class="punct">)</span>
196
- </pre></p>
197
- <h2>Forum</h2>
198
- <p><a href="http://groups.google.com/group/rb-brill-tagger">http://groups.google.com/group/rb-brill-tagger</a></p>
199
- <h2>How to submit patches</h2>
200
- <p>Read the <a href="http://drnicwilliams.com/2007/06/01/8-steps-for-fixing-other-peoples-code/">8 steps for fixing other people&#8217;s code</a> and for section <a href="http://drnicwilliams.com/2007/06/01/8-steps-for-fixing-other-peoples-code/#8b-google-groups">8b: Submit patch to Google Groups</a>, use the Google Group above.</p>
201
- <ul>
202
- <li>github: <a href="http://github.com/taf2/rb-brill-tagger/tree/master">http://github.com/taf2/rb-brill-tagger/tree/master</a></li>
203
- </ul>
204
- <pre>git clone git://github.com/taf2/rb-brill-tagger.git</pre>
205
- <h3>Build and test instructions</h3>
206
- <pre>cd rb-brill-tagger
207
- rake test
208
- rake install_gem</pre>
209
- <h2>License</h2>
210
- <p>This code is free to use under the terms of the <span class="caps">MIT</span> license.</p>
211
- <h2>Contact</h2>
212
- <p>Comments are welcome. Send an email to <a href="mailto:rb-brill-tagger@googlegroups.com">Todd A. Fisher</a> email via the <a href="http://groups.google.com/group/rb-brill-tagger">forum</a></p>
213
- <p class="coda">
214
- <a href="http://xullicious.blogspot.com/">Todd A. Fisher</a>, 21st May 2009<br>
215
- Theme extended from <a href="http://rb2js.rubyforge.org/">Paul Battley</a>
216
- </p>
217
- </div>
218
-
219
- <!-- insert site tracking codes here, like Google Urchin -->
220
- <script type="text/javascript">
221
- var gaJsHost = (("https:" == document.location.protocol) ? "https://ssl." : "http://www.");
222
- document.write(unescape("%3Cscript src='" + gaJsHost + "google-analytics.com/ga.js' type='text/javascript'%3E%3C/script%3E"));
223
- </script>
224
- <script type="text/javascript">
225
- var pageTracker = _gat._getTracker("UA-246931-6");
226
- pageTracker._initData();
227
- pageTracker._trackPageview();
228
- </script>
229
-
230
- </body>
231
- </html>