classifier-reborn 2.0.4 → 2.3.0

Sign up to get free protection for your applications and to get access to all the features.
Files changed (36) hide show
  1. checksums.yaml +5 -5
  2. data/LICENSE +74 -1
  3. data/README.markdown +57 -207
  4. data/data/stopwords/ar +104 -0
  5. data/data/stopwords/bn +362 -0
  6. data/data/stopwords/hi +97 -0
  7. data/data/stopwords/ja +43 -0
  8. data/data/stopwords/ru +420 -0
  9. data/data/stopwords/tr +199 -30
  10. data/data/stopwords/vi +647 -0
  11. data/data/stopwords/zh +125 -0
  12. data/lib/classifier-reborn/backends/bayes_memory_backend.rb +77 -0
  13. data/lib/classifier-reborn/backends/bayes_redis_backend.rb +109 -0
  14. data/lib/classifier-reborn/backends/no_redis_error.rb +14 -0
  15. data/lib/classifier-reborn/bayes.rb +141 -65
  16. data/lib/classifier-reborn/category_namer.rb +6 -4
  17. data/lib/classifier-reborn/extensions/hasher.rb +22 -39
  18. data/lib/classifier-reborn/extensions/token_filter/stemmer.rb +24 -0
  19. data/lib/classifier-reborn/extensions/token_filter/stopword.rb +48 -0
  20. data/lib/classifier-reborn/extensions/token_filter/symbol.rb +20 -0
  21. data/lib/classifier-reborn/extensions/tokenizer/token.rb +36 -0
  22. data/lib/classifier-reborn/extensions/tokenizer/whitespace.rb +28 -0
  23. data/lib/classifier-reborn/extensions/vector.rb +35 -28
  24. data/lib/classifier-reborn/extensions/vector_serialize.rb +10 -10
  25. data/lib/classifier-reborn/extensions/zero_vector.rb +7 -0
  26. data/lib/classifier-reborn/lsi/cached_content_node.rb +6 -5
  27. data/lib/classifier-reborn/lsi/content_node.rb +35 -25
  28. data/lib/classifier-reborn/lsi/summarizer.rb +7 -5
  29. data/lib/classifier-reborn/lsi/word_list.rb +5 -6
  30. data/lib/classifier-reborn/lsi.rb +166 -94
  31. data/lib/classifier-reborn/validators/classifier_validator.rb +170 -0
  32. data/lib/classifier-reborn/version.rb +3 -1
  33. data/lib/classifier-reborn.rb +12 -1
  34. metadata +98 -17
  35. data/bin/bayes.rb +0 -36
  36. data/bin/summarize.rb +0 -16
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
- SHA1:
3
- metadata.gz: 193d6c53d76559337140f192fc69910418015abe
4
- data.tar.gz: 745ec353d12ad84aaf74cca6b235401127f00075
2
+ SHA256:
3
+ metadata.gz: '0100803f158326f660f53694ff5d0d400440792bb5174a10d80ae7eb780c5b6b'
4
+ data.tar.gz: 1f5a249471e67beb8796a0a61f47ea18fa2f0a252e832f03cb7e7b1937921fa5
5
5
  SHA512:
6
- metadata.gz: c9891b16c6e9fb2ddfffd32a2335f59bfc55a5e97dae675b652acaa9122a2fc268bfc0d4c2be945456fe46397601127dd2a39782a99767718adf7fee373bdae2
7
- data.tar.gz: af844b19d90186a6e3866cdcbfa3deb6ba8492e4646298d6712168c1a329f34fbd24c551ff241a561711a8516cf57577dd7fe1a96bb069190e8545a09a570ff0
6
+ metadata.gz: e63b40492f9d35092353c198822f2ce444d05dec7613572048c3f420eecda4040c84026fe621ccb6c316e9862bc25258d47e32663168eb8f67c2b29b41733c57
7
+ data.tar.gz: abad42c42694cea59acf4bb59184a8f2aaa1d909826b126b4917b67b350c3ca9a14a3b688bd648e7ff8bba241a72e7846c749b91092e7ea91b5bc373c793b24f
data/LICENSE CHANGED
@@ -426,4 +426,77 @@ the Free Software Foundation.
426
426
  14. If you wish to incorporate parts of the Library into other free
427
427
  programs whose distribution conditions are incompatible with these,
428
428
  write to the author to ask for permission. For software which is
429
- copyrighted by
429
+ copyrighted by the Free Software Foundation, write to the Free
430
+ Software Foundation; we sometimes make exceptions for this. Our
431
+ decision will be guided by the two goals of preserving the free status
432
+ of all derivatives of our free software and of promoting the sharing
433
+ and reuse of software generally.
434
+
435
+ NO WARRANTY
436
+
437
+ 15. BECAUSE THE LIBRARY IS LICENSED FREE OF CHARGE, THERE IS NO
438
+ WARRANTY FOR THE LIBRARY, TO THE EXTENT PERMITTED BY APPLICABLE LAW.
439
+ EXCEPT WHEN OTHERWISE STATED IN WRITING THE COPYRIGHT HOLDERS AND/OR
440
+ OTHER PARTIES PROVIDE THE LIBRARY "AS IS" WITHOUT WARRANTY OF ANY
441
+ KIND, EITHER EXPRESSED OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE
442
+ IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
443
+ PURPOSE. THE ENTIRE RISK AS TO THE QUALITY AND PERFORMANCE OF THE
444
+ LIBRARY IS WITH YOU. SHOULD THE LIBRARY PROVE DEFECTIVE, YOU ASSUME
445
+ THE COST OF ALL NECESSARY SERVICING, REPAIR OR CORRECTION.
446
+
447
+ 16. IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN
448
+ WRITING WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MAY MODIFY
449
+ AND/OR REDISTRIBUTE THE LIBRARY AS PERMITTED ABOVE, BE LIABLE TO YOU
450
+ FOR DAMAGES, INCLUDING ANY GENERAL, SPECIAL, INCIDENTAL OR
451
+ CONSEQUENTIAL DAMAGES ARISING OUT OF THE USE OR INABILITY TO USE THE
452
+ LIBRARY (INCLUDING BUT NOT LIMITED TO LOSS OF DATA OR DATA BEING
453
+ RENDERED INACCURATE OR LOSSES SUSTAINED BY YOU OR THIRD PARTIES OR A
454
+ FAILURE OF THE LIBRARY TO OPERATE WITH ANY OTHER SOFTWARE), EVEN IF
455
+ SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF THE POSSIBILITY OF SUCH
456
+ DAMAGES.
457
+
458
+ END OF TERMS AND CONDITIONS
459
+
460
+ How to Apply These Terms to Your New Libraries
461
+
462
+ If you develop a new library, and you want it to be of the greatest
463
+ possible use to the public, we recommend making it free software that
464
+ everyone can redistribute and change. You can do so by permitting
465
+ redistribution under these terms (or, alternatively, under the terms of the
466
+ ordinary General Public License).
467
+
468
+ To apply these terms, attach the following notices to the library. It is
469
+ safest to attach them to the start of each source file to most effectively
470
+ convey the exclusion of warranty; and each file should have at least the
471
+ "copyright" line and a pointer to where the full notice is found.
472
+
473
+ <one line to give the library's name and a brief idea of what it does.>
474
+ Copyright (C) <year> <name of author>
475
+
476
+ This library is free software; you can redistribute it and/or
477
+ modify it under the terms of the GNU Lesser General Public
478
+ License as published by the Free Software Foundation; either
479
+ version 2.1 of the License, or (at your option) any later version.
480
+
481
+ This library is distributed in the hope that it will be useful,
482
+ but WITHOUT ANY WARRANTY; without even the implied warranty of
483
+ MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
484
+ Lesser General Public License for more details.
485
+
486
+ You should have received a copy of the GNU Lesser General Public
487
+ License along with this library; if not, write to the Free Software
488
+ Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
489
+
490
+ Also add information on how to contact you by electronic and paper mail.
491
+
492
+ You should also get your employer (if you work as a programmer) or your
493
+ school, if any, to sign a "copyright disclaimer" for the library, if
494
+ necessary. Here is a sample; alter the names:
495
+
496
+ Yoyodyne, Inc., hereby disclaims all copyright interest in the
497
+ library `Frob' (a library for tweaking knobs) written by James Random Hacker.
498
+
499
+ <signature of Ty Coon>, 1 April 1990
500
+ Ty Coon, President of Vice
501
+
502
+ That's all there is to it!
data/README.markdown CHANGED
@@ -1,230 +1,80 @@
1
- ## Welcome to Classifier Reborn
1
+ # Classifier Reborn
2
2
 
3
- Classifier is a general module to allow Bayesian and other types of classifications.
3
+ [![Gem Version](https://badge.fury.io/rb/classifier-reborn.svg)](https://rubygems.org/gems/classifier-reborn)
4
+ [![Build Status](https://img.shields.io/travis/jekyll/classifier-reborn/master.svg)](https://travis-ci.org/jekyll/classifier-reborn)
5
+ ---
4
6
 
5
- Classifier Reborn is a fork of cardmagic/classifier under more active development.
7
+ ## [Read the Docs](https://jekyll.github.io/classifier-reborn/)
6
8
 
7
- ## Download
9
+ ## Getting Started
8
10
 
9
- Add this line to your application's Gemfile:
11
+ Classifier Reborn is a general classifier module to allow Bayesian and other types of classifications.
12
+ It is a fork of [cardmagic/classifier](https://github.com/cardmagic/classifier) under more active development.
13
+ Currently, it has [Bayesian Classifier](https://en.wikipedia.org/wiki/Naive_Bayes_classifier) and [Latent Semantic Indexer (LSI)](https://en.wikipedia.org/wiki/Latent_semantic_analysis) implemented.
10
14
 
11
- gem 'classifier-reborn'
15
+ Here is a quick illustration of the Bayesian classifier.
12
16
 
13
- And then execute:
14
-
15
- $ bundle
16
-
17
- Or install it yourself as:
18
-
19
- $ gem install classifier-reborn
20
-
21
- ## Dependencies
22
-
23
- The only runtime dependency you'll need to install is Roman Shterenzon's fast-stemmer gem:
24
-
25
- gem install fast-stemmer
26
-
27
- This should install automatically with RubyGems.
28
-
29
- If you would like to speed up LSI classification by at least 10x, please install the following libraries:
30
-
31
- * [GNU GSL](http://www.gnu.org/software/gsl)
32
- * [rb-gsl](https://rubygems.org/gems/rb-gsl)
33
-
34
- Notice that LSI will work without these libraries, but as soon as they are installed, Classifier will make use of them. No configuration changes are needed, we like to keep things ridiculously easy for you.
35
-
36
- ## Bayes
37
-
38
- A Bayesian classifier by Lucas Carlson. Bayesian Classifiers are accurate, fast, and have modest memory requirements.
39
-
40
- ### Usage
41
-
42
- ```ruby
43
- require 'classifier-reborn'
44
- classifier = ClassifierReborn::Bayes.new 'Interesting', 'Uninteresting'
45
- classifier.train_interesting "here are some good words. I hope you love them"
46
- classifier.train_uninteresting "here are some bad words, I hate you"
47
- classifier.classify "I hate bad words and you" # returns 'Uninteresting'
48
-
49
- classifier_snapshot = Marshal.dump classifier
50
- # This is a string of bytes, you can persist it anywhere you like
51
-
52
- File.open("classifier.dat", "w") {|f| f.write(classifier_snapshot) }
53
- # Or Redis.current.save "classifier", classifier_snapshot
54
-
55
- # This is now saved to a file, and you can safely restart the application
56
- data = File.read("classifier.dat")
57
- # Or data = Redis.current.get "classifier"
58
- trained_classifier = Marshal.load data
59
- trained_classifier.classify "I love" # returns 'Interesting'
17
+ ```bash
18
+ $ gem install classifier-reborn
19
+ $ irb
20
+ irb(main):001:0> require 'classifier-reborn'
21
+ irb(main):002:0> classifier = ClassifierReborn::Bayes.new 'Ham', 'Spam'
22
+ irb(main):003:0> classifier.train "Ham", "Sunday is a holiday. Say no to work on Sunday!"
23
+ irb(main):004:0> classifier.train "Spam", "You are the lucky winner! Claim your holiday prize."
24
+ irb(main):005:0> classifier.classify "What's the plan for Sunday?"
25
+ #=> "Ham"
60
26
  ```
61
27
 
62
- Beyond the basic example, the constructor and trainer can be used in a more
63
- flexible way to accomidate non-trival applications. Consider the following
64
- program:
65
-
66
- ```ruby
67
- #!/usr/bin/env ruby
68
- # classifier_reborn_demo.rb
69
-
70
- require 'classifier-reborn'
71
-
72
- training_set = DATA.read.split("\n")
73
- categories = training_set.shift.split(',').map{|c| c.strip}
74
-
75
- classifier = ClassifierReborn::Bayes.new categories
76
-
77
- training_set.each do |a_line|
78
- next if a_line.empty? || '#' == a_line.strip[0]
79
- parts = a_line.strip.split(':')
80
- classifier.train(parts.first, parts.last)
81
- end
82
-
83
- puts classifier.classify "I hate bad words and you" #=> 'Uninteresting'
84
- puts classifier.classify "I hate javascript" #=> 'Uninteresting'
85
- puts classifier.classify "javascript is bad" #=> 'Uninteresting'
86
-
87
- puts classifier.classify "all you need is ruby" #=> 'Interesting'
88
- puts classifier.classify "i love ruby" #=> 'Interesting'
89
-
90
- puts classifier.classify "which is better dogs or cats" #=> 'dog'
91
- puts classifier.classify "what do I need to kill rats and mice" #=> 'cat'
92
-
93
- __END__
94
- Interesting, Uninteresting
95
- interesting: here are some good words. I hope you love them
96
- interesting: all you need is love
97
- interesting: the love boat, soon we will be taking another ride
98
- interesting: ruby don't take your love to town
99
-
100
- uninteresting: here are some bad words, I hate you
101
- uninteresting: bad bad leroy brown badest man in the darn town
102
- uninteresting: the good the bad and the ugly
103
- uninteresting: java, javascript, css front-end html
104
- #
105
- # train categories that were not pre-described
106
- #
107
- dog: dog days of summer
108
- dog: a man's best friend is his dog
109
- dog: a good hunting dog is a fine thing
110
- dog: man my dogs are tired
111
- dog: dogs are better than cats in soooo many ways
112
-
113
- cat: the fuzz ball spilt the milk
114
- cat: got rats or mice get a cat to kill them
115
- cat: cats never come when you call them
116
- cat: That dang cat keeps scratching the furniture
28
+ Now, let's build an LSI, classify some text, and find a cluster of related documents.
29
+
30
+ ```bash
31
+ irb(main):006:0> lsi = ClassifierReborn::LSI.new
32
+ irb(main):007:0> lsi.add_item "This text deals with dogs. Dogs.", :dog
33
+ irb(main):008:0> lsi.add_item "This text involves dogs too. Dogs!", :dog
34
+ irb(main):009:0> lsi.add_item "This text revolves around cats. Cats.", :cat
35
+ irb(main):010:0> lsi.add_item "This text also involves cats. Cats!", :cat
36
+ irb(main):011:0> lsi.add_item "This text involves birds. Birds.", :bird
37
+ irb(main):012:0> lsi.classify "This text is about dogs!"
38
+ #=> :dog
39
+ irb(main):013:0> lsi.find_related("This text is around cats!", 2)
40
+ #=> ["This text revolves around cats. Cats.", "This text also involves cats. Cats!"]
117
41
  ```
118
42
 
119
- #### Knowing the Score
120
-
121
- When you ask a bayesian classifier to classify text against a set of trained categories it does so by generating a score (as a Float) for each possible category. The higher the score the closer the fit your text has with that category. The category with the highest score is returned as the best matching category.
122
-
123
- In *ClassifierReborn* the methods *classifications* and *classify_with_score* give you access to the calculated scores. The method *classify* only returns the best matching category.
124
-
125
- Knowing the score allows you to do some interesting things. For example if your application is to generate tags for a blog post you could use the *classifications* method to get a hash of the categories and their scores. You would sort on score and take only the top 3 or 4 categories as your tags for the blog post.
43
+ There is much more that can be done using Bayes and LSI beyond these quick examples.
44
+ For more information read the following documentation topics.
126
45
 
127
- You could within your application establish the smallest acceptable score and only use those categories whose score is greater than or equal to your smallest acceptable score as your tags for the blog post.
46
+ * [Installation and Dependencies](https://jekyll.github.io/classifier-reborn/)
47
+ * [Bayesian Classifier](https://jekyll.github.io/classifier-reborn/bayes)
48
+ * [Latent Semantic Indexer (LSI)](https://jekyll.github.io/classifier-reborn/lsi)
49
+ * [Classifier Validation](https://jekyll.github.io/classifier-reborn/validation)
50
+ * [Development and Contributions](https://jekyll.github.io/classifier-reborn/development) (*Optional Docker instructions included*)
128
51
 
129
- But what if you only use the *classify* method? It does not show you the score of the best category. How do you know that the best category is really any good?
130
-
131
- You can use the threshold.
132
-
133
- #### Using the Threshold
134
-
135
- Some applications can have only one category. The application wants to know if the text being classified is of that category or not. For example consider a list of normal free text responses to some question or maybe a URL string coming to your web application. You know what a normal response looks like; but, you have no idea how people might mis-use the response. So what you want to do is create a bayesian classifier that just has one category, for example 'Good' and you want to know wither your text is classified as Good or Not Good.
136
-
137
- Or suppose you just want the ability to have multiple categories and a 'None of the Above' as a possibility.
138
-
139
- ##### Threshold
140
-
141
- When you initialize the *ClassifierReborn::Bayes* classifier there are several options which can be set that control threshold processing.
52
+ ### Notes on JRuby support
142
53
 
143
54
  ```ruby
144
- b = ClassifierRebor::Bayes.new(
145
- 'good', # one or more categories
146
- enable_threshold: true, # default: false
147
- threshold: -10.0 # default: 0.0
148
- )
149
- b.train_good 'good stuff from Dobie Gillis'
150
- # ...
151
- text = 'bad junk from Maynard G. Krebs'
152
- result = b.classify text
153
- if result.nil?
154
- STDERR.puts "ALERT: This is not good: #{text}"
155
- let_loose_the_dogs_of_war! # method definition left to the reader
156
- end
157
-
55
+ gem 'classifier-reborn-jruby', platforms: :java
158
56
  ```
159
57
 
160
- In the *classify* method when the best category for the text has a score that is either less than the established threshold or is Float::INIFINITY, a nil category is returned. When you see a nil value returned from the *classify* method it means that none of the trained categories (regardless or how many categories were trained) has a score that is above or equal to the established threshold.
161
-
162
- #### Other Threshold-related Convience Methods
58
+ While experimental, this gem should work on JRuby without any kind of additional changes. Unfortunately, you will **not** be able to use C bindings to GNU/GSL or similar performance-enhancing native code. Additionally, we do not use `fast_stemmer`, but rather [an implementation](https://tartarus.org/martin/PorterStemmer/java.txt) of the [Porter Stemming](https://tartarus.org/martin/PorterStemmer/) algorithm. Stemming will differ between MRI and JRuby, however you may choose to [disable stemming](https://tartarus.org/martin/PorterStemmer/) and do your own manual preprocessing (or use some other [popular Java library](https://opennlp.apache.org/)).
163
59
 
164
- ```ruby
165
- b.threshold # get the current threshold
166
- b.threshold = -10.0 # set the threshold
167
- b.threshold_enabled? # Boolean: is the threshold enabled?
168
- b.threshold_disabled? # Boolean: is the threshold disabled?
169
- b.enable_threshold # enables threshold processing
170
- b.disable_threshold # disables threshold processing
171
- ```
172
-
173
- Using these convience methods your applications can dynamically adjust threshold processing as required.
174
-
175
- ### Bayesian Classification
176
-
177
- * https://en.wikipedia.org/wiki/Naive_Bayes_classifier
178
- * http://www.process.com/precisemail/bayesian_filtering.htm
179
- * http://en.wikipedia.org/wiki/Bayesian_filtering
180
- * http://www.paulgraham.com/spam.html
181
-
182
- ## LSI
183
-
184
- A Latent Semantic Indexer by David Fayram. Latent Semantic Indexing engines
185
- are not as fast or as small as Bayesian classifiers, but are more flexible, providing
186
- fast search and clustering detection as well as semantic analysis of the text that
187
- theoretically simulates human learning.
188
-
189
- ### Usage
190
-
191
- ```ruby
192
- require 'classifier-reborn'
193
- lsi = ClassifierReborn::LSI.new
194
- strings = [ ["This text deals with dogs. Dogs.", :dog],
195
- ["This text involves dogs too. Dogs! ", :dog],
196
- ["This text revolves around cats. Cats.", :cat],
197
- ["This text also involves cats. Cats!", :cat],
198
- ["This text involves birds. Birds.",:bird ]]
199
- strings.each {|x| lsi.add_item x.first, x.last}
200
-
201
- lsi.search("dog", 3)
202
- # returns => ["This text deals with dogs. Dogs.", "This text involves dogs too. Dogs! ",
203
- # "This text also involves cats. Cats!"]
204
-
205
- lsi.find_related(strings[2], 2)
206
- # returns => ["This text revolves around cats. Cats.", "This text also involves cats. Cats!"]
207
-
208
- lsi.classify "This text is also about dogs!"
209
- # returns => :dog
210
- ```
60
+ If you encounter a problem, please submit your issue with `[JRuby]` in the title.
211
61
 
212
- Please see the ClassifierReborn::LSI documentation for more information. It is possible to index, search and classify
213
- with more than just simple strings.
62
+ ## Code of Conduct
214
63
 
215
- ### Latent Semantic Indexing
64
+ In order to have a more open and welcoming community, `Classifier Reborn` adheres to the `Jekyll`
65
+ [code of conduct](https://github.com/jekyll/jekyll/blob/master/CODE_OF_CONDUCT.markdown) adapted from the `Ruby on Rails` code of conduct.
216
66
 
217
- * http://www.c2.com/cgi/wiki?LatentSemanticIndexing
218
- * http://www.chadfowler.com/index.cgi/Computing/LatentSemanticIndexing.rdoc
219
- * http://en.wikipedia.org/wiki/Latent_semantic_analysis
67
+ Please adhere to this code of conduct in any interactions you have in the `Classifier` community.
68
+ If you encounter someone violating these terms, please let [Chase Gilliam](https://github.com/Ch4s3) know and we will address it as soon as possible.
220
69
 
221
- ## Authors
70
+ ## Authors and Contributors
222
71
 
223
- * Lucas Carlson (lucas@rufy.com)
224
- * David Fayram II (dfayram@gmail.com)
225
- * Cameron McBride (cameron.mcbride@gmail.com)
226
- * Ivan Acosta-Rubio (ivan@softwarecriollo.com)
227
- * Parker Moore (email@byparker.com)
228
- * Chase Gilliam (chase.gilliam@gmail.com)
72
+ * [Lucas Carlson](mailto:lucas@rufy.com)
73
+ * [David Fayram II](mailto:dfayram@gmail.com)
74
+ * [Cameron McBride](mailto:cameron.mcbride@gmail.com)
75
+ * [Ivan Acosta-Rubio](mailto:ivan@softwarecriollo.com)
76
+ * [Parker Moore](mailto:email@byparker.com)
77
+ * [Chase Gilliam](mailto:chase.gilliam@gmail.com)
78
+ * and [many more](https://github.com/jekyll/classifier-reborn/graphs/contributors)...
229
79
 
230
- This library is released under the terms of the GNU LGPL. See LICENSE for more details.
80
+ The Classifier Reborn library is released under the terms of the [GNU LGPL-2.1](https://github.com/jekyll/classifier-reborn/blob/master/LICENSE).
data/data/stopwords/ar ADDED
@@ -0,0 +1,104 @@
1
+
2
+ فى
3
+ في
4
+ كل
5
+ لم
6
+ لن
7
+ له
8
+ من
9
+ هو
10
+ هي
11
+ قوة
12
+ كما
13
+ لها
14
+ منذ
15
+ وقد
16
+ ولا
17
+ لقاء
18
+ مقابل
19
+ هناك
20
+ وقال
21
+ وكان
22
+ وقالت
23
+ وكانت
24
+ فيه
25
+ لكن
26
+ وفي
27
+ ولم
28
+ ومن
29
+ وهو
30
+ وهي
31
+ يوم
32
+ فيها
33
+ منها
34
+ يكون
35
+ يمكن حيث
36
+ االا
37
+ اما
38
+ االتى
39
+ التي
40
+ اكثر
41
+ ايضا
42
+ الذى
43
+ الذي
44
+ الان
45
+ الذين
46
+ ابين
47
+ ذلك
48
+ دون
49
+ حول
50
+ حين
51
+ الى
52
+ انه
53
+ اول
54
+ انها
55
+ ف
56
+ و
57
+ و6
58
+ قد
59
+ لا
60
+ ما
61
+ مع
62
+ هذا
63
+ واحد
64
+ واضاف
65
+ واضافت
66
+ فان
67
+ قبل
68
+ قال
69
+ كان
70
+ لدى
71
+ نحو
72
+ هذه
73
+ وان
74
+ واكد
75
+ كانت
76
+ واوضح
77
+ ب
78
+ ا
79
+ أ
80
+ ،
81
+ عن
82
+ عند
83
+ عندما
84
+ على
85
+ عليه
86
+ عليها
87
+ تم
88
+ ضد
89
+ بعد
90
+ بعض
91
+ حتى
92
+ اذا
93
+ احد
94
+ بان
95
+ اجل
96
+ غير
97
+ بن
98
+ به
99
+ ثم
100
+ اف
101
+ ان
102
+ او
103
+ اي
104
+ بها