fpgrowth 0.0.1 → 0.0.2
Sign up to get free protection for your applications and to get access to all the features.
- data/README.md +11 -7
- data/fpgrowth-ruby.gemspec +1 -1
- data/lib/fpgrowth/fp_tree/builder/first_pass.rb +0 -1
- data/lib/fpgrowth/miner/conditional_tree_builder.rb +22 -4
- data/lib/fpgrowth/miner.rb +1 -1
- data/lib/fpgrowth/version.rb +1 -1
- data/test/enel/3_canada_i_processi_e_i_prodotti_1.csv +14 -0
- data/test/enel/Produzione_Impianti_Termoelettrici_en.csv +46 -0
- data/test/tc_conditional_tree_builder.rb +135 -0
- data/test/tc_open_data_enel.rb +73 -0
- data/test/tc_open_data_sondage_montreal.rb +14 -14
- data/test/tc_open_data_velo_montreal.rb +8 -20
- metadata +11 -3
data/README.md
CHANGED
@@ -7,7 +7,7 @@ Ruby implementation of FP-Growth
|
|
7
7
|
FP-Growth is an algorithm used for mining frequent pattern in an item set. Such pattern are then used to build association rules.
|
8
8
|
Literature example is { Potatoes, Onions } => { Burger } or { Beer, Chips } => { Dippers }.
|
9
9
|
|
10
|
-
FP-Growth is known as a solution for Mining without Candidate generation (http://dl.acm.org/citation.cfm?id=335372).
|
10
|
+
FP-Growth is known as a solution for Mining without Candidate generation (<http://dl.acm.org/citation.cfm?id=335372>).
|
11
11
|
Main alternative to FP-Growth is A Priori which is a pretty much a naive solution. A Priori consist in generating candidate then scanning the database looking for them.
|
12
12
|
|
13
13
|
FP-Growth solution is about reducing the database in one simple Tree Structure : The FP-Tree. The FP-Tree make easy to extract frequent pattern from it.
|
@@ -26,13 +26,13 @@ Imagine a web app allowing you to connect to other users. Such a website want to
|
|
26
26
|
|
27
27
|
Imagine a commercial website showing some products to the users. It will scan users actions and make suggestions like : "People who like this products often like this other one !"
|
28
28
|
|
29
|
-
Those applications are now easy thanks to fpgrowth
|
29
|
+
Those applications are now easy thanks to fpgrowth for Ruby !
|
30
30
|
|
31
|
-
|
31
|
+
# Installation
|
32
32
|
|
33
33
|
Add this line to your application's Gemfile:
|
34
34
|
|
35
|
-
gem 'fpgrowth
|
35
|
+
gem 'fpgrowth'
|
36
36
|
|
37
37
|
And then execute:
|
38
38
|
|
@@ -59,23 +59,27 @@ The larger is the number of transactions, the smaller should be the threshold. I
|
|
59
59
|
|
60
60
|
```ruby
|
61
61
|
transactions = [['a', 'b'], ['b'], ['b', 'c'], ['a', 'b']]
|
62
|
-
fp_tree = FpGrowth::FpTree.build(transactions, 30)
|
62
|
+
fp_tree = FpGrowth::FpTree.build(transactions, 30)
|
63
|
+
# 30 stands for 30% of transactions. Here, 'c' would be pruned.
|
63
64
|
FpGrowth::Miner.fp_growth(fp_tree)
|
64
65
|
|
65
66
|
```
|
66
67
|
|
68
|
+
### Examples
|
69
|
+
|
70
|
+
You can find in the test repository a few concrete example on Open Data.
|
67
71
|
|
68
72
|
|
69
73
|
## Development : Next steps
|
70
74
|
|
71
75
|
As we said, worst case is a is a DataSet with long transactions and where each item is significantly frequent. Solution would be to higher the threshold level, which would result in data-loss, maybe critical data would be lost...
|
72
76
|
|
73
|
-
A better solution, described is following articles : [http://dl.acm.org/citation.cfm?id=1133907 , http://link.springer.com/chapter/10.1007/978-3-540-24775-3_19]
|
77
|
+
A better solution, described is following articles : [<http://dl.acm.org/citation.cfm?id=1133907> , <http://link.springer.com/chapter/10.1007/978-3-540-24775-3_19>]
|
74
78
|
Main concept is pruning the tree, once built in order to remove the less significant patterns. This is necessary to allow developer to prune his tree, losing least frequent pattern, in order to quickly obtain the most frequent ones.
|
75
79
|
|
76
80
|
This is next step in our Roadmap.
|
77
81
|
|
78
|
-
This is also a necessary step for allowing a Top-Down FP-Growth implementation as described in : http://link.springer.com/chapter/10.1007/3-540-47887-6_34
|
82
|
+
This is also a necessary step for allowing a Top-Down FP-Growth implementation as described in : <http://link.springer.com/chapter/10.1007/3-540-47887-6_34>
|
79
83
|
|
80
84
|
This last implementation is more scalable and more efficient than the current one.
|
81
85
|
|
data/fpgrowth-ruby.gemspec
CHANGED
@@ -10,7 +10,7 @@ Gem::Specification.new do |spec|
|
|
10
10
|
spec.email = ["dam.cavailles@laposte.net"]
|
11
11
|
spec.description = %q{FP-Growth implementation}
|
12
12
|
spec.summary = %q{FP-Growth is mean to detect}
|
13
|
-
spec.homepage = ""
|
13
|
+
spec.homepage = "https://github.com/thedamfr/fpgrowth"
|
14
14
|
spec.license = "MIT"
|
15
15
|
|
16
16
|
spec.files = `git ls-files`.split($/)
|
@@ -74,12 +74,12 @@ module FpGrowth
|
|
74
74
|
end
|
75
75
|
|
76
76
|
|
77
|
-
def sort_by_support(pattern_base)
|
78
|
-
lookup =
|
79
|
-
|
77
|
+
def sort_by_support(pattern_base, fp_tree=@fp_tree)
|
78
|
+
lookup = fp_tree.item_order_lookup
|
80
79
|
pattern_base.content.sort_by! do |item|
|
81
|
-
|
80
|
+
fp_tree.supports.fetch(item, fp_tree.supports.size + 1)
|
82
81
|
end
|
82
|
+
pattern_base.content.reverse!
|
83
83
|
return pattern_base
|
84
84
|
end
|
85
85
|
|
@@ -117,6 +117,24 @@ module FpGrowth
|
|
117
117
|
traverse(cursor_tree, Pattern.new(pattern_base.content[1..-1], pattern_base.support))
|
118
118
|
end
|
119
119
|
|
120
|
+
#fonction qui sert uniquement pour les tests
|
121
|
+
def test_execute_threshold(value = 1)
|
122
|
+
if value == @threshold
|
123
|
+
then return true
|
124
|
+
end
|
125
|
+
return false
|
126
|
+
end
|
127
|
+
|
128
|
+
def test_execute_pattern_base(value = [])
|
129
|
+
if value == @pattern_base
|
130
|
+
then return true
|
131
|
+
end
|
132
|
+
return false
|
133
|
+
end
|
134
|
+
|
135
|
+
|
136
|
+
|
137
|
+
|
120
138
|
|
121
139
|
end
|
122
140
|
end
|
data/lib/fpgrowth/miner.rb
CHANGED
data/lib/fpgrowth/version.rb
CHANGED
@@ -0,0 +1,14 @@
|
|
1
|
+
ENEL - Canada - I processi e i prodotti Unità di misura 2007 2008 2009 2010 2011
|
2
|
+
Produzione di energia elettrica (netta)
|
3
|
+
Da fonti rinnovabili milioni di kWh 323 180 251 281 285
|
4
|
+
biomassa e parte biodegradabile dei rifiuti milioni di kWh 175 172 149 182 175
|
5
|
+
combinata con produzione di calore milioni di kWh 175 172 149 182 175
|
6
|
+
idrica da apporti naturali milioni di kWh 148 0 0 0 0
|
7
|
+
eolica milioni di kWh 0 7.82 102 99.0 110
|
8
|
+
Totale milioni di kWh 323 180 251 281 285
|
9
|
+
semplice milioni di kWh 148 7.82 102 99.0 110
|
10
|
+
combinata con produzione di calore milioni di kWh 175 172 149 182 175
|
11
|
+
"Produzione utile di calore (combinata con
|
12
|
+
produzione di energia elettrica)"
|
13
|
+
In impianti termoelettrici (biomassa e parte biodegradabile dei rifiuti) milioni di kcal "29,626" "30,149" "23,042" "32,524" "29,117"
|
14
|
+
In impianti termoelettrici (biomassa e parte biodegradabile dei rifiuti) milioni di kWh 34.5 35.1 26.8 37.8 33.9
|
@@ -0,0 +1,46 @@
|
|
1
|
+
Regione Provincia Comune Centrale Tipo Sezioni Potenza Lorda (MW)
|
2
|
+
Piemonte Vercelli Trino "Termoelettrica ""Galileo Ferraris"" Trino (VC)" Ciclo Combinato 2 744
|
3
|
+
Alessandria Alessandria Termoelettrica Alessandria Turbogas 2 176
|
4
|
+
Veneto Rovigo Porto Tolle Termoelettrica Porto Tolle Olio 4 "2,640"
|
5
|
+
Venezia Venezia "Termoelettrica ""Andrea Palladio"" Fusina (VE)" Carbone / Metano / CDR / Olio / Idrogeno 6 "1,148"
|
6
|
+
Venezia Venezia Termoelettrica Porto Marghera Carbone 2 140
|
7
|
+
Emilia Romagna Piacenza Castel San Giovanni "Termoelettrica ""Edoardo Amaldi"" La Casella (PC)" Ciclo Combinato 4 "1,558"
|
8
|
+
Ravenna Porto Corsini "Termoelettrica ""Teodora"" Porto Corsini (RA)" Ciclo Combinato 2 761
|
9
|
+
Modena Carpi Termoelettrica Carpi Modenese Turbogas 2 176
|
10
|
+
Liguria La Spezia La Spezia "Termoelettrica ""Eugenio Montale"" La Spezia (SP)" Ciclo Combinato / Carbone 3 "1,282"
|
11
|
+
Genova Genova Termoelettrica Genova Carbone 3 295
|
12
|
+
Toscana Livorno Piombino Termoelettrica Piombino Olio 4 "1,280"
|
13
|
+
Livorno Livorno Termoelettrica Livorno Olio 2 310
|
14
|
+
Livorno Porto Ferraio Termoelettrica Porto Ferraio Diesel 1 17
|
15
|
+
Livorno Isola Capraia Termoelettrica Capraia Diesel 4 1.7
|
16
|
+
Arezzo Cavriglia "Termoelettrica ""Santa Barbara"" Cavriglia (AR)" Ciclo Combinato 1 394
|
17
|
+
Marche Ancona Camerata Picena Termoelettrica Camerata Picena Turbogas 4 104
|
18
|
+
Umbria Perugia Piegaro "Termoelettrica ""Franco Rasetti"" Pietrafitta - Piegaro (PG)" Turbogas / Ciclo Combinato 3 535
|
19
|
+
Perugia Gualdo Cattaneo "Termoelettrica ""Pietro Vannucci"" Bastardo - Giano dell' Umbria (PG)" Carbone 2 150
|
20
|
+
Lazio Viterbo Montalto di Castro "Termoelettrica ""Alessandro Volta"" Montalto di Castro (VT)" Olio / Gas Naturale 4 "3,580"
|
21
|
+
Roma Civitavecchia Termoelettrica Torrevaldaliga Nord Carbone 3 "1,980"
|
22
|
+
Latina Isola Ventotene Termoelettrica Ventotene Diesel 4 1.92
|
23
|
+
Molise Campobasso Larino Termoelettrica Larino Turbogas 2 250
|
24
|
+
Campobasso Campomarino Termoelettrica Campomarino Turbogas 1 88
|
25
|
+
Campania Napoli Giugliano in Campania Termoelettrica Giugliano Turbogas 4 352
|
26
|
+
Caserta Maddaloni Termoelettrica Maddaloni Turbogas 4 352
|
27
|
+
Puglia Brindisi Brindisi "Termoelettrica ""Federico II"" Brindisi" Carbone 4 "2,640"
|
28
|
+
Bari Bari Termoelettrica Bari Olio - Gas 3 206
|
29
|
+
Calabria Cosenza Rossano Calabro Termoelettrica Rossano Olio - Gas 4 "1,738"
|
30
|
+
Cosenza Laino Borgo Termoelettrica Mercure Biomasse 1 41
|
31
|
+
Sicilia Palermo Termini Imerese "Termoelettrica ""Ettore Majorana""" Gas / Turbogas / Ciclo Combinato 4 "1,330"
|
32
|
+
Siracusa Priolo Gargallo "Termoelettrica ""Archimede""" Ciclo Combinato 2 780
|
33
|
+
Siracusa Augusta Termoelettrica Augusta Olio 3 210
|
34
|
+
Agrigento Porto Empedocle Termoelettrica Porto Empedocle Olio 2 140
|
35
|
+
"Sicilia
|
36
|
+
( Isole EOLIE)" Messina Lipari Termoelettrica Vulcano Diesel 6 8.96
|
37
|
+
Messina Lipari Termoelettrica Panarea Diesel 6 4
|
38
|
+
Messina Lipari Termoelettrica Stromboli Diesel 6 3.1
|
39
|
+
Messina Lipari Termoelettrica Filicudi Diesel 6 1.5
|
40
|
+
Messina Lipari Termoelettrica Alicudi Diesel 3 0.5
|
41
|
+
Messina Santa Maria Salina Termoelettrica Santa Maria Salina Diesel 5 4.4
|
42
|
+
Messina Malfa Termoelettrica Malfa Diesel 2 2.2
|
43
|
+
Sardegna Carbonia-Iglesias Portoscuso "Termoelettrica ""Grazia Deledda"" Sulcis " Carbone / Biomasse 2 590
|
44
|
+
Carbonia-Iglesias Portoscuso Termoelettrica Portoscuso Olio 2 320
|
45
|
+
Cagliari Assemini Termoelettrica Assemini Turbogas 2 177
|
46
|
+
Totale (MW) "26,510"
|
@@ -0,0 +1,135 @@
|
|
1
|
+
require 'test/unit'
|
2
|
+
require "fpgrowth/miner"
|
3
|
+
require "fpgrowth/fp_tree"
|
4
|
+
|
5
|
+
class TestConditionalTreeBuilder < Test::Unit::TestCase
|
6
|
+
|
7
|
+
# Called before every test method runs. Can be used
|
8
|
+
# to set up fixture information.
|
9
|
+
def setup
|
10
|
+
|
11
|
+
@tableau_pattern = []
|
12
|
+
@tableau_pattern << FpGrowth::Miner::Pattern.new(['a', 'b'], 2)
|
13
|
+
@tableau_pattern << FpGrowth::Miner::Pattern.new(['a', 'b', 'c'], 2)
|
14
|
+
@tableau_pattern << FpGrowth::Miner::Pattern.new(['b', 'c'], 1)
|
15
|
+
|
16
|
+
@tableau_pattern_one_element = []
|
17
|
+
@tableau_pattern_one_element << FpGrowth::Miner::Pattern.new(['a'], 12)
|
18
|
+
|
19
|
+
@supports_exemple = {'a' => 1, 'b' => 5, 'c' => 4}
|
20
|
+
@pattern_exemple = FpGrowth::Miner::Pattern.new(['a', 'b'], 2)
|
21
|
+
end
|
22
|
+
|
23
|
+
# Called after every test method runs. Can be used to tear
|
24
|
+
# down fixture information.
|
25
|
+
|
26
|
+
def teardown
|
27
|
+
# Do nothing
|
28
|
+
end
|
29
|
+
|
30
|
+
# test initialiser
|
31
|
+
def test_initialize
|
32
|
+
conditional_tree_builder = nil
|
33
|
+
|
34
|
+
assert_nothing_raised { conditional_tree_builder = FpGrowth::Miner::ConditionalTreeBuilder.new() }
|
35
|
+
assert_equal(true, conditional_tree_builder.test_execute_threshold(1))
|
36
|
+
assert_equal(true, conditional_tree_builder.test_execute_pattern_base())
|
37
|
+
|
38
|
+
#initialisation avec un argument
|
39
|
+
|
40
|
+
assert_nothing_raised { conditional_tree_builder = FpGrowth::Miner::ConditionalTreeBuilder.new(@tableau_pattern) }
|
41
|
+
assert_equal(true, conditional_tree_builder.test_execute_pattern_base(@tableau_pattern))
|
42
|
+
assert_equal(true, conditional_tree_builder.test_execute_threshold)
|
43
|
+
|
44
|
+
#initialisation avec deux arguments
|
45
|
+
|
46
|
+
assert_nothing_raised { conditional_tree_builder = FpGrowth::Miner::ConditionalTreeBuilder.new(@tableau_pattern, 3) }
|
47
|
+
assert_equal(true, conditional_tree_builder.test_execute_pattern_base(@tableau_pattern))
|
48
|
+
assert_equal(true, conditional_tree_builder.test_execute_threshold(3))
|
49
|
+
|
50
|
+
end
|
51
|
+
|
52
|
+
|
53
|
+
def test_execute
|
54
|
+
|
55
|
+
|
56
|
+
end
|
57
|
+
|
58
|
+
def test_second_pass
|
59
|
+
|
60
|
+
end
|
61
|
+
|
62
|
+
def test_first_pass
|
63
|
+
|
64
|
+
end
|
65
|
+
|
66
|
+
def test_scan
|
67
|
+
|
68
|
+
# with no argument
|
69
|
+
conditional_tree_builder = nil
|
70
|
+
supports = nil
|
71
|
+
assert_nothing_raised { conditional_tree_builder = FpGrowth::Miner::ConditionalTreeBuilder.new() }
|
72
|
+
assert_nothing_raised { supports = conditional_tree_builder.scan() }
|
73
|
+
assert_equal(Hash.new(0), supports)
|
74
|
+
|
75
|
+
#with one argument
|
76
|
+
assert_nothing_raised { conditional_tree_builder = FpGrowth::Miner::ConditionalTreeBuilder.new(@tableau_pattern) }
|
77
|
+
assert_nothing_raised { supports = conditional_tree_builder.scan() }
|
78
|
+
assert_equal(4, supports['a'])
|
79
|
+
assert_equal(5, supports['b'])
|
80
|
+
assert_equal(3, supports['c'])
|
81
|
+
|
82
|
+
end
|
83
|
+
|
84
|
+
def test_pruning
|
85
|
+
|
86
|
+
end
|
87
|
+
|
88
|
+
def test_sort
|
89
|
+
conditional_tree_builder = nil
|
90
|
+
supports = nil
|
91
|
+
|
92
|
+
assert_nothing_raised { conditional_tree_builder = FpGrowth::Miner::ConditionalTreeBuilder.new() }
|
93
|
+
|
94
|
+
|
95
|
+
#with one argument
|
96
|
+
assert_nothing_raised { supports = conditional_tree_builder.sort(@supports_exemple) }
|
97
|
+
assert_equal(1, supports['a'])
|
98
|
+
assert_equal(5, supports['b'])
|
99
|
+
assert_equal(4, supports['c'])
|
100
|
+
assert_equal('b', supports.keys[0])
|
101
|
+
assert_equal('c', supports.keys[1])
|
102
|
+
assert_equal('a', supports.keys[2])
|
103
|
+
end
|
104
|
+
|
105
|
+
def test_sort_by_support
|
106
|
+
conditional_tree_builder = nil
|
107
|
+
pattern_base = nil
|
108
|
+
fp_tree = nil
|
109
|
+
# One element
|
110
|
+
assert_nothing_raised { fp_tree = FpGrowth::FpTree::FpTree.new(@supports_exemple) }
|
111
|
+
assert_nothing_raised { conditional_tree_builder = FpGrowth::Miner::ConditionalTreeBuilder.new(@tableau_pattern_one_element) }
|
112
|
+
assert_nothing_raised { pattern_base = conditional_tree_builder.sort_by_support(@tableau_pattern_one_element.first, fp_tree) }
|
113
|
+
assert_equal(@tableau_pattern_one_element.first, pattern_base)
|
114
|
+
|
115
|
+
#Two element
|
116
|
+
assert_nothing_raised { pattern_base = conditional_tree_builder.sort_by_support(@pattern_exemple, fp_tree) }
|
117
|
+
|
118
|
+
assert_equal(['b', 'a'], pattern_base.content, "en plus, #{fp_tree.supports.to_s}")
|
119
|
+
assert_equal(2, pattern_base.support)
|
120
|
+
|
121
|
+
#Two element
|
122
|
+
assert_nothing_raised { pattern_base = conditional_tree_builder.sort_by_support(FpGrowth::Miner::Pattern.new(['a','b','c'],2), fp_tree) }
|
123
|
+
|
124
|
+
assert_equal(['b', 'c', 'a'], pattern_base.content, "en plus, #{fp_tree.supports.to_s}")
|
125
|
+
assert_equal(2, pattern_base.support)
|
126
|
+
|
127
|
+
|
128
|
+
end
|
129
|
+
|
130
|
+
def test_traverse
|
131
|
+
|
132
|
+
end
|
133
|
+
|
134
|
+
|
135
|
+
end
|
@@ -0,0 +1,73 @@
|
|
1
|
+
require 'test/unit'
|
2
|
+
|
3
|
+
class TestEnelOpenData < Test::Unit::TestCase
|
4
|
+
|
5
|
+
|
6
|
+
def setup
|
7
|
+
@transactions_canada_processi_prodotti = []
|
8
|
+
CSV.foreach("test/enel/3_canada_i_processi_e_i_prodotti_1.csv", {:headers => true, :header_converters => :symbol, :header_converters => :symbol, :converters => :all, :col_sep => "\t"}) do |row|
|
9
|
+
@transactions_canada_processi_prodotti << row.to_a
|
10
|
+
end
|
11
|
+
@transactions_produzione_impianti_termoelecttrici = []
|
12
|
+
CSV.foreach("test/enel/Produzione_Impianti_Termoelettrici_en.csv", {:headers => true, :header_converters => :symbol, :header_converters => :symbol, :converters => :all, :col_sep => "\t"}) do |row|
|
13
|
+
transaction = [[:tipo, [row[:tipo]]], [:sezioni, row[:sezioni]]]
|
14
|
+
potenza_lorda = row[:potenza_lorda_mw].to_s.gsub(',', '').to_i
|
15
|
+
if potenza_lorda < 10
|
16
|
+
transaction << [:lorda_order, "<10"]
|
17
|
+
elsif potenza_lorda > 10 and potenza_lorda < 100
|
18
|
+
transaction << [:lorda_order, ">10"]
|
19
|
+
elsif potenza_lorda > 100 and potenza_lorda < 1000
|
20
|
+
transaction << [:lorda_order, ">100"]
|
21
|
+
elsif potenza_lorda > 1000
|
22
|
+
transaction << [:lorda_order, ">1000"]
|
23
|
+
end
|
24
|
+
transaction.delete_if { |item| item[1]==nil }
|
25
|
+
@transactions_produzione_impianti_termoelecttrici << transaction
|
26
|
+
end
|
27
|
+
end
|
28
|
+
|
29
|
+
def teardown
|
30
|
+
|
31
|
+
end
|
32
|
+
|
33
|
+
def test_canada_processi_prodotti
|
34
|
+
|
35
|
+
start = Time.now
|
36
|
+
fp_tree = FpGrowth::FpTree.build(@transactions_canada_processi_prodotti, 1)
|
37
|
+
loop = Time.now
|
38
|
+
puts "Tree built in #{loop - start}"
|
39
|
+
|
40
|
+
patterns = FpGrowth::Miner.fp_growth(fp_tree)
|
41
|
+
|
42
|
+
finish = Time.now
|
43
|
+
puts "Tree Mined in #{finish -start}"
|
44
|
+
|
45
|
+
patterns.sort! { |a, b| a.support <=> b.support }.reverse!
|
46
|
+
|
47
|
+
#patterns.each { |pattern| puts "<#{pattern.content}:#{pattern.support}>"}
|
48
|
+
|
49
|
+
|
50
|
+
assert_not_equal(0, patterns.size)
|
51
|
+
|
52
|
+
end
|
53
|
+
|
54
|
+
def test_canada_produzione_impianti_termoelecttrici
|
55
|
+
|
56
|
+
start = Time.now
|
57
|
+
fp_tree = FpGrowth::FpTree.build(@transactions_produzione_impianti_termoelecttrici, 1)
|
58
|
+
loop = Time.now
|
59
|
+
puts "Tree built in #{loop - start}"
|
60
|
+
patterns = FpGrowth::Miner.fp_growth(fp_tree)
|
61
|
+
finish = Time.now
|
62
|
+
puts "Tree Mined in #{finish - start}"
|
63
|
+
|
64
|
+
patterns.sort! { |a, b| a.support <=> b.support }
|
65
|
+
patterns.sort! { |a, b| a.content.length <=> b.content.length }
|
66
|
+
|
67
|
+
assert_not_equal(0, patterns.size)
|
68
|
+
|
69
|
+
#patterns.each { |pattern| puts "<#{pattern.content}:#{pattern.support}>" if pattern.support > 1}
|
70
|
+
|
71
|
+
end
|
72
|
+
|
73
|
+
end
|
@@ -24,8 +24,8 @@ class TestOpenDataMTLSondage < Test::Unit::TestCase
|
|
24
24
|
# Do nothing
|
25
25
|
end
|
26
26
|
|
27
|
-
|
28
|
-
def
|
27
|
+
|
28
|
+
def fp_growth
|
29
29
|
|
30
30
|
d = Time.now
|
31
31
|
puts "Start time : "+d.to_s
|
@@ -33,17 +33,17 @@ class TestOpenDataMTLSondage < Test::Unit::TestCase
|
|
33
33
|
fp_tree = FpGrowth::FpTree.build(@transactions, 1)
|
34
34
|
|
35
35
|
puts "Tree built at :"+Time.now.to_s
|
36
|
-
patterns = FpGrowth::Miner.fp_growth(fp_tree)
|
37
|
-
|
38
|
-
f=Time.now
|
39
|
-
puts "Mining took #{(f-d)}sec"
|
40
|
-
|
41
|
-
patterns.sort! { |a, b| a.support <=> b.support }.reverse!
|
42
|
-
|
43
|
-
for pattern in patterns
|
44
|
-
|
45
|
-
end
|
46
|
-
|
47
|
-
assert_not_equal(0, patterns.size)
|
36
|
+
#patterns = FpGrowth::Miner.fp_growth(fp_tree)
|
37
|
+
#
|
38
|
+
#f=Time.now
|
39
|
+
#puts "Mining took #{(f-d)}sec"
|
40
|
+
#
|
41
|
+
#patterns.sort! { |a, b| a.support <=> b.support }.reverse!
|
42
|
+
#
|
43
|
+
#for pattern in patterns
|
44
|
+
# puts "#{pattern.content} #{pattern.support}"
|
45
|
+
#end
|
46
|
+
#
|
47
|
+
#assert_not_equal(0, patterns.size)
|
48
48
|
end
|
49
49
|
end
|
@@ -10,26 +10,7 @@ class TestOpenDataVelo < Test::Unit::TestCase
|
|
10
10
|
@transactions = []
|
11
11
|
CSV.foreach("test/montreal-velos-comptage/2012.csv", {:headers => true, :header_converters => :symbol, :header_converters => :symbol, :converters => :all, :encoding => 'windows-1251:utf-8', :col_sep => ";"}) do |row|
|
12
12
|
date = Date.parse(row[:date])
|
13
|
-
@transactions << [
|
14
|
-
end
|
15
|
-
|
16
|
-
|
17
|
-
CSV.foreach("test/montreal-velos-comptage/2011.csv", {:headers => true, :header_converters => :symbol, :header_converters => :symbol, :converters => :all, :encoding => 'windows-1251:utf-8', :col_sep => ";"}) do |row|
|
18
|
-
|
19
|
-
date = Date.parse(row[:date])
|
20
|
-
mois = date.month
|
21
|
-
mois = 2 if row[:date].to_s.match("/f.vr/")
|
22
|
-
mois = 3 if row[:date].to_s.include?("avr")
|
23
|
-
mois = 5 if row[:date].to_s.include?("mai")
|
24
|
-
mois = 6 if row[:date].to_s.include?("juin")
|
25
|
-
mois = 7 if row[:date].to_s.include?("juil")
|
26
|
-
mois = 8 if row[:date].to_s.match("/ao.t/")
|
27
|
-
mois = 9 if row[:date].to_s.match("/d.c/")
|
28
|
-
date = Date.new(2011, mois, date.day)
|
29
|
-
transaction = [ [:wday, date.wday], [:month, date.month], [:date, row[:date]], [:berri, row[:berri]], [:ctesaintecatherine, row[:ctesaintecatherine]], [:maisonneuve_1, row[:maisonneuve_1]], [:maisonneuve_2, row[:maisonneuve_2]], [:du_parc, row[:du_parc]], [:pierredupuy, row[:pierredupuy]], [:rachel, row[:rachel]]]
|
30
|
-
|
31
|
-
transaction.delete_if { |x| x[1]==nil }
|
32
|
-
@transactions << transaction
|
13
|
+
@transactions << [[:wday, date.wday], [:month, date.month], [:date, row[:date]], [:berri, row[:berri_1]], [:ctesaintecatherine, row[:ctesaintecatherine]], [:maisonneuve_1, row[:maisonneuve_1]], [:maisonneuve_2, row[:maisonneuve_2]], [:du_parc, row[:du_parc]], [:pierredupuy, row[:pierredupuy]], [:rachel, row[:rachel1]]]
|
33
14
|
end
|
34
15
|
|
35
16
|
|
@@ -42,9 +23,16 @@ class TestOpenDataVelo < Test::Unit::TestCase
|
|
42
23
|
|
43
24
|
# Fake test
|
44
25
|
def test_fp_growth
|
26
|
+
start = Time.now
|
45
27
|
fp_tree = FpGrowth::FpTree.build(@transactions, 1)
|
28
|
+
loop = Time.now
|
29
|
+
puts "Tree built in #{loop - start}"
|
30
|
+
|
46
31
|
patterns = FpGrowth::Miner.fp_growth(fp_tree)
|
47
32
|
|
33
|
+
finish = Time.now
|
34
|
+
puts "Tree Mined in #{finish -start}"
|
35
|
+
|
48
36
|
patterns.sort! { |a, b| a.support <=> b.support }.reverse!
|
49
37
|
|
50
38
|
=begin
|
metadata
CHANGED
@@ -1,7 +1,7 @@
|
|
1
1
|
--- !ruby/object:Gem::Specification
|
2
2
|
name: fpgrowth
|
3
3
|
version: !ruby/object:Gem::Version
|
4
|
-
version: 0.0.
|
4
|
+
version: 0.0.2
|
5
5
|
prerelease:
|
6
6
|
platform: ruby
|
7
7
|
authors:
|
@@ -9,7 +9,7 @@ authors:
|
|
9
9
|
autorequire:
|
10
10
|
bindir: bin
|
11
11
|
cert_chain: []
|
12
|
-
date: 2013-05-
|
12
|
+
date: 2013-05-30 00:00:00.000000000 Z
|
13
13
|
dependencies:
|
14
14
|
- !ruby/object:Gem::Dependency
|
15
15
|
name: bundler
|
@@ -75,21 +75,25 @@ files:
|
|
75
75
|
- lib/fpgrowth/miner/pattern.rb
|
76
76
|
- lib/fpgrowth/miner/pattern_base_extractor.rb
|
77
77
|
- lib/fpgrowth/version.rb
|
78
|
+
- test/enel/3_canada_i_processi_e_i_prodotti_1.csv
|
79
|
+
- test/enel/Produzione_Impianti_Termoelettrici_en.csv
|
78
80
|
- test/montreal-sondage/mtlsat12.csv
|
79
81
|
- test/montreal-velos-comptage/2009.csv
|
80
82
|
- test/montreal-velos-comptage/2010.csv
|
81
83
|
- test/montreal-velos-comptage/2011.csv
|
82
84
|
- test/montreal-velos-comptage/2012.csv
|
83
85
|
- test/tc_builder.rb
|
86
|
+
- test/tc_conditional_tree_builder.rb
|
84
87
|
- test/tc_first_pass.rb
|
85
88
|
- test/tc_fp_tree.rb
|
86
89
|
- test/tc_miner.rb
|
87
90
|
- test/tc_node.rb
|
91
|
+
- test/tc_open_data_enel.rb
|
88
92
|
- test/tc_open_data_sondage_montreal.rb
|
89
93
|
- test/tc_open_data_velo_montreal.rb
|
90
94
|
- test/tc_pattern.rb
|
91
95
|
- test/tc_second_pass.rb
|
92
|
-
homepage:
|
96
|
+
homepage: https://github.com/thedamfr/fpgrowth
|
93
97
|
licenses:
|
94
98
|
- MIT
|
95
99
|
post_install_message:
|
@@ -115,16 +119,20 @@ signing_key:
|
|
115
119
|
specification_version: 3
|
116
120
|
summary: FP-Growth is mean to detect
|
117
121
|
test_files:
|
122
|
+
- test/enel/3_canada_i_processi_e_i_prodotti_1.csv
|
123
|
+
- test/enel/Produzione_Impianti_Termoelettrici_en.csv
|
118
124
|
- test/montreal-sondage/mtlsat12.csv
|
119
125
|
- test/montreal-velos-comptage/2009.csv
|
120
126
|
- test/montreal-velos-comptage/2010.csv
|
121
127
|
- test/montreal-velos-comptage/2011.csv
|
122
128
|
- test/montreal-velos-comptage/2012.csv
|
123
129
|
- test/tc_builder.rb
|
130
|
+
- test/tc_conditional_tree_builder.rb
|
124
131
|
- test/tc_first_pass.rb
|
125
132
|
- test/tc_fp_tree.rb
|
126
133
|
- test/tc_miner.rb
|
127
134
|
- test/tc_node.rb
|
135
|
+
- test/tc_open_data_enel.rb
|
128
136
|
- test/tc_open_data_sondage_montreal.rb
|
129
137
|
- test/tc_open_data_velo_montreal.rb
|
130
138
|
- test/tc_pattern.rb
|