reckon 0.4.4 → 0.5.0

Sign up to get free protection for your applications and to get access to all the features.
Files changed (43) hide show
  1. checksums.yaml +5 -5
  2. data/.ruby-version +1 -1
  3. data/.travis.yml +10 -2
  4. data/CHANGELOG.md +197 -0
  5. data/Gemfile +0 -1
  6. data/Gemfile.lock +33 -15
  7. data/README.md +2 -5
  8. data/lib/reckon.rb +10 -8
  9. data/lib/reckon/app.rb +92 -116
  10. data/lib/reckon/cosine_similarity.rb +119 -0
  11. data/lib/reckon/csv_parser.rb +57 -27
  12. data/lib/reckon/ledger_parser.rb +194 -30
  13. data/lib/reckon/money.rb +3 -4
  14. data/reckon.gemspec +6 -5
  15. data/spec/data_fixtures/73-sample.csv +2 -0
  16. data/spec/data_fixtures/73-tokens.yml +8 -0
  17. data/spec/data_fixtures/73-transactions.ledger +7 -0
  18. data/spec/data_fixtures/austrian_example.csv +13 -0
  19. data/spec/data_fixtures/bom_utf8_file.csv +1 -0
  20. data/spec/data_fixtures/broker_canada_example.csv +12 -0
  21. data/spec/data_fixtures/chase.csv +9 -0
  22. data/spec/data_fixtures/danish_kroner_nordea_example.csv +6 -0
  23. data/spec/data_fixtures/english_date_example.csv +3 -0
  24. data/spec/data_fixtures/french_example.csv +9 -0
  25. data/spec/data_fixtures/german_date_example.csv +3 -0
  26. data/spec/data_fixtures/harder_date_example.csv +5 -0
  27. data/spec/data_fixtures/ing.csv +3 -0
  28. data/spec/data_fixtures/intuit_mint_example.csv +7 -0
  29. data/spec/data_fixtures/invalid_header_example.csv +6 -0
  30. data/spec/data_fixtures/inversed_credit_card.csv +16 -0
  31. data/spec/data_fixtures/nationwide.csv +4 -0
  32. data/spec/data_fixtures/simple.csv +2 -0
  33. data/spec/data_fixtures/some_other.csv +9 -0
  34. data/spec/data_fixtures/spanish_date_example.csv +3 -0
  35. data/spec/data_fixtures/suntrust.csv +7 -0
  36. data/spec/data_fixtures/two_money_columns.csv +5 -0
  37. data/spec/data_fixtures/yyyymmdd_date_example.csv +1 -0
  38. data/spec/reckon/app_spec.rb +66 -34
  39. data/spec/reckon/csv_parser_spec.rb +79 -201
  40. data/spec/reckon/ledger_parser_spec.rb +62 -9
  41. data/spec/spec_helper.rb +3 -0
  42. metadata +62 -19
  43. data/CHANGES.md +0 -9
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
- SHA1:
3
- metadata.gz: cf1d814fe911f299cf58401c3b36734e29a172f1
4
- data.tar.gz: 9130495690de1f73cfa3570a6e3386f38df86830
2
+ SHA256:
3
+ metadata.gz: b58a745c730c0dfaf022a98c19dfaf8568c0b3091548d70c9202f8f7c0d78ada
4
+ data.tar.gz: 030d2036524a19c5eff981608133e8aac15f185b57a891e91a6f1fdabd3796c7
5
5
  SHA512:
6
- metadata.gz: b7c3e23f96983cd79844ba37524f1b843a1dbba230198cf813b83c7c65a076506fb2c66ab663dee0970db2d42e1d3e0b03adf41ee163d08f3b38dbb163c373a6
7
- data.tar.gz: d74ab72db15d54bc0bfe548369ec9a8b3903760bc09fbc16340a6f64c929a69d595c4840900e8a425996bc580e5924321ece84b10d3e033b193b066f2520f7c2
6
+ metadata.gz: 1daf76a19078f1707e847a7ac54aad5836ee3d49ce9c2f612bfefeb86f8c1bf9c9d1f97b7bbecb309f9f44f8097f8f2dbe0d6b6062bc376a62fc88f45bab1dfc
7
+ data.tar.gz: 4784ac871dbdb6160dc53ef8a7f5d69f31ccc7f7f7ea876b97bf60941abb2d13a968f5229747dd74d0b574ecc12326485e1eaba7ea368a91162d36d3ef4f9f86
@@ -1 +1 @@
1
- ruby-2.2.0
1
+ 2.5
@@ -1,5 +1,13 @@
1
1
  language: ruby
2
2
  rvm:
3
- - 1.9.3
4
- - 2.0.0
3
+ # Mac High Sierra
4
+ - 2.0.0-p648
5
+ # Mac Mojave
6
+ - 2.3.7
7
+ # Ubuntu 19.10
8
+ - 2.5
9
+ # Mac Catalina
10
+ - 2.6
5
11
  script: "bundle exec rake"
12
+ before_install:
13
+ - sudo apt-get -y install ledger
@@ -0,0 +1,197 @@
1
+ # Changelog
2
+
3
+ ## [v0.5.0](https://github.com/cantino/reckon/tree/v0.5.0) (2020-02-19)
4
+
5
+ [Full Changelog](https://github.com/cantino/reckon/compare/v0.4.4...v0.5.0)
6
+
7
+ **Closed issues:**
8
+
9
+ - g [\#75](https://github.com/cantino/reckon/issues/75)
10
+ - Learn-from not working [\#74](https://github.com/cantino/reckon/issues/74)
11
+ - Tokens YAML fails to match [\#73](https://github.com/cantino/reckon/issues/73)
12
+ - Missing or stray quote in line error [\#71](https://github.com/cantino/reckon/issues/71)
13
+ - Support ISO 8601 formatting of dates in ledger file [\#70](https://github.com/cantino/reckon/issues/70)
14
+ - Looking for a new maintainer for Reckon [\#68](https://github.com/cantino/reckon/issues/68)
15
+ - Reckon undefined method to\_h when trying to parse csv file [\#66](https://github.com/cantino/reckon/issues/66)
16
+ - Runtime error [\#65](https://github.com/cantino/reckon/issues/65)
17
+ - Reckon doesn't learn from multiple sources [\#63](https://github.com/cantino/reckon/issues/63)
18
+ - problem of importing file [\#59](https://github.com/cantino/reckon/issues/59)
19
+ - Problem with file in which every column is quoted. [\#58](https://github.com/cantino/reckon/issues/58)
20
+ - Error in reckon for the same format csv file [\#57](https://github.com/cantino/reckon/issues/57)
21
+ - Parsing account names does not work if currency symbol is different from $ [\#56](https://github.com/cantino/reckon/issues/56)
22
+ - Problem reading csv file [\#55](https://github.com/cantino/reckon/issues/55)
23
+ - Problem with mint file [\#53](https://github.com/cantino/reckon/issues/53)
24
+ - --money-column [\#43](https://github.com/cantino/reckon/issues/43)
25
+
26
+ **Merged pull requests:**
27
+
28
+ - Fix bugs in ledger file parsing. Fixes \#56. [\#81](https://github.com/cantino/reckon/pull/81) ([benprew](https://github.com/benprew))
29
+ - Better file encoding suggestions [\#80](https://github.com/cantino/reckon/pull/80) ([benprew](https://github.com/benprew))
30
+ - :bug: fix matching algorithm, add logging and a spec helper. Fixes \#73 [\#79](https://github.com/cantino/reckon/pull/79) ([benprew](https://github.com/benprew))
31
+ - bug: invalid header lines should be ignored, not parsed. [\#78](https://github.com/cantino/reckon/pull/78) ([benprew](https://github.com/benprew))
32
+ - convert default date format to iso8601 [\#77](https://github.com/cantino/reckon/pull/77) ([benprew](https://github.com/benprew))
33
+ - Fix rspec failure for ruby 2.3 and 2.4 [\#69](https://github.com/cantino/reckon/pull/69) ([BlackEdder](https://github.com/BlackEdder))
34
+ - Allow setting of money and date columns by index [\#67](https://github.com/cantino/reckon/pull/67) ([cantino](https://github.com/cantino))
35
+
36
+ ## [v0.4.4](https://github.com/cantino/reckon/tree/v0.4.4) (2015-12-02)
37
+
38
+ [Full Changelog](https://github.com/cantino/reckon/compare/v0.4.3...v0.4.4)
39
+
40
+ **Merged pull requests:**
41
+
42
+ - Regexp support in the tokens file [\#54](https://github.com/cantino/reckon/pull/54) ([vzctl](https://github.com/vzctl))
43
+
44
+ ## [v0.4.3](https://github.com/cantino/reckon/tree/v0.4.3) (2015-08-16)
45
+
46
+ [Full Changelog](https://github.com/cantino/reckon/compare/v0.4.2...v0.4.3)
47
+
48
+ ## [v0.4.2](https://github.com/cantino/reckon/tree/v0.4.2) (2015-08-08)
49
+
50
+ [Full Changelog](https://github.com/cantino/reckon/compare/v0.4.1...v0.4.2)
51
+
52
+ **Merged pull requests:**
53
+
54
+ - Ignore empty description columns [\#52](https://github.com/cantino/reckon/pull/52) ([vzctl](https://github.com/vzctl))
55
+
56
+ ## [v0.4.1](https://github.com/cantino/reckon/tree/v0.4.1) (2015-07-08)
57
+
58
+ [Full Changelog](https://github.com/cantino/reckon/compare/v0.4.0...v0.4.1)
59
+
60
+ **Closed issues:**
61
+
62
+ - Unattended [\#50](https://github.com/cantino/reckon/issues/50)
63
+ - Debit/Credit Columns from SunTrust [\#42](https://github.com/cantino/reckon/issues/42)
64
+
65
+ **Merged pull requests:**
66
+
67
+ - \[RFC\] Fix \#42: Work with suntrust double column csv files [\#48](https://github.com/cantino/reckon/pull/48) ([BlackEdder](https://github.com/BlackEdder))
68
+
69
+ ## [v0.4.0](https://github.com/cantino/reckon/tree/v0.4.0) (2015-06-05)
70
+
71
+ [Full Changelog](https://github.com/cantino/reckon/compare/v0.3.10...v0.4.0)
72
+
73
+ **Implemented enhancements:**
74
+
75
+ - Tab completion for transactions [\#40](https://github.com/cantino/reckon/issues/40)
76
+ - feature: "unattended" mode [\#3](https://github.com/cantino/reckon/issues/3)
77
+
78
+ **Closed issues:**
79
+
80
+ - Missing or stray quote error [\#38](https://github.com/cantino/reckon/issues/38)
81
+
82
+ **Merged pull requests:**
83
+
84
+ - Better ISO 8601 dates support [\#49](https://github.com/cantino/reckon/pull/49) ([vzctl](https://github.com/vzctl))
85
+ - Unattended mode and custom tokens support [\#47](https://github.com/cantino/reckon/pull/47) ([vzctl](https://github.com/vzctl))
86
+ - \[RFC\] Implement issue \#40: Tab completion [\#46](https://github.com/cantino/reckon/pull/46) ([BlackEdder](https://github.com/BlackEdder))
87
+ - set readline to allow for backspace in ask dialog [\#44](https://github.com/cantino/reckon/pull/44) ([mrtazz](https://github.com/mrtazz))
88
+
89
+ ## [v0.3.10](https://github.com/cantino/reckon/tree/v0.3.10) (2014-08-16)
90
+
91
+ [Full Changelog](https://github.com/cantino/reckon/compare/v0.3.9...v0.3.10)
92
+
93
+ **Merged pull requests:**
94
+
95
+ - Fix --encoding option [\#41](https://github.com/cantino/reckon/pull/41) ([mamciek](https://github.com/mamciek))
96
+ - Bumped version number [\#37](https://github.com/cantino/reckon/pull/37) ([BlackEdder](https://github.com/BlackEdder))
97
+
98
+ ## [v0.3.9](https://github.com/cantino/reckon/tree/v0.3.9) (2014-02-20)
99
+
100
+ [Full Changelog](https://github.com/cantino/reckon/compare/v0.3.8...v0.3.9)
101
+
102
+ **Closed issues:**
103
+
104
+ - Idea/discussion: csv parser [\#25](https://github.com/cantino/reckon/issues/25)
105
+ - Silently misinterprets UK dates [\#18](https://github.com/cantino/reckon/issues/18)
106
+
107
+ **Merged pull requests:**
108
+
109
+ - Added spec for csv files from Broker Canada [\#36](https://github.com/cantino/reckon/pull/36) ([BlackEdder](https://github.com/BlackEdder))
110
+ - Date format [\#35](https://github.com/cantino/reckon/pull/35) ([BlackEdder](https://github.com/BlackEdder))
111
+ - Added example from a french bank [\#34](https://github.com/cantino/reckon/pull/34) ([BlackEdder](https://github.com/BlackEdder))
112
+ - Austrian example [\#33](https://github.com/cantino/reckon/pull/33) ([BlackEdder](https://github.com/BlackEdder))
113
+ - Ing csv [\#30](https://github.com/cantino/reckon/pull/30) ([BlackEdder](https://github.com/BlackEdder))
114
+ - Further improvements in nationwide csv handling [\#29](https://github.com/cantino/reckon/pull/29) ([BlackEdder](https://github.com/BlackEdder))
115
+ - Refactor: Add money class [\#28](https://github.com/cantino/reckon/pull/28) ([BlackEdder](https://github.com/BlackEdder))
116
+ - Initial split of CSVparser from class App [\#27](https://github.com/cantino/reckon/pull/27) ([BlackEdder](https://github.com/BlackEdder))
117
+ - Updated version of pull request 24: Allow for other currency symbols while calculating money\_score [\#26](https://github.com/cantino/reckon/pull/26) ([BlackEdder](https://github.com/BlackEdder))
118
+ - Change double column detection [\#23](https://github.com/cantino/reckon/pull/23) ([BlackEdder](https://github.com/BlackEdder))
119
+ - Added optional argument to contains\_header to skip multiple header lines [\#22](https://github.com/cantino/reckon/pull/22) ([BlackEdder](https://github.com/BlackEdder))
120
+ - Add a Bitdeli Badge to README [\#20](https://github.com/cantino/reckon/pull/20) ([bitdeli-chef](https://github.com/bitdeli-chef))
121
+ - Update README to show latest usage info [\#19](https://github.com/cantino/reckon/pull/19) ([purcell](https://github.com/purcell))
122
+
123
+ ## [v0.3.8](https://github.com/cantino/reckon/tree/v0.3.8) (2013-07-03)
124
+
125
+ [Full Changelog](https://github.com/cantino/reckon/compare/v0.3.7...v0.3.8)
126
+
127
+ **Implemented enhancements:**
128
+
129
+ - Support other currencies [\#7](https://github.com/cantino/reckon/issues/7)
130
+
131
+ **Closed issues:**
132
+
133
+ - Add support for dates in spanish dd/mm/yyyy [\#13](https://github.com/cantino/reckon/issues/13)
134
+ - Problems with my csv file [\#8](https://github.com/cantino/reckon/issues/8)
135
+
136
+ **Merged pull requests:**
137
+
138
+ - add support for spanish dates dd/mm/yyyy closes \#13 [\#14](https://github.com/cantino/reckon/pull/14) ([mauromorales](https://github.com/mauromorales))
139
+ - fix issue showing true when parsing the currency option related to \#7 [\#12](https://github.com/cantino/reckon/pull/12) ([mauromorales](https://github.com/mauromorales))
140
+
141
+ ## [v0.3.7](https://github.com/cantino/reckon/tree/v0.3.7) (2013-06-27)
142
+
143
+ [Full Changelog](https://github.com/cantino/reckon/compare/v0.3.6...v0.3.7)
144
+
145
+ **Merged pull requests:**
146
+
147
+ - Updated the sources to allow for custom curreny [\#11](https://github.com/cantino/reckon/pull/11) ([ghost](https://github.com/ghost))
148
+ - Add --account option on the commandline [\#10](https://github.com/cantino/reckon/pull/10) ([copiousfreetime](https://github.com/copiousfreetime))
149
+
150
+ ## [v0.3.6](https://github.com/cantino/reckon/tree/v0.3.6) (2013-04-30)
151
+
152
+ [Full Changelog](https://github.com/cantino/reckon/compare/v0.3.5...v0.3.6)
153
+
154
+ **Closed issues:**
155
+
156
+ - iso-8859-1 CSV with accented chars =\> invalid byte sequence in UTF-8 \(ArgumentError\) [\#5](https://github.com/cantino/reckon/issues/5)
157
+ - Ruby 2.0 compatibility [\#4](https://github.com/cantino/reckon/issues/4)
158
+
159
+ **Merged pull requests:**
160
+
161
+ - Recognize yyyymmdd date in Reckon::App\#date\_for. [\#9](https://github.com/cantino/reckon/pull/9) ([mhoogendoorn](https://github.com/mhoogendoorn))
162
+
163
+ ## [v0.3.5](https://github.com/cantino/reckon/tree/v0.3.5) (2013-03-24)
164
+
165
+ [Full Changelog](https://github.com/cantino/reckon/compare/v0.3.4...v0.3.5)
166
+
167
+ **Closed issues:**
168
+
169
+ - backtrace trying to run reckon -f [\#2](https://github.com/cantino/reckon/issues/2)
170
+
171
+ **Merged pull requests:**
172
+
173
+ - Inverse mode [\#6](https://github.com/cantino/reckon/pull/6) ([nathankot](https://github.com/nathankot))
174
+
175
+ ## [v0.3.4](https://github.com/cantino/reckon/tree/v0.3.4) (2013-02-16)
176
+
177
+ [Full Changelog](https://github.com/cantino/reckon/compare/v0.3.3...v0.3.4)
178
+
179
+ **Merged pull requests:**
180
+
181
+ - adds support for Nordea csv files [\#1](https://github.com/cantino/reckon/pull/1) ([x2q](https://github.com/x2q))
182
+
183
+ ## [v0.3.3](https://github.com/cantino/reckon/tree/v0.3.3) (2013-01-13)
184
+
185
+ [Full Changelog](https://github.com/cantino/reckon/compare/v0.3.1...v0.3.3)
186
+
187
+ ## [v0.3.1](https://github.com/cantino/reckon/tree/v0.3.1) (2012-07-30)
188
+
189
+ [Full Changelog](https://github.com/cantino/reckon/compare/v0.3.2...v0.3.1)
190
+
191
+ ## [v0.3.2](https://github.com/cantino/reckon/tree/v0.3.2) (2012-07-30)
192
+
193
+ [Full Changelog](https://github.com/cantino/reckon/compare/5c07bea3fe63f9b909b4b76bd49f22fd8faf7a29...v0.3.2)
194
+
195
+
196
+
197
+ \* *This Changelog was automatically generated by [github_changelog_generator](https://github.com/github-changelog-generator/github-changelog-generator)*
data/Gemfile CHANGED
@@ -1,5 +1,4 @@
1
1
  source "http://rubygems.org"
2
-
3
2
  gemspec
4
3
 
5
4
  gem 'rake'
@@ -1,34 +1,52 @@
1
1
  PATH
2
2
  remote: .
3
3
  specs:
4
- reckon (0.4.4)
4
+ reckon (0.5.0)
5
5
  chronic (>= 0.3.0)
6
- fastercsv (>= 1.5.1)
7
6
  highline (>= 1.5.2)
7
+ rchardet (>= 1.8.0)
8
8
  terminal-table (>= 1.4.2)
9
9
 
10
10
  GEM
11
11
  remote: http://rubygems.org/
12
12
  specs:
13
13
  chronic (0.10.2)
14
- diff-lcs (1.1.3)
15
- fastercsv (1.5.5)
16
- highline (1.6.21)
17
- rake (10.0.4)
18
- rspec (2.11.0)
19
- rspec-core (~> 2.11.0)
20
- rspec-expectations (~> 2.11.0)
21
- rspec-mocks (~> 2.11.0)
22
- rspec-core (2.11.1)
23
- rspec-expectations (2.11.2)
24
- diff-lcs (~> 1.1.3)
25
- rspec-mocks (2.11.1)
26
- terminal-table (1.4.5)
14
+ coderay (1.1.2)
15
+ diff-lcs (1.3)
16
+ highline (2.0.3)
17
+ method_source (0.9.2)
18
+ pry (0.12.2)
19
+ coderay (~> 1.1.0)
20
+ method_source (~> 0.9.0)
21
+ rake (12.3.3)
22
+ rantly (1.2.0)
23
+ rchardet (1.8.0)
24
+ rspec (3.9.0)
25
+ rspec-core (~> 3.9.0)
26
+ rspec-expectations (~> 3.9.0)
27
+ rspec-mocks (~> 3.9.0)
28
+ rspec-core (3.9.1)
29
+ rspec-support (~> 3.9.1)
30
+ rspec-expectations (3.9.0)
31
+ diff-lcs (>= 1.2.0, < 2.0)
32
+ rspec-support (~> 3.9.0)
33
+ rspec-mocks (3.9.1)
34
+ diff-lcs (>= 1.2.0, < 2.0)
35
+ rspec-support (~> 3.9.0)
36
+ rspec-support (3.9.2)
37
+ terminal-table (1.8.0)
38
+ unicode-display_width (~> 1.1, >= 1.1.1)
39
+ unicode-display_width (1.6.1)
27
40
 
28
41
  PLATFORMS
29
42
  ruby
30
43
 
31
44
  DEPENDENCIES
45
+ pry (>= 0.12.2)
32
46
  rake
47
+ rantly (= 1.2.0)
33
48
  reckon!
34
49
  rspec (>= 1.2.9)
50
+
51
+ BUNDLED WITH
52
+ 1.17.3
data/README.md CHANGED
@@ -1,8 +1,8 @@
1
1
  # Reckon
2
2
 
3
- [![Build Status](https://travis-ci.org/cantino/reckon.png)](https://travis-ci.org/cantino/reckon)
3
+ [![Build Status](https://travis-ci.org/cantino/reckon.png?branch=master)](https://travis-ci.org/cantino/reckon)
4
4
 
5
- Reckon automagically converts CSV files for use with the command-line accounting tool [Ledger](https://github.com/jwiegley/ledger/wiki). It also helps you to select the correct accounts associated with the CSV data using Bayesian machine learning.
5
+ Reckon automagically converts CSV files for use with the command-line accounting tool [Ledger](http://www.ledger-cli.org/). It also helps you to select the correct accounts associated with the CSV data using Bayesian machine learning.
6
6
 
7
7
  ## Installation
8
8
 
@@ -114,6 +114,3 @@ You can override them with `--default_outof_account` and `--default_into_account
114
114
  Copyright (c) 2013 Andrew Cantino. See LICENSE for details.
115
115
 
116
116
  Thanks to @BlackEdder for many contributions!
117
-
118
- [![Bitdeli Badge](https://d2weczhvl823v0.cloudfront.net/cantino/reckon/trend.png)](https://bitdeli.com/free "Bitdeli Badge")
119
-
@@ -1,19 +1,21 @@
1
1
  #!/usr/bin/env ruby
2
2
 
3
3
  require 'rubygems'
4
- if RUBY_VERSION =~ /^1\.9/ || RUBY_VERSION =~ /^2/
5
- require 'csv'
6
- else
7
- require 'fastercsv'
8
- end
4
+ require 'rchardet'
5
+ require 'chronic'
6
+ require 'csv'
9
7
  require 'highline/import'
10
8
  require 'optparse'
11
- require 'chronic'
12
- require 'time'
13
9
  require 'terminal-table'
10
+ require 'time'
11
+ require 'logger'
12
+
13
+ LOGGER = Logger.new(STDOUT)
14
+ LOGGER.level = Logger::ERROR
14
15
 
16
+ require_relative 'reckon/version'
17
+ require_relative 'reckon/cosine_similarity'
15
18
  require File.expand_path(File.join(File.dirname(__FILE__), "reckon", "app"))
16
19
  require File.expand_path(File.join(File.dirname(__FILE__), "reckon", "ledger_parser"))
17
20
  require File.expand_path(File.join(File.dirname(__FILE__), "reckon", "csv_parser"))
18
21
  require File.expand_path(File.join(File.dirname(__FILE__), "reckon", "money"))
19
-
@@ -1,21 +1,20 @@
1
- #coding: utf-8
1
+ # coding: utf-8
2
2
  require 'pp'
3
3
  require 'yaml'
4
4
 
5
5
  module Reckon
6
6
  class App
7
- VERSION = "Reckon 0.4.4"
8
- attr_accessor :options, :accounts, :tokens, :seen, :csv_parser, :regexps
7
+ attr_accessor :options, :seen, :csv_parser, :regexps, :matcher
9
8
 
10
9
  def initialize(options = {})
10
+ LOGGER.level = Logger::INFO if options[:verbose]
11
11
  self.options = options
12
- self.tokens = {}
13
12
  self.regexps = {}
14
- self.accounts = {}
15
13
  self.seen = {}
16
14
  self.options[:currency] ||= '$'
17
15
  options[:string] = File.read(options[:file]) unless options[:string]
18
16
  @csv_parser = CSVParser.new( options )
17
+ @matcher = CosineSimilarity.new(options)
19
18
  learn!
20
19
  end
21
20
 
@@ -24,21 +23,44 @@ module Reckon
24
23
  puts str
25
24
  end
26
25
 
26
+ def learn!
27
+ learn_from_account_tokens(options[:account_tokens_file])
28
+
29
+ ledger_file = options[:existing_ledger_file]
30
+ return unless ledger_file
31
+ fail "#{ledger_file} doesn't exist!" unless File.exists?(ledger_file)
32
+ learn_from(File.read(ledger_file))
33
+ end
34
+
35
+ def learn_from_account_tokens(filename)
36
+ return unless filename
37
+
38
+ fail "#{filename} doesn't exist!" unless File.exists?(filename)
39
+
40
+ extract_account_tokens(YAML.load_file(filename)).each do |account, tokens|
41
+ tokens.each do |t|
42
+ if t.start_with?('/')
43
+ add_regexp(account, t)
44
+ else
45
+ @matcher.add_document(account, t)
46
+ end
47
+ end
48
+ end
49
+ end
50
+
27
51
  def learn_from(ledger)
28
52
  LedgerParser.new(ledger).entries.each do |entry|
29
53
  entry[:accounts].each do |account|
30
- learn_about_account( account[:name],
31
- [entry[:desc], account[:amount]].join(" ") ) unless account[:name] == options[:bank_account]
32
- seen[entry[:date]] ||= {}
33
- seen[entry[:date]][@csv_parser.pretty_money(account[:amount])] = true
54
+ str = [entry[:desc], account[:amount]].join(" ")
55
+ @matcher.add_document(account[:name], str) unless account[:name] == options[:bank_account]
56
+ pretty_date = entry[:date].iso8601
57
+ seen[pretty_date] ||= {}
58
+ seen[pretty_date][@csv_parser.pretty_money(account[:amount])] = true
34
59
  end
35
60
  end
36
61
  end
37
62
 
38
- def already_seen?(row)
39
- seen[row[:pretty_date]] && seen[row[:pretty_date]][row[:pretty_money]]
40
- end
41
-
63
+ # Add tokens from account_tokens_file to accounts
42
64
  def extract_account_tokens(subtree, account = nil)
43
65
  if subtree.nil?
44
66
  puts "Warning: empty #{account} tree"
@@ -46,50 +68,26 @@ module Reckon
46
68
  elsif subtree.is_a?(Array)
47
69
  { account => subtree }
48
70
  else
49
- at = subtree.map { |k, v| extract_account_tokens(v, [account, k].compact.join(':')) }
50
- at.inject({}) { |k, v| k = k.merge(v)}
51
- end
52
- end
53
-
54
- def learn!
55
- if options[:account_tokens_file]
56
- fail "#{options[:account_tokens_file]} doesn't exist!" unless File.exists?(options[:account_tokens_file])
57
- extract_account_tokens(YAML.load_file(options[:account_tokens_file])).each do |account, tokens|
58
- tokens.each { |t| learn_about_account(account, t, true) }
71
+ at = subtree.map do |k, v|
72
+ merged_acct = [account, k].compact.join(':')
73
+ extract_account_tokens(v, merged_acct)
59
74
  end
75
+ at.inject({}) { |memo, e| memo.merge!(e)}
60
76
  end
61
- return unless options[:existing_ledger_file]
62
- fail "#{options[:existing_ledger_file]} doesn't exist!" unless File.exists?(options[:existing_ledger_file])
63
- ledger_data = File.read(options[:existing_ledger_file])
64
- learn_from(ledger_data)
65
77
  end
66
78
 
67
- def learn_about_account(account, data, parse_regexps = false)
68
- accounts[account] ||= 0
69
- if parse_regexps && data.start_with?('/')
70
- # https://github.com/tenderlove/psych/blob/master/lib/psych/visitors/to_ruby.rb
71
- match = data.match(/^\/(.*)\/([ix]*)$/m)
72
- fail "failed to parse regexp #{data}" unless match
73
- options = 0
74
- (match[2] || '').split('').each do |option|
75
- case option
76
- when 'x' then options |= Regexp::EXTENDED
77
- when 'i' then options |= Regexp::IGNORECASE
78
- end
79
- end
80
- regexps[Regexp.new(match[1], options)] = account
81
- else
82
- tokenize(data).each do |token|
83
- tokens[token] ||= {}
84
- tokens[token][account] ||= 0
85
- tokens[token][account] += 1
86
- accounts[account] += 1
79
+ def add_regexp(account, regex_str)
80
+ # https://github.com/tenderlove/psych/blob/master/lib/psych/visitors/to_ruby.rb
81
+ match = regex_str.match(/^\/(.*)\/([ix]*)$/m)
82
+ fail "failed to parse regexp #{regex_str}" unless match
83
+ options = 0
84
+ (match[2] || '').split('').each do |option|
85
+ case option
86
+ when 'x' then options |= Regexp::EXTENDED
87
+ when 'i' then options |= Regexp::IGNORECASE
87
88
  end
88
89
  end
89
- end
90
-
91
- def tokenize(str)
92
- str.downcase.split(/[\s\-]/)
90
+ regexps[Regexp.new(match[1], options)] = account
93
91
  end
94
92
 
95
93
  def walk_backwards
@@ -107,8 +105,7 @@ module Reckon
107
105
  seen_anything_new = true
108
106
  end
109
107
 
110
- possible_answers = most_specific_regexp_match(row)
111
- possible_answers = weighted_account_match( row ).map! { |a| a[:account] } if possible_answers.empty?
108
+ possible_answers = suggest(row)
112
109
 
113
110
  ledger = if row[:money] > 0
114
111
  if options[:unattended]
@@ -156,15 +153,19 @@ module Reckon
156
153
  end
157
154
  end
158
155
 
159
- def finish
160
- options[:output_file].close unless options[:output_file] == STDOUT
161
- interactive_output "Exiting."
162
- exit
163
- end
164
-
165
- def output(ledger_line)
166
- options[:output_file].puts ledger_line
167
- options[:output_file].flush
156
+ def each_row_backwards
157
+ rows = []
158
+ (0...@csv_parser.columns.first.length).to_a.each do |index|
159
+ rows << { :date => @csv_parser.date_for(index),
160
+ :pretty_date => @csv_parser.pretty_date_for(index),
161
+ :pretty_money => @csv_parser.pretty_money_for(index),
162
+ :pretty_money_negated => @csv_parser.pretty_money_for(index, :negate),
163
+ :money => @csv_parser.money_for(index),
164
+ :description => @csv_parser.description_for(index) }
165
+ end
166
+ rows.sort { |a, b| a[:date] <=> b[:date] }.each do |row|
167
+ yield row
168
+ end
168
169
  end
169
170
 
170
171
  def most_specific_regexp_match( row )
@@ -176,41 +177,9 @@ module Reckon
176
177
  matches.sort_by! { |account, matched_text| matched_text.length }.map(&:first)
177
178
  end
178
179
 
179
- # Weigh accounts by how well they match the row
180
- def weighted_account_match( row )
181
- query_tokens = tokenize(row[:description])
182
-
183
- search_vector = []
184
- account_vectors = {}
185
-
186
- query_tokens.each do |token|
187
- idf = Math.log((accounts.keys.length + 1) / ((tokens[token] || {}).keys.length.to_f + 1))
188
- tf = 1.0 / query_tokens.length.to_f
189
- search_vector << tf*idf
190
-
191
- accounts.each do |account, total_terms|
192
- tf = (tokens[token] && tokens[token][account]) ? tokens[token][account] / total_terms.to_f : 0
193
- account_vectors[account] ||= []
194
- account_vectors[account] << tf*idf
195
- end
196
- end
197
-
198
- # Should I normalize the vectors? Probably unnecessary due to tf-idf and short documents.
199
-
200
- account_vectors = account_vectors.to_a.map do |account, account_vector|
201
- { :cosine => (0...account_vector.length).to_a.inject(0) { |m, i| m + search_vector[i] * account_vector[i] },
202
- :account => account }
203
- end
204
- account_vectors.sort! {|a, b| b[:cosine] <=> a[:cosine] }
205
-
206
- # Return empty set if no accounts matched so that we can fallback to the defaults in the unattended mode
207
- if options[:unattended]
208
- if account_vectors.first && account_vectors.first[:account]
209
- account_vectors = [] if account_vectors.first[:cosine] == 0
210
- end
211
- end
212
-
213
- return account_vectors
180
+ def suggest(row)
181
+ most_specific_regexp_match(row) +
182
+ @matcher.find_similar(row[:description]).map { |n| n[:account] }
214
183
  end
215
184
 
216
185
  def ledger_format(row, line1, line2)
@@ -220,6 +189,21 @@ module Reckon
220
189
  out
221
190
  end
222
191
 
192
+ def output(ledger_line)
193
+ options[:output_file].puts ledger_line
194
+ options[:output_file].flush
195
+ end
196
+
197
+ def already_seen?(row)
198
+ seen[row[:pretty_date]] && seen[row[:pretty_date]][row[:pretty_money]]
199
+ end
200
+
201
+ def finish
202
+ options[:output_file].close unless options[:output_file] == STDOUT
203
+ interactive_output "Exiting."
204
+ exit
205
+ end
206
+
223
207
  def output_table
224
208
  output = Terminal::Table.new do |t|
225
209
  t.headings = 'Date', 'Amount', 'Description'
@@ -230,21 +214,6 @@ module Reckon
230
214
  interactive_output output
231
215
  end
232
216
 
233
- def each_row_backwards
234
- rows = []
235
- (0...@csv_parser.columns.first.length).to_a.each do |index|
236
- rows << { :date => @csv_parser.date_for(index),
237
- :pretty_date => @csv_parser.pretty_date_for(index),
238
- :pretty_money => @csv_parser.pretty_money_for(index),
239
- :pretty_money_negated => @csv_parser.pretty_money_for(index, :negate),
240
- :money => @csv_parser.money_for(index),
241
- :description => @csv_parser.description_for(index) }
242
- end
243
- rows.sort { |a, b| a[:date] <=> b[:date] }.each do |row|
244
- yield row
245
- end
246
- end
247
-
248
217
  def self.parse_opts(args = ARGV)
249
218
  options = { :output_file => STDOUT }
250
219
  parser = OptionParser.new do |opts|
@@ -255,7 +224,7 @@ module Reckon
255
224
  options[:file] = file
256
225
  end
257
226
 
258
- opts.on("-a", "--account name", "The Ledger Account this file is for") do |a|
227
+ opts.on("-a", "--account NAME", "The Ledger Account this file is for") do |a|
259
228
  options[:bank_account] = a
260
229
  end
261
230
 
@@ -283,6 +252,14 @@ module Reckon
283
252
  options[:ignore_columns] = ignore.split(",").map { |i| i.to_i }
284
253
  end
285
254
 
255
+ opts.on("", "--money-column 2", Integer, "Specify the money column instead of letting Reckon guess - the first column is column 1") do |column_number|
256
+ options[:money_column] = column_number
257
+ end
258
+
259
+ opts.on("", "--date-column 3", Integer, "Specify the date column instead of letting Reckon guess - the first column is column 1") do |column_number|
260
+ options[:date_column] = column_number
261
+ end
262
+
286
263
  opts.on("", "--contains-header [N]", "The first row of the CSV is a header and should be skipped. Optionally add the number of rows to skip.") do |contains_header|
287
264
  options[:contains_header] = 1
288
265
  options[:contains_header] = contains_header.to_i if contains_header
@@ -316,11 +293,11 @@ module Reckon
316
293
  options[:account_tokens_file] = a
317
294
  end
318
295
 
319
- opts.on("", "--default-into-account name", "Default into account") do |a|
296
+ opts.on("", "--default-into-account NAME", "Default into account") do |a|
320
297
  options[:default_into_account] = a
321
298
  end
322
299
 
323
- opts.on("", "--default-outof-account name", "Default 'out of' account") do |a|
300
+ opts.on("", "--default-outof-account NAME", "Default 'out of' account") do |a|
324
301
  options[:default_outof_account] = a
325
302
  end
326
303
 
@@ -351,7 +328,6 @@ module Reckon
351
328
  end
352
329
 
353
330
  unless options[:bank_account]
354
-
355
331
  fail "Please specify --account for the unattended mode" if options[:unattended]
356
332
 
357
333
  options[:bank_account] = ask("What is the account name of this bank account in Ledger? ") do |q|