reckon 0.4.4 → 0.5.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (43) hide show
  1. checksums.yaml +5 -5
  2. data/.ruby-version +1 -1
  3. data/.travis.yml +10 -2
  4. data/CHANGELOG.md +197 -0
  5. data/Gemfile +0 -1
  6. data/Gemfile.lock +33 -15
  7. data/README.md +2 -5
  8. data/lib/reckon.rb +10 -8
  9. data/lib/reckon/app.rb +92 -116
  10. data/lib/reckon/cosine_similarity.rb +119 -0
  11. data/lib/reckon/csv_parser.rb +57 -27
  12. data/lib/reckon/ledger_parser.rb +194 -30
  13. data/lib/reckon/money.rb +3 -4
  14. data/reckon.gemspec +6 -5
  15. data/spec/data_fixtures/73-sample.csv +2 -0
  16. data/spec/data_fixtures/73-tokens.yml +8 -0
  17. data/spec/data_fixtures/73-transactions.ledger +7 -0
  18. data/spec/data_fixtures/austrian_example.csv +13 -0
  19. data/spec/data_fixtures/bom_utf8_file.csv +1 -0
  20. data/spec/data_fixtures/broker_canada_example.csv +12 -0
  21. data/spec/data_fixtures/chase.csv +9 -0
  22. data/spec/data_fixtures/danish_kroner_nordea_example.csv +6 -0
  23. data/spec/data_fixtures/english_date_example.csv +3 -0
  24. data/spec/data_fixtures/french_example.csv +9 -0
  25. data/spec/data_fixtures/german_date_example.csv +3 -0
  26. data/spec/data_fixtures/harder_date_example.csv +5 -0
  27. data/spec/data_fixtures/ing.csv +3 -0
  28. data/spec/data_fixtures/intuit_mint_example.csv +7 -0
  29. data/spec/data_fixtures/invalid_header_example.csv +6 -0
  30. data/spec/data_fixtures/inversed_credit_card.csv +16 -0
  31. data/spec/data_fixtures/nationwide.csv +4 -0
  32. data/spec/data_fixtures/simple.csv +2 -0
  33. data/spec/data_fixtures/some_other.csv +9 -0
  34. data/spec/data_fixtures/spanish_date_example.csv +3 -0
  35. data/spec/data_fixtures/suntrust.csv +7 -0
  36. data/spec/data_fixtures/two_money_columns.csv +5 -0
  37. data/spec/data_fixtures/yyyymmdd_date_example.csv +1 -0
  38. data/spec/reckon/app_spec.rb +66 -34
  39. data/spec/reckon/csv_parser_spec.rb +79 -201
  40. data/spec/reckon/ledger_parser_spec.rb +62 -9
  41. data/spec/spec_helper.rb +3 -0
  42. metadata +62 -19
  43. data/CHANGES.md +0 -9
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
- SHA1:
3
- metadata.gz: cf1d814fe911f299cf58401c3b36734e29a172f1
4
- data.tar.gz: 9130495690de1f73cfa3570a6e3386f38df86830
2
+ SHA256:
3
+ metadata.gz: b58a745c730c0dfaf022a98c19dfaf8568c0b3091548d70c9202f8f7c0d78ada
4
+ data.tar.gz: 030d2036524a19c5eff981608133e8aac15f185b57a891e91a6f1fdabd3796c7
5
5
  SHA512:
6
- metadata.gz: b7c3e23f96983cd79844ba37524f1b843a1dbba230198cf813b83c7c65a076506fb2c66ab663dee0970db2d42e1d3e0b03adf41ee163d08f3b38dbb163c373a6
7
- data.tar.gz: d74ab72db15d54bc0bfe548369ec9a8b3903760bc09fbc16340a6f64c929a69d595c4840900e8a425996bc580e5924321ece84b10d3e033b193b066f2520f7c2
6
+ metadata.gz: 1daf76a19078f1707e847a7ac54aad5836ee3d49ce9c2f612bfefeb86f8c1bf9c9d1f97b7bbecb309f9f44f8097f8f2dbe0d6b6062bc376a62fc88f45bab1dfc
7
+ data.tar.gz: 4784ac871dbdb6160dc53ef8a7f5d69f31ccc7f7f7ea876b97bf60941abb2d13a968f5229747dd74d0b574ecc12326485e1eaba7ea368a91162d36d3ef4f9f86
@@ -1 +1 @@
1
- ruby-2.2.0
1
+ 2.5
@@ -1,5 +1,13 @@
1
1
  language: ruby
2
2
  rvm:
3
- - 1.9.3
4
- - 2.0.0
3
+ # Mac High Sierra
4
+ - 2.0.0-p648
5
+ # Mac Mojave
6
+ - 2.3.7
7
+ # Ubuntu 19.10
8
+ - 2.5
9
+ # Mac Catalina
10
+ - 2.6
5
11
  script: "bundle exec rake"
12
+ before_install:
13
+ - sudo apt-get -y install ledger
@@ -0,0 +1,197 @@
1
+ # Changelog
2
+
3
+ ## [v0.5.0](https://github.com/cantino/reckon/tree/v0.5.0) (2020-02-19)
4
+
5
+ [Full Changelog](https://github.com/cantino/reckon/compare/v0.4.4...v0.5.0)
6
+
7
+ **Closed issues:**
8
+
9
+ - g [\#75](https://github.com/cantino/reckon/issues/75)
10
+ - Learn-from not working [\#74](https://github.com/cantino/reckon/issues/74)
11
+ - Tokens YAML fails to match [\#73](https://github.com/cantino/reckon/issues/73)
12
+ - Missing or stray quote in line error [\#71](https://github.com/cantino/reckon/issues/71)
13
+ - Support ISO 8601 formatting of dates in ledger file [\#70](https://github.com/cantino/reckon/issues/70)
14
+ - Looking for a new maintainer for Reckon [\#68](https://github.com/cantino/reckon/issues/68)
15
+ - Reckon undefined method to\_h when trying to parse csv file [\#66](https://github.com/cantino/reckon/issues/66)
16
+ - Runtime error [\#65](https://github.com/cantino/reckon/issues/65)
17
+ - Reckon doesn't learn from multiple sources [\#63](https://github.com/cantino/reckon/issues/63)
18
+ - problem of importing file [\#59](https://github.com/cantino/reckon/issues/59)
19
+ - Problem with file in which every column is quoted. [\#58](https://github.com/cantino/reckon/issues/58)
20
+ - Error in reckon for the same format csv file [\#57](https://github.com/cantino/reckon/issues/57)
21
+ - Parsing account names does not work if currency symbol is different from $ [\#56](https://github.com/cantino/reckon/issues/56)
22
+ - Problem reading csv file [\#55](https://github.com/cantino/reckon/issues/55)
23
+ - Problem with mint file [\#53](https://github.com/cantino/reckon/issues/53)
24
+ - --money-column [\#43](https://github.com/cantino/reckon/issues/43)
25
+
26
+ **Merged pull requests:**
27
+
28
+ - Fix bugs in ledger file parsing. Fixes \#56. [\#81](https://github.com/cantino/reckon/pull/81) ([benprew](https://github.com/benprew))
29
+ - Better file encoding suggestions [\#80](https://github.com/cantino/reckon/pull/80) ([benprew](https://github.com/benprew))
30
+ - :bug: fix matching algorithm, add logging and a spec helper. Fixes \#73 [\#79](https://github.com/cantino/reckon/pull/79) ([benprew](https://github.com/benprew))
31
+ - bug: invalid header lines should be ignored, not parsed. [\#78](https://github.com/cantino/reckon/pull/78) ([benprew](https://github.com/benprew))
32
+ - convert default date format to iso8601 [\#77](https://github.com/cantino/reckon/pull/77) ([benprew](https://github.com/benprew))
33
+ - Fix rspec failure for ruby 2.3 and 2.4 [\#69](https://github.com/cantino/reckon/pull/69) ([BlackEdder](https://github.com/BlackEdder))
34
+ - Allow setting of money and date columns by index [\#67](https://github.com/cantino/reckon/pull/67) ([cantino](https://github.com/cantino))
35
+
36
+ ## [v0.4.4](https://github.com/cantino/reckon/tree/v0.4.4) (2015-12-02)
37
+
38
+ [Full Changelog](https://github.com/cantino/reckon/compare/v0.4.3...v0.4.4)
39
+
40
+ **Merged pull requests:**
41
+
42
+ - Regexp support in the tokens file [\#54](https://github.com/cantino/reckon/pull/54) ([vzctl](https://github.com/vzctl))
43
+
44
+ ## [v0.4.3](https://github.com/cantino/reckon/tree/v0.4.3) (2015-08-16)
45
+
46
+ [Full Changelog](https://github.com/cantino/reckon/compare/v0.4.2...v0.4.3)
47
+
48
+ ## [v0.4.2](https://github.com/cantino/reckon/tree/v0.4.2) (2015-08-08)
49
+
50
+ [Full Changelog](https://github.com/cantino/reckon/compare/v0.4.1...v0.4.2)
51
+
52
+ **Merged pull requests:**
53
+
54
+ - Ignore empty description columns [\#52](https://github.com/cantino/reckon/pull/52) ([vzctl](https://github.com/vzctl))
55
+
56
+ ## [v0.4.1](https://github.com/cantino/reckon/tree/v0.4.1) (2015-07-08)
57
+
58
+ [Full Changelog](https://github.com/cantino/reckon/compare/v0.4.0...v0.4.1)
59
+
60
+ **Closed issues:**
61
+
62
+ - Unattended [\#50](https://github.com/cantino/reckon/issues/50)
63
+ - Debit/Credit Columns from SunTrust [\#42](https://github.com/cantino/reckon/issues/42)
64
+
65
+ **Merged pull requests:**
66
+
67
+ - \[RFC\] Fix \#42: Work with suntrust double column csv files [\#48](https://github.com/cantino/reckon/pull/48) ([BlackEdder](https://github.com/BlackEdder))
68
+
69
+ ## [v0.4.0](https://github.com/cantino/reckon/tree/v0.4.0) (2015-06-05)
70
+
71
+ [Full Changelog](https://github.com/cantino/reckon/compare/v0.3.10...v0.4.0)
72
+
73
+ **Implemented enhancements:**
74
+
75
+ - Tab completion for transactions [\#40](https://github.com/cantino/reckon/issues/40)
76
+ - feature: "unattended" mode [\#3](https://github.com/cantino/reckon/issues/3)
77
+
78
+ **Closed issues:**
79
+
80
+ - Missing or stray quote error [\#38](https://github.com/cantino/reckon/issues/38)
81
+
82
+ **Merged pull requests:**
83
+
84
+ - Better ISO 8601 dates support [\#49](https://github.com/cantino/reckon/pull/49) ([vzctl](https://github.com/vzctl))
85
+ - Unattended mode and custom tokens support [\#47](https://github.com/cantino/reckon/pull/47) ([vzctl](https://github.com/vzctl))
86
+ - \[RFC\] Implement issue \#40: Tab completion [\#46](https://github.com/cantino/reckon/pull/46) ([BlackEdder](https://github.com/BlackEdder))
87
+ - set readline to allow for backspace in ask dialog [\#44](https://github.com/cantino/reckon/pull/44) ([mrtazz](https://github.com/mrtazz))
88
+
89
+ ## [v0.3.10](https://github.com/cantino/reckon/tree/v0.3.10) (2014-08-16)
90
+
91
+ [Full Changelog](https://github.com/cantino/reckon/compare/v0.3.9...v0.3.10)
92
+
93
+ **Merged pull requests:**
94
+
95
+ - Fix --encoding option [\#41](https://github.com/cantino/reckon/pull/41) ([mamciek](https://github.com/mamciek))
96
+ - Bumped version number [\#37](https://github.com/cantino/reckon/pull/37) ([BlackEdder](https://github.com/BlackEdder))
97
+
98
+ ## [v0.3.9](https://github.com/cantino/reckon/tree/v0.3.9) (2014-02-20)
99
+
100
+ [Full Changelog](https://github.com/cantino/reckon/compare/v0.3.8...v0.3.9)
101
+
102
+ **Closed issues:**
103
+
104
+ - Idea/discussion: csv parser [\#25](https://github.com/cantino/reckon/issues/25)
105
+ - Silently misinterprets UK dates [\#18](https://github.com/cantino/reckon/issues/18)
106
+
107
+ **Merged pull requests:**
108
+
109
+ - Added spec for csv files from Broker Canada [\#36](https://github.com/cantino/reckon/pull/36) ([BlackEdder](https://github.com/BlackEdder))
110
+ - Date format [\#35](https://github.com/cantino/reckon/pull/35) ([BlackEdder](https://github.com/BlackEdder))
111
+ - Added example from a french bank [\#34](https://github.com/cantino/reckon/pull/34) ([BlackEdder](https://github.com/BlackEdder))
112
+ - Austrian example [\#33](https://github.com/cantino/reckon/pull/33) ([BlackEdder](https://github.com/BlackEdder))
113
+ - Ing csv [\#30](https://github.com/cantino/reckon/pull/30) ([BlackEdder](https://github.com/BlackEdder))
114
+ - Further improvements in nationwide csv handling [\#29](https://github.com/cantino/reckon/pull/29) ([BlackEdder](https://github.com/BlackEdder))
115
+ - Refactor: Add money class [\#28](https://github.com/cantino/reckon/pull/28) ([BlackEdder](https://github.com/BlackEdder))
116
+ - Initial split of CSVparser from class App [\#27](https://github.com/cantino/reckon/pull/27) ([BlackEdder](https://github.com/BlackEdder))
117
+ - Updated version of pull request 24: Allow for other currency symbols while calculating money\_score [\#26](https://github.com/cantino/reckon/pull/26) ([BlackEdder](https://github.com/BlackEdder))
118
+ - Change double column detection [\#23](https://github.com/cantino/reckon/pull/23) ([BlackEdder](https://github.com/BlackEdder))
119
+ - Added optional argument to contains\_header to skip multiple header lines [\#22](https://github.com/cantino/reckon/pull/22) ([BlackEdder](https://github.com/BlackEdder))
120
+ - Add a Bitdeli Badge to README [\#20](https://github.com/cantino/reckon/pull/20) ([bitdeli-chef](https://github.com/bitdeli-chef))
121
+ - Update README to show latest usage info [\#19](https://github.com/cantino/reckon/pull/19) ([purcell](https://github.com/purcell))
122
+
123
+ ## [v0.3.8](https://github.com/cantino/reckon/tree/v0.3.8) (2013-07-03)
124
+
125
+ [Full Changelog](https://github.com/cantino/reckon/compare/v0.3.7...v0.3.8)
126
+
127
+ **Implemented enhancements:**
128
+
129
+ - Support other currencies [\#7](https://github.com/cantino/reckon/issues/7)
130
+
131
+ **Closed issues:**
132
+
133
+ - Add support for dates in spanish dd/mm/yyyy [\#13](https://github.com/cantino/reckon/issues/13)
134
+ - Problems with my csv file [\#8](https://github.com/cantino/reckon/issues/8)
135
+
136
+ **Merged pull requests:**
137
+
138
+ - add support for spanish dates dd/mm/yyyy closes \#13 [\#14](https://github.com/cantino/reckon/pull/14) ([mauromorales](https://github.com/mauromorales))
139
+ - fix issue showing true when parsing the currency option related to \#7 [\#12](https://github.com/cantino/reckon/pull/12) ([mauromorales](https://github.com/mauromorales))
140
+
141
+ ## [v0.3.7](https://github.com/cantino/reckon/tree/v0.3.7) (2013-06-27)
142
+
143
+ [Full Changelog](https://github.com/cantino/reckon/compare/v0.3.6...v0.3.7)
144
+
145
+ **Merged pull requests:**
146
+
147
+ - Updated the sources to allow for custom curreny [\#11](https://github.com/cantino/reckon/pull/11) ([ghost](https://github.com/ghost))
148
+ - Add --account option on the commandline [\#10](https://github.com/cantino/reckon/pull/10) ([copiousfreetime](https://github.com/copiousfreetime))
149
+
150
+ ## [v0.3.6](https://github.com/cantino/reckon/tree/v0.3.6) (2013-04-30)
151
+
152
+ [Full Changelog](https://github.com/cantino/reckon/compare/v0.3.5...v0.3.6)
153
+
154
+ **Closed issues:**
155
+
156
+ - iso-8859-1 CSV with accented chars =\> invalid byte sequence in UTF-8 \(ArgumentError\) [\#5](https://github.com/cantino/reckon/issues/5)
157
+ - Ruby 2.0 compatibility [\#4](https://github.com/cantino/reckon/issues/4)
158
+
159
+ **Merged pull requests:**
160
+
161
+ - Recognize yyyymmdd date in Reckon::App\#date\_for. [\#9](https://github.com/cantino/reckon/pull/9) ([mhoogendoorn](https://github.com/mhoogendoorn))
162
+
163
+ ## [v0.3.5](https://github.com/cantino/reckon/tree/v0.3.5) (2013-03-24)
164
+
165
+ [Full Changelog](https://github.com/cantino/reckon/compare/v0.3.4...v0.3.5)
166
+
167
+ **Closed issues:**
168
+
169
+ - backtrace trying to run reckon -f [\#2](https://github.com/cantino/reckon/issues/2)
170
+
171
+ **Merged pull requests:**
172
+
173
+ - Inverse mode [\#6](https://github.com/cantino/reckon/pull/6) ([nathankot](https://github.com/nathankot))
174
+
175
+ ## [v0.3.4](https://github.com/cantino/reckon/tree/v0.3.4) (2013-02-16)
176
+
177
+ [Full Changelog](https://github.com/cantino/reckon/compare/v0.3.3...v0.3.4)
178
+
179
+ **Merged pull requests:**
180
+
181
+ - adds support for Nordea csv files [\#1](https://github.com/cantino/reckon/pull/1) ([x2q](https://github.com/x2q))
182
+
183
+ ## [v0.3.3](https://github.com/cantino/reckon/tree/v0.3.3) (2013-01-13)
184
+
185
+ [Full Changelog](https://github.com/cantino/reckon/compare/v0.3.1...v0.3.3)
186
+
187
+ ## [v0.3.1](https://github.com/cantino/reckon/tree/v0.3.1) (2012-07-30)
188
+
189
+ [Full Changelog](https://github.com/cantino/reckon/compare/v0.3.2...v0.3.1)
190
+
191
+ ## [v0.3.2](https://github.com/cantino/reckon/tree/v0.3.2) (2012-07-30)
192
+
193
+ [Full Changelog](https://github.com/cantino/reckon/compare/5c07bea3fe63f9b909b4b76bd49f22fd8faf7a29...v0.3.2)
194
+
195
+
196
+
197
+ \* *This Changelog was automatically generated by [github_changelog_generator](https://github.com/github-changelog-generator/github-changelog-generator)*
data/Gemfile CHANGED
@@ -1,5 +1,4 @@
1
1
  source "http://rubygems.org"
2
-
3
2
  gemspec
4
3
 
5
4
  gem 'rake'
@@ -1,34 +1,52 @@
1
1
  PATH
2
2
  remote: .
3
3
  specs:
4
- reckon (0.4.4)
4
+ reckon (0.5.0)
5
5
  chronic (>= 0.3.0)
6
- fastercsv (>= 1.5.1)
7
6
  highline (>= 1.5.2)
7
+ rchardet (>= 1.8.0)
8
8
  terminal-table (>= 1.4.2)
9
9
 
10
10
  GEM
11
11
  remote: http://rubygems.org/
12
12
  specs:
13
13
  chronic (0.10.2)
14
- diff-lcs (1.1.3)
15
- fastercsv (1.5.5)
16
- highline (1.6.21)
17
- rake (10.0.4)
18
- rspec (2.11.0)
19
- rspec-core (~> 2.11.0)
20
- rspec-expectations (~> 2.11.0)
21
- rspec-mocks (~> 2.11.0)
22
- rspec-core (2.11.1)
23
- rspec-expectations (2.11.2)
24
- diff-lcs (~> 1.1.3)
25
- rspec-mocks (2.11.1)
26
- terminal-table (1.4.5)
14
+ coderay (1.1.2)
15
+ diff-lcs (1.3)
16
+ highline (2.0.3)
17
+ method_source (0.9.2)
18
+ pry (0.12.2)
19
+ coderay (~> 1.1.0)
20
+ method_source (~> 0.9.0)
21
+ rake (12.3.3)
22
+ rantly (1.2.0)
23
+ rchardet (1.8.0)
24
+ rspec (3.9.0)
25
+ rspec-core (~> 3.9.0)
26
+ rspec-expectations (~> 3.9.0)
27
+ rspec-mocks (~> 3.9.0)
28
+ rspec-core (3.9.1)
29
+ rspec-support (~> 3.9.1)
30
+ rspec-expectations (3.9.0)
31
+ diff-lcs (>= 1.2.0, < 2.0)
32
+ rspec-support (~> 3.9.0)
33
+ rspec-mocks (3.9.1)
34
+ diff-lcs (>= 1.2.0, < 2.0)
35
+ rspec-support (~> 3.9.0)
36
+ rspec-support (3.9.2)
37
+ terminal-table (1.8.0)
38
+ unicode-display_width (~> 1.1, >= 1.1.1)
39
+ unicode-display_width (1.6.1)
27
40
 
28
41
  PLATFORMS
29
42
  ruby
30
43
 
31
44
  DEPENDENCIES
45
+ pry (>= 0.12.2)
32
46
  rake
47
+ rantly (= 1.2.0)
33
48
  reckon!
34
49
  rspec (>= 1.2.9)
50
+
51
+ BUNDLED WITH
52
+ 1.17.3
data/README.md CHANGED
@@ -1,8 +1,8 @@
1
1
  # Reckon
2
2
 
3
- [![Build Status](https://travis-ci.org/cantino/reckon.png)](https://travis-ci.org/cantino/reckon)
3
+ [![Build Status](https://travis-ci.org/cantino/reckon.png?branch=master)](https://travis-ci.org/cantino/reckon)
4
4
 
5
- Reckon automagically converts CSV files for use with the command-line accounting tool [Ledger](https://github.com/jwiegley/ledger/wiki). It also helps you to select the correct accounts associated with the CSV data using Bayesian machine learning.
5
+ Reckon automagically converts CSV files for use with the command-line accounting tool [Ledger](http://www.ledger-cli.org/). It also helps you to select the correct accounts associated with the CSV data using Bayesian machine learning.
6
6
 
7
7
  ## Installation
8
8
 
@@ -114,6 +114,3 @@ You can override them with `--default_outof_account` and `--default_into_account
114
114
  Copyright (c) 2013 Andrew Cantino. See LICENSE for details.
115
115
 
116
116
  Thanks to @BlackEdder for many contributions!
117
-
118
- [![Bitdeli Badge](https://d2weczhvl823v0.cloudfront.net/cantino/reckon/trend.png)](https://bitdeli.com/free "Bitdeli Badge")
119
-
@@ -1,19 +1,21 @@
1
1
  #!/usr/bin/env ruby
2
2
 
3
3
  require 'rubygems'
4
- if RUBY_VERSION =~ /^1\.9/ || RUBY_VERSION =~ /^2/
5
- require 'csv'
6
- else
7
- require 'fastercsv'
8
- end
4
+ require 'rchardet'
5
+ require 'chronic'
6
+ require 'csv'
9
7
  require 'highline/import'
10
8
  require 'optparse'
11
- require 'chronic'
12
- require 'time'
13
9
  require 'terminal-table'
10
+ require 'time'
11
+ require 'logger'
12
+
13
+ LOGGER = Logger.new(STDOUT)
14
+ LOGGER.level = Logger::ERROR
14
15
 
16
+ require_relative 'reckon/version'
17
+ require_relative 'reckon/cosine_similarity'
15
18
  require File.expand_path(File.join(File.dirname(__FILE__), "reckon", "app"))
16
19
  require File.expand_path(File.join(File.dirname(__FILE__), "reckon", "ledger_parser"))
17
20
  require File.expand_path(File.join(File.dirname(__FILE__), "reckon", "csv_parser"))
18
21
  require File.expand_path(File.join(File.dirname(__FILE__), "reckon", "money"))
19
-
@@ -1,21 +1,20 @@
1
- #coding: utf-8
1
+ # coding: utf-8
2
2
  require 'pp'
3
3
  require 'yaml'
4
4
 
5
5
  module Reckon
6
6
  class App
7
- VERSION = "Reckon 0.4.4"
8
- attr_accessor :options, :accounts, :tokens, :seen, :csv_parser, :regexps
7
+ attr_accessor :options, :seen, :csv_parser, :regexps, :matcher
9
8
 
10
9
  def initialize(options = {})
10
+ LOGGER.level = Logger::INFO if options[:verbose]
11
11
  self.options = options
12
- self.tokens = {}
13
12
  self.regexps = {}
14
- self.accounts = {}
15
13
  self.seen = {}
16
14
  self.options[:currency] ||= '$'
17
15
  options[:string] = File.read(options[:file]) unless options[:string]
18
16
  @csv_parser = CSVParser.new( options )
17
+ @matcher = CosineSimilarity.new(options)
19
18
  learn!
20
19
  end
21
20
 
@@ -24,21 +23,44 @@ module Reckon
24
23
  puts str
25
24
  end
26
25
 
26
+ def learn!
27
+ learn_from_account_tokens(options[:account_tokens_file])
28
+
29
+ ledger_file = options[:existing_ledger_file]
30
+ return unless ledger_file
31
+ fail "#{ledger_file} doesn't exist!" unless File.exists?(ledger_file)
32
+ learn_from(File.read(ledger_file))
33
+ end
34
+
35
+ def learn_from_account_tokens(filename)
36
+ return unless filename
37
+
38
+ fail "#{filename} doesn't exist!" unless File.exists?(filename)
39
+
40
+ extract_account_tokens(YAML.load_file(filename)).each do |account, tokens|
41
+ tokens.each do |t|
42
+ if t.start_with?('/')
43
+ add_regexp(account, t)
44
+ else
45
+ @matcher.add_document(account, t)
46
+ end
47
+ end
48
+ end
49
+ end
50
+
27
51
  def learn_from(ledger)
28
52
  LedgerParser.new(ledger).entries.each do |entry|
29
53
  entry[:accounts].each do |account|
30
- learn_about_account( account[:name],
31
- [entry[:desc], account[:amount]].join(" ") ) unless account[:name] == options[:bank_account]
32
- seen[entry[:date]] ||= {}
33
- seen[entry[:date]][@csv_parser.pretty_money(account[:amount])] = true
54
+ str = [entry[:desc], account[:amount]].join(" ")
55
+ @matcher.add_document(account[:name], str) unless account[:name] == options[:bank_account]
56
+ pretty_date = entry[:date].iso8601
57
+ seen[pretty_date] ||= {}
58
+ seen[pretty_date][@csv_parser.pretty_money(account[:amount])] = true
34
59
  end
35
60
  end
36
61
  end
37
62
 
38
- def already_seen?(row)
39
- seen[row[:pretty_date]] && seen[row[:pretty_date]][row[:pretty_money]]
40
- end
41
-
63
+ # Add tokens from account_tokens_file to accounts
42
64
  def extract_account_tokens(subtree, account = nil)
43
65
  if subtree.nil?
44
66
  puts "Warning: empty #{account} tree"
@@ -46,50 +68,26 @@ module Reckon
46
68
  elsif subtree.is_a?(Array)
47
69
  { account => subtree }
48
70
  else
49
- at = subtree.map { |k, v| extract_account_tokens(v, [account, k].compact.join(':')) }
50
- at.inject({}) { |k, v| k = k.merge(v)}
51
- end
52
- end
53
-
54
- def learn!
55
- if options[:account_tokens_file]
56
- fail "#{options[:account_tokens_file]} doesn't exist!" unless File.exists?(options[:account_tokens_file])
57
- extract_account_tokens(YAML.load_file(options[:account_tokens_file])).each do |account, tokens|
58
- tokens.each { |t| learn_about_account(account, t, true) }
71
+ at = subtree.map do |k, v|
72
+ merged_acct = [account, k].compact.join(':')
73
+ extract_account_tokens(v, merged_acct)
59
74
  end
75
+ at.inject({}) { |memo, e| memo.merge!(e)}
60
76
  end
61
- return unless options[:existing_ledger_file]
62
- fail "#{options[:existing_ledger_file]} doesn't exist!" unless File.exists?(options[:existing_ledger_file])
63
- ledger_data = File.read(options[:existing_ledger_file])
64
- learn_from(ledger_data)
65
77
  end
66
78
 
67
- def learn_about_account(account, data, parse_regexps = false)
68
- accounts[account] ||= 0
69
- if parse_regexps && data.start_with?('/')
70
- # https://github.com/tenderlove/psych/blob/master/lib/psych/visitors/to_ruby.rb
71
- match = data.match(/^\/(.*)\/([ix]*)$/m)
72
- fail "failed to parse regexp #{data}" unless match
73
- options = 0
74
- (match[2] || '').split('').each do |option|
75
- case option
76
- when 'x' then options |= Regexp::EXTENDED
77
- when 'i' then options |= Regexp::IGNORECASE
78
- end
79
- end
80
- regexps[Regexp.new(match[1], options)] = account
81
- else
82
- tokenize(data).each do |token|
83
- tokens[token] ||= {}
84
- tokens[token][account] ||= 0
85
- tokens[token][account] += 1
86
- accounts[account] += 1
79
+ def add_regexp(account, regex_str)
80
+ # https://github.com/tenderlove/psych/blob/master/lib/psych/visitors/to_ruby.rb
81
+ match = regex_str.match(/^\/(.*)\/([ix]*)$/m)
82
+ fail "failed to parse regexp #{regex_str}" unless match
83
+ options = 0
84
+ (match[2] || '').split('').each do |option|
85
+ case option
86
+ when 'x' then options |= Regexp::EXTENDED
87
+ when 'i' then options |= Regexp::IGNORECASE
87
88
  end
88
89
  end
89
- end
90
-
91
- def tokenize(str)
92
- str.downcase.split(/[\s\-]/)
90
+ regexps[Regexp.new(match[1], options)] = account
93
91
  end
94
92
 
95
93
  def walk_backwards
@@ -107,8 +105,7 @@ module Reckon
107
105
  seen_anything_new = true
108
106
  end
109
107
 
110
- possible_answers = most_specific_regexp_match(row)
111
- possible_answers = weighted_account_match( row ).map! { |a| a[:account] } if possible_answers.empty?
108
+ possible_answers = suggest(row)
112
109
 
113
110
  ledger = if row[:money] > 0
114
111
  if options[:unattended]
@@ -156,15 +153,19 @@ module Reckon
156
153
  end
157
154
  end
158
155
 
159
- def finish
160
- options[:output_file].close unless options[:output_file] == STDOUT
161
- interactive_output "Exiting."
162
- exit
163
- end
164
-
165
- def output(ledger_line)
166
- options[:output_file].puts ledger_line
167
- options[:output_file].flush
156
+ def each_row_backwards
157
+ rows = []
158
+ (0...@csv_parser.columns.first.length).to_a.each do |index|
159
+ rows << { :date => @csv_parser.date_for(index),
160
+ :pretty_date => @csv_parser.pretty_date_for(index),
161
+ :pretty_money => @csv_parser.pretty_money_for(index),
162
+ :pretty_money_negated => @csv_parser.pretty_money_for(index, :negate),
163
+ :money => @csv_parser.money_for(index),
164
+ :description => @csv_parser.description_for(index) }
165
+ end
166
+ rows.sort { |a, b| a[:date] <=> b[:date] }.each do |row|
167
+ yield row
168
+ end
168
169
  end
169
170
 
170
171
  def most_specific_regexp_match( row )
@@ -176,41 +177,9 @@ module Reckon
176
177
  matches.sort_by! { |account, matched_text| matched_text.length }.map(&:first)
177
178
  end
178
179
 
179
- # Weigh accounts by how well they match the row
180
- def weighted_account_match( row )
181
- query_tokens = tokenize(row[:description])
182
-
183
- search_vector = []
184
- account_vectors = {}
185
-
186
- query_tokens.each do |token|
187
- idf = Math.log((accounts.keys.length + 1) / ((tokens[token] || {}).keys.length.to_f + 1))
188
- tf = 1.0 / query_tokens.length.to_f
189
- search_vector << tf*idf
190
-
191
- accounts.each do |account, total_terms|
192
- tf = (tokens[token] && tokens[token][account]) ? tokens[token][account] / total_terms.to_f : 0
193
- account_vectors[account] ||= []
194
- account_vectors[account] << tf*idf
195
- end
196
- end
197
-
198
- # Should I normalize the vectors? Probably unnecessary due to tf-idf and short documents.
199
-
200
- account_vectors = account_vectors.to_a.map do |account, account_vector|
201
- { :cosine => (0...account_vector.length).to_a.inject(0) { |m, i| m + search_vector[i] * account_vector[i] },
202
- :account => account }
203
- end
204
- account_vectors.sort! {|a, b| b[:cosine] <=> a[:cosine] }
205
-
206
- # Return empty set if no accounts matched so that we can fallback to the defaults in the unattended mode
207
- if options[:unattended]
208
- if account_vectors.first && account_vectors.first[:account]
209
- account_vectors = [] if account_vectors.first[:cosine] == 0
210
- end
211
- end
212
-
213
- return account_vectors
180
+ def suggest(row)
181
+ most_specific_regexp_match(row) +
182
+ @matcher.find_similar(row[:description]).map { |n| n[:account] }
214
183
  end
215
184
 
216
185
  def ledger_format(row, line1, line2)
@@ -220,6 +189,21 @@ module Reckon
220
189
  out
221
190
  end
222
191
 
192
+ def output(ledger_line)
193
+ options[:output_file].puts ledger_line
194
+ options[:output_file].flush
195
+ end
196
+
197
+ def already_seen?(row)
198
+ seen[row[:pretty_date]] && seen[row[:pretty_date]][row[:pretty_money]]
199
+ end
200
+
201
+ def finish
202
+ options[:output_file].close unless options[:output_file] == STDOUT
203
+ interactive_output "Exiting."
204
+ exit
205
+ end
206
+
223
207
  def output_table
224
208
  output = Terminal::Table.new do |t|
225
209
  t.headings = 'Date', 'Amount', 'Description'
@@ -230,21 +214,6 @@ module Reckon
230
214
  interactive_output output
231
215
  end
232
216
 
233
- def each_row_backwards
234
- rows = []
235
- (0...@csv_parser.columns.first.length).to_a.each do |index|
236
- rows << { :date => @csv_parser.date_for(index),
237
- :pretty_date => @csv_parser.pretty_date_for(index),
238
- :pretty_money => @csv_parser.pretty_money_for(index),
239
- :pretty_money_negated => @csv_parser.pretty_money_for(index, :negate),
240
- :money => @csv_parser.money_for(index),
241
- :description => @csv_parser.description_for(index) }
242
- end
243
- rows.sort { |a, b| a[:date] <=> b[:date] }.each do |row|
244
- yield row
245
- end
246
- end
247
-
248
217
  def self.parse_opts(args = ARGV)
249
218
  options = { :output_file => STDOUT }
250
219
  parser = OptionParser.new do |opts|
@@ -255,7 +224,7 @@ module Reckon
255
224
  options[:file] = file
256
225
  end
257
226
 
258
- opts.on("-a", "--account name", "The Ledger Account this file is for") do |a|
227
+ opts.on("-a", "--account NAME", "The Ledger Account this file is for") do |a|
259
228
  options[:bank_account] = a
260
229
  end
261
230
 
@@ -283,6 +252,14 @@ module Reckon
283
252
  options[:ignore_columns] = ignore.split(",").map { |i| i.to_i }
284
253
  end
285
254
 
255
+ opts.on("", "--money-column 2", Integer, "Specify the money column instead of letting Reckon guess - the first column is column 1") do |column_number|
256
+ options[:money_column] = column_number
257
+ end
258
+
259
+ opts.on("", "--date-column 3", Integer, "Specify the date column instead of letting Reckon guess - the first column is column 1") do |column_number|
260
+ options[:date_column] = column_number
261
+ end
262
+
286
263
  opts.on("", "--contains-header [N]", "The first row of the CSV is a header and should be skipped. Optionally add the number of rows to skip.") do |contains_header|
287
264
  options[:contains_header] = 1
288
265
  options[:contains_header] = contains_header.to_i if contains_header
@@ -316,11 +293,11 @@ module Reckon
316
293
  options[:account_tokens_file] = a
317
294
  end
318
295
 
319
- opts.on("", "--default-into-account name", "Default into account") do |a|
296
+ opts.on("", "--default-into-account NAME", "Default into account") do |a|
320
297
  options[:default_into_account] = a
321
298
  end
322
299
 
323
- opts.on("", "--default-outof-account name", "Default 'out of' account") do |a|
300
+ opts.on("", "--default-outof-account NAME", "Default 'out of' account") do |a|
324
301
  options[:default_outof_account] = a
325
302
  end
326
303
 
@@ -351,7 +328,6 @@ module Reckon
351
328
  end
352
329
 
353
330
  unless options[:bank_account]
354
-
355
331
  fail "Please specify --account for the unattended mode" if options[:unattended]
356
332
 
357
333
  options[:bank_account] = ask("What is the account name of this bank account in Ledger? ") do |q|