reckon 0.4.4 → 0.5.0
Sign up to get free protection for your applications and to get access to all the features.
- checksums.yaml +5 -5
- data/.ruby-version +1 -1
- data/.travis.yml +10 -2
- data/CHANGELOG.md +197 -0
- data/Gemfile +0 -1
- data/Gemfile.lock +33 -15
- data/README.md +2 -5
- data/lib/reckon.rb +10 -8
- data/lib/reckon/app.rb +92 -116
- data/lib/reckon/cosine_similarity.rb +119 -0
- data/lib/reckon/csv_parser.rb +57 -27
- data/lib/reckon/ledger_parser.rb +194 -30
- data/lib/reckon/money.rb +3 -4
- data/reckon.gemspec +6 -5
- data/spec/data_fixtures/73-sample.csv +2 -0
- data/spec/data_fixtures/73-tokens.yml +8 -0
- data/spec/data_fixtures/73-transactions.ledger +7 -0
- data/spec/data_fixtures/austrian_example.csv +13 -0
- data/spec/data_fixtures/bom_utf8_file.csv +1 -0
- data/spec/data_fixtures/broker_canada_example.csv +12 -0
- data/spec/data_fixtures/chase.csv +9 -0
- data/spec/data_fixtures/danish_kroner_nordea_example.csv +6 -0
- data/spec/data_fixtures/english_date_example.csv +3 -0
- data/spec/data_fixtures/french_example.csv +9 -0
- data/spec/data_fixtures/german_date_example.csv +3 -0
- data/spec/data_fixtures/harder_date_example.csv +5 -0
- data/spec/data_fixtures/ing.csv +3 -0
- data/spec/data_fixtures/intuit_mint_example.csv +7 -0
- data/spec/data_fixtures/invalid_header_example.csv +6 -0
- data/spec/data_fixtures/inversed_credit_card.csv +16 -0
- data/spec/data_fixtures/nationwide.csv +4 -0
- data/spec/data_fixtures/simple.csv +2 -0
- data/spec/data_fixtures/some_other.csv +9 -0
- data/spec/data_fixtures/spanish_date_example.csv +3 -0
- data/spec/data_fixtures/suntrust.csv +7 -0
- data/spec/data_fixtures/two_money_columns.csv +5 -0
- data/spec/data_fixtures/yyyymmdd_date_example.csv +1 -0
- data/spec/reckon/app_spec.rb +66 -34
- data/spec/reckon/csv_parser_spec.rb +79 -201
- data/spec/reckon/ledger_parser_spec.rb +62 -9
- data/spec/spec_helper.rb +3 -0
- metadata +62 -19
- data/CHANGES.md +0 -9
checksums.yaml
CHANGED
@@ -1,7 +1,7 @@
|
|
1
1
|
---
|
2
|
-
|
3
|
-
metadata.gz:
|
4
|
-
data.tar.gz:
|
2
|
+
SHA256:
|
3
|
+
metadata.gz: b58a745c730c0dfaf022a98c19dfaf8568c0b3091548d70c9202f8f7c0d78ada
|
4
|
+
data.tar.gz: 030d2036524a19c5eff981608133e8aac15f185b57a891e91a6f1fdabd3796c7
|
5
5
|
SHA512:
|
6
|
-
metadata.gz:
|
7
|
-
data.tar.gz:
|
6
|
+
metadata.gz: 1daf76a19078f1707e847a7ac54aad5836ee3d49ce9c2f612bfefeb86f8c1bf9c9d1f97b7bbecb309f9f44f8097f8f2dbe0d6b6062bc376a62fc88f45bab1dfc
|
7
|
+
data.tar.gz: 4784ac871dbdb6160dc53ef8a7f5d69f31ccc7f7f7ea876b97bf60941abb2d13a968f5229747dd74d0b574ecc12326485e1eaba7ea368a91162d36d3ef4f9f86
|
data/.ruby-version
CHANGED
@@ -1 +1 @@
|
|
1
|
-
|
1
|
+
2.5
|
data/.travis.yml
CHANGED
data/CHANGELOG.md
ADDED
@@ -0,0 +1,197 @@
|
|
1
|
+
# Changelog
|
2
|
+
|
3
|
+
## [v0.5.0](https://github.com/cantino/reckon/tree/v0.5.0) (2020-02-19)
|
4
|
+
|
5
|
+
[Full Changelog](https://github.com/cantino/reckon/compare/v0.4.4...v0.5.0)
|
6
|
+
|
7
|
+
**Closed issues:**
|
8
|
+
|
9
|
+
- g [\#75](https://github.com/cantino/reckon/issues/75)
|
10
|
+
- Learn-from not working [\#74](https://github.com/cantino/reckon/issues/74)
|
11
|
+
- Tokens YAML fails to match [\#73](https://github.com/cantino/reckon/issues/73)
|
12
|
+
- Missing or stray quote in line error [\#71](https://github.com/cantino/reckon/issues/71)
|
13
|
+
- Support ISO 8601 formatting of dates in ledger file [\#70](https://github.com/cantino/reckon/issues/70)
|
14
|
+
- Looking for a new maintainer for Reckon [\#68](https://github.com/cantino/reckon/issues/68)
|
15
|
+
- Reckon undefined method to\_h when trying to parse csv file [\#66](https://github.com/cantino/reckon/issues/66)
|
16
|
+
- Runtime error [\#65](https://github.com/cantino/reckon/issues/65)
|
17
|
+
- Reckon doesn't learn from multiple sources [\#63](https://github.com/cantino/reckon/issues/63)
|
18
|
+
- problem of importing file [\#59](https://github.com/cantino/reckon/issues/59)
|
19
|
+
- Problem with file in which every column is quoted. [\#58](https://github.com/cantino/reckon/issues/58)
|
20
|
+
- Error in reckon for the same format csv file [\#57](https://github.com/cantino/reckon/issues/57)
|
21
|
+
- Parsing account names does not work if currency symbol is different from $ [\#56](https://github.com/cantino/reckon/issues/56)
|
22
|
+
- Problem reading csv file [\#55](https://github.com/cantino/reckon/issues/55)
|
23
|
+
- Problem with mint file [\#53](https://github.com/cantino/reckon/issues/53)
|
24
|
+
- --money-column [\#43](https://github.com/cantino/reckon/issues/43)
|
25
|
+
|
26
|
+
**Merged pull requests:**
|
27
|
+
|
28
|
+
- Fix bugs in ledger file parsing. Fixes \#56. [\#81](https://github.com/cantino/reckon/pull/81) ([benprew](https://github.com/benprew))
|
29
|
+
- Better file encoding suggestions [\#80](https://github.com/cantino/reckon/pull/80) ([benprew](https://github.com/benprew))
|
30
|
+
- :bug: fix matching algorithm, add logging and a spec helper. Fixes \#73 [\#79](https://github.com/cantino/reckon/pull/79) ([benprew](https://github.com/benprew))
|
31
|
+
- bug: invalid header lines should be ignored, not parsed. [\#78](https://github.com/cantino/reckon/pull/78) ([benprew](https://github.com/benprew))
|
32
|
+
- convert default date format to iso8601 [\#77](https://github.com/cantino/reckon/pull/77) ([benprew](https://github.com/benprew))
|
33
|
+
- Fix rspec failure for ruby 2.3 and 2.4 [\#69](https://github.com/cantino/reckon/pull/69) ([BlackEdder](https://github.com/BlackEdder))
|
34
|
+
- Allow setting of money and date columns by index [\#67](https://github.com/cantino/reckon/pull/67) ([cantino](https://github.com/cantino))
|
35
|
+
|
36
|
+
## [v0.4.4](https://github.com/cantino/reckon/tree/v0.4.4) (2015-12-02)
|
37
|
+
|
38
|
+
[Full Changelog](https://github.com/cantino/reckon/compare/v0.4.3...v0.4.4)
|
39
|
+
|
40
|
+
**Merged pull requests:**
|
41
|
+
|
42
|
+
- Regexp support in the tokens file [\#54](https://github.com/cantino/reckon/pull/54) ([vzctl](https://github.com/vzctl))
|
43
|
+
|
44
|
+
## [v0.4.3](https://github.com/cantino/reckon/tree/v0.4.3) (2015-08-16)
|
45
|
+
|
46
|
+
[Full Changelog](https://github.com/cantino/reckon/compare/v0.4.2...v0.4.3)
|
47
|
+
|
48
|
+
## [v0.4.2](https://github.com/cantino/reckon/tree/v0.4.2) (2015-08-08)
|
49
|
+
|
50
|
+
[Full Changelog](https://github.com/cantino/reckon/compare/v0.4.1...v0.4.2)
|
51
|
+
|
52
|
+
**Merged pull requests:**
|
53
|
+
|
54
|
+
- Ignore empty description columns [\#52](https://github.com/cantino/reckon/pull/52) ([vzctl](https://github.com/vzctl))
|
55
|
+
|
56
|
+
## [v0.4.1](https://github.com/cantino/reckon/tree/v0.4.1) (2015-07-08)
|
57
|
+
|
58
|
+
[Full Changelog](https://github.com/cantino/reckon/compare/v0.4.0...v0.4.1)
|
59
|
+
|
60
|
+
**Closed issues:**
|
61
|
+
|
62
|
+
- Unattended [\#50](https://github.com/cantino/reckon/issues/50)
|
63
|
+
- Debit/Credit Columns from SunTrust [\#42](https://github.com/cantino/reckon/issues/42)
|
64
|
+
|
65
|
+
**Merged pull requests:**
|
66
|
+
|
67
|
+
- \[RFC\] Fix \#42: Work with suntrust double column csv files [\#48](https://github.com/cantino/reckon/pull/48) ([BlackEdder](https://github.com/BlackEdder))
|
68
|
+
|
69
|
+
## [v0.4.0](https://github.com/cantino/reckon/tree/v0.4.0) (2015-06-05)
|
70
|
+
|
71
|
+
[Full Changelog](https://github.com/cantino/reckon/compare/v0.3.10...v0.4.0)
|
72
|
+
|
73
|
+
**Implemented enhancements:**
|
74
|
+
|
75
|
+
- Tab completion for transactions [\#40](https://github.com/cantino/reckon/issues/40)
|
76
|
+
- feature: "unattended" mode [\#3](https://github.com/cantino/reckon/issues/3)
|
77
|
+
|
78
|
+
**Closed issues:**
|
79
|
+
|
80
|
+
- Missing or stray quote error [\#38](https://github.com/cantino/reckon/issues/38)
|
81
|
+
|
82
|
+
**Merged pull requests:**
|
83
|
+
|
84
|
+
- Better ISO 8601 dates support [\#49](https://github.com/cantino/reckon/pull/49) ([vzctl](https://github.com/vzctl))
|
85
|
+
- Unattended mode and custom tokens support [\#47](https://github.com/cantino/reckon/pull/47) ([vzctl](https://github.com/vzctl))
|
86
|
+
- \[RFC\] Implement issue \#40: Tab completion [\#46](https://github.com/cantino/reckon/pull/46) ([BlackEdder](https://github.com/BlackEdder))
|
87
|
+
- set readline to allow for backspace in ask dialog [\#44](https://github.com/cantino/reckon/pull/44) ([mrtazz](https://github.com/mrtazz))
|
88
|
+
|
89
|
+
## [v0.3.10](https://github.com/cantino/reckon/tree/v0.3.10) (2014-08-16)
|
90
|
+
|
91
|
+
[Full Changelog](https://github.com/cantino/reckon/compare/v0.3.9...v0.3.10)
|
92
|
+
|
93
|
+
**Merged pull requests:**
|
94
|
+
|
95
|
+
- Fix --encoding option [\#41](https://github.com/cantino/reckon/pull/41) ([mamciek](https://github.com/mamciek))
|
96
|
+
- Bumped version number [\#37](https://github.com/cantino/reckon/pull/37) ([BlackEdder](https://github.com/BlackEdder))
|
97
|
+
|
98
|
+
## [v0.3.9](https://github.com/cantino/reckon/tree/v0.3.9) (2014-02-20)
|
99
|
+
|
100
|
+
[Full Changelog](https://github.com/cantino/reckon/compare/v0.3.8...v0.3.9)
|
101
|
+
|
102
|
+
**Closed issues:**
|
103
|
+
|
104
|
+
- Idea/discussion: csv parser [\#25](https://github.com/cantino/reckon/issues/25)
|
105
|
+
- Silently misinterprets UK dates [\#18](https://github.com/cantino/reckon/issues/18)
|
106
|
+
|
107
|
+
**Merged pull requests:**
|
108
|
+
|
109
|
+
- Added spec for csv files from Broker Canada [\#36](https://github.com/cantino/reckon/pull/36) ([BlackEdder](https://github.com/BlackEdder))
|
110
|
+
- Date format [\#35](https://github.com/cantino/reckon/pull/35) ([BlackEdder](https://github.com/BlackEdder))
|
111
|
+
- Added example from a french bank [\#34](https://github.com/cantino/reckon/pull/34) ([BlackEdder](https://github.com/BlackEdder))
|
112
|
+
- Austrian example [\#33](https://github.com/cantino/reckon/pull/33) ([BlackEdder](https://github.com/BlackEdder))
|
113
|
+
- Ing csv [\#30](https://github.com/cantino/reckon/pull/30) ([BlackEdder](https://github.com/BlackEdder))
|
114
|
+
- Further improvements in nationwide csv handling [\#29](https://github.com/cantino/reckon/pull/29) ([BlackEdder](https://github.com/BlackEdder))
|
115
|
+
- Refactor: Add money class [\#28](https://github.com/cantino/reckon/pull/28) ([BlackEdder](https://github.com/BlackEdder))
|
116
|
+
- Initial split of CSVparser from class App [\#27](https://github.com/cantino/reckon/pull/27) ([BlackEdder](https://github.com/BlackEdder))
|
117
|
+
- Updated version of pull request 24: Allow for other currency symbols while calculating money\_score [\#26](https://github.com/cantino/reckon/pull/26) ([BlackEdder](https://github.com/BlackEdder))
|
118
|
+
- Change double column detection [\#23](https://github.com/cantino/reckon/pull/23) ([BlackEdder](https://github.com/BlackEdder))
|
119
|
+
- Added optional argument to contains\_header to skip multiple header lines [\#22](https://github.com/cantino/reckon/pull/22) ([BlackEdder](https://github.com/BlackEdder))
|
120
|
+
- Add a Bitdeli Badge to README [\#20](https://github.com/cantino/reckon/pull/20) ([bitdeli-chef](https://github.com/bitdeli-chef))
|
121
|
+
- Update README to show latest usage info [\#19](https://github.com/cantino/reckon/pull/19) ([purcell](https://github.com/purcell))
|
122
|
+
|
123
|
+
## [v0.3.8](https://github.com/cantino/reckon/tree/v0.3.8) (2013-07-03)
|
124
|
+
|
125
|
+
[Full Changelog](https://github.com/cantino/reckon/compare/v0.3.7...v0.3.8)
|
126
|
+
|
127
|
+
**Implemented enhancements:**
|
128
|
+
|
129
|
+
- Support other currencies [\#7](https://github.com/cantino/reckon/issues/7)
|
130
|
+
|
131
|
+
**Closed issues:**
|
132
|
+
|
133
|
+
- Add support for dates in spanish dd/mm/yyyy [\#13](https://github.com/cantino/reckon/issues/13)
|
134
|
+
- Problems with my csv file [\#8](https://github.com/cantino/reckon/issues/8)
|
135
|
+
|
136
|
+
**Merged pull requests:**
|
137
|
+
|
138
|
+
- add support for spanish dates dd/mm/yyyy closes \#13 [\#14](https://github.com/cantino/reckon/pull/14) ([mauromorales](https://github.com/mauromorales))
|
139
|
+
- fix issue showing true when parsing the currency option related to \#7 [\#12](https://github.com/cantino/reckon/pull/12) ([mauromorales](https://github.com/mauromorales))
|
140
|
+
|
141
|
+
## [v0.3.7](https://github.com/cantino/reckon/tree/v0.3.7) (2013-06-27)
|
142
|
+
|
143
|
+
[Full Changelog](https://github.com/cantino/reckon/compare/v0.3.6...v0.3.7)
|
144
|
+
|
145
|
+
**Merged pull requests:**
|
146
|
+
|
147
|
+
- Updated the sources to allow for custom curreny [\#11](https://github.com/cantino/reckon/pull/11) ([ghost](https://github.com/ghost))
|
148
|
+
- Add --account option on the commandline [\#10](https://github.com/cantino/reckon/pull/10) ([copiousfreetime](https://github.com/copiousfreetime))
|
149
|
+
|
150
|
+
## [v0.3.6](https://github.com/cantino/reckon/tree/v0.3.6) (2013-04-30)
|
151
|
+
|
152
|
+
[Full Changelog](https://github.com/cantino/reckon/compare/v0.3.5...v0.3.6)
|
153
|
+
|
154
|
+
**Closed issues:**
|
155
|
+
|
156
|
+
- iso-8859-1 CSV with accented chars =\> invalid byte sequence in UTF-8 \(ArgumentError\) [\#5](https://github.com/cantino/reckon/issues/5)
|
157
|
+
- Ruby 2.0 compatibility [\#4](https://github.com/cantino/reckon/issues/4)
|
158
|
+
|
159
|
+
**Merged pull requests:**
|
160
|
+
|
161
|
+
- Recognize yyyymmdd date in Reckon::App\#date\_for. [\#9](https://github.com/cantino/reckon/pull/9) ([mhoogendoorn](https://github.com/mhoogendoorn))
|
162
|
+
|
163
|
+
## [v0.3.5](https://github.com/cantino/reckon/tree/v0.3.5) (2013-03-24)
|
164
|
+
|
165
|
+
[Full Changelog](https://github.com/cantino/reckon/compare/v0.3.4...v0.3.5)
|
166
|
+
|
167
|
+
**Closed issues:**
|
168
|
+
|
169
|
+
- backtrace trying to run reckon -f [\#2](https://github.com/cantino/reckon/issues/2)
|
170
|
+
|
171
|
+
**Merged pull requests:**
|
172
|
+
|
173
|
+
- Inverse mode [\#6](https://github.com/cantino/reckon/pull/6) ([nathankot](https://github.com/nathankot))
|
174
|
+
|
175
|
+
## [v0.3.4](https://github.com/cantino/reckon/tree/v0.3.4) (2013-02-16)
|
176
|
+
|
177
|
+
[Full Changelog](https://github.com/cantino/reckon/compare/v0.3.3...v0.3.4)
|
178
|
+
|
179
|
+
**Merged pull requests:**
|
180
|
+
|
181
|
+
- adds support for Nordea csv files [\#1](https://github.com/cantino/reckon/pull/1) ([x2q](https://github.com/x2q))
|
182
|
+
|
183
|
+
## [v0.3.3](https://github.com/cantino/reckon/tree/v0.3.3) (2013-01-13)
|
184
|
+
|
185
|
+
[Full Changelog](https://github.com/cantino/reckon/compare/v0.3.1...v0.3.3)
|
186
|
+
|
187
|
+
## [v0.3.1](https://github.com/cantino/reckon/tree/v0.3.1) (2012-07-30)
|
188
|
+
|
189
|
+
[Full Changelog](https://github.com/cantino/reckon/compare/v0.3.2...v0.3.1)
|
190
|
+
|
191
|
+
## [v0.3.2](https://github.com/cantino/reckon/tree/v0.3.2) (2012-07-30)
|
192
|
+
|
193
|
+
[Full Changelog](https://github.com/cantino/reckon/compare/5c07bea3fe63f9b909b4b76bd49f22fd8faf7a29...v0.3.2)
|
194
|
+
|
195
|
+
|
196
|
+
|
197
|
+
\* *This Changelog was automatically generated by [github_changelog_generator](https://github.com/github-changelog-generator/github-changelog-generator)*
|
data/Gemfile
CHANGED
data/Gemfile.lock
CHANGED
@@ -1,34 +1,52 @@
|
|
1
1
|
PATH
|
2
2
|
remote: .
|
3
3
|
specs:
|
4
|
-
reckon (0.
|
4
|
+
reckon (0.5.0)
|
5
5
|
chronic (>= 0.3.0)
|
6
|
-
fastercsv (>= 1.5.1)
|
7
6
|
highline (>= 1.5.2)
|
7
|
+
rchardet (>= 1.8.0)
|
8
8
|
terminal-table (>= 1.4.2)
|
9
9
|
|
10
10
|
GEM
|
11
11
|
remote: http://rubygems.org/
|
12
12
|
specs:
|
13
13
|
chronic (0.10.2)
|
14
|
-
|
15
|
-
|
16
|
-
highline (
|
17
|
-
|
18
|
-
|
19
|
-
|
20
|
-
|
21
|
-
|
22
|
-
|
23
|
-
|
24
|
-
|
25
|
-
|
26
|
-
|
14
|
+
coderay (1.1.2)
|
15
|
+
diff-lcs (1.3)
|
16
|
+
highline (2.0.3)
|
17
|
+
method_source (0.9.2)
|
18
|
+
pry (0.12.2)
|
19
|
+
coderay (~> 1.1.0)
|
20
|
+
method_source (~> 0.9.0)
|
21
|
+
rake (12.3.3)
|
22
|
+
rantly (1.2.0)
|
23
|
+
rchardet (1.8.0)
|
24
|
+
rspec (3.9.0)
|
25
|
+
rspec-core (~> 3.9.0)
|
26
|
+
rspec-expectations (~> 3.9.0)
|
27
|
+
rspec-mocks (~> 3.9.0)
|
28
|
+
rspec-core (3.9.1)
|
29
|
+
rspec-support (~> 3.9.1)
|
30
|
+
rspec-expectations (3.9.0)
|
31
|
+
diff-lcs (>= 1.2.0, < 2.0)
|
32
|
+
rspec-support (~> 3.9.0)
|
33
|
+
rspec-mocks (3.9.1)
|
34
|
+
diff-lcs (>= 1.2.0, < 2.0)
|
35
|
+
rspec-support (~> 3.9.0)
|
36
|
+
rspec-support (3.9.2)
|
37
|
+
terminal-table (1.8.0)
|
38
|
+
unicode-display_width (~> 1.1, >= 1.1.1)
|
39
|
+
unicode-display_width (1.6.1)
|
27
40
|
|
28
41
|
PLATFORMS
|
29
42
|
ruby
|
30
43
|
|
31
44
|
DEPENDENCIES
|
45
|
+
pry (>= 0.12.2)
|
32
46
|
rake
|
47
|
+
rantly (= 1.2.0)
|
33
48
|
reckon!
|
34
49
|
rspec (>= 1.2.9)
|
50
|
+
|
51
|
+
BUNDLED WITH
|
52
|
+
1.17.3
|
data/README.md
CHANGED
@@ -1,8 +1,8 @@
|
|
1
1
|
# Reckon
|
2
2
|
|
3
|
-
[![Build Status](https://travis-ci.org/cantino/reckon.png)](https://travis-ci.org/cantino/reckon)
|
3
|
+
[![Build Status](https://travis-ci.org/cantino/reckon.png?branch=master)](https://travis-ci.org/cantino/reckon)
|
4
4
|
|
5
|
-
Reckon automagically converts CSV files for use with the command-line accounting tool [Ledger](
|
5
|
+
Reckon automagically converts CSV files for use with the command-line accounting tool [Ledger](http://www.ledger-cli.org/). It also helps you to select the correct accounts associated with the CSV data using Bayesian machine learning.
|
6
6
|
|
7
7
|
## Installation
|
8
8
|
|
@@ -114,6 +114,3 @@ You can override them with `--default_outof_account` and `--default_into_account
|
|
114
114
|
Copyright (c) 2013 Andrew Cantino. See LICENSE for details.
|
115
115
|
|
116
116
|
Thanks to @BlackEdder for many contributions!
|
117
|
-
|
118
|
-
[![Bitdeli Badge](https://d2weczhvl823v0.cloudfront.net/cantino/reckon/trend.png)](https://bitdeli.com/free "Bitdeli Badge")
|
119
|
-
|
data/lib/reckon.rb
CHANGED
@@ -1,19 +1,21 @@
|
|
1
1
|
#!/usr/bin/env ruby
|
2
2
|
|
3
3
|
require 'rubygems'
|
4
|
-
|
5
|
-
|
6
|
-
|
7
|
-
require 'fastercsv'
|
8
|
-
end
|
4
|
+
require 'rchardet'
|
5
|
+
require 'chronic'
|
6
|
+
require 'csv'
|
9
7
|
require 'highline/import'
|
10
8
|
require 'optparse'
|
11
|
-
require 'chronic'
|
12
|
-
require 'time'
|
13
9
|
require 'terminal-table'
|
10
|
+
require 'time'
|
11
|
+
require 'logger'
|
12
|
+
|
13
|
+
LOGGER = Logger.new(STDOUT)
|
14
|
+
LOGGER.level = Logger::ERROR
|
14
15
|
|
16
|
+
require_relative 'reckon/version'
|
17
|
+
require_relative 'reckon/cosine_similarity'
|
15
18
|
require File.expand_path(File.join(File.dirname(__FILE__), "reckon", "app"))
|
16
19
|
require File.expand_path(File.join(File.dirname(__FILE__), "reckon", "ledger_parser"))
|
17
20
|
require File.expand_path(File.join(File.dirname(__FILE__), "reckon", "csv_parser"))
|
18
21
|
require File.expand_path(File.join(File.dirname(__FILE__), "reckon", "money"))
|
19
|
-
|
data/lib/reckon/app.rb
CHANGED
@@ -1,21 +1,20 @@
|
|
1
|
-
#coding: utf-8
|
1
|
+
# coding: utf-8
|
2
2
|
require 'pp'
|
3
3
|
require 'yaml'
|
4
4
|
|
5
5
|
module Reckon
|
6
6
|
class App
|
7
|
-
|
8
|
-
attr_accessor :options, :accounts, :tokens, :seen, :csv_parser, :regexps
|
7
|
+
attr_accessor :options, :seen, :csv_parser, :regexps, :matcher
|
9
8
|
|
10
9
|
def initialize(options = {})
|
10
|
+
LOGGER.level = Logger::INFO if options[:verbose]
|
11
11
|
self.options = options
|
12
|
-
self.tokens = {}
|
13
12
|
self.regexps = {}
|
14
|
-
self.accounts = {}
|
15
13
|
self.seen = {}
|
16
14
|
self.options[:currency] ||= '$'
|
17
15
|
options[:string] = File.read(options[:file]) unless options[:string]
|
18
16
|
@csv_parser = CSVParser.new( options )
|
17
|
+
@matcher = CosineSimilarity.new(options)
|
19
18
|
learn!
|
20
19
|
end
|
21
20
|
|
@@ -24,21 +23,44 @@ module Reckon
|
|
24
23
|
puts str
|
25
24
|
end
|
26
25
|
|
26
|
+
def learn!
|
27
|
+
learn_from_account_tokens(options[:account_tokens_file])
|
28
|
+
|
29
|
+
ledger_file = options[:existing_ledger_file]
|
30
|
+
return unless ledger_file
|
31
|
+
fail "#{ledger_file} doesn't exist!" unless File.exists?(ledger_file)
|
32
|
+
learn_from(File.read(ledger_file))
|
33
|
+
end
|
34
|
+
|
35
|
+
def learn_from_account_tokens(filename)
|
36
|
+
return unless filename
|
37
|
+
|
38
|
+
fail "#{filename} doesn't exist!" unless File.exists?(filename)
|
39
|
+
|
40
|
+
extract_account_tokens(YAML.load_file(filename)).each do |account, tokens|
|
41
|
+
tokens.each do |t|
|
42
|
+
if t.start_with?('/')
|
43
|
+
add_regexp(account, t)
|
44
|
+
else
|
45
|
+
@matcher.add_document(account, t)
|
46
|
+
end
|
47
|
+
end
|
48
|
+
end
|
49
|
+
end
|
50
|
+
|
27
51
|
def learn_from(ledger)
|
28
52
|
LedgerParser.new(ledger).entries.each do |entry|
|
29
53
|
entry[:accounts].each do |account|
|
30
|
-
|
31
|
-
|
32
|
-
|
33
|
-
seen[
|
54
|
+
str = [entry[:desc], account[:amount]].join(" ")
|
55
|
+
@matcher.add_document(account[:name], str) unless account[:name] == options[:bank_account]
|
56
|
+
pretty_date = entry[:date].iso8601
|
57
|
+
seen[pretty_date] ||= {}
|
58
|
+
seen[pretty_date][@csv_parser.pretty_money(account[:amount])] = true
|
34
59
|
end
|
35
60
|
end
|
36
61
|
end
|
37
62
|
|
38
|
-
|
39
|
-
seen[row[:pretty_date]] && seen[row[:pretty_date]][row[:pretty_money]]
|
40
|
-
end
|
41
|
-
|
63
|
+
# Add tokens from account_tokens_file to accounts
|
42
64
|
def extract_account_tokens(subtree, account = nil)
|
43
65
|
if subtree.nil?
|
44
66
|
puts "Warning: empty #{account} tree"
|
@@ -46,50 +68,26 @@ module Reckon
|
|
46
68
|
elsif subtree.is_a?(Array)
|
47
69
|
{ account => subtree }
|
48
70
|
else
|
49
|
-
at = subtree.map
|
50
|
-
|
51
|
-
|
52
|
-
end
|
53
|
-
|
54
|
-
def learn!
|
55
|
-
if options[:account_tokens_file]
|
56
|
-
fail "#{options[:account_tokens_file]} doesn't exist!" unless File.exists?(options[:account_tokens_file])
|
57
|
-
extract_account_tokens(YAML.load_file(options[:account_tokens_file])).each do |account, tokens|
|
58
|
-
tokens.each { |t| learn_about_account(account, t, true) }
|
71
|
+
at = subtree.map do |k, v|
|
72
|
+
merged_acct = [account, k].compact.join(':')
|
73
|
+
extract_account_tokens(v, merged_acct)
|
59
74
|
end
|
75
|
+
at.inject({}) { |memo, e| memo.merge!(e)}
|
60
76
|
end
|
61
|
-
return unless options[:existing_ledger_file]
|
62
|
-
fail "#{options[:existing_ledger_file]} doesn't exist!" unless File.exists?(options[:existing_ledger_file])
|
63
|
-
ledger_data = File.read(options[:existing_ledger_file])
|
64
|
-
learn_from(ledger_data)
|
65
77
|
end
|
66
78
|
|
67
|
-
def
|
68
|
-
|
69
|
-
|
70
|
-
|
71
|
-
|
72
|
-
|
73
|
-
|
74
|
-
|
75
|
-
|
76
|
-
when 'x' then options |= Regexp::EXTENDED
|
77
|
-
when 'i' then options |= Regexp::IGNORECASE
|
78
|
-
end
|
79
|
-
end
|
80
|
-
regexps[Regexp.new(match[1], options)] = account
|
81
|
-
else
|
82
|
-
tokenize(data).each do |token|
|
83
|
-
tokens[token] ||= {}
|
84
|
-
tokens[token][account] ||= 0
|
85
|
-
tokens[token][account] += 1
|
86
|
-
accounts[account] += 1
|
79
|
+
def add_regexp(account, regex_str)
|
80
|
+
# https://github.com/tenderlove/psych/blob/master/lib/psych/visitors/to_ruby.rb
|
81
|
+
match = regex_str.match(/^\/(.*)\/([ix]*)$/m)
|
82
|
+
fail "failed to parse regexp #{regex_str}" unless match
|
83
|
+
options = 0
|
84
|
+
(match[2] || '').split('').each do |option|
|
85
|
+
case option
|
86
|
+
when 'x' then options |= Regexp::EXTENDED
|
87
|
+
when 'i' then options |= Regexp::IGNORECASE
|
87
88
|
end
|
88
89
|
end
|
89
|
-
|
90
|
-
|
91
|
-
def tokenize(str)
|
92
|
-
str.downcase.split(/[\s\-]/)
|
90
|
+
regexps[Regexp.new(match[1], options)] = account
|
93
91
|
end
|
94
92
|
|
95
93
|
def walk_backwards
|
@@ -107,8 +105,7 @@ module Reckon
|
|
107
105
|
seen_anything_new = true
|
108
106
|
end
|
109
107
|
|
110
|
-
possible_answers =
|
111
|
-
possible_answers = weighted_account_match( row ).map! { |a| a[:account] } if possible_answers.empty?
|
108
|
+
possible_answers = suggest(row)
|
112
109
|
|
113
110
|
ledger = if row[:money] > 0
|
114
111
|
if options[:unattended]
|
@@ -156,15 +153,19 @@ module Reckon
|
|
156
153
|
end
|
157
154
|
end
|
158
155
|
|
159
|
-
def
|
160
|
-
|
161
|
-
|
162
|
-
|
163
|
-
|
164
|
-
|
165
|
-
|
166
|
-
|
167
|
-
|
156
|
+
def each_row_backwards
|
157
|
+
rows = []
|
158
|
+
(0...@csv_parser.columns.first.length).to_a.each do |index|
|
159
|
+
rows << { :date => @csv_parser.date_for(index),
|
160
|
+
:pretty_date => @csv_parser.pretty_date_for(index),
|
161
|
+
:pretty_money => @csv_parser.pretty_money_for(index),
|
162
|
+
:pretty_money_negated => @csv_parser.pretty_money_for(index, :negate),
|
163
|
+
:money => @csv_parser.money_for(index),
|
164
|
+
:description => @csv_parser.description_for(index) }
|
165
|
+
end
|
166
|
+
rows.sort { |a, b| a[:date] <=> b[:date] }.each do |row|
|
167
|
+
yield row
|
168
|
+
end
|
168
169
|
end
|
169
170
|
|
170
171
|
def most_specific_regexp_match( row )
|
@@ -176,41 +177,9 @@ module Reckon
|
|
176
177
|
matches.sort_by! { |account, matched_text| matched_text.length }.map(&:first)
|
177
178
|
end
|
178
179
|
|
179
|
-
|
180
|
-
|
181
|
-
|
182
|
-
|
183
|
-
search_vector = []
|
184
|
-
account_vectors = {}
|
185
|
-
|
186
|
-
query_tokens.each do |token|
|
187
|
-
idf = Math.log((accounts.keys.length + 1) / ((tokens[token] || {}).keys.length.to_f + 1))
|
188
|
-
tf = 1.0 / query_tokens.length.to_f
|
189
|
-
search_vector << tf*idf
|
190
|
-
|
191
|
-
accounts.each do |account, total_terms|
|
192
|
-
tf = (tokens[token] && tokens[token][account]) ? tokens[token][account] / total_terms.to_f : 0
|
193
|
-
account_vectors[account] ||= []
|
194
|
-
account_vectors[account] << tf*idf
|
195
|
-
end
|
196
|
-
end
|
197
|
-
|
198
|
-
# Should I normalize the vectors? Probably unnecessary due to tf-idf and short documents.
|
199
|
-
|
200
|
-
account_vectors = account_vectors.to_a.map do |account, account_vector|
|
201
|
-
{ :cosine => (0...account_vector.length).to_a.inject(0) { |m, i| m + search_vector[i] * account_vector[i] },
|
202
|
-
:account => account }
|
203
|
-
end
|
204
|
-
account_vectors.sort! {|a, b| b[:cosine] <=> a[:cosine] }
|
205
|
-
|
206
|
-
# Return empty set if no accounts matched so that we can fallback to the defaults in the unattended mode
|
207
|
-
if options[:unattended]
|
208
|
-
if account_vectors.first && account_vectors.first[:account]
|
209
|
-
account_vectors = [] if account_vectors.first[:cosine] == 0
|
210
|
-
end
|
211
|
-
end
|
212
|
-
|
213
|
-
return account_vectors
|
180
|
+
def suggest(row)
|
181
|
+
most_specific_regexp_match(row) +
|
182
|
+
@matcher.find_similar(row[:description]).map { |n| n[:account] }
|
214
183
|
end
|
215
184
|
|
216
185
|
def ledger_format(row, line1, line2)
|
@@ -220,6 +189,21 @@ module Reckon
|
|
220
189
|
out
|
221
190
|
end
|
222
191
|
|
192
|
+
def output(ledger_line)
|
193
|
+
options[:output_file].puts ledger_line
|
194
|
+
options[:output_file].flush
|
195
|
+
end
|
196
|
+
|
197
|
+
def already_seen?(row)
|
198
|
+
seen[row[:pretty_date]] && seen[row[:pretty_date]][row[:pretty_money]]
|
199
|
+
end
|
200
|
+
|
201
|
+
def finish
|
202
|
+
options[:output_file].close unless options[:output_file] == STDOUT
|
203
|
+
interactive_output "Exiting."
|
204
|
+
exit
|
205
|
+
end
|
206
|
+
|
223
207
|
def output_table
|
224
208
|
output = Terminal::Table.new do |t|
|
225
209
|
t.headings = 'Date', 'Amount', 'Description'
|
@@ -230,21 +214,6 @@ module Reckon
|
|
230
214
|
interactive_output output
|
231
215
|
end
|
232
216
|
|
233
|
-
def each_row_backwards
|
234
|
-
rows = []
|
235
|
-
(0...@csv_parser.columns.first.length).to_a.each do |index|
|
236
|
-
rows << { :date => @csv_parser.date_for(index),
|
237
|
-
:pretty_date => @csv_parser.pretty_date_for(index),
|
238
|
-
:pretty_money => @csv_parser.pretty_money_for(index),
|
239
|
-
:pretty_money_negated => @csv_parser.pretty_money_for(index, :negate),
|
240
|
-
:money => @csv_parser.money_for(index),
|
241
|
-
:description => @csv_parser.description_for(index) }
|
242
|
-
end
|
243
|
-
rows.sort { |a, b| a[:date] <=> b[:date] }.each do |row|
|
244
|
-
yield row
|
245
|
-
end
|
246
|
-
end
|
247
|
-
|
248
217
|
def self.parse_opts(args = ARGV)
|
249
218
|
options = { :output_file => STDOUT }
|
250
219
|
parser = OptionParser.new do |opts|
|
@@ -255,7 +224,7 @@ module Reckon
|
|
255
224
|
options[:file] = file
|
256
225
|
end
|
257
226
|
|
258
|
-
opts.on("-a", "--account
|
227
|
+
opts.on("-a", "--account NAME", "The Ledger Account this file is for") do |a|
|
259
228
|
options[:bank_account] = a
|
260
229
|
end
|
261
230
|
|
@@ -283,6 +252,14 @@ module Reckon
|
|
283
252
|
options[:ignore_columns] = ignore.split(",").map { |i| i.to_i }
|
284
253
|
end
|
285
254
|
|
255
|
+
opts.on("", "--money-column 2", Integer, "Specify the money column instead of letting Reckon guess - the first column is column 1") do |column_number|
|
256
|
+
options[:money_column] = column_number
|
257
|
+
end
|
258
|
+
|
259
|
+
opts.on("", "--date-column 3", Integer, "Specify the date column instead of letting Reckon guess - the first column is column 1") do |column_number|
|
260
|
+
options[:date_column] = column_number
|
261
|
+
end
|
262
|
+
|
286
263
|
opts.on("", "--contains-header [N]", "The first row of the CSV is a header and should be skipped. Optionally add the number of rows to skip.") do |contains_header|
|
287
264
|
options[:contains_header] = 1
|
288
265
|
options[:contains_header] = contains_header.to_i if contains_header
|
@@ -316,11 +293,11 @@ module Reckon
|
|
316
293
|
options[:account_tokens_file] = a
|
317
294
|
end
|
318
295
|
|
319
|
-
opts.on("", "--default-into-account
|
296
|
+
opts.on("", "--default-into-account NAME", "Default into account") do |a|
|
320
297
|
options[:default_into_account] = a
|
321
298
|
end
|
322
299
|
|
323
|
-
opts.on("", "--default-outof-account
|
300
|
+
opts.on("", "--default-outof-account NAME", "Default 'out of' account") do |a|
|
324
301
|
options[:default_outof_account] = a
|
325
302
|
end
|
326
303
|
|
@@ -351,7 +328,6 @@ module Reckon
|
|
351
328
|
end
|
352
329
|
|
353
330
|
unless options[:bank_account]
|
354
|
-
|
355
331
|
fail "Please specify --account for the unattended mode" if options[:unattended]
|
356
332
|
|
357
333
|
options[:bank_account] = ask("What is the account name of this bank account in Ledger? ") do |q|
|