chronicle-etl 0.4.0 → 0.4.3

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (45) hide show
  1. checksums.yaml +4 -4
  2. data/.github/workflows/ruby.yml +2 -2
  3. data/.rubocop.yml +3 -0
  4. data/README.md +156 -81
  5. data/chronicle-etl.gemspec +3 -0
  6. data/lib/chronicle/etl/cli/cli_base.rb +31 -0
  7. data/lib/chronicle/etl/cli/connectors.rb +4 -11
  8. data/lib/chronicle/etl/cli/jobs.rb +49 -22
  9. data/lib/chronicle/etl/cli/main.rb +32 -1
  10. data/lib/chronicle/etl/cli/plugins.rb +62 -0
  11. data/lib/chronicle/etl/cli/subcommand_base.rb +1 -1
  12. data/lib/chronicle/etl/cli.rb +3 -0
  13. data/lib/chronicle/etl/config.rb +7 -4
  14. data/lib/chronicle/etl/configurable.rb +15 -2
  15. data/lib/chronicle/etl/exceptions.rb +29 -2
  16. data/lib/chronicle/etl/extractors/csv_extractor.rb +24 -17
  17. data/lib/chronicle/etl/extractors/extractor.rb +5 -5
  18. data/lib/chronicle/etl/extractors/file_extractor.rb +33 -13
  19. data/lib/chronicle/etl/extractors/helpers/input_reader.rb +76 -0
  20. data/lib/chronicle/etl/extractors/json_extractor.rb +21 -12
  21. data/lib/chronicle/etl/job.rb +7 -1
  22. data/lib/chronicle/etl/job_definition.rb +32 -6
  23. data/lib/chronicle/etl/loaders/csv_loader.rb +35 -8
  24. data/lib/chronicle/etl/loaders/helpers/encoding_helper.rb +18 -0
  25. data/lib/chronicle/etl/loaders/json_loader.rb +44 -0
  26. data/lib/chronicle/etl/loaders/loader.rb +24 -1
  27. data/lib/chronicle/etl/loaders/table_loader.rb +13 -26
  28. data/lib/chronicle/etl/logger.rb +6 -2
  29. data/lib/chronicle/etl/models/base.rb +3 -0
  30. data/lib/chronicle/etl/models/entity.rb +8 -2
  31. data/lib/chronicle/etl/models/raw.rb +26 -0
  32. data/lib/chronicle/etl/registry/connector_registration.rb +5 -0
  33. data/lib/chronicle/etl/registry/plugin_registry.rb +75 -0
  34. data/lib/chronicle/etl/registry/registry.rb +27 -14
  35. data/lib/chronicle/etl/runner.rb +35 -17
  36. data/lib/chronicle/etl/serializers/jsonapi_serializer.rb +6 -0
  37. data/lib/chronicle/etl/serializers/raw_serializer.rb +10 -0
  38. data/lib/chronicle/etl/serializers/serializer.rb +2 -1
  39. data/lib/chronicle/etl/transformers/null_transformer.rb +1 -1
  40. data/lib/chronicle/etl/version.rb +1 -1
  41. data/lib/chronicle/etl.rb +11 -4
  42. metadata +53 -6
  43. data/lib/chronicle/etl/extractors/helpers/filesystem_reader.rb +0 -104
  44. data/lib/chronicle/etl/loaders/stdout_loader.rb +0 -14
  45. data/lib/chronicle/etl/models/generic.rb +0 -23
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: 5fd411a9a41a645b85780230c79b09f361e121d0e8ca7f3270ca8eba55a76ca8
4
- data.tar.gz: c09053715910ab4f027fbdc3a5b7d10c042eee962f7fa93c6571ce8359f51009
3
+ metadata.gz: f2b6fdca3723ec52287c070a0dd08d0cfaf825f5e8f46da0d5a34172c0008573
4
+ data.tar.gz: e15181ba7edc1698404af8ff8c05d5367786ea809360393e825ca5ee5eef6c75
5
5
  SHA512:
6
- metadata.gz: 2c9ec14b6c0a51f1c5ec77ee8d9a7f016d16bdc35db5634f9fa5d38aabc30dec201cd4b8bef06a31b86773a0c1cda2d271d7008dcb247a86d956c094919f3c0f
7
- data.tar.gz: 0dca41e1654e5b2b98a148f853492a67126cdac767000b3c5f97c5c8ff88b77464e17a2fab38b72c1f014f3515c911e5f3f391eaf68d64e73dcfcff5d8e6cb6a
6
+ metadata.gz: c5508dcfc5e1367122ebbc191dc60a76cc7b1f088ae9ebd52bae41e07420f54ea7977a35f2b25839933b06d4673e2a120feecbd958efee07e0d313eaa7a5d167
7
+ data.tar.gz: adcb90549af364189c5ae3b0811c039277aba7dc6fbf2fbd6de8b89c572948d0dc00ba8da59db03252f4770737423e62c3a1175ef018742ef8bd7aee14837f63
@@ -9,9 +9,9 @@ name: Ruby
9
9
 
10
10
  on:
11
11
  push:
12
- branches: [ master ]
12
+ branches: [ main ]
13
13
  pull_request:
14
- branches: [ master ]
14
+ branches: [ main ]
15
15
 
16
16
  jobs:
17
17
  test:
data/.rubocop.yml CHANGED
@@ -27,6 +27,9 @@ Style/OpenStructUse:
27
27
  Style/Copyright:
28
28
  Enabled: false
29
29
 
30
+ Style/MissingElse:
31
+ Enabled: false
32
+
30
33
  Style/SymbolArray:
31
34
  EnforcedStyle: brackets
32
35
 
data/README.md CHANGED
@@ -1,125 +1,200 @@
1
- # Chronicle::ETL
1
+ ## A CLI toolkit for extracting and working with your digital history
2
+
3
+ ![chronicle-etl-banner](https://user-images.githubusercontent.com/6291/157330518-0f934c9a-9ec4-43d9-9cc2-12f156d09b37.png)
2
4
 
3
5
  [![Gem Version](https://badge.fury.io/rb/chronicle-etl.svg)](https://badge.fury.io/rb/chronicle-etl) [![Ruby](https://github.com/chronicle-app/chronicle-etl/actions/workflows/ruby.yml/badge.svg)](https://github.com/chronicle-app/chronicle-etl/actions/workflows/ruby.yml)
4
6
 
5
- Chronicle ETL is a utility that helps you archive and processes personal data. You can *extract* it from a variety of sources, *transform* it, and *load* it to an external API, file, or stdout.
7
+ Are you trying to archive your digital history or incorporate it into your own projects? You’ve probably discovered how frustrating it is to get machine-readable access to your own data. While [building a memex](https://hyfen.net/memex/), I learned first-hand what great efforts must be made before you can begin using the data in interesting ways.
6
8
 
7
- This tool is an adaptation of Andrew Louis's experimental [Memex project](https://hyfen.net/memex) and the dozens of existing importers are being migrated to Chronicle.
9
+ If you don’t want to spend all your time writing scrapers, reverse-engineering APIs, or parsing takeout data, this project is for you! (*If you do enjoy these things, please see the [open issues](https://github.com/chronicle-app/chronicle-etl/issues).*)
8
10
 
9
- ## Installation
11
+ **`chronicle-etl` is a CLI tool that gives you a unified interface for accessing your personal data.** It uses the ETL pattern to *extract* it from a source (e.g. your local browser history, a directory of images, goodreads.com reading history), *transform* it (into a given schema), and *load* it to a source (e.g. a CSV file, JSON, external API).
10
12
 
11
- ```bash
12
- $ gem install chronicle-etl
13
+ ## What does `chronicle-etl` give you?
14
+ * **CLI tool for working with personal data**. You can monitor progress of exports, manipulate the output, set up recurring jobs, manage credentials, and more.
15
+ * **Plugins for many third-party providers**. A plugin system allows you to access data from third-party providers and hook it into the shared CLI infrastructure.
16
+ * **A common, opinionated schema**: You can normalize different datasets into a single schema so that, for example, all your iMessages and emails are stored in a common schema. Don’t want to use the schema? `chronicle-etl` always allows you to fall back on working with the raw extraction data.
17
+
18
+ ## Installation
19
+ ```sh
20
+ # Install chronicle-etl
21
+ gem install chronicle-etl
13
22
  ```
14
23
 
15
- ## Usage
24
+ After installation, the `chronicle-etl` command will be available in your shell. Homebrew support [is coming soon](https://github.com/chronicle-app/chronicle-etl/issues/13).
16
25
 
17
- After installing the gem, `chronicle-etl` is available to run in your shell.
26
+ ## Basic usage and running jobs
18
27
 
19
- ```bash
20
- # read test.csv and display it as a table
21
- $ chronicle-etl jobs:run --extractor csv --extractor-opts filename:test.csv --loader table
28
+ ```sh
29
+ # Display help
30
+ $ chronicle-etl help
22
31
 
23
- # Display help for the jobs:run command
24
- $ chronicle-etl jobs help run
32
+ # Basic job usage
33
+ $ chronicle-etl --extractor NAME --transformer NAME --loader NAME
34
+
35
+ # Read test.csv and display it to stdout as a table
36
+ $ chronicle-etl --extractor csv --input ./data.csv --loader table
25
37
  ```
26
38
 
27
- ## Connectors
39
+ ### Common options
40
+ ```sh
41
+ Options:
42
+ -j, [--name=NAME] # Job configuration name
43
+ -e, [--extractor=EXTRACTOR-NAME] # Extractor class. Default: stdin
44
+ [--extractor-opts=key:value] # Extractor options
45
+ -t, [--transformer=TRANFORMER-NAME] # Transformer class. Default: null
46
+ [--transformer-opts=key:value] # Transformer options
47
+ -l, [--loader=LOADER-NAME] # Loader class. Default: stdout
48
+ [--loader-opts=key:value] # Loader options
49
+ -i, [--input=FILENAME] # Input filename or directory
50
+ [--since=DATE] # Load records SINCE this date. Overrides job's `load_since` configuration option in extractor's options
51
+ [--until=DATE] # Load records UNTIL this date
52
+ [--limit=N] # Only extract the first LIMIT records
53
+ -o, [--output=OUTPUT] # Output filename
54
+ [--fields=field1 field2 ...] # Output only these fields
55
+ [--log-level=LOG_LEVEL] # Log level (debug, info, warn, error, fatal)
56
+ # Default: info
57
+ -v, [--verbose], [--no-verbose] # Set log level to verbose
58
+ [--silent], [--no-silent] # Silence all output
59
+ ```
28
60
 
61
+ ## Connectors
29
62
  Connectors are available to read, process, and load data from different formats or external services.
30
63
 
31
- ```bash
64
+ ```sh
32
65
  # List all available connectors
33
66
  $ chronicle-etl connectors:list
34
-
35
- # Install a connector
36
- $ chronicle-etl connectors:install imessage
37
67
  ```
38
68
 
39
- Built in connectors:
40
-
41
- ### Extractors
42
- - `stdin` - (default) Load records from line-separated stdin
43
- - `csv`
44
- - `file` - load from a single file or directory (with a glob pattern)
45
-
46
- ### Transformers
47
- - `null` - (default) Don't do anything
48
-
49
- ### Loaders
50
- - `stdout` - (default) output records to stdout serialized as JSON
51
- - `csv` - Load records to a csv file
52
- - `rest` - Serialize records with [JSONAPI](https://jsonapi.org/) and send to a REST API
53
- - `table` - Output an ascii table of records. Useful for debugging.
69
+ ### Built-in Connectors
70
+ `chronicle-etl` comes with several built-in connectors for common formats and sources.
54
71
 
55
- ### Provider-specific importers
72
+ #### Extractors
73
+ - [`csv`](https://github.com/chronicle-app/chronicle-etl/blob/main/lib/chronicle/etl/extractors/csv_extractor.rb) - Load records from CSV files or stdin
74
+ - [`json`](https://github.com/chronicle-app/chronicle-etl/blob/main/lib/chronicle/etl/extractors/json_extractor.rb) - Load JSON (either [line-separated objects](https://en.wikipedia.org/wiki/JSON_streaming#Line-delimited_JSON) or one object)
75
+ - [`file`](https://github.com/chronicle-app/chronicle-etl/blob/main/lib/chronicle/etl/extractors/file_extractor.rb) - load from a single file or directory (with a glob pattern)
56
76
 
57
- In addition to the built-in importers, importers for third-party platforms are available. They are packaged as individual Ruby gems.
77
+ #### Transformers
78
+ - [`null`](https://github.com/chronicle-app/chronicle-etl/blob/main/lib/chronicle/etl/transformers/null_transformer.rb) - (default) Don’t do anything and pass on raw extraction data
58
79
 
59
- - [email](https://github.com/chronicle-app/chronicle-email). Extractors for `mbox` and other email files
60
- - [shell](https://github.com/chronicle-app/chronicle-shell). Extract shell history from Bash or Zsh`
61
- - [imessage](https://github.com/chronicle-app/chronicle-imessage). Extract iMessage messages from a local macOS installation
80
+ #### Loaders
81
+ - [`table`](https://github.com/chronicle-app/chronicle-etl/blob/main/lib/chronicle/etl/loaders/table_loader.rb) - (default) Output an ascii table of records. Useful for exploring data.
82
+ - [`csv`](https://github.com/chronicle-app/chronicle-etl/blob/main/lib/chronicle/etl/extractors/csv_extractor.rb) - Load records to CSV
83
+ - [`json`](https://github.com/chronicle-app/chronicle-etl/blob/main/lib/chronicle/etl/loaders/json_loader.rb) - Load records serialized as JSON
84
+ - [`rest`](https://github.com/chronicle-app/chronicle-etl/blob/main/lib/chronicle/etl/loaders/rest_loader.rb) - Serialize records with [JSONAPI](https://jsonapi.org/) and send to a REST API
62
85
 
63
- To install any of these, run `gem install chronicle-PROVIDER`.
86
+ ### Plugins
87
+ Plugins provide access to data from third-party platforms, services, or formats.
64
88
 
65
- If you don't want to use the available rubygem importers, `chronicle-etl` can use `stdin` as an Extractor source (newline separated records). You can also use `stdout` as a loader — transformed records will be outputted separated by newlines.
66
-
67
- I'll be open-sourcing more importers. Please [contact me](mailto:andrew@hyfen.net) to chat about what will be available!
68
-
69
- ## Full commands
89
+ ```bash
90
+ # Install a plugin
91
+ $ chronicle-etl plugins:install NAME
70
92
 
71
- ```
72
- $ chronicle-etl help
73
-
74
- ALL COMMANDS
75
- help # This help menu
76
- connectors help [COMMAND] # Describe subcommands or one specific subcommand
77
- connectors:install NAME # Installs connector NAME
78
- connectors:list # Lists available connectors
79
- jobs help [COMMAND] # Describe subcommands or one specific subcommand
80
- jobs:create # Create a job
81
- jobs:list # List all available jobs
82
- jobs:run # Start a job
83
- jobs:show # Show details about a job
84
- ```
93
+ # Install the imessage plugin
94
+ $ chronicle-etl plugins:install imessage
85
95
 
86
- ### Running a job
96
+ # List installed plugins
97
+ $ chronicle-etl plugins:list
87
98
 
99
+ # Uninstall a plugin
100
+ $ chronicle-etl plugins:uninstall NAME
88
101
  ```
89
- Usage:
90
- chronicle-etl jobs:run
91
102
 
92
- Options:
93
- [--log-level=LOG_LEVEL] # Log level (debug, info, warn, error, fatal)
94
- # Default: info
95
- -v, [--verbose], [--no-verbose] # Set log level to verbose
96
- [--dry-run], [--no-dry-run] # Only run the extraction and transform steps, not the loading
97
- -e, [--extractor=extractor-name] # Extractor class. Default: stdin
98
- [--extractor-opts=key:value] # Extractor options
99
- -t, [--transformer=transformer-name] # Transformer class. Default: null
100
- [--transformer-opts=key:value] # Transformer options
101
- -l, [--loader=loader-name] # Loader class. Default: stdout
102
- [--loader-opts=key:value] # Loader options
103
- -j, [--name=NAME] # Job configuration name
104
-
105
-
106
- Runs an ETL job
103
+ A few dozen importers exist [in my Memex project](https://hyfen.net/memex/) and they’re being ported over to the Chronicle system. This table shows what’s available now and what’s coming. Rows are sorted in very rough order of priority.
104
+
105
+ If you want to work together on a connector, please [get in touch](#get-in-touch)!
106
+
107
+ | Name | Description | Availability |
108
+ |-----------------------------------------------------------------|---------------------------------------------------------------------------------------------|----------------------------------|
109
+ | [imessage](https://github.com/chronicle-app/chronicle-imessage) | iMessage messages and attachments | Available |
110
+ | [shell](https://github.com/chronicle-app/chronicle-shell) | Shell command history | Available (zsh support pending) |
111
+ | [email](https://github.com/chronicle-app/chronicle-email) | Emails and attachments from IMAP or .mbox files | Available (imap support pending) |
112
+ | [pinboard](https://github.com/chronicle-app/chronicle-email) | Bookmarks and tags | Available |
113
+ | [safari](https://github.com/chronicle-app/chronicle-safari) | Browser history from local sqlite db | Available |
114
+ | github | Github user and repo activity | In progress |
115
+ | chrome | Browser history from local sqlite db | Needs porting |
116
+ | whatsapp | Messaging history (via individual chat exports) or reverse-engineered local desktop install | Unstarted |
117
+ | anki | Studying and card creation history | Needs porting |
118
+ | facebook | Messaging and history posting via data export files | Needs porting |
119
+ | twitter | History via API or export data files | Needs porting |
120
+ | foursquare | Location history via API | Needs porting |
121
+ | goodreads | Reading history via export csv (RIP goodreads API) | Needs porting |
122
+ | lastfm | Listening history via API | Needs porting |
123
+ | images | Process image files | Needs porting |
124
+ | arc | Location history from synced icloud backup files | Needs porting |
125
+ | firefox | Browser history from local sqlite db | Needs porting |
126
+ | fitbit | Personal analytics via API | Needs porting |
127
+ | git | Commit history on a repo | Needs porting |
128
+ | google-calendar | Calendar events via API | Needs porting |
129
+ | instagram | Posting and messaging history via export data | Needs porting |
130
+ | shazam | Song tags via reverse-engineered API | Needs porting |
131
+ | slack | Messaging history via API | Need rethinking |
132
+ | strava | Activity history via API | Needs porting |
133
+ | things | Task activity via local sqlite db | Needs porting |
134
+ | bear | Note taking activity via local sqlite db | Needs porting |
135
+ | youtube | Video activity via takeout data and API | Needs porting |
136
+
137
+ ### Writing your own connector
138
+
139
+ Additional connectors are packaged as separate ruby gems. You can view the [iMessage plugin](https://github.com/chronicle-app/chronicle-imessage) for an example.
140
+
141
+ If you want to load a custom connector without creating a gem, you can help by [completing this issue](https://github.com/chronicle-app/chronicle-etl/issues/23).
142
+
143
+ If you want to work together on a connector, please [get in touch](#get-in-touch)!
144
+
145
+ #### Sample custom Extractor class
146
+ ```ruby
147
+ module Chronicle
148
+ module FooService
149
+ class FooExtractor < Chronicle::ETL::Extractor
150
+ register_connector do |r|
151
+ r.identifier = 'foo'
152
+ r.description = 'From foo.com'
153
+ end
154
+
155
+ setting :access_token, required: true
156
+
157
+ def prepare
158
+ @records = # load from somewhere
159
+ end
160
+
161
+ def extract
162
+ @records.each do |record|
163
+ yield Chronicle::ETL::Extraction.new(data: row.to_h)
164
+ end
165
+ end
166
+ end
167
+ end
168
+ end
107
169
  ```
108
170
 
109
171
  ## Development
110
-
111
172
  After checking out the repo, run `bin/setup` to install dependencies. Then, run `rake spec` to run the tests. You can also run `bin/console` for an interactive prompt that will allow you to experiment.
112
173
 
113
174
  To install this gem onto your local machine, run `bundle exec rake install`. To release a new version, update the version number in `version.rb`, and then run `bundle exec rake release`, which will create a git tag for the version, push git commits and tags, and push the `.gem` file to [rubygems.org](https://rubygems.org).
114
175
 
115
- ## Contributing
176
+ ### Additional development commands
177
+ ```bash
178
+ # run tests
179
+ bundle exec rake spec
180
+
181
+ # generate docs
182
+ bundle exec rake yard
183
+
184
+ # use Guard to run specs automatically
185
+ bundle exec guard
186
+ ```
116
187
 
188
+ ## Get in touch
189
+ - [@hyfen](https://twitter.com/hyfen) on Twitter
190
+ - [@hyfen](https://github.com/hyfen) on Github
191
+ - Email: andrew@hyfen.net
192
+
193
+ ## Contributing
117
194
  Bug reports and pull requests are welcome on GitHub at https://github.com/chronicle-app/chronicle-etl. This project is intended to be a safe, welcoming space for collaboration, and contributors are expected to adhere to the [Contributor Covenant](http://contributor-covenant.org) code of conduct.
118
195
 
119
196
  ## License
120
-
121
197
  The gem is available as open source under the terms of the [MIT License](https://opensource.org/licenses/MIT).
122
198
 
123
199
  ## Code of Conduct
124
-
125
- Everyone interacting in the Chronicle::ETL project’s codebases, issue trackers, chat rooms and mailing lists is expected to follow the [code of conduct](https://github.com/chronicle-app/chronicle-etl/blob/master/CODE_OF_CONDUCT.md).
200
+ Everyone interacting in the Chronicle::ETL project’s codebases, issue trackers, chat rooms and mailing lists is expected to follow the [code of conduct](https://github.com/chronicle-app/chronicle-etl/blob/main/CODE_OF_CONDUCT.md).
@@ -47,8 +47,11 @@ Gem::Specification.new do |spec|
47
47
  spec.add_dependency "sequel", "~> 5.35"
48
48
  spec.add_dependency "sqlite3", "~> 1.4"
49
49
  spec.add_dependency "thor", "~> 1.2"
50
+ spec.add_dependency "thor-hollaback", "~> 0.2"
50
51
  spec.add_dependency "tty-progressbar", "~> 0.17"
52
+ spec.add_dependency "tty-spinner"
51
53
  spec.add_dependency "tty-table", "~> 0.11"
54
+ spec.add_dependency "tty-prompt", "~> 0.23"
52
55
 
53
56
  spec.add_development_dependency "bundler", "~> 2.1"
54
57
  spec.add_development_dependency "pry-byebug", "~> 3.9"
@@ -0,0 +1,31 @@
1
+ module Chronicle
2
+ module ETL
3
+ module CLI
4
+ # Base class for CLI commands
5
+ class CLIBase < ::Thor
6
+ no_commands do
7
+ # Shorthand for cli_exit(status: :failure)
8
+ def cli_fail(message: nil, exception: nil)
9
+ cli_exit(status: :failure, message: message, exception: exception)
10
+ end
11
+
12
+ # Exit from CLI
13
+ #
14
+ # @params status Can be eitiher :success or :failure
15
+ # @params message to print
16
+ # @params exception stacktrace if log_level is set to debug
17
+ def cli_exit(status: :success, message: nil, exception: nil)
18
+ exit_code = status == :success ? 0 : 1
19
+ log_level = status == :success ? :info : :fatal
20
+
21
+ message = message.red if status != :success
22
+
23
+ Chronicle::ETL::Logger.debug(exception.full_message) if exception
24
+ Chronicle::ETL::Logger.send(log_level, message) if message
25
+ exit(exit_code)
26
+ end
27
+ end
28
+ end
29
+ end
30
+ end
31
+ end
@@ -8,11 +8,6 @@ module Chronicle
8
8
  default_task 'list'
9
9
  namespace :connectors
10
10
 
11
- desc "install NAME", "Installs connector NAME"
12
- def install(name)
13
- Chronicle::ETL::Registry.install_connector(name)
14
- end
15
-
16
11
  desc "list", "Lists available connectors"
17
12
  # Display all available connectors that chronicle-etl has access to
18
13
  def list
@@ -44,21 +39,19 @@ module Chronicle
44
39
  desc "show PHASE IDENTIFIER", "Show information about a connector"
45
40
  def show(phase, identifier)
46
41
  unless ['extractor', 'transformer', 'loader'].include?(phase)
47
- puts "phase argument must be one of: [extractor, transformer, loader]"
48
- return
42
+ cli_fail(message: "Phase argument must be one of: [extractor, transformer, loader]")
49
43
  end
50
44
 
51
45
  begin
52
46
  connector = Chronicle::ETL::Registry.find_by_phase_and_identifier(phase.to_sym, identifier)
53
- rescue Chronicle::ETL::ConnectorNotAvailableError
54
- puts "Could not find #{phase} #{identifier}"
55
- return
47
+ rescue Chronicle::ETL::ConnectorNotAvailableError, Chronicle::ETL::PluginError => e
48
+ cli_fail(message: "Could not find #{phase} #{identifier}", exception: e)
56
49
  end
57
50
 
58
51
  puts connector.klass.to_s.bold
59
52
  puts " #{connector.descriptive_phrase}"
60
53
  puts
61
- puts "OPTIONS"
54
+ puts "Settings:"
62
55
 
63
56
  headers = ['name', 'default', 'required'].map{ |h| h.to_s.upcase.bold }
64
57
 
@@ -1,4 +1,5 @@
1
1
  require 'pp'
2
+ require 'tty-prompt'
2
3
 
3
4
  module Chronicle
4
5
  module ETL
@@ -10,30 +11,26 @@ module Chronicle
10
11
 
11
12
  class_option :name, aliases: '-j', desc: 'Job configuration name'
12
13
 
13
- class_option :extractor, aliases: '-e', desc: "Extractor class. Default: stdin", banner: 'extractor-name'
14
+ class_option :extractor, aliases: '-e', desc: "Extractor class. Default: stdin", banner: 'NAME'
14
15
  class_option :'extractor-opts', desc: 'Extractor options', type: :hash, default: {}
15
- class_option :transformer, aliases: '-t', desc: 'Transformer class. Default: null', banner: 'transformer-name'
16
+ class_option :transformer, aliases: '-t', desc: 'Transformer class. Default: null', banner: 'NAME'
16
17
  class_option :'transformer-opts', desc: 'Transformer options', type: :hash, default: {}
17
- class_option :loader, aliases: '-l', desc: 'Loader class. Default: stdout', banner: 'loader-name'
18
+ class_option :loader, aliases: '-l', desc: 'Loader class. Default: table', banner: 'NAME'
18
19
  class_option :'loader-opts', desc: 'Loader options', type: :hash, default: {}
19
20
 
20
21
  # This is an array to deal with shell globbing
21
22
  class_option :input, aliases: '-i', desc: 'Input filename or directory', default: [], type: 'array', banner: 'FILENAME'
22
- class_option :since, desc: "Load records SINCE this date. Overrides job's `load_since` configuration option in extractor's options", banner: 'DATE'
23
+ class_option :since, desc: "Load records SINCE this date", banner: 'DATE'
23
24
  class_option :until, desc: "Load records UNTIL this date", banner: 'DATE'
24
25
  class_option :limit, desc: "Only extract the first LIMIT records", banner: 'N'
25
26
 
26
27
  class_option :output, aliases: '-o', desc: 'Output filename', type: 'string'
27
28
  class_option :fields, desc: 'Output only these fields', type: 'array', banner: 'field1 field2 ...'
28
-
29
- class_option :log_level, desc: 'Log level (debug, info, warn, error, fatal)', default: 'info'
30
- class_option :verbose, aliases: '-v', desc: 'Set log level to verbose', type: :boolean
29
+ class_option :header_row, desc: 'Output the header row of tabular output', type: 'boolean'
31
30
 
32
31
  # Thor doesn't like `run` as a command name
33
32
  map run: :start
34
33
  desc "run", "Start a job"
35
- option :log_level, desc: 'Log level (debug, info, warn, error, fatal)', default: 'info'
36
- option :verbose, aliases: '-v', desc: 'Set log level to verbose', type: :boolean
37
34
  option :dry_run, desc: 'Only run the extraction and transform steps, not the loading', type: :boolean
38
35
  long_desc <<-LONG_DESC
39
36
  This will run an ETL job. Each job needs three parts:
@@ -48,25 +45,41 @@ module Chronicle
48
45
  LONG_DESC
49
46
  # Run an ETL job
50
47
  def start
51
- setup_log_level
52
48
  job_definition = build_job_definition(options)
53
- job = Chronicle::ETL::Job.new(job_definition)
54
- runner = Chronicle::ETL::Runner.new(job)
55
- runner.run!
49
+
50
+ if job_definition.plugins_missing?
51
+ missing_plugins = job_definition.errors[:plugins]
52
+ .select { |error| error.is_a?(Chronicle::ETL::PluginLoadError) }
53
+ .map(&:name)
54
+ .uniq
55
+ install_missing_plugins(missing_plugins)
56
+ end
57
+
58
+ run_job(job_definition)
59
+ rescue Chronicle::ETL::JobDefinitionError => e
60
+ cli_fail(message: "Error running job.\n#{job_definition.errors}", exception: e)
56
61
  end
57
62
 
58
63
  desc "create", "Create a job"
59
64
  # Create an ETL job
60
65
  def create
61
66
  job_definition = build_job_definition(options)
67
+ job_definition.validate!
68
+
62
69
  path = File.join('chronicle', 'etl', 'jobs', options[:name])
63
70
  Chronicle::ETL::Config.write(path, job_definition.definition)
71
+ rescue Chronicle::ETL::JobDefinitionError => e
72
+ cli_fail(message: "Job definition error", exception: e)
64
73
  end
65
74
 
66
75
  desc "show", "Show details about a job"
67
76
  # Show an ETL job
68
77
  def show
69
- puts Chronicle::ETL::Job.new(build_job_definition(options))
78
+ job_definition = build_job_definition(options)
79
+ job_definition.validate!
80
+ puts Chronicle::ETL::Job.new(job_definition)
81
+ rescue Chronicle::ETL::JobDefinitionError => e
82
+ cli_fail(message: "Job definition error", exception: e)
70
83
  end
71
84
 
72
85
  desc "list", "List all available jobs"
@@ -86,19 +99,32 @@ LONG_DESC
86
99
 
87
100
  headers = ['name', 'extractor', 'transformer', 'loader'].map { |h| h.upcase.bold }
88
101
 
102
+ puts "Available jobs:"
89
103
  table = TTY::Table.new(headers, job_details)
90
104
  puts table.render(indent: 0, padding: [0, 2])
105
+ rescue Chronicle::ETL::ConfigError => e
106
+ cli_fail(message: "Config error. #{e.message}", exception: e)
91
107
  end
92
108
 
93
109
  private
94
110
 
95
- def setup_log_level
96
- if options[:verbose]
97
- Chronicle::ETL::Logger.log_level = Chronicle::ETL::Logger::DEBUG
98
- elsif options[:log_level]
99
- level = Chronicle::ETL::Logger.const_get(options[:log_level].upcase)
100
- Chronicle::ETL::Logger.log_level = level
101
- end
111
+ def run_job(job_definition)
112
+ job = Chronicle::ETL::Job.new(job_definition)
113
+ runner = Chronicle::ETL::Runner.new(job)
114
+ runner.run!
115
+ end
116
+
117
+ # TODO: probably could merge this with something in cli/plugin
118
+ def install_missing_plugins(missing_plugins)
119
+ prompt = TTY::Prompt.new
120
+ message = "Plugin#{'s' if missing_plugins.count > 1} specified by job not installed.\n"
121
+ message += "Do you want to install "
122
+ message += missing_plugins.map { |name| "chronicle-#{name}".bold}.join(", ")
123
+ message += " and start the job?"
124
+ will_install = prompt.yes?(message)
125
+ cli_fail(message: "Must install #{missing_plugins.join(", ")} plugin to run job") unless will_install
126
+
127
+ Chronicle::ETL::CLI::Plugins.new.install(*missing_plugins)
102
128
  end
103
129
 
104
130
  # Create job definition by reading config file and then overwriting with flag options
@@ -116,7 +142,7 @@ LONG_DESC
116
142
  # Takes flag options and turns them into a runner config
117
143
  def process_flag_options options
118
144
  extractor_options = options[:'extractor-opts'].merge({
119
- filename: (options[:input] if options[:input].any?),
145
+ input: (options[:input] if options[:input].any?),
120
146
  since: options[:since],
121
147
  until: options[:until],
122
148
  limit: options[:limit],
@@ -126,6 +152,7 @@ LONG_DESC
126
152
 
127
153
  loader_options = options[:'loader-opts'].merge({
128
154
  output: options[:output],
155
+ header_row: options[:header_row],
129
156
  fields: options[:fields]
130
157
  }.compact)
131
158
 
@@ -4,7 +4,15 @@ module Chronicle
4
4
  module ETL
5
5
  module CLI
6
6
  # Main entrypoint for CLI app
7
- class Main < ::Thor
7
+ class Main < Chronicle::ETL::CLI::CLIBase
8
+ class_before :set_log_level
9
+ class_before :set_color_output
10
+
11
+ class_option :log_level, desc: 'Log level (debug, info, warn, error, fatal, silent)', default: 'info'
12
+ class_option :verbose, aliases: '-v', desc: 'Set log level to verbose', type: :boolean
13
+ class_option :silent, desc: 'Silence all output', type: :boolean
14
+ class_option :'no-color', desc: 'Disable colour output', type: :boolean
15
+
8
16
  default_task "jobs"
9
17
 
10
18
  desc 'connectors:COMMAND', 'Connectors available for ETL jobs', hide: true
@@ -13,6 +21,9 @@ module Chronicle
13
21
  desc 'jobs:COMMAND', 'Configure and run jobs', hide: true
14
22
  subcommand 'jobs', Jobs
15
23
 
24
+ desc 'plugins:COMMAND', 'Configure plugins', hide: true
25
+ subcommand 'plugins', Plugins
26
+
16
27
  # Entrypoint for the CLI
17
28
  def self.start(given_args = ARGV, config = {})
18
29
  # take a subcommand:command and splits them so Thor knows how to hand off to the subcommand class
@@ -79,6 +90,26 @@ module Chronicle
79
90
  shell.say
80
91
  end
81
92
  end
93
+
94
+ no_commands do
95
+ def testb
96
+ puts "hi"
97
+ end
98
+ def set_color_output
99
+ String.disable_colorization true if options[:'no-color'] || ENV['NO_COLOR']
100
+ end
101
+
102
+ def set_log_level
103
+ if options[:silent]
104
+ Chronicle::ETL::Logger.log_level = Chronicle::ETL::Logger::SILENT
105
+ elsif options[:verbose]
106
+ Chronicle::ETL::Logger.log_level = Chronicle::ETL::Logger::DEBUG
107
+ elsif options[:log_level]
108
+ level = Chronicle::ETL::Logger.const_get(options[:log_level].upcase)
109
+ Chronicle::ETL::Logger.log_level = level
110
+ end
111
+ end
112
+ end
82
113
  end
83
114
  end
84
115
  end