chronicle-etl 0.4.0 → 0.4.3

Sign up to get free protection for your applications and to get access to all the features.
Files changed (45) hide show
  1. checksums.yaml +4 -4
  2. data/.github/workflows/ruby.yml +2 -2
  3. data/.rubocop.yml +3 -0
  4. data/README.md +156 -81
  5. data/chronicle-etl.gemspec +3 -0
  6. data/lib/chronicle/etl/cli/cli_base.rb +31 -0
  7. data/lib/chronicle/etl/cli/connectors.rb +4 -11
  8. data/lib/chronicle/etl/cli/jobs.rb +49 -22
  9. data/lib/chronicle/etl/cli/main.rb +32 -1
  10. data/lib/chronicle/etl/cli/plugins.rb +62 -0
  11. data/lib/chronicle/etl/cli/subcommand_base.rb +1 -1
  12. data/lib/chronicle/etl/cli.rb +3 -0
  13. data/lib/chronicle/etl/config.rb +7 -4
  14. data/lib/chronicle/etl/configurable.rb +15 -2
  15. data/lib/chronicle/etl/exceptions.rb +29 -2
  16. data/lib/chronicle/etl/extractors/csv_extractor.rb +24 -17
  17. data/lib/chronicle/etl/extractors/extractor.rb +5 -5
  18. data/lib/chronicle/etl/extractors/file_extractor.rb +33 -13
  19. data/lib/chronicle/etl/extractors/helpers/input_reader.rb +76 -0
  20. data/lib/chronicle/etl/extractors/json_extractor.rb +21 -12
  21. data/lib/chronicle/etl/job.rb +7 -1
  22. data/lib/chronicle/etl/job_definition.rb +32 -6
  23. data/lib/chronicle/etl/loaders/csv_loader.rb +35 -8
  24. data/lib/chronicle/etl/loaders/helpers/encoding_helper.rb +18 -0
  25. data/lib/chronicle/etl/loaders/json_loader.rb +44 -0
  26. data/lib/chronicle/etl/loaders/loader.rb +24 -1
  27. data/lib/chronicle/etl/loaders/table_loader.rb +13 -26
  28. data/lib/chronicle/etl/logger.rb +6 -2
  29. data/lib/chronicle/etl/models/base.rb +3 -0
  30. data/lib/chronicle/etl/models/entity.rb +8 -2
  31. data/lib/chronicle/etl/models/raw.rb +26 -0
  32. data/lib/chronicle/etl/registry/connector_registration.rb +5 -0
  33. data/lib/chronicle/etl/registry/plugin_registry.rb +75 -0
  34. data/lib/chronicle/etl/registry/registry.rb +27 -14
  35. data/lib/chronicle/etl/runner.rb +35 -17
  36. data/lib/chronicle/etl/serializers/jsonapi_serializer.rb +6 -0
  37. data/lib/chronicle/etl/serializers/raw_serializer.rb +10 -0
  38. data/lib/chronicle/etl/serializers/serializer.rb +2 -1
  39. data/lib/chronicle/etl/transformers/null_transformer.rb +1 -1
  40. data/lib/chronicle/etl/version.rb +1 -1
  41. data/lib/chronicle/etl.rb +11 -4
  42. metadata +53 -6
  43. data/lib/chronicle/etl/extractors/helpers/filesystem_reader.rb +0 -104
  44. data/lib/chronicle/etl/loaders/stdout_loader.rb +0 -14
  45. data/lib/chronicle/etl/models/generic.rb +0 -23
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: 5fd411a9a41a645b85780230c79b09f361e121d0e8ca7f3270ca8eba55a76ca8
4
- data.tar.gz: c09053715910ab4f027fbdc3a5b7d10c042eee962f7fa93c6571ce8359f51009
3
+ metadata.gz: f2b6fdca3723ec52287c070a0dd08d0cfaf825f5e8f46da0d5a34172c0008573
4
+ data.tar.gz: e15181ba7edc1698404af8ff8c05d5367786ea809360393e825ca5ee5eef6c75
5
5
  SHA512:
6
- metadata.gz: 2c9ec14b6c0a51f1c5ec77ee8d9a7f016d16bdc35db5634f9fa5d38aabc30dec201cd4b8bef06a31b86773a0c1cda2d271d7008dcb247a86d956c094919f3c0f
7
- data.tar.gz: 0dca41e1654e5b2b98a148f853492a67126cdac767000b3c5f97c5c8ff88b77464e17a2fab38b72c1f014f3515c911e5f3f391eaf68d64e73dcfcff5d8e6cb6a
6
+ metadata.gz: c5508dcfc5e1367122ebbc191dc60a76cc7b1f088ae9ebd52bae41e07420f54ea7977a35f2b25839933b06d4673e2a120feecbd958efee07e0d313eaa7a5d167
7
+ data.tar.gz: adcb90549af364189c5ae3b0811c039277aba7dc6fbf2fbd6de8b89c572948d0dc00ba8da59db03252f4770737423e62c3a1175ef018742ef8bd7aee14837f63
@@ -9,9 +9,9 @@ name: Ruby
9
9
 
10
10
  on:
11
11
  push:
12
- branches: [ master ]
12
+ branches: [ main ]
13
13
  pull_request:
14
- branches: [ master ]
14
+ branches: [ main ]
15
15
 
16
16
  jobs:
17
17
  test:
data/.rubocop.yml CHANGED
@@ -27,6 +27,9 @@ Style/OpenStructUse:
27
27
  Style/Copyright:
28
28
  Enabled: false
29
29
 
30
+ Style/MissingElse:
31
+ Enabled: false
32
+
30
33
  Style/SymbolArray:
31
34
  EnforcedStyle: brackets
32
35
 
data/README.md CHANGED
@@ -1,125 +1,200 @@
1
- # Chronicle::ETL
1
+ ## A CLI toolkit for extracting and working with your digital history
2
+
3
+ ![chronicle-etl-banner](https://user-images.githubusercontent.com/6291/157330518-0f934c9a-9ec4-43d9-9cc2-12f156d09b37.png)
2
4
 
3
5
  [![Gem Version](https://badge.fury.io/rb/chronicle-etl.svg)](https://badge.fury.io/rb/chronicle-etl) [![Ruby](https://github.com/chronicle-app/chronicle-etl/actions/workflows/ruby.yml/badge.svg)](https://github.com/chronicle-app/chronicle-etl/actions/workflows/ruby.yml)
4
6
 
5
- Chronicle ETL is a utility that helps you archive and processes personal data. You can *extract* it from a variety of sources, *transform* it, and *load* it to an external API, file, or stdout.
7
+ Are you trying to archive your digital history or incorporate it into your own projects? You’ve probably discovered how frustrating it is to get machine-readable access to your own data. While [building a memex](https://hyfen.net/memex/), I learned first-hand what great efforts must be made before you can begin using the data in interesting ways.
6
8
 
7
- This tool is an adaptation of Andrew Louis's experimental [Memex project](https://hyfen.net/memex) and the dozens of existing importers are being migrated to Chronicle.
9
+ If you don’t want to spend all your time writing scrapers, reverse-engineering APIs, or parsing takeout data, this project is for you! (*If you do enjoy these things, please see the [open issues](https://github.com/chronicle-app/chronicle-etl/issues).*)
8
10
 
9
- ## Installation
11
+ **`chronicle-etl` is a CLI tool that gives you a unified interface for accessing your personal data.** It uses the ETL pattern to *extract* it from a source (e.g. your local browser history, a directory of images, goodreads.com reading history), *transform* it (into a given schema), and *load* it to a source (e.g. a CSV file, JSON, external API).
10
12
 
11
- ```bash
12
- $ gem install chronicle-etl
13
+ ## What does `chronicle-etl` give you?
14
+ * **CLI tool for working with personal data**. You can monitor progress of exports, manipulate the output, set up recurring jobs, manage credentials, and more.
15
+ * **Plugins for many third-party providers**. A plugin system allows you to access data from third-party providers and hook it into the shared CLI infrastructure.
16
+ * **A common, opinionated schema**: You can normalize different datasets into a single schema so that, for example, all your iMessages and emails are stored in a common schema. Don’t want to use the schema? `chronicle-etl` always allows you to fall back on working with the raw extraction data.
17
+
18
+ ## Installation
19
+ ```sh
20
+ # Install chronicle-etl
21
+ gem install chronicle-etl
13
22
  ```
14
23
 
15
- ## Usage
24
+ After installation, the `chronicle-etl` command will be available in your shell. Homebrew support [is coming soon](https://github.com/chronicle-app/chronicle-etl/issues/13).
16
25
 
17
- After installing the gem, `chronicle-etl` is available to run in your shell.
26
+ ## Basic usage and running jobs
18
27
 
19
- ```bash
20
- # read test.csv and display it as a table
21
- $ chronicle-etl jobs:run --extractor csv --extractor-opts filename:test.csv --loader table
28
+ ```sh
29
+ # Display help
30
+ $ chronicle-etl help
22
31
 
23
- # Display help for the jobs:run command
24
- $ chronicle-etl jobs help run
32
+ # Basic job usage
33
+ $ chronicle-etl --extractor NAME --transformer NAME --loader NAME
34
+
35
+ # Read test.csv and display it to stdout as a table
36
+ $ chronicle-etl --extractor csv --input ./data.csv --loader table
25
37
  ```
26
38
 
27
- ## Connectors
39
+ ### Common options
40
+ ```sh
41
+ Options:
42
+ -j, [--name=NAME] # Job configuration name
43
+ -e, [--extractor=EXTRACTOR-NAME] # Extractor class. Default: stdin
44
+ [--extractor-opts=key:value] # Extractor options
45
+ -t, [--transformer=TRANFORMER-NAME] # Transformer class. Default: null
46
+ [--transformer-opts=key:value] # Transformer options
47
+ -l, [--loader=LOADER-NAME] # Loader class. Default: stdout
48
+ [--loader-opts=key:value] # Loader options
49
+ -i, [--input=FILENAME] # Input filename or directory
50
+ [--since=DATE] # Load records SINCE this date. Overrides job's `load_since` configuration option in extractor's options
51
+ [--until=DATE] # Load records UNTIL this date
52
+ [--limit=N] # Only extract the first LIMIT records
53
+ -o, [--output=OUTPUT] # Output filename
54
+ [--fields=field1 field2 ...] # Output only these fields
55
+ [--log-level=LOG_LEVEL] # Log level (debug, info, warn, error, fatal)
56
+ # Default: info
57
+ -v, [--verbose], [--no-verbose] # Set log level to verbose
58
+ [--silent], [--no-silent] # Silence all output
59
+ ```
28
60
 
61
+ ## Connectors
29
62
  Connectors are available to read, process, and load data from different formats or external services.
30
63
 
31
- ```bash
64
+ ```sh
32
65
  # List all available connectors
33
66
  $ chronicle-etl connectors:list
34
-
35
- # Install a connector
36
- $ chronicle-etl connectors:install imessage
37
67
  ```
38
68
 
39
- Built in connectors:
40
-
41
- ### Extractors
42
- - `stdin` - (default) Load records from line-separated stdin
43
- - `csv`
44
- - `file` - load from a single file or directory (with a glob pattern)
45
-
46
- ### Transformers
47
- - `null` - (default) Don't do anything
48
-
49
- ### Loaders
50
- - `stdout` - (default) output records to stdout serialized as JSON
51
- - `csv` - Load records to a csv file
52
- - `rest` - Serialize records with [JSONAPI](https://jsonapi.org/) and send to a REST API
53
- - `table` - Output an ascii table of records. Useful for debugging.
69
+ ### Built-in Connectors
70
+ `chronicle-etl` comes with several built-in connectors for common formats and sources.
54
71
 
55
- ### Provider-specific importers
72
+ #### Extractors
73
+ - [`csv`](https://github.com/chronicle-app/chronicle-etl/blob/main/lib/chronicle/etl/extractors/csv_extractor.rb) - Load records from CSV files or stdin
74
+ - [`json`](https://github.com/chronicle-app/chronicle-etl/blob/main/lib/chronicle/etl/extractors/json_extractor.rb) - Load JSON (either [line-separated objects](https://en.wikipedia.org/wiki/JSON_streaming#Line-delimited_JSON) or one object)
75
+ - [`file`](https://github.com/chronicle-app/chronicle-etl/blob/main/lib/chronicle/etl/extractors/file_extractor.rb) - load from a single file or directory (with a glob pattern)
56
76
 
57
- In addition to the built-in importers, importers for third-party platforms are available. They are packaged as individual Ruby gems.
77
+ #### Transformers
78
+ - [`null`](https://github.com/chronicle-app/chronicle-etl/blob/main/lib/chronicle/etl/transformers/null_transformer.rb) - (default) Don’t do anything and pass on raw extraction data
58
79
 
59
- - [email](https://github.com/chronicle-app/chronicle-email). Extractors for `mbox` and other email files
60
- - [shell](https://github.com/chronicle-app/chronicle-shell). Extract shell history from Bash or Zsh`
61
- - [imessage](https://github.com/chronicle-app/chronicle-imessage). Extract iMessage messages from a local macOS installation
80
+ #### Loaders
81
+ - [`table`](https://github.com/chronicle-app/chronicle-etl/blob/main/lib/chronicle/etl/loaders/table_loader.rb) - (default) Output an ascii table of records. Useful for exploring data.
82
+ - [`csv`](https://github.com/chronicle-app/chronicle-etl/blob/main/lib/chronicle/etl/extractors/csv_extractor.rb) - Load records to CSV
83
+ - [`json`](https://github.com/chronicle-app/chronicle-etl/blob/main/lib/chronicle/etl/loaders/json_loader.rb) - Load records serialized as JSON
84
+ - [`rest`](https://github.com/chronicle-app/chronicle-etl/blob/main/lib/chronicle/etl/loaders/rest_loader.rb) - Serialize records with [JSONAPI](https://jsonapi.org/) and send to a REST API
62
85
 
63
- To install any of these, run `gem install chronicle-PROVIDER`.
86
+ ### Plugins
87
+ Plugins provide access to data from third-party platforms, services, or formats.
64
88
 
65
- If you don't want to use the available rubygem importers, `chronicle-etl` can use `stdin` as an Extractor source (newline separated records). You can also use `stdout` as a loader — transformed records will be outputted separated by newlines.
66
-
67
- I'll be open-sourcing more importers. Please [contact me](mailto:andrew@hyfen.net) to chat about what will be available!
68
-
69
- ## Full commands
89
+ ```bash
90
+ # Install a plugin
91
+ $ chronicle-etl plugins:install NAME
70
92
 
71
- ```
72
- $ chronicle-etl help
73
-
74
- ALL COMMANDS
75
- help # This help menu
76
- connectors help [COMMAND] # Describe subcommands or one specific subcommand
77
- connectors:install NAME # Installs connector NAME
78
- connectors:list # Lists available connectors
79
- jobs help [COMMAND] # Describe subcommands or one specific subcommand
80
- jobs:create # Create a job
81
- jobs:list # List all available jobs
82
- jobs:run # Start a job
83
- jobs:show # Show details about a job
84
- ```
93
+ # Install the imessage plugin
94
+ $ chronicle-etl plugins:install imessage
85
95
 
86
- ### Running a job
96
+ # List installed plugins
97
+ $ chronicle-etl plugins:list
87
98
 
99
+ # Uninstall a plugin
100
+ $ chronicle-etl plugins:uninstall NAME
88
101
  ```
89
- Usage:
90
- chronicle-etl jobs:run
91
102
 
92
- Options:
93
- [--log-level=LOG_LEVEL] # Log level (debug, info, warn, error, fatal)
94
- # Default: info
95
- -v, [--verbose], [--no-verbose] # Set log level to verbose
96
- [--dry-run], [--no-dry-run] # Only run the extraction and transform steps, not the loading
97
- -e, [--extractor=extractor-name] # Extractor class. Default: stdin
98
- [--extractor-opts=key:value] # Extractor options
99
- -t, [--transformer=transformer-name] # Transformer class. Default: null
100
- [--transformer-opts=key:value] # Transformer options
101
- -l, [--loader=loader-name] # Loader class. Default: stdout
102
- [--loader-opts=key:value] # Loader options
103
- -j, [--name=NAME] # Job configuration name
104
-
105
-
106
- Runs an ETL job
103
+ A few dozen importers exist [in my Memex project](https://hyfen.net/memex/) and they’re being ported over to the Chronicle system. This table shows what’s available now and what’s coming. Rows are sorted in very rough order of priority.
104
+
105
+ If you want to work together on a connector, please [get in touch](#get-in-touch)!
106
+
107
+ | Name | Description | Availability |
108
+ |-----------------------------------------------------------------|---------------------------------------------------------------------------------------------|----------------------------------|
109
+ | [imessage](https://github.com/chronicle-app/chronicle-imessage) | iMessage messages and attachments | Available |
110
+ | [shell](https://github.com/chronicle-app/chronicle-shell) | Shell command history | Available (zsh support pending) |
111
+ | [email](https://github.com/chronicle-app/chronicle-email) | Emails and attachments from IMAP or .mbox files | Available (imap support pending) |
112
+ | [pinboard](https://github.com/chronicle-app/chronicle-email) | Bookmarks and tags | Available |
113
+ | [safari](https://github.com/chronicle-app/chronicle-safari) | Browser history from local sqlite db | Available |
114
+ | github | Github user and repo activity | In progress |
115
+ | chrome | Browser history from local sqlite db | Needs porting |
116
+ | whatsapp | Messaging history (via individual chat exports) or reverse-engineered local desktop install | Unstarted |
117
+ | anki | Studying and card creation history | Needs porting |
118
+ | facebook | Messaging and history posting via data export files | Needs porting |
119
+ | twitter | History via API or export data files | Needs porting |
120
+ | foursquare | Location history via API | Needs porting |
121
+ | goodreads | Reading history via export csv (RIP goodreads API) | Needs porting |
122
+ | lastfm | Listening history via API | Needs porting |
123
+ | images | Process image files | Needs porting |
124
+ | arc | Location history from synced icloud backup files | Needs porting |
125
+ | firefox | Browser history from local sqlite db | Needs porting |
126
+ | fitbit | Personal analytics via API | Needs porting |
127
+ | git | Commit history on a repo | Needs porting |
128
+ | google-calendar | Calendar events via API | Needs porting |
129
+ | instagram | Posting and messaging history via export data | Needs porting |
130
+ | shazam | Song tags via reverse-engineered API | Needs porting |
131
+ | slack | Messaging history via API | Need rethinking |
132
+ | strava | Activity history via API | Needs porting |
133
+ | things | Task activity via local sqlite db | Needs porting |
134
+ | bear | Note taking activity via local sqlite db | Needs porting |
135
+ | youtube | Video activity via takeout data and API | Needs porting |
136
+
137
+ ### Writing your own connector
138
+
139
+ Additional connectors are packaged as separate ruby gems. You can view the [iMessage plugin](https://github.com/chronicle-app/chronicle-imessage) for an example.
140
+
141
+ If you want to load a custom connector without creating a gem, you can help by [completing this issue](https://github.com/chronicle-app/chronicle-etl/issues/23).
142
+
143
+ If you want to work together on a connector, please [get in touch](#get-in-touch)!
144
+
145
+ #### Sample custom Extractor class
146
+ ```ruby
147
+ module Chronicle
148
+ module FooService
149
+ class FooExtractor < Chronicle::ETL::Extractor
150
+ register_connector do |r|
151
+ r.identifier = 'foo'
152
+ r.description = 'From foo.com'
153
+ end
154
+
155
+ setting :access_token, required: true
156
+
157
+ def prepare
158
+ @records = # load from somewhere
159
+ end
160
+
161
+ def extract
162
+ @records.each do |record|
163
+ yield Chronicle::ETL::Extraction.new(data: row.to_h)
164
+ end
165
+ end
166
+ end
167
+ end
168
+ end
107
169
  ```
108
170
 
109
171
  ## Development
110
-
111
172
  After checking out the repo, run `bin/setup` to install dependencies. Then, run `rake spec` to run the tests. You can also run `bin/console` for an interactive prompt that will allow you to experiment.
112
173
 
113
174
  To install this gem onto your local machine, run `bundle exec rake install`. To release a new version, update the version number in `version.rb`, and then run `bundle exec rake release`, which will create a git tag for the version, push git commits and tags, and push the `.gem` file to [rubygems.org](https://rubygems.org).
114
175
 
115
- ## Contributing
176
+ ### Additional development commands
177
+ ```bash
178
+ # run tests
179
+ bundle exec rake spec
180
+
181
+ # generate docs
182
+ bundle exec rake yard
183
+
184
+ # use Guard to run specs automatically
185
+ bundle exec guard
186
+ ```
116
187
 
188
+ ## Get in touch
189
+ - [@hyfen](https://twitter.com/hyfen) on Twitter
190
+ - [@hyfen](https://github.com/hyfen) on Github
191
+ - Email: andrew@hyfen.net
192
+
193
+ ## Contributing
117
194
  Bug reports and pull requests are welcome on GitHub at https://github.com/chronicle-app/chronicle-etl. This project is intended to be a safe, welcoming space for collaboration, and contributors are expected to adhere to the [Contributor Covenant](http://contributor-covenant.org) code of conduct.
118
195
 
119
196
  ## License
120
-
121
197
  The gem is available as open source under the terms of the [MIT License](https://opensource.org/licenses/MIT).
122
198
 
123
199
  ## Code of Conduct
124
-
125
- Everyone interacting in the Chronicle::ETL project’s codebases, issue trackers, chat rooms and mailing lists is expected to follow the [code of conduct](https://github.com/chronicle-app/chronicle-etl/blob/master/CODE_OF_CONDUCT.md).
200
+ Everyone interacting in the Chronicle::ETL project’s codebases, issue trackers, chat rooms and mailing lists is expected to follow the [code of conduct](https://github.com/chronicle-app/chronicle-etl/blob/main/CODE_OF_CONDUCT.md).
@@ -47,8 +47,11 @@ Gem::Specification.new do |spec|
47
47
  spec.add_dependency "sequel", "~> 5.35"
48
48
  spec.add_dependency "sqlite3", "~> 1.4"
49
49
  spec.add_dependency "thor", "~> 1.2"
50
+ spec.add_dependency "thor-hollaback", "~> 0.2"
50
51
  spec.add_dependency "tty-progressbar", "~> 0.17"
52
+ spec.add_dependency "tty-spinner"
51
53
  spec.add_dependency "tty-table", "~> 0.11"
54
+ spec.add_dependency "tty-prompt", "~> 0.23"
52
55
 
53
56
  spec.add_development_dependency "bundler", "~> 2.1"
54
57
  spec.add_development_dependency "pry-byebug", "~> 3.9"
@@ -0,0 +1,31 @@
1
+ module Chronicle
2
+ module ETL
3
+ module CLI
4
+ # Base class for CLI commands
5
+ class CLIBase < ::Thor
6
+ no_commands do
7
+ # Shorthand for cli_exit(status: :failure)
8
+ def cli_fail(message: nil, exception: nil)
9
+ cli_exit(status: :failure, message: message, exception: exception)
10
+ end
11
+
12
+ # Exit from CLI
13
+ #
14
+ # @params status Can be eitiher :success or :failure
15
+ # @params message to print
16
+ # @params exception stacktrace if log_level is set to debug
17
+ def cli_exit(status: :success, message: nil, exception: nil)
18
+ exit_code = status == :success ? 0 : 1
19
+ log_level = status == :success ? :info : :fatal
20
+
21
+ message = message.red if status != :success
22
+
23
+ Chronicle::ETL::Logger.debug(exception.full_message) if exception
24
+ Chronicle::ETL::Logger.send(log_level, message) if message
25
+ exit(exit_code)
26
+ end
27
+ end
28
+ end
29
+ end
30
+ end
31
+ end
@@ -8,11 +8,6 @@ module Chronicle
8
8
  default_task 'list'
9
9
  namespace :connectors
10
10
 
11
- desc "install NAME", "Installs connector NAME"
12
- def install(name)
13
- Chronicle::ETL::Registry.install_connector(name)
14
- end
15
-
16
11
  desc "list", "Lists available connectors"
17
12
  # Display all available connectors that chronicle-etl has access to
18
13
  def list
@@ -44,21 +39,19 @@ module Chronicle
44
39
  desc "show PHASE IDENTIFIER", "Show information about a connector"
45
40
  def show(phase, identifier)
46
41
  unless ['extractor', 'transformer', 'loader'].include?(phase)
47
- puts "phase argument must be one of: [extractor, transformer, loader]"
48
- return
42
+ cli_fail(message: "Phase argument must be one of: [extractor, transformer, loader]")
49
43
  end
50
44
 
51
45
  begin
52
46
  connector = Chronicle::ETL::Registry.find_by_phase_and_identifier(phase.to_sym, identifier)
53
- rescue Chronicle::ETL::ConnectorNotAvailableError
54
- puts "Could not find #{phase} #{identifier}"
55
- return
47
+ rescue Chronicle::ETL::ConnectorNotAvailableError, Chronicle::ETL::PluginError => e
48
+ cli_fail(message: "Could not find #{phase} #{identifier}", exception: e)
56
49
  end
57
50
 
58
51
  puts connector.klass.to_s.bold
59
52
  puts " #{connector.descriptive_phrase}"
60
53
  puts
61
- puts "OPTIONS"
54
+ puts "Settings:"
62
55
 
63
56
  headers = ['name', 'default', 'required'].map{ |h| h.to_s.upcase.bold }
64
57
 
@@ -1,4 +1,5 @@
1
1
  require 'pp'
2
+ require 'tty-prompt'
2
3
 
3
4
  module Chronicle
4
5
  module ETL
@@ -10,30 +11,26 @@ module Chronicle
10
11
 
11
12
  class_option :name, aliases: '-j', desc: 'Job configuration name'
12
13
 
13
- class_option :extractor, aliases: '-e', desc: "Extractor class. Default: stdin", banner: 'extractor-name'
14
+ class_option :extractor, aliases: '-e', desc: "Extractor class. Default: stdin", banner: 'NAME'
14
15
  class_option :'extractor-opts', desc: 'Extractor options', type: :hash, default: {}
15
- class_option :transformer, aliases: '-t', desc: 'Transformer class. Default: null', banner: 'transformer-name'
16
+ class_option :transformer, aliases: '-t', desc: 'Transformer class. Default: null', banner: 'NAME'
16
17
  class_option :'transformer-opts', desc: 'Transformer options', type: :hash, default: {}
17
- class_option :loader, aliases: '-l', desc: 'Loader class. Default: stdout', banner: 'loader-name'
18
+ class_option :loader, aliases: '-l', desc: 'Loader class. Default: table', banner: 'NAME'
18
19
  class_option :'loader-opts', desc: 'Loader options', type: :hash, default: {}
19
20
 
20
21
  # This is an array to deal with shell globbing
21
22
  class_option :input, aliases: '-i', desc: 'Input filename or directory', default: [], type: 'array', banner: 'FILENAME'
22
- class_option :since, desc: "Load records SINCE this date. Overrides job's `load_since` configuration option in extractor's options", banner: 'DATE'
23
+ class_option :since, desc: "Load records SINCE this date", banner: 'DATE'
23
24
  class_option :until, desc: "Load records UNTIL this date", banner: 'DATE'
24
25
  class_option :limit, desc: "Only extract the first LIMIT records", banner: 'N'
25
26
 
26
27
  class_option :output, aliases: '-o', desc: 'Output filename', type: 'string'
27
28
  class_option :fields, desc: 'Output only these fields', type: 'array', banner: 'field1 field2 ...'
28
-
29
- class_option :log_level, desc: 'Log level (debug, info, warn, error, fatal)', default: 'info'
30
- class_option :verbose, aliases: '-v', desc: 'Set log level to verbose', type: :boolean
29
+ class_option :header_row, desc: 'Output the header row of tabular output', type: 'boolean'
31
30
 
32
31
  # Thor doesn't like `run` as a command name
33
32
  map run: :start
34
33
  desc "run", "Start a job"
35
- option :log_level, desc: 'Log level (debug, info, warn, error, fatal)', default: 'info'
36
- option :verbose, aliases: '-v', desc: 'Set log level to verbose', type: :boolean
37
34
  option :dry_run, desc: 'Only run the extraction and transform steps, not the loading', type: :boolean
38
35
  long_desc <<-LONG_DESC
39
36
  This will run an ETL job. Each job needs three parts:
@@ -48,25 +45,41 @@ module Chronicle
48
45
  LONG_DESC
49
46
  # Run an ETL job
50
47
  def start
51
- setup_log_level
52
48
  job_definition = build_job_definition(options)
53
- job = Chronicle::ETL::Job.new(job_definition)
54
- runner = Chronicle::ETL::Runner.new(job)
55
- runner.run!
49
+
50
+ if job_definition.plugins_missing?
51
+ missing_plugins = job_definition.errors[:plugins]
52
+ .select { |error| error.is_a?(Chronicle::ETL::PluginLoadError) }
53
+ .map(&:name)
54
+ .uniq
55
+ install_missing_plugins(missing_plugins)
56
+ end
57
+
58
+ run_job(job_definition)
59
+ rescue Chronicle::ETL::JobDefinitionError => e
60
+ cli_fail(message: "Error running job.\n#{job_definition.errors}", exception: e)
56
61
  end
57
62
 
58
63
  desc "create", "Create a job"
59
64
  # Create an ETL job
60
65
  def create
61
66
  job_definition = build_job_definition(options)
67
+ job_definition.validate!
68
+
62
69
  path = File.join('chronicle', 'etl', 'jobs', options[:name])
63
70
  Chronicle::ETL::Config.write(path, job_definition.definition)
71
+ rescue Chronicle::ETL::JobDefinitionError => e
72
+ cli_fail(message: "Job definition error", exception: e)
64
73
  end
65
74
 
66
75
  desc "show", "Show details about a job"
67
76
  # Show an ETL job
68
77
  def show
69
- puts Chronicle::ETL::Job.new(build_job_definition(options))
78
+ job_definition = build_job_definition(options)
79
+ job_definition.validate!
80
+ puts Chronicle::ETL::Job.new(job_definition)
81
+ rescue Chronicle::ETL::JobDefinitionError => e
82
+ cli_fail(message: "Job definition error", exception: e)
70
83
  end
71
84
 
72
85
  desc "list", "List all available jobs"
@@ -86,19 +99,32 @@ LONG_DESC
86
99
 
87
100
  headers = ['name', 'extractor', 'transformer', 'loader'].map { |h| h.upcase.bold }
88
101
 
102
+ puts "Available jobs:"
89
103
  table = TTY::Table.new(headers, job_details)
90
104
  puts table.render(indent: 0, padding: [0, 2])
105
+ rescue Chronicle::ETL::ConfigError => e
106
+ cli_fail(message: "Config error. #{e.message}", exception: e)
91
107
  end
92
108
 
93
109
  private
94
110
 
95
- def setup_log_level
96
- if options[:verbose]
97
- Chronicle::ETL::Logger.log_level = Chronicle::ETL::Logger::DEBUG
98
- elsif options[:log_level]
99
- level = Chronicle::ETL::Logger.const_get(options[:log_level].upcase)
100
- Chronicle::ETL::Logger.log_level = level
101
- end
111
+ def run_job(job_definition)
112
+ job = Chronicle::ETL::Job.new(job_definition)
113
+ runner = Chronicle::ETL::Runner.new(job)
114
+ runner.run!
115
+ end
116
+
117
+ # TODO: probably could merge this with something in cli/plugin
118
+ def install_missing_plugins(missing_plugins)
119
+ prompt = TTY::Prompt.new
120
+ message = "Plugin#{'s' if missing_plugins.count > 1} specified by job not installed.\n"
121
+ message += "Do you want to install "
122
+ message += missing_plugins.map { |name| "chronicle-#{name}".bold}.join(", ")
123
+ message += " and start the job?"
124
+ will_install = prompt.yes?(message)
125
+ cli_fail(message: "Must install #{missing_plugins.join(", ")} plugin to run job") unless will_install
126
+
127
+ Chronicle::ETL::CLI::Plugins.new.install(*missing_plugins)
102
128
  end
103
129
 
104
130
  # Create job definition by reading config file and then overwriting with flag options
@@ -116,7 +142,7 @@ LONG_DESC
116
142
  # Takes flag options and turns them into a runner config
117
143
  def process_flag_options options
118
144
  extractor_options = options[:'extractor-opts'].merge({
119
- filename: (options[:input] if options[:input].any?),
145
+ input: (options[:input] if options[:input].any?),
120
146
  since: options[:since],
121
147
  until: options[:until],
122
148
  limit: options[:limit],
@@ -126,6 +152,7 @@ LONG_DESC
126
152
 
127
153
  loader_options = options[:'loader-opts'].merge({
128
154
  output: options[:output],
155
+ header_row: options[:header_row],
129
156
  fields: options[:fields]
130
157
  }.compact)
131
158
 
@@ -4,7 +4,15 @@ module Chronicle
4
4
  module ETL
5
5
  module CLI
6
6
  # Main entrypoint for CLI app
7
- class Main < ::Thor
7
+ class Main < Chronicle::ETL::CLI::CLIBase
8
+ class_before :set_log_level
9
+ class_before :set_color_output
10
+
11
+ class_option :log_level, desc: 'Log level (debug, info, warn, error, fatal, silent)', default: 'info'
12
+ class_option :verbose, aliases: '-v', desc: 'Set log level to verbose', type: :boolean
13
+ class_option :silent, desc: 'Silence all output', type: :boolean
14
+ class_option :'no-color', desc: 'Disable colour output', type: :boolean
15
+
8
16
  default_task "jobs"
9
17
 
10
18
  desc 'connectors:COMMAND', 'Connectors available for ETL jobs', hide: true
@@ -13,6 +21,9 @@ module Chronicle
13
21
  desc 'jobs:COMMAND', 'Configure and run jobs', hide: true
14
22
  subcommand 'jobs', Jobs
15
23
 
24
+ desc 'plugins:COMMAND', 'Configure plugins', hide: true
25
+ subcommand 'plugins', Plugins
26
+
16
27
  # Entrypoint for the CLI
17
28
  def self.start(given_args = ARGV, config = {})
18
29
  # take a subcommand:command and splits them so Thor knows how to hand off to the subcommand class
@@ -79,6 +90,26 @@ module Chronicle
79
90
  shell.say
80
91
  end
81
92
  end
93
+
94
+ no_commands do
95
+ def testb
96
+ puts "hi"
97
+ end
98
+ def set_color_output
99
+ String.disable_colorization true if options[:'no-color'] || ENV['NO_COLOR']
100
+ end
101
+
102
+ def set_log_level
103
+ if options[:silent]
104
+ Chronicle::ETL::Logger.log_level = Chronicle::ETL::Logger::SILENT
105
+ elsif options[:verbose]
106
+ Chronicle::ETL::Logger.log_level = Chronicle::ETL::Logger::DEBUG
107
+ elsif options[:log_level]
108
+ level = Chronicle::ETL::Logger.const_get(options[:log_level].upcase)
109
+ Chronicle::ETL::Logger.log_level = level
110
+ end
111
+ end
112
+ end
82
113
  end
83
114
  end
84
115
  end