chronicle-etl 0.4.4 → 0.5.0

Sign up to get free protection for your applications and to get access to all the features.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: 2f035ef95ebae675973ce505c71345c0c2da640b20a3e88050f4c88c76caf656
4
- data.tar.gz: '0486e4ce5bfdb85ad6ccb5a792ac7aa5a897afecf839c759bb78a2f33136d34e'
3
+ metadata.gz: 951fca4c6238d773ec8bc2b9ea474a0cffdabf0c2f5d0c925f78b91b35836224
4
+ data.tar.gz: 908a7f01fb215cca9936f072b71315c3b62e0d00b3c8f7ffd938682a4cabe42c
5
5
  SHA512:
6
- metadata.gz: f9a1ba3cb4a9abd3bc8a499012b3456b1a2b4cf1f55bed1213f0b1baa6ea96d0ad6e54a470425fa5aa4961061630095218a31f64ef4a39bea15c547219f9a7a8
7
- data.tar.gz: d82ff59fd2875d55b079b7814b6a028f98f80f17d3ae2bb3291e5ae6cfb7e1b06f571e16fc73c83dea41d7682f24eb9b7ee3fa6ae7cc709ede57e12011e6a0be
6
+ metadata.gz: 28fc97935e5bd9538877a2057f3201170fdb1eb574385ae6d94901b21abfa5f923618d5fb2caf94395503ec70c0052b607a939b363f27630aaca26df6ca93722
7
+ data.tar.gz: 0b8e4dedb79e6cbd23487e2c4482d9a8ad9d1653e015593e6b83cac854d94a6cd4702862eebb11ec9f41e63b774eb6f41db929a1ceb12055af6cb08209a6b8eb
data/.rubocop.yml CHANGED
@@ -11,6 +11,9 @@ Style/StringLiterals:
11
11
  Layout/MultilineAssignmentLayout:
12
12
  Enabled: false
13
13
 
14
+ Layout/MultilineMethodCallIndentation:
15
+ EnforcedStyle: indented
16
+
14
17
  Layout/RedundantLineBreak:
15
18
  Enabled: false
16
19
 
data/README.md CHANGED
@@ -34,28 +34,41 @@ $ chronicle-etl --extractor NAME --transformer NAME --loader NAME
34
34
 
35
35
  # Read test.csv and display it to stdout as a table
36
36
  $ chronicle-etl --extractor csv --input ./data.csv --loader table
37
+
38
+ # Retrieve shell commands run in the last 5 hours
39
+ $ chronicle-etl -e shell --since 5h
40
+
41
+ # Get email senders from an .mbox email archive file
42
+ $ chronicle-etl --extractor email:mbox -i sample-email-archive.mbox -t email --fields actor.slug
43
+
44
+ # Save an access token as a secret and use it in a job
45
+ $ chronicle-etl secrets:set pinboard access_token username:foo123
46
+ $ chronicle-etl secrets:list # Verify that's it's available
47
+ $ chronicle-etl -e pinboard --since 1mo # Used automatically based on plugin name
37
48
  ```
38
49
 
39
50
  ### Common options
40
51
  ```sh
41
52
  Options:
42
- -j, [--name=NAME] # Job configuration name
43
- -e, [--extractor=EXTRACTOR-NAME] # Extractor class. Default: stdin
44
- [--extractor-opts=key:value] # Extractor options
45
- -t, [--transformer=TRANFORMER-NAME] # Transformer class. Default: null
46
- [--transformer-opts=key:value] # Transformer options
47
- -l, [--loader=LOADER-NAME] # Loader class. Default: stdout
48
- [--loader-opts=key:value] # Loader options
49
- -i, [--input=FILENAME] # Input filename or directory
50
- [--since=DATE] # Load records SINCE this date. Overrides job's `load_since` configuration option in extractor's options
51
- [--until=DATE] # Load records UNTIL this date
52
- [--limit=N] # Only extract the first LIMIT records
53
- -o, [--output=OUTPUT] # Output filename
54
- [--fields=field1 field2 ...] # Output only these fields
55
- [--log-level=LOG_LEVEL] # Log level (debug, info, warn, error, fatal)
56
- # Default: info
57
- -v, [--verbose], [--no-verbose] # Set log level to verbose
58
- [--silent], [--no-silent] # Silence all output
53
+ -j, [--name=NAME] # Job configuration name
54
+ -e, [--extractor=NAME] # Extractor class. Default: stdin
55
+ [--extractor-opts=key:value] # Extractor options
56
+ -t, [--transformer=NAME] # Transformer class. Default: null
57
+ [--transformer-opts=key:value] # Transformer options
58
+ -l, [--loader=NAME] # Loader class. Default: table
59
+ [--loader-opts=key:value] # Loader options
60
+ -i, [--input=FILENAME] # Input filename or directory
61
+ [--since=DATE] # Load records SINCE this date (or fuzzy time duration)
62
+ [--until=DATE] # Load records UNTIL this date (or fuzzy time duration)
63
+ [--limit=N] # Only extract the first LIMIT records
64
+ -o, [--output=OUTPUT] # Output filename
65
+ [--fields=field1 field2 ...] # Output only these fields
66
+ [--header-row], [--no-header-row] # Output the header row of tabular output
67
+
68
+ [--log-level=LOG_LEVEL] # Log level (debug, info, warn, error, fatal)
69
+ # Default: info
70
+ -v, [--verbose], [--no-verbose] # Set log level to verbose
71
+ [--silent], [--no-silent] # Silence all output
59
72
  ```
60
73
 
61
74
  ## Connectors
@@ -83,58 +96,50 @@ $ chronicle-etl connectors:list
83
96
  - [`json`](https://github.com/chronicle-app/chronicle-etl/blob/main/lib/chronicle/etl/loaders/json_loader.rb) - Load records serialized as JSON
84
97
  - [`rest`](https://github.com/chronicle-app/chronicle-etl/blob/main/lib/chronicle/etl/loaders/rest_loader.rb) - Serialize records with [JSONAPI](https://jsonapi.org/) and send to a REST API
85
98
 
86
- ### Plugins
87
- Plugins provide access to data from third-party platforms, services, or formats.
99
+ ## Chronicle Plugins
100
+ Plugins provide access to data from third-party platforms, services, or formats. Plugins are packaged as separate rubygems and can be installed through `$ gem install` or through the CLI itself.
101
+
102
+ ### Plugin usage
88
103
 
89
104
  ```bash
90
105
  # Install a plugin
91
106
  $ chronicle-etl plugins:install NAME
92
107
 
93
- # Install the imessage plugin
94
- $ chronicle-etl plugins:install imessage
95
-
96
108
  # List installed plugins
97
109
  $ chronicle-etl plugins:list
98
110
 
111
+ # Use a plugin
112
+ $ chronicle-etl plugins:install shell
113
+ $ chronicle-etl --extractor shell:history --limit 10
114
+
99
115
  # Uninstall a plugin
100
116
  $ chronicle-etl plugins:uninstall NAME
101
117
  ```
102
118
 
103
- A few dozen importers exist [in my Memex project](https://hyfen.net/memex/) and they’re being ported over to the Chronicle system. This table shows what’s available now and what’s coming. Rows are sorted in very rough order of priority.
119
+ ### Status
120
+
121
+ A few dozen importers exist [in my Memex project](https://hyfen.net/memex/) and I'm porting them over to the Chronicle system. The [Chronicle Plugin Tracker](https://github.com/orgs/chronicle-app/projects/1/views/1) lets you keep track what's available and what's coming soon.
104
122
 
105
- If you want to work together on a connector, please [get in touch](#get-in-touch)!
123
+ If you don't see a plugin for a third-party provider or data source that you're interested in using with `chronicle-etl`, [please open an issue](https://github.com/chronicle-app/chronicle-etl/issues/new). If you want to work together on a plugin, please [get in touch](#get-in-touch)!
124
+
125
+ #### Currently available
106
126
 
107
127
  | Name | Description | Availability |
108
128
  |-----------------------------------------------------------------|---------------------------------------------------------------------------------------------|----------------------------------|
109
129
  | [imessage](https://github.com/chronicle-app/chronicle-imessage) | iMessage messages and attachments | Available |
110
- | [shell](https://github.com/chronicle-app/chronicle-shell) | Shell command history | Available (zsh support pending) |
111
- | [email](https://github.com/chronicle-app/chronicle-email) | Emails and attachments from IMAP or .mbox files | Available (imap support pending) |
130
+ | [shell](https://github.com/chronicle-app/chronicle-shell) | Shell command history | Available (still needs zsh support) |
131
+ | [email](https://github.com/chronicle-app/chronicle-email) | Emails and attachments from IMAP or .mbox files | Available (still needs IMAP support) |
112
132
  | [pinboard](https://github.com/chronicle-app/chronicle-email) | Bookmarks and tags | Available |
113
133
  | [safari](https://github.com/chronicle-app/chronicle-safari) | Browser history from local sqlite db | Available |
114
- | github | Github user and repo activity | In progress |
115
- | chrome | Browser history from local sqlite db | Needs porting |
116
- | whatsapp | Messaging history (via individual chat exports) or reverse-engineered local desktop install | Unstarted |
117
- | anki | Studying and card creation history | Needs porting |
118
- | facebook | Messaging and history posting via data export files | Needs porting |
119
- | twitter | History via API or export data files | Needs porting |
120
- | foursquare | Location history via API | Needs porting |
121
- | goodreads | Reading history via export csv (RIP goodreads API) | Needs porting |
122
- | lastfm | Listening history via API | Needs porting |
123
- | images | Process image files | Needs porting |
124
- | arc | Location history from synced icloud backup files | Needs porting |
125
- | firefox | Browser history from local sqlite db | Needs porting |
126
- | fitbit | Personal analytics via API | Needs porting |
127
- | git | Commit history on a repo | Needs porting |
128
- | google-calendar | Calendar events via API | Needs porting |
129
- | instagram | Posting and messaging history via export data | Needs porting |
130
- | shazam | Song tags via reverse-engineered API | Needs porting |
131
- | slack | Messaging history via API | Need rethinking |
132
- | strava | Activity history via API | Needs porting |
133
- | things | Task activity via local sqlite db | Needs porting |
134
- | bear | Note taking activity via local sqlite db | Needs porting |
135
- | youtube | Video activity via takeout data and API | Needs porting |
136
-
137
- ### Writing your own connector
134
+
135
+ #### Coming soon
136
+
137
+ In summary, the following **are coming soon**:
138
+ anki, arc, bear, chrome, facebook, firefox, fitbit, foursquare, git, github, goodreads, google-calendar, images, instagram, lastfm, shazam, slack, strava, things, twitter, whatsapp, youtube.
139
+
140
+ Please check the [Chronicle Plugin Tracker](https://github.com/orgs/chronicle-app/projects/1/views/1) for details.
141
+
142
+ ### Writing your own plugin
138
143
 
139
144
  Additional connectors are packaged as separate ruby gems. You can view the [iMessage plugin](https://github.com/chronicle-app/chronicle-imessage) for an example.
140
145
 
@@ -149,7 +154,7 @@ module Chronicle
149
154
  class FooExtractor < Chronicle::ETL::Extractor
150
155
  register_connector do |r|
151
156
  r.identifier = 'foo'
152
- r.description = 'From foo.com'
157
+ r.description = 'from foo.com'
153
158
  end
154
159
 
155
160
  setting :access_token, required: true
@@ -168,6 +173,45 @@ module Chronicle
168
173
  end
169
174
  ```
170
175
 
176
+ ## Secrets Management
177
+
178
+ If your job needs secrets such as access tokens or passwords, `chronicle-etl` has a built-in secret management system.
179
+
180
+ Secrets are organized in namespaces. Typically, you use one namespace per plugin (`pinboard` secrets for the `pinboard` plugin). When you run a job that uses the `pinboard` plugin extractor, for example, the secrets from that namespace will automatically be included in the extractor's options. To override which secrets get included, you can use do it in the connector options with `secrets: ALT-NAMESPACE`.
181
+
182
+ Under the hood, secrets are stored in `~/.config/chronicle/etl/secrets/NAMESPACE.yml` with 0600 permissions on each file.
183
+
184
+ ### Using the secret manager
185
+
186
+ ```sh
187
+ # Save a secret under the 'pinboard' namespace
188
+ $ chronicle-etl secrets:set pinboard access_token username:foo123
189
+
190
+ # Set a secret using stdin
191
+ $ echo -n "username:foo123" | chronicle-etl secrets:set pinboard access_token
192
+
193
+ # List available secretes
194
+ $ chronicle-etl secrets:list
195
+
196
+ # Use 'pinboard' secrets in the pinboard extractor's options (happens automatically)
197
+ $ chronicle-etl -e pinboard --since 1mo
198
+
199
+ # Use a custom secrets namespace
200
+ $ chronicle-etl secrets:set pinboard-alt access_token different-username:foo123
201
+ $ chronicle-etl -e pinboard --extractor-opts secrets:pinboard-alt --since 1mo
202
+
203
+ # Remove a secret
204
+ $ chronicle-etl secrets:unset pinboard access_token
205
+ ```
206
+
207
+ ## Roadmap
208
+
209
+ - Add **homebrew formula** for easier installation. #13
210
+ - Keep tackling **new plugins**. See: [Chronicle Plugin Tracker](https://github.com/orgs/chronicle-app/projects/1)
211
+ - Add support for **incremental extractions** #37
212
+ - **Improve stdin extractor and shell command transformer** (#5) so that users can easily integrate their own scripts/tools into jobs
213
+ - **Add documentation for Chronicle Schema**. It's found throughout this project but never explained.
214
+
171
215
  ## Development
172
216
  After checking out the repo, run `bin/setup` to install dependencies. Then, run `rake spec` to run the tests. You can also run `bin/console` for an interactive prompt that will allow you to experiment.
173
217
 
@@ -43,22 +43,23 @@ Gem::Specification.new do |spec|
43
43
  spec.add_dependency "marcel", "~> 1.0.2"
44
44
  spec.add_dependency "mini_exiftool", "~> 2.10"
45
45
  spec.add_dependency "nokogiri", "~> 1.13"
46
- spec.add_dependency "runcom", ">= 6.0"
47
46
  spec.add_dependency "sequel", "~> 5.35"
48
47
  spec.add_dependency "sqlite3", "~> 1.4"
49
48
  spec.add_dependency "thor", "~> 1.2"
50
49
  spec.add_dependency "thor-hollaback", "~> 0.2"
51
50
  spec.add_dependency "tty-progressbar", "~> 0.17"
51
+ spec.add_dependency "tty-prompt", "~> 0.23"
52
52
  spec.add_dependency "tty-spinner"
53
53
  spec.add_dependency "tty-table", "~> 0.11"
54
- spec.add_dependency "tty-prompt", "~> 0.23"
54
+ spec.add_dependency "xdg", ">= 4.0"
55
55
 
56
56
  spec.add_development_dependency "bundler", "~> 2.1"
57
+ spec.add_development_dependency "guard-rspec", "~> 4.7.3"
58
+ spec.add_development_dependency "fakefs"
57
59
  spec.add_development_dependency "pry-byebug", "~> 3.9"
58
60
  spec.add_development_dependency "rake", "~> 13.0"
59
61
  spec.add_development_dependency "rspec", "~> 3.9"
62
+ spec.add_development_dependency "rubocop", "~> 1.25.1"
60
63
  spec.add_development_dependency "simplecov", "~> 0.21"
61
- spec.add_development_dependency "guard-rspec", "~> 4.7.3"
62
64
  spec.add_development_dependency "yard", "~> 0.9.7"
63
- spec.add_development_dependency "rubocop", "~> 1.25.1"
64
65
  end
@@ -4,6 +4,8 @@ module Chronicle
4
4
  module ETL
5
5
  module CLI
6
6
  # CLI commands for working with ETL connectors
7
+ #
8
+ # @todo make this work with new plugin system (i.e. no loading of all plugins)
7
9
  class Connectors < SubcommandBase
8
10
  default_task 'list'
9
11
  namespace :connectors
@@ -11,8 +13,6 @@ module Chronicle
11
13
  desc "list", "Lists available connectors"
12
14
  # Display all available connectors that chronicle-etl has access to
13
15
  def list
14
- Chronicle::ETL::Registry.load_all!
15
-
16
16
  connector_info = Chronicle::ETL::Registry.connectors.map do |connector_registration|
17
17
  {
18
18
  identifier: connector_registration.identifier,
@@ -20,8 +20,8 @@ module Chronicle
20
20
 
21
21
  # This is an array to deal with shell globbing
22
22
  class_option :input, aliases: '-i', desc: 'Input filename or directory', default: [], type: 'array', banner: 'FILENAME'
23
- class_option :since, desc: "Load records SINCE this date", banner: 'DATE'
24
- class_option :until, desc: "Load records UNTIL this date", banner: 'DATE'
23
+ class_option :since, desc: "Load records SINCE this date (or fuzzy time duration)", banner: 'DATE'
24
+ class_option :until, desc: "Load records UNTIL this date (or fuzzy time duration)", banner: 'DATE'
25
25
  class_option :limit, desc: "Only extract the first LIMIT records", banner: 'N'
26
26
 
27
27
  class_option :output, aliases: '-o', desc: 'Output filename', type: 'string'
@@ -49,7 +49,7 @@ LONG_DESC
49
49
 
50
50
  if job_definition.plugins_missing?
51
51
  missing_plugins = job_definition.errors[:plugins]
52
- .select { |error| error.is_a?(Chronicle::ETL::PluginLoadError) }
52
+ .select { |error| error.is_a?(Chronicle::ETL::PluginNotInstalledError) }
53
53
  .map(&:name)
54
54
  .uniq
55
55
  install_missing_plugins(missing_plugins)
@@ -57,7 +57,11 @@ LONG_DESC
57
57
 
58
58
  run_job(job_definition)
59
59
  rescue Chronicle::ETL::JobDefinitionError => e
60
- cli_fail(message: "Error running job.\n#{job_definition.errors}", exception: e)
60
+ message = ""
61
+ job_definition.errors.each_pair do |category, errors|
62
+ message << "Problem with #{category}:\n - #{errors.map(&:to_s).join("\n -")}"
63
+ end
64
+ cli_fail(message: "Error running job.\n#{message}", exception: e)
61
65
  end
62
66
 
63
67
  desc "create", "Create a job"
@@ -66,8 +70,7 @@ LONG_DESC
66
70
  job_definition = build_job_definition(options)
67
71
  job_definition.validate!
68
72
 
69
- path = File.join('chronicle', 'etl', 'jobs', options[:name])
70
- Chronicle::ETL::Config.write(path, job_definition.definition)
73
+ Chronicle::ETL::Config.write("jobs", options[:name], job_definition.definition)
71
74
  rescue Chronicle::ETL::JobDefinitionError => e
72
75
  cli_fail(message: "Job definition error", exception: e)
73
76
  end
@@ -88,7 +91,7 @@ LONG_DESC
88
91
  jobs = Chronicle::ETL::Config.available_jobs
89
92
 
90
93
  job_details = jobs.map do |job|
91
- r = Chronicle::ETL::Config.load("chronicle/etl/jobs/#{job}.yml")
94
+ r = Chronicle::ETL::Config.load("jobs", job)
92
95
 
93
96
  extractor = r[:extractor][:name] if r[:extractor]
94
97
  transformer = r[:transformer][:name] if r[:transformer]
@@ -109,6 +112,9 @@ LONG_DESC
109
112
  private
110
113
 
111
114
  def run_job(job_definition)
115
+ # FIXME: clumsy to make CLI responsible for setting secrets here. Think about a better way to do this
116
+ job_definition.apply_default_secrets
117
+
112
118
  job = Chronicle::ETL::Job.new(job_definition)
113
119
  runner = Chronicle::ETL::Runner.new(job)
114
120
  runner.run!
@@ -136,21 +142,22 @@ LONG_DESC
136
142
  end
137
143
 
138
144
  def load_job_config name
139
- Chronicle::ETL::Config.load_job_from_config(name)
145
+ Chronicle::ETL::Config.read_job(name)
140
146
  end
141
147
 
142
148
  # Takes flag options and turns them into a runner config
149
+ # TODO: this needs a lot of refactoring
143
150
  def process_flag_options options
144
- extractor_options = options[:'extractor-opts'].merge({
151
+ extractor_options = options[:'extractor-opts'].transform_keys(&:to_sym).merge({
145
152
  input: (options[:input] if options[:input].any?),
146
153
  since: options[:since],
147
154
  until: options[:until],
148
- limit: options[:limit],
155
+ limit: options[:limit]
149
156
  }.compact)
150
157
 
151
- transformer_options = options[:'transformer-opts']
158
+ transformer_options = options[:'transformer-opts'].transform_keys(&:to_sym)
152
159
 
153
- loader_options = options[:'loader-opts'].merge({
160
+ loader_options = options[:'loader-opts'].transform_keys(&:to_sym).merge({
154
161
  output: options[:output],
155
162
  header_row: options[:header_row],
156
163
  fields: options[:fields]
@@ -24,6 +24,9 @@ module Chronicle
24
24
  desc 'plugins:COMMAND', 'Configure plugins', hide: true
25
25
  subcommand 'plugins', Plugins
26
26
 
27
+ desc 'secrets:COMMAND', 'Manage secrets', hide: true
28
+ subcommand 'secrets', Secrets
29
+
27
30
  # Entrypoint for the CLI
28
31
  def self.start(given_args = ARGV, config = {})
29
32
  # take a subcommand:command and splits them so Thor knows how to hand off to the subcommand class
@@ -15,15 +15,25 @@ module Chronicle
15
15
  def install(*plugins)
16
16
  cli_fail(message: "Please specify a plugin to install") unless plugins.any?
17
17
 
18
- spinner = TTY::Spinner.new("[:spinner] Installing #{plugins.join(", ")}...", format: :dots_2)
18
+ installed, not_installed = plugins.partition do |plugin|
19
+ Chronicle::ETL::Registry::PluginRegistry.installed?(plugin)
20
+ end
21
+
22
+ puts "Already installed: #{installed.join(", ")}" if installed.any?
23
+ cli_exit unless not_installed.any?
24
+
25
+ spinner = TTY::Spinner.new("[:spinner] Installing #{not_installed.join(", ")}...", format: :dots_2)
19
26
  spinner.auto_spin
20
- plugins.each do |plugin|
27
+
28
+ not_installed.each do |plugin|
21
29
  spinner.update(title: "Installing #{plugin}")
22
30
  Chronicle::ETL::Registry::PluginRegistry.install(plugin)
31
+
23
32
  rescue Chronicle::ETL::PluginError => e
24
33
  spinner.error("Error".red)
25
34
  cli_fail(message: "Plugin '#{plugin}' could not be installed", exception: e)
26
35
  end
36
+
27
37
  spinner.success("(#{'successful'.green})")
28
38
  end
29
39
 
@@ -0,0 +1,69 @@
1
+ # frozen_string_literal: true
2
+
3
+ require "tty-prompt"
4
+
5
+ module Chronicle
6
+ module ETL
7
+ module CLI
8
+ # CLI commands for working with ETL plugins
9
+ class Secrets < SubcommandBase
10
+ default_task 'list'
11
+ namespace :secrets
12
+
13
+ desc "set NAMESPACE KEY [VALUE]", "Add a secret. VALUE can be set as argument or from stdin"
14
+ def set(namespace, key, value=nil)
15
+ validate_namespace(namespace)
16
+
17
+ if value
18
+ # came as argument
19
+ elsif $stdin.respond_to?(:stat) && $stdin.stat.pipe?
20
+ value = $stdin.read
21
+ else
22
+ prompt = TTY::Prompt.new
23
+ value = prompt.mask("Please enter #{key} for #{namespace}:")
24
+ end
25
+
26
+ Chronicle::ETL::Secrets.set(namespace, key, value.strip)
27
+ cli_exit(message: "Secret set")
28
+ rescue TTY::Reader::InputInterrupt
29
+ cli_fail(message: "\nSecret not set")
30
+ end
31
+
32
+ desc "unset NAMESPACE KEY", "Remove a secret"
33
+ def unset(namespace, key)
34
+ validate_namespace(namespace)
35
+
36
+ Chronicle::ETL::Secrets.unset(namespace, key)
37
+ cli_exit(message: "Secret unset")
38
+ end
39
+
40
+ desc "list", "List available secrets"
41
+ def list(namespace=nil)
42
+ all_secrets = Chronicle::ETL::Secrets.all(namespace)
43
+ cli_exit(message: "No secrets are stored") unless all_secrets.any?
44
+
45
+ rows = []
46
+ all_secrets.each do |namespace, secrets|
47
+ rows += secrets.map do |key, value|
48
+ # hidden_value = (value[0..5] + ("*" * [0, [value.length - 5, 30].min].max)).truncate(30)
49
+ truncated_value = value.truncate(30)
50
+ [namespace, key, truncated_value]
51
+ end
52
+ end
53
+
54
+ headers = ['namespace', 'key', 'value'].map { |h| h.upcase.bold }
55
+
56
+ puts "Available secrets:"
57
+ table = TTY::Table.new(headers, rows)
58
+ puts table.render(indent: 0, padding: [0, 2])
59
+ end
60
+
61
+ private
62
+
63
+ def validate_namespace(namespace)
64
+ cli_fail(message: "'#{namespace}' is not a valid namespace") unless Chronicle::ETL::Secrets.valid_namespace_name?(namespace)
65
+ end
66
+ end
67
+ end
68
+ end
69
+ end
@@ -7,4 +7,5 @@ require 'chronicle/etl/cli/subcommand_base'
7
7
  require 'chronicle/etl/cli/connectors'
8
8
  require 'chronicle/etl/cli/jobs'
9
9
  require 'chronicle/etl/cli/plugins'
10
+ require 'chronicle/etl/cli/secrets'
10
11
  require 'chronicle/etl/cli/main'
@@ -1,55 +1,67 @@
1
- require 'runcom'
1
+ require 'fileutils'
2
+ require 'yaml'
2
3
 
3
4
  module Chronicle
4
5
  module ETL
5
6
  # Utility methods to read, write, and access config files
6
7
  module Config
7
- module_function
8
+ extend self
8
9
 
9
- # Loads a yml config file
10
- def load(path)
11
- config = Runcom::Config.new(path)
12
- # FIXME: hack to deeply symbolize keys
13
- JSON.parse(config.to_h.to_json, symbolize_names: true)
10
+ attr_accessor :xdg_environment
11
+
12
+ def load(type, identifier)
13
+ base = config_pathname_for_type(type)
14
+ path = base.join("#{identifier}.yml")
15
+ return {} unless path.exist?
16
+
17
+ YAML.safe_load(File.read(path), symbolize_names: true, permitted_classes: [Symbol, Date, Time])
14
18
  end
15
19
 
16
20
  # Writes a hash as a yml config file
17
- def write(path, data)
18
- config = Runcom::Config.new(path)
19
- filename = config.all[0].to_s + '.yml'
20
- File.open(filename, 'w') do |f|
21
- f << data.to_yaml
21
+ def write(type, identifier, data)
22
+ base = config_pathname_for_type(type)
23
+ path = base.join("#{identifier}.yml")
24
+ FileUtils.mkdir_p(File.dirname(path))
25
+ File.open(path, 'w', 0o600) do |f|
26
+ # Ruby likes to add --- separators when writing yaml files
27
+ f << data.to_yaml.gsub(/^-+\n/, '')
22
28
  end
23
29
  end
24
30
 
25
31
  # Returns all jobs available in ~/.config/chronicle/etl/jobs/*.yml
26
32
  def available_jobs
27
- Dir.glob(File.join(config_directory("jobs"), "*.yml")).map do |filename|
33
+ Dir.glob(File.join(config_pathname_for_type("jobs"), "*.yml")).map do |filename|
28
34
  File.basename(filename, ".*")
29
35
  end
30
36
  end
31
37
 
32
- # Returns all available credentials available in ~/.config/chronicle/etl/credentials/*.yml
33
- def available_credentials
34
- Dir.glob(File.join(config_directory("credentials"), "*.yml")).map do |filename|
38
+ def available_configs(type)
39
+ Dir.glob(File.join(config_pathname_for_type(type), "*.yml")).map do |filename|
35
40
  File.basename(filename, ".*")
36
41
  end
37
42
  end
38
43
 
39
44
  # Load a job definition from job config directory
40
- def load_job_from_config(job_name)
41
- definition = self.load("chronicle/etl/jobs/#{job_name}.yml")
42
- definition[:name] = job_name
43
- definition
45
+ def read_job(job_name)
46
+ load('jobs', job_name)
44
47
  end
45
48
 
46
- def load_credentials(name)
47
- config = self.load("chronicle/etl/credentials/#{name}.yml")
49
+ def config_pathname
50
+ base = Pathname.new(xdg_config.config_home)
51
+ base.join('chronicle', 'etl')
48
52
  end
49
53
 
50
- def config_directory(type)
51
- path = "chronicle/etl/#{type}"
52
- Runcom::Config.new(path).current || raise(Chronicle::ETL::ConfigError, "Could not access config directory (#{path})")
54
+ def config_pathname_for_type(type)
55
+ config_pathname.join(type)
56
+ end
57
+
58
+ def xdg_config
59
+ # Only used for overriding ENV['HOME'] for XDG-related specs
60
+ if @xdg_environment
61
+ XDG::Environment.new(environment: @xdg_environment)
62
+ else
63
+ XDG::Environment.new
64
+ end
53
65
  end
54
66
  end
55
67
  end
@@ -1,6 +1,7 @@
1
1
  # frozen_string_literal: true
2
2
 
3
3
  require "ostruct"
4
+ require "chronic_duration"
4
5
 
5
6
  module Chronicle
6
7
  module ETL
@@ -57,7 +58,9 @@ module Chronicle
57
58
 
58
59
  options.each do |name, value|
59
60
  setting = self.class.all_settings[name]
60
- raise(Chronicle::ETL::ConnectorConfigurationError, "Unrecognized setting: #{name}") unless setting
61
+
62
+ # Do nothing with a given option if it's not a connector setting
63
+ next unless setting
61
64
 
62
65
  @config[name] = coerced_value(setting, value)
63
66
  end
@@ -83,6 +86,8 @@ module Chronicle
83
86
 
84
87
  def coerced_value(setting, value)
85
88
  setting.type ? __send__("coerce_#{setting.type}", value) : value
89
+ rescue StandardError
90
+ raise(Chronicle::ETL::ConnectorConfigurationError, "Could not coerce #{value} into a #{setting.type}")
86
91
  end
87
92
 
88
93
  def coerce_string(value)
@@ -103,11 +108,15 @@ module Chronicle
103
108
  end
104
109
 
105
110
  def coerce_time(value)
106
- # TODO: handle durations like '3h'
107
- if value.is_a?(String)
108
- Time.parse(value)
111
+ return value unless value.is_a?(String)
112
+
113
+ # Hacky check for duration strings like "60m"
114
+ if value.match(/[a-z]+/)
115
+ ChronicDuration.raise_exceptions = true
116
+ duration_ago = ChronicDuration.parse(value)
117
+ Time.now - duration_ago
109
118
  else
110
- value
119
+ Time.parse(value)
111
120
  end
112
121
  end
113
122
  end
@@ -2,6 +2,8 @@ module Chronicle
2
2
  module ETL
3
3
  class Error < StandardError; end
4
4
 
5
+ class SecretsError < Error; end
6
+
5
7
  class ConfigError < Error; end
6
8
 
7
9
  class RunnerTypeError < Error; end
@@ -23,6 +25,7 @@ module Chronicle
23
25
  end
24
26
  end
25
27
 
28
+ class PluginNotInstalledError < PluginError; end
26
29
  class PluginConflictError < PluginError; end
27
30
  class PluginNotAvailableError < PluginError; end
28
31
  class PluginLoadError < PluginError; end
@@ -45,8 +45,10 @@ module Chronicle
45
45
  def plugins_missing?
46
46
  validate
47
47
 
48
- @errors[:plugins] || []
49
- .filter { |e| e.instance_of?(Chronicle::ETL::PluginLoadError) }
48
+ return false unless @errors[:plugins]&.any?
49
+
50
+ @errors[:plugins]
51
+ .filter { |e| e.instance_of?(Chronicle::ETL::PluginNotInstalledError) }
50
52
  .any?
51
53
  end
52
54
 
@@ -62,6 +64,30 @@ module Chronicle
62
64
  load_credentials
63
65
  end
64
66
 
67
+ # For each connector in this job, mix in secrets into the options
68
+ def apply_default_secrets
69
+ Chronicle::ETL::Registry::PHASES.each do |phase|
70
+ # If the option have a `secrets` key, we look up those secrets and
71
+ # mix them in. If not, use the connector's plugin name and look up
72
+ # secrets with the same namespace
73
+ if @definition[phase][:options][:secrets]
74
+ namespace = @definition[phase][:options][:secrets]
75
+ else
76
+ # We don't want to do this lookup for built-in connectors
77
+ next if __send__("#{phase}_klass".to_sym).connector_registration.built_in?
78
+
79
+ # infer plugin name from connector name and use it for secrets
80
+ # namesepace
81
+ namespace = @definition[phase][:name].split(":").first
82
+ end
83
+
84
+ # Reverse merge secrets into connector's options (we want to preserve
85
+ # options that came from job file or CLI options)
86
+ secrets = Chronicle::ETL::Secrets.read(namespace)
87
+ @definition[phase][:options] = secrets.merge(@definition[phase][:options])
88
+ end
89
+ end
90
+
65
91
  # Is this job continuing from a previous run?
66
92
  def incremental?
67
93
  @definition[:incremental]
@@ -1,5 +1,6 @@
1
- require 'sequel'
2
1
  require 'forwardable'
2
+ require 'sequel'
3
+ require 'xdg'
3
4
 
4
5
  module Chronicle
5
6
  module ETL
@@ -35,8 +36,8 @@ module Chronicle
35
36
  end
36
37
 
37
38
  def self.db_filename
38
- data = Runcom::Data.new "chronicle/etl/job_log.db"
39
- filename = data.all[0].to_s
39
+ base = Pathname.new(XDG::Data.new.home)
40
+ base.join('job_log.db')
40
41
  end
41
42
 
42
43
  def self.initialize_db
@@ -13,8 +13,8 @@ module Chronicle
13
13
  module PluginRegistry
14
14
  # Does this plugin exist?
15
15
  def self.exists?(name)
16
- # TODO: implement this. Could query rubygems.org or have a
17
- # hardcoded approved list
16
+ # TODO: implement this. Could query rubygems.org or use a hardcoded
17
+ # list somewhere
18
18
  true
19
19
  end
20
20
 
@@ -31,6 +31,12 @@ module Chronicle
31
31
  .values
32
32
  end
33
33
 
34
+ # Check whether a given plugin is installed
35
+ def self.installed?(name)
36
+ gem_name = "chronicle-#{name}"
37
+ all_installed.map(&:name).include?(gem_name)
38
+ end
39
+
34
40
  # Activate a plugin with given name by `require`ing it
35
41
  def self.activate(name)
36
42
  # By default, activates the latest available version of a gem
@@ -39,14 +45,17 @@ module Chronicle
39
45
  rescue Gem::ConflictError => e
40
46
  # TODO: figure out if there's more we can do here
41
47
  raise Chronicle::ETL::PluginConflictError.new(name), "Plugin '#{name}' couldn't be loaded. #{e.message}"
42
- rescue LoadError => e
43
- raise Chronicle::ETL::PluginLoadError.new(name), "Plugin '#{name}' couldn't be loaded" if exists?(name)
44
-
45
- raise Chronicle::ETL::PluginNotAvailableError.new(name), "Plugin #{name} doesn't exist"
48
+ rescue StandardError, LoadError => e
49
+ # StandardError to catch random non-loading problems that might occur
50
+ # when requiring the plugin (eg class macro invoked the wrong way)
51
+ # TODO: decide if this should be separated
52
+ raise Chronicle::ETL::PluginLoadError.new(name), "Plugin '#{name}' couldn't be loaded"
46
53
  end
47
54
 
48
55
  # Install a plugin to local gems
49
56
  def self.install(name)
57
+ return if installed?(name)
58
+
50
59
  gem_name = "chronicle-#{name}"
51
60
  raise(Chronicle::ETL::PluginNotAvailableError.new(gem_name), "Plugin #{name} doesn't exist") unless exists?(gem_name)
52
61
 
@@ -9,18 +9,7 @@ module Chronicle
9
9
  class << self
10
10
  attr_accessor :connectors
11
11
 
12
- def load_all!
13
- load_connectors_from_gems
14
- end
15
-
16
- def load_connectors_from_gems
17
- Gem::Specification.filter{|s| s.name.match(/^chronicle/) }.each do |gem|
18
- require_str = gem.name.gsub('chronicle-', 'chronicle/')
19
- require require_str rescue LoadError
20
- end
21
- end
22
-
23
- def register connector
12
+ def register(connector)
24
13
  connectors << connector
25
14
  end
26
15
 
@@ -28,9 +17,14 @@ module Chronicle
28
17
  @connectors ||= []
29
18
  end
30
19
 
31
- def find_by_phase_and_identifier(phase, identifier)
32
- # Simple case: built in connector
20
+ # Find connector from amongst those currently loaded
21
+ def find_by_phase_and_identifier_local(phase, identifier)
33
22
  connector = connectors.find { |c| c.phase == phase && c.identifier == identifier }
23
+ end
24
+
25
+ # Find connector and load relevant plugin to find it if necessary
26
+ def find_by_phase_and_identifier(phase, identifier)
27
+ connector = find_by_phase_and_identifier_local(phase, identifier)
34
28
  return connector if connector
35
29
 
36
30
  # if not available in built-in connectors, try to activate a
@@ -44,6 +38,8 @@ module Chronicle
44
38
  plugin = identifier
45
39
  end
46
40
 
41
+ raise(Chronicle::ETL::PluginNotInstalledError.new(plugin)) unless PluginRegistry.installed?(plugin)
42
+
47
43
  PluginRegistry.activate(plugin)
48
44
 
49
45
  candidates = connectors.select { |c| c.phase == phase && c.plugin == plugin }
@@ -0,0 +1,55 @@
1
+ module Chronicle
2
+ module ETL
3
+ # Secret management module
4
+ module Secrets
5
+ module_function
6
+
7
+ # Save a setting to a namespaced config file
8
+ def set(namespace, key, value)
9
+ config = read(namespace)
10
+ config[key.to_sym] = value
11
+ write(namespace, config)
12
+ end
13
+
14
+ # Remove a setting from a namespaced config file
15
+ def unset(namespace, key)
16
+ config = read(namespace)
17
+ config.delete(key.to_sym)
18
+ write(namespace, config)
19
+ end
20
+
21
+ # Retrieve all secrets from all namespaces
22
+ def all(namespace = nil)
23
+ namespaces = namespace.nil? ? available_secrets : [namespace]
24
+ namespaces
25
+ .to_h { |namespace| [namespace.to_sym, read(namespace)] }
26
+ .delete_if { |_, v| v.empty? }
27
+ end
28
+
29
+ # Return whether a namespace name is valid (lowercase alphanumeric and -)
30
+ def valid_namespace_name?(namespace)
31
+ namespace.match(/^[a-z0-9\-]+$/)
32
+ end
33
+
34
+ # Read secrets from a config file
35
+ def read(namespace)
36
+ definition = Chronicle::ETL::Config.load("secrets", namespace)
37
+ definition[:secrets] || {}
38
+ end
39
+
40
+ # Write secrets to a config file
41
+ def write(namespace, secrets)
42
+ data = {
43
+ secrets: (secrets || {}).transform_keys(&:to_s),
44
+ chronicle_etl_version: Chronicle::ETL::VERSION
45
+ }.transform_keys(&:to_s) # Should I implement deeply_transform_keys...?
46
+ Chronicle::ETL::Config.write("secrets", namespace, data)
47
+ end
48
+
49
+ # Which config files are available in ~/.config/chronicle/etl/secrets
50
+ def available_secrets
51
+ Chronicle::ETL::Config.available_configs('secrets')
52
+ end
53
+ end
54
+ end
55
+ end
@@ -1,5 +1,5 @@
1
1
  module Chronicle
2
2
  module ETL
3
- VERSION = "0.4.4"
3
+ VERSION = "0.5.0"
4
4
  end
5
5
  end
data/lib/chronicle/etl.rb CHANGED
@@ -14,6 +14,7 @@ require_relative 'etl/models/base'
14
14
  require_relative 'etl/models/raw'
15
15
  require_relative 'etl/models/entity'
16
16
  require_relative 'etl/runner'
17
+ require_relative 'etl/secrets'
17
18
  require_relative 'etl/serializers/serializer'
18
19
  require_relative 'etl/utils/binary_attachments'
19
20
  require_relative 'etl/utils/hash_utilities'
metadata CHANGED
@@ -1,14 +1,14 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: chronicle-etl
3
3
  version: !ruby/object:Gem::Version
4
- version: 0.4.4
4
+ version: 0.5.0
5
5
  platform: ruby
6
6
  authors:
7
7
  - Andrew Louis
8
8
  autorequire:
9
9
  bindir: exe
10
10
  cert_chain: []
11
- date: 2022-03-16 00:00:00.000000000 Z
11
+ date: 2022-03-24 00:00:00.000000000 Z
12
12
  dependencies:
13
13
  - !ruby/object:Gem::Dependency
14
14
  name: activesupport
@@ -94,20 +94,6 @@ dependencies:
94
94
  - - "~>"
95
95
  - !ruby/object:Gem::Version
96
96
  version: '1.13'
97
- - !ruby/object:Gem::Dependency
98
- name: runcom
99
- requirement: !ruby/object:Gem::Requirement
100
- requirements:
101
- - - ">="
102
- - !ruby/object:Gem::Version
103
- version: '6.0'
104
- type: :runtime
105
- prerelease: false
106
- version_requirements: !ruby/object:Gem::Requirement
107
- requirements:
108
- - - ">="
109
- - !ruby/object:Gem::Version
110
- version: '6.0'
111
97
  - !ruby/object:Gem::Dependency
112
98
  name: sequel
113
99
  requirement: !ruby/object:Gem::Requirement
@@ -178,6 +164,20 @@ dependencies:
178
164
  - - "~>"
179
165
  - !ruby/object:Gem::Version
180
166
  version: '0.17'
167
+ - !ruby/object:Gem::Dependency
168
+ name: tty-prompt
169
+ requirement: !ruby/object:Gem::Requirement
170
+ requirements:
171
+ - - "~>"
172
+ - !ruby/object:Gem::Version
173
+ version: '0.23'
174
+ type: :runtime
175
+ prerelease: false
176
+ version_requirements: !ruby/object:Gem::Requirement
177
+ requirements:
178
+ - - "~>"
179
+ - !ruby/object:Gem::Version
180
+ version: '0.23'
181
181
  - !ruby/object:Gem::Dependency
182
182
  name: tty-spinner
183
183
  requirement: !ruby/object:Gem::Requirement
@@ -207,19 +207,19 @@ dependencies:
207
207
  - !ruby/object:Gem::Version
208
208
  version: '0.11'
209
209
  - !ruby/object:Gem::Dependency
210
- name: tty-prompt
210
+ name: xdg
211
211
  requirement: !ruby/object:Gem::Requirement
212
212
  requirements:
213
- - - "~>"
213
+ - - ">="
214
214
  - !ruby/object:Gem::Version
215
- version: '0.23'
215
+ version: '4.0'
216
216
  type: :runtime
217
217
  prerelease: false
218
218
  version_requirements: !ruby/object:Gem::Requirement
219
219
  requirements:
220
- - - "~>"
220
+ - - ">="
221
221
  - !ruby/object:Gem::Version
222
- version: '0.23'
222
+ version: '4.0'
223
223
  - !ruby/object:Gem::Dependency
224
224
  name: bundler
225
225
  requirement: !ruby/object:Gem::Requirement
@@ -234,6 +234,34 @@ dependencies:
234
234
  - - "~>"
235
235
  - !ruby/object:Gem::Version
236
236
  version: '2.1'
237
+ - !ruby/object:Gem::Dependency
238
+ name: guard-rspec
239
+ requirement: !ruby/object:Gem::Requirement
240
+ requirements:
241
+ - - "~>"
242
+ - !ruby/object:Gem::Version
243
+ version: 4.7.3
244
+ type: :development
245
+ prerelease: false
246
+ version_requirements: !ruby/object:Gem::Requirement
247
+ requirements:
248
+ - - "~>"
249
+ - !ruby/object:Gem::Version
250
+ version: 4.7.3
251
+ - !ruby/object:Gem::Dependency
252
+ name: fakefs
253
+ requirement: !ruby/object:Gem::Requirement
254
+ requirements:
255
+ - - ">="
256
+ - !ruby/object:Gem::Version
257
+ version: '0'
258
+ type: :development
259
+ prerelease: false
260
+ version_requirements: !ruby/object:Gem::Requirement
261
+ requirements:
262
+ - - ">="
263
+ - !ruby/object:Gem::Version
264
+ version: '0'
237
265
  - !ruby/object:Gem::Dependency
238
266
  name: pry-byebug
239
267
  requirement: !ruby/object:Gem::Requirement
@@ -277,33 +305,33 @@ dependencies:
277
305
  - !ruby/object:Gem::Version
278
306
  version: '3.9'
279
307
  - !ruby/object:Gem::Dependency
280
- name: simplecov
308
+ name: rubocop
281
309
  requirement: !ruby/object:Gem::Requirement
282
310
  requirements:
283
311
  - - "~>"
284
312
  - !ruby/object:Gem::Version
285
- version: '0.21'
313
+ version: 1.25.1
286
314
  type: :development
287
315
  prerelease: false
288
316
  version_requirements: !ruby/object:Gem::Requirement
289
317
  requirements:
290
318
  - - "~>"
291
319
  - !ruby/object:Gem::Version
292
- version: '0.21'
320
+ version: 1.25.1
293
321
  - !ruby/object:Gem::Dependency
294
- name: guard-rspec
322
+ name: simplecov
295
323
  requirement: !ruby/object:Gem::Requirement
296
324
  requirements:
297
325
  - - "~>"
298
326
  - !ruby/object:Gem::Version
299
- version: 4.7.3
327
+ version: '0.21'
300
328
  type: :development
301
329
  prerelease: false
302
330
  version_requirements: !ruby/object:Gem::Requirement
303
331
  requirements:
304
332
  - - "~>"
305
333
  - !ruby/object:Gem::Version
306
- version: 4.7.3
334
+ version: '0.21'
307
335
  - !ruby/object:Gem::Dependency
308
336
  name: yard
309
337
  requirement: !ruby/object:Gem::Requirement
@@ -318,20 +346,6 @@ dependencies:
318
346
  - - "~>"
319
347
  - !ruby/object:Gem::Version
320
348
  version: 0.9.7
321
- - !ruby/object:Gem::Dependency
322
- name: rubocop
323
- requirement: !ruby/object:Gem::Requirement
324
- requirements:
325
- - - "~>"
326
- - !ruby/object:Gem::Version
327
- version: 1.25.1
328
- type: :development
329
- prerelease: false
330
- version_requirements: !ruby/object:Gem::Requirement
331
- requirements:
332
- - - "~>"
333
- - !ruby/object:Gem::Version
334
- version: 1.25.1
335
349
  description: Chronicle-ETL allows you to extract personal data from a variety of services,
336
350
  transformer it, and load it.
337
351
  email:
@@ -364,6 +378,7 @@ files:
364
378
  - lib/chronicle/etl/cli/jobs.rb
365
379
  - lib/chronicle/etl/cli/main.rb
366
380
  - lib/chronicle/etl/cli/plugins.rb
381
+ - lib/chronicle/etl/cli/secrets.rb
367
382
  - lib/chronicle/etl/cli/subcommand_base.rb
368
383
  - lib/chronicle/etl/config.rb
369
384
  - lib/chronicle/etl/configurable.rb
@@ -396,6 +411,7 @@ files:
396
411
  - lib/chronicle/etl/registry/registry.rb
397
412
  - lib/chronicle/etl/registry/self_registering.rb
398
413
  - lib/chronicle/etl/runner.rb
414
+ - lib/chronicle/etl/secrets.rb
399
415
  - lib/chronicle/etl/serializers/jsonapi_serializer.rb
400
416
  - lib/chronicle/etl/serializers/raw_serializer.rb
401
417
  - lib/chronicle/etl/serializers/serializer.rb