dratools 0.0.2 → 0.0.3
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- checksums.yaml +4 -4
- data/docs/environment.md +16 -11
- data/docs/usage.md +1 -1
- data/lib/dratools/command_line_interface.rb +21 -5
- data/lib/dratools/commands/size_command.rb +2 -1
- data/lib/dratools/config.rb +4 -4
- data/lib/dratools/ddbj_resource_client.rb +7 -1
- data/lib/dratools/progress_reporter.rb +87 -0
- data/lib/dratools/run_record_collector.rb +2 -1
- data/lib/dratools/version.rb +1 -1
- data/lib/dratools.rb +1 -0
- metadata +2 -1
checksums.yaml
CHANGED
|
@@ -1,7 +1,7 @@
|
|
|
1
1
|
---
|
|
2
2
|
SHA256:
|
|
3
|
-
metadata.gz:
|
|
4
|
-
data.tar.gz:
|
|
3
|
+
metadata.gz: 580e0395df351cc56d957e756ad8368b663d9902f4722983e268093e97ec0809
|
|
4
|
+
data.tar.gz: 39206bc915245cea263f476cf7e5397f7e1027c1c21de3500e0c0c573f1c2c41
|
|
5
5
|
SHA512:
|
|
6
|
-
metadata.gz:
|
|
7
|
-
data.tar.gz:
|
|
6
|
+
metadata.gz: 91bfee6e903c4334818f4dce7da9a30df66db16412cefcfa91ed384e5e0f031acedb69d1f17a6765fd86602072219c5e10a80043ced268d9b966779673c3fef0
|
|
7
|
+
data.tar.gz: 000f9ca102c3deacd8ae0b1c1d9f8edea16b596f65383c3f0fb925f413ca773390e0d1a7e9615f728b6f8b9d41afe7d5b8cdc0be4814cb319af34ce61fea1dd5
|
data/docs/environment.md
CHANGED
|
@@ -6,10 +6,10 @@
|
|
|
6
6
|
|
|
7
7
|
| 環境変数 | 既定値 | 役割 |
|
|
8
8
|
| --- | ---: | --- |
|
|
9
|
-
| `DRATOOLS_MAX_RECURSIVE_NON_RUN_XREFS` | `
|
|
10
|
-
| `DRATOOLS_TREE_MAX_DIRECT_RUNS` | `
|
|
11
|
-
| `DRATOOLS_URL_MAX_DIRECT_RUNS` | `
|
|
12
|
-
| `DRATOOLS_SIZE_MAX_DIRECT_RUNS` | `
|
|
9
|
+
| `DRATOOLS_MAX_RECURSIVE_NON_RUN_XREFS` | `500` | `runs` などが direct run を持たない親レコードから experiment/sample/study などの非 run レコードを再帰的に辿る最大件数 |
|
|
10
|
+
| `DRATOOLS_TREE_MAX_DIRECT_RUNS` | `200` | `tree` が direct run レコードを個別取得して URL まで展開する最大 run 件数。超えた場合は件数だけを要約表示 |
|
|
11
|
+
| `DRATOOLS_URL_MAX_DIRECT_RUNS` | `200` | `url` が 1 つの親 accession から direct run を暗黙展開して URL を解決する最大 run 件数 |
|
|
12
|
+
| `DRATOOLS_SIZE_MAX_DIRECT_RUNS` | `200` | `size` が 1 つの親 accession から direct run を暗黙展開して HEAD する最大 run 件数 |
|
|
13
13
|
|
|
14
14
|
`unlimited` が使えるのは、上の 4 つの上限設定だけです。
|
|
15
15
|
|
|
@@ -32,24 +32,27 @@
|
|
|
32
32
|
|
|
33
33
|
## 例
|
|
34
34
|
|
|
35
|
-
`tree`
|
|
35
|
+
`tree` の direct run 展開の既定値は 200 件です。
|
|
36
|
+
500 件に上げる例:
|
|
36
37
|
|
|
37
38
|
```sh
|
|
38
|
-
DRATOOLS_TREE_MAX_DIRECT_RUNS=
|
|
39
|
+
DRATOOLS_TREE_MAX_DIRECT_RUNS=500 dratools tree PRJDB12740
|
|
39
40
|
```
|
|
40
41
|
|
|
41
42
|
この値を小さくすると、展開しない direct run は要約だけを表示します。
|
|
42
43
|
|
|
43
|
-
`size`
|
|
44
|
+
`size` の direct run 暗黙展開の既定値は 200 件です。
|
|
45
|
+
500 件に上げる例:
|
|
44
46
|
|
|
45
47
|
```sh
|
|
46
|
-
DRATOOLS_SIZE_MAX_DIRECT_RUNS=
|
|
48
|
+
DRATOOLS_SIZE_MAX_DIRECT_RUNS=500 dratools size PRJDB12740
|
|
47
49
|
```
|
|
48
50
|
|
|
49
|
-
`url`
|
|
51
|
+
`url` の direct run 暗黙展開の既定値は 200 件です。
|
|
52
|
+
500 件に上げる例:
|
|
50
53
|
|
|
51
54
|
```sh
|
|
52
|
-
DRATOOLS_URL_MAX_DIRECT_RUNS=
|
|
55
|
+
DRATOOLS_URL_MAX_DIRECT_RUNS=500 dratools url --tsv PRJDB12740
|
|
53
56
|
```
|
|
54
57
|
|
|
55
58
|
再帰的な非 run 展開の上限を外す:
|
|
@@ -83,6 +86,8 @@ DRATOOLS_DOWNLOAD_COMMAND=aria2c dratools get -O ~/Downloads DRR000001
|
|
|
83
86
|
|
|
84
87
|
これらは上級の設定です。上限を大きくすると、DDBJ Search API へのリクエスト数が増えます。`size` の HTTP `HEAD` の回数も増えます。
|
|
85
88
|
|
|
86
|
-
`DRATOOLS_URL_MAX_DIRECT_RUNS` と `DRATOOLS_SIZE_MAX_DIRECT_RUNS` は direct run 数の上限です。experiment や sample や study を経由して見つかる run
|
|
89
|
+
`DRATOOLS_URL_MAX_DIRECT_RUNS` と `DRATOOLS_SIZE_MAX_DIRECT_RUNS` は direct run 数の上限です。experiment や sample や study を経由して見つかる run の総数は制限しません。`*_MAX_DIRECT_RUNS=unlimited` だけでは `DRATOOLS_MAX_RECURSIVE_NON_RUN_XREFS` の上限は外れません。
|
|
90
|
+
|
|
91
|
+
まず `meta` や `tree` で構造を確認してください。必要なら `runs` で accession を絞ってください。その後で重い操作を実行してください。
|
|
87
92
|
|
|
88
93
|
ダウンロード用の設定を小さくしすぎる場合を考えます。正常なサーバーでも、開始前や転送中に失敗します。短い値は動作検証やネットワーク問題の切り分けに使ってください。通常のダウンロードでは既定値を使ってください。
|
data/docs/usage.md
CHANGED
|
@@ -67,7 +67,7 @@ dratools runs PRJNA341783
|
|
|
67
67
|
dratools runs PRJNA341783 | dratools get -O ~/Downloads
|
|
68
68
|
```
|
|
69
69
|
|
|
70
|
-
Study や BioProject には多数の experiment や sample が含まれることがあります。`runs` はこれらを無制限には辿りません。上限を超えるとエラーで止まります。run へ直接リンクがある場合は、
|
|
70
|
+
Study や BioProject には多数の experiment や sample が含まれることがあります。`runs` はこれらを無制限には辿りません。上限を超えるとエラーで止まります。run へ直接リンクがある場合は、500 件を超えても制限の対象外です。レコードが大きい場合は、先に `tree` や `meta` で構造を確認してください。experiment や sample に絞ってから `runs` を使ってください。
|
|
71
71
|
|
|
72
72
|
## 合計サイズを確認する (`size`)
|
|
73
73
|
|
|
@@ -2,7 +2,9 @@
|
|
|
2
2
|
|
|
3
3
|
require_relative 'version'
|
|
4
4
|
require_relative 'accession_resolver'
|
|
5
|
+
require_relative 'ddbj_resource_client'
|
|
5
6
|
require_relative 'download_service'
|
|
7
|
+
require_relative 'progress_reporter'
|
|
6
8
|
require_relative 'commands/url_command'
|
|
7
9
|
require_relative 'commands/get_command'
|
|
8
10
|
require_relative 'commands/probe_command'
|
|
@@ -67,18 +69,22 @@ module Dratools
|
|
|
67
69
|
|
|
68
70
|
def initialize(
|
|
69
71
|
argv,
|
|
70
|
-
resolver:
|
|
72
|
+
resolver: nil,
|
|
71
73
|
downloader: DownloadService.new,
|
|
72
74
|
stdout: $stdout,
|
|
73
75
|
stderr: $stderr,
|
|
74
76
|
stdin: $stdin
|
|
75
77
|
)
|
|
76
78
|
@argv = argv
|
|
77
|
-
@resolver = resolver
|
|
78
|
-
@downloader = downloader
|
|
79
79
|
@stdout = stdout
|
|
80
80
|
@stderr = stderr
|
|
81
81
|
@stdin = stdin
|
|
82
|
+
@progress = ProgressReporter.new(
|
|
83
|
+
io: stderr,
|
|
84
|
+
enabled: interactive?(stderr)
|
|
85
|
+
)
|
|
86
|
+
@resolver = resolver || default_resolver
|
|
87
|
+
@downloader = downloader
|
|
82
88
|
end
|
|
83
89
|
|
|
84
90
|
def run
|
|
@@ -108,14 +114,24 @@ module Dratools
|
|
|
108
114
|
@argv.drop(1),
|
|
109
115
|
resolver: @resolver,
|
|
110
116
|
downloader: @downloader,
|
|
111
|
-
stdout: @stdout,
|
|
112
|
-
stderr: @
|
|
117
|
+
stdout: @progress.clearing_io(@stdout),
|
|
118
|
+
stderr: @progress.clearing_io,
|
|
113
119
|
stdin: @stdin
|
|
114
120
|
).run
|
|
121
|
+
ensure
|
|
122
|
+
@progress.finish
|
|
115
123
|
end
|
|
116
124
|
|
|
117
125
|
private
|
|
118
126
|
|
|
127
|
+
def default_resolver
|
|
128
|
+
AccessionResolver.new(client: DdbjResourceClient.new(progress: @progress))
|
|
129
|
+
end
|
|
130
|
+
|
|
131
|
+
def interactive?(io)
|
|
132
|
+
io.respond_to?(:tty?) && io.tty?
|
|
133
|
+
end
|
|
134
|
+
|
|
119
135
|
def print_help(stream)
|
|
120
136
|
stream.puts "Usage: #{COMMAND_NAME} <command> [options] [ACCESSION ...]"
|
|
121
137
|
stream.puts ''
|
|
@@ -157,7 +157,8 @@ module Dratools
|
|
|
157
157
|
raise InvalidRecordError,
|
|
158
158
|
"#{accession.to_s.upcase} has #{direct_run_count} direct runs; " \
|
|
159
159
|
"size expands at most #{max_direct_runs} direct runs from one parent accession. " \
|
|
160
|
-
"Use `#{Dratools::NAME} runs #{accession}` and pass narrower accessions
|
|
160
|
+
"Use `#{Dratools::NAME} runs #{accession}` and pass narrower accessions, " \
|
|
161
|
+
"or set #{Config::SIZE_MAX_DIRECT_RUNS_ENV}=unlimited."
|
|
161
162
|
end
|
|
162
163
|
end
|
|
163
164
|
end
|
data/lib/dratools/config.rb
CHANGED
|
@@ -16,10 +16,10 @@ module Dratools
|
|
|
16
16
|
DOWNLOAD_RETRY_WAIT_ENV = 'DRATOOLS_DOWNLOAD_RETRY_WAIT'
|
|
17
17
|
DOWNLOAD_COMMAND_ENV = 'DRATOOLS_DOWNLOAD_COMMAND'
|
|
18
18
|
|
|
19
|
-
DEFAULT_MAX_RECURSIVE_NON_RUN_XREFS =
|
|
20
|
-
DEFAULT_TREE_MAX_DIRECT_RUNS =
|
|
21
|
-
DEFAULT_URL_MAX_DIRECT_RUNS =
|
|
22
|
-
DEFAULT_SIZE_MAX_DIRECT_RUNS =
|
|
19
|
+
DEFAULT_MAX_RECURSIVE_NON_RUN_XREFS = 500
|
|
20
|
+
DEFAULT_TREE_MAX_DIRECT_RUNS = 200
|
|
21
|
+
DEFAULT_URL_MAX_DIRECT_RUNS = 200
|
|
22
|
+
DEFAULT_SIZE_MAX_DIRECT_RUNS = 200
|
|
23
23
|
DEFAULT_DOWNLOAD_CONNECT_TIMEOUT_SECONDS = 30
|
|
24
24
|
DEFAULT_DOWNLOAD_STALL_TIMEOUT_SECONDS = 60
|
|
25
25
|
DEFAULT_DOWNLOAD_STALL_SPEED_BYTES_PER_SECOND = 1024
|
|
@@ -8,6 +8,7 @@ require 'uri'
|
|
|
8
8
|
require_relative 'errors'
|
|
9
9
|
require_relative 'version'
|
|
10
10
|
require_relative 'ddbj_record_fields'
|
|
11
|
+
require_relative 'progress_reporter'
|
|
11
12
|
|
|
12
13
|
module Dratools
|
|
13
14
|
# DDBJ Search API を呼び出す薄い HTTP クライアント。
|
|
@@ -26,17 +27,20 @@ module Dratools
|
|
|
26
27
|
DEFAULT_READ_TIMEOUT_SECONDS = 30
|
|
27
28
|
|
|
28
29
|
def initialize(base_url: DDBJ_SEARCH_API_BASE_URL, open_timeout: DEFAULT_OPEN_TIMEOUT_SECONDS,
|
|
29
|
-
read_timeout: DEFAULT_READ_TIMEOUT_SECONDS)
|
|
30
|
+
read_timeout: DEFAULT_READ_TIMEOUT_SECONDS, progress: ProgressReporter.new)
|
|
30
31
|
@base_url = base_url.delete_suffix('/')
|
|
31
32
|
@open_timeout = open_timeout
|
|
32
33
|
@read_timeout = read_timeout
|
|
34
|
+
@progress = progress
|
|
33
35
|
end
|
|
34
36
|
|
|
35
37
|
def fetch_resource_record(type, accession)
|
|
38
|
+
@progress.report("fetching #{type} #{accession}")
|
|
36
39
|
fetch_json("#{@base_url}/#{ENTRIES_PATH}/#{type}/#{accession}#{ENTRY_RECORD_EXTENSION}")
|
|
37
40
|
end
|
|
38
41
|
|
|
39
42
|
def fetch_db_links(type, accession, target: nil)
|
|
43
|
+
@progress.report("linking #{type} #{accession}")
|
|
40
44
|
request_uri = URI("#{@base_url}/#{DBLINK_PATH}/#{type}/#{accession}")
|
|
41
45
|
request_uri.query = URI.encode_www_form(target: target) if target
|
|
42
46
|
fetch_json(request_uri.to_s).fetch(DdbjRecordFields::DB_XREFS_KEY, [])
|
|
@@ -59,6 +63,7 @@ module Dratools
|
|
|
59
63
|
private
|
|
60
64
|
|
|
61
65
|
def fetch_db_link_counts_chunk(items)
|
|
66
|
+
@progress.report("counting links (#{items.length})")
|
|
62
67
|
request_url = "#{@base_url}/#{DBLINK_PATH}/counts"
|
|
63
68
|
payload = post_json(request_url, items: items)
|
|
64
69
|
payload.fetch('items', []).to_h do |item|
|
|
@@ -67,6 +72,7 @@ module Dratools
|
|
|
67
72
|
end
|
|
68
73
|
|
|
69
74
|
def fetch_resource_records_bulk_chunk(type, accessions, include_db_xrefs:)
|
|
75
|
+
@progress.report("fetching #{type} bulk (#{accessions.length})")
|
|
70
76
|
request_uri = URI("#{@base_url}/#{ENTRIES_PATH}/#{type}/bulk")
|
|
71
77
|
request_uri.query = URI.encode_www_form(includeDbXrefs: include_db_xrefs)
|
|
72
78
|
payload = post_json(request_uri.to_s, ids: accessions)
|
|
@@ -0,0 +1,87 @@
|
|
|
1
|
+
# frozen_string_literal: true
|
|
2
|
+
|
|
3
|
+
module Dratools
|
|
4
|
+
# 対話的な端末のときだけ stderr へ一行進捗を表示する軽量レポーター。
|
|
5
|
+
# データは stdout に出るため、stdout をパイプしても進捗は混ざらない。
|
|
6
|
+
# 非 TTY(リダイレクト・パイプ・CI)では制御文字を出さず完全に無音にする。
|
|
7
|
+
class ProgressReporter
|
|
8
|
+
CLEAR_LINE = "\r\e[K"
|
|
9
|
+
|
|
10
|
+
# 通常の出力の直前に、表示中の進捗行を消す IO ラッパー。
|
|
11
|
+
class ClearingIO
|
|
12
|
+
def initialize(io, reporter)
|
|
13
|
+
@io = io
|
|
14
|
+
@reporter = reporter
|
|
15
|
+
end
|
|
16
|
+
|
|
17
|
+
def puts(*args)
|
|
18
|
+
@reporter.finish
|
|
19
|
+
@io.puts(*args)
|
|
20
|
+
end
|
|
21
|
+
|
|
22
|
+
def print(*args)
|
|
23
|
+
@reporter.finish
|
|
24
|
+
@io.print(*args)
|
|
25
|
+
end
|
|
26
|
+
|
|
27
|
+
def write(*args)
|
|
28
|
+
@reporter.finish
|
|
29
|
+
@io.write(*args)
|
|
30
|
+
end
|
|
31
|
+
|
|
32
|
+
def flush
|
|
33
|
+
@io.flush
|
|
34
|
+
end
|
|
35
|
+
|
|
36
|
+
def tty?
|
|
37
|
+
@io.respond_to?(:tty?) && @io.tty?
|
|
38
|
+
end
|
|
39
|
+
|
|
40
|
+
def method_missing(name, ...)
|
|
41
|
+
return super unless @io.respond_to?(name)
|
|
42
|
+
|
|
43
|
+
@io.public_send(name, ...)
|
|
44
|
+
end
|
|
45
|
+
|
|
46
|
+
def respond_to_missing?(name, include_private = false)
|
|
47
|
+
@io.respond_to?(name, include_private) || super
|
|
48
|
+
end
|
|
49
|
+
end
|
|
50
|
+
|
|
51
|
+
def initialize(io: $stderr, enabled: nil)
|
|
52
|
+
@io = io
|
|
53
|
+
@enabled = enabled.nil? ? interactive?(io) : enabled
|
|
54
|
+
@count = 0
|
|
55
|
+
@active = false
|
|
56
|
+
end
|
|
57
|
+
|
|
58
|
+
def clearing_io(io = @io)
|
|
59
|
+
ClearingIO.new(io, self)
|
|
60
|
+
end
|
|
61
|
+
|
|
62
|
+
# 1 件の進捗を表示する。直前の行を消してから上書きする。
|
|
63
|
+
def report(label)
|
|
64
|
+
return unless @enabled
|
|
65
|
+
|
|
66
|
+
@count += 1
|
|
67
|
+
@io.print("#{CLEAR_LINE}#{label} (#{@count})")
|
|
68
|
+
@io.flush
|
|
69
|
+
@active = true
|
|
70
|
+
end
|
|
71
|
+
|
|
72
|
+
# 残っている進捗行を消す。コマンド終了時(成功・失敗どちらでも)に呼ぶ。
|
|
73
|
+
def finish
|
|
74
|
+
return unless @active
|
|
75
|
+
|
|
76
|
+
@io.print(CLEAR_LINE)
|
|
77
|
+
@io.flush
|
|
78
|
+
@active = false
|
|
79
|
+
end
|
|
80
|
+
|
|
81
|
+
private
|
|
82
|
+
|
|
83
|
+
def interactive?(io)
|
|
84
|
+
io.respond_to?(:tty?) && io.tty?
|
|
85
|
+
end
|
|
86
|
+
end
|
|
87
|
+
end
|
|
@@ -94,7 +94,8 @@ module Dratools
|
|
|
94
94
|
accession = record_accession(ddbj_record) || 'record'
|
|
95
95
|
raise InvalidRecordError,
|
|
96
96
|
"#{accession} has #{non_run_xrefs.length} linked non-run records; " \
|
|
97
|
-
'refine to an experiment/sample accession before run expansion'
|
|
97
|
+
'refine to an experiment/sample accession before run expansion, ' \
|
|
98
|
+
"or set #{Config::MAX_RECURSIVE_NON_RUN_XREFS_ENV}=unlimited"
|
|
98
99
|
end
|
|
99
100
|
|
|
100
101
|
def child_bioprojects(ddbj_record)
|
data/lib/dratools/version.rb
CHANGED
data/lib/dratools.rb
CHANGED
|
@@ -10,6 +10,7 @@ require_relative 'dratools/traversal_node'
|
|
|
10
10
|
require_relative 'dratools/tree_renderer'
|
|
11
11
|
require_relative 'dratools/download_candidate_builder'
|
|
12
12
|
require_relative 'dratools/checksum_verifier'
|
|
13
|
+
require_relative 'dratools/progress_reporter'
|
|
13
14
|
require_relative 'dratools/ddbj_resource_client'
|
|
14
15
|
require_relative 'dratools/accession_resource_type_classifier'
|
|
15
16
|
require_relative 'dratools/run_record_collector'
|
metadata
CHANGED
|
@@ -1,7 +1,7 @@
|
|
|
1
1
|
--- !ruby/object:Gem::Specification
|
|
2
2
|
name: dratools
|
|
3
3
|
version: !ruby/object:Gem::Version
|
|
4
|
-
version: 0.0.
|
|
4
|
+
version: 0.0.3
|
|
5
5
|
platform: ruby
|
|
6
6
|
authors:
|
|
7
7
|
- kojix2
|
|
@@ -47,6 +47,7 @@ files:
|
|
|
47
47
|
- lib/dratools/download_service.rb
|
|
48
48
|
- lib/dratools/errors.rb
|
|
49
49
|
- lib/dratools/external_command_runner.rb
|
|
50
|
+
- lib/dratools/progress_reporter.rb
|
|
50
51
|
- lib/dratools/run_record_collector.rb
|
|
51
52
|
- lib/dratools/traversal_node.rb
|
|
52
53
|
- lib/dratools/tree_renderer.rb
|