htslib 0.0.8 → 0.0.10
Sign up to get free protection for your applications and to get access to all the features.
- checksums.yaml +4 -4
- data/README.md +25 -24
- data/lib/hts/bam/cigar.rb +5 -4
- data/lib/hts/bam/flag.rb +1 -1
- data/lib/hts/bam/header.rb +1 -1
- data/lib/hts/bam/record.rb +5 -11
- data/lib/hts/bam.rb +56 -46
- data/lib/hts/bcf/format.rb +1 -1
- data/lib/hts/bcf/header.rb +4 -4
- data/lib/hts/bcf/info.rb +1 -1
- data/lib/hts/bcf/record.rb +4 -3
- data/lib/hts/bcf.rb +39 -27
- data/lib/hts/faidx.rb +6 -4
- data/lib/hts/hts.rb +56 -0
- data/lib/hts/libhts/bgzf.rb +10 -5
- data/lib/hts/libhts/constants.rb +22 -4
- data/lib/hts/libhts/cram.rb +297 -0
- data/lib/hts/libhts/hfile.rb +19 -11
- data/lib/hts/libhts/hts.rb +2 -2
- data/lib/hts/libhts/sam_funcs.rb +2 -1
- data/lib/hts/libhts.rb +4 -3
- data/lib/hts/tabix.rb +42 -17
- data/lib/hts/version.rb +1 -1
- data/lib/htslib.rb +5 -5
- metadata +4 -2
checksums.yaml
CHANGED
@@ -1,7 +1,7 @@
|
|
1
1
|
---
|
2
2
|
SHA256:
|
3
|
-
metadata.gz:
|
4
|
-
data.tar.gz:
|
3
|
+
metadata.gz: 2ecf4c183a1720313371e493fbf7876a08d17639ad853b1d26faeff3addb7c6e
|
4
|
+
data.tar.gz: cd1ed908a5dd8be184c8d9195c6fb5fabda5487d3c35e11476d9bd6d15c7dae8
|
5
5
|
SHA512:
|
6
|
-
metadata.gz:
|
7
|
-
data.tar.gz:
|
6
|
+
metadata.gz: 4980914c7b92528055d2d0cf62fad6ca5edcde8ba1b9d426740538aee42fab53c7ef96ebe53adcaa9663368f024546c048b71760b68bd88e7cbb7647e200847c
|
7
|
+
data.tar.gz: a94b5d01702cc9873fdf6d1498658cff15c44ba8c90b08cafe514a883560c6aaa90cfbd83cede8a4ae1a5095c6f99710e76a8aacb0e0851e628c607c37719bd5
|
data/README.md
CHANGED
@@ -6,10 +6,7 @@
|
|
6
6
|
[![DOI](https://zenodo.org/badge/247078205.svg)](https://zenodo.org/badge/latestdoi/247078205)
|
7
7
|
[![Docs Stable](https://img.shields.io/badge/docs-stable-blue.svg)](https://rubydoc.info/gems/htslib)
|
8
8
|
|
9
|
-
|
10
|
-
|
11
|
-
Ruby-htslib is the Ruby bindings to HTSlib, a C library for processing high throughput sequencing (HTS) data.
|
12
|
-
It will provide APIs to read and write file formats such as [SAM, BAM, VCF, and BCF](http://samtools.github.io/hts-specs/).
|
9
|
+
Ruby-htslib is the [Ruby](https://www.ruby-lang.org) bindings to [HTSlib](https://github.com/samtools/htslib), a C library for high-throughput sequencing data formats. It allows you to read and write file formats commonly used in genomics, such as [SAM, BAM, VCF, and BCF](http://samtools.github.io/hts-specs/) in the Ruby language.
|
13
10
|
|
14
11
|
:apple: Feel free to fork it out if you can develop it!
|
15
12
|
|
@@ -41,34 +38,35 @@ export HTSLIBDIR="/your/path/to/htslib" # libhts.so
|
|
41
38
|
|
42
39
|
### High level API
|
43
40
|
|
44
|
-
|
45
|
-
Classes such as `Cram` `Bam` `Bcf` `Faidx` `Tabix` are partially implemented.
|
46
|
-
|
47
|
-
Read SAM / BAM - Sequence Alignment Map file
|
41
|
+
Read SAM / BAM / CRAM - Sequence Alignment Map file
|
48
42
|
|
49
43
|
```ruby
|
50
44
|
require 'htslib'
|
51
45
|
|
52
|
-
bam = HTS::Bam.
|
46
|
+
bam = HTS::Bam.open("a.bam")
|
53
47
|
|
54
48
|
bam.each do |r|
|
55
49
|
p name: r.qname,
|
56
50
|
flag: r.flag,
|
51
|
+
chr: r.chrom,
|
57
52
|
pos: r.start + 1,
|
58
|
-
mpos: r.mate_start + 1,
|
59
53
|
mqual: r.mapping_quality,
|
60
|
-
seq: r.sequence,
|
61
54
|
cigar: r.cigar.to_s,
|
62
|
-
|
55
|
+
mchr: r.mate_chrom,
|
56
|
+
mpos: r.mate_start + 1,
|
57
|
+
isize: r.insert_size,
|
58
|
+
seq: r.sequence,
|
59
|
+
qual: r.base_qualities.map { |i| (i + 33).chr }.join,
|
60
|
+
tagMC: r.tag("MC")
|
63
61
|
end
|
64
62
|
|
65
63
|
bam.close
|
66
64
|
```
|
67
65
|
|
68
|
-
Read VCF / BCF - Variant Call Format
|
66
|
+
Read VCF / BCF - Variant Call Format file
|
69
67
|
|
70
68
|
```ruby
|
71
|
-
bcf = HTS::Bcf.
|
69
|
+
bcf = HTS::Bcf.open("b.bcf")
|
72
70
|
|
73
71
|
bcf.each do |r|
|
74
72
|
p chrom: r.chrom,
|
@@ -78,16 +76,16 @@ bcf.each do |r|
|
|
78
76
|
ref: r.ref,
|
79
77
|
alt: r.alt,
|
80
78
|
filter: r.filter
|
79
|
+
# info: r.info
|
80
|
+
# format: r.format
|
81
81
|
end
|
82
82
|
|
83
83
|
bcf.close
|
84
84
|
```
|
85
85
|
|
86
|
-
The methods for reading are implemented first. Methods for writing will be implemented in the coming days.
|
87
|
-
|
88
86
|
### Low level API
|
89
87
|
|
90
|
-
`HTS::LibHTS` provides native functions.
|
88
|
+
`HTS::LibHTS` provides native C functions.
|
91
89
|
|
92
90
|
```ruby
|
93
91
|
require 'htslib'
|
@@ -98,19 +96,20 @@ p b[:category]
|
|
98
96
|
p b[:format]
|
99
97
|
```
|
100
98
|
|
101
|
-
Note: Only
|
99
|
+
Note: htslib makes extensive use of macro functions for speed. you cannot use C macro functions in Ruby if they are not reimplemented in ruby-htslib. Only small number of C structs are implemented with FFI's ManagedStruct, which frees memory when Ruby's garbage collection fires. Other structs will need to be freed manually.
|
102
100
|
|
103
101
|
### Need more speed?
|
104
102
|
|
105
|
-
Try [htslib.cr](https://github.com/bio-crystal/htslib.cr)
|
103
|
+
Try Crystal. [htslib.cr](https://github.com/bio-crystal/htslib.cr) is implemented in Crystal language and provides an API compatible with ruby-htslib. Crsytal language is not as flexible as Ruby language. You can not use eval methods, and you must always be aware of the types. It is not very suitable for writing one-time scripts or experimenting with different code. However, If you have already written code in ruby-htslib, have a clear idea of the manipulations you want to do, and need to execute them many times, then by all means try to implement the command line tool using htslib.cr. The Crystal language is very fast and can perform almost as well as the Rust and C languages.
|
106
104
|
|
107
105
|
## Documentation
|
108
106
|
|
107
|
+
* [API Documentation (develop branch)](https://kojix2.github.io/ruby-htslib/)
|
109
108
|
* [RubyDoc.info - HTSlib](https://rdoc.info/gems/htslib)
|
110
109
|
|
111
110
|
## Development
|
112
111
|
|
113
|
-
To get started with development
|
112
|
+
To get started with development:
|
114
113
|
|
115
114
|
```sh
|
116
115
|
git clone --recursive https://github.com/kojix2/ruby-htslib
|
@@ -120,10 +119,13 @@ bundle exec rake htslib:build
|
|
120
119
|
bundle exec rake test
|
121
120
|
```
|
122
121
|
|
122
|
+
[GNU Autotools](https://en.wikipedia.org/wiki/GNU_Autotools) is required to compile htslib.
|
123
|
+
|
123
124
|
Many macro functions are used in HTSlib. Since these macro functions cannot be called using FFI, they must be reimplemented in Ruby.
|
124
125
|
|
125
126
|
* Actively use the advanced features of Ruby.
|
126
|
-
*
|
127
|
+
* Remain compatibile with [htslib.cr](https://github.com/bio-crystal/htslib.cr).
|
128
|
+
* The most difficult part is the return value. In the Crystal language, it is convenient for a method to return only one type. In the Ruby language, on the other hand, it is more convenient to return multiple classes. For example, in the Crystal language, it is confusing that a return value can take four types: Int32, Float32, Nil, and String. In Ruby, on the other hand, it is very common and does not cause any problems.
|
127
129
|
|
128
130
|
#### FFI Extensions
|
129
131
|
|
@@ -131,7 +133,6 @@ Many macro functions are used in HTSlib. Since these macro functions cannot be c
|
|
131
133
|
|
132
134
|
#### Automatic generation or automatic validation (Future plan)
|
133
135
|
|
134
|
-
|
135
136
|
+ [c2ffi](https://github.com/rpav/c2ffi) is a tool to create JSON format metadata from C header files. It is planned to use c2ffi to automatically generate bindings or tests.
|
136
137
|
|
137
138
|
## Contributing
|
@@ -145,14 +146,14 @@ Ruby-htslib is a library under development, so even small improvements like typo
|
|
145
146
|
* [financial contributions](https://github.com/sponsors/kojix2)
|
146
147
|
|
147
148
|
```
|
148
|
-
Do you need commit rights to
|
149
|
+
Do you need commit rights to ruby-htslib repository?
|
149
150
|
Do you want to get admin rights and take over the project?
|
150
151
|
If so, please feel free to contact us @kojix2.
|
151
152
|
```
|
152
153
|
|
153
154
|
#### Why do you implement htslib in a language like Ruby, which is not widely used in the bioinformatics?
|
154
155
|
|
155
|
-
One of the greatest joys of using a minor language like Ruby in bioinformatics is that there is nothing stopping you from reinventing the wheel. Reinventing the wheel can be fun. But with languages like Python and R, where many bioinformatics masters work, there is no chance left for beginners to create htslib bindings. Bioinformatics file formats, libraries and tools are very complex and I don't know how to understand them. So I wanted to implement the HTSLib binding to better understand how
|
156
|
+
One of the greatest joys of using a minor language like Ruby in bioinformatics is that there is nothing stopping you from reinventing the wheel. Reinventing the wheel can be fun. But with languages like Python and R, where many bioinformatics masters work, there is no chance left for beginners to create htslib bindings. Bioinformatics file formats, libraries and tools are very complex and I don't know how to understand them. So I wanted to implement the HTSLib binding myself to better understand how the pioneers of bioinformatics felt when establishing the file format and how they created their tools. And that effort is still going on today...
|
156
157
|
|
157
158
|
## Links
|
158
159
|
|
data/lib/hts/bam/cigar.rb
CHANGED
@@ -4,7 +4,7 @@
|
|
4
4
|
# https://github.com/quinlan-lab/hts-python
|
5
5
|
|
6
6
|
module HTS
|
7
|
-
class Bam
|
7
|
+
class Bam < Hts
|
8
8
|
class Cigar
|
9
9
|
include Enumerable
|
10
10
|
|
@@ -18,7 +18,7 @@ module HTS
|
|
18
18
|
end
|
19
19
|
|
20
20
|
def to_s
|
21
|
-
|
21
|
+
map { |op, len| "#{len}#{op}" }.join
|
22
22
|
end
|
23
23
|
|
24
24
|
def each
|
@@ -26,8 +26,9 @@ module HTS
|
|
26
26
|
|
27
27
|
@n_cigar.times do |i|
|
28
28
|
c = @pointer[i].read_uint32
|
29
|
-
|
30
|
-
|
29
|
+
op = LibHTS.bam_cigar_opchr(c)
|
30
|
+
len = LibHTS.bam_cigar_oplen(c)
|
31
|
+
yield [op, len]
|
31
32
|
end
|
32
33
|
end
|
33
34
|
end
|
data/lib/hts/bam/flag.rb
CHANGED
data/lib/hts/bam/header.rb
CHANGED
data/lib/hts/bam/record.rb
CHANGED
@@ -4,10 +4,12 @@
|
|
4
4
|
# https://github.com/quinlan-lab/hts-python
|
5
5
|
|
6
6
|
module HTS
|
7
|
-
class Bam
|
7
|
+
class Bam < Hts
|
8
8
|
class Record
|
9
9
|
SEQ_NT16_STR = "=ACMGRSVTWYHKDBN"
|
10
10
|
|
11
|
+
attr_reader :header
|
12
|
+
|
11
13
|
def initialize(bam1_t, header)
|
12
14
|
@bam1 = bam1_t
|
13
15
|
@header = header
|
@@ -21,16 +23,6 @@ module HTS
|
|
21
23
|
@bam1.to_ptr
|
22
24
|
end
|
23
25
|
|
24
|
-
attr_reader :header
|
25
|
-
|
26
|
-
# def initialize_copy
|
27
|
-
# super
|
28
|
-
# end
|
29
|
-
|
30
|
-
def self.rom_sam_str; end
|
31
|
-
|
32
|
-
def tags; end
|
33
|
-
|
34
26
|
# returns the query name.
|
35
27
|
def qname
|
36
28
|
LibHTS.bam_get_qname(@bam1).read_string
|
@@ -189,6 +181,8 @@ module HTS
|
|
189
181
|
end
|
190
182
|
end
|
191
183
|
|
184
|
+
# def tags; end
|
185
|
+
|
192
186
|
def to_s
|
193
187
|
kstr = LibHTS::KString.new
|
194
188
|
raise "Failed to format bam record" if LibHTS.sam_format1(@header.struct, @bam1, kstr) == -1
|
data/lib/hts/bam.rb
CHANGED
@@ -3,6 +3,9 @@
|
|
3
3
|
# Based on hts-python
|
4
4
|
# https://github.com/quinlan-lab/hts-python
|
5
5
|
|
6
|
+
require_relative "../htslib"
|
7
|
+
|
8
|
+
require_relative "hts"
|
6
9
|
require_relative "bam/header"
|
7
10
|
require_relative "bam/cigar"
|
8
11
|
require_relative "bam/flag"
|
@@ -12,10 +15,10 @@ module HTS
|
|
12
15
|
class Bam
|
13
16
|
include Enumerable
|
14
17
|
|
15
|
-
attr_reader :
|
18
|
+
attr_reader :file_name, :index_path, :mode, :header
|
16
19
|
|
17
|
-
def self.open(
|
18
|
-
file = new(
|
20
|
+
def self.open(*args, **kw)
|
21
|
+
file = new(*args, **kw) # do not yield
|
19
22
|
return file unless block_given?
|
20
23
|
|
21
24
|
begin
|
@@ -26,22 +29,23 @@ module HTS
|
|
26
29
|
file
|
27
30
|
end
|
28
31
|
|
29
|
-
def initialize(
|
30
|
-
|
31
|
-
|
32
|
-
|
33
|
-
|
34
|
-
if mode[0] == "r" && !File.exist?(file_path)
|
35
|
-
message = "No such SAM/BAM file - #{file_path}"
|
32
|
+
def initialize(file_name, mode = "r", index: nil, fai: nil, threads: nil,
|
33
|
+
create_index: false)
|
34
|
+
if block_given?
|
35
|
+
message = "HTS::Bam.new() dose not take block; Please use HTS::Bam.open() instead"
|
36
36
|
raise message
|
37
37
|
end
|
38
38
|
|
39
|
+
# NOTE: Do not check for the existence of local files, since file_names may be remote URIs.
|
40
|
+
|
41
|
+
@file_name = file_name
|
39
42
|
@mode = mode
|
40
|
-
@hts_file = LibHTS.hts_open(
|
43
|
+
@hts_file = LibHTS.hts_open(@file_name, mode)
|
44
|
+
|
45
|
+
raise Errno::ENOENT, "Failed to open #{@file_name}" if @hts_file.null?
|
41
46
|
|
42
47
|
if fai
|
43
|
-
|
44
|
-
r = LibHTS.hts_set_fai_filename(@hts_file, fai_path)
|
48
|
+
r = LibHTS.hts_set_fai_filename(@hts_file, fai)
|
45
49
|
raise "Failed to load fasta index: #{fai}" if r < 0
|
46
50
|
end
|
47
51
|
|
@@ -50,46 +54,54 @@ module HTS
|
|
50
54
|
raise "Failed to set number of threads: #{threads}" if r < 0
|
51
55
|
end
|
52
56
|
|
53
|
-
return if mode[0] == "w"
|
57
|
+
return if @mode[0] == "w"
|
54
58
|
|
55
59
|
@header = Bam::Header.new(@hts_file)
|
56
60
|
|
57
|
-
create_index if
|
61
|
+
create_index(index) if create_index
|
62
|
+
|
63
|
+
@idx = load_index(index)
|
58
64
|
|
59
|
-
|
60
|
-
@idx = LibHTS.sam_index_load(@hts_file, file_path)
|
65
|
+
@start_position = tell
|
61
66
|
end
|
62
67
|
|
63
|
-
def create_index
|
64
|
-
|
65
|
-
|
66
|
-
|
67
|
-
|
68
|
+
def create_index(index_name = nil)
|
69
|
+
if index
|
70
|
+
warn "Create index for #{@file_name} to #{index_name}"
|
71
|
+
LibHTS.sam_index_build2(@file_name, index_name, -1)
|
72
|
+
else
|
73
|
+
warn "Create index for #{@file_name} to #{index_name}"
|
74
|
+
LibHTS.sam_index_build(@file_name, -1)
|
75
|
+
end
|
68
76
|
end
|
69
77
|
|
70
|
-
def
|
71
|
-
|
78
|
+
def load_index(index_name = nil)
|
79
|
+
if index_name
|
80
|
+
LibHTS.sam_index_load2(@hts_file, @file_name, index_name)
|
81
|
+
else
|
82
|
+
LibHTS.sam_index_load3(@hts_file, @file_name, nil, 2) # should be 3 ? (copy remote file to local?)
|
83
|
+
end
|
72
84
|
end
|
73
85
|
|
74
|
-
def
|
75
|
-
|
86
|
+
def index_loaded?
|
87
|
+
!@idx.null?
|
76
88
|
end
|
77
89
|
|
78
90
|
# Close the current file.
|
79
91
|
def close
|
80
|
-
LibHTS.hts_idx_destroy(@idx) if @idx
|
92
|
+
LibHTS.hts_idx_destroy(@idx) if @idx&.null?
|
81
93
|
@idx = nil
|
82
94
|
LibHTS.hts_close(@hts_file)
|
83
95
|
@hts_file = nil
|
84
96
|
end
|
85
97
|
|
86
98
|
def closed?
|
87
|
-
@hts_file.nil?
|
99
|
+
@hts_file.nil? || @hts_file.null?
|
88
100
|
end
|
89
101
|
|
90
102
|
def write_header(header)
|
91
103
|
@header = header.dup
|
92
|
-
LibHTS.hts_set_fai_filename(@hts_file, @
|
104
|
+
LibHTS.hts_set_fai_filename(@hts_file, @file_name)
|
93
105
|
LibHTS.sam_hdr_write(@hts_file, header)
|
94
106
|
end
|
95
107
|
|
@@ -98,9 +110,17 @@ module HTS
|
|
98
110
|
LibHTS.sam_write1(@hts_file, header, aln_dup) > 0 || raise
|
99
111
|
end
|
100
112
|
|
101
|
-
#
|
102
|
-
|
103
|
-
|
113
|
+
# Iterate over each record.
|
114
|
+
# Generate a new Record object each time.
|
115
|
+
# Slower than each.
|
116
|
+
def each_copy
|
117
|
+
return to_enum(__method__) unless block_given?
|
118
|
+
|
119
|
+
while LibHTS.sam_read1(@hts_file, header, bam1 = LibHTS.bam_init1) != -1
|
120
|
+
record = Record.new(bam1, header)
|
121
|
+
yield record
|
122
|
+
end
|
123
|
+
self
|
104
124
|
end
|
105
125
|
|
106
126
|
# Iterate over each record.
|
@@ -114,24 +134,14 @@ module HTS
|
|
114
134
|
|
115
135
|
bam1 = LibHTS.bam_init1
|
116
136
|
record = Record.new(bam1, header)
|
117
|
-
yield record while LibHTS.sam_read1(@hts_file, header, bam1)
|
118
|
-
|
119
|
-
|
120
|
-
# Iterate over each record.
|
121
|
-
# Generate a new Record object each time.
|
122
|
-
# Slower than each.
|
123
|
-
def each_copy
|
124
|
-
return to_enum(__method__) unless block_given?
|
125
|
-
|
126
|
-
while LibHTS.sam_read1(@hts_file, header, bam1 = LibHTS.bam_init1) > 0
|
127
|
-
record = Record.new(bam1, header)
|
128
|
-
yield record
|
129
|
-
end
|
137
|
+
yield record while LibHTS.sam_read1(@hts_file, header, bam1) != -1
|
138
|
+
self
|
130
139
|
end
|
131
140
|
|
132
141
|
# query [WIP]
|
133
142
|
def query(region)
|
134
|
-
|
143
|
+
raise "Index file is required to call the query method." unless index_loaded?
|
144
|
+
|
135
145
|
qiter = LibHTS.sam_itr_querys(@idx, header, region)
|
136
146
|
begin
|
137
147
|
bam1 = LibHTS.bam_init1
|
data/lib/hts/bcf/format.rb
CHANGED
data/lib/hts/bcf/header.rb
CHANGED
@@ -1,7 +1,7 @@
|
|
1
1
|
# frozen_string_literal: true
|
2
2
|
|
3
3
|
module HTS
|
4
|
-
class Bcf
|
4
|
+
class Bcf < Hts
|
5
5
|
class Header
|
6
6
|
def initialize(hts_file)
|
7
7
|
@bcf_hdr = LibHTS.bcf_hdr_read(hts_file)
|
@@ -19,14 +19,14 @@ module HTS
|
|
19
19
|
LibHTS.bcf_hdr_get_version(@bcf_hdr)
|
20
20
|
end
|
21
21
|
|
22
|
-
def
|
22
|
+
def nsamples
|
23
23
|
LibHTS.bcf_hdr_nsamples(@bcf_hdr)
|
24
24
|
end
|
25
25
|
|
26
|
-
def
|
26
|
+
def samples
|
27
27
|
# bcf_hdr_id2name is macro function
|
28
28
|
@bcf_hdr[:samples]
|
29
|
-
.read_array_of_pointer(
|
29
|
+
.read_array_of_pointer(nsamples)
|
30
30
|
.map(&:read_string)
|
31
31
|
end
|
32
32
|
|
data/lib/hts/bcf/info.rb
CHANGED
data/lib/hts/bcf/record.rb
CHANGED
@@ -1,7 +1,7 @@
|
|
1
1
|
# frozen_string_literal: true
|
2
2
|
|
3
3
|
module HTS
|
4
|
-
class Bcf
|
4
|
+
class Bcf < Hts
|
5
5
|
class Record
|
6
6
|
def initialize(bcf_t, header)
|
7
7
|
@bcf1 = bcf_t
|
@@ -109,8 +109,9 @@ module HTS
|
|
109
109
|
|
110
110
|
private
|
111
111
|
|
112
|
-
def initialize_copy
|
113
|
-
|
112
|
+
def initialize_copy(orig)\
|
113
|
+
@header = orig.header
|
114
|
+
@bcf1 = LibHTS.bcf_dup(orig.struct)
|
114
115
|
end
|
115
116
|
end
|
116
117
|
end
|
data/lib/hts/bcf.rb
CHANGED
@@ -3,19 +3,22 @@
|
|
3
3
|
# Based on hts-python
|
4
4
|
# https://github.com/quinlan-lab/hts-python
|
5
5
|
|
6
|
+
require_relative "../htslib"
|
7
|
+
|
8
|
+
require_relative "hts"
|
6
9
|
require_relative "bcf/header"
|
7
10
|
require_relative "bcf/info"
|
8
11
|
require_relative "bcf/format"
|
9
12
|
require_relative "bcf/record"
|
10
13
|
|
11
14
|
module HTS
|
12
|
-
class Bcf
|
15
|
+
class Bcf < Hts
|
13
16
|
include Enumerable
|
14
17
|
|
15
|
-
attr_reader :
|
18
|
+
attr_reader :file_name, :index_path, :mode, :header
|
16
19
|
|
17
|
-
def self.open(
|
18
|
-
file = new(
|
20
|
+
def self.open(*args, **kw)
|
21
|
+
file = new(*args, **kw) # do not yield
|
19
22
|
return file unless block_given?
|
20
23
|
|
21
24
|
begin
|
@@ -26,40 +29,34 @@ module HTS
|
|
26
29
|
file
|
27
30
|
end
|
28
31
|
|
29
|
-
def initialize(
|
30
|
-
|
31
|
-
|
32
|
-
|
33
|
-
|
34
|
-
if mode[0] == "r" && !File.exist?(file_path)
|
35
|
-
message = "No such VCF/BCF file - #{file_path}"
|
32
|
+
def initialize(file_name, mode = "r", index: nil, fai: nil, threads: nil,
|
33
|
+
create_index: false)
|
34
|
+
if block_given?
|
35
|
+
message = "HTS::Bcf.new() dose not take block; Please use HTS::Bcf.open() instead"
|
36
36
|
raise message
|
37
37
|
end
|
38
38
|
|
39
|
+
# NOTE: Do not check for the existence of local files, since file_names may be remote URIs.
|
40
|
+
|
41
|
+
@file_name = file_name
|
39
42
|
@mode = mode
|
40
|
-
@hts_file = LibHTS.hts_open(
|
43
|
+
@hts_file = LibHTS.hts_open(@file_name, mode)
|
44
|
+
|
45
|
+
raise Errno::ENOENT, "Failed to open #{@file_name}" if @hts_file.null?
|
41
46
|
|
42
47
|
if threads&.> 0
|
43
48
|
r = LibHTS.hts_set_threads(@hts_file, threads)
|
44
49
|
raise "Failed to set number of threads: #{threads}" if r < 0
|
45
50
|
end
|
46
51
|
|
47
|
-
return if mode[0] == "w"
|
52
|
+
return if @mode[0] == "w"
|
48
53
|
|
49
54
|
@header = Bcf::Header.new(@hts_file)
|
50
55
|
end
|
51
56
|
|
52
|
-
def struct
|
53
|
-
@hts_file
|
54
|
-
end
|
55
|
-
|
56
|
-
def to_ptr
|
57
|
-
@hts_file.to_ptr
|
58
|
-
end
|
59
|
-
|
60
57
|
def write_header
|
61
58
|
@header = header.dup
|
62
|
-
LibHTS.hts_set_fai_filename(header, @
|
59
|
+
LibHTS.hts_set_fai_filename(header, @file_name)
|
63
60
|
LibHTS.bcf_hdr_write(@hts_file, header.struct)
|
64
61
|
end
|
65
62
|
|
@@ -78,15 +75,18 @@ module HTS
|
|
78
75
|
@hts_file.nil?
|
79
76
|
end
|
80
77
|
|
81
|
-
def
|
82
|
-
header.
|
78
|
+
def nsamples
|
79
|
+
header.nsamples
|
83
80
|
end
|
84
81
|
|
85
|
-
def
|
86
|
-
header.
|
82
|
+
def samples
|
83
|
+
header.samples
|
87
84
|
end
|
88
85
|
|
89
|
-
|
86
|
+
# Iterate over each record.
|
87
|
+
# Generate a new Record object each time.
|
88
|
+
# Slower than each.
|
89
|
+
def each_copy
|
90
90
|
return to_enum(__method__) unless block_given?
|
91
91
|
|
92
92
|
while LibHTS.bcf_read(@hts_file, header, bcf1 = LibHTS.bcf_init) != -1
|
@@ -95,5 +95,17 @@ module HTS
|
|
95
95
|
end
|
96
96
|
self
|
97
97
|
end
|
98
|
+
|
99
|
+
# Iterate over each record.
|
100
|
+
# Record object is reused.
|
101
|
+
# Faster than each_copy.
|
102
|
+
def each
|
103
|
+
return to_enum(__method__) unless block_given?
|
104
|
+
|
105
|
+
bcf1 = LibHTS.bcf_init
|
106
|
+
record = Record.new(bcf1, header)
|
107
|
+
yield record while LibHTS.bcf_read(@hts_file, header, bcf1) != -1
|
108
|
+
self
|
109
|
+
end
|
98
110
|
end
|
99
111
|
end
|
data/lib/hts/faidx.rb
CHANGED
@@ -3,17 +3,19 @@
|
|
3
3
|
# Based on hts-python
|
4
4
|
# https://github.com/quinlan-lab/hts-python
|
5
5
|
|
6
|
+
require_relative "../htslib"
|
7
|
+
|
6
8
|
module HTS
|
7
9
|
class Faidx
|
8
|
-
attr_reader :
|
10
|
+
attr_reader :file_name
|
9
11
|
|
10
12
|
class << self
|
11
13
|
alias open new
|
12
14
|
end
|
13
15
|
|
14
|
-
def initialize(
|
15
|
-
@
|
16
|
-
@fai = LibHTS.fai_load(
|
16
|
+
def initialize(file_name)
|
17
|
+
@file_name = file_name
|
18
|
+
@fai = LibHTS.fai_load(@file_name)
|
17
19
|
|
18
20
|
# IO like API
|
19
21
|
if block_given?
|