minimap2 0.0.4 → 0.2.21

Sign up to get free protection for your applications and to get access to all the features.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: c5ef413b53423059804163320f8d5cd9ca0366fa7473603b06a857c109b99395
4
- data.tar.gz: 64cdc52d1961e793f9dc0a966c9f63f357aa3992c4cf8f3341d1cd64c0eea01a
3
+ metadata.gz: 4bd850f529cb82950c16581735bdd74f232e0ef3490e5cb5b6f7045faa1fe696
4
+ data.tar.gz: 40d00cf14886a35f831b593d541cf9e72f8e5cf07d87be31116c215799449f62
5
5
  SHA512:
6
- metadata.gz: cf3a7c4294279f7a79a6f931f50c243f72b17658a69f4e6eec13e8584b61d503a6da164be3b01ca81cb52679d893efd903ac41387ec418f43c016e392d027ed4
7
- data.tar.gz: 4cac45c87e639ec4b698990e80e9a64aad902a368f44e35223a31ce162eeafa46137b06faa99b345f8e57a99cce4b9e39bb2959dd29d8a5c09d746439431df21
6
+ metadata.gz: 669bd6d5a4eb0dc37f12ee4c0f9653bfe76afec70b8d592e291269cb97b90b493b398b8d68ebacb64ba2ce28187a32a32fdb3fb77ef070023ffa27983f479929
7
+ data.tar.gz: 12c2fd1ace06a7e6a1734cb27f09091851f3fe917714156b27a003a168815dbef83eabc00c56c701bdcd5f982db873346bca375b3e8f05764b7fb797d2d5c898
data/README.md CHANGED
@@ -1,4 +1,4 @@
1
- # Minimap2
1
+ # ruby-minimap2
2
2
 
3
3
  [![Gem Version](https://img.shields.io/gem/v/minimap2?color=brightgreen)](https://rubygems.org/gems/minimap2)
4
4
  [![CI](https://github.com/kojix2/ruby-minimap2/workflows/CI/badge.svg)](https://github.com/kojix2/ruby-minimap2/actions)
@@ -12,7 +12,7 @@
12
12
 
13
13
  ## Installation
14
14
 
15
- You need to install ruby-minimap2 from the source code. Because you need to build minimap2 and create a shared library. Open your terminal and type the following commands in order.
15
+ Open your terminal and type the following commands in order. You need to build minimap2 on your own because you need to create a shared library that contains cmappy functions.
16
16
 
17
17
  Build
18
18
 
@@ -29,7 +29,7 @@ Install
29
29
  bundle exec rake install
30
30
  ```
31
31
 
32
- Ruby-minimap2 is tested on Ubuntu and macOS.
32
+ Ruby-minimap2 is [tested on Ubuntu and macOS](https://github.com/kojix2/ruby-minimap2/actions).
33
33
 
34
34
  ## Quick Start
35
35
 
@@ -37,99 +37,119 @@ Ruby-minimap2 is tested on Ubuntu and macOS.
37
37
  require "minimap2"
38
38
  ```
39
39
 
40
- create aligner
40
+ Create aligner
41
41
 
42
42
  ```ruby
43
43
  aligner = Minimap2::Aligner.new("minimap2/test/MT-human.fa")
44
44
  ```
45
45
 
46
- retrieve a subsequence from the index
46
+ Retrieve a subsequence from the index
47
47
 
48
48
  ```ruby
49
49
  seq = aligner.seq("MT_human", 100, 200)
50
50
  ```
51
51
 
52
- mapping
52
+ Mapping
53
53
 
54
54
  ```ruby
55
55
  hits = aligner.align(seq)
56
- pp hits[0].to_h
57
- # {:ctg => "MT_human",
58
- # :ctg_len => 16569,
59
- # :r_st => 100,
60
- # :r_en => 200,
61
- # :strand => 1,
62
- # :trans_strand => 0,
63
- # :blen => 100,
64
- # :mlen => 100,
65
- # :nm => 0,
66
- # :primary => 1,
67
- # :q_st => 0,
68
- # :q_en => 100,
69
- # :mapq => 60,
70
- # :cigar => [[100, 0]],
71
- # :read_num => 1,
72
- # :cs => "",
73
- # :md => "",
74
- # :cigar_str => "100M"}
56
+ pp hits[0]
57
+ ```
58
+
59
+ ```
60
+ =>
61
+ #<Minimap2::Alignment:0x000055fe18223f50
62
+ @blen=100,
63
+ @cigar=[[100, 0]],
64
+ @cigar_str="100M",
65
+ @cs="",
66
+ @ctg="MT_human",
67
+ @ctg_len=16569,
68
+ @mapq=60,
69
+ @md="",
70
+ @mlen=100,
71
+ @nm=0,
72
+ @primary=1,
73
+ @q_en=100,
74
+ @q_st=0,
75
+ @r_en=200,
76
+ @r_st=100,
77
+ @read_num=1,
78
+ @strand=1,
79
+ @trans_strand=0>
75
80
  ```
76
81
 
77
82
  ## APIs Overview
78
83
 
79
- See the [RubyDoc.info document](https://rubydoc.info/gems/minimap2) for details.
84
+ API is based on [Mappy](https://github.com/lh3/minimap2/tree/master/python), the official Python binding for Minimap2.
85
+
86
+ Note: `Aligner#map` has been changed to `aligne`, because `map` means iterator in Ruby.
80
87
 
81
88
  ```markdown
82
89
  * Minimap2 module
83
- - fastx_read
84
- - revcomp
90
+ - fastx_read Read fasta/fastq file.
91
+ - revcomp Reverse complement sequence.
85
92
 
86
93
  * Aligner class
87
94
  * attributes
88
- - index
89
- - idx_opt
90
- - map_opt
95
+ - index Returns the value of attribute index.
96
+ - idx_opt Returns the value of attribute idx_opt.
97
+ - map_opt Returns the value of attribute map_opt.
91
98
  * methods
92
- - new(path, preset: nil)
93
- - align
99
+ - new(path, preset: nil) Create a new aligner. (presets: sr, map-pb, map-out, map-hifi, splice, asm5, etc.)
100
+ - align Maps and returns alignments.
101
+ - seq Retrieve a subsequence from the index.
94
102
 
95
103
  * Alignment class
96
104
  * attributes
97
- - ctg
98
- - ctg_len
99
- - r_st
100
- - r_en
101
- - strand
102
- - trans_strand
103
- - blen
104
- - mlen
105
- - nm
106
- - primary
107
- - q_st
108
- - q_en
109
- - mapq
110
- - cigar
111
- - read_num
112
- - cs
113
- - md
114
- - cigar_str
105
+ - ctg Returns name of the reference sequence the query is mapped to.
106
+ - ctg_len Returns total length of the reference sequence.
107
+ - r_st Returns start positions on the reference.
108
+ - r_en Returns end positions on the reference.
109
+ - strand Returns +1 if on the forward strand; -1 if on the reverse strand.
110
+ - trans_strand Returns transcript strand. +1 if on the forward strand; -1 if on the reverse strand; 0 if unknown.
111
+ - blen Returns length of the alignment, including both alignment matches and gaps but excluding ambiguous bases.
112
+ - mlen Returns length of the matching bases in the alignment, excluding ambiguous base matches.
113
+ - nm Returns number of mismatches, gaps and ambiguous poistions in the alignment.
114
+ - primary Returns if the alignment is primary (typically the best and the first to generate).
115
+ - q_st Returns start positions on the query.
116
+ - q_en Returns end positions on the query.
117
+ - mapq Returns mapping quality.
118
+ - cigar Returns CIGAR returned as an array of shape (n_cigar,2). The two numbers give the length and the operator of each CIGAR operation.
119
+ - read_num Returns read number that the alignment corresponds to; 1 for the first read and 2 for the second read.
120
+ - cs Returns the cs tag.
121
+ - md Returns the MD tag as in the SAM format. It is an empty string unless the md argument is applied when calling Aligner#align.
122
+ - cigar_str Returns CIGAR string.
115
123
  * methods
116
- - to_h
117
- - to_s
124
+ - to_h Convert Alignment to hash.
125
+ - to_s Convert to the PAF format without the QueryName and QueryLength columns.
118
126
 
119
- * FFI module
120
- * IdxOpt class
121
- * MapOpt class
127
+ ## FFI module
128
+ * IdxOpt class Indexing options.
129
+ * MapOpt class Mapping options.
122
130
  ```
123
131
 
124
- The ruby-minimap2 API is compliant with mappy, the official Python binding for Minimap2. However, there are a few differences. For example, the `map` method has been renamed to `align` since map is the common name for iterators in Ruby.
132
+ This is not all. See the [RubyDoc.info documentation](https://rubydoc.info/gems/minimap2/) for more details.
125
133
 
126
- * [Mappy: Minimap2 Python Binding](https://github.com/lh3/minimap2/tree/master/python)
134
+ ruby-minimap2 is built on top of [Ruby-FFI](https://github.com/ffi/ffi).
135
+ Native functions can be called from the FFI module. FFI also provides the way to access some C structs.
127
136
 
128
- ruby-minimap2 is built on top of [Ruby-FFI](https://github.com/ffi/ffi). Native functions can be called from the FFI module, which also provides a way to access some C structs such as IdxOpt and MapOpt.
137
+ ```ruby
138
+ aligner.idx_opt.members
139
+ # => [:k, :w, :flag, :bucket_bits, :mini_batch_size, :batch_size]
140
+ aligner.kds_opt.values
141
+ # => [15, 10, 0, 14, 50000000, 9223372036854775807]
142
+ aligner.idx_opt[:k]
143
+ # => 15
144
+ aligner.idx_opt[:k] = 14
145
+ aligner.idx_opt[:k]
146
+ # => 14
147
+ ```
129
148
 
130
149
  ## Development
131
150
 
132
- Fork your repository and clone.
151
+ Fork your repository.
152
+ then clone.
133
153
 
134
154
  ```sh
135
155
  git clone --recursive https://github.com/kojix2/ruby-minimap2
@@ -138,7 +158,7 @@ git clone --recursive https://github.com/kojix2/ruby-minimap2
138
158
  # git submodule update -i
139
159
  ```
140
160
 
141
- Build.
161
+ Build Minimap2 and Mappy.
142
162
 
143
163
  ```sh
144
164
  cd ruby-minimap2
@@ -146,6 +166,13 @@ bundle install # Install dependent packages including Ruby-FFI
146
166
  bundle exec rake minimap2:build
147
167
  ```
148
168
 
169
+ A shared library will be created in the vendor directory.
170
+
171
+ ```
172
+ └── vendor
173
+ └── libminimap2.so
174
+ ```
175
+
149
176
  Run tests.
150
177
 
151
178
  ```
@@ -166,3 +193,7 @@ ruby-minimap2 is a library under development and there are many points to be imp
166
193
  ## License
167
194
 
168
195
  [MIT License](https://opensource.org/licenses/MIT).
196
+
197
+ ## Acknowledgements
198
+
199
+ I would like to thank Heng Li for making Minimap2, and all the readers who read the README to the end.
data/lib/minimap2.rb CHANGED
@@ -34,7 +34,7 @@ module Minimap2
34
34
 
35
35
  # methods from mappy
36
36
  class << self
37
- # read fasta/fastq file
37
+ # Read fasta/fastq file.
38
38
  # @param [String] file_path
39
39
  # @param [Boolean] read_comment If false or nil, the comment will not be read.
40
40
  # @yield [name, seq, qual, comment]
@@ -57,7 +57,7 @@ module Minimap2
57
57
  FFI.mm_fastx_close(ks)
58
58
  end
59
59
 
60
- # reverse complement sequence
60
+ # Reverse complement sequence.
61
61
  # @param [String] seq
62
62
  # @return [string] seq
63
63
 
@@ -68,7 +68,7 @@ module Minimap2
68
68
  FFI.mappy_revcomp(l, bseq)
69
69
  end
70
70
 
71
- # set verbosity level
71
+ # Set verbosity level.
72
72
  # @param [Integer] level
73
73
 
74
74
  def verbose(level = -1)
@@ -4,11 +4,21 @@ module Minimap2
4
4
  class Aligner
5
5
  attr_reader :idx_opt, :map_opt, :index
6
6
 
7
- # Create a new aligner
7
+ # Create a new aligner.
8
8
  #
9
9
  # @param fn_idx_in [String] index or sequence file name.
10
10
  # @param seq [String] a single sequence to index.
11
11
  # @param preset [String] minimap2 preset.
12
+ # * map-pb : PacBio CLR genomic reads
13
+ # * map-ont : Oxford Nanopore genomic reads
14
+ # * map-hifi : PacBio HiFi/CCS genomic reads (v2.19 or later)
15
+ # * asm20 : PacBio HiFi/CCS genomic reads (v2.18 or earlier)
16
+ # * sr : short genomic paired-end reads
17
+ # * splice : spliced long reads (strand unknown)
18
+ # * splice:hq : Final PacBio Iso-seq or traditional cDNA
19
+ # * asm5 : intra-species asm-to-asm alignment
20
+ # * ava-pb : PacBio read overlap
21
+ # * ava-ont : Nanopore read overlap
12
22
  # @param k [Integer] k-mer length, no larger than 28.
13
23
  # @param w [Integer] minimizer window size, no larger than 255.
14
24
  # @param min_cnt [Integer] mininum number of minimizers on a chain.
@@ -101,6 +111,7 @@ module Minimap2
101
111
  end
102
112
 
103
113
  # Explicitly releases the memory of the index object.
114
+
104
115
  def free_index
105
116
  FFI.mm_idx_destroy(index) unless index.null?
106
117
  end
@@ -184,10 +195,10 @@ module Minimap2
184
195
  alignments
185
196
  end
186
197
 
187
- # retrieve a subsequence from the index.
188
- # @params name
189
- # @params start
190
- # @params stop
198
+ # Retrieve a subsequence from the index.
199
+ # @param name
200
+ # @param start
201
+ # @param stop
191
202
 
192
203
  def seq(name, start = 0, stop = 0x7fffffff)
193
204
  lp = ::FFI::MemoryPointer.new(:int)
@@ -1,7 +1,7 @@
1
1
  # frozen_string_literal: true
2
2
 
3
3
  module Minimap2
4
- # Alignment result
4
+ # Alignment result.
5
5
  #
6
6
  # @!attribute ctg
7
7
  # @return [String] name of the reference sequence the query is mapped to.
@@ -73,17 +73,21 @@ module Minimap2
73
73
  @cs = cs
74
74
  @md = md
75
75
 
76
- @cigar_str = cigar.map { |x| x[0].to_s + 'MIDNSH'[x[1]] }.join
76
+ @cigar_str = cigar.map { |x| x[0].to_s + FFI::CIGAR_STR[x[1]] }.join
77
77
  end
78
78
 
79
79
  def primary?
80
80
  @primary == 1
81
81
  end
82
82
 
83
+ # Convert Alignment to hash.
84
+
83
85
  def to_h
84
86
  self.class.keys.map { |k| [k, __send__(k)] }.to_h
85
87
  end
86
88
 
89
+ # Convert to the PAF format without the QueryName and QueryLength columns.
90
+
87
91
  def to_s
88
92
  strand = if @strand.positive?
89
93
  '+'
@@ -34,6 +34,7 @@ module Minimap2
34
34
  NO_END_FLT = 0x10000000
35
35
  HARD_MLEVEL = 0x20000000
36
36
  SAM_HIT_ONLY = 0x40000000
37
+ RMQ = 0x80000000 # LL
37
38
 
38
39
  HPC = 0x1
39
40
  NO_SEQ = 0x2
@@ -43,6 +44,8 @@ module Minimap2
43
44
 
44
45
  MAX_SEG = 255
45
46
 
47
+ CIGAR_STR = 'MIDNSHP=XB'
48
+
46
49
  # emulate 128-bit integers
47
50
  class MM128 < ::FFI::Struct
48
51
  layout \
@@ -1,5 +1,6 @@
1
1
  # frozen_string_literal: true
2
2
 
3
3
  module Minimap2
4
- VERSION = '0.0.4'
4
+ # Minimap2-2.21 (r1071).
5
+ VERSION = '0.2.21'
5
6
  end
Binary file
metadata CHANGED
@@ -1,14 +1,14 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: minimap2
3
3
  version: !ruby/object:Gem::Version
4
- version: 0.0.4
4
+ version: 0.2.21
5
5
  platform: ruby
6
6
  authors:
7
7
  - kojix2
8
8
  autorequire:
9
9
  bindir: bin
10
10
  cert_chain: []
11
- date: 2021-05-27 00:00:00.000000000 Z
11
+ date: 2021-07-06 00:00:00.000000000 Z
12
12
  dependencies:
13
13
  - !ruby/object:Gem::Dependency
14
14
  name: ffi
@@ -112,6 +112,7 @@ files:
112
112
  - lib/minimap2/ffi/mappy.rb
113
113
  - lib/minimap2/ffi_helper.rb
114
114
  - lib/minimap2/version.rb
115
+ - vendor/libminimap2.so
115
116
  homepage: https://github.com/kojix2/ruby-minimap2
116
117
  licenses:
117
118
  - MIT