bio-samtools 0.2.0
Sign up to get free protection for your applications and to get access to all the features.
- data/.document +5 -0
- data/Gemfile +16 -0
- data/Gemfile.lock +24 -0
- data/LICENSE.txt +702 -0
- data/README.rdoc +85 -0
- data/Rakefile +59 -0
- data/VERSION +1 -0
- data/bio-samtools.gemspec +105 -0
- data/ext/mkrf_conf.rb +65 -0
- data/lib/bio-samtools.rb +2 -0
- data/lib/bio/.DS_Store +0 -0
- data/lib/bio/db/sam.rb +325 -0
- data/lib/bio/db/sam/bam.rb +210 -0
- data/lib/bio/db/sam/external/COPYING +21 -0
- data/lib/bio/db/sam/external/VERSION +1 -0
- data/lib/bio/db/sam/faidx.rb +21 -0
- data/lib/bio/db/sam/library.rb +25 -0
- data/lib/bio/db/sam/sam.rb +84 -0
- data/test/basictest.rb +308 -0
- data/test/coverage.rb +26 -0
- data/test/coverage_plot.rb +28 -0
- data/test/feature.rb +0 -0
- data/test/helper.rb +18 -0
- data/test/samples/small/ids2.txt +1 -0
- data/test/samples/small/sorted.bam +0 -0
- data/test/samples/small/test +0 -0
- data/test/samples/small/test.bam +0 -0
- data/test/samples/small/test.fa +20 -0
- data/test/samples/small/test.fai +0 -0
- data/test/samples/small/test.sai +0 -0
- data/test/samples/small/test.tam +10 -0
- data/test/samples/small/test_chr.fasta +1000 -0
- data/test/samples/small/test_chr.fasta.amb +2 -0
- data/test/samples/small/test_chr.fasta.ann +3 -0
- data/test/samples/small/test_chr.fasta.bwt +0 -0
- data/test/samples/small/test_chr.fasta.fai +1 -0
- data/test/samples/small/test_chr.fasta.pac +0 -0
- data/test/samples/small/test_chr.fasta.rbwt +0 -0
- data/test/samples/small/test_chr.fasta.rpac +0 -0
- data/test/samples/small/test_chr.fasta.rsa +0 -0
- data/test/samples/small/test_chr.fasta.sa +0 -0
- data/test/samples/small/testu.bam +0 -0
- data/test/samples/small/testu.bam.bai +0 -0
- data/test/test_bio-samtools.rb +7 -0
- metadata +185 -0
data/README.rdoc
ADDED
@@ -0,0 +1,85 @@
|
|
1
|
+
= bio-samtools
|
2
|
+
|
3
|
+
The original project samtools-ruby belongs to Ricardo H. Ramirez @ https://github.com/homonecloco/samtools-ruby
|
4
|
+
|
5
|
+
== Introduction
|
6
|
+
|
7
|
+
|
8
|
+
Documentation and code come from that project and we'll adapt it for a better integration in BioRuby.
|
9
|
+
|
10
|
+
Binder of samtools for ruby, on the top of FFI.
|
11
|
+
|
12
|
+
This project was born from the need to add support of BAM files to
|
13
|
+
the gee_fu genome browser (http://github.com/danmaclean/gee_fu).
|
14
|
+
|
15
|
+
== Installation
|
16
|
+
At the moment, the only way to "install" the module is to copy it to copy the content
|
17
|
+
of the lib folder on any lib folder you have. You must also copy your libbam.a (linux)
|
18
|
+
or libbam.dll (windows) from samtools. If you have a Mac, add the following variable
|
19
|
+
to the makefile of samtools
|
20
|
+
|
21
|
+
DYFLAGS = -dynamiclib -lz
|
22
|
+
|
23
|
+
and the following target:
|
24
|
+
dylib: libbam.dylib
|
25
|
+
libbam.dylib:$(LOBJS)
|
26
|
+
$(CC) $(CFLAGS) $(DYFLAGS) -o libbam.dylib $(LOBJS)
|
27
|
+
|
28
|
+
Then, run
|
29
|
+
|
30
|
+
make dylib
|
31
|
+
|
32
|
+
Finally, you copy the .dylib file to your lib folder.
|
33
|
+
|
34
|
+
== Usage
|
35
|
+
The easiest way to see in "action" samtools-ruby to call
|
36
|
+
rake test.
|
37
|
+
|
38
|
+
|
39
|
+
== Dependencies:
|
40
|
+
-FFI (http://github.com/ffi/ffi)
|
41
|
+
-libbam
|
42
|
+
This can be obtained from samtools. (http://samtools.sourceforge.net/ )
|
43
|
+
|
44
|
+
== FAQ.
|
45
|
+
I´m getting a segmentation Fault, what did I do wrong?
|
46
|
+
There are two known segmentation faults at the moment
|
47
|
+
-When you try to load a text file as binary file
|
48
|
+
-When you try to lad a binary file as a text file
|
49
|
+
|
50
|
+
If this is not the problem, or you have any other question, don´t hesitate on
|
51
|
+
dropping a line to
|
52
|
+
Ricardo dot Ramirez-Gonzalez at bbsrc dot ac dot uk
|
53
|
+
|
54
|
+
== TODO
|
55
|
+
-Write a gem to install it
|
56
|
+
-Filter to the fetching algorithm (give a condition that has to be satisfied to add the alignment to the list)
|
57
|
+
-Examples of how to use it, besides the test folder
|
58
|
+
-Operating system independent, DONE ( test needed)
|
59
|
+
|
60
|
+
== IMPORTANT NOTE
|
61
|
+
-Libraries are compiled for 64 bit machines
|
62
|
+
-The gem/archive comes with preinstalled library .a and dylib, .dll is missing I didn't compile it.
|
63
|
+
-Library .a, I didn't check if it works on linux (test needed)
|
64
|
+
-Library libbam.so.1, The library is simply compiled with make dylib without any special parameter on
|
65
|
+
CentOS5.5 x86_64, gcc (GCC) 4.1.2 20080704 (Red Hat 4.1.2-48).
|
66
|
+
$ openssl dgst libbam.so.1
|
67
|
+
MD5(libbam.so.1)= c45cfccfb41ffeb2730ee4b227d244c4
|
68
|
+
|
69
|
+
|
70
|
+
|
71
|
+
== Contributing to bio-samtools
|
72
|
+
|
73
|
+
* Check out the latest master to make sure the feature hasn't been implemented or the bug hasn't been fixed yet
|
74
|
+
* Check out the issue tracker to make sure someone already hasn't requested it and/or contributed it
|
75
|
+
* Fork the project
|
76
|
+
* Start a feature/bugfix branch
|
77
|
+
* Commit and push until you are happy with your contribution
|
78
|
+
* Make sure to add tests for it. This is important so I don't break it in a future version unintentionally.
|
79
|
+
* Please try not to mess with the Rakefile, version, or history. If you want to have your own version, or is otherwise necessary, that is fine, but please isolate to its own commit so I can cherry-pick around it.
|
80
|
+
|
81
|
+
== Copyright
|
82
|
+
|
83
|
+
Copyright (c) 2011 Raoul J.P. Bonnal. See LICENSE.txt for
|
84
|
+
further details.
|
85
|
+
|
data/Rakefile
ADDED
@@ -0,0 +1,59 @@
|
|
1
|
+
require 'rubygems'
|
2
|
+
require 'bundler'
|
3
|
+
|
4
|
+
|
5
|
+
begin
|
6
|
+
Bundler.setup(:default, :development)
|
7
|
+
rescue Bundler::BundlerError => e
|
8
|
+
$stderr.puts e.message
|
9
|
+
$stderr.puts "Run `bundle install` to install missing gems"
|
10
|
+
exit e.status_code
|
11
|
+
end
|
12
|
+
require 'rake'
|
13
|
+
|
14
|
+
require 'jeweler'
|
15
|
+
Jeweler::Tasks.new do |gem|
|
16
|
+
# gem is a Gem::Specification... see http://docs.rubygems.org/read/chapter/20 for more options
|
17
|
+
gem.name = "bio-samtools"
|
18
|
+
gem.homepage = "http://github.com/helios/bioruby-samtools"
|
19
|
+
gem.license = "MIT"
|
20
|
+
gem.summary = %Q{Binder of samtools for ruby, on the top of FFI.}
|
21
|
+
gem.description = %Q{Binder of samtools for ruby, on the top of FFI.
|
22
|
+
|
23
|
+
This project was born from the need to add support of BAM files to
|
24
|
+
the gee_fu genome browser (http://github.com/danmaclean/gee_fu).}
|
25
|
+
gem.email = "ilpuccio.febo@gmail.com"
|
26
|
+
gem.authors = ["Ricardo Ramirez-Gonzalez","Dan MacLean","Raoul J.P. Bonnal"]
|
27
|
+
# Include your dependencies below. Runtime dependencies are required when using your gem,
|
28
|
+
# and development dependencies are only needed for development (ie running rake tasks, tests, etc)
|
29
|
+
# gem.add_runtime_dependency 'jabber4r', '> 0.1'
|
30
|
+
# gem.add_development_dependency 'rspec', '> 1.2.3'
|
31
|
+
gem.extensions = "ext/mkrf_conf.rb"
|
32
|
+
end
|
33
|
+
Jeweler::RubygemsDotOrgTasks.new
|
34
|
+
|
35
|
+
require 'rake/testtask'
|
36
|
+
Rake::TestTask.new(:test) do |test|
|
37
|
+
test.libs << 'lib' << 'test'
|
38
|
+
test.pattern = 'test/**/test_*.rb'
|
39
|
+
test.verbose = true
|
40
|
+
end
|
41
|
+
|
42
|
+
require 'rcov/rcovtask'
|
43
|
+
Rcov::RcovTask.new do |test|
|
44
|
+
test.libs << 'test'
|
45
|
+
test.pattern = 'test/**/test_*.rb'
|
46
|
+
test.verbose = true
|
47
|
+
end
|
48
|
+
|
49
|
+
task :default => :test
|
50
|
+
|
51
|
+
require 'rake/rdoctask'
|
52
|
+
Rake::RDocTask.new do |rdoc|
|
53
|
+
version = File.exist?('VERSION') ? File.read('VERSION') : ""
|
54
|
+
|
55
|
+
rdoc.rdoc_dir = 'rdoc'
|
56
|
+
rdoc.title = "bio-samtools #{version}"
|
57
|
+
rdoc.rdoc_files.include('README*')
|
58
|
+
rdoc.rdoc_files.include('lib/**/*.rb')
|
59
|
+
end
|
data/VERSION
ADDED
@@ -0,0 +1 @@
|
|
1
|
+
0.2.0
|
@@ -0,0 +1,105 @@
|
|
1
|
+
# Generated by jeweler
|
2
|
+
# DO NOT EDIT THIS FILE DIRECTLY
|
3
|
+
# Instead, edit Jeweler::Tasks in Rakefile, and run 'rake gemspec'
|
4
|
+
# -*- encoding: utf-8 -*-
|
5
|
+
|
6
|
+
Gem::Specification.new do |s|
|
7
|
+
s.name = %q{bio-samtools}
|
8
|
+
s.version = "0.2.0"
|
9
|
+
|
10
|
+
s.required_rubygems_version = Gem::Requirement.new(">= 0") if s.respond_to? :required_rubygems_version=
|
11
|
+
s.authors = ["Ricardo Ramirez-Gonzalez", "Dan MacLean", "Raoul J.P. Bonnal"]
|
12
|
+
s.date = %q{2011-05-23}
|
13
|
+
s.description = %q{Binder of samtools for ruby, on the top of FFI.
|
14
|
+
|
15
|
+
This project was born from the need to add support of BAM files to
|
16
|
+
the gee_fu genome browser (http://github.com/danmaclean/gee_fu).}
|
17
|
+
s.email = %q{ilpuccio.febo@gmail.com}
|
18
|
+
s.extensions = ["ext/mkrf_conf.rb"]
|
19
|
+
s.extra_rdoc_files = [
|
20
|
+
"LICENSE.txt",
|
21
|
+
"README.rdoc"
|
22
|
+
]
|
23
|
+
s.files = [
|
24
|
+
".document",
|
25
|
+
"Gemfile",
|
26
|
+
"Gemfile.lock",
|
27
|
+
"LICENSE.txt",
|
28
|
+
"README.rdoc",
|
29
|
+
"Rakefile",
|
30
|
+
"VERSION",
|
31
|
+
"bio-samtools.gemspec",
|
32
|
+
"ext/mkrf_conf.rb",
|
33
|
+
"lib/bio-samtools.rb",
|
34
|
+
"lib/bio/.DS_Store",
|
35
|
+
"lib/bio/db/sam.rb",
|
36
|
+
"lib/bio/db/sam/bam.rb",
|
37
|
+
"lib/bio/db/sam/external/COPYING",
|
38
|
+
"lib/bio/db/sam/external/VERSION",
|
39
|
+
"lib/bio/db/sam/faidx.rb",
|
40
|
+
"lib/bio/db/sam/library.rb",
|
41
|
+
"lib/bio/db/sam/sam.rb",
|
42
|
+
"test/basictest.rb",
|
43
|
+
"test/coverage.rb",
|
44
|
+
"test/coverage_plot.rb",
|
45
|
+
"test/feature.rb",
|
46
|
+
"test/helper.rb",
|
47
|
+
"test/samples/small/ids2.txt",
|
48
|
+
"test/samples/small/sorted.bam",
|
49
|
+
"test/samples/small/test",
|
50
|
+
"test/samples/small/test.bam",
|
51
|
+
"test/samples/small/test.fa",
|
52
|
+
"test/samples/small/test.fai",
|
53
|
+
"test/samples/small/test.sai",
|
54
|
+
"test/samples/small/test.tam",
|
55
|
+
"test/samples/small/test_chr.fasta",
|
56
|
+
"test/samples/small/test_chr.fasta.amb",
|
57
|
+
"test/samples/small/test_chr.fasta.ann",
|
58
|
+
"test/samples/small/test_chr.fasta.bwt",
|
59
|
+
"test/samples/small/test_chr.fasta.fai",
|
60
|
+
"test/samples/small/test_chr.fasta.pac",
|
61
|
+
"test/samples/small/test_chr.fasta.rbwt",
|
62
|
+
"test/samples/small/test_chr.fasta.rpac",
|
63
|
+
"test/samples/small/test_chr.fasta.rsa",
|
64
|
+
"test/samples/small/test_chr.fasta.sa",
|
65
|
+
"test/samples/small/testu.bam",
|
66
|
+
"test/samples/small/testu.bam.bai",
|
67
|
+
"test/test_bio-samtools.rb"
|
68
|
+
]
|
69
|
+
s.homepage = %q{http://github.com/helios/bioruby-samtools}
|
70
|
+
s.licenses = ["MIT"]
|
71
|
+
s.require_paths = ["lib"]
|
72
|
+
s.rubygems_version = %q{1.5.0}
|
73
|
+
s.summary = %q{Binder of samtools for ruby, on the top of FFI.}
|
74
|
+
|
75
|
+
if s.respond_to? :specification_version then
|
76
|
+
s.specification_version = 3
|
77
|
+
|
78
|
+
if Gem::Version.new(Gem::VERSION) >= Gem::Version.new('1.2.0') then
|
79
|
+
s.add_runtime_dependency(%q<ffi>, [">= 0"])
|
80
|
+
s.add_development_dependency(%q<shoulda>, [">= 0"])
|
81
|
+
s.add_development_dependency(%q<bundler>, ["~> 1.0.0"])
|
82
|
+
s.add_development_dependency(%q<jeweler>, [">= 0"])
|
83
|
+
s.add_development_dependency(%q<rcov>, [">= 0"])
|
84
|
+
s.add_development_dependency(%q<bio>, [">= 1.4.1"])
|
85
|
+
s.add_development_dependency(%q<ffi>, [">= 0"])
|
86
|
+
else
|
87
|
+
s.add_dependency(%q<ffi>, [">= 0"])
|
88
|
+
s.add_dependency(%q<shoulda>, [">= 0"])
|
89
|
+
s.add_dependency(%q<bundler>, ["~> 1.0.0"])
|
90
|
+
s.add_dependency(%q<jeweler>, [">= 0"])
|
91
|
+
s.add_dependency(%q<rcov>, [">= 0"])
|
92
|
+
s.add_dependency(%q<bio>, [">= 1.4.1"])
|
93
|
+
s.add_dependency(%q<ffi>, [">= 0"])
|
94
|
+
end
|
95
|
+
else
|
96
|
+
s.add_dependency(%q<ffi>, [">= 0"])
|
97
|
+
s.add_dependency(%q<shoulda>, [">= 0"])
|
98
|
+
s.add_dependency(%q<bundler>, ["~> 1.0.0"])
|
99
|
+
s.add_dependency(%q<jeweler>, [">= 0"])
|
100
|
+
s.add_dependency(%q<rcov>, [">= 0"])
|
101
|
+
s.add_dependency(%q<bio>, [">= 1.4.1"])
|
102
|
+
s.add_dependency(%q<ffi>, [">= 0"])
|
103
|
+
end
|
104
|
+
end
|
105
|
+
|
data/ext/mkrf_conf.rb
ADDED
@@ -0,0 +1,65 @@
|
|
1
|
+
#(c) Copyright 2011 Raoul Bonnal. All Rights Reserved.
|
2
|
+
|
3
|
+
# create Rakefile for shared library compilation
|
4
|
+
|
5
|
+
|
6
|
+
|
7
|
+
path = File.expand_path(File.dirname(__FILE__))
|
8
|
+
|
9
|
+
path_external = File.join(path, "../lib/bio/db/sam/external")
|
10
|
+
|
11
|
+
version = File.open(File.join(path_external,"VERSION"),'r')
|
12
|
+
Version = version.read
|
13
|
+
version.close
|
14
|
+
|
15
|
+
url = "http://sourceforge.net/projects/samtools/files/samtools/#{Version}/samtools-#{Version}.tar.bz2/download"
|
16
|
+
SamToolsFile = "samtools-#{Version}.tar.bz2"
|
17
|
+
|
18
|
+
File.open(File.join(path,"Rakefile"),"w") do |rakefile|
|
19
|
+
rakefile.write <<-RAKE
|
20
|
+
require 'rbconfig'
|
21
|
+
require 'open-uri'
|
22
|
+
require 'fileutils'
|
23
|
+
include FileUtils::Verbose
|
24
|
+
require 'rake/clean'
|
25
|
+
|
26
|
+
URL = "#{url}"
|
27
|
+
|
28
|
+
task :download do
|
29
|
+
open(URL) do |uri|
|
30
|
+
File.open("#{SamToolsFile}",'wb') do |fout|
|
31
|
+
fout.write(uri.read)
|
32
|
+
end #fout
|
33
|
+
end #uri
|
34
|
+
end
|
35
|
+
|
36
|
+
task :compile do
|
37
|
+
sh "tar xvfj #{SamToolsFile}"
|
38
|
+
cd("samtools-#{Version}") do
|
39
|
+
sh "make"
|
40
|
+
cp("libbam.a","#{path_external}")
|
41
|
+
case Config::CONFIG['host_os']
|
42
|
+
when /linux/
|
43
|
+
sh "make libbam.so.1-local"
|
44
|
+
cp("libbam.so.1","#{path_external}")
|
45
|
+
when /darwin/
|
46
|
+
sh "make libbam.1.dylib-local"
|
47
|
+
cp("libbam.1.dylib","#{path_external}")
|
48
|
+
when /mswin|mingw/ then raise NotImplementedError, "BWA library is not available for Windows platform"
|
49
|
+
end#case
|
50
|
+
end #cd
|
51
|
+
end
|
52
|
+
|
53
|
+
task :clean do
|
54
|
+
cd("samtools-#{Version}") do
|
55
|
+
sh "make clean"
|
56
|
+
end
|
57
|
+
rm("#{SamToolsFile}")
|
58
|
+
rm_rf("samtools-#{Version}")
|
59
|
+
end
|
60
|
+
|
61
|
+
task :default => [:download, :compile, :clean]
|
62
|
+
|
63
|
+
RAKE
|
64
|
+
|
65
|
+
end
|
data/lib/bio-samtools.rb
ADDED
data/lib/bio/.DS_Store
ADDED
Binary file
|
data/lib/bio/db/sam.rb
ADDED
@@ -0,0 +1,325 @@
|
|
1
|
+
require 'bio/db/sam/library'
|
2
|
+
require 'bio/db/sam/bam'
|
3
|
+
require 'bio/db/sam/faidx'
|
4
|
+
require 'bio/db/sam/sam'
|
5
|
+
|
6
|
+
module LibC
|
7
|
+
extend FFI::Library
|
8
|
+
ffi_lib FFI::Library::LIBC
|
9
|
+
attach_function :free, [ :pointer ], :void
|
10
|
+
# call #attach_function to attach to malloc, free, memcpy, bcopy, etc.
|
11
|
+
end
|
12
|
+
|
13
|
+
module Bio
|
14
|
+
class DB
|
15
|
+
class Sam
|
16
|
+
attr_reader :sam_file
|
17
|
+
|
18
|
+
def initialize(optsa={})
|
19
|
+
opts = { :fasta => nil, :bam => nil,:tam => nil, :compressed => true, :write => false }.merge!(optsa)
|
20
|
+
|
21
|
+
|
22
|
+
|
23
|
+
@fasta_path = opts[:fasta]
|
24
|
+
@compressed = opts[:compressed]
|
25
|
+
@write = opts[:write]
|
26
|
+
bam = opts[:bam]
|
27
|
+
tam = opts[:tam]
|
28
|
+
|
29
|
+
if bam == nil && tam == nil && @fasta_path == nil then
|
30
|
+
raise SAMException.new(), "No alignment or reference file"
|
31
|
+
elsif bam != nil && tam != nil then
|
32
|
+
raise SAMException.new(), "Alignment has to be in either text or binary format, not both"
|
33
|
+
elsif bam != nil then
|
34
|
+
@binary = true
|
35
|
+
@sam = bam
|
36
|
+
elsif tam != nil then
|
37
|
+
@sam = tam
|
38
|
+
@binary = false
|
39
|
+
|
40
|
+
end
|
41
|
+
@fasta_file = nil
|
42
|
+
@sam_file = nil
|
43
|
+
|
44
|
+
ObjectSpace.define_finalizer(self, self.class.method(:finalize).to_proc)
|
45
|
+
end
|
46
|
+
|
47
|
+
def open()
|
48
|
+
|
49
|
+
raise SAMException.new(), "Writing not supported yet" if @write
|
50
|
+
raise SAMException.new(), "No SAM file specified" unless @sam
|
51
|
+
|
52
|
+
opts = @write ? "w" : "r"
|
53
|
+
if @binary then
|
54
|
+
opts += "b"
|
55
|
+
if @write then
|
56
|
+
unless @compressed then
|
57
|
+
opts += "u"
|
58
|
+
end
|
59
|
+
end
|
60
|
+
end
|
61
|
+
valid = ["r", "w", "wh", "rb", "wb" , "wbu"]
|
62
|
+
unless valid.include?(opts) then
|
63
|
+
raise SAMException.new(), "Invalid options for samopen: " + opts
|
64
|
+
end
|
65
|
+
|
66
|
+
samFile = Bio::DB::SAM::Tools.samopen(@sam, opts, nil)
|
67
|
+
if samFile.null? then
|
68
|
+
@sam_file = nil
|
69
|
+
raise SAMException.new(), "File not opened: " + @sam
|
70
|
+
end
|
71
|
+
@sam_file = Bio::DB::SAM::Tools::SamfileT.new(samFile)
|
72
|
+
|
73
|
+
end
|
74
|
+
|
75
|
+
def to_s()
|
76
|
+
(@binary ? "Binary" : "Text") + " file: " + @sam + " with fasta: " + @fasta_path
|
77
|
+
end
|
78
|
+
|
79
|
+
def close()
|
80
|
+
Bio::DB::SAM::Tools.fai_destroy(@fasta_index) unless @fasta_index.nil? || @fasta_index.null?
|
81
|
+
Bio::DB::SAM::Tools.bam_index_destroy(@sam_index) unless @sam_index.nil? || @sam_index.null?
|
82
|
+
Bio::DB::SAM::Tools.samclose(@sam_file) unless @sam_file.nil?
|
83
|
+
@sam_file = nil
|
84
|
+
@fasta_index = nil
|
85
|
+
end
|
86
|
+
|
87
|
+
def Sam.finalize(id)
|
88
|
+
id.close()
|
89
|
+
puts "Finalizing #{id} at #{Time.new}"
|
90
|
+
end
|
91
|
+
|
92
|
+
def load_index()
|
93
|
+
raise SAMException.new(), "Indexes are only supported by BAM files, please use samtools to convert your SAM file" unless @binary
|
94
|
+
@sam_index = Bio::DB::SAM::Tools.bam_index_load(@sam)
|
95
|
+
if @sam_index.null? then
|
96
|
+
p "Generating index for: " + @sam
|
97
|
+
Bio::DB::SAM::Tools.bam_index_build(@sam)
|
98
|
+
@sam_index = Bio::DB::SAM::Tools.bam_index_load(@sam)
|
99
|
+
raise SAMException.new(), "Unable to generate bam index for: " + @sam if @sam_index.nil? || @sam_index.null?
|
100
|
+
end
|
101
|
+
end
|
102
|
+
|
103
|
+
def load_reference()
|
104
|
+
raise SAMException.new(), "No path for the refernce fasta file. " if @fasta_path.nil?
|
105
|
+
|
106
|
+
@fasta_index = Bio::DB::SAM::Tools.fai_load(@fasta_path)
|
107
|
+
|
108
|
+
if @fasta_index.null? then
|
109
|
+
p "Generating index for: " + @fasta_path
|
110
|
+
Bio::DB::SAM::Tools.fai_build(@fasta_path)
|
111
|
+
@fasta_index = Bio::DB::SAM::Tools.fai_load(@fasta_path)
|
112
|
+
raise SAMException.new(), "Unable to generate fasta index for: " + @fasta_path if @fasta_index.nil? || @fasta_index.null?
|
113
|
+
end
|
114
|
+
|
115
|
+
end
|
116
|
+
|
117
|
+
def average_coverage(chromosome, qstart, len)
|
118
|
+
|
119
|
+
#reference = fetch_reference(chromosome, qstart,len)
|
120
|
+
# len = reference.length if len > reference.length
|
121
|
+
|
122
|
+
|
123
|
+
coverages = chromosome_coverage(chromosome, qstart, len)
|
124
|
+
total = 0
|
125
|
+
len.times{ |i| total= total + coverages[i] }
|
126
|
+
avg_cov = total.to_f / len
|
127
|
+
#LibC.free reference
|
128
|
+
avg_cov
|
129
|
+
end
|
130
|
+
|
131
|
+
def chromosome_coverage(chromosome, qstart, len)
|
132
|
+
# reference = fetch_reference(chromosome, qstart,len)
|
133
|
+
# len = reference.length if len > reference.length
|
134
|
+
#p qend.to_s + "-" + qstart.to_s + "framesize " + (qend - qstart).to_s
|
135
|
+
coverages = Array.new(len, 0)
|
136
|
+
|
137
|
+
chr_cov_proc = Proc.new do |alignment|
|
138
|
+
#last = qstart + len
|
139
|
+
#first = qstart
|
140
|
+
#last = alignment.calend if last > alignment.calend
|
141
|
+
#first = alignment.pos if first < alignment.pos
|
142
|
+
# p first
|
143
|
+
last = alignment.calend - qstart
|
144
|
+
first = alignment.pos - qstart
|
145
|
+
if last < first
|
146
|
+
tmp = last
|
147
|
+
last = first
|
148
|
+
first = last
|
149
|
+
end
|
150
|
+
|
151
|
+
# STDERR.puts "#{first} #{last}\n"
|
152
|
+
first.upto(last-1) { |i|
|
153
|
+
|
154
|
+
coverages[i-1] = 1 + coverages[i-1] if i-1 < len && i > 0
|
155
|
+
}
|
156
|
+
end
|
157
|
+
|
158
|
+
fetch_with_function(chromosome, qstart, qstart+len, chr_cov_proc)
|
159
|
+
#p coverages
|
160
|
+
coverages
|
161
|
+
end
|
162
|
+
|
163
|
+
def fetch_reference(chromosome, qstart,qend)
|
164
|
+
load_reference if @fasta_index.nil? || @fasta_index.null?
|
165
|
+
query = query_string(chromosome, qstart,qend)
|
166
|
+
len = FFI::MemoryPointer.new :int
|
167
|
+
reference = Bio::DB::SAM::Tools.fai_fetch(@fasta_index, query, len)
|
168
|
+
raise SAMException.new(), "Unable to get sequence for reference: "+query if reference.nil?
|
169
|
+
|
170
|
+
reference
|
171
|
+
end
|
172
|
+
|
173
|
+
def query_string(chromosome, qstart,qend)
|
174
|
+
query = chromosome + ":" + qstart.to_s + "-" + qend.to_s
|
175
|
+
query
|
176
|
+
end
|
177
|
+
|
178
|
+
def fetch(chromosome, qstart, qend)
|
179
|
+
als = Array.new
|
180
|
+
fetchAlignment = Proc.new do |alignment|
|
181
|
+
als.push(alignment.clone)
|
182
|
+
0
|
183
|
+
end
|
184
|
+
fetch_with_function(chromosome, qstart, qend, fetchAlignment)
|
185
|
+
als
|
186
|
+
end
|
187
|
+
|
188
|
+
def fetch_with_function(chromosome, qstart, qend, function)
|
189
|
+
load_index if @sam_index.nil? || @sam_index.null?
|
190
|
+
chr = FFI::MemoryPointer.new :int
|
191
|
+
beg = FFI::MemoryPointer.new :int
|
192
|
+
last = FFI::MemoryPointer.new :int
|
193
|
+
query = query_string(chromosome, qstart,qend)
|
194
|
+
qpointer = FFI::MemoryPointer.from_string(query)
|
195
|
+
header = @sam_file[:header]
|
196
|
+
Bio::DB::SAM::Tools.bam_parse_region(header,qpointer, chr, beg, last)
|
197
|
+
#raise SAMException.new(), "invalid query: " + query if(chr.read_int < 0)
|
198
|
+
count = 0;
|
199
|
+
|
200
|
+
fetchAlignment = Proc.new do |bam_alignment, data|
|
201
|
+
alignment = Alignment.new
|
202
|
+
alignment.set(bam_alignment, header)
|
203
|
+
function.call(alignment)
|
204
|
+
count = count + 1
|
205
|
+
0
|
206
|
+
end
|
207
|
+
Bio::DB::SAM::Tools.bam_fetch(@sam_file[:x][:bam], @sam_index,chr.read_int,beg.read_int, last.read_int, nil, fetchAlignment)
|
208
|
+
#LibC.free chr
|
209
|
+
#LibC.free beg
|
210
|
+
#LibC.free last
|
211
|
+
#LibC.free qpointer
|
212
|
+
count
|
213
|
+
end
|
214
|
+
|
215
|
+
end
|
216
|
+
|
217
|
+
class Tag
|
218
|
+
attr_accessor :tag, :type, :value
|
219
|
+
def set(str)
|
220
|
+
v = str.split(":")
|
221
|
+
@tag = v[0]
|
222
|
+
@type = v[1]
|
223
|
+
@value = v[2]
|
224
|
+
end
|
225
|
+
end
|
226
|
+
|
227
|
+
class Alignment
|
228
|
+
|
229
|
+
def initialize
|
230
|
+
ObjectSpace.define_finalizer(self,
|
231
|
+
self.class.method(:finalize).to_proc)
|
232
|
+
end
|
233
|
+
def Alignment.finalize(object_id)
|
234
|
+
|
235
|
+
# puts "Object #{object_id} dying at #{Time.new}"
|
236
|
+
# p "?" . object_id.al
|
237
|
+
# p object_id.al
|
238
|
+
LibC.free object_id.al
|
239
|
+
LibC.free object_id.sam
|
240
|
+
LibC.free object_id.calend
|
241
|
+
LibC.free object_id.qlen
|
242
|
+
|
243
|
+
LibC.free object_id.samstr
|
244
|
+
end
|
245
|
+
|
246
|
+
#Attributes from the format
|
247
|
+
attr_accessor :qname, :flag, :rname,:pos,:mapq,:cigar, :mrnm, :mpos, :isize, :seq, :qual, :tags, :al, :samstr
|
248
|
+
#Attributes pulled with the C library
|
249
|
+
attr_accessor :calend, :qlen
|
250
|
+
#Attrobites frp, the flag field (see chapter 2.2.2 of the sam file documentation)
|
251
|
+
#query_strand and mate_strand are true if they are forward. It is the opposite to the definition in the BAM format for clarity.
|
252
|
+
#primary is the negation of is_negative from the BAM format
|
253
|
+
attr_accessor :is_paired, :is_mapped, :query_unmapped, :mate_unmapped, :query_strand, :mate_strand, :first_in_pair,:second_in_pair, :primary, :failed_quality, :is_duplicate
|
254
|
+
|
255
|
+
def set(bam_alignment, header)
|
256
|
+
#Create the FFI object
|
257
|
+
@al = Bio::DB::SAM::Tools::Bam1T.new(bam_alignment)
|
258
|
+
|
259
|
+
#set the raw data
|
260
|
+
tmp_str = Bio::DB::SAM::Tools.bam_format1(header,al)
|
261
|
+
#self.sam = tmp_str
|
262
|
+
#ObjectSpace.define_finalizer(self, proc {|id| puts "Finalizer one on #{id}" })
|
263
|
+
self.sam = String.new(tmp_str)
|
264
|
+
#LibC.free tmp_str
|
265
|
+
#Set values calculated by libbam
|
266
|
+
core = al[:core]
|
267
|
+
cigar = al[:data][core[:l_qname]]#define bam1_cigar(b) ((uint32_t*)((b)->data + (b)->core.l_qname))
|
268
|
+
@calend = Bio::DB::SAM::Tools.bam_calend(core,cigar)
|
269
|
+
@qlen = Bio::DB::SAM::Tools.bam_cigar2qlen(core,cigar)
|
270
|
+
|
271
|
+
#process the flags
|
272
|
+
@is_paired = @flag & 0x0001 > 0
|
273
|
+
@is_mapped = @flag & 0x0002 > 0
|
274
|
+
@query_unmapped = @flag & 0x0004 > 0
|
275
|
+
@mate_unmapped = @flag & 0x0008 > 0
|
276
|
+
@query_strand = !(@flag & 0x0010 > 0)
|
277
|
+
@mate_strand = !(@flag & 0x0020 > 0)
|
278
|
+
@first_in_pair = @flag & 0x0040 > 0
|
279
|
+
@second_in_pair = @flag & 0x0080 > 0
|
280
|
+
@primary = !(@flag & 0x0100 > 0)
|
281
|
+
@failed_quality = @flag & 0x0200 > 0
|
282
|
+
@is_duplicate = @flag & 0x0400 > 0
|
283
|
+
|
284
|
+
end
|
285
|
+
|
286
|
+
|
287
|
+
def sam=(sam)
|
288
|
+
#p sam
|
289
|
+
s = sam.split("\t")
|
290
|
+
self.qname = s[0]
|
291
|
+
self.flag = s[1].to_i
|
292
|
+
self.rname = s[2]
|
293
|
+
self.pos = s[3].to_i
|
294
|
+
self.mapq = s[4].to_i
|
295
|
+
self.cigar = s[5]
|
296
|
+
self.mrnm = s[6]
|
297
|
+
self.mpos = s[7].to_i
|
298
|
+
self.isize = s[8].to_i
|
299
|
+
self.seq = s[9]
|
300
|
+
self.qual = s[10]
|
301
|
+
self.tags = {}
|
302
|
+
11.upto(s.size-1) {|n|
|
303
|
+
t = Tag.new
|
304
|
+
t.set(s[n])
|
305
|
+
tags[t.tag] = t
|
306
|
+
}
|
307
|
+
|
308
|
+
|
309
|
+
#<QNAME> <FLAG> <RNAME> <POS> <MAPQ> <CIGAR> <MRNM> <MPOS> <ISIZE> <SEQ> <QUAL> \
|
310
|
+
#[<TAG>:<VTYPE>:<VALUE> [...]]
|
311
|
+
|
312
|
+
end
|
313
|
+
|
314
|
+
end
|
315
|
+
|
316
|
+
class SAMException < RuntimeError
|
317
|
+
#we can add further variables to give information of the excpetion
|
318
|
+
def initialize()
|
319
|
+
|
320
|
+
end
|
321
|
+
end
|
322
|
+
end
|
323
|
+
end
|
324
|
+
|
325
|
+
|