ngs_server 0.1 → 0.2

Sign up to get free protection for your applications and to get access to all the features.
Files changed (248) hide show
  1. data/bin/ngs_server +72 -50
  2. data/ext/bamtools/extconf.rb +3 -3
  3. data/ext/vcftools/Makefile +28 -0
  4. data/ext/vcftools/README.txt +36 -0
  5. data/ext/vcftools/cpp/.svn/all-wcprops +125 -0
  6. data/ext/vcftools/cpp/.svn/dir-prop-base +6 -0
  7. data/ext/vcftools/cpp/.svn/entries +708 -0
  8. data/ext/vcftools/cpp/.svn/text-base/Makefile.svn-base +46 -0
  9. data/ext/vcftools/cpp/.svn/text-base/dgeev.cpp.svn-base +146 -0
  10. data/ext/vcftools/cpp/.svn/text-base/dgeev.h.svn-base +43 -0
  11. data/ext/vcftools/cpp/.svn/text-base/output_log.cpp.svn-base +79 -0
  12. data/ext/vcftools/cpp/.svn/text-base/output_log.h.svn-base +34 -0
  13. data/ext/vcftools/cpp/.svn/text-base/parameters.cpp.svn-base +535 -0
  14. data/ext/vcftools/cpp/.svn/text-base/parameters.h.svn-base +154 -0
  15. data/ext/vcftools/cpp/.svn/text-base/vcf_entry.cpp.svn-base +497 -0
  16. data/ext/vcftools/cpp/.svn/text-base/vcf_entry.h.svn-base +190 -0
  17. data/ext/vcftools/cpp/.svn/text-base/vcf_entry_getters.cpp.svn-base +421 -0
  18. data/ext/vcftools/cpp/.svn/text-base/vcf_entry_setters.cpp.svn-base +482 -0
  19. data/ext/vcftools/cpp/.svn/text-base/vcf_file.cpp.svn-base +495 -0
  20. data/ext/vcftools/cpp/.svn/text-base/vcf_file.h.svn-base +184 -0
  21. data/ext/vcftools/cpp/.svn/text-base/vcf_file_diff.cpp.svn-base +1282 -0
  22. data/ext/vcftools/cpp/.svn/text-base/vcf_file_filters.cpp.svn-base +1215 -0
  23. data/ext/vcftools/cpp/.svn/text-base/vcf_file_format_convert.cpp.svn-base +1138 -0
  24. data/ext/vcftools/cpp/.svn/text-base/vcf_file_index.cpp.svn-base +171 -0
  25. data/ext/vcftools/cpp/.svn/text-base/vcf_file_output.cpp.svn-base +3012 -0
  26. data/ext/vcftools/cpp/.svn/text-base/vcftools.cpp.svn-base +107 -0
  27. data/ext/vcftools/cpp/.svn/text-base/vcftools.h.svn-base +25 -0
  28. data/ext/vcftools/cpp/Makefile +46 -0
  29. data/ext/vcftools/cpp/dgeev.cpp +146 -0
  30. data/ext/vcftools/cpp/dgeev.h +43 -0
  31. data/ext/vcftools/cpp/output_log.cpp +79 -0
  32. data/ext/vcftools/cpp/output_log.h +34 -0
  33. data/ext/vcftools/cpp/parameters.cpp +535 -0
  34. data/ext/vcftools/cpp/parameters.h +154 -0
  35. data/ext/vcftools/cpp/vcf_entry.cpp +497 -0
  36. data/ext/vcftools/cpp/vcf_entry.h +190 -0
  37. data/ext/vcftools/cpp/vcf_entry_getters.cpp +421 -0
  38. data/ext/vcftools/cpp/vcf_entry_setters.cpp +482 -0
  39. data/ext/vcftools/cpp/vcf_file.cpp +495 -0
  40. data/ext/vcftools/cpp/vcf_file.h +184 -0
  41. data/ext/vcftools/cpp/vcf_file_diff.cpp +1282 -0
  42. data/ext/vcftools/cpp/vcf_file_filters.cpp +1215 -0
  43. data/ext/vcftools/cpp/vcf_file_format_convert.cpp +1138 -0
  44. data/ext/vcftools/cpp/vcf_file_index.cpp +171 -0
  45. data/ext/vcftools/cpp/vcf_file_output.cpp +3012 -0
  46. data/ext/vcftools/cpp/vcftools.cpp +107 -0
  47. data/ext/vcftools/cpp/vcftools.h +25 -0
  48. data/ext/vcftools/examples/.svn/all-wcprops +185 -0
  49. data/ext/vcftools/examples/.svn/dir-prop-base +6 -0
  50. data/ext/vcftools/examples/.svn/entries +1048 -0
  51. data/ext/vcftools/examples/.svn/prop-base/perl-api-1.pl.svn-base +5 -0
  52. data/ext/vcftools/examples/.svn/text-base/annotate-test.vcf.svn-base +37 -0
  53. data/ext/vcftools/examples/.svn/text-base/annotate.out.svn-base +23 -0
  54. data/ext/vcftools/examples/.svn/text-base/annotate.txt.svn-base +7 -0
  55. data/ext/vcftools/examples/.svn/text-base/annotate2.out.svn-base +52 -0
  56. data/ext/vcftools/examples/.svn/text-base/annotate3.out.svn-base +23 -0
  57. data/ext/vcftools/examples/.svn/text-base/cmp-test-a-3.3.vcf.svn-base +12 -0
  58. data/ext/vcftools/examples/.svn/text-base/cmp-test-a.vcf.svn-base +12 -0
  59. data/ext/vcftools/examples/.svn/text-base/cmp-test-b-3.3.vcf.svn-base +12 -0
  60. data/ext/vcftools/examples/.svn/text-base/cmp-test-b.vcf.svn-base +12 -0
  61. data/ext/vcftools/examples/.svn/text-base/cmp-test.out.svn-base +53 -0
  62. data/ext/vcftools/examples/.svn/text-base/concat-a.vcf.svn-base +21 -0
  63. data/ext/vcftools/examples/.svn/text-base/concat-b.vcf.svn-base +13 -0
  64. data/ext/vcftools/examples/.svn/text-base/concat-c.vcf.svn-base +19 -0
  65. data/ext/vcftools/examples/.svn/text-base/concat.out.svn-base +39 -0
  66. data/ext/vcftools/examples/.svn/text-base/invalid-4.0.vcf.svn-base +31 -0
  67. data/ext/vcftools/examples/.svn/text-base/isec-n2-test.vcf.out.svn-base +19 -0
  68. data/ext/vcftools/examples/.svn/text-base/merge-test-a.vcf.svn-base +17 -0
  69. data/ext/vcftools/examples/.svn/text-base/merge-test-b.vcf.svn-base +17 -0
  70. data/ext/vcftools/examples/.svn/text-base/merge-test-c.vcf.svn-base +15 -0
  71. data/ext/vcftools/examples/.svn/text-base/merge-test.vcf.out.svn-base +31 -0
  72. data/ext/vcftools/examples/.svn/text-base/perl-api-1.pl.svn-base +46 -0
  73. data/ext/vcftools/examples/.svn/text-base/query-test.out.svn-base +6 -0
  74. data/ext/vcftools/examples/.svn/text-base/shuffle-test.vcf.svn-base +12 -0
  75. data/ext/vcftools/examples/.svn/text-base/subset.SNPs.out.svn-base +10 -0
  76. data/ext/vcftools/examples/.svn/text-base/subset.indels.out.svn-base +18 -0
  77. data/ext/vcftools/examples/.svn/text-base/subset.vcf.svn-base +21 -0
  78. data/ext/vcftools/examples/.svn/text-base/valid-3.3.vcf.svn-base +30 -0
  79. data/ext/vcftools/examples/.svn/text-base/valid-4.0.vcf.stats.svn-base +104 -0
  80. data/ext/vcftools/examples/.svn/text-base/valid-4.0.vcf.svn-base +34 -0
  81. data/ext/vcftools/examples/.svn/text-base/valid-4.1.vcf.svn-base +37 -0
  82. data/ext/vcftools/examples/annotate-test.vcf +37 -0
  83. data/ext/vcftools/examples/annotate.out +23 -0
  84. data/ext/vcftools/examples/annotate.txt +7 -0
  85. data/ext/vcftools/examples/annotate2.out +52 -0
  86. data/ext/vcftools/examples/annotate3.out +23 -0
  87. data/ext/vcftools/examples/cmp-test-a-3.3.vcf +12 -0
  88. data/ext/vcftools/examples/cmp-test-a.vcf +12 -0
  89. data/ext/vcftools/examples/cmp-test-b-3.3.vcf +12 -0
  90. data/ext/vcftools/examples/cmp-test-b.vcf +12 -0
  91. data/ext/vcftools/examples/cmp-test.out +53 -0
  92. data/ext/vcftools/examples/concat-a.vcf +21 -0
  93. data/ext/vcftools/examples/concat-b.vcf +13 -0
  94. data/ext/vcftools/examples/concat-c.vcf +19 -0
  95. data/ext/vcftools/examples/concat.out +39 -0
  96. data/ext/vcftools/examples/invalid-4.0.vcf +31 -0
  97. data/ext/vcftools/examples/isec-n2-test.vcf.out +19 -0
  98. data/ext/vcftools/examples/merge-test-a.vcf +17 -0
  99. data/ext/vcftools/examples/merge-test-b.vcf +17 -0
  100. data/ext/vcftools/examples/merge-test-c.vcf +15 -0
  101. data/ext/vcftools/examples/merge-test.vcf.out +31 -0
  102. data/ext/vcftools/examples/perl-api-1.pl +46 -0
  103. data/ext/vcftools/examples/query-test.out +6 -0
  104. data/ext/vcftools/examples/shuffle-test.vcf +12 -0
  105. data/ext/vcftools/examples/subset.SNPs.out +10 -0
  106. data/ext/vcftools/examples/subset.indels.out +18 -0
  107. data/ext/vcftools/examples/subset.vcf +21 -0
  108. data/ext/vcftools/examples/valid-3.3.vcf +30 -0
  109. data/ext/vcftools/examples/valid-4.0.vcf +34 -0
  110. data/ext/vcftools/examples/valid-4.0.vcf.stats +104 -0
  111. data/ext/vcftools/examples/valid-4.1.vcf +37 -0
  112. data/ext/vcftools/extconf.rb +2 -0
  113. data/ext/vcftools/perl/.svn/all-wcprops +149 -0
  114. data/ext/vcftools/perl/.svn/entries +844 -0
  115. data/ext/vcftools/perl/.svn/prop-base/fill-aa.svn-base +5 -0
  116. data/ext/vcftools/perl/.svn/prop-base/fill-an-ac.svn-base +5 -0
  117. data/ext/vcftools/perl/.svn/prop-base/fill-ref-md5.svn-base +5 -0
  118. data/ext/vcftools/perl/.svn/prop-base/tab-to-vcf.svn-base +5 -0
  119. data/ext/vcftools/perl/.svn/prop-base/test.t.svn-base +5 -0
  120. data/ext/vcftools/perl/.svn/prop-base/vcf-annotate.svn-base +5 -0
  121. data/ext/vcftools/perl/.svn/prop-base/vcf-compare.svn-base +5 -0
  122. data/ext/vcftools/perl/.svn/prop-base/vcf-concat.svn-base +5 -0
  123. data/ext/vcftools/perl/.svn/prop-base/vcf-convert.svn-base +5 -0
  124. data/ext/vcftools/perl/.svn/prop-base/vcf-fix-newlines.svn-base +5 -0
  125. data/ext/vcftools/perl/.svn/prop-base/vcf-isec.svn-base +5 -0
  126. data/ext/vcftools/perl/.svn/prop-base/vcf-merge.svn-base +5 -0
  127. data/ext/vcftools/perl/.svn/prop-base/vcf-query.svn-base +5 -0
  128. data/ext/vcftools/perl/.svn/prop-base/vcf-shuffle-cols.svn-base +5 -0
  129. data/ext/vcftools/perl/.svn/prop-base/vcf-sort.svn-base +5 -0
  130. data/ext/vcftools/perl/.svn/prop-base/vcf-stats.svn-base +5 -0
  131. data/ext/vcftools/perl/.svn/prop-base/vcf-subset.svn-base +5 -0
  132. data/ext/vcftools/perl/.svn/prop-base/vcf-to-tab.svn-base +5 -0
  133. data/ext/vcftools/perl/.svn/prop-base/vcf-validator.svn-base +5 -0
  134. data/ext/vcftools/perl/.svn/text-base/ChangeLog.svn-base +84 -0
  135. data/ext/vcftools/perl/.svn/text-base/FaSlice.pm.svn-base +214 -0
  136. data/ext/vcftools/perl/.svn/text-base/Makefile.svn-base +12 -0
  137. data/ext/vcftools/perl/.svn/text-base/Vcf.pm.svn-base +2853 -0
  138. data/ext/vcftools/perl/.svn/text-base/VcfStats.pm.svn-base +681 -0
  139. data/ext/vcftools/perl/.svn/text-base/fill-aa.svn-base +103 -0
  140. data/ext/vcftools/perl/.svn/text-base/fill-an-ac.svn-base +56 -0
  141. data/ext/vcftools/perl/.svn/text-base/fill-ref-md5.svn-base +204 -0
  142. data/ext/vcftools/perl/.svn/text-base/tab-to-vcf.svn-base +92 -0
  143. data/ext/vcftools/perl/.svn/text-base/test.t.svn-base +376 -0
  144. data/ext/vcftools/perl/.svn/text-base/vcf-annotate.svn-base +1099 -0
  145. data/ext/vcftools/perl/.svn/text-base/vcf-compare.svn-base +1193 -0
  146. data/ext/vcftools/perl/.svn/text-base/vcf-concat.svn-base +310 -0
  147. data/ext/vcftools/perl/.svn/text-base/vcf-convert.svn-base +180 -0
  148. data/ext/vcftools/perl/.svn/text-base/vcf-fix-newlines.svn-base +97 -0
  149. data/ext/vcftools/perl/.svn/text-base/vcf-isec.svn-base +660 -0
  150. data/ext/vcftools/perl/.svn/text-base/vcf-merge.svn-base +577 -0
  151. data/ext/vcftools/perl/.svn/text-base/vcf-query.svn-base +272 -0
  152. data/ext/vcftools/perl/.svn/text-base/vcf-shuffle-cols.svn-base +89 -0
  153. data/ext/vcftools/perl/.svn/text-base/vcf-sort.svn-base +79 -0
  154. data/ext/vcftools/perl/.svn/text-base/vcf-stats.svn-base +160 -0
  155. data/ext/vcftools/perl/.svn/text-base/vcf-subset.svn-base +206 -0
  156. data/ext/vcftools/perl/.svn/text-base/vcf-to-tab.svn-base +112 -0
  157. data/ext/vcftools/perl/.svn/text-base/vcf-validator.svn-base +145 -0
  158. data/ext/vcftools/perl/ChangeLog +84 -0
  159. data/ext/vcftools/perl/FaSlice.pm +214 -0
  160. data/ext/vcftools/perl/Makefile +12 -0
  161. data/ext/vcftools/perl/Vcf.pm +2853 -0
  162. data/ext/vcftools/perl/VcfStats.pm +681 -0
  163. data/ext/vcftools/perl/fill-aa +103 -0
  164. data/ext/vcftools/perl/fill-an-ac +56 -0
  165. data/ext/vcftools/perl/fill-ref-md5 +204 -0
  166. data/ext/vcftools/perl/tab-to-vcf +92 -0
  167. data/ext/vcftools/perl/test.t +376 -0
  168. data/ext/vcftools/perl/vcf-annotate +1099 -0
  169. data/ext/vcftools/perl/vcf-compare +1193 -0
  170. data/ext/vcftools/perl/vcf-concat +310 -0
  171. data/ext/vcftools/perl/vcf-convert +180 -0
  172. data/ext/vcftools/perl/vcf-fix-newlines +97 -0
  173. data/ext/vcftools/perl/vcf-isec +660 -0
  174. data/ext/vcftools/perl/vcf-merge +577 -0
  175. data/ext/vcftools/perl/vcf-query +286 -0
  176. data/ext/vcftools/perl/vcf-shuffle-cols +89 -0
  177. data/ext/vcftools/perl/vcf-sort +79 -0
  178. data/ext/vcftools/perl/vcf-stats +160 -0
  179. data/ext/vcftools/perl/vcf-subset +206 -0
  180. data/ext/vcftools/perl/vcf-to-tab +112 -0
  181. data/ext/vcftools/perl/vcf-validator +145 -0
  182. data/ext/vcftools/website/.svn/all-wcprops +41 -0
  183. data/ext/vcftools/website/.svn/entries +238 -0
  184. data/ext/vcftools/website/.svn/prop-base/VCF-poster.pdf.svn-base +5 -0
  185. data/ext/vcftools/website/.svn/prop-base/favicon.ico.svn-base +5 -0
  186. data/ext/vcftools/website/.svn/prop-base/favicon.png.svn-base +5 -0
  187. data/ext/vcftools/website/.svn/text-base/Makefile.svn-base +6 -0
  188. data/ext/vcftools/website/.svn/text-base/README.svn-base +2 -0
  189. data/ext/vcftools/website/.svn/text-base/VCF-poster.pdf.svn-base +0 -0
  190. data/ext/vcftools/website/.svn/text-base/default.css.svn-base +250 -0
  191. data/ext/vcftools/website/.svn/text-base/favicon.ico.svn-base +0 -0
  192. data/ext/vcftools/website/.svn/text-base/favicon.png.svn-base +0 -0
  193. data/ext/vcftools/website/Makefile +6 -0
  194. data/ext/vcftools/website/README +2 -0
  195. data/ext/vcftools/website/VCF-poster.pdf +0 -0
  196. data/ext/vcftools/website/default.css +250 -0
  197. data/ext/vcftools/website/favicon.ico +0 -0
  198. data/ext/vcftools/website/favicon.png +0 -0
  199. data/ext/vcftools/website/img/.svn/all-wcprops +53 -0
  200. data/ext/vcftools/website/img/.svn/entries +300 -0
  201. data/ext/vcftools/website/img/.svn/prop-base/bg.gif.svn-base +5 -0
  202. data/ext/vcftools/website/img/.svn/prop-base/bgcode.gif.svn-base +5 -0
  203. data/ext/vcftools/website/img/.svn/prop-base/bgcontainer.gif.svn-base +5 -0
  204. data/ext/vcftools/website/img/.svn/prop-base/bgul.gif.svn-base +5 -0
  205. data/ext/vcftools/website/img/.svn/prop-base/header.gif.svn-base +5 -0
  206. data/ext/vcftools/website/img/.svn/prop-base/li.gif.svn-base +5 -0
  207. data/ext/vcftools/website/img/.svn/prop-base/quote.gif.svn-base +5 -0
  208. data/ext/vcftools/website/img/.svn/prop-base/search.gif.svn-base +5 -0
  209. data/ext/vcftools/website/img/.svn/text-base/bg.gif.svn-base +0 -0
  210. data/ext/vcftools/website/img/.svn/text-base/bgcode.gif.svn-base +0 -0
  211. data/ext/vcftools/website/img/.svn/text-base/bgcontainer.gif.svn-base +0 -0
  212. data/ext/vcftools/website/img/.svn/text-base/bgul.gif.svn-base +0 -0
  213. data/ext/vcftools/website/img/.svn/text-base/header.gif.svn-base +0 -0
  214. data/ext/vcftools/website/img/.svn/text-base/li.gif.svn-base +0 -0
  215. data/ext/vcftools/website/img/.svn/text-base/quote.gif.svn-base +0 -0
  216. data/ext/vcftools/website/img/.svn/text-base/search.gif.svn-base +0 -0
  217. data/ext/vcftools/website/img/bg.gif +0 -0
  218. data/ext/vcftools/website/img/bgcode.gif +0 -0
  219. data/ext/vcftools/website/img/bgcontainer.gif +0 -0
  220. data/ext/vcftools/website/img/bgul.gif +0 -0
  221. data/ext/vcftools/website/img/header.gif +0 -0
  222. data/ext/vcftools/website/img/li.gif +0 -0
  223. data/ext/vcftools/website/img/quote.gif +0 -0
  224. data/ext/vcftools/website/img/search.gif +0 -0
  225. data/ext/vcftools/website/src/.svn/all-wcprops +53 -0
  226. data/ext/vcftools/website/src/.svn/entries +300 -0
  227. data/ext/vcftools/website/src/.svn/text-base/docs.inc.svn-base +202 -0
  228. data/ext/vcftools/website/src/.svn/text-base/index.inc.svn-base +52 -0
  229. data/ext/vcftools/website/src/.svn/text-base/index.php.svn-base +80 -0
  230. data/ext/vcftools/website/src/.svn/text-base/license.inc.svn-base +27 -0
  231. data/ext/vcftools/website/src/.svn/text-base/links.inc.svn-base +13 -0
  232. data/ext/vcftools/website/src/.svn/text-base/options.inc.svn-base +654 -0
  233. data/ext/vcftools/website/src/.svn/text-base/perl_module.inc.svn-base +249 -0
  234. data/ext/vcftools/website/src/.svn/text-base/specs.inc.svn-base +18 -0
  235. data/ext/vcftools/website/src/docs.inc +202 -0
  236. data/ext/vcftools/website/src/index.inc +52 -0
  237. data/ext/vcftools/website/src/index.php +80 -0
  238. data/ext/vcftools/website/src/license.inc +27 -0
  239. data/ext/vcftools/website/src/links.inc +13 -0
  240. data/ext/vcftools/website/src/options.inc +654 -0
  241. data/ext/vcftools/website/src/perl_module.inc +249 -0
  242. data/ext/vcftools/website/src/specs.inc +18 -0
  243. data/lib/config.ru +9 -0
  244. data/lib/ngs_server/add.rb +9 -0
  245. data/lib/ngs_server/version.rb +1 -1
  246. data/lib/ngs_server.rb +55 -3
  247. data/ngs_server.gemspec +5 -2
  248. metadata +296 -6
@@ -0,0 +1,206 @@
1
+ #!/usr/bin/env perl
2
+ #
3
+ # Author: petr.danecek@sanger
4
+ #
5
+
6
+ use strict;
7
+ use warnings;
8
+ use Carp;
9
+ use Vcf;
10
+
11
+ my $opts = parse_params();
12
+ vcf_subset($opts);
13
+
14
+ exit;
15
+
16
+ #--------------------------------
17
+
18
+ sub error
19
+ {
20
+ my (@msg) = @_;
21
+ if ( scalar @msg )
22
+ {
23
+ croak @msg;
24
+ }
25
+ die
26
+ "Usage: vcf-subset [OPTIONS] in.vcf.gz > out.vcf\n",
27
+ "Options:\n",
28
+ " -c, --columns <string> File or comma-separated list of columns to keep in the vcf file. If file, one column per row\n",
29
+ " -e, --exclude-ref Exclude rows not containing variants.\n",
30
+ " -f, --force Proceed anyway even if VCF does not contain some of the samples.\n",
31
+ " -p, --private Print only rows where only the subset columns carry an alternate allele.\n",
32
+ " -r, --replace-with-ref Replace the excluded types with reference allele instead of dot.\n",
33
+ " -t, --type <list> Comma-separated list of variant types to include: SNPs,indels.\n",
34
+ " -u, --keep-uncalled Do not exclude rows without calls.\n",
35
+ " -h, -?, --help This help message.\n",
36
+ "Examples:\n",
37
+ " cat in.vcf | vcf-subset -r -t indels -e -c SAMPLE1 > out.vcf\n",
38
+ "\n";
39
+ }
40
+
41
+
42
+ sub parse_params
43
+ {
44
+ my $opts = { exclude_ref=>0, keep_uncalled=>0, replace_with_ref=>0, private=>0, args=>[$0, @ARGV] };
45
+ while (my $arg=shift(@ARGV))
46
+ {
47
+ if ( $arg eq '-t' || $arg eq '--type' )
48
+ {
49
+ my %known = ( SNPs=>'s', indels=>'i' );
50
+ my $types = shift(@ARGV);
51
+ for my $t (split(/,/,$types))
52
+ {
53
+ if ( !(exists($known{$t})) ) { error("Unknown type [$t] with -t [$types]\n"); }
54
+ $$opts{types}{$known{$t}} = 1;
55
+ }
56
+ next;
57
+ }
58
+ if ( $arg eq '-e' || $arg eq '--exclude-ref' ) { $$opts{'exclude_ref'} = 1; next }
59
+ if ( $arg eq '-f' || $arg eq '--force' ) { $$opts{'force'} = 1; next }
60
+ if ( $arg eq '-p' || $arg eq '--private' ) { $$opts{'private'} = 1; next }
61
+ if ( $arg eq '-r' || $arg eq '--replace-with-ref' ) { $$opts{'replace_with_ref'} = 1; next }
62
+ if ( $arg eq '-u' || $arg eq '--keep-uncalled' ) { $$opts{'keep_uncalled'} = 1; next }
63
+ if ( $arg eq '-c' || $arg eq '--columns' ) { $$opts{'columns_file'} = shift(@ARGV); next }
64
+ if ( $arg eq '-?' || $arg eq '-h' || $arg eq '--help' ) { error(); }
65
+ if ( -e $arg ) { $$opts{file} = $arg; next }
66
+ if ( -e $arg or $arg=~m{^(?:ftp|http)://} ) { $$opts{file}=$arg; next; }
67
+ error("Unknown parameter \"$arg\". Run -h for help.\n");
68
+ }
69
+ if ( !$$opts{exclude_ref} && !$$opts{'columns_file'} && !exists($$opts{'types'}) ) { error("Missing the -c or -t or -r option.\n") }
70
+ if ( exists($$opts{types}) ) { $$opts{types}{r}=1; }
71
+ return $opts;
72
+ }
73
+
74
+
75
+ sub read_columns
76
+ {
77
+ my ($fname) = @_;
78
+ my @columns;
79
+ if ( !-e $fname )
80
+ {
81
+ @columns = split(/,/,$fname);
82
+ return \@columns;
83
+ }
84
+ open(my $fh,'<',$fname) or error("$fname: $!");
85
+ while (my $line=<$fh>)
86
+ {
87
+ chomp($line);
88
+ $line=~s/\s+//g;
89
+ push @columns, $line;
90
+ }
91
+ close($fh);
92
+ return \@columns;
93
+ }
94
+
95
+ sub check_columns
96
+ {
97
+ my ($opts,$vcf,$columns) = @_;
98
+ my @out;
99
+ for my $col (@$columns)
100
+ {
101
+ if ( exists($$vcf{has_column}{$col}) )
102
+ {
103
+ push @out, $col;
104
+ next;
105
+ }
106
+
107
+ my $msg = qq[No such column in the VCF file: "$col"\n];
108
+ if ( $$opts{force} ) { warn($msg); }
109
+ else { error($msg); }
110
+ }
111
+ return \@out;
112
+ }
113
+
114
+ sub vcf_subset
115
+ {
116
+ my ($opts) = @_;
117
+
118
+ my $vcf = $$opts{file} ? Vcf->new(file=>$$opts{file}) : Vcf->new(fh=>\*STDIN);
119
+ $vcf->parse_header();
120
+
121
+ # Init requested column info. If not present, include all columns.
122
+ my $columns = exists($$opts{columns_file}) ? read_columns($$opts{columns_file}) : [];
123
+ $columns = check_columns($opts,$vcf,$columns);
124
+ if ( !@$columns && (my $ncols=@{$$vcf{columns}})>9 )
125
+ {
126
+ push @$columns, @{$$vcf{columns}}[9..($ncols-1)];
127
+ }
128
+ my %has_col = map { $_ => 1 } @$columns;
129
+
130
+ $vcf->add_header_line({key=>'source',value=>join(' ',@{$$opts{args}})},append=>'timestamp');
131
+ $vcf->set_samples(include=>$columns) unless $$opts{private};
132
+ print $vcf->format_header($columns);
133
+
134
+ my $check_private = $$opts{private};
135
+ while (my $x=$vcf->next_data_hash())
136
+ {
137
+ my $site_has_call = 0;
138
+ my $site_has_nonref = 0;
139
+ my $site_is_private = 1;
140
+ my $ref = $$x{REF};
141
+
142
+ for my $col (keys %{$$x{gtypes}})
143
+ {
144
+ if ( !$has_col{$col} && ($site_is_private==0 || !$check_private) )
145
+ {
146
+ # This column is not to be printed
147
+ delete($$x{gtypes}{$col});
148
+ next;
149
+ }
150
+
151
+ my ($alleles,$seps,$is_phased,$is_empty) = $vcf->parse_haplotype($x,$col);
152
+ my $sample_has_call = 0;
153
+ my $sample_has_nonref = 0;
154
+ my @out_alleles;
155
+
156
+ for (my $i=0; $i<@$alleles; $i++)
157
+ {
158
+ my ($type,$len,$ht) = $vcf->event_type($ref,$$alleles[$i]);
159
+ $out_alleles[$i] = $$alleles[$i];
160
+
161
+ # Exclude unwanted variant types if requested
162
+ if ( exists($$opts{types}) )
163
+ {
164
+ if ( !exists($$opts{types}{$type}) )
165
+ {
166
+ $out_alleles[$i] = $$opts{replace_with_ref} ? $ref : '.';
167
+ next;
168
+ }
169
+ $sample_has_call = 1;
170
+ }
171
+ elsif ( !$is_empty ) { $sample_has_call = 1; }
172
+ if ( $type ne 'r' ) { $site_has_nonref = 1; $sample_has_nonref = 1; }
173
+ }
174
+ if ( $check_private && !$has_col{$col} )
175
+ {
176
+ if ( $sample_has_nonref ) { $site_is_private=0; }
177
+ delete($$x{gtypes}{$col});
178
+ next;
179
+ }
180
+ if ( !$sample_has_call )
181
+ {
182
+ if ( $$opts{replace_with_ref} )
183
+ {
184
+ for (my $i=0; $i<@$alleles; $i++) { $out_alleles[$i] = $ref; }
185
+ }
186
+ else
187
+ {
188
+ for (my $i=0; $i<@$alleles; $i++) { $out_alleles[$i] = '.'; }
189
+ }
190
+ }
191
+ else
192
+ {
193
+ $site_has_call = 1;
194
+ }
195
+ $$x{gtypes}{$col}{GT} = $vcf->format_haplotype(\@out_alleles,$seps);
196
+ }
197
+
198
+ if ( !$site_has_call && !$$opts{keep_uncalled} ) { next; }
199
+ if ( !$site_has_nonref && $$opts{exclude_ref} ) { next; }
200
+ if ( $check_private && (!$site_is_private || !$site_has_nonref) ) { next; }
201
+
202
+ $vcf->format_genotype_strings($x);
203
+ print $vcf->format_line($x,$columns);
204
+ }
205
+ }
206
+
@@ -0,0 +1,112 @@
1
+ #!/usr/bin/env perl
2
+
3
+ use strict;
4
+ use warnings;
5
+ use Carp;
6
+ use Vcf;
7
+
8
+ my $opts = parse_params();
9
+ convert_to_tab($opts);
10
+
11
+ exit;
12
+
13
+ #--------------------------------
14
+
15
+ sub error
16
+ {
17
+ my (@msg) = @_;
18
+ if ( scalar @msg )
19
+ {
20
+ croak @msg;
21
+ }
22
+ die
23
+ "Usage: vcf-to-tab [OPTIONS] < in.vcf > out.tab\n",
24
+ "Options:\n",
25
+ " -h, -?, --help This help message.\n",
26
+ " -i, --iupac Use one-letter IUPAC codes\n",
27
+ "\n";
28
+ }
29
+
30
+
31
+ sub parse_params
32
+ {
33
+ my $opts = { iupac=>0 };
34
+ while (my $arg=shift(@ARGV))
35
+ {
36
+ if ( $arg eq '-?' || $arg eq '-h' || $arg eq '--help' ) { error(); }
37
+ if ( $arg eq '-i' || $arg eq '--iupac' ) { $$opts{iupac}=1; next; }
38
+ error("Unknown parameter \"$arg\". Run -h for help.\n");
39
+ }
40
+
41
+ if ( $$opts{iupac} )
42
+ {
43
+ $$opts{iupac} =
44
+ {
45
+ 'GG' => 'G',
46
+ 'CC' => 'C',
47
+ 'TT' => 'T',
48
+ 'AA' => 'A',
49
+
50
+ 'GT' => 'K',
51
+ 'TG' => 'K',
52
+ 'AC' => 'M',
53
+ 'CA' => 'M',
54
+ 'CG' => 'S',
55
+ 'GC' => 'S',
56
+ 'AG' => 'R',
57
+ 'GA' => 'R',
58
+ 'AT' => 'W',
59
+ 'TA' => 'W',
60
+ 'CT' => 'Y',
61
+ 'TC' => 'Y',
62
+
63
+ '..' => '.',
64
+ };
65
+ }
66
+
67
+ return $opts;
68
+ }
69
+
70
+
71
+ sub convert_to_tab
72
+ {
73
+ my ($opts) = @_;
74
+
75
+ my $iupac;
76
+ if ( $$opts{iupac} ) { $iupac=$$opts{iupac}; }
77
+
78
+ my $vcf = Vcf->new(fh=>\*STDIN);
79
+ $vcf->parse_header();
80
+
81
+ my $header_printed=0;
82
+
83
+ while (my $x=$vcf->next_data_hash())
84
+ {
85
+ if ( !$header_printed )
86
+ {
87
+ print "#CHROM\tPOS\tREF";
88
+ for my $col (sort keys %{$$x{gtypes}})
89
+ {
90
+ print "\t$col";
91
+ }
92
+ print "\n";
93
+
94
+ $header_printed = 1;
95
+ }
96
+
97
+ print "$$x{CHROM}\t$$x{POS}\t$$x{REF}";
98
+ for my $col (sort keys %{$$x{gtypes}})
99
+ {
100
+ my ($al1,$sep,$al2) = exists($$x{gtypes}{$col}{GT}) ? $vcf->parse_alleles($x,$col) : ('.','/','.');
101
+ my $gt = $al1.'/'.$al2;
102
+ if ( $iupac )
103
+ {
104
+ if ( !exists($$iupac{$gt}) ) { error(qq[Unknown IUPAC code for "$al1$sep$al2" .. $$x{CHROM}:$$x{POS} $col\n]); }
105
+ $gt = $$iupac{$gt};
106
+ }
107
+ print "\t".$gt;
108
+ }
109
+ print "\n";
110
+ }
111
+ }
112
+
@@ -0,0 +1,145 @@
1
+ #!/usr/bin/env perl
2
+ #
3
+ # Author: petr.danecek@sanger
4
+ #
5
+
6
+ use strict;
7
+ use warnings;
8
+ use Carp;
9
+ use Vcf;
10
+ use IPC::Open3 'open3';
11
+ use IO::Select;
12
+
13
+ my $opts = parse_params();
14
+ do_validation($opts);
15
+
16
+ exit;
17
+
18
+ #--------------------------------
19
+
20
+ sub error
21
+ {
22
+ my (@msg) = @_;
23
+ if ( scalar @msg )
24
+ {
25
+ croak @msg;
26
+ }
27
+ die
28
+ "Usage: vcf-validator [OPTIONS] file.vcf.gz\n",
29
+ "Options:\n",
30
+ " -d, --duplicates Warn about duplicate positions.\n",
31
+ " -u, --unique-messages Output all messages only once.\n",
32
+ " -h, -?, --help This help message.\n",
33
+ "\n";
34
+ }
35
+
36
+
37
+ sub parse_params
38
+ {
39
+ my $opts = { unique=>0, duplicates=>0 };
40
+ while (my $arg=shift(@ARGV))
41
+ {
42
+ if ( $arg eq '-d' || $arg eq '--duplicates' ) { $$opts{duplicates}=1; next; }
43
+ if ( $arg eq '-u' || $arg eq '--unique-messages' ) { $$opts{unique}=1; next; }
44
+ if ( $arg eq '-?' || $arg eq '-h' || $arg eq '--help' ) { error(); }
45
+ if ( -e $arg && !exists($$opts{file}) ) { $$opts{file}=$arg; next; }
46
+ error("Unknown parameter or non-existent file: \"$arg\". Run -h for help.\n");
47
+ }
48
+ return $opts;
49
+ }
50
+
51
+ sub do_validation
52
+ {
53
+ my ($opts) = @_;
54
+
55
+ my %opts = $$opts{file} ? (file=>$$opts{file}) : (fh=>\*STDIN);
56
+ my $vcf = Vcf->new(%opts, warn_duplicates=>$$opts{duplicates});
57
+
58
+ if ( !$$opts{unique} )
59
+ {
60
+ $vcf->run_validation();
61
+ return;
62
+ }
63
+
64
+ my ($kid_in,$kid_out,$kid_err);
65
+
66
+ my $pid = open3($kid_in,$kid_out,$kid_err,'-');
67
+ if ( !defined $pid ) { error("Cannot fork: $!"); }
68
+
69
+ if ($pid)
70
+ {
71
+ $$opts{known_lines} = [];
72
+
73
+ my $sel = new IO::Select;
74
+ $sel->add($kid_out,$kid_err);
75
+
76
+ while(my @ready = $sel->can_read)
77
+ {
78
+ foreach my $fh (@ready)
79
+ {
80
+ my $line = <$fh>;
81
+ if (not defined $line)
82
+ {
83
+ $sel->remove($fh);
84
+ next;
85
+ }
86
+ print_or_discard_line($opts,$line);
87
+ }
88
+ }
89
+ print_summary($opts);
90
+ }
91
+ else
92
+ {
93
+ $vcf->run_validation();
94
+ return;
95
+ }
96
+ }
97
+
98
+ sub print_or_discard_line
99
+ {
100
+ my ($opts,$line) = @_;
101
+
102
+ my @items = split(/\s+/,$line);
103
+ my $nitems = scalar @items;
104
+
105
+ for my $known (@{$$opts{known_lines}})
106
+ {
107
+ if ( @items != @{$$known{line}} ) { next; }
108
+
109
+ my $nmatches = 0;
110
+ for (my $i=0; $i<$nitems; $i++)
111
+ {
112
+ if ( $items[$i] eq $$known{line}[$i] ) { $nmatches++ }
113
+ }
114
+
115
+ if ( $nitems-$nmatches<3 )
116
+ {
117
+ $$known{n}++;
118
+ return;
119
+ }
120
+ }
121
+
122
+ push @{$$opts{known_lines}}, { line=>\@items, n=>1 };
123
+ print $line;
124
+ }
125
+
126
+ sub print_summary
127
+ {
128
+ my ($opts) = @_;
129
+ my $n = 0;
130
+ for my $error (@{$$opts{known_lines}})
131
+ {
132
+ $n += $$error{n};
133
+ }
134
+ print "\n\n------------------------\n";
135
+ print "Summary:\n";
136
+ printf "\t%d errors total \n\n", $n;
137
+
138
+ $n = 0;
139
+ for my $error (sort {$$b{n}<=>$$a{n}} @{$$opts{known_lines}})
140
+ {
141
+ if ( $n++ > 50 ) { print "\n\nand more...\n"; last; }
142
+ printf "\t%d\t..\t%s\n", $$error{n},join(' ',@{$$error{line}});
143
+ }
144
+ }
145
+
@@ -0,0 +1,84 @@
1
+ 2011-04-04 14:00 petr.danecek@sanger
2
+ * VCFtools now support VCFv4.1
3
+ * fill-ref-md5: New tool backfilling sequence MD5s into VCF header
4
+ * Renamed merge-vcf, compare-vcf etc. to consistent naming vcf-merge, vcf-compare
5
+ * vcf-merge: Now merging also GL and other Number=[AG] tags
6
+ * vcf-compare: Comparing indel haplotypes
7
+
8
+ 2011-02-21 12:31 petr.danecek@sanger
9
+ * vcf-stats: new -s option to speed up parsing when stats computed for selected samples only
10
+ * merge-vcf: allow to merge arbitrary chunks; -c option now deprecated, use -r instead
11
+ * compare-vcf: change in output format and more detailed comparison
12
+
13
+ 2011-02-17 17:36 petr.danecek@sanger
14
+ * vcf-stats: allow querying stats of individual samples
15
+
16
+ 2011-02-16 12:07 petr.danecek@sanger
17
+ * vcf-stats: major revision
18
+ * vcf-annotate: more filtering options
19
+
20
+ 2011-02-04 14:43 petr
21
+ * merge-vcf: if possible, calculate AC,AN even for sites without genotypes
22
+
23
+ 2011-02-03 15:04 petr
24
+ * merge-vcf: fixed a bug introduced by the previous fix.
25
+
26
+ 2011-02-02 21:02 petr
27
+ * merge-vcf: fixed a bug in merging indel ALTs. Only VCFs without samples were affected.
28
+
29
+ 2011-01-28 15:38 petr
30
+ * vcf-subset: new option for printing rows with calls private to the subset group
31
+
32
+ 2011-01-24 13:38 petr
33
+ * Vcf.pm: uppercase floating point number expressions (such as
34
+ 1.0382033E-6) now pass validation
35
+
36
+ 2011-01-20 08:28 petr
37
+ * vcf-concat: print header also for empty VCFs with the -s option
38
+
39
+ 2011-01-04 08:59 petr
40
+ * vcf-isec, vcf-sort, Vcf.pm: replaced "zcat" by "gunzip -c"
41
+
42
+ 2010-12-22 14:18 petr
43
+ * vcf-annotate: New --SnpCluster option
44
+ * Vcf.pm: new sub add_filter()
45
+
46
+ 2010-12-15 13:44 petr
47
+ * vcf-isec: By default output records from all files with unique positions
48
+ (duplicate records from the same file still should be printed). With the -o
49
+ switch, only positions from the left-most file will be printed.
50
+
51
+ 2010-12-09 14:48 petr
52
+ * query-vcf: Output 'True' for Flag tags when present and . when absent
53
+ * vcf-annotate: Fix: the command line eats quotes when they are not escaped
54
+
55
+ 2010-12-08 12:06 petr
56
+ * Vcf.pm: throw an error when tabix fails.
57
+ * query-vcf: enable streaming of files when region is not specified.
58
+
59
+ 2010-12-02 11:53 petr
60
+ * Vcf.pm: allow ALT alleles which are not present in samples
61
+ * vcf-isec: Multiple files can be created simultaneously with all possible
62
+ isec combinations. Suitable for Venn Diagram analysis.
63
+ * merge-vcf: Do not remove ALT alleles if no samples are present
64
+ * merge-vcf: Do FILTER merging more intelligently.
65
+ * merge-vcf: Join the QUAL column: use average value weighted by the number of samples.
66
+
67
+ 2010-11-28 08:34 petr
68
+ * vcf-concat: Partial sort
69
+ * vcf-validator: Added -u option
70
+ * VcfStats.pm: dump_counts
71
+
72
+ 2010-11-27 13:04 petr
73
+ * vcf-subset: Filter variants by type
74
+
75
+ 2010-11-26 09:08 petr
76
+ * vcf-annotate: Added possibility to read header descriptions from a file
77
+
78
+ 2010-11-24 13:25 petr
79
+ * Fix in Vcf.pm:fill_ref_alt_mapping. VCF files processed with merge-vcf were
80
+ affected when containing IDs in the ALT column.
81
+
82
+ 2010-11-23 13:12 petr
83
+ * Major revamp of Vcf.pm to allow better inheritance. Problems likely.
84
+