ngs_server 0.1 → 0.2

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (248) hide show
  1. data/bin/ngs_server +72 -50
  2. data/ext/bamtools/extconf.rb +3 -3
  3. data/ext/vcftools/Makefile +28 -0
  4. data/ext/vcftools/README.txt +36 -0
  5. data/ext/vcftools/cpp/.svn/all-wcprops +125 -0
  6. data/ext/vcftools/cpp/.svn/dir-prop-base +6 -0
  7. data/ext/vcftools/cpp/.svn/entries +708 -0
  8. data/ext/vcftools/cpp/.svn/text-base/Makefile.svn-base +46 -0
  9. data/ext/vcftools/cpp/.svn/text-base/dgeev.cpp.svn-base +146 -0
  10. data/ext/vcftools/cpp/.svn/text-base/dgeev.h.svn-base +43 -0
  11. data/ext/vcftools/cpp/.svn/text-base/output_log.cpp.svn-base +79 -0
  12. data/ext/vcftools/cpp/.svn/text-base/output_log.h.svn-base +34 -0
  13. data/ext/vcftools/cpp/.svn/text-base/parameters.cpp.svn-base +535 -0
  14. data/ext/vcftools/cpp/.svn/text-base/parameters.h.svn-base +154 -0
  15. data/ext/vcftools/cpp/.svn/text-base/vcf_entry.cpp.svn-base +497 -0
  16. data/ext/vcftools/cpp/.svn/text-base/vcf_entry.h.svn-base +190 -0
  17. data/ext/vcftools/cpp/.svn/text-base/vcf_entry_getters.cpp.svn-base +421 -0
  18. data/ext/vcftools/cpp/.svn/text-base/vcf_entry_setters.cpp.svn-base +482 -0
  19. data/ext/vcftools/cpp/.svn/text-base/vcf_file.cpp.svn-base +495 -0
  20. data/ext/vcftools/cpp/.svn/text-base/vcf_file.h.svn-base +184 -0
  21. data/ext/vcftools/cpp/.svn/text-base/vcf_file_diff.cpp.svn-base +1282 -0
  22. data/ext/vcftools/cpp/.svn/text-base/vcf_file_filters.cpp.svn-base +1215 -0
  23. data/ext/vcftools/cpp/.svn/text-base/vcf_file_format_convert.cpp.svn-base +1138 -0
  24. data/ext/vcftools/cpp/.svn/text-base/vcf_file_index.cpp.svn-base +171 -0
  25. data/ext/vcftools/cpp/.svn/text-base/vcf_file_output.cpp.svn-base +3012 -0
  26. data/ext/vcftools/cpp/.svn/text-base/vcftools.cpp.svn-base +107 -0
  27. data/ext/vcftools/cpp/.svn/text-base/vcftools.h.svn-base +25 -0
  28. data/ext/vcftools/cpp/Makefile +46 -0
  29. data/ext/vcftools/cpp/dgeev.cpp +146 -0
  30. data/ext/vcftools/cpp/dgeev.h +43 -0
  31. data/ext/vcftools/cpp/output_log.cpp +79 -0
  32. data/ext/vcftools/cpp/output_log.h +34 -0
  33. data/ext/vcftools/cpp/parameters.cpp +535 -0
  34. data/ext/vcftools/cpp/parameters.h +154 -0
  35. data/ext/vcftools/cpp/vcf_entry.cpp +497 -0
  36. data/ext/vcftools/cpp/vcf_entry.h +190 -0
  37. data/ext/vcftools/cpp/vcf_entry_getters.cpp +421 -0
  38. data/ext/vcftools/cpp/vcf_entry_setters.cpp +482 -0
  39. data/ext/vcftools/cpp/vcf_file.cpp +495 -0
  40. data/ext/vcftools/cpp/vcf_file.h +184 -0
  41. data/ext/vcftools/cpp/vcf_file_diff.cpp +1282 -0
  42. data/ext/vcftools/cpp/vcf_file_filters.cpp +1215 -0
  43. data/ext/vcftools/cpp/vcf_file_format_convert.cpp +1138 -0
  44. data/ext/vcftools/cpp/vcf_file_index.cpp +171 -0
  45. data/ext/vcftools/cpp/vcf_file_output.cpp +3012 -0
  46. data/ext/vcftools/cpp/vcftools.cpp +107 -0
  47. data/ext/vcftools/cpp/vcftools.h +25 -0
  48. data/ext/vcftools/examples/.svn/all-wcprops +185 -0
  49. data/ext/vcftools/examples/.svn/dir-prop-base +6 -0
  50. data/ext/vcftools/examples/.svn/entries +1048 -0
  51. data/ext/vcftools/examples/.svn/prop-base/perl-api-1.pl.svn-base +5 -0
  52. data/ext/vcftools/examples/.svn/text-base/annotate-test.vcf.svn-base +37 -0
  53. data/ext/vcftools/examples/.svn/text-base/annotate.out.svn-base +23 -0
  54. data/ext/vcftools/examples/.svn/text-base/annotate.txt.svn-base +7 -0
  55. data/ext/vcftools/examples/.svn/text-base/annotate2.out.svn-base +52 -0
  56. data/ext/vcftools/examples/.svn/text-base/annotate3.out.svn-base +23 -0
  57. data/ext/vcftools/examples/.svn/text-base/cmp-test-a-3.3.vcf.svn-base +12 -0
  58. data/ext/vcftools/examples/.svn/text-base/cmp-test-a.vcf.svn-base +12 -0
  59. data/ext/vcftools/examples/.svn/text-base/cmp-test-b-3.3.vcf.svn-base +12 -0
  60. data/ext/vcftools/examples/.svn/text-base/cmp-test-b.vcf.svn-base +12 -0
  61. data/ext/vcftools/examples/.svn/text-base/cmp-test.out.svn-base +53 -0
  62. data/ext/vcftools/examples/.svn/text-base/concat-a.vcf.svn-base +21 -0
  63. data/ext/vcftools/examples/.svn/text-base/concat-b.vcf.svn-base +13 -0
  64. data/ext/vcftools/examples/.svn/text-base/concat-c.vcf.svn-base +19 -0
  65. data/ext/vcftools/examples/.svn/text-base/concat.out.svn-base +39 -0
  66. data/ext/vcftools/examples/.svn/text-base/invalid-4.0.vcf.svn-base +31 -0
  67. data/ext/vcftools/examples/.svn/text-base/isec-n2-test.vcf.out.svn-base +19 -0
  68. data/ext/vcftools/examples/.svn/text-base/merge-test-a.vcf.svn-base +17 -0
  69. data/ext/vcftools/examples/.svn/text-base/merge-test-b.vcf.svn-base +17 -0
  70. data/ext/vcftools/examples/.svn/text-base/merge-test-c.vcf.svn-base +15 -0
  71. data/ext/vcftools/examples/.svn/text-base/merge-test.vcf.out.svn-base +31 -0
  72. data/ext/vcftools/examples/.svn/text-base/perl-api-1.pl.svn-base +46 -0
  73. data/ext/vcftools/examples/.svn/text-base/query-test.out.svn-base +6 -0
  74. data/ext/vcftools/examples/.svn/text-base/shuffle-test.vcf.svn-base +12 -0
  75. data/ext/vcftools/examples/.svn/text-base/subset.SNPs.out.svn-base +10 -0
  76. data/ext/vcftools/examples/.svn/text-base/subset.indels.out.svn-base +18 -0
  77. data/ext/vcftools/examples/.svn/text-base/subset.vcf.svn-base +21 -0
  78. data/ext/vcftools/examples/.svn/text-base/valid-3.3.vcf.svn-base +30 -0
  79. data/ext/vcftools/examples/.svn/text-base/valid-4.0.vcf.stats.svn-base +104 -0
  80. data/ext/vcftools/examples/.svn/text-base/valid-4.0.vcf.svn-base +34 -0
  81. data/ext/vcftools/examples/.svn/text-base/valid-4.1.vcf.svn-base +37 -0
  82. data/ext/vcftools/examples/annotate-test.vcf +37 -0
  83. data/ext/vcftools/examples/annotate.out +23 -0
  84. data/ext/vcftools/examples/annotate.txt +7 -0
  85. data/ext/vcftools/examples/annotate2.out +52 -0
  86. data/ext/vcftools/examples/annotate3.out +23 -0
  87. data/ext/vcftools/examples/cmp-test-a-3.3.vcf +12 -0
  88. data/ext/vcftools/examples/cmp-test-a.vcf +12 -0
  89. data/ext/vcftools/examples/cmp-test-b-3.3.vcf +12 -0
  90. data/ext/vcftools/examples/cmp-test-b.vcf +12 -0
  91. data/ext/vcftools/examples/cmp-test.out +53 -0
  92. data/ext/vcftools/examples/concat-a.vcf +21 -0
  93. data/ext/vcftools/examples/concat-b.vcf +13 -0
  94. data/ext/vcftools/examples/concat-c.vcf +19 -0
  95. data/ext/vcftools/examples/concat.out +39 -0
  96. data/ext/vcftools/examples/invalid-4.0.vcf +31 -0
  97. data/ext/vcftools/examples/isec-n2-test.vcf.out +19 -0
  98. data/ext/vcftools/examples/merge-test-a.vcf +17 -0
  99. data/ext/vcftools/examples/merge-test-b.vcf +17 -0
  100. data/ext/vcftools/examples/merge-test-c.vcf +15 -0
  101. data/ext/vcftools/examples/merge-test.vcf.out +31 -0
  102. data/ext/vcftools/examples/perl-api-1.pl +46 -0
  103. data/ext/vcftools/examples/query-test.out +6 -0
  104. data/ext/vcftools/examples/shuffle-test.vcf +12 -0
  105. data/ext/vcftools/examples/subset.SNPs.out +10 -0
  106. data/ext/vcftools/examples/subset.indels.out +18 -0
  107. data/ext/vcftools/examples/subset.vcf +21 -0
  108. data/ext/vcftools/examples/valid-3.3.vcf +30 -0
  109. data/ext/vcftools/examples/valid-4.0.vcf +34 -0
  110. data/ext/vcftools/examples/valid-4.0.vcf.stats +104 -0
  111. data/ext/vcftools/examples/valid-4.1.vcf +37 -0
  112. data/ext/vcftools/extconf.rb +2 -0
  113. data/ext/vcftools/perl/.svn/all-wcprops +149 -0
  114. data/ext/vcftools/perl/.svn/entries +844 -0
  115. data/ext/vcftools/perl/.svn/prop-base/fill-aa.svn-base +5 -0
  116. data/ext/vcftools/perl/.svn/prop-base/fill-an-ac.svn-base +5 -0
  117. data/ext/vcftools/perl/.svn/prop-base/fill-ref-md5.svn-base +5 -0
  118. data/ext/vcftools/perl/.svn/prop-base/tab-to-vcf.svn-base +5 -0
  119. data/ext/vcftools/perl/.svn/prop-base/test.t.svn-base +5 -0
  120. data/ext/vcftools/perl/.svn/prop-base/vcf-annotate.svn-base +5 -0
  121. data/ext/vcftools/perl/.svn/prop-base/vcf-compare.svn-base +5 -0
  122. data/ext/vcftools/perl/.svn/prop-base/vcf-concat.svn-base +5 -0
  123. data/ext/vcftools/perl/.svn/prop-base/vcf-convert.svn-base +5 -0
  124. data/ext/vcftools/perl/.svn/prop-base/vcf-fix-newlines.svn-base +5 -0
  125. data/ext/vcftools/perl/.svn/prop-base/vcf-isec.svn-base +5 -0
  126. data/ext/vcftools/perl/.svn/prop-base/vcf-merge.svn-base +5 -0
  127. data/ext/vcftools/perl/.svn/prop-base/vcf-query.svn-base +5 -0
  128. data/ext/vcftools/perl/.svn/prop-base/vcf-shuffle-cols.svn-base +5 -0
  129. data/ext/vcftools/perl/.svn/prop-base/vcf-sort.svn-base +5 -0
  130. data/ext/vcftools/perl/.svn/prop-base/vcf-stats.svn-base +5 -0
  131. data/ext/vcftools/perl/.svn/prop-base/vcf-subset.svn-base +5 -0
  132. data/ext/vcftools/perl/.svn/prop-base/vcf-to-tab.svn-base +5 -0
  133. data/ext/vcftools/perl/.svn/prop-base/vcf-validator.svn-base +5 -0
  134. data/ext/vcftools/perl/.svn/text-base/ChangeLog.svn-base +84 -0
  135. data/ext/vcftools/perl/.svn/text-base/FaSlice.pm.svn-base +214 -0
  136. data/ext/vcftools/perl/.svn/text-base/Makefile.svn-base +12 -0
  137. data/ext/vcftools/perl/.svn/text-base/Vcf.pm.svn-base +2853 -0
  138. data/ext/vcftools/perl/.svn/text-base/VcfStats.pm.svn-base +681 -0
  139. data/ext/vcftools/perl/.svn/text-base/fill-aa.svn-base +103 -0
  140. data/ext/vcftools/perl/.svn/text-base/fill-an-ac.svn-base +56 -0
  141. data/ext/vcftools/perl/.svn/text-base/fill-ref-md5.svn-base +204 -0
  142. data/ext/vcftools/perl/.svn/text-base/tab-to-vcf.svn-base +92 -0
  143. data/ext/vcftools/perl/.svn/text-base/test.t.svn-base +376 -0
  144. data/ext/vcftools/perl/.svn/text-base/vcf-annotate.svn-base +1099 -0
  145. data/ext/vcftools/perl/.svn/text-base/vcf-compare.svn-base +1193 -0
  146. data/ext/vcftools/perl/.svn/text-base/vcf-concat.svn-base +310 -0
  147. data/ext/vcftools/perl/.svn/text-base/vcf-convert.svn-base +180 -0
  148. data/ext/vcftools/perl/.svn/text-base/vcf-fix-newlines.svn-base +97 -0
  149. data/ext/vcftools/perl/.svn/text-base/vcf-isec.svn-base +660 -0
  150. data/ext/vcftools/perl/.svn/text-base/vcf-merge.svn-base +577 -0
  151. data/ext/vcftools/perl/.svn/text-base/vcf-query.svn-base +272 -0
  152. data/ext/vcftools/perl/.svn/text-base/vcf-shuffle-cols.svn-base +89 -0
  153. data/ext/vcftools/perl/.svn/text-base/vcf-sort.svn-base +79 -0
  154. data/ext/vcftools/perl/.svn/text-base/vcf-stats.svn-base +160 -0
  155. data/ext/vcftools/perl/.svn/text-base/vcf-subset.svn-base +206 -0
  156. data/ext/vcftools/perl/.svn/text-base/vcf-to-tab.svn-base +112 -0
  157. data/ext/vcftools/perl/.svn/text-base/vcf-validator.svn-base +145 -0
  158. data/ext/vcftools/perl/ChangeLog +84 -0
  159. data/ext/vcftools/perl/FaSlice.pm +214 -0
  160. data/ext/vcftools/perl/Makefile +12 -0
  161. data/ext/vcftools/perl/Vcf.pm +2853 -0
  162. data/ext/vcftools/perl/VcfStats.pm +681 -0
  163. data/ext/vcftools/perl/fill-aa +103 -0
  164. data/ext/vcftools/perl/fill-an-ac +56 -0
  165. data/ext/vcftools/perl/fill-ref-md5 +204 -0
  166. data/ext/vcftools/perl/tab-to-vcf +92 -0
  167. data/ext/vcftools/perl/test.t +376 -0
  168. data/ext/vcftools/perl/vcf-annotate +1099 -0
  169. data/ext/vcftools/perl/vcf-compare +1193 -0
  170. data/ext/vcftools/perl/vcf-concat +310 -0
  171. data/ext/vcftools/perl/vcf-convert +180 -0
  172. data/ext/vcftools/perl/vcf-fix-newlines +97 -0
  173. data/ext/vcftools/perl/vcf-isec +660 -0
  174. data/ext/vcftools/perl/vcf-merge +577 -0
  175. data/ext/vcftools/perl/vcf-query +286 -0
  176. data/ext/vcftools/perl/vcf-shuffle-cols +89 -0
  177. data/ext/vcftools/perl/vcf-sort +79 -0
  178. data/ext/vcftools/perl/vcf-stats +160 -0
  179. data/ext/vcftools/perl/vcf-subset +206 -0
  180. data/ext/vcftools/perl/vcf-to-tab +112 -0
  181. data/ext/vcftools/perl/vcf-validator +145 -0
  182. data/ext/vcftools/website/.svn/all-wcprops +41 -0
  183. data/ext/vcftools/website/.svn/entries +238 -0
  184. data/ext/vcftools/website/.svn/prop-base/VCF-poster.pdf.svn-base +5 -0
  185. data/ext/vcftools/website/.svn/prop-base/favicon.ico.svn-base +5 -0
  186. data/ext/vcftools/website/.svn/prop-base/favicon.png.svn-base +5 -0
  187. data/ext/vcftools/website/.svn/text-base/Makefile.svn-base +6 -0
  188. data/ext/vcftools/website/.svn/text-base/README.svn-base +2 -0
  189. data/ext/vcftools/website/.svn/text-base/VCF-poster.pdf.svn-base +0 -0
  190. data/ext/vcftools/website/.svn/text-base/default.css.svn-base +250 -0
  191. data/ext/vcftools/website/.svn/text-base/favicon.ico.svn-base +0 -0
  192. data/ext/vcftools/website/.svn/text-base/favicon.png.svn-base +0 -0
  193. data/ext/vcftools/website/Makefile +6 -0
  194. data/ext/vcftools/website/README +2 -0
  195. data/ext/vcftools/website/VCF-poster.pdf +0 -0
  196. data/ext/vcftools/website/default.css +250 -0
  197. data/ext/vcftools/website/favicon.ico +0 -0
  198. data/ext/vcftools/website/favicon.png +0 -0
  199. data/ext/vcftools/website/img/.svn/all-wcprops +53 -0
  200. data/ext/vcftools/website/img/.svn/entries +300 -0
  201. data/ext/vcftools/website/img/.svn/prop-base/bg.gif.svn-base +5 -0
  202. data/ext/vcftools/website/img/.svn/prop-base/bgcode.gif.svn-base +5 -0
  203. data/ext/vcftools/website/img/.svn/prop-base/bgcontainer.gif.svn-base +5 -0
  204. data/ext/vcftools/website/img/.svn/prop-base/bgul.gif.svn-base +5 -0
  205. data/ext/vcftools/website/img/.svn/prop-base/header.gif.svn-base +5 -0
  206. data/ext/vcftools/website/img/.svn/prop-base/li.gif.svn-base +5 -0
  207. data/ext/vcftools/website/img/.svn/prop-base/quote.gif.svn-base +5 -0
  208. data/ext/vcftools/website/img/.svn/prop-base/search.gif.svn-base +5 -0
  209. data/ext/vcftools/website/img/.svn/text-base/bg.gif.svn-base +0 -0
  210. data/ext/vcftools/website/img/.svn/text-base/bgcode.gif.svn-base +0 -0
  211. data/ext/vcftools/website/img/.svn/text-base/bgcontainer.gif.svn-base +0 -0
  212. data/ext/vcftools/website/img/.svn/text-base/bgul.gif.svn-base +0 -0
  213. data/ext/vcftools/website/img/.svn/text-base/header.gif.svn-base +0 -0
  214. data/ext/vcftools/website/img/.svn/text-base/li.gif.svn-base +0 -0
  215. data/ext/vcftools/website/img/.svn/text-base/quote.gif.svn-base +0 -0
  216. data/ext/vcftools/website/img/.svn/text-base/search.gif.svn-base +0 -0
  217. data/ext/vcftools/website/img/bg.gif +0 -0
  218. data/ext/vcftools/website/img/bgcode.gif +0 -0
  219. data/ext/vcftools/website/img/bgcontainer.gif +0 -0
  220. data/ext/vcftools/website/img/bgul.gif +0 -0
  221. data/ext/vcftools/website/img/header.gif +0 -0
  222. data/ext/vcftools/website/img/li.gif +0 -0
  223. data/ext/vcftools/website/img/quote.gif +0 -0
  224. data/ext/vcftools/website/img/search.gif +0 -0
  225. data/ext/vcftools/website/src/.svn/all-wcprops +53 -0
  226. data/ext/vcftools/website/src/.svn/entries +300 -0
  227. data/ext/vcftools/website/src/.svn/text-base/docs.inc.svn-base +202 -0
  228. data/ext/vcftools/website/src/.svn/text-base/index.inc.svn-base +52 -0
  229. data/ext/vcftools/website/src/.svn/text-base/index.php.svn-base +80 -0
  230. data/ext/vcftools/website/src/.svn/text-base/license.inc.svn-base +27 -0
  231. data/ext/vcftools/website/src/.svn/text-base/links.inc.svn-base +13 -0
  232. data/ext/vcftools/website/src/.svn/text-base/options.inc.svn-base +654 -0
  233. data/ext/vcftools/website/src/.svn/text-base/perl_module.inc.svn-base +249 -0
  234. data/ext/vcftools/website/src/.svn/text-base/specs.inc.svn-base +18 -0
  235. data/ext/vcftools/website/src/docs.inc +202 -0
  236. data/ext/vcftools/website/src/index.inc +52 -0
  237. data/ext/vcftools/website/src/index.php +80 -0
  238. data/ext/vcftools/website/src/license.inc +27 -0
  239. data/ext/vcftools/website/src/links.inc +13 -0
  240. data/ext/vcftools/website/src/options.inc +654 -0
  241. data/ext/vcftools/website/src/perl_module.inc +249 -0
  242. data/ext/vcftools/website/src/specs.inc +18 -0
  243. data/lib/config.ru +9 -0
  244. data/lib/ngs_server/add.rb +9 -0
  245. data/lib/ngs_server/version.rb +1 -1
  246. data/lib/ngs_server.rb +55 -3
  247. data/ngs_server.gemspec +5 -2
  248. metadata +296 -6
@@ -0,0 +1,206 @@
1
+ #!/usr/bin/env perl
2
+ #
3
+ # Author: petr.danecek@sanger
4
+ #
5
+
6
+ use strict;
7
+ use warnings;
8
+ use Carp;
9
+ use Vcf;
10
+
11
+ my $opts = parse_params();
12
+ vcf_subset($opts);
13
+
14
+ exit;
15
+
16
+ #--------------------------------
17
+
18
+ sub error
19
+ {
20
+ my (@msg) = @_;
21
+ if ( scalar @msg )
22
+ {
23
+ croak @msg;
24
+ }
25
+ die
26
+ "Usage: vcf-subset [OPTIONS] in.vcf.gz > out.vcf\n",
27
+ "Options:\n",
28
+ " -c, --columns <string> File or comma-separated list of columns to keep in the vcf file. If file, one column per row\n",
29
+ " -e, --exclude-ref Exclude rows not containing variants.\n",
30
+ " -f, --force Proceed anyway even if VCF does not contain some of the samples.\n",
31
+ " -p, --private Print only rows where only the subset columns carry an alternate allele.\n",
32
+ " -r, --replace-with-ref Replace the excluded types with reference allele instead of dot.\n",
33
+ " -t, --type <list> Comma-separated list of variant types to include: SNPs,indels.\n",
34
+ " -u, --keep-uncalled Do not exclude rows without calls.\n",
35
+ " -h, -?, --help This help message.\n",
36
+ "Examples:\n",
37
+ " cat in.vcf | vcf-subset -r -t indels -e -c SAMPLE1 > out.vcf\n",
38
+ "\n";
39
+ }
40
+
41
+
42
+ sub parse_params
43
+ {
44
+ my $opts = { exclude_ref=>0, keep_uncalled=>0, replace_with_ref=>0, private=>0, args=>[$0, @ARGV] };
45
+ while (my $arg=shift(@ARGV))
46
+ {
47
+ if ( $arg eq '-t' || $arg eq '--type' )
48
+ {
49
+ my %known = ( SNPs=>'s', indels=>'i' );
50
+ my $types = shift(@ARGV);
51
+ for my $t (split(/,/,$types))
52
+ {
53
+ if ( !(exists($known{$t})) ) { error("Unknown type [$t] with -t [$types]\n"); }
54
+ $$opts{types}{$known{$t}} = 1;
55
+ }
56
+ next;
57
+ }
58
+ if ( $arg eq '-e' || $arg eq '--exclude-ref' ) { $$opts{'exclude_ref'} = 1; next }
59
+ if ( $arg eq '-f' || $arg eq '--force' ) { $$opts{'force'} = 1; next }
60
+ if ( $arg eq '-p' || $arg eq '--private' ) { $$opts{'private'} = 1; next }
61
+ if ( $arg eq '-r' || $arg eq '--replace-with-ref' ) { $$opts{'replace_with_ref'} = 1; next }
62
+ if ( $arg eq '-u' || $arg eq '--keep-uncalled' ) { $$opts{'keep_uncalled'} = 1; next }
63
+ if ( $arg eq '-c' || $arg eq '--columns' ) { $$opts{'columns_file'} = shift(@ARGV); next }
64
+ if ( $arg eq '-?' || $arg eq '-h' || $arg eq '--help' ) { error(); }
65
+ if ( -e $arg ) { $$opts{file} = $arg; next }
66
+ if ( -e $arg or $arg=~m{^(?:ftp|http)://} ) { $$opts{file}=$arg; next; }
67
+ error("Unknown parameter \"$arg\". Run -h for help.\n");
68
+ }
69
+ if ( !$$opts{exclude_ref} && !$$opts{'columns_file'} && !exists($$opts{'types'}) ) { error("Missing the -c or -t or -r option.\n") }
70
+ if ( exists($$opts{types}) ) { $$opts{types}{r}=1; }
71
+ return $opts;
72
+ }
73
+
74
+
75
+ sub read_columns
76
+ {
77
+ my ($fname) = @_;
78
+ my @columns;
79
+ if ( !-e $fname )
80
+ {
81
+ @columns = split(/,/,$fname);
82
+ return \@columns;
83
+ }
84
+ open(my $fh,'<',$fname) or error("$fname: $!");
85
+ while (my $line=<$fh>)
86
+ {
87
+ chomp($line);
88
+ $line=~s/\s+//g;
89
+ push @columns, $line;
90
+ }
91
+ close($fh);
92
+ return \@columns;
93
+ }
94
+
95
+ sub check_columns
96
+ {
97
+ my ($opts,$vcf,$columns) = @_;
98
+ my @out;
99
+ for my $col (@$columns)
100
+ {
101
+ if ( exists($$vcf{has_column}{$col}) )
102
+ {
103
+ push @out, $col;
104
+ next;
105
+ }
106
+
107
+ my $msg = qq[No such column in the VCF file: "$col"\n];
108
+ if ( $$opts{force} ) { warn($msg); }
109
+ else { error($msg); }
110
+ }
111
+ return \@out;
112
+ }
113
+
114
+ sub vcf_subset
115
+ {
116
+ my ($opts) = @_;
117
+
118
+ my $vcf = $$opts{file} ? Vcf->new(file=>$$opts{file}) : Vcf->new(fh=>\*STDIN);
119
+ $vcf->parse_header();
120
+
121
+ # Init requested column info. If not present, include all columns.
122
+ my $columns = exists($$opts{columns_file}) ? read_columns($$opts{columns_file}) : [];
123
+ $columns = check_columns($opts,$vcf,$columns);
124
+ if ( !@$columns && (my $ncols=@{$$vcf{columns}})>9 )
125
+ {
126
+ push @$columns, @{$$vcf{columns}}[9..($ncols-1)];
127
+ }
128
+ my %has_col = map { $_ => 1 } @$columns;
129
+
130
+ $vcf->add_header_line({key=>'source',value=>join(' ',@{$$opts{args}})},append=>'timestamp');
131
+ $vcf->set_samples(include=>$columns) unless $$opts{private};
132
+ print $vcf->format_header($columns);
133
+
134
+ my $check_private = $$opts{private};
135
+ while (my $x=$vcf->next_data_hash())
136
+ {
137
+ my $site_has_call = 0;
138
+ my $site_has_nonref = 0;
139
+ my $site_is_private = 1;
140
+ my $ref = $$x{REF};
141
+
142
+ for my $col (keys %{$$x{gtypes}})
143
+ {
144
+ if ( !$has_col{$col} && ($site_is_private==0 || !$check_private) )
145
+ {
146
+ # This column is not to be printed
147
+ delete($$x{gtypes}{$col});
148
+ next;
149
+ }
150
+
151
+ my ($alleles,$seps,$is_phased,$is_empty) = $vcf->parse_haplotype($x,$col);
152
+ my $sample_has_call = 0;
153
+ my $sample_has_nonref = 0;
154
+ my @out_alleles;
155
+
156
+ for (my $i=0; $i<@$alleles; $i++)
157
+ {
158
+ my ($type,$len,$ht) = $vcf->event_type($ref,$$alleles[$i]);
159
+ $out_alleles[$i] = $$alleles[$i];
160
+
161
+ # Exclude unwanted variant types if requested
162
+ if ( exists($$opts{types}) )
163
+ {
164
+ if ( !exists($$opts{types}{$type}) )
165
+ {
166
+ $out_alleles[$i] = $$opts{replace_with_ref} ? $ref : '.';
167
+ next;
168
+ }
169
+ $sample_has_call = 1;
170
+ }
171
+ elsif ( !$is_empty ) { $sample_has_call = 1; }
172
+ if ( $type ne 'r' ) { $site_has_nonref = 1; $sample_has_nonref = 1; }
173
+ }
174
+ if ( $check_private && !$has_col{$col} )
175
+ {
176
+ if ( $sample_has_nonref ) { $site_is_private=0; }
177
+ delete($$x{gtypes}{$col});
178
+ next;
179
+ }
180
+ if ( !$sample_has_call )
181
+ {
182
+ if ( $$opts{replace_with_ref} )
183
+ {
184
+ for (my $i=0; $i<@$alleles; $i++) { $out_alleles[$i] = $ref; }
185
+ }
186
+ else
187
+ {
188
+ for (my $i=0; $i<@$alleles; $i++) { $out_alleles[$i] = '.'; }
189
+ }
190
+ }
191
+ else
192
+ {
193
+ $site_has_call = 1;
194
+ }
195
+ $$x{gtypes}{$col}{GT} = $vcf->format_haplotype(\@out_alleles,$seps);
196
+ }
197
+
198
+ if ( !$site_has_call && !$$opts{keep_uncalled} ) { next; }
199
+ if ( !$site_has_nonref && $$opts{exclude_ref} ) { next; }
200
+ if ( $check_private && (!$site_is_private || !$site_has_nonref) ) { next; }
201
+
202
+ $vcf->format_genotype_strings($x);
203
+ print $vcf->format_line($x,$columns);
204
+ }
205
+ }
206
+
@@ -0,0 +1,112 @@
1
+ #!/usr/bin/env perl
2
+
3
+ use strict;
4
+ use warnings;
5
+ use Carp;
6
+ use Vcf;
7
+
8
+ my $opts = parse_params();
9
+ convert_to_tab($opts);
10
+
11
+ exit;
12
+
13
+ #--------------------------------
14
+
15
+ sub error
16
+ {
17
+ my (@msg) = @_;
18
+ if ( scalar @msg )
19
+ {
20
+ croak @msg;
21
+ }
22
+ die
23
+ "Usage: vcf-to-tab [OPTIONS] < in.vcf > out.tab\n",
24
+ "Options:\n",
25
+ " -h, -?, --help This help message.\n",
26
+ " -i, --iupac Use one-letter IUPAC codes\n",
27
+ "\n";
28
+ }
29
+
30
+
31
+ sub parse_params
32
+ {
33
+ my $opts = { iupac=>0 };
34
+ while (my $arg=shift(@ARGV))
35
+ {
36
+ if ( $arg eq '-?' || $arg eq '-h' || $arg eq '--help' ) { error(); }
37
+ if ( $arg eq '-i' || $arg eq '--iupac' ) { $$opts{iupac}=1; next; }
38
+ error("Unknown parameter \"$arg\". Run -h for help.\n");
39
+ }
40
+
41
+ if ( $$opts{iupac} )
42
+ {
43
+ $$opts{iupac} =
44
+ {
45
+ 'GG' => 'G',
46
+ 'CC' => 'C',
47
+ 'TT' => 'T',
48
+ 'AA' => 'A',
49
+
50
+ 'GT' => 'K',
51
+ 'TG' => 'K',
52
+ 'AC' => 'M',
53
+ 'CA' => 'M',
54
+ 'CG' => 'S',
55
+ 'GC' => 'S',
56
+ 'AG' => 'R',
57
+ 'GA' => 'R',
58
+ 'AT' => 'W',
59
+ 'TA' => 'W',
60
+ 'CT' => 'Y',
61
+ 'TC' => 'Y',
62
+
63
+ '..' => '.',
64
+ };
65
+ }
66
+
67
+ return $opts;
68
+ }
69
+
70
+
71
+ sub convert_to_tab
72
+ {
73
+ my ($opts) = @_;
74
+
75
+ my $iupac;
76
+ if ( $$opts{iupac} ) { $iupac=$$opts{iupac}; }
77
+
78
+ my $vcf = Vcf->new(fh=>\*STDIN);
79
+ $vcf->parse_header();
80
+
81
+ my $header_printed=0;
82
+
83
+ while (my $x=$vcf->next_data_hash())
84
+ {
85
+ if ( !$header_printed )
86
+ {
87
+ print "#CHROM\tPOS\tREF";
88
+ for my $col (sort keys %{$$x{gtypes}})
89
+ {
90
+ print "\t$col";
91
+ }
92
+ print "\n";
93
+
94
+ $header_printed = 1;
95
+ }
96
+
97
+ print "$$x{CHROM}\t$$x{POS}\t$$x{REF}";
98
+ for my $col (sort keys %{$$x{gtypes}})
99
+ {
100
+ my ($al1,$sep,$al2) = exists($$x{gtypes}{$col}{GT}) ? $vcf->parse_alleles($x,$col) : ('.','/','.');
101
+ my $gt = $al1.'/'.$al2;
102
+ if ( $iupac )
103
+ {
104
+ if ( !exists($$iupac{$gt}) ) { error(qq[Unknown IUPAC code for "$al1$sep$al2" .. $$x{CHROM}:$$x{POS} $col\n]); }
105
+ $gt = $$iupac{$gt};
106
+ }
107
+ print "\t".$gt;
108
+ }
109
+ print "\n";
110
+ }
111
+ }
112
+
@@ -0,0 +1,145 @@
1
+ #!/usr/bin/env perl
2
+ #
3
+ # Author: petr.danecek@sanger
4
+ #
5
+
6
+ use strict;
7
+ use warnings;
8
+ use Carp;
9
+ use Vcf;
10
+ use IPC::Open3 'open3';
11
+ use IO::Select;
12
+
13
+ my $opts = parse_params();
14
+ do_validation($opts);
15
+
16
+ exit;
17
+
18
+ #--------------------------------
19
+
20
+ sub error
21
+ {
22
+ my (@msg) = @_;
23
+ if ( scalar @msg )
24
+ {
25
+ croak @msg;
26
+ }
27
+ die
28
+ "Usage: vcf-validator [OPTIONS] file.vcf.gz\n",
29
+ "Options:\n",
30
+ " -d, --duplicates Warn about duplicate positions.\n",
31
+ " -u, --unique-messages Output all messages only once.\n",
32
+ " -h, -?, --help This help message.\n",
33
+ "\n";
34
+ }
35
+
36
+
37
+ sub parse_params
38
+ {
39
+ my $opts = { unique=>0, duplicates=>0 };
40
+ while (my $arg=shift(@ARGV))
41
+ {
42
+ if ( $arg eq '-d' || $arg eq '--duplicates' ) { $$opts{duplicates}=1; next; }
43
+ if ( $arg eq '-u' || $arg eq '--unique-messages' ) { $$opts{unique}=1; next; }
44
+ if ( $arg eq '-?' || $arg eq '-h' || $arg eq '--help' ) { error(); }
45
+ if ( -e $arg && !exists($$opts{file}) ) { $$opts{file}=$arg; next; }
46
+ error("Unknown parameter or non-existent file: \"$arg\". Run -h for help.\n");
47
+ }
48
+ return $opts;
49
+ }
50
+
51
+ sub do_validation
52
+ {
53
+ my ($opts) = @_;
54
+
55
+ my %opts = $$opts{file} ? (file=>$$opts{file}) : (fh=>\*STDIN);
56
+ my $vcf = Vcf->new(%opts, warn_duplicates=>$$opts{duplicates});
57
+
58
+ if ( !$$opts{unique} )
59
+ {
60
+ $vcf->run_validation();
61
+ return;
62
+ }
63
+
64
+ my ($kid_in,$kid_out,$kid_err);
65
+
66
+ my $pid = open3($kid_in,$kid_out,$kid_err,'-');
67
+ if ( !defined $pid ) { error("Cannot fork: $!"); }
68
+
69
+ if ($pid)
70
+ {
71
+ $$opts{known_lines} = [];
72
+
73
+ my $sel = new IO::Select;
74
+ $sel->add($kid_out,$kid_err);
75
+
76
+ while(my @ready = $sel->can_read)
77
+ {
78
+ foreach my $fh (@ready)
79
+ {
80
+ my $line = <$fh>;
81
+ if (not defined $line)
82
+ {
83
+ $sel->remove($fh);
84
+ next;
85
+ }
86
+ print_or_discard_line($opts,$line);
87
+ }
88
+ }
89
+ print_summary($opts);
90
+ }
91
+ else
92
+ {
93
+ $vcf->run_validation();
94
+ return;
95
+ }
96
+ }
97
+
98
+ sub print_or_discard_line
99
+ {
100
+ my ($opts,$line) = @_;
101
+
102
+ my @items = split(/\s+/,$line);
103
+ my $nitems = scalar @items;
104
+
105
+ for my $known (@{$$opts{known_lines}})
106
+ {
107
+ if ( @items != @{$$known{line}} ) { next; }
108
+
109
+ my $nmatches = 0;
110
+ for (my $i=0; $i<$nitems; $i++)
111
+ {
112
+ if ( $items[$i] eq $$known{line}[$i] ) { $nmatches++ }
113
+ }
114
+
115
+ if ( $nitems-$nmatches<3 )
116
+ {
117
+ $$known{n}++;
118
+ return;
119
+ }
120
+ }
121
+
122
+ push @{$$opts{known_lines}}, { line=>\@items, n=>1 };
123
+ print $line;
124
+ }
125
+
126
+ sub print_summary
127
+ {
128
+ my ($opts) = @_;
129
+ my $n = 0;
130
+ for my $error (@{$$opts{known_lines}})
131
+ {
132
+ $n += $$error{n};
133
+ }
134
+ print "\n\n------------------------\n";
135
+ print "Summary:\n";
136
+ printf "\t%d errors total \n\n", $n;
137
+
138
+ $n = 0;
139
+ for my $error (sort {$$b{n}<=>$$a{n}} @{$$opts{known_lines}})
140
+ {
141
+ if ( $n++ > 50 ) { print "\n\nand more...\n"; last; }
142
+ printf "\t%d\t..\t%s\n", $$error{n},join(' ',@{$$error{line}});
143
+ }
144
+ }
145
+
@@ -0,0 +1,84 @@
1
+ 2011-04-04 14:00 petr.danecek@sanger
2
+ * VCFtools now support VCFv4.1
3
+ * fill-ref-md5: New tool backfilling sequence MD5s into VCF header
4
+ * Renamed merge-vcf, compare-vcf etc. to consistent naming vcf-merge, vcf-compare
5
+ * vcf-merge: Now merging also GL and other Number=[AG] tags
6
+ * vcf-compare: Comparing indel haplotypes
7
+
8
+ 2011-02-21 12:31 petr.danecek@sanger
9
+ * vcf-stats: new -s option to speed up parsing when stats computed for selected samples only
10
+ * merge-vcf: allow to merge arbitrary chunks; -c option now deprecated, use -r instead
11
+ * compare-vcf: change in output format and more detailed comparison
12
+
13
+ 2011-02-17 17:36 petr.danecek@sanger
14
+ * vcf-stats: allow querying stats of individual samples
15
+
16
+ 2011-02-16 12:07 petr.danecek@sanger
17
+ * vcf-stats: major revision
18
+ * vcf-annotate: more filtering options
19
+
20
+ 2011-02-04 14:43 petr
21
+ * merge-vcf: if possible, calculate AC,AN even for sites without genotypes
22
+
23
+ 2011-02-03 15:04 petr
24
+ * merge-vcf: fixed a bug introduced by the previous fix.
25
+
26
+ 2011-02-02 21:02 petr
27
+ * merge-vcf: fixed a bug in merging indel ALTs. Only VCFs without samples were affected.
28
+
29
+ 2011-01-28 15:38 petr
30
+ * vcf-subset: new option for printing rows with calls private to the subset group
31
+
32
+ 2011-01-24 13:38 petr
33
+ * Vcf.pm: uppercase floating point number expressions (such as
34
+ 1.0382033E-6) now pass validation
35
+
36
+ 2011-01-20 08:28 petr
37
+ * vcf-concat: print header also for empty VCFs with the -s option
38
+
39
+ 2011-01-04 08:59 petr
40
+ * vcf-isec, vcf-sort, Vcf.pm: replaced "zcat" by "gunzip -c"
41
+
42
+ 2010-12-22 14:18 petr
43
+ * vcf-annotate: New --SnpCluster option
44
+ * Vcf.pm: new sub add_filter()
45
+
46
+ 2010-12-15 13:44 petr
47
+ * vcf-isec: By default output records from all files with unique positions
48
+ (duplicate records from the same file still should be printed). With the -o
49
+ switch, only positions from the left-most file will be printed.
50
+
51
+ 2010-12-09 14:48 petr
52
+ * query-vcf: Output 'True' for Flag tags when present and . when absent
53
+ * vcf-annotate: Fix: the command line eats quotes when they are not escaped
54
+
55
+ 2010-12-08 12:06 petr
56
+ * Vcf.pm: throw an error when tabix fails.
57
+ * query-vcf: enable streaming of files when region is not specified.
58
+
59
+ 2010-12-02 11:53 petr
60
+ * Vcf.pm: allow ALT alleles which are not present in samples
61
+ * vcf-isec: Multiple files can be created simultaneously with all possible
62
+ isec combinations. Suitable for Venn Diagram analysis.
63
+ * merge-vcf: Do not remove ALT alleles if no samples are present
64
+ * merge-vcf: Do FILTER merging more intelligently.
65
+ * merge-vcf: Join the QUAL column: use average value weighted by the number of samples.
66
+
67
+ 2010-11-28 08:34 petr
68
+ * vcf-concat: Partial sort
69
+ * vcf-validator: Added -u option
70
+ * VcfStats.pm: dump_counts
71
+
72
+ 2010-11-27 13:04 petr
73
+ * vcf-subset: Filter variants by type
74
+
75
+ 2010-11-26 09:08 petr
76
+ * vcf-annotate: Added possibility to read header descriptions from a file
77
+
78
+ 2010-11-24 13:25 petr
79
+ * Fix in Vcf.pm:fill_ref_alt_mapping. VCF files processed with merge-vcf were
80
+ affected when containing IDs in the ALT column.
81
+
82
+ 2010-11-23 13:12 petr
83
+ * Major revamp of Vcf.pm to allow better inheritance. Problems likely.
84
+