PhyResSE
Exports
Intro
In case your session comprises more than one sample, the 'Reports' page will also provide exports. They are displayed in the upper right corner only after all files are processed. Each export summarizes all samples.
Handling
VCF ( VCF ) : All samples are merged into one vcf file (by VCFtools: vcf-merge).
VCF-stats ( Stats ): (also by VCFtools: vcf-stats) provides some statistics.
VCF-tab ( TSV ): It is then "flattened" into tab-delimited text format (also by VCFtools: vcf-to-tab, displaying max. two alleles), and further into a
FASTA ( Fasta ): file, in which each sample is represented by one sequence ( vcf_tab_to_fasta_alignment.pl by Christina Bergey, with only one allele at each position). Both VCF-tab and FASTA comprise only SNP genome positions. Moreover, the fasta file is produced by ignoring all multi-allele positions at the moment. We currently work on decomposing mixed samples into separate (component) sequences, also estimating their shares (percentage of whole sample except contamination reads).
Tree (): Last, a maximum likelihood tree is generated from the fasta file by
FastTree and rendered by jstree. The latter enables to
- re-root
Click on the tree figure, uncheck "Circular", "Draw tree", pick new root (e.g. REF) by clicking on the blue square (next to REF), "reroot" (both Newick tree in text window and the graph are updated), click on tree and re-check circular. You can also reroot the circular tree but here the nodes are more difficult to select (i.e. to precisely hit a particular node by mouseclick). - highlight
To highlight samples click on the tree, enter search string and click "Search". All sample names comprising an entered substring will be highlighted. - style
To style the figure, click on the tree figure to adapt figure width, fontsize, and spacing. - remove
To remove samples or whole subtrees click on the blue box (after un-checking "Circular"). You can also collapse, ladderize, swap, multifurcate, move, or color-code any subtree. As above it is also possible to do all this in circular mode, just that the nodes (no blue boxes here) are more difficult to hit by mouseclick. - edit manually
Edit the Newick tree in the text window, click on the graph, "Draw tree" (or copy-paste the generated Newick tree into a more powerful tree viewer). - also on a train/plane without internet
Save the HTML page and unpack jstree.zip in the same directory.
Variants (): All samples' variants in one comma-separated spreadsheet (import into e.g. Excel). It also contains base counts, i. e. occurances of all
nucleotides reliably observed (with base quality > 13) at the particular position. For unfiltered base counts see the VCF.
Computational steps
# code snippet producing all exports (embedding Newick tree in HTML for visualization by java script)
while ($file=<$ARGV[0]*.bam.flt.vcf>){ # feed all single-sample vcf files
$res=`/usr/bin/bgzip -c $file > ${file}.gz`; print "$res\n";
$res=`/usr/bin/tabix -p vcf ${file}.gz`; print "$res\n";
$filecount++; $files=$files.${file}.'.gz ';
}
if ($filecount>1){ # merge and tree from min 2 files
print "summarizing (exports and tree)\n";
$res=`/usr/bin/vcf-merge ${files} > $ARGV[0]export.vcf`; print "$res\n";
$res=`/usr/bin/vcf-stats $ARGV[0]export.vcf > $ARGV[0]export.stats`; print "$res\n";
$res=`/bin/cat $ARGV[0]export.vcf | /usr/bin/vcf-to-tab > $ARGV[0]export.tab`; print "$res\n";
$res=`/usr/bin/vcf_tab_to_fasta_alignment.pl --exclude_het --output_ref -i $ARGV[0]export.tab > $ARGV[0]export.fa`; print "$res\n";
$res=`/usr/bin/FastTreeMP -nt -quiet $ARGV[0]export.fa`;
$res=~s/,/,\n/g; $res=~s/\)/\n\)/g; # break into many lines for easy copy/paste
open(Fout,"> $ARGV[0]export.html");
print Fout <<"END";
<html>
[...]
<textarea id="nhx-ex" style="display: none">
END
print Fout $res;
print Fout '</textarea></body></html>'."\n\n";
close Fout;
}
VersionsVCFtools v0.1.13
tabix v1.7.2
vcf_tab_to_fasta_alignment.pl by Christina Bergey (Bergey CM (2012). vcf-tab-to-fasta; http://code.google.com/p/vcf-tab-to-fasta)
FastTree v2.1.10 as multi-threaded executable (+SSE +OpenMP)
jstree (no versioning found, from http://lh3lh3.users.sourceforge.net/jstree.shtml)