Institute of Population Genetics PoPoolation DB Institute of Population Genetics University of Veterinary Medicine Vienna
 
User Manual for PoPoolation DB database
1. PoPoolation DB query page
 
Users can query PoPoolation DB for multiple options as shown in Figure 1. For this purpose the following steps need to be followed:
[1] Choose and input option (Query by region/Query by gene/Query by fasta sequence/Query by gene GTF) and specify a search term (varies for the different query options).

[2] Select a population ID (default: Dmel_Portugal_all).

[3] Select a reference genome (in case of Fasta sequence to run BLAT).

[4] Choose data to retrieve (Pi, Theta, D or Only Polymorphism).

[5] Enter length of flanking 5' and 3' region to the query (default: 0 base pairs).

[6] Input values for Advanced Parameters. These parameters define the properties of the data to be queried. Minimum Count: Minimum number of times that an allele has to appear so it is included for the measurements (default: 2). Minimum Quality Score: Minimum base quality required for a base to be considered in the analysis (default: 20). Minimum Coverage: Minimum number of allele observations for a nucleotide position (default: 4). Maximum Coverage: Maximum number of alleles for a nucleotide position coverage (default: 300). Minimum Covered Fraction: Proportion of the window that should have sufficient coverage for the statistic calculation (default: 0.6). Window Size: Length in bp of the window within which the measurement will be calculated (default: 1.000bp) Step Size: Number of bp that the window should slide so the next window is defined (default: 100bp). Variability search input page [http://www.popoolation.at/pgt/]
Figure 1: shows PoPoolation DB natural polymorphism query page
2. PoPoolation DB result page
 
An example output of natural polymorphism search from PoPoolation DB is shown in Figure 2. This result page consists of 5 parts.
[1] Visual output of pi, theta or D track displayed in any of the three browsers: UCSC genome browser / Flybase genome browser / Flybase RNA-Seq genome browser

[2] Quantiles for chromosome: Under Quantiles 95% (pi) and Quantiles 99% (pi) are precalculated chromosomewise quantiles for the parameter of interest. Lower and Upper refer to the lower and upper quantiles of the chromosomewise parameter estimate.

[3] SNP Table: Tabular format displaying information on each of the SNPs in the queried region. For more information on the SNP Table see below (5. SNP Table).

[4] Indel Table: Tabular format displaying information on each of the Insertions and Deletions in the queried region. For more information on the Indel Table see below (6. Indel Table).

[5] Natural Polymorphism Search Criteria: An important feature of PoPoolation DB is its reproducibility of results consequently promoting transparency in data analysis. For each analysis performed PoPoolation DB prints a log file with the parameters defined by the user. A sample output is shown in Figure 5.

Alternatively, the Retrieve option Only Polymorphism displays its results in a page only containing the SNP Table and Indel Table mentioned above. For more information on these tables see (5) and (6).
Figure 2: shows a sample PoPoolation DB natural polymorphism output window
 
3. View Track In The UCSC Genome Browser Or In FlyBase Genome Browser.r
 
An example of the visual output of a query for population variation (pi) is displayed in Figures 3 and 4 on the UCSC and FlyBase Genome Browsers respectively. This analysis corresponds to a sliding window analysis of pi of a Portuguese D. melanogaster population on chromosome X: 2044135-2053016. The pronounced drop in variability around the gene crm has been previously described, suggesting that at least one favorable mutation has been recently spread in cosmopolitan D. melanogaster populations.
Figure 3: shows a sample pi track in UCSC Genome Browser
 
Figure 4: shows a sample pi track in Flybase Genome Browser
 
4. View Track In FlyBase RNA-seq Genome Browser
 
An example of the visual output of a query for population variation (pi) is displayed in Figure 5 in FlyBase RNA-seq Genome Browser. This analysis corresponds to a sliding window analysis of pi of a Portuguese D. melanogaster population on chromosome X: 2044135-2053016. The pronounced drop in variability around the gene crm has been previously described, suggesting that at least one favorable mutation has been recently spread in cosmopolitan D. melanogaster populations.
Figure 5: shows a sample pi track in Flybase RNA-Seq Genome Browser
 
5. SNP table
 
To complement the results on population variation parameters or Tajima's D, PoPoolation DB provides a description of all SNPs in the queried region in a tabular format. Alternatively, the users can specify that only this information be retrieved from the database by choosing the Retrieve option: Only Polymorphism.

For each SNP in the queried region, the SNP table contains information about the counts of each allelic state in the sequenced population, the coverage at that position and the allelic state in D. melanogaster's reference genome. The SNP table also contains information about the amino acid state in the reference genome and whether the polymorphism present in the sequenced population changes this character state. For each position in the queried region PoPoolation DB prints the annotated features available for the fragment (e.g. introns, CDS) as stored in gff3 files in Flybase. It also provides a hyperlink to the corresponding position in Flybase.
SNP table
Figure 6: shows a sample SNP table from PoPoolation DB
6. Indel table
 
To complement the results on population variation parameters or Tajima's D, PoPoolation DB also provides a description of the Insertions and Deletions (Indels) in the queried region in a tabular format. As for the SNP Table, the users can specify that only this information be retrieved from the database by choosing the Retrieve option: Only Polymorphism.

For each position with an indel, PoPoolation DB shows the nucleotide sequence that is added or deleted. Deletions are marked with a minus and a number indicating how many nucleotides changed (e.g a deleted A results in -1A). The equivalent output is given for a insertions but with a plus (e.g. an insertion of four nucleotides results in +4ATCG). For positions with complex indels (simultaneously deletions and insertions), PoPoolation DB prints separate rows for each type of change. For each indel, the position displayed in the table refers to the position in the reference genome previous to the indel (e.g. 1401 -1T refers to the nucleotide position before the indel). To assess the coverage around the indel, PoPoolation DB calculates the average coverage of the 5 neighboring nucleotides of bothe the 5' and 3' side of the indel.
Indel table
Figure 7: shows a sample Indel table from PoPoolation DB

Home  |  Contact us  |  Help