Data access

Introduction
This page gives access to the data generated in Zanini et al. eLIFE, e11282.
The data are available in two forms:

Download


There are three ways to query the data set:
  1. Clinical and summary data: Tables containing clinical information (viral load, CD4+ counts, HLA types, estimates of time of infection), the patient reference sequence for mapping, and the single nucleotide variant (SNP) frequency tables.
  2. Haplotypes: Extended genetic variants, including minor ones, covering standard or custom genomic regions.
  3. Sequencing reads: The filtered sequencing reads, mapped against the single fragments of the patient-specific reference. Download the genbank (GB) version of the reference to find PCR fragment annotations.
  4. Counts of pairs of SNPs: The counts of pairs of single nucleotide polymorphisms, for each sample and amplicon. These data are useful to investigate linkage.

RESTful API


The API is used for online analysis or embedding into another website. See below for details.
Clinical and summary data
Patient Samples Time of infection Viral load CD4+ counts Reference Estimated depth SNP counts
p1 12 TSV TSV TSV GB | Fasta TSV ZIP
p2 6 TSV TSV TSV GB | Fasta TSV ZIP
p3 10 TSV TSV TSV GB | Fasta TSV ZIP
p5 7 TSV TSV TSV GB | Fasta TSV ZIP
p6 7 TSV TSV TSV GB | Fasta TSV ZIP
p7 11 TSV TSV TSV GB | Fasta TSV ZIP
p8 7 TSV TSV TSV GB | Fasta TSV ZIP
p9 8 TSV TSV TSV GB | Fasta TSV ZIP
p10 9 TSV TSV TSV GB | Fasta TSV ZIP
p11 7 TSV TSV TSV GB | Fasta TSV ZIP
Notes:
  • Time of Infection, Viral load, CD4+ counts, and HLA type: Download a Tab Separated Value (TSV) text file with the respective data for each patient
  • Reference: Download the consensus sequence of the first time point as annotated genbank (GB) or fasta format.
  • Depth: Download a TSV ile with RNA template number estimates and fragment specific estimates of effective depth.
  • Single nucleotide polymorphism (SNP) counts: Download a set of (big) tables containing the number of ACGT-N observations at each position in the genome.
Haplotypes

Precompiled alignments

These alignments are premade for regions of general interest.

New alignments

Generate alignments for a custom genomic region.

From/To require HXB2 nucleotide coordinates (both ends included).

NOTE: A few minutes might elapse while we prepare your data.

Sequencing Reads
Sample PCR amplicon
[days since infection] F1 F2 F3 F4 F5 F6
122 BAM BAM BAM BAM BAM BAM
562 BAM BAM BAM BAM BAM BAM
1084 BAM BAM BAM BAM BAM BAM
1254 BAM BAM BAM BAM BAM BAM
1282 BAM BAM BAM BAM BAM BAM
1393 BAM BAM BAM BAM BAM BAM
1861 BAM BAM BAM BAM BAM BAM
2303 BAM BAM BAM BAM BAM BAM
2578 BAM BAM BAM BAM BAM BAM
2639 BAM BAM BAM BAM BAM BAM
2922 BAM BAM BAM BAM BAM BAM
2966 BAM BAM BAM BAM BAM BAM
Sample PCR amplicon
[days since infection] F1 F2 F3 F4 F5 F6
74 BAM BAM BAM BAM BAM BAM
561 BAM BAM BAM BAM BAM BAM
936 BAM BAM BAM BAM BAM BAM
1255 BAM BAM BAM BAM BAM BAM
1628 BAM BAM BAM BAM - BAM
2018 BAM BAM BAM BAM BAM BAM
Sample PCR amplicon
[days since infection] F1 F2 F3 F4 F5 F6
146 BAM BAM BAM BAM BAM BAM
501 BAM BAM BAM BAM BAM BAM
797 BAM BAM BAM BAM BAM BAM
1126 BAM BAM BAM BAM - BAM
1476 BAM BAM BAM BAM BAM BAM
1934 BAM BAM BAM BAM - BAM
2006 BAM BAM BAM BAM BAM BAM
2344 BAM BAM BAM BAM BAM BAM
2727 BAM BAM BAM BAM - BAM
3079 BAM BAM BAM BAM BAM BAM
Sample PCR amplicon
[days since infection] F1 F2 F3 F4 F5 F6
134 - BAM BAM BAM BAM BAM
303 BAM BAM BAM BAM - BAM
713 BAM BAM BAM BAM BAM BAM
1057 BAM BAM BAM BAM BAM BAM
1414 BAM BAM BAM BAM BAM BAM
1813 BAM BAM BAM BAM BAM BAM
2149 BAM BAM BAM BAM BAM BAM
Sample PCR amplicon
[days since infection] F1 F2 F3 F4 F5 F6
62 BAM BAM BAM BAM BAM BAM
118 BAM BAM BAM BAM BAM BAM
974 BAM BAM BAM BAM - BAM
1293 BAM BAM BAM BAM BAM BAM
1724 BAM BAM BAM BAM - BAM
2178 BAM BAM BAM BAM BAM BAM
2556 BAM BAM BAM BAM BAM BAM
Sample PCR amplicon
[days since infection] F1 F2 F3 F4 F5 F6
1905 - - - - - -
2248 - - - - - -
2311 - - - - - -
2668 BAM - BAM BAM - BAM
2970 BAM BAM BAM BAM BAM BAM
3437 BAM BAM BAM BAM BAM BAM
3934 BAM BAM BAM BAM BAM BAM
4445 BAM BAM BAM BAM BAM BAM
5034 BAM BAM BAM BAM BAM BAM
5392 BAM BAM BAM BAM BAM BAM
5811 BAM BAM BAM BAM BAM BAM
Sample PCR amplicon
[days since infection] F1 F2 F3 F4 F5 F6
87 BAM BAM BAM BAM BAM BAM
200 BAM BAM BAM BAM BAM BAM
570 BAM BAM BAM BAM BAM BAM
1003 BAM BAM BAM BAM BAM BAM
1437 BAM BAM BAM BAM BAM BAM
1810 BAM BAM BAM BAM BAM BAM
2208 BAM BAM BAM BAM BAM BAM
Sample PCR amplicon
[days since infection] F1 F2 F3 F4 F5 F6
106 BAM BAM BAM BAM BAM BAM
227 BAM BAM BAM BAM BAM BAM
813 BAM BAM BAM BAM BAM BAM
1193 BAM BAM BAM BAM - BAM
1815 BAM BAM BAM BAM BAM BAM
2214 BAM BAM BAM BAM BAM BAM
2608 BAM BAM BAM BAM - BAM
2955 BAM BAM BAM BAM BAM BAM
Sample PCR amplicon
[days since infection] F1 F2 F3 F4 F5 F6
33 BAM BAM BAM BAM BAM BAM
66 BAM BAM BAM BAM - BAM
68 BAM BAM BAM BAM - -
374 - - - - - -
530 - - - - - -
912 - - - BAM - -
1004 BAM BAM BAM BAM - BAM
2229 BAM BAM BAM - - -
2256 - - - - - -
Sample PCR amplicon
[days since infection] F1 F2 F3 F4 F5 F6
209 BAM BAM BAM BAM BAM BAM
332 BAM BAM BAM BAM BAM BAM
572 BAM BAM BAM BAM BAM BAM
1026 BAM BAM BAM BAM BAM BAM
1396 BAM BAM BAM BAM BAM BAM
1750 BAM BAM BAM BAM - BAM
2043 BAM BAM BAM BAM BAM BAM
Notes:
  • Select the patient of interest in the tabs above.
  • The words BAM in the table link to the mapped reads for each time points and amplified fragment.
  • For each BAM file, the mapping reference is the corresponding fragment of the genbank (GB) patient reference sequences above. For instance, if a certain read has position 100 in amplicon F5 and patient p1, one must open the GB reference file from patient p1, browse the annotations for F5, and add 100 to its position.
  • A dash indicates a fragment that we were unable to sequence.
  • Some of these files are large!
Counts of pairs of SNPs
Sample PCR amplicon
[days since infection] F1 F2 F3 F4 F5 F6
122 ZIP ZIP ZIP ZIP ZIP ZIP
562 ZIP ZIP ZIP ZIP ZIP ZIP
1084 ZIP ZIP ZIP ZIP ZIP ZIP
1254 ZIP ZIP ZIP ZIP ZIP ZIP
1282 ZIP ZIP ZIP ZIP ZIP ZIP
1393 ZIP ZIP ZIP ZIP ZIP ZIP
1861 ZIP ZIP ZIP ZIP ZIP ZIP
2303 ZIP ZIP ZIP ZIP ZIP ZIP
2578 ZIP ZIP ZIP ZIP ZIP ZIP
2639 ZIP ZIP ZIP ZIP ZIP ZIP
2922 ZIP ZIP ZIP ZIP ZIP ZIP
2966 ZIP ZIP ZIP ZIP ZIP ZIP
Sample PCR amplicon
[days since infection] F1 F2 F3 F4 F5 F6
74 ZIP ZIP ZIP ZIP ZIP ZIP
561 ZIP ZIP ZIP ZIP ZIP ZIP
936 ZIP ZIP ZIP ZIP ZIP ZIP
1255 ZIP ZIP ZIP ZIP ZIP ZIP
1628 ZIP ZIP ZIP ZIP - ZIP
2018 ZIP ZIP ZIP ZIP ZIP ZIP
Sample PCR amplicon
[days since infection] F1 F2 F3 F4 F5 F6
146 ZIP ZIP ZIP ZIP ZIP ZIP
501 ZIP ZIP ZIP ZIP ZIP ZIP
797 ZIP ZIP ZIP ZIP ZIP ZIP
1126 ZIP ZIP ZIP ZIP - ZIP
1476 ZIP ZIP ZIP ZIP ZIP ZIP
1934 ZIP ZIP ZIP ZIP - ZIP
2006 ZIP ZIP ZIP ZIP ZIP ZIP
2344 ZIP ZIP ZIP ZIP ZIP ZIP
2727 ZIP ZIP ZIP ZIP - ZIP
3079 ZIP ZIP ZIP ZIP ZIP ZIP
Sample PCR amplicon
[days since infection] F1 F2 F3 F4 F5 F6
134 - ZIP ZIP ZIP ZIP ZIP
303 ZIP ZIP ZIP ZIP - ZIP
713 ZIP ZIP ZIP ZIP ZIP ZIP
1057 ZIP ZIP ZIP ZIP ZIP ZIP
1414 ZIP ZIP ZIP ZIP ZIP ZIP
1813 ZIP ZIP ZIP ZIP ZIP ZIP
2149 ZIP ZIP ZIP ZIP ZIP ZIP
Sample PCR amplicon
[days since infection] F1 F2 F3 F4 F5 F6
62 ZIP ZIP ZIP ZIP ZIP ZIP
118 ZIP ZIP ZIP ZIP ZIP ZIP
974 ZIP ZIP ZIP ZIP - ZIP
1293 ZIP ZIP ZIP ZIP ZIP ZIP
1724 ZIP ZIP ZIP ZIP - ZIP
2178 ZIP ZIP ZIP ZIP ZIP ZIP
2556 ZIP ZIP ZIP ZIP ZIP ZIP
Sample PCR amplicon
[days since infection] F1 F2 F3 F4 F5 F6
1905 - - - - - -
2248 - - - - - -
2311 - - - - - -
2668 ZIP - ZIP ZIP - ZIP
2970 ZIP ZIP ZIP ZIP ZIP ZIP
3437 ZIP ZIP ZIP ZIP ZIP ZIP
3934 ZIP ZIP ZIP ZIP ZIP ZIP
4445 ZIP ZIP ZIP ZIP ZIP ZIP
5034 ZIP ZIP ZIP ZIP ZIP ZIP
5392 ZIP ZIP ZIP ZIP ZIP ZIP
5811 ZIP ZIP ZIP ZIP ZIP ZIP
Sample PCR amplicon
[days since infection] F1 F2 F3 F4 F5 F6
87 ZIP ZIP ZIP ZIP ZIP ZIP
200 ZIP ZIP ZIP ZIP ZIP ZIP
570 ZIP ZIP ZIP ZIP ZIP ZIP
1003 ZIP ZIP ZIP ZIP ZIP ZIP
1437 ZIP ZIP ZIP ZIP ZIP ZIP
1810 ZIP ZIP ZIP ZIP ZIP ZIP
2208 ZIP ZIP ZIP ZIP ZIP ZIP
Sample PCR amplicon
[days since infection] F1 F2 F3 F4 F5 F6
106 ZIP ZIP ZIP ZIP ZIP ZIP
227 ZIP ZIP ZIP ZIP ZIP ZIP
813 ZIP ZIP ZIP ZIP ZIP ZIP
1193 ZIP ZIP ZIP ZIP - ZIP
1815 ZIP ZIP ZIP ZIP ZIP ZIP
2214 ZIP ZIP ZIP ZIP ZIP ZIP
2608 ZIP ZIP ZIP ZIP - ZIP
2955 ZIP ZIP ZIP ZIP ZIP ZIP
Sample PCR amplicon
[days since infection] F1 F2 F3 F4 F5 F6
33 ZIP ZIP ZIP ZIP ZIP ZIP
66 ZIP ZIP ZIP ZIP - ZIP
68 ZIP ZIP ZIP ZIP - -
374 - - - - - -
530 - - - - - -
912 - - - ZIP - -
1004 ZIP ZIP ZIP ZIP - ZIP
2229 ZIP ZIP ZIP - - -
2256 - - - - - -
Sample PCR amplicon
[days since infection] F1 F2 F3 F4 F5 F6
209 ZIP ZIP ZIP ZIP ZIP ZIP
332 ZIP ZIP ZIP ZIP ZIP ZIP
572 ZIP ZIP ZIP ZIP ZIP ZIP
1026 ZIP ZIP ZIP ZIP ZIP ZIP
1396 ZIP ZIP ZIP ZIP ZIP ZIP
1750 ZIP ZIP ZIP ZIP - ZIP
2043 ZIP ZIP ZIP ZIP ZIP ZIP
Notes:
  • Select the patient of interest in the tabs above.
  • A dash indicates a fragment that we were unable to sequence.
  • Each zip archive contains a single-column text file. The format of this file is specified in the first lines (header).
  • Once unzipped and reshaped, the pair counts matrices have dimension 6x6xLxL, where 6 is the length of the nucleotide alphabet, which is ACGT-N, and L is the length of the fragment.
  • Each file is around 10MB large.
API

All data shown on this website can be accessed via a RESTful API for easy web-compatible manipulation. The base URL is

/api/data/

To this root you can append parameters of interest:

Data URL Example
Trees tree/<patientname>/<region> tree/p1/V3
Viral load viralLoad/<patientname> viralLoad/p1
CD4+ cell count cellCount/<patientname> cellCount/p1
Number of template molecules numberTemplates/<patientname> numberTemplates/p1
Reference sequences referenceSequence/<patientname> referenceSequence/p1
Divergence/diversity divdiv/<patientname>/<region> divdiv/p1/V3
Single nucleotide polymorphisms snp/<patientname>/<region> snp/p1/genomewide
Haplotypes haplotypes/<patientname>/<region> haplotypes/p1/V3
Notes:
  • The API is in JSON format only. Any "Accept" HTTP header will be ignored.
  • The API accepts HTTP GET requests only. No data upload/edit is possible.