|Analysis Name||Gossypium hirsutum (AD1) Genome NAU-NBI Assembly v1.1 & Annotation v1.1 |
|Method||SOAPdenovo (12) |
|Source||Illumina HiSeq 2000 reads from various insert size libraries (NAU-NBI) |
|Date performed||2015-04-20 |
About the assembly
An allohaploid plant was derived from the allotetraploid cotton (TM-1) and used for genome sequencing. 612 Gb (245× genome equivalent) of high-quality Illumina reads were produced and assembled using SOAPdenovo12. The resulting contigs and scaffold were integrated using 174,454 pairs of Sanger-sequenced BAC-end sequences comprising 116.5 Mb, and assembled into the TM-1 genome sequence (V1.0). To correct for misassembly, classify the homoeologous segments and order the scaffolds, an ultradense genetic map was developed using genotyping by sequencing of 59 F2 individuals derived from TM-1 and G. barbadense cv. Hai7124. The map consisted of 4,999,048 single-nucleotide polymorphism (SNP) loci and 4,049 recombination bins spanning 4,042 cM in 26 linkage groups. Using the map, 218 misassembled scaffolds were corrected (442.2 Mb, or 17.6%, of the genome sequence) in the assembly V1.0 and most misassembled scaffolds were caused by ambiguous homeolog sequences. The final assembly (V1.1) comprised 265,279 contigs (N50 = 34.0 kb) and 40,407 scaffolds (N50 = 1.6 Mb). The total scaffold length (2.4 Gb) spanned ~96% of the estimated allotetraploid genome (2.5 Gb), of which 6,146 scaffolds (2.3 Gb) were aligned and organized into 26 pseudochromosomes, including 1.5 Gb (4,635 scaffolds) in the A subgenome and 0.8 Gb (1,511 scaffolds) in the D subgenome. Furthermore, 1.9 Gb (79.2%) was oriented based on linkage maps.
|Oriented scaffold number
|Oriented scaffold size
|Total gene length
Zhang et. al., Sequencing of allotetraploid cotton (Gossypium hirsutum L. acc. TM-1) provides a resource for fiber improvement. Nature Biotechnology. 33, 531–537. 2015
The chromosomes (pseudomolecules) and scaffolds for Gossypium hirsutum (AD1) Genome NAU-NBI Assembly v1.1
|Assembly pseudomolecules (FASTA format)
Marker alignments were performed by the CottonGen Team of Main Bioinformatics Lab at WSU. The alignment tool 'BLAT' was used to map marker sequences from CottonGen to the G. hirsutum genome assembly. Markers required 90% identity over 97% of their length. For SSRs & RFLPs, gap size was restricted to 1000bp or less with less than 2 gaps. For dbSNPs and Indels gap size was restricted to 2bp with less than 2 gaps. The available files are in GFF3 format. Markers available in CottonGen and CMap are linked to JBrowse.
Transcript alignments were performed by the CottonGen Team of Main Bioinformatics Lab at WSU. The alignment tool 'BLAT' was used to map transcripts to the G. hirsutum genome assembly. Alignments with an alignment length of 97% and 90% identify were preserved. The available files are in GFF3 format.
Protein alignments available below were performed by the CottonGen Team of the Main Bioinformatics Lab at WSU. The alignment tool 'exonerate' was used to map protein sequences onto the G. hirsutum NBI v1.1 genome. Only alignments with a percent identity of 90% were retained.
Protein homology was performed by the CottonGen Team of Main Bioinformatics Lab at WSU. Proteins from the G. hirsutum assembly were mapped against proteins from other genomes and databases using blastp with an e-value cutoff of 1e-6. Only the best match was kept. The available files are in Excel 2007 format.
||Tianzhen Zhang, Yan Hu, Wenkai Jiang, Lei Fang, Xueying Guan, Jiedan Chen, Jinbo Zhang, Christopher A Saski, Brian E Scheffler, David M Stelly, Amanda M Hulse-Kemp, Qun Wan, Bingliang Liu, Chunxiao Liu, Sen Wang, Mengqiao Pan, Yangkun Wang, Dawei Wang, Wenxue Ye, Lijing Chang, Wenpan Zhang, Qingxin Song, Ryan C Kirkbride, Xiaoya Chen, Elizabeth Dennis, Danny J Llewellyn, Daniel G Peterson, Peggy Thaxton, Don C Jones, Qiong Wang, Xiaoyang Xu, Hua Zhang, Huaitong Wu, Lei Zhou, Gaofu Mei, Shuqi Chen, Yue Tian, Dan Xiang, Xinghe Li, Jian Ding, Qiyang Zuo, Linna Tao, Yunchao Liu, Ji Li, Yu Lin, Yuanyuan Hui, Zhisheng Cao, Caiping Cai, Xiefei Zhu, Zhi Jiang, Baoliang Zhou, Wangzhen Guo, Ruiqiang Li & Z Jeffrey Chen
||Sequencing of allotetraploid cotton (Gossypium hirsutum L. acc. TM-1) provides a resource for fiber improvement
Functional annotation for Gossypium hirsutum (AD1) Genome NBI Assembly v1.1 (Performed by NBI)
Functional annotation for Gossypium hirsutum (AD1) Genome NBI Assembly v1.1 (Performed by the CottonGen Team of the Main Bioinformatics Lab at WSU.)
All assembly and annotation files are available for download by selecting the desired data type in the left-hand "Resources" side bar. Each data type page will provide a description of the available files and links to download. Alternatively, you can browse all available files on the FTP repository.