Gossypium barbadense (AD2) Genome HAU-SGI Assembly v1.0 & Annotation v1.0
About the assembly
Here, we report the genome sequence of the superior fibre quality tetraploid cotton, G. barbadense acc. 3-79 using a whole-genome shotgun approach with large fragments of DNA Paired-End Tag (DNA-PET) sequencing data. This is a high-quality assembly of the 2.57 gigabase genome of G. barbadense, including 80,876 protein-coding genes. The double-sized genome of the A (or At) (1.50 Gb) against D (or Dt) (853 Mb) primarily resulted from the expansion of Gypsy elements, including Peabody and Retrosat2 subclades in the Del clade, and the Athila subclade in the Athila/Tat clade. Substantial gene expansion and contraction were observed and rich homoeologous gene pairs with biased expression patterns were identified, suggesting abundant gene sub-functionalization occurred by allopolyploidization. More specifically, the CesA gene family has adapted differentially temporal expression patterns, suggesting an integrated regulatory mechanism of CesA genes from At and Dt subgenomes for the primary and secondary cellulose biosynthesis of cotton fibre in a “relay race”-like fashion. It is anticipated that the G. barbadense genome sequence will advance the understanding of the mechanism of genome polyploidization and underpin genome-wide comparison research in this genus.
Yuan et.al, The genome sequence of Sea-Island cotton (Gossypium barbadense) provides insights into the allopolyploidization and development of superior spinnable fibres. Sci Rep. 2015 Dec 4;5:17662. doi: 10.1038/srep17662.
The chromosomes (pseudomolecules) and scaffolds for Gossypium barbadense (AD1) Genome HAU-NBI Assembly v1.0
The predicted genes and proteins for Gossypium barbadense (AD2) Genome HAU-NBI Assembly v1.0
Marker alignments were performed by the CottonGen Team of Main Bioinformatics Lab at WSU. The alignment tool 'BLAT' was used to map marker sequences from CottonGen to the G. barbadense genome assembly. Markers required 90% identity over 97% of their length. For SSRs & RFLPs, gap size was restricted to 1000bp or less with less than 2 gaps. For dbSNPs and Indels gap size was restricted to 2bp with less than 2 gaps. The available files are in GFF3 format. Markers available in CottonGen and CMap are linked to JBrowse.
Transcript alignments were performed by the CottonGen Team of Main Bioinformatics Lab at WSU. The alignment tool 'BLAT' was used to map transcripts to the G. barbadense genome assembly. Alignments with an alignment length of 97% and 90% identify were preserved. The available files are in GFF3 format.
Protein alignments available below were performed by the CottonGen Team of the Main Bioinformatics Lab at WSU. The alignment tool 'exonerate' was used to map protein sequences onto the G. barbadense NBI v1.0 genome. Only alignments with a percent identity of 90% were retained.
Protein homology was performed by the CottonGen Team of Main Bioinformatics Lab at WSU. Proteins from the G. barbadense assembly were mapped against proteins from other genomes and databases using blastp with an e-value cutoff of 1e-6. Only the best match was kept. The available files are in Excel 2007 format.
Functional annotation for Gossypium barbadense (AD2) Genome NBI Assembly v1.0 (Performed by NBI)
Functional annotation for Gossypium barbadense (AD2) Genome NBI Assembly v1.0 (Performed by the CottonGen Team of the Main Bioinformatics Lab at WSU.)
All assembly and annotation files are available for download by selecting the desired data type in the left-hand "Resources" side bar. Each data type page will provide a description of the available files and links to download. Alternatively, you can browse all available files on the FTP repository.