Gossypium barbadense (AD2) 'AZB375' genome HAU_v1

Overview
Analysis NameGossypium barbadense (AD2) 'AZB375' genome HAU_v1
MethodHiFi, HiFiasm, BUSCO, LTR Assembly Index
SourceY2010.zip
Date performed2024-03-08

We selected 12 representative G. barbadense accessions from around the world, including 7 primitively domesticated accessions from South America, 2 Sea Island landrace accessions from the Caribbean region, and 3 cultivated accessions. An average of 63.1 GB of high-fidelity (HiFi) reads were generated for the 12 accessions. These were initially assembled via HiFiasm into individual genomes ranging from 2.21 to 2.25GB in size (Table 1) and with contig N50 ranging from 55.0 Mb to 102.7 Mb (average = 70.4 Mb), longer than the previously published genome of G. barbadense.  These 98.40% (range 97.48% to 98.93%) assembled contigs were further anchored and ordered into 26 pseudo-chromosomes based on the 3-79 reference genome.  Assembly completeness was high for all assemblies, which contained more than 99.5% complete BUSCOs, and the LTR Assembly Index (LAI) scores ranged from 13.78 to 15.18 per genome, which is considered reference quality according to LAI scores (Table 1). 

Table 1. Statistics of the genomic assembly and annotation of 12 G. barbadense accessions

Sample_ID Y2003 Y2005 Y2010 Y2013 Y2016 Y2029 Y2031 Y2032 Y2033 Y2034 Y2036 Y3048
Germplasm Name GB249 AZB51 AZB375 Yuma AZK101 Giza 7 AZB339 AZB634 GB660 GB776 Junhai-1 CEG
Total length (MB) 2,240 2,299 2,237 2,244 2,213 2,237 2,247 2,252 2,235 2,269 2,267 2,251
Anchor and Orient (%) 99.5 99.5 99.6 99.6 99.6 99.6 99.5 99.6 99.5 99.5 99.6 99.6
Contigs N50 (BP) 64.72 67.25 77.88 102.72 65.91 56.54 61.99 102.73 56.31 61.89 68.79 64.62
GC content(%) 34.42 34.56 34.37 34.39 34.18 34.36 34.43 34.54 34.41 34.63 34.55 34.44
BUSCO (%) 99.5 99.5 99.6 99.6 99.6 99.6 99.5 99.6 99.5 99.5 99.6 99.6
LTR Assembly Index (LAI) 14.51 14.69 15.01 14.46 14.83 14.17 14.18 14.54 14.08 13.78 15.18 15.11
Repeatation 72.63 69.88 69.61 72.63 68.92 72.62 72.69 71.46 70.97 72.16 72.89 72.51
Number of Genes  71,549 74,094 71,963 74,046 71,312 76,012 72,247 72,870 72,281 74,527 75,774 75,755
Assembly

The chromosomes (pseudomolecules) for Gossypium barbadense 'AZB375' genome. These files belong to the Gossypium barbadense (AD2) 'AZB375' genome HAU_v1.

Chromosomes (FASTA format) G.barbadense_HAU-12GB-AZB375.genome.fasta
Functional Analysis

Functional annotation files for the Gossypium barbadense AZB375 Genome v1.0 are available for download below. The Gossypium barbadense AZB375 Genome v1.0 proteins were analyzed using InterProScan in order to assign InterPro domains and Gene Ontology (GO) terms. Pathways analysis was performed using the KEGG Automatic Annotation Server (KAAS).

Downloads

GO assignments from InterProScan AD2_Y2010_v1_genes2GO.xlsx.gz
IPR assignments from InterProScan AD2_Y2010_v1_genes2IPR.xlsx.gz
Proteins mapped to KEGG Orthologs AD2_Y2010_v1_KEGG-orthologis.xlsx.gz
Proteins mapped to KEGG Pathways AD2_Y2010_v1_KEGG-pathways.xlsx.gz
Genes

The predicted gene model, their alignments and proteins for Gossypium barbadense ' AZB375.' genome. These files belong to the Gossypium barbadense (AD2) ' AZB375' genome HAU_v1

Predicted gene models with exons (GFF3 format) G.barbadense_HAU_ AZB375.gene.gff.gz
Coding sequences, CDS (FASTA format) G.barbadense_HAU_ AZB375.gene.cds.fa.gz
Protein sequences (FASTA format) G.barbadense_HAU_ AZB375.gene.pep.fa.gz
Homology

Homology of the Gossypium barbadense AZB375 genome v1.0 proteins was determined by pairwise sequence comparison using the blastp algorithm against various protein databases. An expectation value cutoff less than 1e-6  for the Arabidoposis proteins (Araport11, 2022-09), UniProtKB/SwissProt (Release 2023-07), and UniProtKB/TrEMBL (Release 2023-07) databases. The best hit reports are available for download in Excel format. 

Protein Homologs

G. barbadense AZB375 Genome v1.0 proteins with arabidopsis (Araport11) homologs (EXCEL file) AD2_Y2010_v1_vs_tair.xlsx.gz
G. barbadense AZB375 Genome v1.0 proteins with arabidopsis (Araport11) (FASTA file) AD2_Y2010_v1_vs_tair_hit.fasta.gz
G. barbadense AZB375 Genome v1.0 proteins without arabidopsis (Araport11) (FASTA file) AD2_Y2010_v1_vs_tair_noHit.fasta.gz
G. barbadense AZB375 Genome v1.0 proteins with SwissProt homologs (EXCEL file) AD2_Y2010_v1_vs_swissprot.xlsx.gz
G. barbadense AZB375 Genome v1.0 proteins with SwissProt (FASTA file) AD2_Y2010_v1_vs_swissprot_hit.fasta.gz
G. barbadense AZB375 Genome v1.0 proteins without SwissProt (FASTA file) AD2_Y2010_v1_vs_swissprot_noHit.fasta.gz
G. barbadense AZB375 Genome v1.0 proteins with TrEMBL homologs (EXCEL file) AD2_Y2010_v1_vs_trembl.xlsx.gz
G. barbadense AZB375 Genome v1.0 proteins with TrEMBL (FASTA file) AD2_Y2010_v1_vs_trembl_hit.fasta.gz
G. barbadense AZB375 Genome v1.0 proteins without TrEMBL (FASTA file) AD2_Y2010_v1_vs_trembl_noHit.fasta.gz