- Data Overview
- Marker - Over 276,221 genetic markers consisting of 3,541 RFLPs, 78,340 SSRs, 183,035 SNPs. Marker information includes sequence, germplasm, DNA library, repeat motif, restriction enzyme, reference, etc.
- Map - 49 map data sets with 34,559 loci consisting of 44 genetic, 2 association, 1 bin, 1 consensus, and 1 in-silico maps, which covers cotton genome groups AD, A, D, and G.
- QTL - 988 QTL loci of 200 QTL trait data from 7 QTL datasets, and another 1,000 QTLs are coming soon.
- Polymorphism - 3,800 marker polymorphism from 5 different projects, marker types include SSR (2516), SNP (1005), InDel (279).
- Species - 50 Gossypium species with detailed information such as geographic origins, inter species compatabilities, etc.
- Germplasm and Collection - 16,052 germplasm records including 97 populations and 15,955 individual entries identified from nearly 49,000 names or entries obtained from 16 collections. The 16 collections are from United States, Uzbekistan, and China.
- Trait - Total 491,824 trait scores from US National Cotton Germplasm Collection (383,935 of 8,9764 entries for 50 traits in 39 environments since years before 2006 till 2015), GRIN obsolete (73,296 of 6,871 entries for 49 traits), China (22,439 of 2,957 entries for 34 traits), and Uzbekistan (22,154 of 847 entries for 39 traits) germplasm evaluations.
- Images - 12,269 digital images of 2,015 germplasm, from USDA-ARS National Cotton Germplasm characterization Project.
- Sequence - 610,246 sequences consisting mostly of GenBank records sequences (06/12/2015). The GenBank nucleotide sequence records bring in 9,456 genes, 8,909 gene products, and 181 DNA libraries.
- Reference - 15,155 references from journal articles, conference proceedings, patents, book chapters, and theses.
- Contact - 576 contacts.
- ESTs - CottonGen Gossypium Unigene v1.0 - unigene V1 was built using 442,954 Gossypium ESTs downloaded from GenBank (09/16/2012), filtered for quality and contamination and assembled using CAP3 resulting in 21,698 contigs and 128,218 singletons.
- Gene - 1224 Generic Genes from GenBank sequences (07/18/2014), 305,798 CDS data from G. raimondii genome project BGI v1.0 (40976), JGI annot v2.1 (77267), G. arboreum BGI v2.0 (40,134), G. hirsutum NBI v1.1 (70,478), and BGI v1.0 (76,943).
Gossypium hirsutum (AD1) genome CGP-BGI v1.1 assembly & v1.0 annotation
Gossypium hirsutum (AD1) genome NAU-NBI v1.1 assembly & v1.1 annotation
Gossypium arboreum A2 BGI-CGP assembly v2.0 & annotation v1.0
Gossypium raimondii D genome JGI assembly v2.0 and annotation v2.1,
Gossypium raimondii Draft D Genome v1.0 (BGI-CGP) Assembly & Annotation,
- CMap - CMap is a graphical interface which enables users to view and compare genetic and physical maps between and among species. Currently there 49 maps in cotton CMap.
- CottonCyc - is a collection of Pathway/Genome Databases provides a reference on the genomes and metabolic pathways of sequenced Gossypium genomes. Currently includes Cyc pathways for JGI v2.0 G. raimondii D5 genome assembly.
- GBrowse - a graphical interface that enables users to view genome annotations. CottonGen GBrowse currently includes G. raimondii genome sequence and annotation data of both versions from BGI (v1.0) and JGI (v2.0 annot v2.1), G. arboreum genome sequence and annotation data of BGI (v2.0), chloroplast sequences of G. hirsutum, G. barbadense, G. arboreum, G. raimondii and Arabidopsis thaliana (TAIR10).
- NCBI BLAST finds regions of local similarity between sequences. BLAST can be used to infer functional and evolutionary relationships between sequences as well as help identify members of gene families. Datasets available for searching include Gosspium Uniprot and nr proteins, gossypium dbESTs, gossypium SSRs, CottonGen dbEST unigene v1, Gossypium EST Contigs from Flagel et al., 2012, PlantGDB cotton unigenes and The CDS and protein sequences from Chinese BGI D5-genome assembly v1.0, JGI D5-genome assembly 2.0, and BGI-A2 genome assembly v2.0,.
- Batch BLAST runs a wrapper for BLAST, parsing the BLAST output into an Excel file and producing a FASTA file of those sequenes with database homologs. Users will be notified by email when the job is complete and directed to a website to download the result files. The same datasets are available in Batch BLAST as the Cottongen NCBI BLAST.