Gossypium hirsutum CottonGen RefTrans V1

Analysis NameGossypium hirsutum CottonGen RefTrans V1
Methodreftrans (1.0)
Date performed2016-10-25

Materials & Methods

CottonGen Gossypium hirsutum RefTrans V1 combines peer-reviewed published RNA-Seq and EST data sets to create a reference transcriptome (RefTrans, 93,449 sequences) for Gossypium hirsutum and provides putative gene function identified by homology to known proteins.

In  Gossypium hirsutum RefTrans V1, 2.76 billion RNA-Seq reads from publicly available peer-reviewed G. hirsutum RNA-Seq data sets (Ma et al. 2016 [SRP049434], Bedre et al. 2015 [SRP055046], Li et al. 2015 [SRP055709], Naoumkina et al. 2015 [SRP052863], Wang et al. 2015 [SRP010021] , Fang et al. 2014 [SRP047139], Naoumkina et al. 2014 [SRP026301], Xiao et al. 2014 [SRP033354], Yoo and Wendel. 2014 [SRP017061], Zhang et al. 2014 [SRP012138], Zhang et al. 2014 [SRP041153], Yang et. 2014 [SRP042128],  Bowman et al. 2013 [SRP026618], Chen et al. 2013 [SRP030288], Jiao et al. 2013 [SRP026000], Sun et al. 2013 [SRP011398]), and 337,811 ESTs, were downloaded from the NCBI Short Read Archive database and the NCBI dbEST database, respectively. The RNA-Seq reads and ESTs were assembled by using the Mainlab RefTrans pipeline (manuscript in preparation – details of pipeline provided ahead of publication on request). The RefTran sequences were functionally characterized by pairwise comparison using the BLASTX algorithm against the Swiss-Prot (UniProtKB/Swiss-Prot Release 2015_10) and TrEMBL (UniProtKB/TrEMBL Release 2015_10)  protein databases.  Information on the top 25 matches with an expectation (E) value of ≤ 1E-06 were recorded and stored in CottonGen together with the RefTrans sequences. InterPro domains and Gene Ontology assignments were made to Gossypium hirsutum RefTrans V1 using InterProScan at the EBI through Blast2GO.  The transcriptome and associated annotation are available to download, search by name, keyword (functional description), or mapped location, and view on the genome through JBrowse.




Additional information about this analysis:
Property NameValue
JBrowse URLhttps://www.cottongen.org/jbrowse/index.html?data=data%2FGh_Tx_JGIv1.1&loc=
Analysis Typereftrans


RefTrans in FASTA format (93,449 sequences) Gossypium hirsutum RefTrans v1 FASTA format 


Homology Analysis 

Homology was determined using the BLASTx algorithm with an e-value cutoff of 1.0 e-6 for the Gossypium hirsutum RefTrans V1 vs. the Swiss-Prot (UniProtKB/Swiss-Prot Release 2015_10), TrEMBL(UniProtKB/TrEMBL Release 2015_10).  Only the best match was kept.  


BLAST of refTrans to Swiss-Prot EXCEL format (71% refTrans with homologies) Gossypium hirsutum RefTrans v1 vs Swissprot
RefTrans with homologies  FASTA format Gossypium hirsutum RefTrans v1 vs Swissprot_hit
RefTrans without homologies FASTA format Gossypium hirsutum RefTrans v1 vs Swissprot_noHit


BLAST of refTrans to TrEMBL EXCEL format (97% refTrans with homologies)  Gossypium hirsutum RefTrans v1 vs TrEMBL
RefTrans with homologies  FASTA format   Gossypium hirsutum RefTrans v1 vs TrEMBL_hit
RefTrans without homologies FASTA format   Gossypium hirsutum RefTrans v1 vs TrEMBL_noHit


InterProscan Analysis

InterPro domains and Gene Ontology assignments were made to Gossypium hirsutum RefTrans V1 using InterProScan at the EBI through Blast2GO.
Gene Ontology annotations by RefTrans EXCEL format Gossypium hirsutum RefTrans V1 Gene Ontology annotations
InterPro annotations by RefTrans EXCEL format Gossypium hirsutum RefTrans V1 InterPro annotations



The alignment tool 'BLAT' was used to map Gossypium hirsutum RefTrans V1 to the Gossypium hirsutum (AD1) acc 'TM-1' genome NAU-NBI v1.1 assembly. Alignments with an alignment length of 95% and 90% identify were preserved. 
BLAT of refTrans to AD1 EXCEL format Gossypium hirsutum RefTrans V1_AD