We are interested in using transcriptome data, generated with next generation sequencing technology, to investigate the evolutionary trends of specific genes and their associated expression in mayflies. We generated an additional transcriptome for mayflies. RNA was extracted from a freshly frozen specimen preserved in RNAlater® (Ambion) using TRIzol® Reagent (Ambion) and cDNA libraries were prepared from mRNA. RNA-seq data was generated using a paired-end protocol (PE100) on Illumina HiSeq2000 with an expected 60 million reads. In order to effectively investigate the large amount of sequences, we created a bioinformatics workflow to analyze the newly generated transcriptome data along with previous data for mayflies. The workflow consists of these main steps: Assembling the transcripts, identifying candidate coding regions, searching biological sequence databases for homologous sequences.
Since there is no reference genome for mayflies, a de novo transcriptome assembly will be incorporated into the workflow. The workflow will automate comparison of sequences against known insect reference genomes and the NCBI protein database. Ultimately, the workflow will be mining out genes from the transcriptome data to look for genes involved in character trends in Evolution, such as opsins, gills, wings.