Class
Article
College
College of Agriculture and Applied Sciences
Department
Plants, Soils, and Climate Department
Faculty Mentor
Rakesh Kaundal
Presentation Type
Poster Presentation
Abstract
Abstract: With the advent of Next-Generation Sequencing (NGS) technologies, numerous data is being generated every day, however, streamlined analysis remains a big hurdle to efficiently use the technology. A large number of algorithms, statistical methods, and software tools have been developed in recent years to perform the individual analysis steps of various NGS applications. Some NGS applications data analysis procedures are therefore very complex, requiring several program tools to be downloaded for their various processing steps. There is a significant room for the development of scalable computing environments that link the individual software components to automated workflows to efficiently and reproducibly conduct complex genome-wide analyses. We have developed a Python package (pySeqRNA) that is capable of running the NGS data analysis from start to finish reproducibly and efficiently.This package provides a uniform workflow interface and support for running on the High-Performance Computing Cluster (HPCC) as well as on local computers. It is an extensible pipeline for performing end-to-end analysis with automated report generation for various NGS applications like RNA-Seq, single-cell RNA-Seq, and dual RNA-Seq, etc. To simplify the analysis of these applications, the package provides pre-configured analysis and report templates. pySeqRNA workflow consists of quality check and pre-processing of raw sequence reads, accurate mapping of millions of short sequencing reads to a reference genome including the identification of splicing events, quantifying expression levels of genes, transcripts, and exons in two ways: (i) Uniquely mapped reads, (ii) Multi-mapped groups, and Differential analysis of gene expression among different biological conditions, biological interpretation of differentially expressed genes, including functional enrichment analysis.This package accelerates the retrieval of reproducible results from NGS experiments. By integrating several command-line tools and custom Python scripts, it allows effective use of existing software and tools with newly written scripts in Python without restricting users to a collection of pre-defined methods and environments. pySeqRNA is freely available at http://bioinfo.usu.edu/pySeqRNA/. Presentation Time: Wednesday, 1-2 p.m.Zoom link: https://usu-edu.zoom.us/j/87892002075?pwd=Ym1Tcy9NOVhaaGZWczZWY1JCL3owUT09
Location
Logan, UT
Start Date
4-11-2021 12:00 AM
Included in
Climate Commons, Plant Sciences Commons, Soil Science Commons
pySeqRNA: An Automated Python Package for Advanced RNA Sequencing Data Analysis and Annotation
Logan, UT
Abstract: With the advent of Next-Generation Sequencing (NGS) technologies, numerous data is being generated every day, however, streamlined analysis remains a big hurdle to efficiently use the technology. A large number of algorithms, statistical methods, and software tools have been developed in recent years to perform the individual analysis steps of various NGS applications. Some NGS applications data analysis procedures are therefore very complex, requiring several program tools to be downloaded for their various processing steps. There is a significant room for the development of scalable computing environments that link the individual software components to automated workflows to efficiently and reproducibly conduct complex genome-wide analyses. We have developed a Python package (pySeqRNA) that is capable of running the NGS data analysis from start to finish reproducibly and efficiently.This package provides a uniform workflow interface and support for running on the High-Performance Computing Cluster (HPCC) as well as on local computers. It is an extensible pipeline for performing end-to-end analysis with automated report generation for various NGS applications like RNA-Seq, single-cell RNA-Seq, and dual RNA-Seq, etc. To simplify the analysis of these applications, the package provides pre-configured analysis and report templates. pySeqRNA workflow consists of quality check and pre-processing of raw sequence reads, accurate mapping of millions of short sequencing reads to a reference genome including the identification of splicing events, quantifying expression levels of genes, transcripts, and exons in two ways: (i) Uniquely mapped reads, (ii) Multi-mapped groups, and Differential analysis of gene expression among different biological conditions, biological interpretation of differentially expressed genes, including functional enrichment analysis.This package accelerates the retrieval of reproducible results from NGS experiments. By integrating several command-line tools and custom Python scripts, it allows effective use of existing software and tools with newly written scripts in Python without restricting users to a collection of pre-defined methods and environments. pySeqRNA is freely available at http://bioinfo.usu.edu/pySeqRNA/. Presentation Time: Wednesday, 1-2 p.m.Zoom link: https://usu-edu.zoom.us/j/87892002075?pwd=Ym1Tcy9NOVhaaGZWczZWY1JCL3owUT09