Publications

2015

Zhao H, Ye X, Li QQ. Characterization of plant polyadenylation complexes by using tandem affinity purification.. Methods in molecular biology (Clifton, N.J.). 2015;1255:69–78. doi:10.1007/978-1-4939-2175-1_7

Messenger RNA in eukaryotic cells is initially produced as a nascent transcript (pre-mRNA) without a polyadenine [poly(A)] tail to the 3' end. The precise cleavage of the pre-mRNA and addition of a poly(A) track need the communication between cis-elements in the pre-mRNA sequences and transacting protein factors recognizing them. Based on homology analyses, Arabidopsis cleavage and polyadenylation specificity factor (AtCPSF) complex should play a critical role in pre-mRNA 3' end processing. Here we describe the isolation of AtCPSF complex by using a tandem affinity purification (TAP) method. We demonstrate that TAP is a potent protein complex isolating approach that can fulfill a downstream protein identification purpose based on mass spectrometry techniques.

Wu X, Ji G, Li QQ. Computational analysis of plant polyadenylation signals.. Methods in molecular biology (Clifton, N.J.). 2015;1255:3–11. doi:10.1007/978-1-4939-2175-1_1

Messenger RNA polyadenylation in eukaryotes marks the end of a transcript, and the process is associated with transcription termination. Increasing evidence reveals the potential of gene expression regulation through alternative polyadenylation. The site of poly(A) addition is defined by poly(A) signals reside in the transcribed pre-mRNA. To gain further insight into poly(A) signals and their functions in defining alternative polyadenylation sites that lie within different genomic regions, SignalSleuth2 was developed to extract and analyze cis-elements from a set of data with known poly(A) sites. After obtaining the sequences surrounding the poly(A) sites, exhaustive search of short sequence motifs in specified range of nucleotide sequences are performed, variable motif sizes and rank the detected motifs based on their occurrence frequencies are tallied. It also has new functions including Position-Specific Scoring Matrix (PSSM) scores calculation and multiple scanning modes. This program is powerful in revealing underline sequence motifs surrounding any target regions in a given dataset.

Cao J, Li QQ. Poly(A) tag library construction from 10 ng total RNA.. Methods in molecular biology (Clifton, N.J.). 2015;1255:185–94. doi:10.1007/978-1-4939-2175-1_16

Alternative polyadenylation has been demonstrated as a tier of gene expression regulation in eukaryotes. However, its role has not been elucidated at the cellular level. Equipped with techniques to isolate single cells by fluorescence-activated cell sorting (FACS) and laser captured micro-dissection, analysis of alternative polyadenylation in specific cell types becomes possible. We present a method to generate poly(A) tags for high-throughput sequencing (PAT-seq) libraries from very low amount of total RNA. This protocol targets the junction of the 3'-UTR and poly(A) tail of transcripts. Ten nanograms of total RNA isolated from the FACS-sorted cells was reverse-transcribed to double stranded cDNA with a anchored oligo dT(18) primer containing maximal T7 promoter sequence. Then, an RNA amplification step using in vitro transcription of T7 RNA polymerase was carried out. Achieved cRNA was fragmented by partial digestion. First strand synthesis was carried out by using a partial adaptor sequence with random 9-nt primer to introduce the adaptor at the 5' end. An anchored oligo dT primer containing adaptor sequence on 3' end was introduced through second strand cDNA synthesis. This new method has been applied to investigate polyadenylation using nanogram amount of total RNA from Arabidopsis cells.

Wu X, Ji G, Li QQ. Prediction of plant mRNA polyadenylation sites.. Methods in molecular biology (Clifton, N.J.). 2015;1255:13–23. doi:10.1007/978-1-4939-2175-1_2

Messenger RNA polyadenylation is one of the essential processing steps during eukaryotic gene expression. The site of polyadenylation [poly(A) site] marks the end of a transcript, which is also the end of a gene in most cases. A computation program that is able to recognize poly(A) sites would not only be useful for genome annotation in finding genes ends, but also for predicting alternative poly(A) sites. PASS [Poly(A) Site Sleuth] and PAC [Poly(A) site Classifier] were developed to predict poly(A) sites in plants. PASS was built based on the Generalized Hidden Markov Model (GHMM), which consists of four functional modules: input model, poly(A) site recognition module, graphic process module, and output module. PAC is a classification model, integrating several features that define the poly(A) sites including K-gram pattern, Z-curve, position-specific scoring matrix, and first-order inhomogeneous Markov sub-model. PAC can be used to predict poly(A) sites from species whose polyadenylation profile is unknown. The result of PASS and PAC is an output of a few files with one of them containing the score or probability of being a poly(A) site for each position of a given sequence. While the models were built mostly based on poly(A) profile data from Arabidopsis, it is also functional in other higher plants since their profiles are quite similar.

Xing D, Li QQ. RADPRE: a computational program for identification of differential mRNA processing including alternative polyadenylation.. Methods in molecular biology (Clifton, N.J.). 2015;1255:57–66. doi:10.1007/978-1-4939-2175-1_6

Genome-wide studies revealed the prevalence of multiple transcripts resulting from alternative polyadenylation (APA) of a single given gene in higher eukaryotes. Several studies in the past few years attempted to address how those APA events are regulated and what the biological consequences of those regulations are. Common to these efforts is the comparison of unbiased transcriptome data, either derived from whole-genome tiling array or next generation sequencing, to identify the specific APA events in a given condition. RADPRE (Ratio-based Analysis of Differential mRNA Processing and Expression) is an R program, developed to serve such a purpose using data from the whole-genome tilling array. RADPRE took a set of tilling array data as input, performed a series of calculation including a correction of the probe affinity variation, a hierarchy of statistical tests and an estimation of the false discovery rate (FDR) of the differentially processed genes (DPG). The result was an output of a few tabular files including DPG and their corresponding FDR. This chapter is written for scientists with limited programming experiences.

Liu M, Wu X, Li QQ. DNA/RNA hybrid primer mediated poly(A) tag library construction for Illumina sequencing.. Methods in molecular biology (Clifton, N.J.). 2015;1255:175–84. doi:10.1007/978-1-4939-2175-1_15

Alternation polyadenylation is widespread in eukaryotes, and has demonstrated roles in gene expression regulation. Owing to deep DNA sequencing technologies, global analyses of alternation polyadenylation and their functions have become possible. We present a method to generate poly(A) tags libraries for high-throughput sequencing (PAT-seq). This protocol targets the junction of the 3'-UTR and poly(A) tail of a transcript so it can be positively identified as a poly(A) site. Upon Zinc-mediated limited digestion of total RNA, RNA fragments with poly(A) tail are then isolated and 5'-end repaired. A DNA/RNA hybrid adaptor is ligated to the 5' end as an anchor. Then the library is generated by reverse transcription with oligo(dT)-adapter followed by PCR amplification. Such a custom poly(A) tags library can be generated from any source poly(A) containing RNA and good for both single- or paired-end sequencing in any Illumina sequencing platforms. This new method has been applied to investigate mRNA polyadenylation in Arabidopsis.

Guan J, Fu J, Wu M, Chen L, Ji G, Li QQ, Wu X. VAAPA: a web platform for visualization and analysis of alternative polyadenylation.. Computers in biology and medicine. 2015;57:20–5. doi:10.1016/j.compbiomed.2014.11.010

Polyadenylation [poly(A)] is an essential process during the maturation of most mRNAs in eukaryotes. Alternative polyadenylation (APA) as an important layer of gene expression regulation has been increasingly recognized in various species. Here, a web platform for visualization and analysis of alternative polyadenylation (VAAPA) was developed. This platform can visualize the distribution of poly(A) sites and poly(A) clusters of a gene or a section of a chromosome. It can also highlight genes with switched APA sites among different conditions. VAAPA is an easy-to-use web-based tool that provides functions of poly(A) site query, data uploading, downloading, and APA sites visualization. It was designed in a multi-tier architecture and developed based on Smart GWT (Google Web Toolkit) using Java as the development language. VAAPA will be a valuable addition to the community for the comprehensive study of APA, not only by making the high quality poly(A) site data more accessible, but also by providing users with numerous valuable functions for poly(A) site analysis and visualization.

Zhao H, Li QQ. In vitro analysis of cleavage and polyadenylation in Arabidopsis.. Methods in molecular biology (Clifton, N.J.). 2015;1255:79–89. doi:10.1007/978-1-4939-2175-1_8

In eukaryotes, pre-messenger RNA (pre-mRNA) cleavage and polyadenylation is one of the necessary processing steps that produce a mature and functional mRNA. Regulation on pre-mRNA cleavage and polyadenylation affects other processes such as mRNA translocation, stability, and translation. The process of pre-mRNA cleavage and polyadenylation, and its relationship with RNA splicing and translation, have been extensively studied due to its importance in vivo. A successful in vitro system has provided enormous amount of information to the study of cleavage and polyadenylation in the mammalian and yeast systems. Here, we describe an in vitro pre-mRNA cleavage system that faithfully cleaves pre-mRNA substrate using Arabidopsis cell/tissue cultures.

Ji G, Li L, Li QQ, Wu X, Fu J, Chen G, Wu X. PASPA: a web server for mRNA poly(A) site predictions in plants and algae.. Bioinformatics (Oxford, England). 2015;31(10):1671–3. doi:10.1093/bioinformatics/btv004

Polyadenylation is an essential process during eukaryotic gene expression. Prediction of poly(A) sites helps to define the 3' end of genes, which is important for gene annotation and elucidating gene regulation mechanisms. However, due to limited knowledge of poly(A) signals, it is still challenging to predict poly(A) sites in plants and algae. PASPA is a web server for P: oly( A: ) S: ite prediction in P: lants and A: lgae, which integrates many in-house tools as add-ons to facilitate poly(A) site prediction, visualization and mining. This server can predict poly(A) sites for ten species, including seven previously poly(A) signal non-characterized species, with sensitivity and specificity in a range between 0.80 and 0.95.