Publications

2012

Hunt AG, Xing D, Li QQ. Plant polyadenylation factors: conservation and variety in the polyadenylation complex in plants.. BMC genomics. 2012;13:641. doi:10.1186/1471-2164-13-641

BACKGROUND: Polyadenylation, an essential step in eukaryotic gene expression, requires both cis-elements and a plethora of trans-acting polyadenylation factors. The polyadenylation factors are largely conserved across mammals and fungi. The conservation seems also extended to plants based on the analyses of Arabidopsis polyadenylation factors. To extend this observation, we systemically identified the orthologs of yeast and human polyadenylation factors from 10 plant species chosen based on both the availability of their genome sequences and their positions in the evolutionary tree, which render them representatives of different plant lineages.

RESULTS: The evolutionary trajectories revealed several interesting features of plant polyadenylation factors. First, the number of genes encoding plant polyadenylation factors was clearly increased from "lower" to "higher" plants. Second, the gene expansion in higher plants was biased to some polyadenylation factors, particularly those involved in RNA binding. Finally, while there are clear commonalities, the differences in the polyadenylation apparatus were obvious across different species, suggesting an ongoing process of evolutionary change. These features lead to a model in which the plant polyadenylation complex consists of a conserved core, which is rather rigid in terms of evolutionary conservation, and a panoply of peripheral subunits, which are less conserved and associated with the core in various combinations, forming a collection of somewhat distinct complex assemblies.

CONCLUSIONS: The multiple forms of plant polyadenylation complex, together with the diversified polyA signals may explain the intensive alternative polyadenylation (APA) and its regulatory role in biological functions of higher plants.

2011

Xing D, Li QQ. Alternative polyadenylation and gene expression regulation in plants.. Wiley interdisciplinary reviews. RNA. 2011;2(3):445–58. doi:10.1002/wrna.59

Functioning as an essential step of pre-mRNA processing, polyadenylation has been realized in recent years to play an important regulatory role during eukaryotic gene expression. Such regulation occurs mostly through the use of alternative polyadenylation (APA) sites and generates different transcripts with altered coding capacity for proteins and/or RNA. However, the molecular mechanisms that underlie APAs are poorly understood. Besides APA cases demonstrated in animal embryo development, cancers, and other diseases, there are a number of APA examples reported in plants. The best-known ones are related to flowering time control pathways and stress responses. Genome-wide studies have revealed that plants use APA extensively to generate diversity in their transcriptomes. Although each transcript produced by RNA polymerase II has a poly(A) tail, over 50% of plant genes studied possess multiple APA sites in their transcripts. The signals defining poly(A) sites in plants were mostly studied through classical genetic means. Our understanding of these poly(A) signals is enhanced by the tallies of whole plant transcriptomes. The profiles of these signals have been used to build computer models that can predict poly(A) sites in newly sequenced genomes, potential APA sites in genes of interest, and/or to identify, and then mutate, unwanted poly(A) sites in target transgenes to facilitate crop improvements. In this review, we provide readers an update on recent research advances that shed light on the understanding of polyadenylation, APA, and its role in gene expression regulation in plants.

Zheng J, Xing D, Wu X, Shen Y, Kroll DM, Ji G, Li QQ. Ratio-based analysis of differential mRNA processing and expression of a polyadenylation factor mutant pcfs4 using arabidopsis tiling microarray.. PloS one. 2011;6(2):e14719. doi:10.1371/journal.pone.0014719

BACKGROUND: Alternative polyadenylation as a mechanism in gene expression regulation has been widely recognized in recent years. Arabidopsis polyadenylation factor PCFS4 was shown to function in leaf development and in flowering time control. The function of PCFS4 in controlling flowering time was correlated with the alternative polyadenylation of FCA, a flowering time regulator. However, genetic evidence suggested additional targets of PCFS4 that may mediate its function in both flowering time and leaf development.

METHODOLOGY/PRINCIPAL FINDINGS: To identify further targets, we investigated the whole transcriptome of a PCFS4 mutant using Affymetrix Arabidopsis genomic tiling 1.0R array and developed a data analysis pipeline, termed RADPRE (Ratio-based Analysis of Differential mRNA Processing and Expression). In RADPRE, ratios of normalized probe intensities between wild type Columbia and a pcfs4 mutant were first generated. By doing so, one of the major problems of tiling array data–variations caused by differential probe affinity–was significantly alleviated. With the probe ratios as inputs, a hierarchy of statistical tests was carried out to identify differentially processed genes (DPG) and differentially expressed genes (DEG). The false discovery rate (FDR) of this analysis was estimated by using the balanced random combinations of Col/pcfs4 and pcfs4/Col ratios as inputs. Gene Ontology (GO) analysis of the DPGs and DEGs revealed potential new roles of PCFS4 in stress responses besides flowering time regulation.

CONCLUSION/SIGNIFICANCE: We identified 68 DPGs and 114 DEGs with FDR at 1% and 2%, respectively. Most of the 68 DPGs were subjected to alternative polyadenylation, splicing or transcription initiation. Quantitative PCR analysis of a set of DPGs confirmed that most of these genes were truly differentially processed in pcfs4 mutant plants. The enriched GO term "regulation of flower development" among PCFS4 targets further indicated the efficacy of the RADPRE pipeline. This simple but effective program is available upon request.

Shen Y, Venu RC, Nobuta K, Wu X, Notibala V, Demirci C, Meyers BC, Wang G-L, Ji G, Li QQ. Transcriptome dynamics through alternative polyadenylation in developmental and environmental responses in plants revealed by deep sequencing.. Genome research. 2011;21(9):1478–86. doi:10.1101/gr.114744.110

Polyadenylation sites mark the ends of mRNA transcripts. Alternative polyadenylation (APA) may alter sequence elements and/or the coding capacity of transcripts, a mechanism that has been demonstrated to regulate gene expression and transcriptome diversity. To study the role of APA in transcriptome dynamics, we analyzed a large-scale data set of RNA "tags" that signify poly(A) sites and expression levels of mRNA. These tags were derived from a wide range of tissues and developmental stages that were mutated or exposed to environmental treatments, and generated using digital gene expression (DGE)-based protocols of the massively parallel signature sequencing (MPSS-DGE) and the Illumina sequencing-by-synthesis (SBS-DGE) sequencing platforms. The data offer a global view of APA and how it contributes to transcriptome dynamics. Upon analysis of these data, we found that ∼60% of Arabidopsis genes have multiple poly(A) sites. Likewise, ∼47% and 82% of rice genes use APA, supported by MPSS-DGE and SBS-DGE tags, respectively. In both species, ∼49%-66% of APA events were mapped upstream of annotated stop codons. Interestingly, 10% of the transcriptomes are made up of APA transcripts that are differentially distributed among developmental stages and in tissues responding to environmental stresses, providing an additional level of transcriptome dynamics. Examples of pollen-specific APA switching and salicylic acid treatment-specific APA clearly demonstrated such dynamics. The significance of these APAs is more evident in the 3034 genes that have conserved APA events between rice and Arabidopsis.

Wu X, Liu M, Downie B, Liang C, Ji G, Li QQ, Hunt AG. Genome-wide landscape of polyadenylation in Arabidopsis provides evidence for extensive alternative polyadenylation.. Proceedings of the National Academy of Sciences of the United States of America. 2011;108(30):12533–8. doi:10.1073/pnas.1019732108

Alternative polyadenylation (APA) has been shown to play an important role in gene expression regulation in animals and plants. However, the extent of sense and antisense APA at the genome level is not known. We developed a deep-sequencing protocol that queries the junctions of 3'UTR and poly(A) tails and confidently maps the poly(A) tags to the annotated genome. The results of this mapping show that 70% of Arabidopsis genes use more than one poly(A) site, excluding microheterogeneity. Analysis of the poly(A) tags reveal extensive APA in introns and coding sequences, results of which can significantly alter transcript sequences and their encoding proteins. Although the interplay of intron splicing and polyadenylation potentially defines poly(A) site uses in introns, the polyadenylation signals leading to the use of CDS protein-coding region poly(A) sites are distinct from the rest of the genome. Interestingly, a large number of poly(A) sites correspond to putative antisense transcripts that overlap with the promoter of the associated sense transcript, a mode previously demonstrated to regulate sense gene expression. Our results suggest that APA plays a far greater role in gene expression in plants than previously expected.

Zhao H, Zheng J, Li QQ. A novel plant in vitro assay system for pre-mRNA cleavage during 3’-end formation.. Plant physiology. 2011;157(3):1546–54. doi:10.1104/pp.111.179465

Messenger RNA (mRNA) maturation in eukaryotic cells requires the formation of the 3' end, which includes two tightly coupled steps: the committing cleavage reaction that requires both correct cis-element signals and cleavage complex formation, and the polyadenylation step that adds a polyadenosine [poly(A)] tract to the newly generated 3' end. An in vitro biochemical assay plays a critical role in studying this process. The lack of such an assay system in plants hampered the study of plant mRNA 3'-end formation for the last two decades. To address this, we have now established and characterized a plant in vitro cleavage assay system, in which nuclear protein extracts from Arabidopsis (Arabidopsis thaliana) suspension cell cultures can accurately cleave different pre-mRNAs at expected in vivo authenticated poly(A) sites. The specific activity is dependent on appropriate cis-elements on the substrate RNA. When complemented by yeast (Saccharomyces cerevisiae) poly(A) polymerase, about 150-nucleotide poly(A) tracts were added specifically to the newly cleaved 3' ends in a cooperative manner. The reconstituted polyadenylation reaction is indicative that authentic cleavage products were generated. Our results not only provide a novel plant pre-mRNA cleavage assay system, but also suggest a cross-kingdom functional complementation of yeast poly(A) polymerase in a plant system.

2010

Ji G, Wu X, Shen Y, Huang J, Li QQ. A classification-based prediction model of messenger RNA polyadenylation sites.. Journal of theoretical biology. 2010;265(3):287–96. doi:10.1016/j.jtbi.2010.05.015

Messenger RNA polyadenylation is one of the essential processing steps during eukaryotic gene expression. The site of polyadenylation [(poly(A) site] marks the end of a transcript, which is also the end of a gene. A computation program that is able to recognize poly(A) sites would not only prove useful for genome annotation in finding genes ends, but also for predicting alternative poly(A) sites. Features that define the poly(A) sites can now be extracted from the poly(A) site datasets to build such predictive models. Using methods, including K-gram pattern, Z-curve, position-specific scoring matrix and first-order inhomogeneous Markov sub-model, numerous features were generated and placed in an original feature space. To select the most useful features, attribute selection algorithms, such as information gain and entropy, were employed. A training model was then built based on the Bayesian network to determine a subset of the optimal features. Test models corresponding to the training models were built to predict poly(A) sites in Arabidopsis and rice. Thus, a prediction model, termed Poly(A) site classifier, or PAC, was constructed. The uniqueness of the model lies in its structure in that each sub-model can be replaced or expanded, while feature generation, selection and classification are all independent processes. Its modular design makes it easily adaptable to different species or datasets. The algorithm's high specificity and sensitivity were demonstrated by testing several datasets and, at the best combinations, they both reached 95%. The software package may be used for genome annotation and optimizing transgene structure.

2009

Xing D, Ni S, Kennedy MA, Li QQ. Identification of a plant-specific Zn2+-sensitive ribonuclease activity.. Planta. 2009;230(4):819–25. doi:10.1007/s00425-009-0986-3

Ribonucleases (RNases) play a variety of cellular and biological roles in all three domains of life. In an attempt to perform RNA immuno-precipitation assays of Arabidopsis proteins, we found an EDTA-dependent RNase activity from Arabidopsis suspension tissue cultures. Further investigations proved that the EDTA-dependent RNase activity was plant specific. Characterization of the RNase activity indicated that it was insensitive to low pH and high concentration of NaCl. In the process of isolating the activity with cation exchange chromatography, we found that the EDTA dependency of the activity was lost. This led us to speculate that some metal ions, which inhibited the RNase activity, may be removed during cation exchange chromatography so that the nuclease activity was released. The EDTA dependency of the activity could be due to the ability of the EDTA chelating those metal ions, mimicking the effect of the cation exchange chromatography. Indeed, Zn(2+) strongly inhibited the activity, and the inhibition could be released by EDTA based on both in-solution and in-gel assays. In-gel assays identified two RNase activity bands. Mass spectrometry assays of those activity bands revealed more than 20 proteins. However, none of them has an apparent known nuclease domain, suggesting that one or more of those proteins might possess a currently uncharacterized nuclease domain. Our results may shed light on RNA metabolism in plants by introducing a novel plant-specific RNase activity.

Xing D, Li QQ. Alternative polyadenylation: a mechanism maximizing transcriptome diversity in higher eukaryotes.. Plant signaling & behavior. 2009;4(5):440–2. doi:10.1104/pp.108.129817

Based on comparative genome analyses, the increases in protein-coding gene number could not account for the increases of morphological and behavioral complexity of higher eukaryotes. Transcriptional regulations, alternative splicing and the involvement of non-coding RNA in gene expression regulations have been credited for the drastic increase of transcriptome complexity. However, an emerging theme of another mechanism that contributes to the formation of alternative mRNA 3'-ends is alternative polyadenylation (APA). First, recent studies indicated that APA is a wide spread phenomenon across the transcriptomes of higher eukaryotes and being regulated by developmental and environmental cues. Secondly, our characterization of the Arabidopsis polyadenylation factors suggested that plant polyadenylation has also evolved to regulate the expression of specific genes by means of APA and therefore the specific biological functions. Finally, Phylogenetic analyses of eukaryotic polyadenylation factors from several organisms revealed that the number of polyadenylation factors tends to increase in higher eukaryotes, which provides the potential for their functional differentiation in regulating gene expression through APA. Based on above evidence, we, thus, hypothesize that APA, serving as an additional mechanism, contributes to the complexity of higher eukaryotes.

Zhao H, Xing D, Li QQ. Unique features of plant cleavage and polyadenylation specificity factor revealed by proteomic studies.. Plant physiology. 2009;151(3):1546–56. doi:10.1104/pp.109.142729

Cleavage and polyadenylation of precursor mRNA is an essential process for mRNA maturation. Among the 15 to 20 protein factors required for this process, a subgroup of proteins is needed for both cleavage and polyadenylation in plants and animals. This subgroup of proteins is known as the cleavage and polyadenylation specificity factor (CPSF). To explore the in vivo structural features of plant CPSF, we used tandem affinity purification methods to isolate the interacting protein complexes for each component of the CPSF subunits using Arabidopsis (Arabidopsis thaliana ecotype Landsberg erecta) suspension culture cells. The proteins in these complexes were identified by mass spectrometry and western immunoblots. By compiling the in vivo interaction data from tandem affinity purification tagging as well as other available yeast two-hybrid data, we propose an in vivo plant CPSF model in which the Arabidopsis CPSF possesses AtCPSF30, AtCPSF73-I, AtCPSF73-II, AtCPSF100, AtCPSF160, AtFY, and AtFIPS5. Among them, AtCPSF100 serves as a core with which all other factors, except AtFIPS5, are associated. These results show that plant CPSF possesses distinct features, such as AtCPSF73-II and AtFY, while sharing other ortholog components with its yeast and mammalian counterparts. Interestingly, these two unique plant CPSF components have been associated with embryo development and flowering time controls, both of which involve plant-specific biological processes.