Publications

2008

BACKGROUND: In plant functional genomic studies, gene cloning into binary vectors for plant transformation is a routine procedure. Traditionally, gene cloning has relied on restriction enzyme digestion and ligation. In recent years, however, Gateway(R) cloning technology (Invitrogen Co.) has developed a fast and reliable alternative cloning methodology which uses a phage recombination strategy. While many Gateway- compatible vectors are available, we frequently encounter problems in which antibiotic resistance genes for bacterial selection are the same between recombinant vectors. Under these conditions, it is difficult, if not sometimes impossible, to use antibiotic resistance in selecting the desired transformants. We have, therefore, developed a practical procedure to solve this problem.

RESULTS: An integrated protocol for cloning genes of interest from PCR to Agrobacterium transformants via the Gateway(R) System was developed. The protocol takes advantage of unique characteristics of the replication origins of plasmids used and eliminates the necessity for restriction enzyme digestion in plasmid selections.

CONCLUSION: The protocol presented here is a streamlined procedure for fast and reliable cloning of genes of interest from PCR to Agrobacterium via the Gateway(R) System. This protocol overcomes a key problem in which two recombinant vectors carry the same antibiotic selection marker. In addition, the protocol could be adapted for high-throughput applications.

Liang C, Liu Y, Liu L, Davis AC, Shen Y, Li QQ. Expressed sequence tags with cDNA termini: previously overlooked resources for gene annotation and transcriptome exploration in Chlamydomonas reinhardtii.. Genetics. 2008;179(1):83–93. doi:10.1534/genetics.107.085605

Many of Chlamydomonas reinhardtii expressed sequence tags (ESTs) in GenBank dbEST and community EST assemblies were either over- or undertrimmed in terms of their cDNA termini, which are defined as the diagnostic sequence elements that delineate 3'/5' ends of mRNA transcripts. Overtrimming represents a loss of directional, positional, and structural information of transcript ends whereas undertrimming causes unclean spurious sequences retained in ESTs that exert deleterious impacts on downstream EST-based applications. We examined 309,278 raw EST sequencing trace files of C. reinhardtii and found that only 57% had cDNA termini that matched the expected structures specified in their cDNA library constructions while satisfying our minimum length requirement for their final clean sequences. Using GMAP, 156,963 individual ESTs were mapped to the genome successfully, with their in silico-verified cDNA termini anchored to the genome. Our data analysis suggested strong macro- and microheterogeneity of 3'/5' end positions of individual transcripts derived from the same genes in C. reinhardtii. This work annotating differential ends of individual transcripts in the draft genome presents the research community with a new stream of data that will facilitate accurate determination of gene structures, genome annotation, and exploration of the transcriptome and mRNA metabolism in C. reinhardtii.

Shen Y, Liu Y, Liu L, Liang C, Li QQ. Unique features of nuclear mRNA poly(A) signals and alternative polyadenylation in Chlamydomonas reinhardtii.. Genetics. 2008;179(1):167–76. doi:10.1534/genetics.108.088971

To understand nuclear mRNA polyadenylation mechanisms in the model alga Chlamydomonas reinhardtii, we generated a data set of 16,952 in silico-verified poly(A) sites from EST sequencing traces based on Chlamydomonas Genome Assembly v.3.1. Analysis of this data set revealed a unique and complex polyadenylation signal profile that is setting Chlamydomonas apart from other organisms. In contrast to the high-AU content in the 3'-UTRs of other organisms, Chlamydomonas shows a high-guanylate content that transits to high-cytidylate around the poly(A) site. The average length of the 3'-UTR is 595 nucleotides (nt), significantly longer than that of Arabidopsis and rice. The dominant poly(A) signal, UGUAA, was found in 52% of the near-upstream elements, and its occurrence may be positively correlated with higher gene expression levels. The UGUAA signal also exists in Arabidopsis and in some mammalian genes but mainly in the far-upstream elements, suggesting a shift in function. The C-rich region after poly(A) sites with unique signal elements is a characteristic downstream element that is lacking in higher plants. We also found a high level of alternative polyadenylation in the Chlamydomonas genome, with a range of up to 33% of the 4057 genes analyzed having at least two unique poly(A) sites and approximately 1% of these genes having poly(A) sites residing in predicted coding sequences, introns, and 5'-UTRs. These potentially contribute to transcriptome diversity and gene expression regulation.

Xing D, Zhao H, Xu R, Li QQ. Arabidopsis PCFS4, a homologue of yeast polyadenylation factor Pcf11p, regulates FCA alternative processing and promotes flowering time.. The Plant journal : for cell and molecular biology. 2008;54(5):899–910. doi:10.1111/j.1365-313X.2008.03455.x

The timely transition from vegetative to reproductive growth is vital for reproductive success in plants. It has been suggested that messenger RNA 3'-end processing plays a role in this transition. Specifically, two autonomous factors in the Arabidopsis thaliana flowering time control pathway, FY and FCA, are required for the alternative polyadenylation of FCA pre-mRNA. In this paper we provide evidence that Pcf11p-similar protein 4 (PCFS4), an Arabidopsis homologue of yeast polyadenylation factor Protein 1 of Cleavage Factor 1 (Pcf11p), regulates FCA alternative polyadenylation and promotes flowering as a novel factor in the autonomous pathway. First, the mutants of PCFS4 show delayed flowering under both long-day and short-day conditions and still respond to vernalization treatment. Next, gene expression analyses indicate that the delayed flowering in pcfs4 mutants is mediated by Flowering Locus C (FLC). Moreover, the expression profile of the known FCA transcripts, which result from alternative polyadenylation, was altered in the pcfs4 mutants, suggesting the role of PCFS4 in FCA alternative polyadenylation and control of flowering time. In agreement with these observations, using yeast two-hybrid assays and TAP-tagged protein pull-down analyses, we also revealed that PCFS4 forms a complex in vivo with FY and other polyadenylation factors. The PCFS4 promoter activity assay indicated that the transcription of PCFS4 is temporally and spatially regulated, suggesting its non-essential nature in plant growth and development.

Hunt AG, Xu R, Addepalli B, Rao S, Forbes KP, Meeks LR, Xing D, Mo M, Zhao H, Bandyopadhyay A, et al. Arabidopsis mRNA polyadenylation machinery: comprehensive analysis of protein-protein interactions and gene expression profiling.. BMC genomics. 2008;9:220. doi:10.1186/1471-2164-9-220

BACKGROUND: The polyadenylation of mRNA is one of the critical processing steps during expression of almost all eukaryotic genes. It is tightly integrated with transcription, particularly its termination, as well as other RNA processing events, i.e. capping and splicing. The poly(A) tail protects the mRNA from unregulated degradation, and it is required for nuclear export and translation initiation. In recent years, it has been demonstrated that the polyadenylation process is also involved in the regulation of gene expression. The polyadenylation process requires two components, the cis-elements on the mRNA and a group of protein factors that recognize the cis-elements and produce the poly(A) tail. Here we report a comprehensive pairwise protein-protein interaction mapping and gene expression profiling of the mRNA polyadenylation protein machinery in Arabidopsis.

RESULTS: By protein sequence homology search using human and yeast polyadenylation factors, we identified 28 proteins that may be components of Arabidopsis polyadenylation machinery. To elucidate the protein network and their functions, we first tested their protein-protein interaction profiles. Out of 320 pair-wise protein-protein interaction assays done using the yeast two-hybrid system, 56 (approximately 17%) showed positive interactions. 15 of these interactions were further tested, and all were confirmed by co-immunoprecipitation and/or in vitro co-purification. These interactions organize into three distinct hubs involving the Arabidopsis polyadenylation factors. These hubs are centered around AtCPSF100, AtCLPS, and AtFIPS. The first two are similar to complexes seen in mammals, while the third one stands out as unique to plants. When comparing the gene expression profiles extracted from publicly available microarray datasets, some of the polyadenylation related genes showed tissue-specific expression, suggestive of potential different polyadenylation complex configurations.

CONCLUSION: An extensive protein network was revealed for plant polyadenylation machinery, in which all predicted proteins were found to be connecting to the complex. The gene expression profiles are indicative that specialized sub-complexes may be formed to carry out targeted processing of mRNA in different developmental stages and tissue types. These results offer a roadmap for further functional characterizations of the protein factors, and for building models when testing the genetic contributions of these genes in plant growth and development.

Shen Y, Ji G, Haas BJ, Wu X, Zheng J, Reese GJ, Li QQ. Genome level analysis of rice mRNA 3’-end processing signals and alternative polyadenylation.. Nucleic acids research. 2008;36(9):3150–61. doi:10.1093/nar/gkn158

The position of a poly(A) site of eukaryotic mRNA is determined by sequence signals in pre-mRNA and a group of polyadenylation factors. To reveal rice poly(A) signals at a genome level, we constructed a dataset of 55 742 authenticated poly(A) sites and characterized the poly(A) signals. This resulted in identifying the typical tripartite cis-elements, including FUE, NUE and CE, as previously observed in Arabidopsis. The average size of the 3'-UTR was 289 nucleotides. When mapped to the genome, however, 15% of these poly(A) sites were found to be located in the currently annotated intergenic regions. Moreover, an extensive alternative polyadenylation profile was evident where 50% of the genes analyzed had more than one unique poly(A) site (excluding microheterogeneity sites), and 13% had four or more poly(A) sites. About 4% of the analyzed genes possessed alternative poly(A) sites at their introns, 5'-UTRs, or protein coding regions. The authenticity of these alternative poly(A) sites was partially confirmed using MPSS data. Analysis of nucleotide profile and signal patterns indicated that there may be a different set of poly(A) signals for those poly(A) sites found in the coding regions. Based on the features of rice poly(A) signals, an updated algorithm termed PASS-Rice was designed to predict poly(A) sites.

Polyadenylation factor CLP1 is essential for mRNA 3'-end processing in yeast and mammals. The Arabidopsis (Arabidopsis thaliana) CLP1-SIMILAR PROTEIN3 (CLPS3) is an ortholog of human hCLP1. CLPS3 was previously found to be a subunit in the affinity-purified PCFS4-TAP (tandem affinity purification) complex involved in the alternative polyadenylation of FCA and flowering time control in Arabidopsis. In this article, we further explored the components in the affinity-purified CLPS3-TAP complex, from which Arabidopsis cleavage and polyadenylation specificity factor (CPSF) subunits AtCPSF100 and AtCPSF160 were found. This result implies that CLPS3 may bridge CPSF to the PCFS4 complex. Characterization of the CLPS3 mutant revealed that CLPS3 was essential for embryo development and important for female gametophyte transmission. Overexpression of CLPS3-TAP fusion caused a range of postembryonic development abnormalities, including early flowering time, altered phyllotaxy, and abnormal numbers and shapes of flower organs. These phenotypes are associated with the altered gene expression levels of FCA, WUS, and CUC1. The decreased ratio of FCA-beta to FCA-gamma in the overexpression plants suggests that CLPS3 favored the usage of FCA regular poly(A) site over the alternative site. These observations indicate that Arabidopsis CLPS3 might be involved in the processing of pre-mRNAs encoded by a distinct subset of genes that are important in plant development.

Zhang J, Addepalli B, Yun K-Y, Hunt AG, Xu R, Rao S, Li QQ, Falcone DL. A polyadenylation factor subunit implicated in regulating oxidative signaling in Arabidopsis thaliana.. PloS one. 2008;3(6):e2410. doi:10.1371/journal.pone.0002410

BACKGROUND: Plants respond to many unfavorable environmental conditions via signaling mediated by altered levels of various reactive oxygen species (ROS). To gain additional insight into oxidative signaling responses, Arabidopsis mutants that exhibited tolerance to oxidative stress were isolated. We describe herein the isolation and characterization of one such mutant, oxt6.

METHODOLOGY/PRINCIPAL FINDINGS: The oxt6 mutation is due to the disruption of a complex gene (At1g30460) that encodes the Arabidopsis ortholog of the 30-kD subunit of the cleavage and polyadenylation specificity factor (CPSF30) as well as a larger, related 65-kD protein. Expression of mRNAs encoding Arabidopsis CPSF30 alone was able to restore wild-type growth and stress susceptibility to the oxt6 mutant. Transcriptional profiling and single gene expression studies show elevated constitutive expression of a subset of genes that encode proteins containing thioredoxin- and glutaredoxin-related domains in the oxt6 mutant, suggesting that stress can be ameliorated by these gene classes. Bulk poly(A) tail length was not seemingly affected in the oxt6 mutant, but poly(A) site selection was different, indicating a subtle effect on polyadenylation in the mutant.

CONCLUSIONS/SIGNIFICANCE: These results implicate the Arabidopsis CPSF30 protein in the posttranscriptional control of the responses of plants to stress, and in particular to the expression of a set of genes that suffices to confer tolerance to oxidative stress.

2007

Ji G, Zheng J, Shen Y, Wu X, Jiang R, Lin Y, Loke JC, Davis KM, Reese GJ, Li QQ. Predictive modeling of plant messenger RNA polyadenylation sites.. BMC bioinformatics. 2007;8:43.

BACKGROUND: One of the essential processing events during pre-mRNA maturation is the post-transcriptional addition of a polyadenine [poly(A)] tail. The 3'-end poly(A) track protects mRNA from unregulated degradation, and indicates the integrity of mRNA through recognition by mRNA export and translation machinery. The position of a poly(A) site is predetermined by signals in the pre-mRNA sequence that are recognized by a complex of polyadenylation factors. These signals are generally tri-part sequence patterns around the cleavage site that serves as the future poly(A) site. In plants, there is little sequence conservation among these signal elements, which makes it difficult to develop an accurate algorithm to predict the poly(A) site of a given gene. We attempted to solve this problem.

RESULTS: Based on our current working model and the profile of nucleotide sequence distribution of the poly(A) signals and around poly(A) sites in Arabidopsis, we have devised a Generalized Hidden Markov Model based algorithm to predict potential poly(A) sites. The high specificity and sensitivity of the algorithm were demonstrated by testing several datasets, and at the best combinations, both reach 97%. The accuracy of the program, called poly(A) site sleuth or PASS, has been demonstrated by the prediction of many validated poly(A) sites. PASS also predicted the changes of poly(A) site efficiency in poly(A) signal mutants that were constructed and characterized by traditional genetic experiments. The efficacy of PASS was demonstrated by predicting poly(A) sites within long genomic sequences.

CONCLUSION: Based on the features of plant poly(A) signals, a computational model was built to effectively predict the poly(A) sites in Arabidopsis genes. The algorithm will be useful in gene annotation because a poly(A) site signifies the end of the transcript. This algorithm can also be used to predict alternative poly(A) sites in known genes, and will be useful in the design of transgenes for crop genetic engineering by predicting and eliminating undesirable poly(A) sites.

2006

Delaney KJ, Xu R, Zhang J, Li Q, Yun K-Y, Falcone DL, Hunt AG. Calmodulin interacts with and regulates the RNA-binding activity of an Arabidopsis polyadenylation factor subunit.. Plant physiology. 2006;140(4):1507–21.

The Arabidopsis (Arabidopsis thaliana) gene that encodes the probable ortholog of the 30-kD subunit of the mammalian cleavage and polyadenylation specificity factor (CPSF) is a complex one, encoding small (approximately 28 kD) and large (approximately 68 kD) polypeptides. The small polypeptide (AtCPSF30) corresponds to CPSF30 and is the focus of this study. Recombinant AtCPSF30 was purified from Escherichia coli and found to possess RNA-binding activity. Mutational analysis indicated that an evolutionarily conserved central core of AtCPSF30 is involved in RNA binding, but that RNA binding also requires a short sequence adjacent to the N terminus of the central core. AtCPSF30 was found to bind calmodulin, and calmodulin inhibited the RNA-binding activity of the protein in a calcium-dependent manner. Mutational analysis showed that a small part of the protein, again adjacent to the N terminus of the conserved core, is responsible for calmodulin binding; point mutations in this region abolished both binding to and inhibition of RNA binding by calmodulin. Interestingly, AtCPSF30 was capable of self-interactions. This property also mapped to the central conserved core of the protein. However, calmodulin had no discernible effect on the self-association. These results show that the central portion of AtCPSF30 is involved in a number of important functions, and they raise interesting possibilities for both the interplay between splicing and polyadenylation and the regulation of these processes by stimuli that act through calmodulin.