Publications
2024
-
(2024) Cell Host and Microbe. 32, 10, p. 1744-1757.e2 Abstract
The genetic diversity of the gut microbiota has a central role in host health. Here, we created pangenomes for 728 human gut prokaryotic species, quadrupling the genes of strain-specific genomes. Each of these species has a core set of a thousand genes, differing even between closely related species, and an accessory set of genes unique to the different strains. Functional analysis shows high strain variability associates with sporulation, whereas low variability is linked with antibiotic resistance. We further map the antibiotic resistome across the human gut population and find 237 cases of extreme resistance even to last-resort antibiotics, with a predominance among Enterobacteriaceae. Lastly, the presence of specific genes in the microbiota relates to host age and sex. Our study underscores the genetic complexity of the human gut microbiota, emphasizing its significant implications for host health. The pangenomes and antibiotic resistance map constitute a valuable resource for further research.
-
(2024) Journal of Experimental Medicine. 221, 5, e20231686. Abstract
The mycobiota are a critical part of the gut microbiome, but hostfungal interactions and specific functional contributions of commensal fungi to host fitness remain incompletely understood. Here, we report the identification of a new fungal commensal, Kazachstania heterogenica var. weizmannii, isolated from murine intestines. K. weizmannii exposure prevented Candida albicans colonization and significantly reduced the commensal C. albicans burden in colonized animals. Following immunosuppression of C. albicans colonized mice, competitive fungal commensalism thereby mitigated fatal candidiasis. Metagenome analysis revealed K. heterogenica or K. weizmannii presence among human commensals. Our results reveal competitive fungal commensalism within the intestinal microbiota, independent of bacteria and immune responses, that could bear potential therapeutic value for the management of C. albicansmediated diseases.
-
(2024) Cell Reports. 43, 5, 114203. Abstract
Leishmania is the causative agent of cutaneous and visceral diseases affecting millions of individuals worldwide. Pseudouridine (Ψ), the most abundant modification on rRNA, changes during the parasite life cycle. Alterations in the level of a specific Ψ in helix 69 (H69) affected ribosome function. To decipher the molecular mechanism of this phenotype, we determine the structure of ribosomes lacking the single Ψ and its parental strain at ∼2.43 Å resolution using cryo-EM. Our findings demonstrate the significance of a single Ψ on H69 to its structure and the importance for its interactions with helix 44 and specific tRNAs. Our study suggests that rRNA modification affects translation of mRNAs carrying codon bias due to selective accommodation of tRNAs by the ribosome. Based on the high-resolution structures, we propose a mechanism explaining how the ribosome selects specific tRNAs.
-
(2024) PLoS Biology. 22, 3, e3002570. Abstract
Some drugs increase the mutation rate of their target pathogen, a potentially concerning mechanism as the pathogen might evolve faster toward an undesired phenotype. We suggest a four-step assessment of evolutionary safety for the approval of such treatments.
-
(2024) Molecular Biology and Evolution. 41, 3, msae052. Abstract
Aneuploidy is common in eukaryotes, often leading to decreased fitness. However, evidence from fungi and human tumur cells suggests that specific aneuploidies can be beneficial under stressful conditions and facilitate adaptation. In a previous evolutionary experiment with yeast, populations evolving under heat stress became aneuploid, only to later revert to euploidy after beneficial mutations accumulated. It was therefore suggested that aneuploidy is a \u201cstepping stone\u201d on the path to adaptation. Here, we test this hypothesis. We use Bayesian inference to fit an evolutionary model with both aneuploidy and mutation to the experimental results. We then predict the genotype frequency dynamics during the experiment, demonstrating that most of the evolved euploid population likely did not descend from aneuploid cells, but rather from the euploid wild-type population. Our model shows how the beneficial mutation supplythe product of population size and beneficial mutation ratedetermines the evolutionary dynamics: with low supply, much of the evolved population descends from aneuploid cells; but with high supply, beneficial mutations are generated fast enough to outcompete aneuploidy due to its inherent fitness cost. Our results suggest that despite its potential fitness benefits under stress, aneuploidy can be an evolutionary \u201cdiversion\u201d rather than a \u201cstepping stone\u201d: it can delay, rather than facilitate, the adaptation of the population, and cells that become aneuploid may leave less descendants compared to cells that remain diploid.
2023
-
(2023) Nature Communications. 14, 5384. Abstract
Diabetes and associated comorbidities are a global health threat on the rise. We conducted a six-month dietary intervention in pre-diabetic individuals (NCT03222791), to mitigate the hyperglycemia and enhance metabolic health. The current work explores early diabetes markers in the 200 individuals who completed the trial. We find 166 of 2,803 measured features, including oral and gut microbial species and pathways, serum metabolites and cytokines, show significant change in response to a personalized postprandial glucose-targeting diet or the standard of care Mediterranean diet. These changes include established markers of hyperglycemia as well as novel features that can now be investigated as potential therapeutic targets. Our results indicate the microbiome mediates the effect of diet on glycemic, metabolic and immune measurements, with gut microbiome compositional change explaining 12.25% of serum metabolites variance. Although the gut microbiome displays greater compositional changes compared to the oral microbiome, the oral microbiome demonstrates more changes at the genetic level, with trends dependent on environmental richness and species prevalence in the population. In conclusion, our study shows dietary interventions can affect the microbiome, cardiometabolic profile and immune response of the host, and that these factors are well associated with each other, and can be harnessed for new therapeutic modalities.
-
(2023) Genetics. 225, 1, iyad111. Abstract
The mutation rate plays an important role in adaptive evolution. It can be modified by mutator and anti-mutator alleles. Recent empirical evidence hints that the mutation rate may vary among genetically identical individuals: evidence from bacteria suggests that the mutation rate can be affected by expression noise of a DNA repair protein and potentially also by translation errors in various proteins. Importantly, this non-genetic variation may be heritable via a transgenerational epigenetic mode of inheritance, giving rise to a mutator phenotype that is independent from mutator alleles. Here, we investigate mathematically how the rate of adaptive evolution is affected by the rate of mutation rate phenotype switching. We model an asexual population with two mutation rate phenotypes, non-mutator and mutator. An offspring may switch from its parental phenotype to the other phenotype. We find that switching rates that correspond to so-far empirically described non-genetic systems of inheritance of the mutation rate lead to higher rates of adaptation on both artificial and natural fitness landscapes. These switching rates can maintain within the same individuals both a mutator phenotype and intermediary mutations, a combination that facilitates adaptation. Moreover, non-genetic inheritance increases the proportion of mutators in the population, which in turn increases the probability of hitchhiking of the mutator phenotype with adaptive mutations. This in turns facilitates the acquisition of additional adaptive mutations. Our results rationalize recently observed noise in the expression of proteins that affect the mutation rate and suggest that non-genetic inheritance of this phenotype may facilitate evolutionary adaptive processes.
-
(2023) PLoS Biology. 21, 8 August, e3002214. Abstract
Nucleoside analogs are a major class of antiviral drugs. Some act by increasing the viral mutation rate causing lethal mutagenesis of the virus. Their mutagenic capacity, however, may lead to an evolutionary safety concern. We define evolutionary safety as a probabilistic assurance that the treatment will not generate an increased number of mutants. We develop a mathematical framework to estimate the total mutant load produced with and without mutagenic treatment. We predict rates of appearance of such virus mutants as a function of the timing of treatment and the immune competence of patients, employing realistic assumptions about the vulnerability of the viral genome and its potential to generate viable mutants. We focus on the case study of Molnupiravir, which is an FDA-approved treatment against Coronavirus Disease-2019 (COVID-19). We estimate that Molnupiravir is narrowly evolutionarily safe, subject to the current estimate of parameters. Evolutionary safety can be improved by restricting treatment with this drug to individuals with a low immunological clearance rate and, in future, by designing treatments that lead to a greater increase in mutation rate. We report a simple mathematical rule to determine the fold increase in mutation rate required to obtain evolutionary safety that is also applicable to other pathogen-treatment combinations.
2022
-
(2022) Cell. 185, 20, p. 3789-3806 Abstract
Cancer-microbe associations have been explored for centuries, but cancer-associated fungi have rarely been examined. Here, we comprehensively characterize the cancer mycobiome within 17,401 patient tissue, blood, and plasma samples across 35 cancer types in four independent cohorts. We report fungal DNA and cells at low abundances across many major human cancers, with differences in community compositions that differ among cancer types, even when accounting for technical background. Fungal histological staining of tissue microarrays supported intratumoral presence and frequent spatial association with cancer cells and macrophages. Comparing intratumoral fungal communities with matched bacteriomes and immunomes revealed co-occurring bi-domain ecologies, often with permissive, rather than competitive, microenvironments and distinct immune responses. Clinically focused assessments suggested prognostic and diagnostic capacities of the tissue and plasma mycobiomes, even in stage I cancers, and synergistic predictive performance with bacteriomes.
-
(2022) Molecular Biology and Evolution. 39, 9, msac178. Abstract
Fitness landscape mapping and the prediction of evolutionary trajectories on these landscapes are major tasks in evolutionary biology research. Evolutionary dynamics is tightly linked to the landscape topography, but this relation is not straightforward. Here, we analyze a fitness landscape of a yeast tRNA gene, previously measured under four different conditions. We find that the wild type allele is sub-optimal, and 810% of its variants are fitter. We rule out the possibilities that the wild type is fittest on average on these four conditions or located on a local fitness maximum. Notwithstanding, we cannot exclude the possibility that the wild type might be fittest in some of the many conditions in the complex ecology that yeast lives at. Instead, we find that the wild type is mutationally robust (\u201cflat\u201d), while more fit variants are typically mutationally fragile. Similar observations of mutational robustness or flatness have been so far made in very few cases, predominantly in viral genomes.
-
(2022) PLoS Computational Biology. 18, 8, e1010391. Abstract
The COVID-19 pandemic demonstrated that the process of global vaccination against a novel virus can be a prolonged one. Social distancing measures, that are initially adopted to control the pandemic, are gradually relaxed as vaccination progresses and population immunity increases. The result is a prolonged period of high disease prevalence combined with a fitness advantage for vaccine-resistant variants, which together lead to a considerably increased probability for vaccine escape. A spatial vaccination strategy is proposed that has the potential to dramatically reduce this risk. Rather than dispersing the vaccination effort evenly throughout a country, distinct geographic regions of the country are sequentially vaccinated, quickly bringing each to effective herd immunity. Regions with high vaccination rates will then have low infection rates and vice versa. Since people primarily interact within their own region, spatial vaccination reduces the number of encounters between infected individuals (the source of mutations) and vaccinated individuals (who facilitate the spread of vaccine-resistant strains). Thus, spatial vaccination may help mitigate the global risk of vaccine-resistant variants.
-
(2022) The journal of Biological chemistry. 298, 7, 102141. Abstract
Trypanosoma brucei, the parasite that causes sleeping sickness, cycles between an insect and a mammalian host. However, the effect of RNA modifications such as pseudouridinylation on its ability to survive in these two different host environments is unclear. Here, two genome-wide approaches were applied for mapping pseudouridinylation sites (Ψs) on small nucleolar RNA (snoRNA), 7SL RNA, vtRNA, and tRNAs from T. brucei. We show using HydraPsiSeq and RiboMeth-seq, that the Ψ on C/D snoRNA guiding 2-O-methylation increased the efficiency of the guided modification on its target, rRNA. We found differential levels of Ψs on these ncRNAs in the two life stages (insect host and mammalian host) of the parasite. Furthermore, tRNA isoform abundance and Ψ modifications were characterized in these two life stages demonstrating stage-specific regulation. We conclude that the differential Ψ modifications identified here may contribute to modulating the function of non-coding (nc)RNAs involved in rRNA processing, rRNA modification, protein synthesis, and protein translocation during cycling of the parasite between its two hosts.
-
(2022) Nature Human Behaviour. 6, 2, p. 193-206 Abstract
The greatest hope for a return to normalcy following the COVID-19 pandemic is worldwide vaccination. Yet, a relaxation of social distancing that allows increased transmissibility, coupled with selection pressure due to vaccination, will probably lead to the emergence of vaccine resistance. We analyse the evolutionary dynamics of COVID-19 in the presence of dynamic contact reduction and in response to vaccination. We use infection and vaccination data from six different countries. We show that under slow vaccination, resistance is very likely to appear even if social distancing is maintained. Under fast vaccination, the emergence of mutants can be prevented if social distancing is maintained during vaccination. We analyse multiple human factors that affect the evolutionary potential of the virus, including the extent of dynamic social distancing, vaccination campaigns, vaccine design, boosters and vaccine hesitancy. We provide guidelines for policies that aim to minimize the probability of emergence of vaccine-resistant variants.
-
(2022) Default journal. p. 1-6 Abstract
What can punctuation tell us about cultural historical trends? Here I analyze the change in frequency of the punctuation marks '?' and '!' in six languages over the last two centuries by a cultromics study of the Google Books online repository. I found that in German, Italian and Spanish the ratio of usage of question marks to exclamation marks sharply declines towards the Second World War, and steadily increases thereafter, whereas in English '?' was always more heavily used. A common trend in all languages is a rise in '?' compared to '!' in the second half on the 20th century. Furthermore, over the last decades, the usage in English of 'Why', 'How' and 'What'-open ended questions, with often no definitive answer-has tended to increase in frequency more sharply than 'Where', 'When', and 'Who'. I propose that the relative usage of question marks and the type of questions asked may serve as a meaningful dynamic measure of the cultural state of societies. Culturomics is a form of computational lexicology that studies human behavior, language, cultural and historical trends through the quantitative analysis of texts (Michel et al., 2011). A major source of culturomics data has been Google Books, a Google service that allows to search the full texts of books and magazines scanned and converted to digitized text and stored in a database. This database contains books printed in nine different languages from the year 1500, though predominantly from 1800 until 2009. The main output of an inquiry is a temporal representation profile of each word (the so-called " n-gram "), depicting the number of its appearances in the corpus of scanned books in any given year, normalized to the total number of words (or n-grams) scanned during that year. The Ngram Viewer enables browsing of the data online.
2021
-
(2021) Nature Cancer. 2, 10, p. 1055-1070 Abstract
Stochastic transition of cancer cells between drug-sensitive and drug-tolerant persister phenotypes has been proposed to play a key role in non-genetic resistance to therapy. Yet, we show here that cancer cells actually possess a highly stable inherited chance to persist (CTP) during therapy. This CTP is non-stochastic, determined pre-treatment and has a unimodal distribution ranging from 0 to almost 100%. Notably, CTP is drug specific. We found that differential serine/threonine phosphorylation of the insulin receptor substrate 1 (IRS1) protein determines the CTP of lung and of head and neck cancer cells under epidermal growth factor receptor inhibition, both in vitro and in vivo. Indeed, the first-in-class IRS1 inhibitor NT219 was highly synergistic with anti-epidermal growth factor receptor therapy across multiple in vitro and in vivo models. Elucidation of drug-specific mechanisms that determine the degree and stability of cellular CTP may establish a framework for the elimination of cancer persisters, using new rationally designed drug combinations.
-
(2021) Proceedings of the National Academy of Sciences - PNAS. 118, 42, e210655611. Abstract
The tRNA pool determines the efficiency, throughput, and accuracy of translation. Previous studies have identified dynamic changes in the tRNA (transfer RNA) supply and mRNA (messenger RNA) demand during cancerous proliferation. Yet dynamic changes may also occur during physiologically normal proliferation, and these are less well characterized. We examined the tRNA and mRNA pools of T cells during their vigorous proliferation and differentiation upon triggering their antigen receptor. We observed a global signature of switch in demand for codons at the early proliferation phase of the response, accompanied by corresponding changes in tRNA expression levels. In the later phase, upon differentiation, the response of the tRNA pool relaxed back to the basal level, potentially restraining excessive proliferation. Sequencing of tRNAs allowed us to evaluate their diverse base-modifications. We found that two types of tRNA modifications, wybutosine and ms2t6A, are reduced dramatically during T cell activation. These modifications occur in the anticodon loops of two tRNAs that decode \u201cslippery codons,\u201d which are prone to ribosomal frameshifting. Attenuation of these frameshift-protective modifications is expected to increase the potential for proteome-wide frameshifting during T cell proliferation. Indeed, human cell lines deleted of a wybutosine writer showed increased ribosomal frameshifting, as detected with an HIV gag-pol frameshifting site reporter. These results may explain HIVs specific tropism toward proliferating T cells since it requires ribosomal frameshift exactly on the corresponding codon for infection. The changes in tRNA expression and modifications uncover a layer of translation regulation during T cell proliferation and expose a potential tradeoff between cellular growth and translation fidelity.
-
(2021) PLoS Genetics. 17, 9, e1009805. Abstract
RNA splicing is a key process in eukaryotic gene expression, in which an intron is spliced out of a pre-mRNA molecule to eventually produce a mature mRNA. Most intron-containing genes are constitutively spliced, hence efficient splicing of an intron is crucial for efficient regulation of gene expression. Here we use a large synthetic oligo library of ~20,000 variants to explore how different intronic sequence features affect splicing efficiency and mRNA expression levels in S. cerevisiae. Introns are defined by three functional sites, the 5' donor site, the branch site, and the 3' acceptor site. Using a combinatorial design of synthetic introns, we demonstrate how non-consensus splice site sequences in each of these sites affect splicing efficiency. We then show that S. cerevisiae splicing machinery tends to select alternative 3' splice sites downstream of the original site, and we suggest that this tendency created a selective pressure, leading to the avoidance of cryptic splice site motifs near introns' 3' ends. We further use natural intronic sequences from other yeast species, whose splicing machineries have diverged to various extents, to show how intron architectures in the various species have been adapted to the organism's splicing machinery. We suggest that the observed tendency for cryptic splicing is a result of a loss of a specific splicing factor, U2AF1. Lastly, we show that synthetic sequences containing two introns give rise to alternative RNA isoforms in S. cerevisiae, demonstrating that merely a synthetic fusion of two introns might be suffice to facilitate alternative splicing in yeast. Our study reveals novel mechanisms by which introns are shaped in evolution to allow cells to regulate their transcriptome. In addition, it provides a valuable resource to study the regulation of constitutive and alternative splicing in a model organism.
2020
-
(2020) eLife. 9, e58461. Abstract
Different subsets of the tRNA pool in human cells are expressed in different cellular conditions. The `proliferation-tRNAs' are induced upon normal and cancerous cell division, while the `differentiation-tRNAs' are active in non-dividing, differentiated cells. Here we examine the essentiality of the various tRNAs upon cellular growth and arrest. We established a CRISPR-based editing procedure with sgRNAs that each target a tRNA family. We measured tRNA essentiality for cellular growth and found that most proliferation-tRNAs are essential compared to differentiation-tRNAs in rapidly growing cell lines. Yet in more slowly dividing lines, the differentiation-tRNAs were more essential. In addition, we measured the essentiality of each tRNA family upon response to cell cycle arresting signals. Here we detected a more complex behavior with both proliferation-tRNAs and differentiation tRNAs showing various levels of essentiality. These results provide the so-far most comprehensive functional characterization of human tRNAs with intricate roles in various cellular states.
-
(2020) Genetics. 216, 2, p. 543-558 Abstract
Tracing evolutionary processes that lead to fixation of genomic variation in wild bacterial populations is a prime challenge in molecular evolution. In particular, the relative contribution of horizontal gene transfer (HGT) vs. de novo mutations during adaptation to a new environment is poorly understood. To gain a better understanding of the dynamics of HGT and its effect on adaptation, we subjected several populations of competent Bacillus subtilis to a serial dilution evolution on a high-salt-containing medium, either with or without foreign DNA from diverse pre-adapted or naturally salt tolerant species. Following 504 generations of evolution, all populations improved growth yield on the medium. Sequencing of evolved populations revealed extensive acquisition of foreign DNA from close Bacillus donors but not from more remote donors. HGT occurred in bursts, whereby a single bacterial cell appears to have acquired dozens of fragments at once. In the largest burst, close to 2% of the genome has been replaced by HGT. Acquired segments tend to be clustered in integration hotspots. Other than HGT, genomes also acquired spontaneous mutations. Many of these mutations occurred within, and seem to alter, the sequence of flagellar proteins. Finally, we show that, while some HGT fragments could be neutral, others are adaptive and accelerate evolution.
-
(2020) Nature Communications. 11, 1, 3061. Abstract
Programmed ribosomal frameshifting (PRF) is the controlled slippage of the translating ribosome to an alternative frame. This process is widely employed by human viruses such as HIV and SARS coronavirus and is critical for their replication. Here, we developed a high-throughput approach to assess the frameshifting potential of a sequence. We designed and tested >12,000 sequences based on 15 viral and human PRF events, allowing us to systematically dissect the rules governing ribosomal frameshifting and discover novel regulatory inputs based on amino acid properties and tRNA availability. We assessed the natural variation in HIV gag-pol frameshifting rates by testing >500 clinical isolates and identified subtype-specific differences and associations between viral load in patients and the optimality of PRF rates. We devised computational models that accurately predict frameshifting potential and frameshifting rates, including subtle differences between HIV isolates. This approach can contribute to the development of antiviral agents targeting PRF.
2019
-
(2019) Nature Communications. 10, 4572. Abstract
Most human genes are alternatively spliced, allowing for a large expansion of the proteome. The multitude of regulatory inputs to splicing limits the potential to infer general principles from investigating native sequences. Here, we create a rationally designed library of >32,000 splicing events to dissect the complexity of splicing regulation through systematic sequence alterations. Measuring RNA and protein splice isoforms allows us to investigate both cause and effect of splicing decisions, quantify diverse regulatory inputs and accurately predict (R-2 = 0.73-0.85) isoform ratios from sequence and secondary structure. By profiling individual cells, we measure the cell-to-cell variability of splicing decisions and show that it can be encoded in the DNA and influenced by regulatory inputs, opening the door for a novel, single-cell perspective on splicing regulation.
-
(2019) PLoS Biology. 17, 8, e3000423. Abstract
Splicing expands, reshapes, and regulates the transcriptome of eukaryotic organisms. Despite its importance, key questions remain unanswered, including the following: Can splicing evolve when organisms adapt to new challenges? How does evolution optimize inefficiency of introns' splicing and of the splicing machinery? To explore these questions, we evolved yeast cells that were engineered to contain an inefficiently spliced intron inside a gene whose protein product was under selection for an increased expression level. We identified a combination of mutations in Cis (within the gene of interest) and in Trans (in mRNA-maturation machinery). Surprisingly, the mutations in Cis resided outside of known intronic functional sites and improved the intron's splicing efficiency potentially by easing tight mRNA structures. One of these mutations hampered a protein's domain that was not under selection, demonstrating the evolutionary flexibility of multi-domain proteins as one domain functionality was improved at the expense of the other domain. The Trans adaptations resided in two proteins, Npl3 and Gbp2, that bind pre-mRNAs and are central to their maturation. Interestingly, these mutations either increased or decreased the affinity of these proteins to mRNA, presumably allowing faster spliceosome recruitment or increased time before degradation of the pre-mRNAs, respectively. Altogether, our work reveals various mechanistic pathways toward optimizations of intron splicing to ultimately adapt gene expression patterns to novel demands.
-
(2019) Molecular Cell. 75, 3, p. 427-441 Abstract
The translation machinery and the genes it decodes co-evolved to achieve production throughput and accuracy. Nonetheless, translation errors are frequent, and they affect physiology and protein evolution. Mapping translation errors in proteomes and understanding their causes is hindered by lack of a proteome-wide experimental methodology. We present the first methodology for systematic detection and quantification of errors in entire proteomes. Following proteome mass spectrometry, we identify, in E. coli and yeast, peptides whose mass indicates specific amino acid substitutions. Most substitutions result from codon-anticodon mispairing. Errors occur at sites that evolve rapidly and that minimally affect energetic stability, indicating selection for high translation fidelity. Ribosome density data show that errors occur at sites where ribosome velocity is higher, demonstrating a trade-off between speed and accuracy. Treating bacteria with an aminoglycoside antibiotic or deprivation of specific amino acids resulted in particular patterns of errors. These results reveal a mechanistic and evolutionary basis for translation fidelity.
-
(2019) PLoS Genetics. 15, 7, e1008248. Abstract
The localization of mRNAs encoding secreted/membrane proteins (mSMPs) to the endoplasmic reticulum (ER) likely facilitates the co-translational translocation of secreted proteins. However, studies have shown that mSMP recruitment to the ER in eukaryotes can occur in a manner that is independent of the ribosome, translational control, and the signal recognition particle, although the mechanism remains largely unknown. Here, we identify a cis-acting RNA sequence motif that enhances mSMP localization to the ER and appears to increase mRNA stability, and both the synthesis and secretion of secretome proteins. Termed SECReTE, for secretion-enhancing cis regulatory targeting element, this motif is enriched in mRNAs encoding secretome proteins translated on the ER in eukaryotes and on the inner membrane of prokaryotes. SECReTE consists of >= 10 nucleotide triplet repeats enriched with pyrimidine (C/U) every third base (i.e. NNY, where N = any nucleotide, Y = pyrimidine) and can be present in the untranslated as well as the coding regions of the mRNA. Synonymous mutations that elevate the SECReTE count in a given mRNA (e.g. SUC2, HSP150, and CCW12) lead to an increase in protein secretion in yeast, while a reduction in count led to less secretion and physiological defects. Moreover, the addition of SECReTE to the 3'UTR of an mRNA for an exogenously expressed protein (e.g. GFP) led to its increased secretion from yeast cells. Thus, SECReTE constitutes a novel RNA motif that facilitates ER-localized mRNA translation and protein secretion.
-
(2019) PLoS Biology. 17, 6, Abstract
Technological breakthroughs in the past two decades have ushered in a new era of biomedical research, turning it into an information-rich and technology-driven science. This scientific revolution, though evident to the research community, remains opaque to nonacademic audiences. Such knowledge gaps are likely to persist without revised strategies for science education and public outreach. To address this challenge, we developed a unique outreach program to actively engage over 100 high-school students in the investigation of multidrug-resistant bacteria. Our program uses robotic automation and interactive web-based tools to bridge geographical distances, scale up the number of participants, and reduce overall cost. Students and teachers demonstrated high engagement and interest throughout the project and valued its unique approach. This educational model can be leveraged to advance the massive open online courses movement that is already transforming science education.
-
(2019) PLoS Biology. 17, 3, e3000182. Abstract
In experimental evolution, scientists evolve organisms in the lab, typically by challenging them to new environmental conditions. How best to evolve a desired trait? Should the challenge be applied abruptly, gradually, periodically, sporadically? Should one apply chemical mutagenesis, and do strains with high innate mutation rate evolve faster? What are ideal population sizes of evolving populations? There are endless strategies, beyond those that can be exposed by individual labs. We therefore arranged a community challenge, Evolthon, in which students and scientists from different labs were asked to evolve Escherichia coli or Saccharomyces cerevisiae for an abiotic stresslow temperature. About 30 participants from around the world explored diverse environmental and genetic regimes of evolution. After a period of evolution in each lab, all strains of each species were competed with one another. In yeast, the most successful strategies were those that used mating, underscoring the importance of sex in evolution. In bacteria, the fittest strain used a strategy based on exploration of different mutation rates. Different strategies displayed variable levels of performance and stability across additional challenges and conditions. This study therefore uncovers principles of effective experimental evolutionary regimens and might prove useful also for biotechnological developments of new strains and for understanding natural strategies in evolutionary arms races between species. Evolthon constitutes a model for community-based scientific exploration that encourages creativity and cooperation.
-
(2019) Cell Stem Cell. 24, 2, p. 328-341.e9 Abstract
The epigenetic dynamics of induced pluripotent stem cell (iPSC) reprogramming in correctly reprogrammed cells at high resolution and throughout the entire process remain largely undefined. Here, we characterize conversion of mouse fibroblasts into iPSCs using Gatad2a-Mbd3/NuRD-depleted and highly efficient reprogramming systems. Unbiased high-resolution profiling of dynamic changes in levels of gene expression, chromatin engagement, DNA accessibility, and DNA methylation were obtained. We identified two distinct and synergistic transcriptional modules that dominate successful reprogramming, which are associated with cell identity and biosynthetic genes. The pluripotency module is governed by dynamic alterations in epigenetic modifications to promoters and binding by Oct4, Sox2, and Klf4, but not Myc. Early DNA demethylation at certain enhancers prospectively marks cells fated to reprogram. Myc activity drives expression of the essential biosynthetic module and is associated with optimized changes in tRNA codon usage. Our functional validations highlight interweaved epigenetic- and Myc-governed essential reconfigurations that rapidly commission and propel deterministic reprogramming toward naive pluripotency.
2018
-
(2018) Annual Review of Cell and Developmental Biology. 34, p. 239-264 Abstract
The pool of transfer RNA (tRNA) molecules in cells allows the ribosome to decode genetic information. This repertoire of molecular decoders is positioned in the crossroad of the genome, the transcriptome, and the proteome. Omics and systems biology now allow scientists to explore the entire repertoire of tRNAs of many organisms, revealing basic exciting biology. The tRNA gene set of hundreds of species is now characterized, in addition to the tRNA genes of organelles and viruses. Genes encoding tRNAs for certain anticodon types appear in dozens of copies in a genome, while others are universally absent from any genome. Transcriptome measurement of tRNAs is challenging, but in recent years new technologies have allowed researchers to determine the dynamic expression patterns of tRNAs. These advances reveal that availability of ready-to-translate tRNA molecules is highly controlled by several transcriptional and posttranscriptional regulatory processes. This regulation shapes the proteome according to the cellular state. The tRNA pool profoundly impacts many aspects of cellular and organismal life, including protein expression level, translation accuracy, adequacy of folding, and even mRNA stability. As a result, the shape of the tRNA pool affects organismal health and may participate in causing conditions such as cancer and neurological conditions.
-
(2018) RNA Biology. 15, 7, p. 863-867 Abstract
DNA harbors the blueprint for life. However, the instructions stored in the DNA could be altered at the RNA level before they are executed. One of these processes is RNA editing, which was shown to modify RNA sequences in many organisms. The most abundant modification is the deamination of adenosine (A) into inosine (I). In turn, inosine can be identified as a guanosine (G) by the ribosome and other cellular machineries such as reverse transcriptase. In multicellular organisms, enzymes from the ADAR (adenosine deaminase acting on RNA) family mediate RNA editing in mRNA, whereas enzymes from the ADAT family mediate A-to-I editing on tRNAs. In bacteria however, until recently, only one editing site was described, in tRNA(Arg), but never in mRNA. The tRNA site was shown to be modified by tadA (tRNA specific adenosine deaminase) which is believed to be the ancestral enzyme for the RNA editing family of enzymes. In our recent work, we have shown for the first time, editing on multiple sites in bacterial mRNAs and identified tadA as the enzyme responsible for this editing activity. Focusing on one of the identified targets - the self-killing toxin hokB, we found that editing is physiologically regulated and that it increases protein activity. Here we discuss possible modes of regulation on hokB editing, potential roles of RNA editing in bacteria, possible implications, and future research directions.
-
-
(2018) Proceedings Of The National Academy Of Sciences Of The United States Of America-Biological Sciences. 115, 21, p. E4940-E4949 Abstract
Although the genetic code is redundant, synonymous codons for the same amino acid are not used with equal frequencies in genomes, a phenomenon termed "codon usage bias." Previous studies have demonstrated that synonymous changes in a coding sequence can exert significant cis effects on the gene's expression level. However, whether the codon composition of a gene can also affect the translation efficiency of other genes has not been thoroughly explored. To study how codon usage bias influences the cellular economy of translation, we massively converted abundant codons to their rare synonymous counterpart in several highly expressed genes in Escherichia coli. This perturbation reduces both the cellular fitness and the translation efficiency of genes that have high initiation rates and are naturally enriched with the manipulated codon, in agreement with theoretical predictions. Interestingly, we could alleviate the observed phenotypes by increasing the supply of the tRNA for the highly demanded codon, thus demonstrating that the codon usage of highly expressed genes was selected in evolution to maintain the efficiency of global protein translation.
-
(2018) Cell. 172, 3, p. 391-392 Abstract
In the era of genome engineering, a new study returns to classical genetics to decipher genotype-phenotype relationships in unprecedented throughput and with unprecedented accuracy. Capitalizing on natural variation in yeast strains and frequent meiotic recombination, She and Jarosz (2018) dissect and map to nucleotide resolution, simple and complex determinants of diverse phenotypic traits.
2017
-
(2017) Genome Research. 27, 10, p. 1696-1703 Abstract
Adenosine (A) to inosine (I) RNA editing is widespread in eukaryotes. In prokaryotes, however, A-to-I RNA editing was only reported to occur in tRNAs but not in protein-coding genes. By comparing DNA and RNA sequences of Escherichia coli, we show for the first time that A-to-I editing occurs also in prokaryotic mRNAs and has the potential to affect the translated proteins and cell physiology. We found 15 novel A-to-I editing events, of which 12 occurred within known protein-coding genes where they always recode a tyrosine (TAC) into a cysteine (TGC) codon. Furthermore, we identified the tRNA-specific adenosine deaminase (tadA) as the editing enzyme of all these editing sites, thus making it the first identified RNA editing enzyme that modifies both tRNAs and mRNAs. Interestingly, several of the editing targets are self-killing toxins that belong to evolutionarily conserved toxin-antitoxin pairs. We focused on hokB, a toxin that confers antibiotic tolerance by growth inhibition, as it demonstrated the highest level of such mRNA editing. We identified a correlated mutation pattern between the edited and a DNA hard-coded Cys residue positions in the toxin and demonstrated that RNA editing occurs in hokB in two additional bacterial species. Thus, not only the toxin is evolutionarily conserved but also the editing itself within the toxin is. Finally, we found that RNA editing in hokB increases as a function of cell density and enhances its toxicity. Our work thus demonstrates the occurrence, regulation, and functional consequences of RNA editing in bacteria.
-
-
(2017) Molecular Cell. 65, 1, p. 142-153 Abstract
Gene expression burdens cells by consuming resources and energy. While numerous studies have investigated regulation of expression level, little is known about gene design elements that govern expression costs. Here, we ask how cells minimize production costs while maintaining a given protein expression level and whether there are gene architectures that optimize this process. We measured fitness of similar to 14,000 E. coli strains, each expressing a reporter gene with a unique 5' architecture. By comparing cost-effective and ineffective architectures, we found that cost per protein molecule could be minimized by lowering transcription levels, regulating translation speeds, and utilizing amino acids that are cheap to synthesize and that are less hydrophobic. We then examined natural E. coli genes and found that highly expressed genes have evolved more forcefully to minimize costs associated with their expression. Our study thus elucidates gene design elements that improve the economy of protein expression in natural and heterologous systems.
2016
-
(2016) PLoS Biology. 14, 9, e1002557. Abstract
The mitochondrial ribosome, which translates all mitochondrial DNA (mtDNA)-encoded proteins, should be tightly regulated pre- and post-transcriptionally. Recently, we found RNA-DNA differences (RDDs) at human mitochondrial 16S (large) rRNA position 947 that were indicative of post-transcriptional modification. Here, we show that these 16S rRNA RDDs result from a 1-methyladenosine (m1A) modification introduced by TRMT61B, thus being the first vertebrate methyltransferase that modifies both tRNA and rRNAs. m1A947 is conserved in humans and all vertebrates having adenine at the corresponding mtDNA position (90% of vertebrates). However, this mtDNA base is a thymine in 10% of the vertebrates and a guanine in the 23S rRNA of 95% of bacteria, suggesting alternative evolutionary solutions. m1A, uridine, or guanine may stabilize the local structure of mitochondrial and bacterial ribosomes. Experimental assessment of genome-edited Escherichia coli showed that unmodified adenine caused impaired protein synthesis and growth. Our findings revealed a conserved mechanism of rRNA modification that has been selected instead of DNA mutations to enable proper mitochondrial ribosome function.
-
(2016) PLoS Genetics. 12, 8, e1006264. Abstract
Codon usage bias affects protein translation because tRNAs that recognize synonymous codons differ in their abundance. Although the current dogma states that tRNA expression is exclusively regulated by intrinsic control elements (A- and B-box sequences), we revealed, using a reporter that monitors the levels of individual tRNA genes in Caenorhabditis elegans, that eight tryptophan tRNA genes, 100% identical in sequence, are expressed in different tissues and change their expression dynamically. Furthermore, the expression levels of the sup-7 tRNA gene at day 6 were found to predict the animals lifespan. We discovered that the expression of tRNAs that reside within introns of protein-coding genes is affected by the host genes promoter. Pairing between specific Pol II genes and the tRNAs that are contained in their introns is most likely adaptive, since a genome-wide analysis revealed that the presence of specific intronic tRNAs within specific orthologous genes is conserved across Caenorhabditis species.
-
(2016) BMC Genomics. 17, 1, 674. Abstract
Background: Cells constantly adapt to changes in their environment. When environment shifts between conditions that were previously encountered during the course of evolution, evolutionary-programmed responses are possible. Cells, however, may also encounter a new environment to which a novel response is required. To characterize the first steps in adaptation to a novel condition, we studied budding yeast growth on xylulose, a sugar that is very rarely found in the wild. Results: We previously reported that growth on xylulose induces the expression of amino acid biosynthesis genes in multiple natural yeast isolates. This induction occurs despite the presence of amino acids in the growth medium and is a unique response to xylulose, not triggered by naturally available carbon sources. Propagating these strains for ~300 generations on xylulose significantly improved their growth rate. Notably, the most significant change in gene expression was the loss of amino acid biosynthesis gene induction. Furthermore, the reduction in amino-acid biosynthesis gene expression on xylulose was tightly correlated with the improvement in growth rate, suggesting that internal depletion of amino-acids presented a major bottleneck limiting growth in xylulose. Conclusions: We discuss the possible implications of our results for explaining how cells maintain the balance between supply and demand of amino acids during growth in evolutionary 'familiar' vs. 'novel' conditions.
-
(2016) eLife. 5, APRIL2016, e14424. Abstract
Correlation does not imply causation. If two variables, say A and B, are correlated, it could be because A causes B, or that B causes A, or because a third factor affects them both. We suggest that in many cases in biology, the causal link might be bi-directional: A causes B through a fast-acting physio- logical process, while B causes A through a slowly accumulating evolutionary process. Furthermore, many trained biologists tend to consistently focus at first on the fast-acting direction, and overlook the slower process in the opposite direction. We analyse several examples from modern biology that dem- onstrate this bias (codon usage optimality and gene expression, gene duplication and genetic dispens- ability, stem cell division and cancer risk, and the microbiome and host metabolism) and also discuss an example from linguistics. These examples demonstrate mutual effects between the fast physiologi- cal processes and the slow evolutionary ones. We believe that building awareness of inference biases among biologists who tend to prefer one causal direction over another could improve scientific reasoning.
-
(2016) PLoS Genetics. 12, 2, e1005879. Abstract
Most mammalian genes often feature alternative polyadenylation (APA) sites and hence diverse 3UTR lengths. Proliferating cells were reported to favor APA sites that result in shorter 3UTRs. One consequence of such shortening is escape of mRNAs from targeting by microRNAs (miRNAs) whose binding sites are eliminated. Such a mechanism might provide proliferation-related genes with an expression gain during normal or cancerous proliferation. Notably, miRNA sites tend to be more active when located near both ends of the 3UTR compared to those located more centrally. Accordingly, miRNA sites located near the center of the full 3UTR might become more active upon 3'UTR shortening. To address this conjecture we performed 3' sequencing to determine the 3' ends of all human UTRs in several cell lines. Remarkably, we found that conserved miRNA binding sites are preferentially enriched immediately upstream to APA sites, and this enrichment is more prominent in pro-differentiation/anti-proliferative genes. Binding sites of the miR17-92 cluster, upregulated in rapidly proliferating cells, are particularly enriched just upstream to APA sites, presumably conferring stronger inhibitory activity upon shortening. Thus 3UTR shortening appears not only to enable escape from inhibition of growth promoting genes but also to potentiate repression of anti-proliferative genes.
2015
-
(2015) Cell. 163, 3, p. 549-559 Abstract
Adaptation is the process in which organisms improve their fitness by changing their phenotype using genetic or non-genetic mechanisms. The adaptation toolbox consists of varied molecular and genetic means that we posit span an almost continuous "adaptation spectrum." Different adaptations are characterized by the time needed for organisms to attain them and by their duration. We suggest that organisms often adapt by progressing the adaptation spectrum, starting with rapidly attained physiological and epigenetic adaptations and culminating with slower long-lasting genetic ones. A tantalizing possibility is that earlier adaptations facilitate realization of later ones.
-
(2015) Biology Direct. 10, 1, 34. Abstract
Which came first, the Chicken or the Egg?" We suggest this question is not a paradox. The Modern Synthesis envisions speciation through genetic changes in germ cells via random mutations, an "Egg first" scenario, but perhaps epigenetic inheritance mechanisms can transmit adaptive changes initiated in the soma ("Chicken first"). Reviewers: The article was reviewed by Dr. Eugene Koonin, Dr. Itai Yanai, Dr. Laura Landweber.
-
2014
-
(2014) eLife. 3, August2014, p. 1-19 03440. Abstract
In all living organisms, ribosomes translating membrane proteins are targeted to membrane translocons early in translation, by the ubiquitous signal recognition particle (SRP) system. In eukaryotes, the SRP Alu domain arrests translation elongation of membrane proteins until targeting is complete. Curiously, however, the Alu domain is lacking in most eubacteria. In this study, by analyzing genome-wide data on translation rates, we identified a potential compensatory mechanism in E. coli that serves to slow down the translation during membrane protein targeting. The underlying mechanism is likely programmed into the coding sequence, where ShineDalgarnolike elements trigger elongation pauses at strategic positions during the early stages of translation. We provide experimental evidence that slow translation during targeting and improves membrane protein production fidelity, as it correlates with better folding of overexpressed membrane proteins. Thus, slow elongation is important for membrane protein targeting in E. coli, which utilizes mechanisms different from the eukaryotic one to control the translation speed.
-
(2014) Journal of Molecular Cell Biology. 6, 3, p. 192-197 Abstract
p53 is a transcription factor that governs numerous stress response pathways within the cell. Maintaining the right levels of p53 is crucial for cell survival and proper cellular homeostasis. The tight regulation of p53 involves many cellular components, most notably its major negative regulators Mdm2 and Mdm4, which maintain p53 protein amount and activity in tight check. microRNAs (miRNAs) are small non-coding RNAs that target specific mRNAs to translational arrest and degradation. miRNAs are also key components of the normal p53 pathway, joining forces with Mdm2 and Mdm4 to maintain proper p53 activity. Here we review the current knowledge of miRNAs targeting Mdm2 and Mdm4, and their importance in different tissues and in pathological states such as cancer. In addition, we address the role of Alu sequences-highly abundant retroelements spread throughout the human genome, and their impact on gene regulation via the miRNA machinery. Alus occupy a significant portion of genes' 3UTR, and as such they have the potential to impact mRNA regulation. Since Alus are primate-specific, they introduce a new regulatory layer into primate genomes. Alus can influence and alter gene regulation, creating primate-specific cancer-preventive regulatory mechanisms to sustain the transition to longer life span in primates. We review the possible influence of Alu sequences on miRNA functionality in general and specifically within the p53 network.
-
(2014) Developmental Neurobiology. 74, 3, p. 365-381 Abstract
RNA localization is a regulatory mechanism that is conserved from bacteria to mammals. Yet, little is known about the mechanism and the logic that govern the distribution of RNA transcripts within the cell. Here, we present a novel organ culture system, which enables the isolation of RNA specifically from NGF dependent re-growing peripheral axons of mouse embryo, sensory neurons. In combination with massive parallel sequencing technology, we determine the subcellular localization of most transcripts in the transcriptome. We found that the axon is enriched in mRNAs that encode secreted proteins, transcription factors, and the translation machinery. In contrast, the axon was largely depleted from mRNAs encoding transmembrane proteins, a particularly interesting finding, since many of these gene products are specifically expressed in the tip of the axon at the protein level. Comparison of the mitochondrial mRNAs encoded in the nucleus with those encoded in the mitochondria, uncovered completely different localization pattern, with the latter much enriched in the axon fraction. This discovery is intriguing since the protein products encoded by the nuclear and mitochondrial genome form large co-complexes. Finally, focusing on alternative splice variants that are specific to axonal fractions, we find short sequence motifs that are enriched in the axonal transcriptome. Together our findings shed light on the extensive role of RNA localization and its characteristics.
-
(2014) Cell Death and Differentiation. 21, 2, p. 302-309 Abstract
The p53 pathway is pivotal in tumor suppression. Cellular p53 activity is subject to tight regulation, in which the two related proteins Mdm2 and Mdm4 have major roles. The delicate interplay between the levels of Mdm2, Mdm4 and p53 is crucial for maintaining proper cellular homeostasis. microRNAs (miRNAs) are short non-coding RNAs that downregulate the level and translatability of specific target mRNAs. We report that miR-661, a primate-specific miRNA, can target both Mdm2 and Mdm4 mRNA in a cell type-dependent manner. miR-661 interacts with Mdm2 and Mdm4 RNA within living cells. The inhibitory effect of miR-661 is more prevalent on Mdm2 than on Mdm4. Interestingly, the predicted miR-661 targets in both mRNAs reside mainly within Alu elements, suggesting a primate-specific mechanism for regulatory diversification during evolution. Downregulation of Mdm2 and Mdm4 by miR-661 augments p53 activity and inhibits cell cycle progression in p53-proficient cells. Correspondingly, low miR-661 expression correlates with bad outcome in breast cancers that typically express wild-type p53. In contrast, the miR-661 locus tends to be amplified in tumors harboring p53 mutations, and miR-661 promotes migration of cells derived from such tumors. Thus, miR-661 may either suppress or promote cancer aggressiveness, depending on p53 status.
-
(2014) Cell. 158, 6, p. 1281-1292 Abstract
A dichotomous choice for metazoan cells is between proliferation and differentiation. Measuring tRNA pools in various cell types, we found two distinct subsets, one that is induced in proliferating cells, and repressed otherwise, and another with the opposite signature. Correspondingly, we found that genes serving cell-autonomous functions and genes involved in multicellularity obey distinct codon usage. Proliferation-induced and differentiation-induced tRNAs often carry anticodons that correspond to the codons enriched among the cell-autonomous and the multicellularity genes, respectively. Because mRNAs of cell-autonomous genes are induced in proliferation and cancer in particular, the concomitant induction of their codon-enriched tRNAs suggests coordination between transcription and translation. Histone modifications indeed change similarly in the vicinity of cell-autonomous genes and their corresponding tRNAs, and in multicellularity genes and their tRNAs, suggesting the existence of transcriptional programs coordinating tRNA supply and demand. Hence, we describe the existence of two distinct translation programs that operate during proliferation and differentiation.
-
(2014) PLoS Genetics. 10, 1, e1004084. Abstract
Deciphering the architecture of the tRNA pool is a prime challenge in translation research, as tRNAs govern the efficiency and accuracy of the process. Towards this challenge, we created a systematic tRNA deletion library in Saccharomyces cerevisiae, aimed at dissecting the specific contribution of each tRNA gene to the tRNA pool and to the cell's fitness. By harnessing this resource, we observed that the majority of tRNA deletions show no appreciable phenotype in rich medium, yet under more challenging conditions, additional phenotypes were observed. Robustness to tRNA gene deletion was often facilitated through extensive backup compensation within and between tRNA families. Interestingly, we found that within tRNA families, genes carrying identical anti-codons can contribute differently to the cellular fitness, suggesting the importance of the genomic surrounding to tRNA expression. Characterization of the transcriptome response to deletions of tRNA genes exposed two disparate patterns: in single-copy families, deletions elicited a stress response; in deletions of genes from multi-copy families, expression of the translation machinery increased. Our results uncover the complex architecture of the tRNA pool and pave the way towards complete understanding of their role in cell physiology.
2013
-
(2013) eLife. 2013, 2, e01339. Abstract
Changes in expression patterns may occur when organisms are presented with new environmental challenges, for example following migration or genetic changes. To elucidate the mechanisms by which the translational machinery adapts to such changes, we perturbed the tRNA pool of Saccharomyces cerevisiae by tRNA gene deletion. We then evolved the deletion strain and observed that the genetic adaptation was recurrently based on a strategic mutation that changed the anticodon of other tRNA genes to match that of the deleted one. Strikingly, a systematic search in hundreds of genomes revealed that anticodon mutations occur throughout the tree of life. We further show that the evolution of the tRNA pool also depends on the need to properly couple translation to protein folding. Together, our observations shed light on the evolution of the tRNA pool, demonstrating that mutation in the anticodons of tRNA genes is a common adaptive mechanism when meeting new translational demands.
-
(2013) Bioinformatics. 29, 7, p. 894-902 Abstract
Motivation: The massive spread of repetitive elements in the human genome presents a substantial challenge to the organism, as such elements may accidentally contain seemingly functional motifs. A striking example is offered by the roughly one million copies of Alu repeats in the genome, of which ∼0.5% reside within genes' untranslated regions (UTRs), presenting ∼30 000 novel potential targets for highly conserved microRNAs (miRNAs). Here, we examine the functionality of miRNA targets within Alu elements in 30UTRs in the human genome. Results: Using a comprehensive dataset of miRNA overexpression assays, we show that mRNAs with miRNA targets within Alus are significantly less responsive to the miRNA effects compared with mRNAs that have the same targets outside Alus. Using Ago2-binding mRNA profiling, we confirm that the miRNA machinery avoids miRNA targets within Alus, as opposed to the highly efficient binding of targets outside Alus. We propose three features that prevent potential miRNA sites within Alus from being recognized by the miRNA machinery: (i) Alu repeats that contain miRNA targets and genuine functional miRNA targets appear to reside in distinct mutually exclusive territories within 3'UTRs; (ii) Alus have tight secondary structure that may limit access to the miRNA machinery; and (iii) A-to-I editing of Alu-derived mRNA sequences may divert miRNA targets. The combination of these features is proposed to allow toleration of Alu insertions into mRNAs. Nonetheless, a subset of miRNA targets within Alus appears not to possess any of the aforementioned features, and thus may represent cases where Alu insertion in the genome has introduced novel functional miRNA targets.
-
(2013) PLoS Computational Biology. 9, 3, e1002934. Abstract
A full understanding of gene regulation requires an understanding of the contributions that the various regulatory regions have on gene expression. Although it is well established that sequences downstream of the main promoter can affect expression, our understanding of the scale of this effect and how it is encoded in the DNA is limited. Here, to measure the effect of native S. cerevisiae 3 end sequences on expression, we constructed a library of 85 fluorescent reporter strains that differ only in their 3 end region. Notably, despite being driven by the same strong promoter, our library spans a continuous twelve-fold range of expression values. These measurements correlate with endogenous mRNA levels, suggesting that the 3 end contributes to constitutive differences in mRNA levels. We used deep sequencing to map the 3UTR ends of our strains and show that determination of polyadenylation sites is intrinsic to the local 3 end sequence. Polyadenylation mapping was followed by sequence analysis, we found that increased A/T content upstream of the main polyadenylation site correlates with higher expression, both in the library and genome-wide, suggesting that native genes differ by the encoded efficiency of 3 end processing. Finally, we use single cells fluorescence measurements, in different promoter activation levels, to show that 3 end sequences modulate protein expression dynamics differently than promoters, by predominantly affecting the size of protein production bursts as opposed to the frequency at which these bursts occur. Altogether, our results lead to a more complete understanding of gene regulation by demonstrating that 3 end regions have a unique and sequence dependent effect on gene expression.
2012
-
(2012) Proceedings of the National Academy of Sciences of the United States of America. 109, 51, p. 21010-21015 Abstract
Aneuploidy, an abnormal number of chromosomes, is a widespread phenomenon found in unicellulars such as yeast, as well as in plants and in mammalians, especially in cancer. Aneuploidy is a genomescale aberration that imposes a severe burden on the cell, yet under stressful conditions specific aneuploidies confer a selective advantage. This dual nature of aneuploidy raises the question of whether it can serve as a stable and sustainable evolutionary adaptation. To clarify this, we conducted a set of laboratory evolution experiments in yeast and followed the long-term dynamics of aneuploidy under diverse conditions. Herewe showthat chromosomal duplications are first acquired as a crude solution to stress, yet only as transient solutions that are eliminated and replaced by more efficient solutions obtained at the individual gene level. These transient dynamics of aneuploidy were repeatedly observed in our laboratory evolution experiments; chromosomal duplications gained under stress were eliminated not only when the stress was relieved, but even if it persisted. Furthermore, when stress was applied gradually rather than abruptly, alternative solutions appear to have emerged, but not aneuploidy. Our findings indicate that chromosomalduplicationis afirst evolutionary line of defense, that retains survivability under strong and abrupt selective pressures, yet it merely serves as a "quick fix," whereas more refined and sustainable solutions take over. Thus, in the perspective of genome evolution trajectory, aneuploidy is a useful yet short-lived intermediate that facilitates further adaptation.
-
(2012) Nucleic Acids Research. 40, 20, p. 10053-10063 Abstract
Translation of a gene is assumed to be efficient if the supply of the tRNAs that translate it is high. Yet high-abundance tRNAs are often also at high demand since they correspond to preferred codons in genomes. Thus to fully model translational efficiency one must gauge the supply-to-demand ratio of the tRNAs that are required by the transcriptome at a given time. The tRNAs' supply is often approximated by their gene copy number in the genome. Yet neither the demand for each tRNA nor the extent to which its concentration changes across environmental conditions has been extensively examined. Here we compute changes in the codon usage of the transcriptome across different conditions in several organisms by inspecting conventional mRNA expression data. We find recurring dynamics of codon usage in the transcriptome in multiple stressful conditions. In particular, codons that are translated by rare tRNAs become over-represented in the transcriptome in response to stresses. These results raise the possibility that the tRNA pool might dynamically change upon stress to support efficient translation of stress-transcribed genes. Alternatively, stress genes may be typically translated with low efficiency, presumably due to lack of sufficient evolutionary optimization pressure on their codon usage.
-
(2012) PLoS Computational Biology. 8, 8, e1002644. Abstract
The intrinsic stochasticity of gene expression leads to cell-to-cell variations, noise, in protein abundance. Several processes, including transcription, translation, and degradation of mRNA and proteins, can contribute to these variations. Recent single cell analyses of gene expression in yeast have uncovered a general trend where expression noise scales with protein abundance. This trend is consistent with a stochastic model of gene expression where mRNA copy number follows the random birth and death process. However, some deviations from this basic trend have also been observed, prompting questions about the contribution of gene-specific features to such deviations. For example, recent studies have pointed to the TATA box as a sequence feature that can influence expression noise by facilitating expression bursts. Transcription-originated noise can be potentially further amplified in translation. Therefore, we asked the question of to what extent sequence features known or postulated to accompany translation efficiency can also be associated with increase in noise strength and, on average, how such increase compares to the amplification associated with the TATA box. Untangling different components of expression noise is highly nontrivial, as they may be gene or gene-module specific. In particular, focusing on codon usage as one of the sequence features associated with efficient translation, we found that ribosomal genes display a different relationship between expression noise and codon usage as compared to other genes. Within nonribosomal genes we found that sequence high codon usage is correlated with increased noise relative to the average noise of proteins with the same abundance. Interestingly, by projecting the data on a theoretical model of gene expression, we found that the amplification of noise strength associated with codon usage is comparable to that of the TATA box, suggesting that the effect of translation on noise in eukaryotic gene expression might be more prominent than previously appreciated.
-
(2012) EMBO Journal. 31, 6, p. 1350-1363 Abstract
Retrograde axonal injury signalling stimulates cell body responses in lesioned peripheral neurons. The involvement of importins in retrograde transport suggests that transcription factors (TFs) might be directly involved in axonal injury signalling. Here, we show that multiple TFs are found in axons and associate with dynein in axoplasm from injured nerve. Biochemical and functional validation for one TF family establishes that axonal STAT3 is locally translated and activated upon injury, and is transported retrogradely with dynein and importin α5 to modulate survival of peripheral sensory neurons after injury. Hence, retrograde transport of TFs from axonal lesion sites provides a direct link between axon and nucleus.
2011
-
(2011) PLoS Genetics. 7, 9, e1002273. Abstract
Transcriptome dynamics is governed by two opposing processes, mRNA production and degradation. Recent studies found that changes in these processes are frequently coordinated and that the relationship between them shapes transcriptome kinetics. Specifically, when transcription changes are counter-acted with changes in mRNA stability, transient fast-relaxing transcriptome kinetics is observed. A possible molecular mechanism underlying such coordinated regulation might lay in two RNA polymerase (Pol II) subunits, Rpb4 and Rpb7, which are recruited to mRNAs during transcription and later affect their degradation in the cytoplasm. Here we used a yeast strain carrying a mutant Pol II which poorly recruits these subunits. We show that this mutant strain is impaired in its ability to modulate mRNA stability in response to stress. The normal negative coordinated regulation is lost in the mutant, resulting in abnormal transcriptome profiles both with respect to magnitude and kinetics of responses. These results reveal an important role for Pol II, in regulation of both mRNA synthesis and degradation, and also in coordinating between them. We propose a simple model for production-degradation coupling that accounts for our observations. The model shows how a simple manipulation of the rates of co-transcriptional mRNA imprinting by Pol II may govern genome-wide transcriptome kinetics in response to environmental changes.
-
(2011) Nucleic Acids Research. 39, 14, p. 6016-6028 Abstract
Effective translation of the viral genome during the infection cycle most likely enhances its fitness. In this study, we reveal two different strategies employed by cyanophages, viruses infecting cyanobacteria, to enhance their translation efficiency. Cyanophages of the T7-like Podoviridae family adjust their GC content and codon usage to those of their hosts. In contrast, cyanophages of the T4-like Myoviridae family maintain genomes with low GC content, thus sometimes differing from that of their hosts. By introducing their own specific set of tRNAs, they appear to modulate the tRNA pools of hosts with tRNAs that fit the viral low GC preferred codons. We assessed the possible effects of those viral tRNAs on cyanophages and cyanobacterial genomes using the tRNA adaptation index, which measures the extent to which a given pool of tRNAs translates efficiently particular genes. We found a strong selective pressure to gain and maintain tRNAs that will boost translation of myoviral genes when infecting a high GC host, contrasted by a negligible effect on the host genes. Thus, myoviral tRNAs may represent an adaptive strategy to enhance fitness when infecting high GC hosts, thereby potentially broadening the spectrum of hosts while alleviating the need to adjust global parameters such as GC content for each specific host.
-
(2011) Trends in Genetics. 27, 8, p. 316-322 Abstract
Gene expression comprises multiple stages, from transcription to protein degradation. Although much is known about the regulation of each stage separately, an understanding of the regulatory coupling between the different stages is only beginning to emerge. For example, there is a clear crosstalk between translation and transcription, and the localization and stability of an mRNA in the cytoplasm could already be determined during transcription in the nucleus. We review a diversity of mechanisms discovered in recent years that couple the different stages of gene expression. We then speculate on the functional and evolutionary significance of this coupling and suggest certain systems-level functionalities that might be optimized via the various coupling modes. In particular, we hypothesize that coupling is often an economic strategy that allows biological systems to respond robustly and precisely to genetic and environmental perturbations.
-
(2011) Yeast Systems Biology. p. 407-425 Abstract
Genetic regulatory circuits are often regarded as precise machines that accurately determine the level of expression of each protein. Most experimental technologies used to measure gene expression levels are incapable of testing and challenging this notion, as they often measure levels averaged over entire populations of cells. Yet, when expression levels are measured at the single cell level of even genetically identical cells, substantial cell-to-cell variation (or \u201cnoise\u201d) may be observed. Sometimes different genes in a given genome may display different levels of noise; even the same gene, expressed under different environmental conditions, may display greater cell-to-cell variability in specific conditions and more tight control in other situations. While at first glance noise may seem to be an undesired property of biological networks, it might be beneficial in some cases. For instance, noise will increase functional heterogeneity in a population of microorganisms facing variable, often unpredictable, environmental changes, increasing the probability that some cells may survive the stress. In that respect, we can speculate that the population is implementing a risk distribution strategy, long before genetic heterogeneity could be acquired. Organisms may have evolved to regulate not only the averaged gene expression levels but also the extent of allowed deviations from such an average, setting it at the desired level for every gene under each specific condition. Here we review the evolving understanding of noise, its molecular underpinnings, and its effect on phenotype and fitness when it can be detrimental, beneficial, or neutral and which regulatory tools eukaryotic cells may use to optimally control it.
-
(2011) Molecular Biology and Evolution. 28, 5, p. 1545-1551 Abstract
MicroRNAs (miRs) are considered major contributors to the evolution of animal morphological complexity. Multiple bursts of novel miR families were documented throughout animal evolution, yet, their evolutionary origins are not understood. Here, we discuss two alternative genomic sources for novel miR families, namely, transposable elements, which were previously described, and a newly proposed origin: CpG islands. We show that these two origins are evolutionarily distinct and that they correspond to marked differences in several functional and genomic characteristics. Together, our results shed light on the intriguing origin of one of the major constituents of regulatory networks in animals, miRs.
-
(2011) Proceedings of the National Academy of Sciences of the United States of America. 108, 17, p. 7271-7276 Abstract
Survival in natural habitats selects for microorganisms that are well-adapted to a wide range of conditions. Recent studies revealed that cells evolved innovative response strategies that extend beyond merely sensing a given stimulus and responding to it on encounter. A diversity of microorganisms, including Escherichia coli, Vibrio cholerae, and several yeast species, were shown to use a predictive regulation strategy that uses the appearance of one stimulus as a cue for the likely arrival of a subsequent one. A better understanding of such a predictive strategy requires elucidating the interplay between key biological and environmental forces. Here, we describe a mathematical framework to address this challenge. We base this framework on experimental systems featuring early preparation to either a stress or an exposure to improvement in the growth medium. Our model calculates the fitness advantage originating under each regulation strategy in a given habitat. We conclude that, although a predictive response strategy might by advantageous under some ecologies, its costs might exceed the benefit in others. The combined theoretical-experimental treatment presented here helps assess the potential of natural ecologies to support a predictive behavior.
-
The role of codon selection in regulation of translation efficiency deduced from synthetic libraries(2011) GENOME BIOLOGY. 12, 2, R12. Abstract
Background: Translation efficiency is affected by a diversity of parameters, including secondary structure of the transcript and its codon usage. Here we examine the effects of codon usage on translation efficiency by re-analysis of previously constructed synthetic expression libraries in Escherichia coli.Results: We define the region in a gene that takes the longest time to translate as the bottleneck. We found that localization of the bottleneck at the beginning of a transcript promoted a high level of expression, especially if the computed dwell time of the ribosome within this region was sufficiently long. The location and translation time of the bottleneck were not correlated with the cost of expression, approximated by the fitness of the host cell, yet utilization of specific codons was. Particularly, enhanced usage of the codons UCA and CAU was correlated with increased cost of production, potentially due to sequestration of their corresponding rare tRNAs.Conclusions: The distribution of codons along the genes appears to affect translation efficiency, consistent with analysis of natural genes. This study demonstrates how synthetic biology complements bioinformatics by providing a set-up for well controlled experiments in biology.
-
(2011) Molecular Systems Biology. 7, 481. Abstract
Proper functioning of biological cells requires that the process of protein expression be carried out with high efficiency and fidelity. Given an amino-acid sequence of a protein, multiple degrees of freedom still remain that may allow evolution to tune efficiency and fidelity for each gene under various conditions and cell types. Particularly, the redundancy of the genetic code allows the choice between alternative codons for the same amino acid, which, although 'synonymous,' may exert dramatic effects on the process of translation. Here we review modern developments in genomics and systems biology that have revolutionized our understanding of the multiple means by which translation is regulated. We suggest new means to model the process of translation in a richer framework that will incorporate information about gene sequences, the tRNA pool of the organism and the thermodynamic stability of the mRNA transcripts. A practical demonstration of a better understanding of the process would be a more accurate prediction of the proteome, given the transcriptome at a diversity of biological conditions.
2010
-
(2010) Science Signaling. 3, 130, p. ra53 Abstract
Retrograde signaling from axon to soma activates intrinsic regeneration mechanisms in lesioned peripheral sensory neurons; however, the links between axonal injury signaling and the cell body response are not well understood. Here, we used phosphoproteomics and microarrays to implicate ∼900 phosphoproteins in retrograde injury signaling in rat sciatic nerve axons in vivo and ∼4500 transcripts in the in vivo response to injury in the dorsal root ganglia. Computational analyses of these data sets identified ∼400 redundant axonal signaling networks connected to 39 transcription factors implicated in the sensory neuron response to axonal injury. Experimental perturbation of individual overrepresented signaling hub proteins, including Abl, AKT, p38, and protein kinase C, affected neurite outgrowth in sensory neurons. Paradoxically, however, combined perturbation of Abl together with other hub proteins had a reduced effect relative to perturbation of individual proteins. Our data indicate that nerve injury responses are controlled by multiple regulatory components, and suggest that network redundancies provide robustness to the injury response.
-
(2010) GENOME BIOLOGY. 11, 6, R58. Abstract
Background: Early embryos contain mRNA transcripts expressed from two distinct origins; those expressed from the mother's genome and deposited in the oocyte (maternal) and those expressed from the embryo's genome after fertilization (zygotic). The transition from maternal to zygotic control occurs at different times in different animals according to the extent and form of maternal contributions, which likely reflect evolutionary and ecological forces. Maternally deposited transcripts rely on post-transcriptional regulatory mechanisms for precise spatial and temporal expression in the embryo, whereas zygotic transcripts can use both transcriptional and post-transcriptional regulatory mechanisms. The differences in maternal contributions between animals may be associated with gene regulatory changes detectable by the size and complexity of the associated regulatory regions.Results: We have used genomic data to identify and compare maternal and/or zygotic expressed genes from six different animals and find evidence for selection acting to shape gene regulatory architecture in thousands of genes. We find that mammalian maternal genes are enriched for complex regulatory regions, suggesting an increase in expression specificity, while egg-laying animals are enriched for maternal genes that lack transcriptional specificity.Conclusions: We propose that this lack of specificity for maternal expression in egg-laying animals indicates that a large fraction of maternal genes are expressed non-functionally, providing only supplemental nutritional content to the developing embryo. These results provide clear predictive criteria for analysis of additional genomes.
-
(2010) Science Signaling. 3, 124, p. ra43 Abstract
Epidermal growth factor (EGF) stimulates cells by launching gene expression programs that are frequently deregulated in cancer. MicroRNAs, which attenuate gene expression by binding complementary regions in messenger RNAs, are broadly implicated in cancer. Using genome-wide approaches, we showed that EGF stimulation initiates a coordinated transcriptional program of microRNAs and transcription factors. The earliest event involved a decrease in the abundance of a subset of 23 microRNAs. This step permitted rapid induction of oncogenic transcription factors, such as c-FOS, encoded by immediate early genes. In line with roles as suppressors of EGF receptor (EGFR) signaling, we report that the abundance of this early subset of microRNAs is decreased in breast and in brain tumors driven by the EGFR or the closely related HER2. These findings identify specific microRNAs as attenuators of growth factor signaling and oncogenesis.
-
(2010) Trends in Genetics. 26, 6, p. 253-259 Abstract
MicroRNAs (miRNAs) appear to be key players in the maintenance of genomic integrity. Recent evidence implies that cancers often avoid miRNA-mediated regulation, and global repression of miRNAs is associated with increased tumorigenicity. Here we suggest that miRNAs are directly involved in the maintenance of genomic integrity through global repression of transposable elements (TEs), whose expression and transposition are well-documented causes of genomic instability in mammalian somatic tissues. Hence, one outcome of the tumor's ability to avoid miRNA-mediated regulation might be the enhancement of genomic instability and mutability due to derepression of TEs. We outline possible mechanisms underlying TE repression by miRNAs, including post-transcriptional silencing and transcriptional silencing through DNA and histone methylation. This hypothesis calls into consideration the need to study the role of miRNAs and the RNAi machinery in the nucleus, and specifically their impact on the maintenance of genomic integrity in the context of cancer.
-
(2010) Cell. 141, 2, p. 344-354 Abstract
Recent years have seen intensive progress in measuring protein translation. However, the contributions of coding sequences to the efficiency of the process remain unclear. Here, we identify a universally conserved profile of translation efficiency along mRNAs computed based on adaptation between coding sequences and the tRNA pool. In this profile, the first ∼30-50 codons are, on average, translated with a low efficiency. Additionally, in eukaryotes, the last ∼50 codons show the highest efficiency over the full coding sequence. The profile accurately predicts position-dependent ribosomal density along yeast genes. These data suggest that translation speed and, as a consequence, ribosomal density are encoded by coding sequences and the tRNA pool. We suggest that the slow " ramp" at the beginning of mRNAs serves as a late stage of translation initiation, forming an optimal and robust means to reduce ribosomal traffic jams, thus minimizing the cost of protein expression.
-
(2010) Physical Review E. 81, 3, 031924. Abstract
Single-cell experiments of simple regulatory networks can markedly differ from cell population experiments. Such differences arise from stochastic events in individual cells that are averaged out in cell populations. For instance, while individual cells may show sustained oscillations in the concentrations of some proteins, such oscillations may appear damped in the population average. In this paper we investigate the role of RNA stochastic fluctuations as a leading force to produce a sustained excitatory behavior at the single-cell level. As opposed to some previous models, we build a fully stochastic model of a negative feedback loop that explicitly takes into account the RNA stochastic dynamics. We find that messenger RNA random fluctuations can be amplified during translation and produce sustained pulses of protein expression. Motivated by the recent appreciation of the importance of noncoding regulatory RNAs in post-transcription regulation, we also consider the possibility that a regulatory RNA transcript could bind to the messenger RNA and repress translation. Our findings show that the regulatory transcript helps reducing gene expression variability both at the single-cell level and at the cell population level.
-
(2010) Cell Death and Differentiation. 17, 2, p. 236-245 Abstract
Aberrant oncogene activation induces cellular senescence, an irreversible growth arrest that acts as a barrier against tumorigenesis. To identify microRNAs (miRNAs) involved in oncogene-induced senescence, we examined the expression of miRNAs in primary human TIG3 fibroblasts after constitutive activation of B-RAF. Among the regulated miRNAs, both miR-34a and miR-146a were strongly induced during senescence. Although members of the miR-34 family are known to be transcriptionally regulated by p53, we find that miR-34a is regulated independently of p53 during oncogene-induced senescence. Instead, upregulation of miR-34a is mediated by the ETS family transcription factor, ELK1. During senescence, miR-34a targets the important proto-oncogene MYC and our data suggest that miR-34a thereby coordinately controls a set of cell cycle regulators. Hence, in addition to its integration in the p53 pathway, we show that alternative cancer-related pathways regulate miR-34a, emphasising its significance as a tumour suppressor.
2009
-
Coupling transcriptional and post-transcriptional miRNA regulation in the control of cell fate(2009) AGING-US. 1, 9, p. 762-770 Abstract
miRNAs function as a critical regulatory layer in development, differentiation, and the maintenance of cell fate. Depletion of miRNAs from embryonic stem cells impairs their differentiation capacity. Total elimination of miRNAs leads to premature senescence in normal cells and tissues through activation of the DNA-damage checkpoint, whereas ablation of miRNAs in cancer cell lines results in an opposite effect, enhancing their tumorigenic potential. Here we compile evidence from the literature that point at miRNAs as key players in the maintenance of genomic integrity and proper cell fate. There is an apparent gap between our understanding of the subtle way by which miRNAs modulate protein levels, and their profound impact on cell fate. We propose that examining miRNAs in the context of the regulatory transcriptional and post-transcriptional networks they are embedded in may provide a broader view of their role in controlling cell fate.
-
(2009) PLoS Computational Biology. 5, 8, e1000477. Abstract
Injury to nerve axons induces diverse responses in neuronal cell bodies, some of which are influenced by the distance from the site of injury. This suggests that neurons have the capacity to estimate the distance of the injury site from their cell body. Recent work has shown that the molecular motor dynein transports importin-mediated retrograde signaling complexes from axonal lesion sites to cell bodies, raising the question whether dynein-based mechanisms enable axonal distance estimations in injured neurons? We used computer simulations to examine mechanisms that may provide nerve cells with dynein-dependent distance assessment capabilities. A multiple-signals model was postulated based on the time delay between the arrival of two or more signals produced at the site of injury-a rapid signal carried by action potentials or similar mechanisms and slower signals carried by dynein. The time delay between the arrivals of these two types of signals should reflect the distance traversed, and simulations of this model show that it can indeed provide a basis for distance measurements in the context of nerve injuries. The analyses indicate that the suggested mechanism can allow nerve cells to discriminate between distances differing by 10% or more of their total axon length, and suggest that dynein-based retrograde signaling in neurons can be utilized for this purpose over different scales of nerves and organisms. Moreover, such a mechanism might also function in synapse to nucleus signaling in uninjured neurons. This could potentially allow a neuron to dynamically sense the relative lengths of its processes on an ongoing basis, enabling appropriate metabolic output from cell body to processes.
-
(2009) Nature. 460, 7252, p. 220-224 Abstract
Natural habitats of some microorganisms may fluctuate erratically, whereas others, which are more predictable, offer the opportunity to prepare in advance for the next environmental change. In analogy to classical Pavlovian conditioning, microorganisms may have evolved to anticipate environmental stimuli by adapting to their temporal order of appearance. Here we present evidence for environmental change anticipation in two model microorganisms, Escherichia coli and Saccharomyces cerevisiae. We show that anticipation is an adaptive trait, because pre-exposure to the stimulus that typically appears early in the ecology improves the organisms fitness when encountered with a second stimulus. Additionally, we observe loss of the conditioned response in E. coli strains that were repeatedly exposed in a laboratory evolution experiment only to the first stimulus. Focusing on the molecular level reveals that the natural temporal order of stimuli is embedded in the wiring of the regulatory networkearly stimuli pre-induce genes that would be needed for later ones, yet later stimuli only induce genes needed to cope with them. Our work indicates that environmental anticipation is an adaptive trait that was repeatedly selected for during evolution and thus may be ubiquitous in biology.
-
(2009) Cell. 136, 3, p. 389-392 Abstract
Many crucial components of signal transduction, developmental, and metabolic pathways have functionally redundant copies. Further, these redundancies show surprising evolutionary stability over prolonged time scales. We propose that redundancies are not just archeological leftovers of ancient gene duplications, but rather that synergy arising from feedback between redundant copies may serve as an information processing element that facilitates signal transduction and the control of gene expression.
2008
-
(2008) PLoS Genetics. 4, 3, e1000018. Abstract
Transcription factors (TFs) regulate gene expression through specific interactions with short promoter elements. The same regulatory protein may recognize a variety of related sequences. Moreover, once they are detected it is hard to predict whether highly similar sequence motifs will be recognized by the same TF and regulate similar gene expression patterns, or serve as binding sites for distinct regulatory factors. We developed computational measures to assess the functional implications of variations on regulatory motifs and to compare the functions of related sites. We have developed computational means for estimating the functional outcome of substituting a single position within a binding site and applied them to a collection of putative regulatory motifs. We predict the effects of nucleotide variations within motifs on gene expression patterns. In cases where such predictions could be compared to suitable published experimental evidence, we found very good agreement. We further accumulated statistics from multiple substitutions across various binding sites in an attempt to deduce general properties that characterize nucleotide substitutions that are more likely to alter expression. We found that substitutions involving Adenine are more likely to retain the expression pattern and that substitutions involving Guanine are more likely to alter expression compared to the rest of the substitutions. Our results should facilitate the prediction of the expression outcomes of binding site variations. One typical important implication is expected to be the ability to predict the phenotypic effect of variation in regulatory motifs in promoters.
-
(2008) Molecular Systems Biology. 4, 4. Abstract
The state of the transcriptome reflects a balance between mRNA production and degradation. Yet how these two regulatory arms interact in shaping the kinetics of the transcriptome in response to environmental changes is not known. We subjected yeast to two stresses, one that induces a fast and transient response, and another that triggers a slow enduring response. We then used microarrays following transcriptional arrest to measure genome-wide decay profiles under each condition. We found condition-specific changes in mRNA decay rates and coordination between mRNA production and degradation. In the transient response, most induced genes were surprisingly destabilized, whereas repressed genes were somewhat stabilized, exhibiting counteraction between production and degradation. This strategy can reconcile high steady-state level with short response time among induced genes. In contrast, the stress that induces the slow response displays the more expected behavior, whereby most induced genes are stabilized, and repressed genes are destabilized. Our results show genome-wide interplay between mRNA production and degradation, and that alternative modes of such interplay determine the kinetics of the transcriptome in response to stress.
-
(2008) Molecular Systems Biology. 4, 229. Abstract
Normal cell growth is governed by a complicated biological system, featuring multiple levels of control, often deregulated in cancers. The role of microRNAs (miRNAs) in the control of gene expression is now increasingly appreciated, yet their involvement in controlling cell proliferation is still not well understood. Here we investigated the mammalian cell proliferation control network consisting of transcriptional regulators, E2F and p53, their targets and a family of 15 miRNAs. Indicative of their significance, expression of these miRNAs is downregulated in senescent cells and in breast cancers harboring wild-type p53. These miRNAs are repressed by p53 in an E2F1-mediated manner. Furthermore, we show that these miRNAs silence antiproliferative genes, which themselves are E2F1 targets. Thus, miRNAs and transcriptional regulators appear to cooperate in the framework of a multi-gene transcriptional and post-transcriptional feed-forward loop. Finally, we show that, similarly to p53 inactivation, overexpression of representative miRNAs promotes proliferation and delays senescence, manifesting the detrimental phenotypic consequence of perturbations in this circuit. Taken together, these findings position miRNAs as novel key players in the mammalian cellular proliferation network.
-
(2008) Proceedings of the National Academy of Sciences of the United States of America. 105, 4, p. 1243-1248 Abstract
The widely observed dispensability of duplicate genes is typically interpreted to suggest that a proportion of the duplicate pairs are at least partially redundant in their functions, thus allowing for compensatory affects. However, because redundancy is expected to be evolutionarily short lived, there is currently debate on both the proportion of redundant duplicates and their functional importance. Here, we examined these compensatory interactions by relying on a genome wide data analysis, followed by experiments and literature mining in yeast. Our data, thus, strongly suggest that compensated duplicates are not randomly distributed within the protein interaction network but are rather strategically allocated to the most highly connected proteins. This design is appealing because it suggests that many of the potentially vulnerable nodes that would otherwise be highly sensitive to mutations are often protected by redundancy. Furthermore, divergence analyses show that this association between redundancy and protein connectivity becomes even more significant among the ancient duplicates, suggesting that these functional overlaps have undergone purifying selection. Our results suggest an intriguing conclusion - although redundancy is typically transient on evolutionary time scales, it tends to be preserved among some of the central proteins in the cellular interaction network.
2007
-
(2007) Bioinformatics. 23, 13, p. i440-i449 Abstract
Motivation: Current methodologies for the selection of putative transcription factor binding sites (TFBS) rely on various assumptions such as over-representation of motifs occurring on gene promoters, and the use of motif descriptions such as consensus or position-specific scoring matrices (PSSMs). In order to avoid bias introduced by such assumptions, we apply an unsupervised motif extraction (MEX) algorithm to sequences of promoters. The extracted motifs are assessed for their likely cis-regulatory function by calculating the expression coherence (EC) of the corresponding genes, across a set of biological conditions. Results: Applying MEX to all Saccharomyces cerevisiae promoters, followed by EC analysis across 40 biological conditions, we obtained a high percentage of putative cis-regulatory motifs. We clustered motifs that obtained highly significant EC scores, based on both their sequence similarity and similarity in the biological conditions these motifs appear to regulate. We describe 20 clusters, some of which regroup known TFBS. The clusters display different mRNA expression profiles, correlated with typical changes in the nucleotide composition of their relevant motifs. In several cases, a variation of a single nucleotide is shown to lead to distinct differences in expression patterns. These results are confronted with additional information, such as binding of transcription factors to groups of genes. Detailed analysis is presented for clusters related to MCB/SCB, STRE and PAC. In the first two cases, we provide evidence for different binding mechanisms of different clusters of motifs. For PAC-related motifs we uncover a new cluster that has so far been overshadowed by the stronger effects of known PAC motifs.
-
(2007) PLoS Computational Biology. 3, 7, p. 1291-1304 Abstract
microRNAs (miRs) are small RNAs that regulate gene expression at the posttranscriptional level. It is anticipated that, in combination with transcription factors (TFs), they span a regulatory network that controls thousands of mammalian genes. Here we set out to uncover local and global architectural features of the mammalian miR regulatory network. Using evolutionarily conserved potential binding sites of miRs in human targets, and conserved binding sites of TFs in promoters, we uncovered two regulation networks. The first depicts combinatorial interactions between pairs of miRs with many shared targets. The network reveals several levels of hierarchy, whereby a few miRs interact with many other lowly connected miR partners. We revealed hundreds of "target hubs" genes, each potentially subject to massive regulation by dozens of miRs. Interestingly, many of these target hub genes are transcription regulators and they are often related to various developmental processes. The second network consists of miR-TF pairs that coregulate large sets of common targets. We discovered that the network consists of several recurring motifs. Most notably, in a significant fraction of the miR-TF coregulators the TF appears to regulate the miR, or to be regulated by the miR, forming a diversity of feed-forward loops. Together these findings provide new insights on the architecture of the combined transcriptional-post transcriptional regulatory network.
-
(2007) Nature Genetics. 39, 3, p. 415-421 Abstract
A major challenge in comparative genomics is to understand how phenotypic differences between species are encoded in their genomes. Phenotypic divergence may result from differential transcription of orthologous genes, yet less is known about the involvement of differential translation regulation in species phenotypic divergence. In order to assess translation effects on divergence, we analyzed ∼2,800 orthologous genes in nine yeast genomes. For each gene in each species, we predicted translation efficiency, using a measure of the adaptation of its codons to the organism's tRNA pool. Mining this data set, we found hundreds of genes and gene modules with correlated patterns of translational efficiency across the species. One signal encompassed entire modules that are either needed for oxidative respiration or fermentation and are efficiently translated in aerobic or anaerobic species, respectively. In addition, the efficiency of translation of the mRNA splicing machinery strongly correlates with the number of introns in the various genomes. Altogether, we found extensive selection on synonymous codon usage that modulates translation according to gene function and organism phenotype. We conclude that, like factors such as transcription regulation, translation efficiency affects and is affected by the process of species divergence.
-
Characterization of the effects of TF binding site variations on gene expression towards predicting the functional outcomes of regulatory SNPs(2007) Systems Biology And Regulatory Genomics. 4023, p. 51-61 Abstract
This work addresses a central question in medical genetics - the distinction between disease-causing SNPs and neutral variations. Unlike previous studies that focused mainly on coding SNPs, our efforts were centered around variations in regulatory regions and specifically within transcription factor (TF) binding sites. We have compiled a comprehensive collection of genome wide TF binding sites and developed computational measures to estimate the effects of binding site variations on the expression profiles of the regulated genes. Applying these measures to binding sites of known TFs, we were able to make predictions that were in line with published experimental evidence and with structural data on DNA-protein interactions. We attempted to generalize the properties of expression-altering substitutions by accumulating statistics from many substitutions across multiple binding sites. We found that in the yeast genome substitutions that abolish a G or a C are on average more severe than substitutions that abolish an A or a T. This may be attributed to the low GC content of the yeast genome, in which G and C may be important for conferring specificity. We found additional factors that are correlated with the severity of a substitution. Such factors can be integrated in order to create a set of rules for the prioritization of regulatory SNPs according to their disease-causing potential.
-
Examination of the tRNA adaptation index as a predictor of protein expression levels(2007) Systems Biology And Regulatory Genomics. 4023, p. 107-118 Abstract
Phenotypic differences between closely-related species may arise from differential expression regimes, rather than different gene complements. Knowledge of cellular protein levels across a species sample would thus be useful for the inference of the genes underlying such phenotypic differences. dos Reis et al [1] recently proposed the tRNA Adaptation Index to score the optimality of a coding sequence with respect to a species' cellular tRNA pools. As a preliminary step towards a multi-species analysis that would utilize this index, we examine in this paper its performance in predicting protein expression levels in the yeast S. cerevisiae and find that it likely predicts maximal potential levels of proteins. We also show that tAI profiles of genes across species carry functional information regarding the interactions between proteins.
2006
-
(2006) EMBO Reports. 7, 12, p. 1216-1222 Abstract
Many genomic loci contain transcription units on both strands, therefore two oppositely oriented transcripts can overlap. Often, one strand codes for a protein, whereas the transcript from the other strand is non-encoding. Such natural antisense transcripts (NATs) can negatively regulate the conjugated sense transcript. NATs are highly prevalent in a wide range of species - for example, around 15% of human protein-encoding genes have an associated NAT. The regulatory mechanisms by which NATs act are diverse, as are the means to control their expression. Here, we review the current understanding of NAT function and its mechanistic basis, which has been gathered from both individual gene cases and genome-wide studies. In parallel, we survey findings about the regulation of NAT transcription. Finally, we hypothesize that the regulation of antisense transcription might be tailored to its mode of action. According to this model, the observed relationship between the expression patterns of NATs and their targets might indicate the regulatory mechanism that is in action.
-
(2006) Proceedings of the National Academy of Sciences of the United States of America. 103, 31, p. 11653-11658 Abstract
Functional redundancies, generated by gene duplications, are highly widespread throughout all known genomes. One consequence of these redundancies is a tremendous increase to the robustness of organisms to mutations and other stresses. Yet, this very robustness also renders redundancy evolutionary unstable, and it is, thus, predicted to have only a transient lifetime. In contrast, numerous reports describe instances of functional overlaps that have been conserved throughout extended evolutionary periods. More interestingly, many such backed-up genes were shown to be transcriptionally responsive to the intactness of their redundant partner and are up-regulated if the latter is mutationally inactivated. By manual inspection of the literature, we have compiled a list of such "responsive backup circuits" in a diverse list of species. Reviewing these responsive backup circuits, we extract recurring principles characterizing their regulation. We then apply modeling approaches to explore further their dynamic properties. Our results demonstrate that responsive backup circuits may function as ideal devices for filtering nongenetic noise from transcriptional pathways and obtaining regulatory precision. We thus challenge the view that such redundancies are simply leftovers of ancient duplications and suggest they are an additional component to the sophisticated machinery of cellular regulation. In this respect, we suggest that compensation for gene loss is merely a side effect of sophisticated design principles using functional redundancy.
-
(2006) Nature Genetics. 38, 6, p. 636-643 Abstract
Noise in gene expression is generated at multiple levels, such as transcription and translation, chromatin remodeling and pathway-specific regulation. Studies of individual promoters have suggested different dominating noise sources, raising the question of whether a general trend exists across a large number of genes and conditions. We examined the variation in the expression levels of 43 Saccharomyces cerevisiae proteins, in cells grown under 11 experimental conditions. For all classes of genes and under all conditions, the expression variance was approximately proportional to the mean; the same scaling was observed at steady state and during the transient responses to the perturbations. Theoretical analysis suggests that this scaling behavior reflects variability in mRNA copy number, resulting from random 'birth and death' of mRNA molecules or from promoter fluctuations. Deviation of coexpressed genes from this general trend, including high noise in stress-related genes and low noise in proteasomal genes, may indicate fluctuations in pathway-specific regulators or a differential activation pattern of the underlying gene promoters.
-
-
(2006) Clinical Cancer Research. 12, 7 I, p. 2014-2024 Abstract
Purpose: The aim of this study was to investigate the role of p53 in regulating micro-RNA (miRNA) expression due to its function as a transcription factor. In addition, p53 may also affect other cellular mRNA gene expression at the translational level either via its mediated miRNAs or due to its RNA-binding function. Experimental Design: The possible interaction between p53 and miRNAs in regulating gene expression was investigated using human colon cancer HCT-116 (wt-p53) and HCT-116 (null-p53) cell lines. The effect of p53 on the expression of miRNAs was investigated using miRNA expression array and real-time quantitative reverse transcription-PCR analysis. Results: Our investigation indicated that the expression levels of a number of miRNAs were affected by wt-p53. Down-regulation of wt-p53 via small interfering RNA abolished the effect of wt-p53 in regulating miRNAs in HCT-116 (wt-p53) cells. Global sequence analysis revealed that over 46% of the 326 mi RNA putative promoters contain potential p53-binding sites, suggesting that some of these miRNAs were potentially regulated directly by wt-p53. In addition, the expression levels of steady-state total mRNAs and actively translated mRNA transcripts were quantified by high-density microarray gene expression analysis. The results indicated that nearly 200 cellular mRNA transcripts were regulated at the posttranscriptional level, and sequence analysis revealed that some of these mRNAs may be potential targets of miRNAs, including translation initiation factor elF-5A, elF-4A, and protein phosphatase 1. Conclusion: To the best of our knowledge, this is the first report demonstrating that wt-p53 and miRNAs interact in influencing gene expression and providing insights of how p53 regulates genes at multiple levels via unique mechanisms.
2005
-
(2005) Biochemistry. 44, 45, p. 14870-14880 Abstract
MdfA is an Escherichia coli multidrug transporter of the major facilitator superfamily (MFS) of secondary transporters. Although several aspects of multidrug recognition by MdfA have been characterized, better understanding the detailed mechanism of its function requires structural information. Previous studies have modeled the 3D structures of MFS proteins, based on the X-ray structure of LacY and GlpT. However, because of poor sequence homology, between LacY, GlpT, and MdfA additional constraints were required for a reliable homology modeling. Using an algorithm that predicts the angular orientation of each transmembrane helix (TM) (kPROT), we obtained a remarkably similar pattern for the 12 TMs of MdfA and those of GlpT and LacY, suggesting that they all have similar helix packing. Consequently, a 3D model was constructed for MdfA by structural alignment with LacY and GlpT, using the kPROT results as an additional constraint. Further refinement and a preliminary evaluation of the model were achieved by correlated mutation analysis and the available experimental data. Surprisingly, in addition to the previously characterized membrane-embedded glutamate at position 26, the model suggests that Asp34 and Arg112 are located within the membrane, on the same face of the cavity as Glu26. Importantly, Arg112 is evolutionarily conserved in secondary drug transporters, and here we show that a positive charge at this position is absolutely essential for multidrug transport by MdfA.
-
(2005) Molecular Systems Biology. 1, p. 2005.0022 Abstract
Deciphering regulatory events that drive malignant transformation represents a major challenge for systems biology. Here, we analyzed genome-wide transcription profiling of an in vitro cancerous transformation process. We focused on a cluster of genes whose expression levels increased as a function of p53 and p16(INK4A) tumor suppressors inactivation. This cluster predominantly consists of cell cycle genes and constitutes a signature of a diversity of cancers. By linking expression profiles of the genes in the cluster with the dynamic behavior of p53 and p16(INK4A), we identified a promoter architecture that integrates signals from the two tumor suppressive channels and that maps their activity onto distinct levels of expression of the cell cycle genes, which, in turn, correspond to different cellular proliferation rates. Taking components of the mitotic spindle as an example, we experimentally verified our predictions that p53-mediated transcriptional repression of several of these novel targets is dependent on the activities of p21, NFY, and E2F. Our study demonstrates how a well-controlled transformation process allows linking between gene expression, promoter architecture, and activity of upstream signaling molecules.
-
(2005) Nucleic Acids Research. 33, 2, p. 605-615 Abstract
Deciphering gene regulatory network architecture amounts to the identification of the regulators, conditions in which they act, genes they regulate, cis-acting motifs they bind, expression profiles they dictate and more complex relationships between alternative regulatory partnerships and alternative regulatory motifs that give rise to sub-modalities of expression profiles. The 'location data' in yeast is a comprehensive resource that provides transcription factor-DNA interaction information in vivo. Here, we provide two contributions: first, we developed means to assess the extent of noise in the location data, and consequently for extracting signals from it. Second, we couple signal extraction with better characterization of the genetic network architecture. We apply two methods for the detection of combinatorial associations between transcription factors (TFs), the integration of which provides a global map of combinatorial regulatory interactions. We discover the capacity of regulatory motifs and TF partnerships to dictate fine-tuned expression patterns of subsets of genes, which are clearly distinct from those displayed by most genes assigned to the same TF. Our findings provide carefully prioritized, high-quality assignments between regulators and regulated genes and as such should prove useful for experimental and computational biologists alike.
-
(2005) GENOME BIOLOGY. 6, 10, p. R86 Abstract
BACKGROUND: In recent years, intensive computational efforts have been directed towards the discovery of promoter motifs that correlate with mRNA expression profiles. Nevertheless, it is still not always possible to predict steady-state mRNA expression levels based on promoter signals alone, suggesting that other factors may be involved. Other genic regions, in particular 3' UTRs, which are known to exert regulatory effects especially through controlling RNA stability and localization, were less comprehensively investigated, and deciphering regulatory motifs within them is thus crucial. RESULTS: By analyzing 3' UTR sequences and mRNA decay profiles of Saccharomyces cerevisiae genes, we derived a catalog of 53 sequence motifs that may be implicated in stabilization or destabilization of mRNAs. Some of the motifs correspond to known RNA-binding protein sites, and one of them may act in destabilization of ribosome biogenesis genes during stress response. In addition, we present for the first time a catalog of 23 motifs associated with subcellular localization. A significant proportion of the 3' UTR motifs is highly conserved in orthologous yeast genes, and some of the motifs are strikingly similar to recently published mammalian 3' UTR motifs. We classified all genes into those regulated only at transcription initiation level, only at degradation level, and those regulated by a combination of both. Interestingly, different biological functionalities and expression patterns correspond to such classification. CONCLUSION: The present motif catalogs are a first step towards the understanding of the regulation of mRNA degradation and subcellular localization, two important processes which--together with transcription regulation--determine the cell transcriptome.
-
(2005) Nature Genetics. 37, 3, p. 295-299 Abstract
A key question in molecular genetics is why severe mutations often do not result in a detectably abnormal phenotype. This robustness was partially ascribed to redundant paralogs1,2 that may provide backup for one another in case of mutation. Mining mutant viability and mRNA expression data in Saccharomyces cerevisiae, we found that backup was provided predominantly by paralogs that are expressed dissimilarly in most growth conditions. We considered that this apparent inconsistency might be resolved by a transcriptional reprogramming mechanism that allows the intact paralog to rescue the organism upon mutation of its counterpart. We found that in wild-type cells, partial coregulation across growth conditions predicted the ability of paralogs to alter their transcription patterns and to provide backup for one another. Notably, the sets of regulatory motifs that controlled the paralogs with the most efficient backup activity deliberately overlapped only partially; paralogs with highly similar or dissimilar sets of motifs had suboptimal backup activity. Such an arrangement of partially shared regulatory motifs reconciles the differential expression of paralogs with their ability to back each other up.
2003
-
(2003) Nucleic Acids Research. 31, 13, p. 3824-3828 Abstract
We have generated a WWW interface for automated comprehensive analyses of promoter regulatory motifs and the effect they exert on mRNA expression profiles. The server provides a wide spectrum of analysis tools that allow de novo discovery of regulatory motifs, along with refinement and in-depth investigation of fully or partially characterized motifs. The presented discovery and analysis tools are fundamentally different from existing tools in their basic rational, statistical background and specificity and sensitivity towards true regulatory elements. We thus anticipate that the service will be of great importance to the experimental and computational biology communities alike. The motif discovery and diagnosis workbench is available at http://longitude.weizmann.ac.il/rMotif/.
2002
-
(2002) Genome Research. 12, 11, p. 1723-31 Abstract
Combinatorial regulation is an important feature of eukaryotic transcription. However, only a limited number of studies have characterized this aspect on a whole-genome level. We have conducted a genome-wide computational survey to identify cis-regulatory motif pairs that co-occur in a significantly high number of promoters in the S. cerevisiae genome. A pair of novel motifs, mRRPE and PAC, co-occur most highly in the genome, primarily in the promoters of genes involved in rRNA transcription and processing. The two motifs show significant positional and orientational bias with mRRPE being closer to the ATG than PAC in most promoters. Two additional rRNA-related motifs, mRRSE3 and mRRSE10, also co-occur with mRRPE and PAC. mRRPE and PAC are the primary determinants of expression profiles while mRRSE3 and mRRSE10 modulate these patterns. We describe a new computational approach for studying the functional significance of the physical locations of promoter elements that combine analyses of genome sequence and microarray data. Applying this methodology to the regulatory cassette containing the four rRNA motifs demonstrates that the relative promoter locations of these elements have a profound effect on the expression patterns of the downstream genes. These findings provide a function for these novel motifs and insight into the mechanism by which they regulate gene expression. The methodology introduced here should prove particularly useful for analyzing transcriptional regulation in more complex genomes.
-
(2002) Molecular Biology of the Cell. 13, 5, p. 1608-14 Abstract
Ohno [Ohno, S. (1970) in Evolution by Gene Duplication, Springer, New York] proposed that gene duplication with subsequent divergence of paralogs could be a major force in the evolution of new gene functions. In practice the functional differences between closely related homologues produced by duplications can be subtle and difficult to separate experimentally. Here we show that DNA microarrays can distinguish the functions of two closely related homologues from the yeast Saccharomyces cerevisiae, Yap1p and Yap2p. Although Yap1p and Yap2p are both bZIP transcription factors involved in multiple stress responses and are 88% identical in their DNA binding domains, our work shows that these proteins activate nonoverlapping sets of genes. Yap1p controls a set of genes involved in detoxifying the effects of reactive oxygen species, whereas Yap2p controls a set of genes over represented for the function of stabilizing proteins. In addition we show that the binding sites in the promoters of the Yap1p-dependent genes differ from the sites in the promoters of Yap2p-dependent genes and we validate experimentally that these differences are important for regulation by Yap1p. We conclude that while Yap1p and Yap2p may have some overlapping functions they are clearly not redundant and, more generally, that DNA microarray analysis will be an important tool for distinguishing the functions of the large numbers of highly conserved genes found in all eukaryotic genomes.
-
(2002) Journal of Molecular Biology. 318, 1, p. 71-81 Abstract
While microarray-based expression profiling has facilitated the use of computational methods to find potential cis-regulatory promoter elements, few current in silico approaches explicitly link regulatory motifs with the transcription factors that bind them. We have thus developed a TF-centric clustering (TFCC) algorithm that may provide such missing information through incorporation of biological knowledge about TFs. TFCC is a semi-supervised clustering algorithm which relies on the assumption that the expression profiles of some TFs may be related to those of the genes under their control. We examined this premise and found the vicinities of TFs in expression space are often enriched with the genes they regulate. So, instead of clustering genes based on the mutual similarity of their expression profiles to each other, we used TFs as seeds to group together genes whose expression patterns correlate with that of a particular TF. Then a Gibbs sampling algorithm was applied to search for shared cis-regulatory elements in promoters of clustered genes. Our working hypothesis was that if a TF-centric cluster indeed contains many targets of the seeding TF, at least one of the discovered motifs would be the site bound by the very same TF We tested the TFCC approach on eight cell cycle and sporulation regulating TFs whose binding sites have been previously characterized in Saccharomyces cerevisiae, and correctly identified binding site motifs for half of them. In addition, we also made de novo predictions for some unknown TF binding sites. (C) 2002 Elsevier Science Ltd. All rights reserved.
2001
-
(2001) Nature Genetics. 29, 2, p. 153-9 Abstract
Several computational methods based on microarray data are currently used to study genome-wide transcriptional regulation. Few studies, however, address the combinatorial nature of transcription, a well-established phenomenon in eukaryotes. Here we describe a new approach using microarray data to uncover novel functional motif combinations in the promoters of Saccharomyces cerevisiae. In addition to identifying novel motif combinations that affect expression patterns during the cell cycle, sporulation and various stress responses, we observed regulatory cross-talk among several of these processes. We have also generated motif-association maps that provide a global view of transcription networks. The maps are highly connected, suggesting that a small number of transcription factors are responsible for a complex set of expression patterns in diverse conditions. This approach may be useful for modeling transcriptional regulatory networks in more complex eukaryotes.
-
(2001) Genomics. 71, 3, p. 296-306 Abstract
The olfactory receptor (OR) subgenome harbors the largest known gene family in mammals, disposed in clusters on numerous chromosomes. One of the best characterized OR clusters, located at human chromosome 17p13.3, has previously been studied by us in human and in other primates, revealing a conserved set of 17 OR genes. Here, we report the identification of a syntenic OR cluster in the mouse and the partial DNA sequence of many of its OR genes. A probe for the mouse M5 gene, orthologous to one of the OR genes in the human cluster (OR17-25), was used to isolate six PAC clones, all mapping by in situ hybridization to mouse chromosome 11B3-11B5, a region of shared synteny with human chromosome 17p13.3. Thirteen mouse OR sequences amplified and sequenced from these PACs allowed us to construct a putative physical map of the OR gene cluster at the mouse Olfr1 locus. Several points of evidence, including a strong similarity in subfamily composition and at least four cases of gene orthology, suggest that the mouse Olfr1 and the human 17p13.3 clusters are orthologous. A detailed comparison of the OR sequences within the two clusters helps trace their independent evolutionary history in the two species. Two types of evolutionary scenarios are discerned: cases of "true orthologous genes" in which high sequence similarity suggests a shared conserved function, as opposed to instances in which orthologous genes may have undergone independent diversification in the realm of "free reign" repertoire expansion.
-
(2001) Human Genetics. 108, 1, p. 1-13 Abstract
Olfactory receptors (ORs) constitute the largest multigene family in multicellular organisms. Their evolutionary proliferation has been driven by the need to provide recognition capacity for millions of potential odorants with arbitrary chemical configurations. Human genome sequencing has provided a highly informative picture of die "olfactory subgenome", the repertoire of OR genes. We describe here an analysis of 224 human OR genes, a much larger number than hitherto systematically analyzed. These are derived by literature survey, data mining at 14 genomic clusters, and by an OR-targeted experimental sequencing strategy. The presented set contains at least 53% pseudogenes and is minimally divided into 11 gene families. One of these (no. 7) has undergone a particularly extensive expansion in primates. The analysis of this collection leads to insight into the origin of OR genes, suggesting a graded expansion through mammalian evolution. It also allows us to delineate a structural map of the respective proteins. A sequence database and analysis package is provided (http://bioinformatics.weizmann.ac.il/HORDE), which will be useful for analyzing human OR sequences genome-wide.
2000
-
-
(2000) Mammalian Genome. 11, 11, p. 1016-1023 Abstract
The vertebrate olfactory receptor (OR) subgenome harbors the largest known gene family, which has been expanded by the need to provide recognition capacity for millions of potential odorants. We implemented an automated procedure to identify all OR coding regions from published sequences. This led us to the identification of 831 OR coding regions (including pseudogenes) from 24 vertebrate species. The resulting dataset was subjected to neighbor-joining phylogenetic analysis and classified into 32 distinct families, 14 of which include only genes from tetrapodan species (Class II ORs). We also report here the first identification of OR sequences from a marsupial (koala) and a monotreme (platypus). Analysis of these OR sequences suggests that the ancestral mammal had a small OR repertoire, which expanded independently in all three mammalian subclasses. Classification of 'fish-like' (Class I) ORs indicates that some of these ancient ORs were maintained and even expanded in mammals. A nomenclature system for the OR gene superfamily is proposed, based on a divergence evolutionary model. The nomenclature consists of the root symbol 'OR', followed by a family numeral, subfamily letter(s), and a numeral representing the individual gene within the subfamily. For example, OR3A1 is an OR gene of family 3, subfamily A, and OR7E12P is an OR pseudogene of family 7, subfamily E. The symbol is to be preceded by a species indicator. We have assigned the proposed nomenclature symbols for all 330 human OR genes in the database. A WWW tool for automated name assignment is provided.
1999
-
(1999) Journal of Molecular Biology. 294, 4, p. 921-935 Abstract
Modeling of integral membrane proteins and the prediction of their functional sites requires the identification of transmembrane (TM) segments and the determination of their angular orientations. Hydrophobicity scales predict accurately the location of TM helices, but are less accurate in computing angular disposition. Estimating lipid-exposure propensities of the residues from statistics of solved membrane protein structures has the disadvantage of relying on relatively few proteins. As an alternative, we propose here a scale of knowledge-based Propensities for Residue Orientation in Transmembrane segments (kPROT), derived from the analysis of more than 5000 non-redundant protein sequences. We assume that residues that tend to be exposed to the membrane are more frequent in TM segments of single-span proteins, while residues that prefer to be buried in the transmembrane bundle interior are present mainly in multi-span TMs. The kPROT value for each residue is thus defined as the logarithm of the ratio of its proportions in single and multiple TM spans. The scale is refined further by defining it for three discrete sections of the TM segment; namely, extracellular, central, and intracellular. The capacity of the kPROT scale to predict angular helical orientation was compared to that of alternative methods in a benchmark test, using a diversity of multi-span α-helical transmembrane proteins with a solved 3D structure. kPROT yielded an average angular error of 41°, significantly lower than that of alternative scales (62°-68°). The new scale thus provides a useful general tool for modeling and prediction of functional residues in membrane proteins. A WWW server (http://bioinfo.weizmann.ac.il/kPROT) is available for automatic helix orientation prediction with kPROT.
-
(1999) Genomics. 61, 1, p. 24-36 Abstract
The olfactory receptor (OR) subgenome harbors the largest known gene family in mammals, disposed in clusters on numerous chromosomes. We have carried out a comparative evolutionary analysis of the best characterized genomic OR gene cluster, on human chromosome 17p13. Fifteen orthologs from chimpanzee (localized to chromosome 19p15), as well as key OR counterparts from other primates, have been identified and sequenced. Comparison among orthologs and paralogs revealed a multiplicity of gene conversion events, which occurred exclusively within OR subfamilies. These appear to lead to segment shuffling in the odorant binding site, an evolutionary process reminiscent of somatic combinatorial diversification in the immune system. We also demonstrate that the functional mammalian OR repertoire has undergone a rapid decline in the past 10 million years: while for the common ancestor of all great apes an intact OR cluster is inferred, in present-day humans and great apes the cluster includes nearly 40% pseudogenes.
-
-
(1999) Molecular Biology of the Brain. p. 93-104 Abstract
In order to elicit an olfactory response, a substance has to partition into the gas phase and diffuse into the nose. Such odorant molecules, usually low molecular-mass hydrophobic compounds, encounter the ciliated endings of sensory neuronal dendrites, which protrude into a mucus layer at the surface of the olfactory epithelium in the nasal cavity. Embedded in the membranes of such cilia are olfactory receptor (OR) proteins, which recognize odorants and elicit a transduction cascade that underlies the nerve cell response. The sensory axons project to the olfactory bulb in the brain, where they converge into synaptic structures called glomeruli. The specific convergence patterns of olfactory axons, which depend on OR expression, provide a model system for neuronal network development. Here, initial processing of odour information occurs, which is followed by additional analysis in higher olfactory brain centres.
-
(1999) Protein Science. 8, 5, p. 969-977 Abstract
The accumulation of hundreds of olfactory receptor (OR) sequences, along with the recent availability of detailed models of other G-protein-coupled receptors, allows us to analyze the OR amino acid variability patterns in a structural context. A Fourier analysis of 197 multiply aligned olfactory receptor sequences showed an α-helical periodicity in the variability profile. This was particularly pronounced in the more variable transmembranal segments 3, 4, and 5. Rhodopsin-based homology modeling demonstrated that the inferred variable helical faces largely point to the interior of the receptor barrel. We propose that a set of 17 hypervariable residues, which point to the barrel interior and are more extracellularly disposed, constitute the odorant complementarity determining regions. While 12 of these residues coincide with established ligand-binding contact postions in other G-protein- coupled receptors, the rest are suggested to form an olfactory-unique aspect of the binding pocket. Highly conserved olfactory receptor-specific sequence motifs, found in the second and third intracellular loops, may comprise the G-protein recognition epitope. The prediction of olfactory receptor functional sites provides concrete suggestions of site-directed mutagenesis experiments for altering ligand and G-protein specificity.
-
(1999) Instruments, Methods, And Missions For Astrobiology Ii. 3755, p. 144-162 Abstract
The Graded Autocatalysis Replication Domain (GARD) model described here depicts an early primordial. scenario, prior to the emergence of biopolymers, such as RNA or proteins. The model describes, with the help of statistical chemistry computer simulations, a collection of organic molecular species capable of rudimentary selection and evolution. The GARD model provides a rigorous kinetic analysis of simple sets of chemicals that manifest mutual catalysis. It is shown that catalytic closure can sustain self replication up to a critical dilution rate, related to the extent of mutual catalysis. The capacity for self replication in a mutually catalytic set is shown to be a graded property, quantitated by a critical parameter lambda(ci). GARD could be a simple model for a primordial scenario, in which replication and catalysis are performed by the same set of molecules. GARDobes are proposed to be entities that embody a GARD system, endowed with a non-DNA "compositional genome", and are presumed to have replicated slowly and imperfectly through mutually catalytic networks. Therefore, they are not bound by the standard cellular size constraints: GARDobes may be as small as a few nanometers, with 20-50 nanometers being rather large and elaborate. Active GARDobes, if ever found on earth or on other planets, would be distinguished by a highly biased organic chemistry, i.e. having only a small subset of the possible molecules of any given class. Their fossils might still bear the hallmarks of such a bias, with narrow spectra of molecules such as Polycyclic Aromatic Hydrocarbons or even with enantiomeric excesses.
1998
-
(1998) Physica A. 249, 4-Jan, p. 558-564 Abstract
A thorough outlook on the origin of life needs to delineate a chemically rigorous, self-consistent path from highly heterogeneous, random ensembles of relatively simple organic molecules, to an entity that has rudimentary life-like characteristics. Such entity should be endowed with a capacity to express variation, undergo mutation-like changes and manifest a simple evolutionary process. For simulating such system we developed the Graded Autocatalysis Replication Domain (GARD) model for explicit kinetic analysis of mutual catalysis in sets of random oligomers derived from energized precursor monomers. The kinetic properties of the GARD model are based on vesicle enclosure and expansion. With the additional assumption of spontaneous vesicle splitting, a GARD evolution scenario is envisaged as a consequence of pure chemical kinetics. Here we show how the GARD model can serve as a platform for investigating the dynamics of self-organization mechanisms in molecular evolutionary processes. (C) 1998 Elsevier Science B.V. All rights reserved.
-
(1998) Origins of Life and Evolution of the Biosphere. 28, 4-6, p. 501-514 Abstract
A Graded Autocatalysis Replication Domain (GARD) model is proposed, which provides a rigorous kinetic analysis of simple chemical sets that manifest mutual catalysis. It is shown that catalytic closure can sustain self-replication up to a critical dilution rate, λc, related to the graded extent of mutual catalysis. We explore the behaviour of vesicles containing GARD species whose mutual catalysis is governed by a previously published statistical distribution. In the population thus generated, some GARD vesicles display a significantly higher replication efficiency than most others. GARD thus represents a simple model for primordial chemical selection of mutually catalytic sets.
-
(1998) Olfaction And Taste Xii: An International Symposium. 855, p. 182-193 Abstract
The human olfactory subgenome represents several hundred olfactory receptor (OR) genes in a dozen or more clusters on several chromosomes. One OR gene cluster on human chromosome 17 has been characterized by us in detail. Based on a large-scale DNA sequence analysis, we have identified events of gene duplication and fusion as well as the generation of pseudogenes. The latter instances of 'gene death' could underlie the widespread phenomenon of human specific anosmias. Sixteen OR coding regions were found on this cluster, and six of them are pseudogenes. One of these pseudogenes, OR17-23, was found to be an intact open reading frame in an old world monkey. This may be a reflection of an OR repertoire diminution in man. A homology model of the OR protein was constructed by utilizing the rich information available on ~ 200 OR sequences. The putative odorant complementarity determining regions (CDR) was found to consist of 20 hypervariable residues facing an interior caving defined by transmembrane helices 3, 4 and 5. Such a model could be useful in analyzing additional OR gene sequences in the human genome in terms of odorant binding.
1994
-
(1994) Berichte der Bunsengesellschaft/Physical Chemistry Chemical Physics. 98, 9, p. 1166-1169 Abstract
We examined the behavior of auto-catalytic sets of polymers by a computer simulation. Polymers are allowed to interact with each other, whereby each polymer molecule may catalyze the formation and degradation of others. The system is subjected to a set of thermodynamic and kinetic constraints, including a constant influx of free energy, which keeps the system away from chemical equilibrium and thus enables the effect of catalysis. The system is found to continuously change and probe many possible values in the composition space. In this simulation we make use of a Receptor Affinity Distribution (RAD) model to predict the probabilities of interaction and catalysis. Our results indicate that initially random sets of polymers, under the assumptions of the model, might accumulate information (i.e., clustering in the composition space). Sets will occupy a limited region of composition space, and temporarily reproduce themselves or disperse and give rise to other sets.