My favorite subject-specific journal is Molecular Biology and Evolution (MBE). This journal publishes on topics primarily related to molecular evolution and evolutionary genomics, which are among my favorite subjects in biology. I’m happy to report that the latest issue of MBE is out today, and there are lots of great articles that I think will be of interest to folks here, many of which are open-access.
I sadly don’t have time to write up any of these articles, but I thought it might be useful to “sample” a few in case any any of you would like to read and discuss them. Here are a handful that seem particularly interesting:
Population Structure Shapes Copy Number Variation in Malaria Parasites (open-access)
Abstract:
If copy number variants (CNVs) are predominantly deleterious, we would expect them to be more efficiently purged from populations with a large effective population size (Ne) than from populations with a small Ne. Malaria parasites (Plasmodium falciparum) provide an excellent organism to examine this prediction, because this protozoan shows a broad spectrum of population structures within a single species, with large, stable, outbred populations in Africa, small unstable inbred populations in South America and with intermediate population characteristics in South East Asia. We characterized 122 single-clone parasites, without prior laboratory culture, from malaria-infected patients in seven countries in Africa, South East Asia and South America using a high-density single-nucleotide polymorphism/CNV microarray. We scored 134 high-confidence CNVs across the parasite exome, including 33 deletions and 102 amplifications, which ranged in size from <500 bp to 59 kb, as well as 10,107 flanking, biallelic single-nucleotide polymorphisms. Overall, CNVs were rare, small, and skewed toward low frequency variants, consistent with the deleterious model. Relative to African and South East Asian populations, CNVs were significantly more common in South America, showed significantly less skew in allele frequencies, and were significantly larger. On this background of low frequency CNV, we also identified several high-frequency CNVs under putative positive selection using an FST outlier analysis. These included known adaptive CNVs containing rh2b and pfmdr1, and several other CNVs (e.g., DNA helicase and three conserved proteins) that require further investigation. Our data are consistent with a significant impact of genetic structure on CNV burden in an important human pathogen.
No Accumulation of Transposable Elements in Asexual Arthropods (open-access)
Abstract:
Transposable elements (TEs) and other repetitive DNA can accumulate in the absence of recombination, a process contributing to the degeneration of Y-chromosomes and other nonrecombining genome portions. A similar accumulation of repetitive DNA is expected for asexually reproducing species, given their entire genome is effectively nonrecombining. We tested this expectation by comparing the whole-genome TE loads of five asexual arthropod lineages and their sexual relatives, including asexual and sexual lineages of crustaceans (Daphnia water fleas), insects (Leptopilina wasps), and mites (Oribatida). Surprisingly, there was no evidence for increased TE load in genomes of asexual as compared to sexual lineages, neither for all classes of repetitive elements combined nor for specific TE families. Our study therefore suggests that nonrecombining genomes do not accumulate TEs like nonrecombining genomic regions of sexual lineages. Even if a slight but undetected increase of TEs were caused by asexual reproduction, it appears to be negligible compared to variance between species caused by processes unrelated to reproductive mode. It remains to be determined if molecular mechanisms underlying genome regulation in asexuals hamper TE activity. Alternatively, the differences in TE dynamics between nonrecombining genomes in asexual lineages versus nonrecombining genome portions in sexual species might stem from selection for benign TEs in asexual lineages because of the lack of genetic conflict between TEs and their hosts and/or because asexual lineages may only arise from sexual ancestors with particularly low TE loads.
Evolution of Prdm Genes in Animals: Insights from Comparative Genomics (open-access)
Abstract:
Prdm genes encode transcription factors with a subtype of SET domain known as the PRDF1-RIZ (PR) homology domain and a variable number of zinc finger motifs. These genes are involved in a wide variety of functions during animal development. As most Prdm genes have been studied in vertebrates, especially in mice, little is known about the evolution of this gene family. We searched for Prdm genes in the fully sequenced genomes of 93 different species representative of all the main metazoan lineages. A total of 976 Prdm genes were identified in these species. The number of Prdm genes per species ranges from 2 to 19. To better understand how the Prdm gene family has evolved in metazoans, we performed phylogenetic analyses using this large set of identified Prdm genes. These analyses allowed us to define 14 different subfamilies of Prdm genes and to establish, through ancestral state reconstruction, that 11 of them are ancestral to bilaterian animals. Three additional subfamilies were acquired during early vertebrate evolution (Prdm5, Prdm11, and Prdm17). Several gene duplication and gene loss events were identified and mapped onto the metazoan phylogenetic tree. By studying a large number of nonmetazoan genomes, we confirmed that Prdm genes likely constitute a metazoan-specific gene family. Our data also suggest that Prdm genes originated before the diversification of animals through the association of a single ancestral SET domain encoding gene with one or several zinc finger encoding genes.
This next one is on a topic that comes up here from time to time, and I think it will be of interest to many of you. Sadly, it’s paywalled, but if you don’t have access through a university library, feel free to send me a pm.
Are Human Translated Pseudogenes Functional?
Abstract:
By definition, pseudogenes are relics of former genes that no longer possess biological functions. Operationally, they are identified based on disruptions of open reading frames (ORFs) or presumed losses of promoter activities. Intriguingly, a recent human proteomic study reported peptides encoded by 107 pseudogenes. These peptides may play currently unrecognized physiological roles. Alternatively, they may have resulted from accidental translations of pseudogene transcripts and possess no function. Comparing between human and macaque orthologs, we show that the nonsynonymous to synonymous substitution rate ratio (ω) is significantly smaller for translated pseudogenes than other pseudogenes. In particular, five of 34 translated pseudogenes amenable to evolutionary analysis have ω values significantly lower than 1, indicative of the action of purifying selection. This and other findings demonstrate that some but not all translated pseudogenes have selected functions at the protein level. Hence, neither ORF disruption nor presence of protein product disproves or proves gene functionality at the protein level.
There are a lot of other interesting papers in this issue, but for the sake of brevity, I’ll stop here. Happy reading!
Also, apologies to Ira Flatow for shameless theft of his excellent radio show’s title.
As Former President of the SMBE, and thus a current member of its Council, let me thank you for the endorsement of our journal.
In fact, as I write this, I am participating in an online Skype/cell phone conference connecting to the SMBE’s winter council meeting in Florida. They just changed the page charges (now 1st 10 pages free).
One NIH researcher teaching my RNA class asserts pseudo genes are part of the part of the RNA-interference regulatory schema and regulatory decoys.
This is of importance to the NIH since they study hereditary disease and they see damaged pseodogenes as related to damage in health.
https://en.wikipedia.org/wiki/RNA_interference
Example:
http://www.pnas.org/content/108/20/8345.full
and
My professor specifically cited this:
http://www.nature.com/nrg/journal/v11/n8/full/nrg2835.html
stcordova,
Oh, Sal, Sal. That’s a typical method of those who want to claim that there’s no such thing as junk DNA: point to a few members of a class that have acquired functions and jump from there to the inference that all members of that class are functional.
But the MBE paper offers a solution. Note that it provides a test for function of translated pseudogenes: ratio of synonymous to non-synonymous changes. And it finds that a small minority of translated pseudogenes pass the test of function.
stcordova,
Sal, I’ve got to agree with John. I don’t think that there is much resistance to the idea that some fraction of operationally-defined pseudogenes might currently have – or in the future take on – functional roles. In such cases, perhaps these “pseudogenes” should be given a different name if we want to retain lack of function in the definition of a pseudogene.
But from basic molecular biology, we know that genes duplicate with some regularity, and we know that fitness-impairing mutations can accumulate if not prevented by selection. Unless we’re willing to argue that all products of gene duplication events (I’m ignoring unitary pseudogenes) will be under strong purifying selection, what would prevent the accumulation of pseudogenes in a genome?
But you don’t know it doesn’t have function. We barely have the experimental apparatus to elucidate what little we know. These guys work years on an experiment on one gene or complex, and in some cases one measly amino acid residue on a histone tail (not that that relates to pseudogenes immediately, but just to highlight the difficulty of molecular biology).
The hard work to find the molecular reality is in the lab, and it moves at a snails pace, but it is real scientific progress. There are a buzzillion pseudo genes in the biological world, we’ve only scratched the surface in terms of actually testable experiments.
What are the non-functional pseudogenes translated to?
Of the putative pseudogenes examined in this study, most do not have evidence that they are translated into anything. Those that are translated are, by definition, translated into amino acid polypeptides. A majority of the translated pseudogenes appear to be evolving at a rate consistent with neutral evolution, while there is evidence that a handful of them are evolving under purifying selection.
Joe Felsenstein,
Thanks, Joe! I hope that in the not-too-distant future, I will have produced research worth submitting to MBE.
Think how many pebbles there are on the beach, and how many haven’t been tested for perfect sphericity.
Dave Carlson,
How was it determined that gene duplication is a stochastic process?
They follow known probability distributions.
Bubba Gump’s First Law of Mutation.
Mutations are mostly detrimental, and they are either programmed to produce increases in complexity, or are tweaked by an invisible designer to produce desired outcomes.
From the TE paper:
I actually don’t find this surprising at all. A significant mode of TE transmission is the alternating diploid/haploid state. A TE in asexuals can’t jump between lineages.
Allan Miller,
Allan,
I’m not sure I follow that. Can you elaborate?
I hope this isn’t off-topic. I found this while wandering around related stuff.
https://www.researchgate.net/publication/255980753_A_Tale_of_Two_Crocoducks_Creationist_Misuses_of_Molecular_Evolution
There’s a free download button near the bottom of the page.
I deny that lab experiments are the only way to discover lack of function. We can see that in the general case, genome-wide, there are very good arguments that there is lots of junk (around 90% in the case of the human genome). In individual cases, there are tests for function that don’t involve watching that function happen experimentally, one of which the MBE paper mentioned: comparative substitution rates. That paper assayed a bunch of the class of translated pseudogenes and found that most failed the test for function. OK, maybe they have unknown functions that can’t be tested for. But maybe there’s an invisible, intangible elephant in your living room.
John Harshman,
I have very little knowledge that would allow me to access the positions on junk DNA,
Encore says 80% and you say 10% is a good estimate. I honestly have never seen an argument with such a wide gap. What is causing this huge range of opinions among scientists? Is the 80% claim a sales pitch to get more genome research funding?
Sounds like hype, and has mostly been backtraked. 90 percent is not conserved, which means that whatever function it might have is unrelated to specific sequence.
“Encore”? I think you mean ENCODE?
ENCODE’s definition of what makes a gene “function” boils down to it’s not 100% biochemically inert. Well, okay, I don’t really have a problem with saying that 80% of the genome does, indeed, retain some degree (however small) of biochemical activity. What I do have a problem with, is saying that some degree of biochemical activity is synonymous with functional gene.
That’s what I would have thought. But how does that answer Sal who I presume was talking about something completely different?
ETA: It seems rather clear to me anyways that Sal was talking about RNA not proteins.
What do you think Sal said that wasn’t answered?
colewd,
No one knows much of anything. But medical researchers trying to find cures are under the working assumption much of the heritable disease issues will be associated with non-coding DNA.
Why? One fact is that 90% of heritable diseases associated with Single Nucleotide Polymorphisms (a change in a single DNA letter), are in the non-coding regions — and the non-coding is 98% of the genome. It would not be a bad working hypothesis to assume function until otherwise proven wrong given such facts.
The NIH ENCODE, Roadmap, and E4 projects (perhaps 600 million budgeted) are driven by medical researchers who see heritable disease frequently associated with parts of the genome where the evolutionary biologists said there is functionless junk. Money keeps flowing because they are making their case and advancing medical science vs. closing their eyes and saying, “no point in investigating since this is junk.”
What if the NIH is wrong? I’d say humanity has less to lose than it stands to gain by making a working assumption most of the genome has some purpose in establishing and maintaining health.
To the nay-sayers who say 90% is junk, I’d respond by saying, “Do think you’d survive if that 90% were knocked out of your genome?”
In 2012:
All I can say is that the position I take in this discussion is pretty much the line I here when I interact with NIH researchers.
Yes, amazing isn’t it. So much for the issue being settled science either way. Know one knows for sure what the answer is.
What’s the harm in assuming the NIH and medical community are right? For some reason some evolutionary biologists don’t like the idea that biological systems are exquisite Rube Gold machines that have excessive extravagance and go through more gyrations than absolutely needed for survival .
stcordova,
It’s an absurdly common thing for people who don’t understand what “junk DNA” means to equate it with “non-coding DNA”, as you do and as the abstract you quote also does. Do you understand the problem with that?
Do you understand that a mutation to junk DNA can result in phenotypic effects?
Do you understand that a SNP in junk DNA can “be associated” with a disease even if it has nothing to do with causing the disease, as long as it’s in the same linkage group as the actual cause?
The magazine is not intellectually accurate. It does not deal in molecular evolution from a scientific vision. nor in biology.
What the operative word is COMPARATIVE.
They just assume a evolution paradigm and then do comparative speculation until the cows come home.
I say you would never find a single article dealing with scientific evidence for conclusions about the evolution of things from molecular evidence except in very minor cases.
if not just name three.
These cats just can;’t see that its not science just because you compare dna to draw relationships.
They can’t see that they must prove FIRST evolution has taken place before doing dna trees.
Remove the presumption of evolution and this mad will be found void of scientific discovery or innovation or insight.
At least they don’t use fossils but its still the same flaw of reasoning.
Why are you asking me? Salvador asked about mRNA not translated to protein. John responded by appealing to mRNA translated to protein.
If you don’t understand what is being discussed just say so.
You do know that not all DNA that is transcribed into RNA is translated into a protein, don’t you?
That has nothing to do with the discussion. The percentage having any function remains about 10 percent. The part Sal talked about is a fraction of one percent.
90 percent is not conserved. If it has function, the function is insensitive to sequence and mutates without being subject to purifying selection.
I’ll just take it on your say so.
Feel free to correct my numbers.
Yes I do. Why do you ask?
Sorry that’s a terrible way to detect function. I just pointed out pseudo gene RNAs can act as decoys in a regulatory schema, so “synonymous” is meaningless since they don’t code for proteins since they are untranslated.
Synonymous changes has probably zero to do with transcribed but untranslated pseudo genes that act as microRNA decoys.
For example, one can’t deduce the following sort of pseudogene function with the MBE study you cited, it has to be done through molecular biology, not evolutionary biology:
and the following won’t be detected by the synonymous substitution test either,
NOTES:
As an aside, perhaps partly because of all the post-transcriptional modificaitons, 10% of the transcriptome, ahem, may not even correspond directly to DNA!
Dave Carlson,
A TE can, in principle, jump onto any chromosome. In a sexual population, any such locus can then ‘colonise’ the population, as a subgenome fragment. Since haploids come together and then separate as gametes, a TE can hop from haploid to haploid as a kind of sexually transmitted disease. It may well be that the only genetic contribution of the individual in question is that very fragment.
In an asexual, however, a TE jumping to another chromosome takes place in an undivided, nonrecombining genome. It can’t get outside that lineage. The only way that this can ‘take over’ the population – ie, can fix – is if the entire genome in which it resides eliminates all competing genomes. But of course there are barriers to this, linkage effects and interference, and the selective effect of the TE itself, which will be more damaging in this confined space.
An active TE in an asexual lineage is more likely to degrade its genome, and hence that is likely to be eliminated, by its effect on organismal fitness. The asexual genome is ill equipped to respond to this threat, because there is no subgenome fragmentation – responsive alleles occur more frequently in ‘the wrong place’, and cannot recombine into infected genomes. Any response to the threat must occur in the lineage in which the threat exists. In a sexual genome, conversely, there is locus-level selection. An allele that deals with the TE, (or selection for reduced virulence), will be favoured in the TE-infested population, even if it has its origin in a non-infected line. So, you get something of a stand-off. TEs spread more readily, but so do counter-measures, leading (IMO) to an expectation of higher chance of locating a TE in a sexual population.
stcordova,
You are taking the rather absurd position that evolutionary considerations are leading people to tell medical researchers – or they are telling themselves – to ignore 90% of the genome.
Their test would not be applicable in the cases you mention because it was designed specifically to look for selective constraints in putative pseudogenes that are translated. I’m not sure why you would expect the method to apply to an entirely different set of questions that the paper didn’t ask.
Allan Miller,
Thanks, Allan. That makes sense and is consistent with some ideas regarding TE proliferation in combining vs. non-recombining genomes I’ve encountered previously. Interesting stuff!
It’s a way to detect function in translatedpseudogenes. I used it as an example because translated pseudogenes were what the article was about. Now you did change the subject to pseudogenes in general, but the test merely needs to be changed to sequence conservation in general. A transcript that interferes with other transcripts must have a sequence similar to that transcript and thus will be subject to purifying selection. It will accumulate changes more slowly than the neutral rate.
You ignored my other questions, so let me restore them:
It’s an absurdly common thing for people who don’t understand what “junk DNA” means to equate it with “non-coding DNA”, as you do and as the abstract you quote also does. Do you understand the problem with that?
Do you understand that a mutation to junk DNA can result in phenotypic effects?
Do you understand that a SNP in junk DNA can “be associated” with a disease even if it has nothing to do with causing the disease, as long as it’s in the same linkage group as the actual cause?
John Harshman,
Then it isn’t junk DNA
Transposons carry within their sequence the coding for two of the enzymes required for it to do its thing (move around). How does natural selection, drift or neutral construction account for that?
How would you define “junk DNA”?
Frankie,
They don’t. There are mechanisms of genetic transmission beyond those mediated solely through differential vertical inheritance. Which you’d know, of course, ‘cos you ‘understand science’.
Nice non-response. Transposons exist and your position doesn’t have a mechanism capable of explaining that existence. Which you would know if you understood evolutionism and science.
If it has some effect it isn’t junk.
But it doesn’t… it’s only that it can mutate and gain function.
Don’t confuse FrankenJoe by stating facts.
So if it has no effect until the mutation occurs, isn’t it junk up until that point?
So if you remove it and nothing happens, is it junk?
Frankie,
Of course it does.
What’s yours, meanwhile? No, let me guess – Design. Where, when, why, how? A smidgeon of that detail you repetitiously and pugnaciously demand would be ever so welcome. In your own time.
stcordova,
Thanks for the information. I really don’t understand this debate. Why call part of the genome that we don’t understand junk? How much of the function of the genome do we really understand? 1%? From my study of vitamin d and cancer I have come to realize that some of transcription is controlled by regulation of proteins. B catenin, a protein that transcribes many cell cycle control proteins is regulated by vitamin d and a protein complex including APC and GKP that breaks it down. When excess b catenin is in the cell, the cell cycle can go out of control causing cancer. How this process works is poorly understood at this point as far as a real detailed description but I know the control mechanisms of the chromosomes such as histones, transcriptional proteins, intron codes and how they go from a tightly wound chromosome to transcription are complex yet a very tightly controlled process.
colewd,
Origin of the term
The link to Ohno’s original paper is via a junk-hostile site (Pellionisz, an apparent crank who turned up at Sandwalk from time to time). But the reasoning is interesting, and sound.