IMAGINE your excitement as a budding young researcher, taking on your first piece of research as part of an undergraduate summer studentship. The project is to characterise a gene that, within medically important bacteria, encodes resistance to a key group of antibiotics—the tetracyclines. The gene in question is described in a peer-reviewed specialist journal, but no-one is quite sure how the gene works.
So why are we interested? Well, if we’re to understand and address the problem of antibiotic resistance, one of the many things we need to do is understand their mechanisms of resistance—how they work. This gene appears very different from any other gene that performs a similar function, and because of this it has been classed into its own ‘family’ of resistance determinant which appears in reviews and textbooks. It has also been screened for, and found, in a notable ‘superbug’, VRSA (the vancomycin resistant big brother of MRSA). The presence of this gene may also have influenced whether or not a patient was given tetracycline because if a genetic screen comes back positive for a tetracycline resistance gene, then you’re not likely to recommend tetracycline (which may have been the preferred drug of choice).
The only problem is, the gene in question is not an antibiotic resistance gene, but we won’t know this until we’ve have spent the summer working on it. Indeed, it won’t be known until the project is inherited as a pet project by a postdoc. The fact is, the gene had already been recognised for what it was over a decade earlier, though this was ever reported. The person who immediately dismissed the gene’s published function all that time ago was in fact the one time PhD supervisor of the postdoc who picked up the project, and a world expert on the family of genes to which our mystery gene belongs, but I’ll come back to that.
My post last week, ‘On publishing negative results…‘, briefly described the issue of positive publication bias in scientific and medical literature, and was a pre-amble to the story of my own experience publishing negative results. So let me now tell you about how I tried, and succeeded, at getting ostensibly negative results published.
Obviously I was the postdoc I describe above, and the project to study this ‘tetracycline resistance gene’ had burned through the summer of an undergraduate and a graduate student, and then through another six months of my time (on and off). It is not uncommon as a postdoc to undertake several smaller projects in parallel to the major project for which the postdoc is employed. Sometimes you win, sometimes you lose. Despite having accumulated plenty of ‘file drawer’ datasets and potential papers over the years, this was one we tried—out of principle—to follow through to publication.
In 1996 a paper from the laboratory of Dr Lolita Daneo-Moore, Temple University PA, reported the isolation of a novel tetracycline resistance gene on a plasmid found within the gut bacterium Enterococcus faecium:
Ridenhour et al. (1996) A novel tetracycline-resistant determinant, tet(U), is encoded on the plasmid pKq10 in Enterococcus faecium. Plasmid 35: 71-80 [DOI]
For information, plasmids are rings of DNA that exist within bacterial cells, but are distinct from the cell’s own DNA. They encode a library of weird and wonderful (though sometimes cryptic) utilities that bacteria can inherit or share between themselves—these utilities often include antibiotic- and antiseptic-resistance, among others.
Whilst I cannot hope to give a full run down of all the details of both this paper and my own paper, in essence Ridenhour’s work represents a classic piece of reductive scientific investigation. Their aim was to find out what made several E. faecium isolates resistant to the antibiotic tetracycline. They honed in on a single small plasmid, ‘pKq10’, present in all the resistant isolates, and having determined the DNA sequence of the plasmid, they found a candidate gene that they called tet(U). As is the norm, once you have a DNA sequence for something, you can compare this sequence, or the amino acid sequence that the DNA ultimately translates to, with others in databases such as GenBank—to see if anything like it already exists. Based on this sort of computer-based analysis, the authors suggested that it shared some similarity with other types of tetracycline resistance genes, but was different enough to be assigned its own class, Tet(U). Here we use a capitalised ‘Tet(U)’ to describe the protein family, and the lowercase italics ‘tet(U)‘ to describe the gene that encodes it.
It was on this basis, many years later, that our lab decided to figure out how the protein produced by tet(U) actually worked. Our lab has had some success picking apart resistance mechanisms in this way, using a biochemical approach to purify the protein away from the cells and then test its activity. That takes a sentence to say, but took several months to do!
Tetracycline is an antibiotic that targets the protein-making machinery of bacteria. So to study its activity, we can add our purified TetU protein to a small tube that contains an active protein-making soup extracted from bacteria, and see whether its presence protects this protein production from the effects of tetracycline. It didn’t. Nor in fact could we identify any kind of resistance associated with this gene in any other context.
Repeating the original experiments
When I inherited the project, the first thing I did was go back to the original paper and scrutinise both the report, and the DNA sequences they’d submitted to the database. What struck me was that the gene in question, tet(U), looked very familiar. I spent all of 5 minutes double checking the DNA sequence against GenBank and it became clear why. In terms of familiarity, tet(U) was as obvious to me in function as a double-decker bus is to most people. But perhaps the confusion had been that it wasn’t a whole gene I was expecting—it was just the back half. This is because somewhere along the line, when the original lab determined the DNA sequence, two DNA mutations (or sequencing errors) crept in that split a single obvious gene into two, smaller, more cryptic genes (at least to those less familiar with them).
We have to remember that in 1996 sequencing was rather more error prone than it was today. Also, for the sake of ease (and I can’t blame them), the authors had chosen an error prone way to make enough DNA for the sequencing. It’s also true that the GenBank database has had many more submissions in the past 15 years, making the identification of ‘tet(U)‘ more obvious now than back then. It also didn’t help that their bioinformatic analysis stretched the imagination into the patently bizarre—the similarities they reported between tet(U) and other tetracycline resistance genes seemed only possible with wanton desire, rather than empiricism.
So what is tet(U)? It’s actually the back end of a replication gene. This codes for a protein that the plasmid produces so that the plasmid can essentially make more of itself, i.e. replicate. It’s a pretty important protein, and not one that in this family of proteins you would find in two pieces and still be functional. It’s also from the same family of replication proteins with which my former PhD supervisor has a world class expertise.
Given that other researchers and clinical microbiologists were continuing to screen for the presence of tet(U), again potentially informing treatment choices, we felt it prudent to inform the community that it should not be considered a tetracycline resistance determinant. However, with myself and the lab head being self-doubting and meticulous chaps, we thought we’d run through their original experiments, just in case there was something obvious we were missing.
The start of the real problems
The real problem with repeating this work was the fact that Dr Daneo-Moore—senior author of the original paper—is sadly deceased, and the lab and materials have long since dispersed. No-one I contacted had any of the original bacterial isolates, or indeed the plasmid! All we had was the DNA sequence of their plasmid deposited in the database, and the knowledge that tet(U) has cropped up time and again where people have screened for it. By screening, we mean that people are looking for one small section of the gene, rather than trying to understand the genetic context where they find it, i.e. determining whether it’s a distinct gene or in fact part of a (much larger) replication gene. In fact, the saving grace was a recent mass DNA sequencing project that had inadvertently delivered up the sequences I’d need, but I’ll come back to that.
For the time being, I did what I could. I had the whole plasmid synthesised according to the original published DNA sequence, and then dutifully performed all the cuttings and splicings as Ridenhour et al. described, yet saw none of the tetracycline resistance that they did. So this was the point that we decided we’d write a short article to update the literature and have done with it.
The peer review hurdle
I have been a reviewer for six scientific publications for several years, and have performed my duties rigorously and in as scientific a manner possible, looking for sound logic, methodology and substantiated interpretation. The peer review of our paper was a little less than this: one reviewer saw what we were saying, and barring a few stylistic points was happy with it. The other reviewer cost me another five months work.
You see, the thing is, despite pointing at the gene in question and shouting very loudly, “LOOK, IT’S CLEARLY THE BACK END OF A BUS ☞” (i.e. it’s a replication protein, it’s identical to other replication proteins, this is just a small plasmid with one gene and that gene is a replication protein, end of…), the burden of proof was put upon us to disprove that tet(U) was a tetracycline resistance gene. This is of course the essence of Popperism, but we felt we had essentially done this. However, it had been hard to draw a line and say, “tet(U) does not confer tetracycline resistance because it is not a tetracycline resistance gene”, because at the back of my mind I’m thinking, “hmm, does it not confer tetracycline resistance because I am crap and can’t make it do what its supposed to do?”
This, in essence, is the big reviewer ‘fob off’. What researcher has the time to work through so many petty experiments to try to show that the gene isn’t the one thing, all the while ignoring the fairly strong evidence that it’s actually another thing all together?
Sadly, one of the reviewers wasn’t a biochemist, so proceeded to ‘inform’ me (a biochemist) that I couldn’t use the sorts of techniques I’d been using. This is despite my having used them for my entire career, and have been used successfully in biochemistry research for over 15 years, and by others to do just the sort of experiments we’d described. That reviewer also wanted another control experiment, namely to do the same experiments, but on a totally different tetracycline resistance gene, Tet(M). This is a ‘real’ tetracycline resistance gene that the original authors reported as sharing some similarity, but actually it shares no such similarity. Comically, I was also told—in a clairvoyant like manner—that the biochemical approach I’d taken would also “not work” for Tet(M) either. But of course the techniques worked beautifully with Tet(M), and this protein did everything that Tet(U) was supposed to do—but actually didn’t, i.e. confer resistance to tetracycline. So we had at least vindicated the approach!
I was also told that the only proper experiment I could do was to work with the original material used by the lab, and ignore the sequence they had reported in the their paper. So I was being told something that would mean that the hypothesis is now untestable by science, because I could not go back in time and work with exactly whatever it was the original authors were working with. Obviously this is ridiculous—the sequence in their paper is synonymous with tet(U), and is the basis on which it has been accepted in the literature and used to screen for current tet(U) signals in bacteria. If they have failed to include some other important factor, then they have not in fact described the role of tetU (or the true source of resistance) as it stood.
Having performed many more experiments and permutations to generate yet more negative results, I returned to the original tet(U) DNA sequence and compared it against the database again, just for good measure, as it’d been 6 months since my last analysis. Wouldn’t you know, an identical match appeared. I got excited and then inputted the whole sequence of the plasmid, and got an almost identical match with just a few differences. Those differences would be the ones I had predicted would turn an apparent plasmid replication gene into two smaller cryptic genes. Our tet(U) was a mere constituent of a gene labelled as a replication gene on this new matching sequence; this sequence had been submitted to GenBank as part of a partial genome of Enterococcus faecium (the same species as the original host of the plasmid) from a laboratory at University Medical Center Utrecht in the Netherlands.
I just assumed that a plasmid similar to mine had gotten sick of its independent life and inserted itself into the genome of the bacterium—this often happens. I intended to include it as a mere footnote. However, a few days later I found a scientist with a familiar name following me on twitter, Dr Willem van Schaik (@WvSchaik). It took me a while to twig, but realised that coincidentally it was his lab that was doing this sequencing. I sent @WvSchaik a tweet to say that I’d been poring through his sequence data (as you do), but had a question about one contig (the unsorted section of sequence where I’d seen my plasmid). He told me it was in fact a separate plasmid—not part of the genome as I’d thought.
This is where I got more excited and asked @WvSchaik if he wouldn’t mind sending me the bacterial strain this sequence came from. As promised Willem sent the strain. I grew it up, extracted just the plasmid DNA, and found there were actually three plasmids, one of which looked about the right size. A few quick tests later and I had the plasmid, but not any plasmid—it was pKQ10, the original plasmid! It had first been studied 16 years previously in a lab in Pennsylvania, and been lost to science. Now it had turned up in Utrecht, in an Enterococcus strain isolated from an infection in an old man.
And do you know what was interesting about pKQ10? Absolutely nothing. All it does is replicate itself. It is an end unto itself, it exists merely to exist, and does not confer resistance to tetracycline.
Caryl et al.(2012) “tet(U)” is not a tetracycline resistance determinant. Antimicrob Agents Chemother 56: 3378-3379 [DOI]
The enterococcal plasmid pKQ10 has been reported to carry a poorly characterized tetracycline resistance determinant designated tet(U). However, in a series of studies intended to further characterize this determinant, we have been unable to substantiate the claim that tet(U) confers resistance to tetracyclines. In line with these results, bioinformatic analysis provides compelling evidence that “tet(U)” is in fact the misannotated 3′ end of a gene encoding a rolling-circle replication initiator (Rep) protein.
If only you knew how careful we had to be with every.choice.of.word throughout the manuscript so that we weren’t accused of being ‘judgemental’.
N.B. No disrespect is intended, or should be ascribed, to the authors of the original paper. This is simply the iterative nature of science as it should happen, and is a product of progress in molecular and bioinformatic tools, and some stubborn determination to publish negative results.
[This post was restored from a WayBackWhen archive. It was originally posted to a blog called ‘The Gene Gym” that began life on the Nature Network in 2010, and then moved to Spekrum’s SciLogs platform.]