Discovery of a New Way that Bacteria Regulate Their Genes


The amino acid composition of proteins is encoded in DNA in the form of three-letter words (codons). Each amino acid can be coded for by more than one codon and, for a given amino acid, different organisms use one codon more frequently than the alternatives. This codon usage preference, particularly near the start of genes, has a strong influence in gene expression, but the causes and precise effects of such codon preference are unclear. Scientists at Harvard University analyzed thousands of synthetic gene constructs containing either frequent or infrequent codons toward their start. Using next-generation sequencing to determine gene expression and fluorescent cell sorting to assess protein abundance, the investigators concluded that the presence of infrequent codons near the start of genes dramatically increases protein expression. Furthermore, using computational methods to predict RNA structure, the authors demonstrated that the three-letter sequence of infrequent codons reduces the formation of secondary structures in the messenger RNA (mRNA) molecule involved in the protein synthesis process, facilitating the translation of the DNA sequence of genes into proteins. This mRNA structural modification is in large part responsible for the observed increase in expression of genes with infrequent codons. These results have important implications for the design of synthetic genes that can be more efficiently expressed in engineered organisms for the production of new biomolecules such as biofuels.


Goodman, D. B., G. M. Church, and S. Kosuri. 2013. “Causes and Effects of N-Terminal Codon Bias in Bacterial Genes,” Science 342(6157), 475–79. DOI:10.1126/science.1241934.