Index to this page

The Operon

Within its tiny cell, the bacterium E. coli contains all the genetic information it needs to metabolize, grow, and reproduce. It can synthesize every organic molecule it needs from glucose and a number of inorganic ions.

Many of the genes in E. coli are expressed constitutively; that is, they are always turned "on". Others, however, are active only when their products are needed by the cell, so their expression must be regulated.

Two examples: The three enzymes are

The lac operon

The capacity to respond to the presence of lactose was always there. The genes for the three induced enzymes are part of the genome of the cell. But until lactose was added to the culture medium, these genes were not expressed (β-galactosidase was expressed weakly — just enough to convert lactose into allolactose).

The most direct way to control the expression of a gene is to regulate its rate of transcription; that is, the rate at which RNA polymerase transcribes the gene into molecules of messenger RNA (mRNA).

Link to a general discussion of gene transcription.

Gene transcription begins at a particular nucleotide shown in the figure as "+1". RNA polymerase actually binds to a site "upstream" (i.e., on the 5' side) of this site and opens the double helix so that transcription of one strand can begin.

The binding site for RNA polymerase is called the promoter. In bacteria, two features of the promoter appear to be important:

The exact DNA sequence between the two regions does not seem to be important.

Each of the three enzymes synthesized in response to lactose is encoded by a separate gene. The three genes are arranged in tandem on the bacterial chromosome.

The lac operon. In the absence of lactose, the repressor protein encoded by the I gene binds to the lac operator and prevents transcription. Binding of allolactose to the repressor causes it to leave the operator. This enables RNA polymerase to transcribe the three genes of the operon. The single mRNA molecule that results is then translated into the three proteins.

The lac repressor binds to a specific sequence of two dozen nucleotides called the operator. [Link to a discussion of how the DNA sequence of the operator site can be determined.] Most of the operator is downstream of the promoter. When the repressor is bound to the operator, RNA polymerase is unable to proceed downstream with its task of gene transcription.

The lac repressor represents only a tiny fraction of the proteins in the E. coli cell.
Link to a discussion of how it can nonetheless be isolated and purified.

The operon is the combination of the

The gene encoding the lac repressor is called the I gene. It happens to be located just upstream of the lac promoter. However, its precise location is probably not important because it achieves its effect by means of its protein product, which is free to diffuse throughout the cell. And, in fact, the genes for some repressors are not located close to the operators they control.

Although repressors are free to diffuse through the cell, how does — for example — the lac repressor find the single stretch of 24 base pairs of the operator out of the 4.6 million base pairs of DNA in the E. coli genome? It turns out the repressor is free to bind anywhere on the DNA using both

Once astride the DNA, the repressor can move along it until it encounters the operator sequence. Now an allosteric change in the tertiary structure of the protein allows the same amino acids to establish bonds — mostly hydrogen bonds and hydrophobic interactions — with particular bases in the operator sequence.

The lac repressor is made up of four identical polypeptides (thus a "homotetramer"). Part of the molecule has a site (or sites) that enable it to recognize and bind to the 24 base pairs of the lac operator. Another part of the repressor contains sites that bind to allolactose. When allolactose unites with the repressor, it causes a change in the shape of the molecule, so that it can no longer remain attached to the DNA sequence of the operator. Thus, when lactose is added to the culture medium,

Hardly does transcription begin, before ribosomes attach to the growing mRNA molecule and move down it to translate the message into the three proteins. You can see why punctuation codons — UAA, UAG, or UGA — are needed to terminate translation between the portions of the mRNA coding for each of the three enzymes.

This mechanism is characteristic of bacteria, but differs in several respects from that found in eukaryotes: Link to a discussion of gene regulation in eukaryotes.


As mentioned above, the synthesis of tryptophan from precursors available in the cell requires 5 enzymes. The genes encoding these are clustered together in a single operon with its own promoter and operator. In this case, however, the presence of tryptophan in the cell shuts down the operon. When Trp is present, it binds to a site on the Trp repressor and enables the Trp repressor to bind to the operator. When Trp is not present, the repressor leaves its operator, and transcription of the 5 enzyme-encoding genes begins.

Stereo view of the tryptophan repressor (right side of each panel) bound to its operator DNA (left side). The repressor is a homodimer of two identical polypeptides (on either side of the horizontal red line). Binding to DNA occurs only when a molecule of tryptophan (red rings) is bound to each monomer of the repressor. You may be able to fuse the images be holding a stiff paper or cardboard between the views so that your left eye sees only the left-hand image, your right eye the right-hand image. Persevere; the results are worth it!. (Image courtesy of P. B. Sigler.)

The usefulness to the cell of this control mechanism is clear. The presence in the cell of an essential metabolite, in this case tryptophan, turns off its own manufacture and thus stops unneeded protein synthesis.

As its name suggests, repressors are negative control mechanisms, shutting down operons However, some gene transcription in E. coli is under positive control.

Positive Control of Transcription: CAP

Absence of the lac repressor is essential but not sufficient for effective transcription of the lac operon. The activity of RNA polymerase also depends on the presence of another DNA-binding protein called catabolite activator protein or CAP. Like the lac repressor, CAP has two types of binding sites:

However, CAP can bind to DNA only when cAMP is bound to CAP. so when cAMP levels in the cell are low, CAP fails to bind DNA and thus RNA polymerase cannot begin its work, even in the absence of the repressor.

So the lac operon is under both negative (the repressor) and positive (CAP) control.


It turns out that it is not simply a matter of belt and suspenders. This dual system enables the cell to make choices. What, for example, should the cell do when fed both glucose and lactose? Presented with such a choice, E. coli (for reasons about which we can only speculate) chooses glucose. It makes its choice by using the interplay between these two control devices. Without CAP, binding of RNA polymerase is inhibited even though there is no repressor to interfere with it if it could bind. The molecular basis for its choices is shown in the graphic.

CAP consists of two identical polypeptides (hence it is a homodimer). Toward the C-terminal, each has two regions of alpha helix with a sharp bend between them. The longer of these is called the recognition helix because it is responsible for recognizing and binding to a particular sequence of bases in DNA.

The graphic shows a model of CAP. The two monomers are identical. Each monomer recognizes a sequence of nucleotides in DNA by means of the region of alpha helix labeled F. Note that the two recognition helices are spaced 34Å apart, which is the distance that it takes the DNA molecule (on the left) to make precisely one complete turn.

The recognition helices of each polypeptide of CAP are, of course, identical. But their orientation in the dimer is such that the sequence of bases they recognize must run in the opposite direction for each recognition helix to bind properly. This arrangement of two identical sequences of base pairs running in opposite directions is called an inverted repeat.

The strategy illustrated by CAP and its binding site has turned out to be used widely. As more and more DNA-regulating proteins have been discovered, many turn out to share the traits we find in CAP:


Protein repressors and corepressors are not the only way in which bacteria control gene transcription. It turns out that the regulation of the level of certain metabolites can also be controlled by riboswitches. A riboswitch is section of the 5'-untranslated region (5'-UTR) in a molecule of messenger RNA (mRNA) which has a specific binding site for the metabolite (or a close relative).

Some of the metabolites that bind to riboswitches: In each case, the riboswitch regulates transcription of genes involved in the metabolism of that molecule. The metabolite binds to the growing mRNA and induces an allosteric change that

Some riboswitches control mRNA translation rather than its transcription. [Link]

It has been suggested that these regulatory mechanisms, which do not involve any protein, are a relict from an "RNA world".

Welcome&Next Search

19 August 2013