How many nitrogenous bases codes for 56 amino acids. What is the genetic code: general information

DNA and RNA nucleotides
  1. Purines: adenine, guanine
  2. Pyrimidines: cytosine, thymine (uracil)

codon- a triplet of nucleotides encoding a specific amino acid.

tab. 1. Amino acids commonly found in proteins
Name Abbreviation
1. AlanineAla
2. ArginineArg
3. AsparagineAsn
4. Aspartic acidasp
5. CysteineCys
6. Glutamic acidGlu
7. GlutamineGln
8. Glycinegly
9. HistidineHis
10. Isoleucineile
11. LeucineLeu
12. LysineLys
13. MethionineMet
14. PhenylalaninePhe
15. ProlinePro
16. SeriesSer
17. ThreonineThr
18. Tryptophantrp
19. TyrosineTyr
20. ValineVal

The genetic code, which is also called the amino acid code, is a system for recording information about the sequence of amino acids in a protein using the sequence of nucleotide residues in DNA that contain one of the 4 nitrogenous bases: adenine (A), guanine (G), cytosine (C) and thymine (T). However, since the double-stranded DNA helix is ​​not directly involved in the synthesis of the protein that is encoded by one of these strands (i.e. RNA), the code is written in the language of RNA, in which uracil (U) is included instead of thymine. For the same reason, it is customary to say that a code is a sequence of nucleotides, not base pairs.

The genetic code is represented by certain code words - codons.

The first code word was deciphered by Nirenberg and Mattei in 1961. They obtained an extract from E. coli containing ribosomes and other factors necessary for protein synthesis. The result was a cell-free system for protein synthesis, which could assemble a protein from amino acids if the necessary mRNA was added to the medium. By adding synthetic RNA, consisting only of uracils, to the medium, they found that a protein was formed consisting only of phenylalanine (polyphenylalanine). So it was found that the triplet of UUU nucleotides (codon) corresponds to phenylalanine. Over the next 5-6 years, all codons of the genetic code were determined.

The genetic code is a kind of dictionary that translates a text written with four nucleotides into a protein text written with 20 amino acids. The rest of the amino acids found in the protein are modifications of one of the 20 amino acids.

Properties of the genetic code

The genetic code has the following properties.

  1. Tripletity Each amino acid corresponds to a triple of nucleotides. It is easy to calculate that there are 4 3 = 64 codons. Of these, 61 are semantic and 3 are meaningless (terminating, stop codons).
  2. Continuity(there are no separating characters between nucleotides) - the absence of intragenic punctuation marks;

    Within a gene, each nucleotide is part of a significant codon. In 1961 Seymour Benzer and Francis Crick experimentally proved the triplet code and its continuity (compactness) [show]

    The essence of the experiment: "+" mutation - the insertion of one nucleotide. "-" mutation - loss of one nucleotide.

    A single mutation ("+" or "-") at the beginning of a gene or a double mutation ("+" or "-") spoils the entire gene.

    A triple mutation ("+" or "-") at the beginning of a gene spoils only part of the gene.

    A quadruple "+" or "-" mutation again spoils the entire gene.

    The experiment was carried out on two adjacent phage genes and showed that

    1. the code is triplet and there are no punctuation marks inside the gene
    2. there are punctuation marks between genes
  3. Presence of intergenic punctuation marks- the presence among the triplets of initiating codons (they begin protein biosynthesis), codons - terminators (indicate the end of protein biosynthesis);

    Conventionally, the AUG codon also belongs to punctuation marks - the first after the leader sequence. It performs the function of a capital letter. In this position, it codes for formylmethionine (in prokaryotes).

    At the end of each gene encoding a polypeptide, there is at least one of 3 termination codons, or stop signals: UAA, UAG, UGA. They terminate the broadcast.

  4. Collinearity- correspondence of the linear sequence of mRNA codons and amino acids in the protein.
  5. Specificity- each amino acid corresponds only to certain codons that cannot be used for another amino acid.
  6. Unidirectional- codons are read in one direction - from the first nucleotide to the next
  7. Degeneracy, or redundancy, - several triplets can encode one amino acid (amino acids - 20, possible triplets - 64, 61 of them are semantic, i.e., on average, each amino acid corresponds to about 3 codons); the exception is methionine (Met) and tryptophan (Trp).

    The reason for the degeneracy of the code is that the main semantic load is carried by the first two nucleotides in the triplet, and the third is not so important. From here code degeneracy rule : if two codons have two identical first nucleotides, and their third nucleotides belong to the same class (purine or pyrimidine), then they code for the same amino acid.

    However, there are two exceptions to this ideal rule. These are the AUA codon, which should correspond not to isoleucine, but to methionine, and the UGA codon, which is the terminator, while it should correspond to tryptophan. The degeneracy of the code obviously has an adaptive value.

  8. Versatility- all the properties of the genetic code listed above are characteristic of all living organisms.
    codon Universal code Mitochondrial codes
    Vertebrates Invertebrates Yeast Plants
    UGASTOPtrptrptrpSTOP
    AUAileMetMetMetile
    CUALeuLeuLeuThrLeu
    AGAArgSTOPSerArgArg
    AGGArgSTOPSerArgArg

    Recently, the principle of the universality of the code has been shaken in connection with the discovery by Berell in 1979 of the ideal code of human mitochondria, in which the code degeneracy rule is fulfilled. In the mitochondrial code, the UGA codon corresponds to tryptophan and AUA to methionine, as required by the code degeneracy rule.

    Perhaps, at the beginning of evolution, all the simplest organisms had the same code as the mitochondria, and then it underwent slight deviations.

  9. non-overlapping- each of the triplets of the genetic text is independent of each other, one nucleotide is part of only one triplet; On fig. shows the difference between overlapping and non-overlapping code.

    In 1976 φX174 phage DNA was sequenced. It has a single stranded circular DNA of 5375 nucleotides. The phage was known to encode 9 proteins. For 6 of them, genes located one after another were identified.

    It turned out that there is an overlap. The E gene is completely within the D gene. Its start codon appears as a result of a one nucleotide shift in the reading. The J gene starts where the D gene ends. The start codon of the J gene overlaps with the stop codon of the D gene by a two-nucleotide shift. The design is called "reading frame shift" by a number of nucleotides that is not a multiple of three. To date, overlap has only been shown for a few phages.

  10. Noise immunity- the ratio of the number of conservative substitutions to the number of radical substitutions.

    Mutations of nucleotide substitutions that do not lead to a change in the class of the encoded amino acid are called conservative. Mutations of nucleotide substitutions that lead to a change in the class of the encoded amino acid are called radical.

    Since the same amino acid can be encoded by different triplets, some substitutions in triplets do not lead to a change in the encoded amino acid (for example, UUU -> UUC leaves phenylalanine). Some substitutions change an amino acid to another from the same class (non-polar, polar, basic, acidic), other substitutions also change the class of the amino acid.

    In each triplet, 9 single substitutions can be made, i.e. you can choose which of the positions to change - in three ways (1st or 2nd or 3rd), and the selected letter (nucleotide) can be changed to 4-1 = 3 other letters (nucleotides). The total number of possible nucleotide substitutions is 61 by 9 = 549.

    By direct counting on the table of the genetic code, one can verify that of these: 23 nucleotide substitutions lead to the appearance of codons - translation terminators. 134 substitutions do not change the encoded amino acid. 230 substitutions do not change the class of the encoded amino acid. 162 substitutions lead to a change in the amino acid class, i.e. are radical. Of the 183 substitutions of the 3rd nucleotide, 7 lead to the appearance of translation terminators, and 176 are conservative. Of the 183 substitutions of the 1st nucleotide, 9 lead to the appearance of terminators, 114 are conservative and 60 are radical. Of the 183 substitutions of the 2nd nucleotide, 7 lead to the appearance of terminators, 74 are conservative, and 102 are radical.


In any cell and organism, all features of the anatomical, morphological and functional nature are determined by the structure of the proteins that are included in them. The hereditary property of an organism is the ability to synthesize certain proteins. Amino acids are located in a polypeptide chain, on which biological characteristics depend.
Each cell has its own sequence of nucleotides in the DNA polynucleotide chain. This is the genetic code of DNA. Through it, information about the synthesis of certain proteins is recorded. About what the genetic code is, about its properties and genetic information is described in this article.

A bit of history

The idea that perhaps a genetic code exists was formulated by J. Gamow and A. Down in the middle of the twentieth century. They described that the nucleotide sequence responsible for the synthesis of a particular amino acid contains at least three units. Later they proved the exact number of three nucleotides (this is a unit of the genetic code), which was called a triplet or codon. There are sixty-four nucleotides in total, because the acid molecule, where or RNA occurs, consists of residues of four different nucleotides.

What is the genetic code

The method of coding the protein sequence of amino acids due to the sequence of nucleotides is characteristic of all living cells and organisms. That's what the genetic code is.
There are four nucleotides in DNA:

  • adenine - A;
  • guanine - G;
  • cytosine - C;
  • thymine - T.

They are indicated by capital letters in Latin or (in Russian-language literature) Russian.
RNA also has four nucleotides, but one of them is different from DNA:

  • adenine - A;
  • guanine - G;
  • cytosine - C;
  • uracil - U.

All nucleotides line up in chains, and in DNA a double helix is ​​obtained, and in RNA it is single.
Proteins are built on where they, located in a certain sequence, determine its biological properties.

Properties of the genetic code

Tripletity. The unit of the genetic code consists of three letters, it is triplet. This means that the twenty existing amino acids are coded for by three specific nucleotides called codons or trilpets. There are sixty-four combinations that can be created from four nucleotides. This amount is more than enough to encode twenty amino acids.
Degeneracy. Each amino acid corresponds to more than one codon, with the exception of methionine and tryptophan.
Unambiguity. One codon codes for one amino acid. For example, in the gene of a healthy person with information about the beta target of hemoglobin, the triplet of GAG and GAA codes for A in everyone who has sickle cell anemia, one nucleotide is changed.
Collinearity. The amino acid sequence always corresponds to the nucleotide sequence that the gene contains.
The genetic code is continuous and compact, which means that it does not have "punctuation marks". That is, starting at a certain codon, there is a continuous reading. For example, AUGGUGTSUUAAAUGUG will be read as: AUG, GUG, CUU, AAU, GUG. But not AUG, UGG, and so on, or in any other way.
Versatility. It is the same for absolutely all terrestrial organisms, from humans to fish, fungi and bacteria.

Table

Not all available amino acids are present in the presented table. Hydroxyproline, hydroxylysine, phosphoserine, iodo derivatives of tyrosine, cystine, and some others are absent, since they are derivatives of other amino acids encoded by mRNA and formed after protein modification as a result of translation.
From the properties of the genetic code, it is known that one codon is able to code for one amino acid. The exception is the genetic code that performs additional functions and codes for valine and methionine. RNA, being at the beginning with a codon, attaches a t-RNA that carries formyl methion. Upon completion of the synthesis, it splits off itself and takes the formyl residue with it, transforming into a methionine residue. Thus, the above codons are the initiators of the synthesis of a chain of polypeptides. If they are not at the beginning, then they are no different from others.

genetic information

This concept means a program of properties that is transmitted from ancestors. It is embedded in heredity as a genetic code.
Implemented during protein synthesis genetic code:

  • information and RNA;
  • ribosomal rRNA.

Information is transmitted by direct communication (DNA-RNA-protein) and reverse (environment-protein-DNA).
Organisms can receive, store, transfer it and use it most effectively.
Being inherited, information determines the development of an organism. But due to interaction with the environment, the reaction of the latter is distorted, due to which evolution and development take place. Thus, new information is laid in the body.


The calculation of the laws of molecular biology and the discovery of the genetic code illustrated the need to combine genetics with Darwin's theory, on the basis of which a synthetic theory of evolution emerged - non-classical biology.
Heredity, variability and Darwin's natural selection are complemented by genetically determined selection. Evolution is realized at the genetic level through random mutations and inheritance of the most valuable traits that are most adapted to the environment.

Deciphering the human code

In the nineties, the Human Genome Project was launched, as a result of which, in the 2000s, fragments of the genome containing 99.99% of human genes were discovered. Fragments that are not involved in protein synthesis and are not encoded remained unknown. Their role is still unknown.

Chromosome 1, last discovered in 2006, is the longest in the genome. More than three hundred and fifty diseases, including cancer, appear as a result of disorders and mutations in it.

The role of such research can hardly be overestimated. When they discovered what the genetic code is, it became known according to what patterns development occurs, how the morphological structure, the psyche, predisposition to certain diseases, metabolism and vices of individuals are formed.

In the body's metabolism leading role belongs to proteins and nucleic acids.
Protein substances form the basis of all vital cell structures, have an unusually high reactivity, and are endowed with catalytic functions.
Nucleic acids are part of the most important organ of the cell - the nucleus, as well as the cytoplasm, ribosomes, mitochondria, etc. Nucleic acids play an important, primary role in heredity, body variability, and protein synthesis.

Plan synthesis protein is stored in the cell nucleus, and direct synthesis occurs outside the nucleus, so it is necessary delivery service encoded plan from the nucleus to the site of synthesis. This delivery service is performed by RNA molecules.

The process starts at core cells: part of the DNA "ladder" unwinds and opens. Due to this, the RNA letters form bonds with the open DNA letters of one of the DNA strands. The enzyme transfers the letters of the RNA to connect them into a thread. So the letters of DNA are "rewritten" into the letters of RNA. The newly formed RNA chain is separated, and the DNA "ladder" twists again. The process of reading information from DNA and synthesizing its RNA template is called transcription , and the synthesized RNA is called informational or i-RNA .

After further modifications, this kind of encoded mRNA is ready. i-RNA comes out of the nucleus and goes to the site of protein synthesis, where the letters i-RNA are deciphered. Each set of three letters of i-RNA forms a "letter" that stands for one particular amino acid.

Another type of RNA looks for this amino acid, captures it with the help of an enzyme, and delivers it to the site of protein synthesis. This RNA is called transfer RNA, or tRNA. As the mRNA message is read and translated, the chain of amino acids grows. This chain twists and folds into a unique shape, creating one kind of protein. Even the process of protein folding is remarkable: to use a computer to calculate all options it would take 1027 (!) years to fold a medium-sized protein consisting of 100 amino acids. And for the formation of a chain of 20 amino acids in the body, it takes no more than one second, and this process occurs continuously in all cells of the body.

Genes, genetic code and its properties.

About 7 billion people live on Earth. Except for 25-30 million pairs of identical twins, then genetically all people are different : each is unique, has unique hereditary characteristics, character traits, abilities, temperament.

Such differences are explained differences in genotypes- sets of genes of an organism; each one is unique. The genetic traits of a particular organism are embodied in proteins - consequently, the structure of the protein of one person differs, although quite a bit, from the protein of another person.

It does not mean that humans do not have exactly the same proteins. Proteins that perform the same functions may be the same or very slightly differ by one or two amino acids from each other. But does not exist on the Earth of people (with the exception of identical twins), in which all proteins would be are the same .

Information about the primary structure of a protein encoded as a sequence of nucleotides in a section of a DNA molecule, gene - a unit of hereditary information of an organism. Each DNA molecule contains many genes. The totality of all the genes of an organism makes up its genotype . In this way,

A gene is a unit of hereditary information of an organism, which corresponds to a separate section of DNA

Hereditary information is encoded using genetic code , which is universal for all organisms and differs only in the alternation of nucleotides that form genes and code for proteins of specific organisms.

Genetic code consists of triplets (triplets) of DNA nucleotides, combined in different sequences (AAT, HCA, ACG, THC, etc.), each of which encodes a specific amino acid (which will be built into the polypeptide chain).

Actually code counts sequence of nucleotides in an i-RNA molecule , because it removes information from DNA (the process transcriptions ) and translates it into a sequence of amino acids in the molecules of synthesized proteins (process broadcasts ).
The composition of mRNA includes nucleotides A-C-G-U, the triplets of which are called codons : the CHT DNA triplet on mRNA will become the HCA triplet, and the AAG DNA triplet will become the UUC triplet. Exactly i-RNA codons reflects the genetic code in the record.

In this way, genetic code - a unified system for recording hereditary information in nucleic acid molecules in the form of a sequence of nucleotides . The genetic code is based on the use of an alphabet consisting of only four nucleotide letters that differ in nitrogenous bases: A, T, G, C.

The main properties of the genetic code:

1. Genetic code triplet. A triplet (codon) is a sequence of three nucleotides that codes for one amino acid. Since proteins contain 20 amino acids, it is obvious that each of them cannot be encoded by one nucleotide ( since there are only four types of nucleotides in DNA, in this case 16 amino acids remain uncoded). Two nucleotides for coding amino acids are also not enough, since in this case only 16 amino acids can be encoded. This means that the smallest number of nucleotides encoding one amino acid must be at least three. In this case, the number of possible nucleotide triplets is 43 = 64.

2. Redundancy (degeneracy) The code is a consequence of its triplet nature and means that one amino acid can be encoded by several triplets (since there are 20 amino acids, and there are 64 triplets), with the exception of methionine and tryptophan, which are encoded by only one triplet. In addition, some triplets perform specific functions: in the mRNA molecule, the triplets UAA, UAG, UGA are terminating codons, i.e. stop-signals that stop the synthesis of the polypeptide chain. The triplet corresponding to methionine (AUG), standing at the beginning of the DNA chain, does not encode an amino acid, but performs the function of initiating (exciting) reading.

3. Unambiguity code - along with redundancy, the code has the property uniqueness : each codon matches only one specific amino acid.

4. Collinearity code, i.e. sequence of nucleotides in a gene exactly corresponds to the sequence of amino acids in the protein.

5. Genetic code non-overlapping and compact , i.e. does not contain "punctuation marks". This means that the reading process does not allow for the possibility of overlapping columns (triplets), and, starting at a certain codon, the reading goes continuously triplet by triplet until stop-signals ( termination codons).

6. Genetic code universal , i.e., the nuclear genes of all organisms encode information about proteins in the same way, regardless of the level of organization and the systematic position of these organisms.

Exist genetic code tables for decryption codons i-RNA and building chains of protein molecules.

Matrix synthesis reactions.

In living systems, there are reactions unknown in inanimate nature - matrix synthesis reactions.

The term "matrix" in technology they denote the form used for casting coins, medals, typographic type: the hardened metal exactly reproduces all the details of the form used for casting. Matrix synthesis resembles a casting on a matrix: new molecules are synthesized in strict accordance with the plan laid down in the structure of already existing molecules.

The matrix principle lies at the core the most important synthetic reactions of the cell, such as the synthesis of nucleic acids and proteins. In these reactions, an exact, strictly specific sequence of monomeric units in the synthesized polymers is provided.

This is where directional pulling monomers to a specific location cells - into molecules that serve as a matrix where the reaction takes place. If such reactions occurred as a result of a random collision of molecules, they would proceed infinitely slowly. The synthesis of complex molecules based on the matrix principle is carried out quickly and accurately. The role of the matrix macromolecules of nucleic acids play in matrix reactions DNA or RNA .

monomeric molecules, from which the polymer is synthesized - nucleotides or amino acids - in accordance with the principle of complementarity are arranged and fixed on the matrix in a strictly defined, predetermined order.

Then comes "crosslinking" of monomer units into a polymer chain, and the finished polymer is dropped from the matrix.

Thereafter matrix ready to the assembly of a new polymer molecule. It is clear that just as only one coin, one letter can be cast on a given mold, so only one polymer can be "assembled" on a given matrix molecule.

Matrix type of reactions- a specific feature of the chemistry of living systems. They are the basis of the fundamental property of all living things - its ability to reproduce its own kind.

Matrix synthesis reactions

1. DNA replication - replication (from lat. replicatio - renewal) - the process of synthesis of a daughter molecule of deoxyribonucleic acid on the matrix of the parent DNA molecule. During the subsequent division of the mother cell, each daughter cell receives one copy of a DNA molecule that is identical to the DNA of the original mother cell. This process ensures the accurate transmission of genetic information from generation to generation. DNA replication is carried out by a complex enzyme complex, consisting of 15-20 different proteins, called replisome . The material for synthesis is free nucleotides present in the cytoplasm of cells. The biological meaning of replication lies in the exact transfer of hereditary information from the parent molecule to the daughter ones, which normally occurs during the division of somatic cells.

The DNA molecule consists of two complementary strands. These chains are held together by weak hydrogen bonds that can be broken by enzymes. The DNA molecule is capable of self-doubling (replication), and a new half of it is synthesized on each old half of the molecule.
In addition, an mRNA molecule can be synthesized on a DNA molecule, which then transfers the information received from DNA to the site of protein synthesis.

Information transfer and protein synthesis follow a matrix principle, comparable to the work of a printing press in a printing house. Information from DNA is copied over and over again. If errors occur during copying, they will be repeated in all subsequent copies.

True, some errors in copying information by a DNA molecule can be corrected - the process of eliminating errors is called reparations. The first of the reactions in the process of information transfer is the replication of the DNA molecule and the synthesis of new DNA strands.

2. Transcription (from Latin transcriptio - rewriting) - the process of RNA synthesis using DNA as a template, occurring in all living cells. In other words, it is the transfer of genetic information from DNA to RNA.

Transcription is catalyzed by the enzyme DNA-dependent RNA polymerase. RNA polymerase moves along the DNA molecule in the direction 3 " → 5". Transcription consists of steps initiation, elongation and termination . The unit of transcription is the operon, a fragment of the DNA molecule consisting of promoter, transcribed moiety, and terminator . i-RNA consists of one strand and is synthesized on DNA in accordance with the rule of complementarity with the participation of an enzyme that activates the beginning and end of the synthesis of the i-RNA molecule.

The finished mRNA molecule enters the cytoplasm on the ribosomes, where the synthesis of polypeptide chains takes place.

3. Broadcast (from lat. translation- transfer, movement) - the process of protein synthesis from amino acids on the matrix of information (matrix) RNA (mRNA, mRNA) carried out by the ribosome. In other words, this is the process of translating the information contained in the nucleotide sequence of i-RNA into the sequence of amino acids in the polypeptide.

4. reverse transcription is the process of forming double-stranded DNA based on information from single-stranded RNA. This process is called reverse transcription, since the transfer of genetic information occurs in the “reverse” direction relative to transcription. The idea of ​​reverse transcription was initially very unpopular, as it contradicted the central dogma of molecular biology, which assumed that DNA is transcribed into RNA and then translated into proteins.

However, in 1970, Temin and Baltimore independently discovered an enzyme called reverse transcriptase (revertase) , and the possibility of reverse transcription was finally confirmed. In 1975, Temin and Baltimore were awarded the Nobel Prize in Physiology or Medicine. Some viruses (such as the human immunodeficiency virus that causes HIV infection) have the ability to transcribe RNA into DNA. HIV has an RNA genome that integrates into DNA. As a result, the DNA of the virus can be combined with the genome of the host cell. The main enzyme responsible for the synthesis of DNA from RNA is called revertase. One of the functions of reversease is to create complementary DNA (cDNA) from the viral genome. The associated enzyme ribonuclease cleaves RNA, and reversetase synthesizes cDNA from the DNA double helix. cDNA is integrated into the host cell genome by integrase. The result is synthesis of viral proteins by the host cell that form new viruses. In the case of HIV, apoptosis (cell death) of T-lymphocytes is also programmed. In other cases, the cell may remain a distributor of viruses.

The sequence of matrix reactions in protein biosynthesis can be represented as a diagram.

In this way, protein biosynthesis- this is one of the types of plastic exchange, during which the hereditary information encoded in the DNA genes is realized in a certain sequence of amino acids in protein molecules.

Protein molecules are essentially polypeptide chains made up of individual amino acids. But amino acids are not active enough to connect with each other on their own. Therefore, before they combine with each other and form a protein molecule, amino acids must activate . This activation occurs under the action of special enzymes.

As a result of activation, the amino acid becomes more labile and, under the action of the same enzyme, binds to t- RNA. Each amino acid corresponds to a strictly specific t- RNA, which finds "its" amino acid and endures it into the ribosome.

Therefore, the ribosome receives various activated amino acids linked to their t- RNA. The ribosome is like conveyor to assemble a protein chain from various amino acids entering it.

Simultaneously with t-RNA, on which its own amino acid "sits", " signal» from the DNA that is contained in the nucleus. In accordance with this signal, one or another protein is synthesized in the ribosome.

The directing influence of DNA on protein synthesis is not carried out directly, but with the help of a special intermediary - matrix or messenger RNA (mRNA or i-RNA), which synthesized into the nucleus It is not influenced by DNA, so its composition reflects the composition of DNA. The RNA molecule is, as it were, a cast from the form of DNA. The synthesized mRNA enters the ribosome and, as it were, transfers it to this structure plan- in what order should the activated amino acids entering the ribosome be combined with each other in order to synthesize a certain protein. Otherwise, genetic information encoded in DNA is transferred to mRNA and then to protein.

The mRNA molecule enters the ribosome and flashes her. That segment of it that is currently in the ribosome is determined codon (triplet), interacts in a completely specific way with a structure suitable for it triplet (anticodon) in the transfer RNA that brought the amino acid into the ribosome.

Transfer RNA with its amino acid approaches a certain codon of mRNA and connects with him; to the next, neighboring site of i-RNA joins another tRNA with a different amino acid and so on until the entire i-RNA chain is read, until all the amino acids are strung in the appropriate order, forming a protein molecule. And t-RNA, which delivered the amino acid to a specific site of the polypeptide chain, freed from its amino acid and exits the ribosome.

Then again in the cytoplasm, the desired amino acid can join it, and it will again transfer it to the ribosome. In the process of protein synthesis, not one, but several ribosomes, polyribosomes, are simultaneously involved.

The main stages of the transfer of genetic information:

1. Synthesis on DNA as on an mRNA template (transcription)
2. Synthesis of the polypeptide chain in ribosomes according to the program contained in i-RNA (translation) .

The stages are universal for all living beings, but the temporal and spatial relationships of these processes differ in pro- and eukaryotes.

At prokaryotes transcription and translation can occur simultaneously because DNA is located in the cytoplasm. At eukaryote transcription and translation are strictly separated in space and time: the synthesis of various RNAs occurs in the nucleus, after which the RNA molecules must leave the nucleus, passing through the nuclear membrane. The RNA is then transported in the cytoplasm to the site of protein synthesis.

They line up in chains and, thus, sequences of genetic letters are obtained.

Genetic code

The proteins of almost all living organisms are built from only 20 types of amino acids. These amino acids are called canonical. Each protein is a chain or several chains of amino acids connected in a strictly defined sequence. This sequence determines the structure of the protein, and therefore all its biological properties.

C

CUU (Leu/L)Leucine
CUC (Leu/L) Leucine
CUA (Leu/L)Leucine
CUG (Leu/L) Leucine

In some proteins, non-standard amino acids such as selenocysteine ​​and pyrrolysine are inserted by the stop codon-reading ribosome, which depends on the sequences in the mRNA. Selenocysteine ​​is now considered as the 21st, and pyrrolysine as the 22nd amino acid that makes up proteins.

Despite these exceptions, the genetic code of all living organisms has common features: a codon consists of three nucleotides, where the first two are defining, codons are translated by tRNA and ribosomes into a sequence of amino acids.

Deviations from the standard genetic code.
Example codon Usual meaning Reads like:
Some types of yeast of the genus Candida CUG Leucine Serene
Mitochondria, in particular Saccharomyces cerevisiae CU(U, C, A, G) Leucine Serene
Mitochondria of higher plants CGG Arginine tryptophan
Mitochondria (in all studied organisms without exception) UGA Stop tryptophan
Mammalian mitochondria, Drosophila, S.cerevisiae and many simple AUA Isoleucine Methionine = Start
prokaryotes GUG Valine Start
Eukaryotes (rare) CUG Leucine Start
Eukaryotes (rare) GUG Valine Start
Prokaryotes (rare) UUG Leucine Start
Eukaryotes (rare) ACG Threonine Start
Mammalian mitochondria AGC, AGU Serene Stop
Drosophila mitochondria AGA Arginine Stop
Mammalian mitochondria AG(A,G) Arginine Stop

The history of ideas about the genetic code

Nevertheless, in the early 1960s, new data revealed the failure of the "comma-free code" hypothesis. Then experiments showed that codons, considered by Crick to be meaningless, can provoke protein synthesis in a test tube, and by 1965 the meaning of all 64 triplets was established. It turned out that some codons are simply redundant, that is, a number of amino acids are encoded by two, four or even six triplets.

see also

Notes

  1. Genetic code supports targeted insertion of two amino acids by one codon. Turanov AA, Lobanov AV, Fomenko DE, Morrison HG, Sogin ML, Klobutcher LA, Hatfield DL, Gladyshev VN. Science. 2009 Jan 9;323(5911):259-61.
  2. The AUG codon encodes methionine, but also serves as a start codon - as a rule, translation begins from the first AUG codon of mRNA.
  3. NCBI: "The Genetic Codes", Compiled by Andrzej (Anjay) Elzanowski and Jim Ostell
  4. Jukes TH, Osawa S, The genetic code in mitochondria and chloroplasts., Experientia. 1990 Dec 1;46(11-12):1117-26.
  5. Osawa S, Jukes TH, Watanabe K, Muto A (March 1992). "Recent evidence for evolution of the genetic code". microbiol. Rev. 56 (1): 229–64. PMID 1579111.
  6. SANGER F. (1952). "The arrangement of amino acids in proteins.". Adv Protein Chem. 7 : 1-67. PMID 14933251 .
  7. M. Ichas biological code. - Peace, 1971.
  8. WATSON JD, CRICK FH. (April 1953). «Molecular structure of nucleic acids; a structure for deoxyribose nucleic acid.". Nature 171 : 737-738. PMID 13054692 .
  9. WATSON JD, CRICK FH. (May 1953). "Genetical implications of the structure of deoxyribonucleic acid.". Nature 171 : 964-967. PMID 13063483 .
  10. Crick F.H. (April 1966). "The genetic code - yesterday, today, and tomorrow." Cold Spring Harb Symp Quant Biol.: 1-9. PMID 5237190.
  11. G. GAMOW (February 1954). "Possible Relationship between Deoxyribonucleic Acid and Protein Structures.". Nature 173 : 318. DOI: 10.1038/173318a0 . PMID 13882203 .
  12. GAMOW G, RICH A, YCAS M. (1956). "The problem of information transfer from the nucleic acids to proteins.". Adv Biol Med Phys. 4 : 23-68. PMID 13354508 .
  13. Gamow G, Ycas M. (1955). STATISTICAL CORRELATION OF PROTEIN AND RIBONUCLEIC ACID COMPOSITION. ". Proc Natl Acad Sci U S A. 41 : 1011-1019. PMID 16589789 .
  14. Crick FH, Griffith JS, Orgel LE. (1957). CODES WITHOUT COMMAS. ". Proc Natl Acad Sci U S A. 43 : 416-421. PMID 16590032.
  15. Hayes B. (1998). "The Invention of the Genetic Code." (PDF reprint). American scientist 86 : 8-14.

Literature

  • Azimov A. Genetic code. From the theory of evolution to the decoding of DNA. - M.: Tsentrpoligraf, 2006. - 208 s - ISBN 5-9524-2230-6.
  • Ratner V. A. Genetic code as a system - Soros Educational Journal, 2000, 6, No. 3, pp. 17-22.
  • Crick FH, Barnett L, Brenner S, Watts-Tobin RJ. General nature of the genetic code for proteins - Nature, 1961 (192), pp. 1227-32

Links

  • Genetic code- article from the Great Soviet Encyclopedia

Wikimedia Foundation. 2010 .

Genetic code- a method inherent in all living organisms to encode the sequence of amino acid residues in the composition of proteins using the sequence of nucleotides in the composition of the nucleic acid.

Four nitrogenous bases are used in DNA - adenine (A), guanine (G), cytosine (C), thymine (T), which in Russian literature are denoted by the letters A, G, C and T. These letters make up the alphabet of the genetic code. RNA uses the same nucleotides, except for the nucleotide containing thymine, which is replaced by a similar nucleotide containing uracil, which is denoted by the letter U (U in Russian-language literature). In DNA and RNA molecules, nucleotides line up in chains and, thus, sequences of genetic letters are obtained.

Genetic code

The proteins of almost all living organisms are built from only 20 types of amino acids. These amino acids are called canonical. Each protein is a chain or several chains of amino acids connected in a strictly defined sequence. This sequence determines the structure of the protein, and therefore all its biological properties.

The implementation of genetic information in living cells (that is, the synthesis of a protein encoded by a gene) is carried out using two matrix processes: transcription (that is, the synthesis of mRNA on a DNA template) and translation of the genetic code into an amino acid sequence (synthesis of a polypeptide chain on mRNA). Three consecutive nucleotides are enough to encode 20 amino acids, as well as the stop signal, which means the end of the protein sequence. A set of three nucleotides is called a triplet. Accepted abbreviations corresponding to amino acids and codons are shown in the figure. The concept of "genetic code" has nothing to do with the sequence of triplets (codons) in a nucleic acid, and, consequently, with the sequence of amino acid residues in a protein molecule. The genetic code is a way of recording, not the content of the recording.

Genetic code common to most pro- and eukaryotes. The table lists all 64 codons and lists the corresponding amino acids. The base order is from the 5' to the 3' end of the mRNA.

standard genetic code
1st
base
2nd base 3rd
base
U C A G
U UUU (Phe/F) Phenylalanine UCU (Ser/S) Serine UAU (Tyr/Y) Tyrosine UGU (Cys/C) Cysteine U
UUC UCC UAC UGC C
UUA (Leu/L) Leucine UCA UAA Stop ( Ocher) UGA Stop ( Opal) A
UUG UCG UAG Stop ( Amber) UGG (Trp/W) Tryptophan G
C CUU CCU (Pro/P) Proline CAU (His/H) Histidine CGU (Arg/R) Arginine U
CUC CCC CAC CGC C
CUA CCA CAA (Gln/Q) Glutamine CGA A
CUG CCG CAG CGG G
A AUU (Ile/I) Isoleucine ACU (Thr/T) Threonine AAU (Asn/N) Asparagine AGU (Ser/S) Serine U
AUC ACC AAC AGC C
AUA ACA AAA (Lys/K) Lysine AGA (Arg/R) Arginine A
AUG[A] (Met/M) Methionine ACG AAG AGG G
G GUU (Val/V) Valine GCU (Ala/A) Alanine GAU (Asp/D) Aspartic acid GGU (Gly/G) Glycine U
GUC GCC GAC GGC C
GUA GCA GAA (Glu/E) Glutamic acid GGA A
GUG GCG GAG GGG G
A The AUG codon encodes methionine and is also the site of translation initiation: the first AUG codon in the mRNA coding region serves as the start of protein synthesis. Sector version of the record, the inner circle is the 1st base of the codon (from the 5'-end) Reverse table (codons for each amino acid are indicated, as well as stop codons)
Ala/A GCU, GCC, GCA, GCG Leu/L UUA, UUG, CUU, CUC, CUA, CUG
Arg/R CGU, CGC, CGA, CGG, AGA, AGG Lys/K AAA, AAG
Asn/N AAU, AAC Met/M AUG
Asp/D GAU, GAC Phe/F UUU, UUC
Cys/C UGU, UGC Pro/P CCU, CCC, CCA, CCG
Gln/Q CAA, CAG Ser/S UCU, UCC, UCA, UCG, AGU, AGC
Glu/E GAA, GAG Thr/T ACU, ACC, ACA, ACG
Gly/G GGU, GGC, GGA, GGG Trp/W UGG
His/H CAU, CAC Tyr/Y UAU, UAC
Ile/I AUU, AUC, AUA Val/V GUU, GUC, GUU, GUG
START AUG STOP UAG, UGA, UAA
Deviations from the standard genetic code
Example codon Usual meaning Reads like:
Some types of yeast of the genus Candida CUG Leucine Serene
Mitochondria, in particular Saccharomyces cerevisiae CU(U, C, A, G) Leucine Serene
Mitochondria of higher plants CGG Arginine tryptophan
Mitochondria (in all studied organisms without exception) UGA Stop tryptophan
Nuclear genome of ciliates Euplotes UGA Stop Cysteine ​​or selenocysteine
Mitochondria of mammals, Drosophila, S.cerevisiae and many simple AUA Isoleucine Methionine = Start
prokaryotes GUG Valine Start
Eukaryotes (rare) CUG Leucine Start
Eukaryotes (rare) GUG Valine Start
Prokaryotes (rare) UUG Leucine Start
Eukaryotes (rare) ACG Threonine Start
Mammalian mitochondria AGC, AGU Serene Stop
Drosophila mitochondria AGA Arginine Stop
Mammalian mitochondria AG(A,G) Arginine Stop