The genetic code is characterized. The main properties of the genetic code and their meaning

Every living organism has a special set of proteins. Certain compounds of nucleotides and their sequence in the DNA molecule form the genetic code. It conveys information about the structure of the protein. In genetics, a certain concept has been adopted. According to her, one gene corresponded to one enzyme (polypeptide). It should be said that research on nucleic acids and proteins has been carried out for a fairly long period. Further in the article, we will take a closer look at the genetic code and its properties. A brief chronology of research will also be given.

Terminology

The genetic code is a way of encoding the amino acid protein sequence with the participation of the nucleotide sequence. This method of forming information is characteristic of all living organisms. Proteins are natural organic substances with high molecular weight. These compounds are also present in living organisms. They consist of 20 types of amino acids, which are called canonical. Amino acids are arranged in a chain and connected in a strictly established sequence. It determines the structure of the protein and its biological properties. There are also several chains of amino acids in the protein.

DNA and RNA

Deoxyribonucleic acid is a macromolecule. She is responsible for the transmission, storage and implementation of hereditary information. DNA uses four nitrogenous bases. These include adenine, guanine, cytosine, thymine. RNA consists of the same nucleotides, except for the one that contains thymine. Instead, a nucleotide containing uracil (U) is present. RNA and DNA molecules are nucleotide chains. Thanks to this structure, sequences are formed - the "genetic alphabet".

Implementation of information

The synthesis of a protein encoded by a gene is realized by combining mRNA on a DNA template (transcription). There is also a transfer of the genetic code into a sequence of amino acids. That is, the synthesis of the polypeptide chain on mRNA takes place. To encode all amino acids and signal the end of the protein sequence, 3 nucleotides are enough. This chain is called a triplet.

Research History

The study of protein and nucleic acids has been carried out for a long time. In the middle of the 20th century, the first ideas about the nature of the genetic code finally appeared. In 1953, it was found that some proteins are made up of sequences of amino acids. True, at that time they could not yet determine their exact number, and there were numerous disputes about this. In 1953, Watson and Crick published two papers. The first declared the secondary structure of DNA, the second spoke of its admissible copying by means of matrix synthesis. In addition, emphasis was placed on the fact that a particular sequence of bases is a code that carries hereditary information. American and Soviet physicist Georgy Gamow admitted the coding hypothesis and found a method to test it. In 1954, his work was published, during which he put forward a proposal to establish correspondences between amino acid side chains and diamond-shaped "holes" and use this as a coding mechanism. Then it was called rhombic. Explaining his work, Gamow admitted that the genetic code could be triplet. The work of a physicist was one of the first among those that were considered close to the truth.

Classification

After several years, various models of genetic codes were proposed, representing two types: overlapping and non-overlapping. The first one was based on the occurrence of one nucleotide in the composition of several codons. The triangular, sequential and major-minor genetic code belongs to it. The second model assumes two types. Non-overlapping include combinational and "code without commas". The first variant is based on the encoding of an amino acid by nucleotide triplets, and its composition is the main one. According to the "no comma code", certain triplets correspond to amino acids, while the rest do not. In this case, it was believed that if any significant triplets were arranged in series, others in a different reading frame would turn out to be unnecessary. Scientists believed that it was possible to select a nucleotide sequence that would meet these requirements, and that there were exactly 20 triplets.

Although Gamow et al questioned this model, it was considered the most correct over the next five years. At the beginning of the second half of the 20th century, new data appeared that made it possible to detect some shortcomings in the "code without commas". Codons have been found to be able to induce protein synthesis in vitro. Closer to 1965, they comprehended the principle of all 64 triplets. As a result, redundancy of some codons was found. In other words, the sequence of amino acids is encoded by several triplets.

Distinctive features

The properties of the genetic code include:

Variations

For the first time, the deviation of the genetic code from the standard was discovered in 1979 during the study of mitochondrial genes in the human body. Further similar variants were identified, including many alternative mitochondrial codes. These include the deciphering of the stop codon UGA used as the definition of tryptophan in mycoplasmas. GUG and UUG in archaea and bacteria are often used as starting variants. Sometimes genes code for a protein from a start codon that differs from the one normally used by that species. Also, in some proteins, selenocysteine ​​and pyrrolysine, which are non-standard amino acids, are inserted by the ribosome. She reads the stop codon. It depends on the sequences found in the mRNA. Currently, selenocysteine ​​is considered the 21st, pyrrolizan - the 22nd amino acid present in proteins.

General features of the genetic code

However, all exceptions are rare. In living organisms, in general, the genetic code has a number of common features. These include the composition of the codon, which includes three nucleotides (the first two belong to the determining ones), the transfer of codons by tRNA and ribosomes into an amino acid sequence.

They line up in chains and, thus, sequences of genetic letters are obtained.

Genetic code

The proteins of almost all living organisms are built from only 20 types of amino acids. These amino acids are called canonical. Each protein is a chain or several chains of amino acids connected in a strictly defined sequence. This sequence determines the structure of the protein, and therefore all its biological properties.

C

CUU (Leu/L) Leucine
CUC (Leu/L) Leucine
CUA (Leu/L) Leucine
CUG (Leu/L) Leucine

In some proteins, non-standard amino acids such as selenocysteine ​​and pyrrolysine are inserted by the stop codon-reading ribosome, which depends on the sequences in the mRNA. Selenocysteine ​​is now considered as the 21st, and pyrrolysine as the 22nd amino acid that makes up proteins.

Despite these exceptions, the genetic code of all living organisms has common features: a codon consists of three nucleotides, where the first two are defining, codons are translated by tRNA and ribosomes into a sequence of amino acids.

Deviations from the standard genetic code.
Example codon Usual meaning Reads like:
Some types of yeast of the genus Candida CUG Leucine Serene
Mitochondria, in particular Saccharomyces cerevisiae CU(U, C, A, G) Leucine Serene
Mitochondria of higher plants CGG Arginine tryptophan
Mitochondria (in all studied organisms without exception) UGA Stop tryptophan
Mammalian mitochondria, Drosophila, S.cerevisiae and many simple AUA Isoleucine Methionine = Start
prokaryotes GUG Valine Start
Eukaryotes (rare) CUG Leucine Start
Eukaryotes (rare) GUG Valine Start
Prokaryotes (rare) UUG Leucine Start
Eukaryotes (rare) ACG Threonine Start
Mammalian mitochondria AGC, AGU Serene Stop
Drosophila mitochondria AGA Arginine Stop
Mammalian mitochondria AG(A, G) Arginine Stop

The history of ideas about the genetic code

Nevertheless, in the early 1960s, new data revealed the failure of the "comma-free code" hypothesis. Then experiments showed that codons, considered by Crick to be meaningless, can provoke protein synthesis in a test tube, and by 1965 the meaning of all 64 triplets was established. It turned out that some codons are simply redundant, that is, a number of amino acids are encoded by two, four or even six triplets.

see also

Notes

  1. Genetic code supports targeted insertion of two amino acids by one codon. Turanov AA, Lobanov AV, Fomenko DE, Morrison HG, Sogin ML, Klobutcher LA, Hatfield DL, Gladyshev VN. Science. 2009 Jan 9;323(5911):259-61.
  2. The AUG codon encodes methionine, but also serves as a start codon - as a rule, translation begins from the first AUG codon of mRNA.
  3. NCBI: "The Genetic Codes", Compiled by Andrzej (Anjay) Elzanowski and Jim Ostell
  4. Jukes TH, Osawa S, The genetic code in mitochondria and chloroplasts., Experientia. 1990 Dec 1;46(11-12):1117-26.
  5. Osawa S, Jukes TH, Watanabe K, Muto A (March 1992). "Recent evidence for evolution of the genetic code". microbiol. Rev. 56 (1): 229–64. PMID 1579111.
  6. SANGER F. (1952). "The arrangement of amino acids in proteins.". Adv Protein Chem. 7 : 1-67. PMID 14933251 .
  7. M. Ichas biological code. - Peace, 1971.
  8. WATSON JD, CRICK FH. (April 1953). «Molecular structure of nucleic acids; a structure for deoxyribose nucleic acid.". Nature 171 : 737-738. PMID 13054692 .
  9. WATSON JD, CRICK FH. (May 1953). "Genetical implications of the structure of deoxyribonucleic acid.". Nature 171 : 964-967. PMID 13063483 .
  10. Crick F.H. (April 1966). "The genetic code - yesterday, today, and tomorrow." Cold Spring Harb Symp Quant Biol.: 1-9. PMID 5237190.
  11. G. GAMOW (February 1954). "Possible Relationship between Deoxyribonucleic Acid and Protein Structures.". Nature 173 : 318. DOI: 10.1038/173318a0 . PMID 13882203 .
  12. GAMOW G, RICH A, YCAS M. (1956). "The problem of information transfer from the nucleic acids to proteins.". Adv Biol Med Phys. 4 : 23-68. PMID 13354508 .
  13. Gamow G, Ycas M. (1955). STATISTICAL CORRELATION OF PROTEIN AND RIBONUCLEIC ACID COMPOSITION. ". Proc Natl Acad Sci U S A. 41 : 1011-1019. PMID 16589789 .
  14. Crick FH, Griffith JS, Orgel LE. (1957). CODES WITHOUT COMMAS. ". Proc Natl Acad Sci U S A. 43 : 416-421. PMID 16590032.
  15. Hayes B. (1998). "The Invention of the Genetic Code." (PDF reprint). American scientist 86 : 8-14.

Literature

  • Azimov A. Genetic code. From the theory of evolution to the decoding of DNA. - M.: Tsentrpoligraf, 2006. - 208 s - ISBN 5-9524-2230-6.
  • Ratner V. A. Genetic code as a system - Soros Educational Journal, 2000, 6, No. 3, pp. 17-22.
  • Crick FH, Barnett L, Brenner S, Watts-Tobin RJ. General nature of the genetic code for proteins - Nature, 1961 (192), pp. 1227-32

Links

  • Genetic code- article from the Great Soviet Encyclopedia

Wikimedia Foundation. 2010 .

Gene classification

1) By the nature of the interaction in the allelic pair:

Dominant (a gene capable of suppressing the manifestation of an allelic recessive gene); - recessive (a gene, the manifestation of which is suppressed by an allelic dominant gene).

2) Functional classification:

2) Genetic code- these are certain combinations of nucleotides and the sequence of their location in the DNA molecule. This is a way of encoding the amino acid sequence of proteins using a sequence of nucleotides, characteristic of all living organisms.

Four nucleotides are used in DNA - adenine (A), guanine (G), cytosine (C), thymine (T), which in Russian-language literature are denoted by the letters A, G, T and C. These letters make up the alphabet of the genetic code. In RNA, the same nucleotides are used, with the exception of thymine, which is replaced by a similar nucleotide - uracil, which is denoted by the letter U (U in Russian-language literature). In DNA and RNA molecules, nucleotides line up in chains and, thus, sequences of genetic letters are obtained.

Genetic code

There are 20 different amino acids used in nature to build proteins. Each protein is a chain or several chains of amino acids in a strictly defined sequence. This sequence determines the structure of the protein, and therefore all its biological properties. The set of amino acids is also universal for almost all living organisms.

The implementation of genetic information in living cells (i.e., the synthesis of a protein encoded by a gene) is carried out using two matrix processes: transcription (i.e., mRNA synthesis on a DNA template) and translation of the genetic code into an amino acid sequence (synthesis of a polypeptide chain on an mRNA template). Three consecutive nucleotides are enough to encode 20 amino acids, as well as the stop signal, which means the end of the protein sequence. A set of three nucleotides is called a triplet. Accepted abbreviations corresponding to amino acids and codons are shown in the figure.

Properties of the genetic code

1. Tripletity- a significant unit of the code is a combination of three nucleotides (triplet, or codon).

2. Continuity- there are no punctuation marks between the triplets, that is, the information is read continuously.

3. discreteness- the same nucleotide cannot be simultaneously part of two or more triplets.

4. Specificity- a certain codon corresponds to only one amino acid.

5. Degeneracy (redundancy) Several codons can correspond to the same amino acid.

6. Versatility - genetic code works the same way in organisms of different levels of complexity - from viruses to humans. (genetic engineering methods are based on this)

3) transcription - the process of RNA synthesis using DNA as a template that occurs in all living cells. In other words, it is the transfer of genetic information from DNA to RNA.

Transcription is catalyzed by the enzyme DNA-dependent RNA polymerase. The process of RNA synthesis proceeds in the direction from 5 "- to 3" - end, that is, RNA polymerase moves along the template DNA chain in the direction 3 "-> 5"

Transcription consists of the stages of initiation, elongation and termination.

Transcription initiation- a complex process that depends on the DNA sequence near the transcribed sequence (and in eukaryotes also on more distant parts of the genome - enhancers and silencers) and on the presence or absence of various protein factors.

Elongation- Further unwinding of DNA and RNA synthesis along the coding chain continues. it, like DNA synthesis, is carried out in the direction 5-3

Termination- as soon as the polymerase reaches the terminator, it is immediately cleaved from DNA, the local DNA-RNA hybrid is destroyed and the newly synthesized RNA is transported from the nucleus to the cytoplasm, at which transcription is completed.

Processing- a set of reactions leading to the transformation of the primary products of transcription and translation into functioning molecules. Items are subject to functionally inactive precursor molecules decomp. ribonucleic acid (tRNA, rRNA, mRNA) and many others. proteins.

In the process of synthesis of catabolic enzymes (cleaving substrates), prokaryotes undergo induced synthesis of enzymes. This gives the cell the opportunity to adapt to environmental conditions and save energy by stopping the synthesis of the corresponding enzyme if the need for it disappears.
To induce the synthesis of catabolic enzymes, the following conditions are required:

1. The enzyme is synthesized only when the cleavage of the corresponding substrate is necessary for the cell.
2. The substrate concentration in the medium must exceed a certain level before the corresponding enzyme can be formed.
The mechanism of regulation of gene expression in Escherichia coli is best studied using the example of the lac operon, which controls the synthesis of three catabolic enzymes that break down lactose. If there is a lot of glucose and little lactose in the cell, the promoter remains inactive, and the repressor protein is located on the operator - transcription of the lac operon is blocked. When the amount of glucose in the environment, and therefore in the cell, decreases, and lactose increases, the following events occur: the amount of cyclic adenosine monophosphate increases, it binds to the CAP protein - this complex activates the promoter to which RNA polymerase binds; at the same time, excess lactose binds to the repressor protein and releases the operator from it - the path for RNA polymerase is open, transcription of the structural genes of the lac operon begins. Lactose acts as an inductor for the synthesis of those enzymes that break it down.

5) Regulation of gene expression in eukaryotes is much more difficult. Different types of cells of a multicellular eukaryotic organism synthesize a number of identical proteins and at the same time they differ from each other in a set of proteins specific to cells of this type. The level of production depends on the type of cells, as well as on the stage of development of the organism. Gene expression is regulated at the cell level and at the organism level. The genes of eukaryotic cells are divided into two main types: the first determines the universality of cellular functions, the second determines (determines) specialized cellular functions. Gene Functions first group appear in all cells. To carry out differentiated functions, specialized cells must express a specific set of genes.
Chromosomes, genes, and operons of eukaryotic cells have a number of structural and functional features, which explains the complexity of gene expression.
1. Operons of eukaryotic cells have several genes - regulators, which can be located on different chromosomes.
2. Structural genes that control the synthesis of enzymes of one biochemical process can be concentrated in several operons located not only in one DNA molecule, but also in several.
3. Complex sequence of the DNA molecule. There are informative and non-informative sections, unique and repeatedly repeated informative nucleotide sequences.
4. Eukaryotic genes consist of exons and introns, and mRNA maturation is accompanied by excision of introns from the corresponding primary RNA transcripts (pro-i-RNA), i.e. splicing.
5. The process of gene transcription depends on the state of chromatin. Local compaction of DNA completely blocks RNA synthesis.
6. Transcription in eukaryotic cells is not always associated with translation. The synthesized mRNA can be stored as informosomes for a long time. Transcription and translation occur in different compartments.
7. Some eukaryotic genes have non-permanent localization (labile genes or transposons).
8. Methods of molecular biology revealed the inhibitory effect of histone proteins on the synthesis of mRNA.
9. In the process of development and differentiation of organs, the activity of genes depends on hormones circulating in the body and causing specific reactions in certain cells. In mammals, the action of sex hormones is important.
10. In eukaryotes, 5-10% of genes are expressed at each stage of ontogenesis, the rest should be blocked.

6) repair of genetic material

Genetic repair- the process of eliminating genetic damage and restoring the hereditary apparatus, which occurs in the cells of living organisms under the action of special enzymes. The ability of cells to repair genetic damage was first discovered in 1949 by the American geneticist A. Kelner. Repair- a special function of cells, which consists in the ability to correct chemical damage and breaks in DNA molecules damaged during normal DNA biosynthesis in the cell or as a result of exposure to physical or chemical agents. It is carried out by special enzyme systems of the cell. A number of hereditary diseases (eg, xeroderma pigmentosum) are associated with impaired repair systems.

types of reparations:

Direct repair is the simplest way to eliminate damage in DNA, which usually involves specific enzymes that can quickly (usually in one stage) repair the corresponding damage, restoring the original structure of nucleotides. This is how, for example, O6-methylguanine-DNA-methyltransferase acts, which removes the methyl group from the nitrogenous base to one of its own cysteine ​​residues.

Ministry of Education and Science of the Russian Federation Federal Agency for Education

State Educational Institution of Higher Professional Education "Altai State Technical University named after I.I. Polzunov"

Department of Natural Science and System Analysis

Essay on the topic "Genetic code"

1. The concept of the genetic code

3. Genetic information

Bibliography


1. The concept of the genetic code

The genetic code is a unified system for recording hereditary information in nucleic acid molecules in the form of a sequence of nucleotides, characteristic of living organisms. Each nucleotide is indicated by a capital letter, which begins the name of the nitrogenous base that is part of it: - A (A) adenine; - G (G) guanine; - C (C) cytosine; - T (T) thymine (in DNA) or U (U) uracil (in mRNA).

The implementation of the genetic code in the cell occurs in two stages: transcription and translation.

The first of these takes place in the nucleus; it consists in the synthesis of mRNA molecules on the corresponding sections of DNA. In this case, the DNA nucleotide sequence is "rewritten" into the RNA nucleotide sequence. The second stage takes place in the cytoplasm, on ribosomes; in this case, the nucleotide sequence of the i-RNA is translated into the sequence of amino acids in the protein: this stage proceeds with the participation of transfer RNA (t-RNA) and the corresponding enzymes.

2. Properties of the genetic code

1. Tripletity

Each amino acid is encoded by a sequence of 3 nucleotides.

A triplet or codon is a sequence of three nucleotides that codes for one amino acid.


The code cannot be monopleth, since 4 (the number of different nucleotides in DNA) is less than 20. The code cannot be doublet, because 16 (the number of combinations and permutations of 4 nucleotides by 2) is less than 20. The code can be triplet, because 64 (the number of combinations and permutations from 4 to 3) is greater than 20.

2. Degeneracy.

All amino acids, with the exception of methionine and tryptophan, are encoded by more than one triplet: 2 amino acids 1 triplet = 2 9 amino acids 2 triplets each = 18 1 amino acid 3 triplets = 3 5 amino acids 4 triplets each = 20 3 amino acids 6 triplets each = 18 Total 61 triplet codes for 20 amino acids.

3. The presence of intergenic punctuation marks.

A gene is a section of DNA that codes for one polypeptide chain or one molecule of tRNA, rRNA, or sRNA.

The tRNA, rRNA, and sRNA genes do not code for proteins.

At the end of each gene encoding a polypeptide, there is at least one of 3 termination codons, or stop signals: UAA, UAG, UGA. They terminate the broadcast.

Conventionally, the AUG codon also belongs to punctuation marks - the first after the leader sequence. It performs the function of a capital letter. In this position, it codes for formylmethionine (in prokaryotes).

4. Uniqueness.

Each triplet encodes only one amino acid or is a translation terminator.

The exception is the AUG codon. In prokaryotes, in the first position (capital letter) it codes for formylmethionine, and in any other position it codes for methionine.

5. Compactness, or the absence of intragenic punctuation marks.

Within a gene, each nucleotide is part of a significant codon.

In 1961 Seymour Benzer and Francis Crick experimentally proved that the code is triplet and compact.

The essence of the experiment: "+" mutation - the insertion of one nucleotide. "-" mutation - loss of one nucleotide. A single "+" or "-" mutation at the beginning of a gene corrupts the entire gene. A double "+" or "-" mutation also spoils the entire gene. A triple "+" or "-" mutation at the beginning of the gene spoils only part of it. A quadruple "+" or "-" mutation again spoils the entire gene.

The experiment proves that the code is triplet and there are no punctuation marks inside the gene. The experiment was carried out on two adjacent phage genes and showed, in addition, the presence of punctuation marks between the genes.

3. Genetic information

Genetic information is a program of the properties of an organism, received from ancestors and embedded in hereditary structures in the form of a genetic code.

It is assumed that the formation of genetic information proceeded according to the scheme: geochemical processes - mineral formation - evolutionary catalysis (autocatalysis).

It is possible that the first primitive genes were microcrystalline crystals of clay, and each new layer of clay lines up in accordance with the structural features of the previous one, as if receiving information about the structure from it.

Realization of genetic information occurs in the process of synthesis of protein molecules with the help of three RNAs: informational (mRNA), transport (tRNA) and ribosomal (rRNA). The process of information transfer goes: - through the channel of direct communication: DNA - RNA - protein; and - via the feedback channel: environment - protein - DNA.

Living organisms are able to receive, store and transmit information. Moreover, living organisms tend to use the information received about themselves and the world around them as efficiently as possible. Hereditary information embedded in genes and necessary for a living organism for existence, development and reproduction is transmitted from each individual to his descendants. This information determines the direction of development of the organism, and in the process of its interaction with the environment, the reaction to its individual can be distorted, thereby ensuring the evolution of the development of descendants. In the process of evolution of a living organism, new information arises and is remembered, including the value of information for it increases.

In the course of the implementation of hereditary information under certain environmental conditions, the phenotype of organisms of a given biological species is formed.

Genetic information determines the morphological structure, growth, development, metabolism, mental warehouse, predisposition to diseases and genetic defects of the body.

Many scientists, rightly emphasizing the role of information in the formation and evolution of living things, noted this circumstance as one of the main criteria of life. So, V.I. Karagodin believes: "The living is such a form of existence of information and the structures encoded by it, which ensures the reproduction of this information in suitable environmental conditions." The connection of information with life is also noted by A.A. Lyapunov: "Life is a highly ordered state of matter that uses information encoded by the states of individual molecules to develop persistent reactions." Our well-known astrophysicist N.S. Kardashev also emphasizes the informational component of life: “Life arises due to the possibility of synthesizing a special kind of molecules that are able to remember and use at first the simplest information about the environment and their own structure, which they use for self-preservation, for reproduction and, which is especially important for us, for obtaining more more information." Ecologist F. Tipler draws attention to this ability of living organisms to store and transmit information in his book "Physics of Immortality": "I define life as some kind of coded information that is preserved by natural selection." Moreover, he believes that if this is so, then the life-information system is eternal, infinite and immortal.

The discovery of the genetic code and the establishment of the laws of molecular biology showed the need to combine modern genetics and the Darwinian theory of evolution. Thus, a new biological paradigm was born - the synthetic theory of evolution (STE), which can already be considered as non-classical biology.

The main ideas of Darwin's evolution with his triad - heredity, variability, natural selection - in the modern view of the evolution of the living world are supplemented by ideas not just of natural selection, but of such selection, which is genetically determined. The beginning of the development of synthetic or general evolution can be considered the work of S.S. Chetverikov on population genetics, in which it was shown that not individual traits and individuals are subjected to selection, but the genotype of the entire population, but it is carried out through the phenotypic traits of individual individuals. This leads to the spread of beneficial changes throughout the population. Thus, the mechanism of evolution is implemented both through random mutations at the genetic level, and through the inheritance of the most valuable traits (the value of information!), which determine the adaptation of mutational traits to the environment, providing the most viable offspring.

Seasonal climate changes, various natural or man-made disasters, on the one hand, lead to a change in the frequency of gene repetition in populations and, as a result, to a decrease in hereditary variability. This process is sometimes called genetic drift. And on the other hand, to changes in the concentration of various mutations and a decrease in the diversity of genotypes contained in the population, which can lead to changes in the direction and intensity of selection.


4. Deciphering the human genetic code

In May 2006, scientists working on sequencing the human genome published a complete genetic map of chromosome 1, which was the last incompletely sequenced human chromosome.

A preliminary human genetic map was published in 2003, marking the formal end of the Human Genome Project. Within its framework, genome fragments containing 99% of human genes were sequenced. The accuracy of gene identification was 99.99%. However, at the end of the project, only four of the 24 chromosomes had been fully sequenced. The fact is that in addition to genes, chromosomes contain fragments that do not encode any traits and are not involved in protein synthesis. The role that these fragments play in the life of the organism is still unknown, but more and more researchers are inclined to believe that their study requires the closest attention.

Gene- a structural and functional unit of heredity that controls the development of a particular trait or property. Parents pass on a set of genes to their offspring during reproduction. A great contribution to the study of the gene was made by Russian scientists: Simashkevich E.A., Gavrilova Yu.A., Bogomazova O.V. (2011)

Currently, in molecular biology, it has been established that genes are sections of DNA that carry any integral information - about the structure of one protein molecule or one RNA molecule. These and other functional molecules determine the development, growth and functioning of the organism.

At the same time, each gene is characterized by a number of specific regulatory DNA sequences, such as promoters, which are directly involved in regulating the expression of the gene. Regulatory sequences can be located either in the immediate vicinity of the open reading frame encoding the protein, or the beginning of the RNA sequence, as is the case with promoters (the so-called cis cis-regulatory elements), and at a distance of many millions of base pairs (nucleotides), as in the case of enhancers, insulators and suppressors (sometimes classified as trans-regulatory elements trans-regulatory elements). Thus, the concept of a gene is not limited to the coding region of DNA, but is a broader concept that includes regulatory sequences.

Originally the term gene appeared as a theoretical unit for the transmission of discrete hereditary information. The history of biology remembers disputes about which molecules can be carriers of hereditary information. Most researchers believed that only proteins can be such carriers, since their structure (20 amino acids) allows you to create more options than the structure of DNA, which is composed of only four types of nucleotides. Later, it was experimentally proved that it is DNA that includes hereditary information, which was expressed as the central dogma of molecular biology.

Genes can undergo mutations - random or purposeful changes in the sequence of nucleotides in the DNA chain. Mutations can lead to a change in the sequence, and therefore a change in the biological characteristics of a protein or RNA, which, in turn, can result in a general or local altered or abnormal functioning of the organism. Such mutations in some cases are pathogenic, since their result is a disease, or lethal at the embryonic level. However, not all changes in the nucleotide sequence lead to a change in the protein structure (due to the effect of the degeneracy of the genetic code) or to a significant change in the sequence and are not pathogenic. In particular, the human genome is characterized by single nucleotide polymorphisms and copy number variations. copy number variations), such as deletions and duplications, which make up about 1% of the entire human nucleotide sequence. Single nucleotide polymorphisms, in particular, define different alleles of the same gene.

The monomers that make up each of the DNA chains are complex organic compounds that include nitrogenous bases: adenine (A) or thymine (T) or cytosine (C) or guanine (G), a five-atom sugar-pentose-deoxyribose, named after which and received the name of DNA itself, as well as the residue of phosphoric acid. These compounds are called nucleotides.

Gene properties

  1. discreteness - immiscibility of genes;
  2. stability - the ability to maintain a structure;
  3. lability - the ability to repeatedly mutate;
  4. multiple allelism - many genes exist in a population in a variety of molecular forms;
  5. allelism - in the genotype of diploid organisms, only two forms of the gene;
  6. specificity - each gene encodes its own trait;
  7. pleiotropy - multiple effect of a gene;
  8. expressivity - the degree of expression of a gene in a trait;
  9. penetrance - the frequency of manifestation of a gene in the phenotype;
  10. amplification - an increase in the number of copies of a gene.

Classification

  1. Structural genes are unique components of the genome, representing a single sequence encoding a specific protein or some types of RNA. (See also the article housekeeping genes).
  2. Functional genes - regulate the work of structural genes.

Genetic code- a method inherent in all living organisms to encode the amino acid sequence of proteins using a sequence of nucleotides.

Four nucleotides are used in DNA - adenine (A), guanine (G), cytosine (C), thymine (T), which in Russian-language literature are denoted by the letters A, G, C and T. These letters make up the alphabet of the genetic code. In RNA, the same nucleotides are used, with the exception of thymine, which is replaced by a similar nucleotide - uracil, which is denoted by the letter U (U in Russian-language literature). In DNA and RNA molecules, nucleotides line up in chains and, thus, sequences of genetic letters are obtained.

Genetic code

There are 20 different amino acids used in nature to build proteins. Each protein is a chain or several chains of amino acids in a strictly defined sequence. This sequence determines the structure of the protein, and therefore all its biological properties. The set of amino acids is also universal for almost all living organisms.

The implementation of genetic information in living cells (that is, the synthesis of a protein encoded by a gene) is carried out using two matrix processes: transcription (that is, the synthesis of mRNA on a DNA template) and translation of the genetic code into an amino acid sequence (synthesis of a polypeptide chain on mRNA). Three consecutive nucleotides are enough to encode 20 amino acids, as well as the stop signal, which means the end of the protein sequence. A set of three nucleotides is called a triplet. Accepted abbreviations corresponding to amino acids and codons are shown in the figure.

Properties

  1. Tripletity- a significant unit of the code is a combination of three nucleotides (triplet, or codon).
  2. Continuity- there are no punctuation marks between the triplets, that is, the information is read continuously.
  3. non-overlapping- the same nucleotide cannot be part of two or more triplets at the same time (not observed for some overlapping genes of viruses, mitochondria and bacteria that encode several frameshift proteins).
  4. Unambiguity (specificity)- a certain codon corresponds to only one amino acid (however, the UGA codon in Euplotes crassus codes for two amino acids - cysteine ​​and selenocysteine)
  5. Degeneracy (redundancy) Several codons can correspond to the same amino acid.
  6. Versatility- the genetic code works in the same way in organisms of different levels of complexity - from viruses to humans (genetic engineering methods are based on this; there are a number of exceptions, shown in the table in the "Variations of the standard genetic code" section below).
  7. Noise immunity- mutations of nucleotide substitutions that do not lead to a change in the class of the encoded amino acid are called conservative; nucleotide substitution mutations that lead to a change in the class of the encoded amino acid are called radical.

Protein biosynthesis and its steps

Protein biosynthesis- a complex multi-stage process of synthesis of a polypeptide chain from amino acid residues, occurring on the ribosomes of cells of living organisms with the participation of mRNA and tRNA molecules.

Protein biosynthesis can be divided into stages of transcription, processing and translation. During transcription, the genetic information encrypted in DNA molecules is read and this information is written into mRNA molecules. During a series of successive stages of processing, some fragments that are unnecessary in subsequent stages are removed from mRNA, and the nucleotide sequences are edited. After the code is transported from the nucleus to the ribosomes, the actual synthesis of protein molecules occurs by attaching individual amino acid residues to the growing polypeptide chain.

Between transcription and translation, the mRNA molecule undergoes a series of successive changes that ensure the maturation of a functioning template for the synthesis of the polypeptide chain. A cap is attached to the 5' end, and a poly-A tail is attached to the 3' end, which increases the lifespan of the mRNA. With the advent of processing in a eukaryotic cell, it became possible to combine gene exons to obtain a greater variety of proteins encoded by a single DNA nucleotide sequence - alternative splicing.

Translation consists in the synthesis of a polypeptide chain in accordance with the information encoded in messenger RNA. The amino acid sequence is arranged using transport RNA (tRNA), which form complexes with amino acids - aminoacyl-tRNA. Each amino acid has its own tRNA, which has a corresponding anticodon that “matches” the mRNA codon. During translation, the ribosome moves along the mRNA, as the polypeptide chain builds up. Energy for protein synthesis is provided by ATP.

The finished protein molecule is then cleaved from the ribosome and transported to the right place in the cell. Some proteins require additional post-translational modification to reach their active state.