The normal distribution of a random variable x has a function. Normal (Gaussian) distribution law

Random variables are associated with random events. Random events are spoken of when it is impossible to unambiguously predict the result that can be obtained under certain conditions.

Suppose we are tossing an ordinary coin. Usually the result of this procedure is not uniquely certain. One can only say with certainty that one of two things will happen: either heads or tails will fall out. Any of these events will be random. You can enter a variable that will describe the outcome of this random event. Obviously, this variable will take two discrete values: heads and tails. Since we cannot accurately predict in advance which of the two possible values ​​this variable will take, it can be argued that in this case we are dealing with random variables.

Let us now assume that in the experiment we are evaluating the reaction time of the subject upon the presentation of some stimulus. As a rule, it turns out that even when the experimenter takes all measures to standardize the experimental conditions, minimizing or even eliminating possible variations in the presentation of the stimulus, the measured values ​​of the reaction time of the subject will still differ. In this case, they say that the reaction time of the subject is described by a random variable. Since, in principle, in the experiment we can get any value of the reaction time - the set of possible values ​​​​of the reaction time that can be obtained as a result of measurements turns out to be infinite - they say about continuity this random variable.

The question arises: are there any regularities in the behavior of random variables? The answer to this question turns out to be in the affirmative.

Thus, if one conducts an infinite number of tossings of the same coin, one will find that the number of drops on each of the two sides of the coin will be approximately the same, unless, of course, the coin is false and not bent. To emphasize this pattern, the concept of the probability of a random event is introduced. It is clear that in the case of a coin toss, one of two possible events will occur without fail. This is due to the fact that the total probability of these two events, otherwise called the total probability, is 100%. If we assume that both of the two events associated with the testing of the coin occur with equal probabilities, then the probability of each outcome separately, obviously, turns out to be 50%. Thus, theoretical considerations allow us to describe the behavior of a given random variable. Such a description in mathematical statistics is denoted by the term "distribution of a random variable".

The situation is more complicated with a random variable that does not have a well-defined set of values, i.e. turns out to be continuous. But even in this case, some important regularities of its behavior can be noted. So, when conducting an experiment with measuring the reaction time of the subject, it can be noted that different intervals of the duration of the reaction of the subject are estimated with different degrees of probability. It is likely rare that the subject will react too quickly. For example, in semantic decision tasks, subjects practically fail to more or less accurately respond at a speed of less than 500 ms (1/2 s). Similarly, it is unlikely that a subject faithfully following the experimenter's instructions will greatly delay his response. In semantic decision problems, for example, responses estimated to be more than 5 s are usually considered unreliable. Nevertheless, with 100% certainty, it can be assumed that the reaction time of the subject will be in the range from 0 to + co. But this probability is the sum of the probabilities of each individual value of the random variable. Therefore, the distribution of a continuous random variable can be described as a continuous function y = f (X ).

If we are dealing with a discrete random variable, when all its possible values ​​are known in advance, as in the example with a coin, it is usually not very difficult to build a model for its distribution. It suffices to introduce only some reasonable assumptions, as we did in the example under consideration. The situation is more complicated with the distribution of continuous magnitudes that take on an unknown number of values ​​in advance. Of course, if we, for example, developed a theoretical model that describes the behavior of a subject in an experiment with measuring reaction time when solving a semantic solution problem, we could try to describe the theoretical distribution of specific values ​​of the reaction time of the same subject upon presentation of one and the same stimulus. However, this is not always possible. Therefore, the experimenter may be forced to assume that the distribution of the random variable of interest to him is described by some law already studied in advance. Most often, although this may not always be absolutely correct, the so-called normal distribution is used for these purposes, which acts as a standard for the distribution of any random variable, regardless of its nature. This distribution was first described mathematically in the first half of the 18th century. de Moivre.

Normal distribution occurs when the phenomenon of interest to us is subject to the influence of an infinite number of random factors that balance each other. Formally, the normal distribution, as de Moivre showed, can be described by the following relation:

where X represents a random variable of interest to us, the behavior of which we study; R is the probability value associated with this random variable; π and e - well-known mathematical constants describing respectively the ratio of the circumference to the diameter and the base of the natural logarithm; μ and σ2 are the parameters of the normal distribution of the random variable, respectively, the mathematical expectation and variance of the random variable X.

To describe the normal distribution, it turns out to be necessary and sufficient to define only the parameters μ and σ2.

Therefore, if we have a random variable whose behavior is described by equation (1.1) with arbitrary values ​​of μ and σ2, then we can denote it as Ν (μ, σ2) without remembering all the details of this equation.

Rice. 1.1.

Any distribution can be represented visually in the form of a graph. Graphically, the normal distribution has the form of a bell-shaped curve, the exact shape of which is determined by the parameters of the distribution, i.e. mathematical expectation and variance. The parameters of the normal distribution can take almost any values, which are limited only by the measuring scale used by the experimenter. In theory, the value of the mathematical expectation can be any number from the range of numbers from -∞ to +∞, and the variance can be any non-negative number. Therefore, there is an infinite number of different types of normal distribution and, accordingly, an infinite number of curves representing it (having, however, a similar bell-shaped form). It is clear that it is impossible to describe all of them. However, if the parameters of a particular normal distribution are known, it can be converted to the so-called unit normal distribution, the mathematical expectation for which is equal to zero, and the variance is equal to one. This normal distribution is also called standard or z-distribution. The plot of the unit normal distribution is shown in fig. 1.1, whence it is obvious that the top of the bell-shaped curve of the normal distribution characterizes the value of the mathematical expectation. Another parameter of the normal distribution - dispersion - characterizes the degree of "spreading" of the bell-shaped curve relative to the horizontal (abscissa axis).

compared to other types of distributions. The main feature of this distribution is that all other distribution laws tend to this law with an infinite repetition of the number of trials. How is this distribution obtained?

Imagine that, taking a hand dynamometer, you are located in the most crowded place in your city. And to everyone who passes by, you offer to measure your strength by squeezing the dynamometer with your right or left hand. You carefully record the readings of the dynamometer. After some time, with a sufficiently large number of tests, you put the dynamometer readings on the abscissa axis, and the number of people who “squeezed” this reading on the ordinate axis. The obtained points are connected by a smooth line. The result is the curve shown in Figure 9.8. The shape of this curve will not change much as the time of the experiment increases. Moreover, from some point on, the new values ​​will only refine the curve without changing its shape.


Rice. 9.8.

Now let's move with our dynamometer to the athletic hall and repeat the experiment. Now the maximum of the curve will shift to the right, the left end will be somewhat tightened, while its right end will be steeper (Fig. 9.9).


Rice. 9.9.

Note that the maximum frequency for the second distribution (point B) will be lower than the maximum frequency for the first distribution (point A). This can be explained by the fact that the total number of people visiting the gym will be less than the number of people who passed near the experimenter in the first case (in the city center in a fairly crowded place). The maximum has shifted to the right, as athletic halls are attended by physically stronger people compared to the general background.

And finally, we will visit schools, kindergartens and nursing homes with the same goal: to reveal the strength of the hands of visitors to these places. And again, the distribution curve will have a similar shape, but now, obviously, its left end will be steeper, and the right end will be more tightened. And as in the second case, the maximum (point C) will be lower than point A (Fig. 9.10).


Rice. 9.10.

This remarkable property of the normal distribution - to keep the shape of the probability density curve (Fig. 8 - 10) was noticed and described in 1733 by Moivre, and then investigated by Gauss.

In scientific research, in technology, in mass phenomena or experiments, when it comes to repeatedly repeating random variables under constant experimental conditions, they say that test results experience random scattering, obeying the law of the normal distribution curve

(21)

Where is the most frequently occurring event. As a rule, in the formula (21) instead of the parameter, . Moreover, the longer the experimental series, the less the parameter will differ from the mathematical expectation. The area under the curve (Fig. 9.11) is assumed to be equal to one. The area corresponding to any interval of the x-axis is numerically equal to the probability of a random result falling into this interval.


Rice. 9.11.

The normal distribution function has the form


(22)

Note that the normal curve (Fig. 9.11) is symmetrical with respect to the straight line and asymptotically approaches the OX axis at .

Calculate the mathematical expectation for the normal law


(23)

Properties of the normal distribution

Let us consider the main properties of this most important distribution.

Property 1. Density function of the normal distribution (21) definitions on the entire x-axis.

Property 2. The density function of the normal distribution (21) is greater than zero for any of the domain of definition ().

Property 3. With an infinite increase (decrease), the distribution function (21) tends to zero .

Property 4. When , the distribution function given by (21) has the largest value equal to

(24)

Property 5. The graph of the function (Fig. 9.11) is symmetrical with respect to a straight line.

Property 6. The function graph (Fig. 9.11) has two inflection points symmetrical about a straight line:

(25)

Property 7. All odd central moments are equal to zero. Note that using property 7, the asymmetry of the function is determined by the formula . If , then they conclude that the distribution under study is symmetrical with respect to the straight line . If , then they say that the row is shifted to the right (more gently sloping right branch of the graph or tightened). If , then it is considered that the row is shifted to the left (more flattened left branch of the graph in Fig. 9.12).


Rice. 9.12.

Property 8. The kurtosis of the distribution is 3. In practice, it is often calculated and the degree of "compression" or "blurring" of the graph is determined by the proximity of this value to zero (Fig. 9.13). And since it is related to , it ultimately characterizes the degree of data frequency dispersion. And since it defines

The most famous and frequently used law in probability theory is the normal distribution law or Gauss law .

main feature The normal distribution law lies in the fact that it is the limiting law for other distribution laws.

Note that for a normal distribution, the integral function has the form:

.

Let's show now that the probabilistic meaning of the parameters and is as follows: a there is a mathematical expectation, - the standard deviation (that is, ) of the normal distribution:

a) by definition of the mathematical expectation of a continuous random variable, we have

Really

,

since there is an odd function under the integral sign, and the limits of integration are symmetrical with respect to the origin;

- Poisson integral .

So, the mathematical expectation of the normal distribution is equal to the parameter a .

b) by definition of the dispersion of a continuous random variable and, taking into account that , we can write

.

Integrating by parts, setting , find

Consequently .

So, the standard deviation of the normal distribution is equal to the parameter .

If and normal distribution is called normalized (or, standard normal) distribution. Then, obviously, the normalized density (differential) and the normalized integral distribution function will be written respectively in the form:

(The function, as you know, is called the Laplace function (see LECTURE 5) or the probability integral. Both functions, that is , are tabulated and their values ​​are recorded in the corresponding tables).

Normal distribution properties (normal curve properties):

1. Obviously, a function on the entire real line.

2. , that is, the normal curve is located above the axis Oh .

3. , that is, the axis Oh serves as the horizontal asymptote of the graph.

4. Normal curve is symmetrical about a straight line x = a (accordingly, the graph of the function is symmetrical about the axis OU ).

Therefore, we can write: .

5. .

6. It is easy to show that the points and are the inflection points of the normal curve (prove yourself).

7.It's obvious that

but since , then . Besides , therefore, all odd moments are equal to zero.

For even moments, we can write:

8. .

9. .

10. , where .

11. For negative values ​​of the random variable: , where .


13. The probability of hitting a random variable on a plot symmetrical about the center of distribution is equal to:

EXAMPLE 3. Show that a normally distributed random variable X deviates from expectation M(X) no more than .

Solution. For a normal distribution: .

In other words, the probability that the absolute value of the deviation will exceed triple the standard deviation is very small, namely 0.0027. This means that only in 0.27% of cases this can happen. Such events, based on the principle of the impossibility of unlikely events, can be considered practically impossible.

So, an event with a probability of 0.9973 can be considered practically certain, that is, a random variable deviates from the mathematical expectation by no more than .

EXAMPLE 4. Knowing the characteristics of the normal distribution of a random variable X - tensile strength of steel: kg / mm 2 and kg / mm 2, find the probability of obtaining steel with a tensile strength of 31 kg / mm 2 to 35 kg / mm 2.

Solution.

3. Exponential distribution (exponential distribution law)

The exponential (exponential) is the probability distribution of a continuous random variable X , which is described by a differential function (distribution density)

where is a constant positive value.

The exponential distribution is defined one parameter . This feature of the exponential distribution indicates its advantage over distributions that depend on a larger number of parameters. Usually, the parameters are unknown and one has to find their estimates (approximate values); of course, it is easier to evaluate one parameter than two, or three, etc.

It is easy to write the integral function of the exponential distribution:

We have defined the exponential distribution using a differential function; it is clear that it can be determined using the integral function.

Comment: Consider a continuous random variable T - the duration of the uptime of the product. Let us denote its accepted values ​​by t , . Cumulative distribution function defines failure probability products over a period of time t . Therefore, the probability of failure-free operation for the same time duration t , that is, the probability of the opposite event is equal to

Definition. normal is called the probability distribution of a continuous random variable, which is described by the probability density

The normal distribution is also called Gauss law.

The normal distribution law is central to the theory of probability. This is due to the fact that this law manifests itself in all cases when a random variable is the result of the action of a large number of different factors. All other distribution laws approach the normal law.

It can be easily shown that the parameters and included in the distribution density are, respectively, the mathematical expectation and the standard deviation of the random variable X.

Find the distribution function F(x).

The normal distribution density plot is called normal curve or Gaussian curve.

A normal curve has the following properties:

1) The function is defined on the entire number axis.

2) For all X the distribution function takes only positive values.

3) The OX axis is the horizontal asymptote of the probability density graph, since with an unlimited increase in the absolute value of the argument X, the value of the function tends to zero.

4) Find the extremum of the function.

Because at y' > 0 at x< m and y'< 0 at x > m, then at the point x = t the function has a maximum equal to .

5) The function is symmetrical with respect to a straight line x = a, because difference

(x - a) enters the squared distribution density function.

6) To find the inflection points of the graph, we find the second derivative of the density function.

At x = m+ s and x = m- s the second derivative is equal to zero, and when passing through these points it changes sign, i.e. at these points the function has an inflection.

At these points, the value of the function is .

Let's build a graph of the distribution density function.

Graphs were built for t=0 and three possible values ​​of the standard deviation s = 1, s = 2 and s = 7. As you can see, as the value of the standard deviation increases, the graph becomes flatter, and the maximum value decreases.

If a a> 0, then the graph will shift in the positive direction if a < 0 – в отрицательном.

At a= 0 and s = 1 the curve is called normalized. Normalized Curve Equation:

For brevity, we say that CV X obeys the law N(m, s), i.e. X ~ N(m, s). The parameters m and s coincide with the main characteristics of the distribution: m = m X , s = s X = . If SV X ~ N(0, 1), then it is called standardized normal value. DF is called a standardized normal value Laplace function and is denoted as Ф(x). It can be used to calculate interval probabilities for the normal distribution N(m, s):

P(x 1 £ X< x 2) = Ф - Ф .

When solving problems on a normal distribution, it is often necessary to use tabular values ​​of the Laplace function. Since the Laplace function satisfies the relation F(-x) = 1 - F(x), then it suffices to have tabular values ​​of the function F(x) only for positive argument values.

For the probability of hitting an interval that is symmetric with respect to the mathematical expectation, the following formula is true: P(|X - m X |< e) = 2×F(e/s) - 1.

The central moments of the normal distribution satisfy the recursive relation: m n +2 = (n+1)s 2 m n , n = 1, 2, ... . This implies that all central moments of odd order are equal to zero (since m 1 = 0).

Find the probability that a random variable distributed according to the normal law falls into a given interval.

Denote

Because integral is not expressed in terms of elementary functions, then the function is introduced into consideration

,

which is called Laplace function or probability integral.

The values ​​of this function for various values X calculated and presented in special tables.

Below is a graph of the Laplace function.

The Laplace function has the following properties:

2) F(- X) = - F( X);

The Laplace function is also called error function and denote erf x.

Still in use normalized the Laplace function, which is related to the Laplace function by the relation:

Below is a plot of the normalized Laplace function.

When considering the normal distribution, an important special case is distinguished, known as three sigma rule.

Let us write down the probability that the deviation of a normally distributed random variable from the mathematical expectation is less than a given value D:

If we accept D = 3s, then we obtain using the tables of values ​​of the Laplace function:

Those. the probability that a random variable deviates from its mathematical expectation by an amount greater than three times the standard deviation is practically zero.

This rule is called three sigma rule.

In practice, it is considered that if for any random variable the rule of three sigma is satisfied, then this random variable has a normal distribution.

Example. The train consists of 100 wagons. The mass of each wagon is a random variable distributed according to the normal law with mathematical expectation a= 65 t and standard deviation s = 0.9 t. The locomotive can carry a train weighing no more than 6600 t, otherwise it is necessary to attach a second locomotive. Find the probability that the second locomotive is not required.

The second locomotive is not required if the deviation of the mass of the train from the expected one (100 × 65 = 6500) does not exceed 6600 - 6500 = 100 tons.

Because the mass of each car has a normal distribution, then the mass of the entire train will also be distributed normally.

We get:

Example. A normally distributed random variable X is given by its parameters - a \u003d 2 - mathematical expectation and s = 1 – standard deviation. It is required to write the probability density and plot it, find the probability that X will take a value from the interval (1; 3), find the probability that X deviates (modulo) from the mathematical expectation by no more than 2.

The distribution density has the form:

Let's build a graph:

Let's find the probability of hitting a random variable in the interval (1; 3).

Find the probability that the random variable deviates from the mathematical expectation by a value not greater than 2.

The same result can be obtained using the normalized Laplace function.

Lecture 8 Law of large numbers(Section 2)

Lecture plan

Central limit theorem (general formulation and particular formulation for independent identically distributed random variables).

Chebyshev's inequality.

The law of large numbers in the form of Chebyshev.

The concept of event frequency.

Statistical understanding of probability.

The law of large numbers in Bernoulli form.

The study of statistical regularities made it possible to establish that under certain conditions the total behavior of a large number of random variables almost loses its random character and becomes regular (in other words, random deviations from some average behavior cancel each other out). In particular, if the influence on the sum of individual terms is uniformly small, the law of distribution of the sum approaches normal. The mathematical formulation of this statement is given in a group of theorems called law of large numbers.

LAW OF GREAT NUMBERS- a general principle, by virtue of which the combined action of random factors leads, under certain very general conditions, to a result that is almost independent of chance. The first example of the operation of this principle is the convergence of the frequency of occurrence of a random event with its probability with an increase in the number of trials (often used in practice, for example, when using the frequency of occurrence of any quality of the respondent in the sample as a sample estimate of the corresponding probability).

Essence law of large numbers is that with a large number of independent experiments, the frequency of occurrence of some event is close to its probability.

Central Limit Theorem (CLT) (in the formulation of Lyapunov A.M. for identically distributed RVs). If pairwise independent RVs X 1 , X 2 , ..., X n , ... have the same distribution law with finite numerical characteristics M = m and D = s 2 , then for n ® ¥ the distribution law of the RV indefinitely approaches the normal law N(n×m, ).

Consequence. If in the condition of the CB theorem , then as n ® ¥ the law of distribution of SW Y approaches the normal law N(m, s/ ) indefinitely.

De Moivre-Laplace theorem. Let SV K be the number of “successes” in n trials according to the Bernoulli scheme. Then, for n ® ¥ and a fixed value of the probability of “success” in one trial p, the distribution law of RV K indefinitely approaches the normal law N(n×p, ).

Consequence. If in the condition of the theorem, instead of SV K, we consider SV K/n - the frequency of “successes” in n trials according to the Bernoulli scheme, then its distribution law for n ® ¥ and a fixed value of p approaches the normal law N(p, ) indefinitely.

Comment. Let SV K be the number of “successes” in n trials according to the Bernoulli scheme. The law of distribution of such SW is the binomial law. Then, as n ® ¥, the binomial law has two limit distributions:

n distribution Poisson(for n ® ¥ and l = n×p = const);

n distribution Gaussian N(n×p, ) (for n ® ¥ and p = const).

Example. The probability of “success” in one trial is only p = 0.8. How many trials should be done so that with a probability of at least 0.9 we can expect that the observed frequency of “success” in trials according to the Bernoulli scheme deviates from the probability p by no more than e = 0.01?

Solution. For comparison, we solve the problem in two ways.

In probability theory, a fairly large number of various distribution laws are considered. For solving problems related to the construction of control charts, only some of them are of interest. The most important of them is normal distribution law, which is used to build control charts used in quantitative control, i.e. when we are dealing with a continuous random variable. The normal distribution law occupies a special position among other distribution laws. This is explained by the fact that, firstly, it is most often encountered in practice, and, secondly, it is the limiting law, to which other laws of distribution approach under very often encountered typical conditions. As for the second circumstance, it has been proven in probability theory that the sum of a sufficiently large number of independent (or weakly dependent) random variables subject to any distribution laws (subject to certain very non-rigid restrictions) approximately obeys the normal law, and this is carried out the more accurately the greater the number of random variables summed up. Most of the random variables encountered in practice, such as, for example, measurement errors, can be represented as the sum of a very large number of relatively small terms - elementary errors, each of which is caused by the action of a separate cause independent of the others. The normal law occurs when the random variable X is the result of a large number of different factors. Each factor separately by the value X influences slightly, and it is impossible to specify which one influences to a greater extent than the others.

Normal distribution(Laplace–Gauss distribution) is the probability distribution of a continuous random variable X such that the probability distribution density at - ¥<х< + ¥ принимает действительное значение:

exp (3)

That is, the normal distribution is characterized by two parameters m and s, where m is the mathematical expectation; s is the standard deviation of the normal distribution.

s value 2 is the variance of the normal distribution.

The mathematical expectation m characterizes the position of the distribution center, and the standard deviation s (RMS) is a dispersion characteristic (Fig. 3).

f(x) f(x)


Figure 3 - Density functions of the normal distribution with:

a) different mathematical expectations m; b) different RMS s.

Thus, the value μ determined by the position of the distribution curve on the x-axis. Dimension μ - the same as the dimension of the random variable X. As the mathematical expectation increases, both functions shift parallel to the right. With decreasing variance s 2 the density becomes more and more concentrated around m, while the distribution function becomes more and more steep.

The value of σ determines the shape of the distribution curve. Since the area under the distribution curve must always remain equal to unity, as σ increases, the distribution curve becomes flatter. On fig. 3.1 shows three curves for different σ: σ1 = 0.5; σ2 = 1.0; σ3 = 2.0.

Figure 3.1 - Density functions of the normal distribution with different RMS s .

The distribution function (integral function) has the form (Fig. 4):

(4)

Figure 4 - Integral (a) and differential (b) normal distribution functions

Of particular importance is the linear transformation of a normally distributed random variable X, after which a random variable is obtained Z with mathematical expectation 0 and variance 1. Such a transformation is called normalization:

It can be done for every random variable. Normalization allows all possible variants of the normal distribution to be reduced to one case: m = 0, s = 1.

The normal distribution with m = 0, s = 1 is called normalized normal distribution (standardized).

standard normal distribution(standard Laplace-Gauss distribution or normalized normal distribution) is the probability distribution of a standardized normal random variable Z, the distribution density of which is equal to:

at - ¥<z< + ¥

Function values Ф(z) is determined by the formula:

(7)

Function values Ф(z) and density f(z) normalized normal distribution are calculated and summarized in tables (tabulated). The table is compiled only for positive values z that's why:

F (z) = 1Ф (z) (8)

Using these tables, one can determine not only the values ​​of the function and density of the normalized normal distribution for a given z, but also the values ​​of the general normal distribution function, since:

; (9)

. 10)

In many problems related to normally distributed random variables, one has to determine the probability of hitting a random variable X, subject to the normal law with parameters m and s, to a certain area. Such a site can be, for example, a tolerance field for a parameter from the upper value U to the bottom L.

The probability of falling into the interval from X 1 to X 2 can be determined by the formula:

Thus, the probability of hitting a random variable (parameter value) X in the tolerance field is determined by the formula

One can find the probability that a random variable X will be within μ k s . Obtained values ​​for k=1,2 and 3 are the following (see also Fig. 5):

Thus, if any value appears outside the three-sigma region, which contains 99.73% of all possible values, and the probability of such an event occurring is very small (1:270), it should be considered that the value in question turned out to be too small or too large not due to random variation, but due to significant interference in the process itself, capable of causing changes in the nature of the distribution.

The area lying inside the three-sigma boundaries is also called statistical tolerance area relevant machine or process.