The distribution function contains complete information about the random variable. In practice, the allocation function cannot always be established; sometimes such exhaustive knowledge is not required. Partial information about a random variable is given by numerical characteristics, which, depending on the type of information, are divided into the following groups.
1. Characteristics of the position of a random variable on the numerical axis (mode Mo, median Me, expected value M(X)).
2. Characteristics of the spread of a random variable around the mean value (dispersion D(X), standard deviation σ( X)).
3. Characteristics of the curve shape y = φ( x) (asymmetry As, kurtosis Ex).
Let's take a closer look at each of these characteristics.
Expected value random variable X indicates some average value around which all possible values ​​are grouped X. For a discrete random variable that can take only a finite number of possible values, the mathematical expectation is the sum of the products of all possible values ​​of the random variable and the probability of these values:
. (2.4)
For a continuous random variable X, which has a given distribution density φ( x) the mathematical expectation is the following integral:
. (2.5)
Here it is assumed that the improper integral converges absolutely, i.e. exist.
Properties of mathematical expectation:
1. M(S) = C, where With = const;
2. M(CX) = CM(X);
3. M(X ± Y) = M(X) ± M(Y), where X and Y– any random variables;
4. M(XY)=M(X)∙M(Y), where X and Y are independent random variables.
Two random variables are called independent , if the distribution law of one of them does not depend on what possible values ​​the other value has taken.
Fashion discrete random variable, denoted Mo, its most probable value is called (Fig. 2.3), and the mode of a continuous random variable is the value at which the probability density is maximum (Fig. 2.4).

Rice. 2.3 Fig. 2.4
Median continuous random variable X its value Me is called such, for which it is equally probable whether the random variable will turn out to be less or more Me, i.e.
P(X < Me) = P(X > Me)
From the definition of the median, it follows that P(X<Me) = 0.5, i.e. F (Me) = 0.5. Geometrically, the median can be interpreted as the abscissa, in which the ordinate φ( x) bisects the area bounded by the distribution curve (Fig. 2.5). In the case of a symmetrical distribution, the median coincides with the mode and the mathematical expectation (Fig. 2.6).

Rice. 2.5 Fig. 2.6


Variance of a random variable- a measure of the spread of a given random variable, that is, its deviation from the mathematical expectation. Denoted D[X] in Russian literature and (eng. variance) in foreign countries. In statistics, the designation or is often used. The square root of the variance, equal to , is called the standard deviation, standard deviation, or standard spread. The standard deviation is measured in the same units as the random variable itself, and the variance is measured in the squares of that unit.

It follows from Chebyshev's inequality that a random variable moves away from its mathematical expectation by more than k standard deviations with probability less than 1/ k². So, for example, in at least 75% of cases, a random variable is removed from its mean by no more than two standard deviations, and in about 89% - by no more than three.

dispersion random variable is called the mathematical expectation of the square of its deviation from the mathematical expectation
D(X) = M(XM(X)) 2 .
Variance of a random variable X it is convenient to calculate by the formula:
a) for a discrete quantity
; (2.6)
b) for a continuous random variable
j( X)d x – 2 . (2.7)
The dispersion has the following properties:
1. D(C) = 0, where With = const;
2. D(C× X) = C 2 ∙ D(X);
3. D(X± Y) = D(X) + D(Y), if X and Y independent random variables.
Standard deviation random variable X is called the arithmetic root of the variance, i.e.
σ( X) = .
Note that the dimension σ( X) coincides with the dimension of the random variable itself X, so the standard deviation is more convenient for scattering characterization.
A generalization of the main numerical characteristics of random variables is the concept of moments of a random variable.
The initial moment of the kth order α k random variable X is called the mathematical expectation of the quantity X k, i.e. α k = M(X k).
The initial moment of the first order is the mathematical expectation of the random variable.
The central moment of the kth order μ k random variable X is called the mathematical expectation of the quantity ( XM(X))k, i.e. μ k = M(XM(X))k.
The central moment of the second order is the variance of the random variable.
For a discrete random variable, the initial moment is expressed by the sum α k= , and the central one is the sum μ k = where p i = p(X=x i). For the initial and central moments of a continuous random variable, the following equalities can be obtained:
α k = ,  μ k = ,
where φ( x) is the distribution density of the random variable X.
Value As= μ 3 / σ 3 is called asymmetry coefficient .
If the asymmetry coefficient is negative, then this indicates a large influence on the value of m 3 negative deviations. In this case, the distribution curve (Fig. 2.7) is more flat to the left of M(X). If the coefficient As is positive, which means that the influence of positive deviations prevails, then the distribution curve (Fig. 2.7) is flatter on the right. In practice, the sign of the asymmetry is determined by the location of the distribution curve relative to the mode (maximum point of the differential function).

Rice. 2.7
kurtosis Ek is called the quantity
Ek\u003d μ 4 / σ 4 - 3.

Question 24: Correlation

Correlation (correlation dependence) - statistical relationship of two or more random variables (or variables that can be considered as such with some acceptable degree of accuracy). In this case, changes in the values ​​of one or more of these quantities are accompanied by a systematic change in the values ​​of another or other quantities. The mathematical measure of the correlation of two random variables is correlation relation, or correlation coefficient (or ) . If a change in one random variable does not lead to a regular change in another random variable, but leads to a change in another statistical characteristic of this random variable, then such a relationship is not considered a correlation, although it is statistical.

For the first time, the term “correlation” was introduced into scientific circulation by the French paleontologist Georges Cuvier in the 18th century. He developed the "law of correlation" of parts and organs of living beings, with the help of which it is possible to restore the appearance of a fossil animal, having at its disposal only a part of its remains. In statistics, the word "correlation" was first used by the English biologist and statistician Francis Galton at the end of the 19th century.

Some types of correlation coefficients can be positive or negative (it is also possible that there is no statistical relationship - for example, for independent random variables). If it is assumed that a strict order relation is given on the values ​​of the variables, then negative correlation- correlation, in which an increase in one variable is associated with a decrease in another variable, while the correlation coefficient can be negative; positive correlation in such conditions, a correlation in which an increase in one variable is associated with an increase in another variable, while the correlation coefficient can be positive.

Example. The mathematical expectations and variances of two independent random variables X and Y are known: M(x)=8 , M(Y)=7 , D(X)=9 , D(Y)=6 . Find the mathematical expectation and variance of the random variable Z=9X-8Y+7 .
Decision. Based on the properties of mathematical expectation: M(Z) = M(9X-8Y+7) = 9*M(X) - 8*M(Y) + M(7) = 9*8 - 8*7 + 7 = 23 .
Based on the dispersion properties: D(Z) = D(9X-8Y+7) = D(9X) - D(8Y) + D(7) = 9^2D(X) - 8^2D(Y) + 0 = 81*9 - 64*6 = 345

