Uniform distribution of a two-dimensional random variable. Systems of random variables

Definition 2.7. is a pair of random numbers (X, Y), or a point on the coordinate plane (Fig. 2.11).

Rice. 2.11.

A two-dimensional random variable is a special case of a multidimensional random variable, or random vector.

Definition 2.8. Random vector - is it a random function?,(/) with a finite set of possible argument values t, whose value for any value t is a random variable.

A two-dimensional random variable is called continuous if its coordinates are continuous, and discrete if its coordinates are discrete.

To set the law of distribution of two-dimensional random variables means to establish a correspondence between its possible values ​​and the probability of these values. According to the ways of setting, random variables are divided into continuous and discrete, although there are general ways to set the distribution law of any RV.

Discrete two-dimensional random variable

A discrete two-dimensional random variable is specified using a distribution table (Table 2.1).

Table 2.1

Allocation table (joint allocation) CB ( X, U)

Table elements are defined by the formula

Distribution table element properties:

The distribution over each coordinate is called one-dimensional or marginal:

R 1> = P(X =.d,) - marginal distribution of SW X;

p^2) = P(Y= y,)- marginal distribution of SV U.

Communication of the joint distribution of CB X and Y, given by the set of probabilities [p () ), i = 1,..., n,j = 1,..., t(distribution table), and marginal distribution.


Similarly for SV U p- 2)= X p, g

Problem 2.14. Given:

Continuous 2D random variable

/(X, y)dxdy- element of probability for a two-dimensional random variable (X, Y) - probability of hitting a random variable (X, Y) in a rectangle with sides cbc, dy at dx, dy -* 0:

f(x, y) - distribution density two-dimensional random variable (X, Y). Task /(x, y) we give complete information about the distribution of a two-dimensional random variable.

Marginal distributions are specified as follows: for X - by the distribution density of CB X/,(x); on Y- SV distribution density f>(y).

Setting the distribution law of a two-dimensional random variable by the distribution function

A universal way to specify the distribution law for a discrete or continuous two-dimensional random variable is the distribution function F(x, y).

Definition 2.9. Distribution function F(x, y)- probability of joint occurrence of events (Xy), i.e. F(x0,y n) = = P(X y), thrown onto the coordinate plane, fall into an infinite quadrant with a vertex at the point M(x 0, u i)(in the shaded area in Fig. 2.12).

Rice. 2.12. Illustration of the distribution function F( x, y)

Function Properties F(x, y)

  • 1) 0 1;
  • 2) F(-oo,-oo) = F(x,-oo) = F(-oo, y) = 0; F( oo, oo) = 1;
  • 3) F(x, y)- non-decreasing in each argument;
  • 4) F(x, y) - continuous left and bottom;
  • 5) consistency of distributions:

F(x, X: F(x, oo) = F,(x); F(y, oo) - marginal distribution over Y F( oo, y) = F 2 (y). Connection /(x, y) with F(x, y):

Relationship between joint density and marginal density. Dana f(x, y). We get the marginal distribution densities f(x),f 2 (y)".


The case of independent coordinates of a two-dimensional random variable

Definition 2.10. SW X and Yindependent(nc) if any events associated with each of these RVs are independent. From the definition of nc CB it follows:

  • 1 )Pij = p X) pf
  • 2 )F(x,y) = F l (x)F 2 (y).

It turns out that for independent SWs X and Y completed and

3 )f(x,y) = J(x)f,(y).

Let us prove that for independent SWs X and Y2) 3). Proof, a) Let 2), i.e.,

in the same time F(x,y) = f J f(u,v)dudv, whence it follows 3);

b) let 3 now hold, then


those. true 2).

Let's consider tasks.

Problem 2.15. The distribution is given by the following table:

We build marginal distributions:

We get P(X = 3, U = 4) = 0,17 * P(X = 3) P (Y \u003d 4) \u003d 0.1485 => => SV X and Dependents.

Distribution function:


Problem 2.16. The distribution is given by the following table:

We get P tl = 0.2 0.3 = 0.06; P 12 \u003d 0.2? 0.7 = 0.14; P2l = 0,8 ? 0,3 = = 0,24; R 22 - 0.8 0.7 = 0.56 => SW X and Y nz.

Problem 2.17. Dana /(x, y) = 1/st exp| -0.5(d "+ 2xy + 5d/ 2)]. To find Oh) and /Ay)-

Decision

(calculate yourself).

Quite often, when studying random variables, one has to deal with two, three, or even more random variables. For example, the two-dimensional random variable $\left(X,\ Y\right)$ will describe the hit point of the projectile, where the random variables $X,\ Y$ are the abscissa and the ordinate, respectively. The performance of a random student during the session is characterized by an $n$-dimensional random variable $\left(X_1,\ X_2,\ \dots ,\ X_n\right)$, where the random variables are $X_1,\ X_2,\ \dots ,\ X_n $ - these are the grades put down in the grade book in various disciplines.

The set of $n$ random variables $\left(X_1,\ X_2,\ \dots ,\ X_n\right)$ is called random vector. We restrict ourselves to the case $\left(X,\ Y\right)$.

Let $X$ be a discrete random variable with possible values ​​$x_1,x_2,\ \dots ,\ x_n$, and $Y$ be a discrete random variable with possible values ​​$y_1,y_2,\ \dots ,\ y_n$.

Then a discrete two-dimensional random variable $\left(X,\ Y\right)$ can take the values ​​$\left(x_i,\ y_j\right)$ with probabilities $p_(ij)=P\left(\left(X=x_i \right)\left(Y=y_j\right)\right)=P\left(X=x_i\right)P\left(Y=y_j|X=x_i\right)$. Here $P\left(Y=y_j|X=x_i\right)$ is the conditional probability that the random variable $Y$ takes the value $y_j$ given that the random variable $X$ takes the value $x_i$.

The probability that the random variable $X$ takes the value $x_i$ is equal to $p_i=\sum_j(p_(ij))$. The probability that the random variable $Y$ takes the value $y_j$ is equal to $q_j=\sum_i(p_(ij))$.

$$P\left(X=x_i|Y=y_j\right)=((P\left(\left(X=x_i\right)\left(Y=y_j\right)\right))\over (P\ left(Y=y_j\right)))=((p_(ij))\over (q_j)).$$

$$P\left(Y=y_j|X=x_i\right)=((P\left(\left(X=x_i\right)\left(Y=y_j\right)\right))\over (P\ left(X=x_i\right)))=((p_(ij))\over (p_i)).$$

Example 1 . The distribution of a two-dimensional random variable is given:

$\begin(array)(|c|c|)
\hline
X\backslash Y & 2 & 3 \\
\hline
-1 & 0,15 & 0,25 \\
\hline
0 & 0,28 & 0,13 \\
\hline
1 & 0,09 & 0,1 \\
\hline
\end(array)$

Let us define the distribution laws for the random variables $X$ and $Y$. Let us find the conditional distributions of the random variable $X$ under the condition $Y=2$ and the random variable $Y$ under the condition $X=0$.

Let's fill in the following table:

$\begin(array)(|c|c|)
\hline
X\backslash Y & 2 & 3 & p_i & p_(ij)/q_1 \\
\hline
-1 & 0,15 & 0,25 & 0,4 & 0,29 \\
\hline
0 & 0,28 & 0,13 & 0,41 & 0,54 \\
\hline
1 & 0,09 & 0,1 & 0,19 & 0,17 \\
\hline
q_j & 0.52 & 0.48 & 1 & \\
\hline
p_(ij)/p_2 & 0.68 & 0.32 & & \\
\hline
1 & 0,09 & 0,1 \\
\hline
\end(array)$

Let's explain how the table is filled. The values ​​of the first three columns of the first four rows are taken from the condition. The sum of the numbers of the $2$th and $3$th columns of the $2$th ($3$th) row is indicated in the $4$th column of the $2$th ($3$th) row. The sum of the numbers in the $2$th and $3$th columns of the $4$th row is indicated in the $4$th column of the $4$th row.

The sum of numbers in the $2$th, $3$th and $4$th rows of the $2$th ($3$th) column is written in the $5$th row of the $2$th ($3$th) column. Each number in the $2$th column is divided by $q_1=0.52$, the result is rounded up to two decimal places and written in the $5$th column. The numbers from the $2$th and $3$th columns of the $3$th row are divided by $p_2=0.41$, the result is rounded up to two decimal places and written in the last line.

Then the law of distribution of the random variable $X$ has the following form.

$\begin(array)(|c|c|)
\hline
X & -1 & 0 & 1 \\
\hline
p_i & 0.4 & 0.41 & 0.19 \\
\hline
\end(array)$

The law of distribution of the random variable $Y$.

$\begin(array)(|c|c|)
\hline
Y & 2 & 3 \\
\hline
q_j & 0.52 & 0.48 \\
\hline
\end(array)$

The conditional distribution of the random variable $X$ under the condition $Y=2$ has the following form.

$\begin(array)(|c|c|)
\hline
X & -1 & 0 & 1 \\
\hline
p_(ij)/q_1 & 0.29 & 0.54 & 0.17 \\
\hline
\end(array)$

The conditional distribution of the random variable $Y$ under the condition $X=0$ has the following form.

$\begin(array)(|c|c|)
\hline
Y & 2 & 3 \\
\hline
p_(ij)/p_2 & 0.68 & 0.32 \\
\hline
\end(array)$

Example 2 . We have six pencils, two of which are red. We put the pencils in two boxes. $2$ pieces are put into the first one, and two into the second one. $X$ is the number of red pencils in the first box, and $Y$ is in the second. Write the distribution law for the system of random variables $(X,\ Y)$.

Let the discrete random variable $X$ be the number of red pencils in the first box, and the discrete random variable $Y$ be the number of red pencils in the second box. The possible values ​​of the random variables $X,\ Y$ are respectively $X:0,\ 1,\ 2$, $Y:0,\ 1,\ 2$. Then a discrete two-dimensional random variable $\left(X,\ Y\right)$ can take the values ​​$\left(x,\ y\right)$ with probabilities $P=P\left(\left(X=x\right) \times \left(Y=y\right)\right)=P\left(X=x\right)\times P\left(Y=y|X=x\right)$, where $P\left(Y =y|X=x\right)$ - the conditional probability that the random variable $Y$ takes the value $y$, provided that the random variable $X$ takes the value $x$. Let us represent the correspondence between the values ​​$\left(x,\ y\right)$ and the probabilities $P\left(\left(X=x\right)\times \left(Y=y\right)\right)$ as follows tables.

$\begin(array)(|c|c|)
\hline
X\backslash Y & 0 & 1 & 2 \\
\hline
0 & ((1)\over (15)) & ((4)\over (15)) & ((1)\over (15)) \\
\hline
1 & ((4)\over (15)) & ((4)\over (15)) & 0 \\
\hline
2 & ((1)\over (15)) & 0 & 0 \\
\hline
\end(array)$

The rows of such a table indicate the values ​​$X$, and the columns indicate the values ​​$Y$, then the probabilities $P\left(\left(X=x\right)\times \left(Y=y\right)\right)$ are indicated at the intersection of the corresponding row and column. Let's calculate the probabilities using the classical definition of probability and the product theorem of probabilities of dependent events.

$$P\left(\left(X=0\right)\left(Y=0\right)\right)=((C^2_4)\over (C^2_6))\cdot ((C^2_2) \over (C^2_4))=((6)\over (15))\cdot ((1)\over (6))=((1)\over (15));$$

$$P\left(\left(X=0\right)\left(Y=1\right)\right)=((C^2_4)\over (C^2_6))\cdot ((C^1_2\ cdot C^1_2)\over (C^2_4))=((6)\over (15))\cdot ((2\cdot 2)\over (6))=((4)\over (15)) ;$$

$$P\left(\left(X=0\right)\left(Y=2\right)\right)=((C^2_4)\over (C^2_6))\cdot ((C^2_2) \over (C^2_4))=((6)\over (15))\cdot ((1)\over (6))=((1)\over (15));$$

$$P\left(\left(X=1\right)\left(Y=0\right)\right)=((C^1_2\cdot C^1_4)\over (C^2_6))\cdot ( (C^2_3)\over (C^2_4))=((2\cdot 4)\over (15))\cdot ((3)\over (6))=((4)\over (15)) ;$$

$$P\left(\left(X=1\right)\left(Y=1\right)\right)=((C^1_2\cdot C^1_4)\over (C^2_6))\cdot ( (C^1_1\cdot C^1_3)\over (C^2_4))=((2\cdot 4)\over (15))\cdot ((1\cdot 3)\over (6))=(( 4)\over(15));$$

$$P\left(\left(X=2\right)\left(Y=0\right)\right)=((C^2_2)\over (C^2_6))\cdot ((C^2_4) \over (C^2_4))=((1)\over (15))\cdot 1=((1)\over (15)).$$

Since in the distribution law (the resulting table) the entire set of events forms a complete group of events, the sum of the probabilities should be equal to 1. Let's check this:

$$\sum_(i,\ j)(p_(ij))=((1)\over (15))+((4)\over (15))+((1)\over (15))+ ((4)\over (15))+((4)\over (15))+((1)\over (15))=1.$$

Distribution function of a two-dimensional random variable

distribution function A two-dimensional random variable $\left(X,\ Y\right)$ is a function $F\left(x,\ y\right)$, which for any real numbers $x$ and $y$ is equal to the probability of joint execution of two events $ \left\(X< x\right\}$ и $\left\{Y < y\right\}$. Таким образом, по определению

$$F\left(x,\ y\right)=P\left\(X< x,\ Y < y\right\}.$$

For a discrete two-dimensional random variable, the distribution function is found by summing all probabilities $p_(ij)$ for which $x_i< x,\ y_j < y$, то есть

$$F\left(x,\ y\right)=\sum_(x_i< x}{\sum_{y_j < y}{p_{ij}}}.$$

Properties of the distribution function of a two-dimensional random variable.

1 . The distribution function $F\left(x,\ y\right)$ is bounded, that is, $0\le F\left(x,\ y\right)\le 1$.

2 . $F\left(x,\ y\right)$ non-decreasing for each of its arguments with the other fixed, i.e. $F\left(x_2,\ y\right)\ge F\left(x_1,\ y\right )$ for $x_2>x_1$, $F\left(x,\ y_2\right)\ge F\left(x,\ y_1\right)$ for $y_2>y_1$.

3 . If at least one of the arguments takes the value $-\infty $, then the distribution function will be equal to zero, i.e. $F\left(-\infty ,\ y\right)=F\left(x,\ -\infty \right ),\ F\left(-\infty ,\ -\infty \right)=0$.

4 . If both arguments take the value $+\infty $, then the distribution function will be equal to $1$, i.e. $F\left(+\infty ,\ +\infty \right)=1$.

5 . In the case when exactly one of the arguments takes the value $+\infty $, the distribution function $F\left(x,\ y\right)$ becomes the distribution function of the random variable corresponding to the other element, i.e. $F\left(x ,\ +\infty \right)=F_1\left(x\right)=F_X\left(x\right),\ F\left(+\infty ,\ y\right)=F_y\left(y\right) =F_Y\left(y\right)$.

6 . $F\left(x,\ y\right)$ is left continuous for each of its arguments, i.e.

$$(\mathop(lim)_(x\to x_0-0) F\left(x,\ y\right)\ )=F\left(x_0,\ y\right),\ (\mathop(lim) _(y\to y_0-0) F\left(x,\ y\right)\ )=F\left(x,\ y_0\right).$$

Example 3 . Let a discrete two-dimensional random variable $\left(X,\ Y\right)$ be given by a distribution series.

$\begin(array)(|c|c|)
\hline
X\backslash Y & 0 & 1 \\
\hline
0 & ((1)\over (6)) & ((2)\over (6)) \\
\hline
1 & ((2)\over (6)) & ((1)\over (6)) \\
\hline
\end(array)$

Then the distribution function:

$F(x,y)=\left\(\begin(matrix)
0,\ at\ x\le 0,\ y\le 0 \\
0,\ at\ x\le 0,\ 0< y\le 1 \\
0,\ for\ x\le 0,\ y>1 \\
0,\ at\ 0< x\le 1,\ y\le 0 \\
((1)\over (6)),\ at\ 0< x\le 1,\ 0 < y\le 1 \\
((1)\over (6))+((2)\over (6))=((1)\over (2)),\ when\ 0< x\le 1,\ y>1 \\
0,\ for\ x>1,\ y\le 0 \\
((1)\over (6))+((2)\over (6))=((1)\over (2)),\ when\ x>1,\ 0< y\le 1 \\
((1)\over (6))+((2)\over (6))+((2)\over (6))+((1)\over (6))=1,\ for\ x >1,\ y>1 \\
\end(matrix)\right.$

bivariate discrete distribution random

Often the result of the experiment is described by several random variables: . For example, the weather in a given place at a certain time of the day can be characterized by the following random variables: X 1 - temperature, X 2 - pressure, X 3 - air humidity, X 4 - wind speed.

In this case, one speaks of a multidimensional random variable or a system of random variables.

Consider a two-dimensional random variable whose possible values ​​are pairs of numbers. Geometrically, a two-dimensional random variable can be interpreted as a random point on a plane.

If the components X and Y are discrete random variables, then is a discrete two-dimensional random variable, and if X and Y are continuous, then is a continuous two-dimensional random variable.

The law of probability distribution of a two-dimensional random variable is the correspondence between possible values ​​and their probabilities.

The distribution law of a two-dimensional discrete random variable can be given in the form of a double-entry table (see Table 6.1), where is the probability that the component X took on the meaning x i, and the component Y- meaning y j .

Table 6.1.1.

y 1

y 2

y j

y m

x 1

p 11

p 12

p 1j

p 1m

x 2

p 21

p 22

p 2j

p 2m

x i

p i1

p i2

p ij

p im

x n

p n1

p n2

p nj

p nm

Since events make up a complete group of pairwise incompatible events, the sum of probabilities is equal to 1, i.e.

From table 6.1 you can find the laws of distribution of one-dimensional components X and Y.

Example 6.1.1 . Find the laws of distribution of components X and Y, if the distribution of a two-dimensional random variable is given in the form of table 6.1.2.

Table 6.1.2.

If we fix the value of one of the arguments, for example, then the resulting distribution of the quantity X is called a conditional distribution. The conditional distribution is defined similarly Y.

Example 6.1.2 . According to the distribution of a two-dimensional random variable given in Table. 6.1.2, find: a) the conditional distribution law of the component X given that; b) conditional distribution law Y provided that.

Decision. Conditional probabilities of components X and Y calculated by formulas

Conditional distribution law X condition has the form

The control: .

The distribution law of a two-dimensional random variable can be given as distribution functions, which determines for each pair of numbers the probability that X takes on a value less than X, and wherein Y takes on a value less than y:

Geometrically, the function means the probability of a random point falling into an infinite square with a vertex at the point (Fig. 6.1.1).

Let's note the properties.

  • 1. The range of the function - , i.e. .
  • 2. Function - non-decreasing function for each argument.
  • 3. There are limiting relations:

At , the distribution function of the system becomes equal to the distribution function of the component X, i.e. .

Likewise, .

Knowing, you can find the probability of a random point falling within the rectangle ABCD.

Namely,

Example 6.1.3. Bivariate discrete random variable defined by distribution table

Find the distribution function.

Decision. Value in case of discrete components X and Y is found by summing all probabilities with indices i and j, for which, . Then, if and, then (the events and are impossible). Similarly, we get:

if and then;

if and then;

if and then;

if and then;

if and then;

if and then;

if and then;

if and then;

if and then.

The results obtained are presented in the form of a table (6.1.3) of values:

For two-dimensional continuous random variable, the concept of probability density is introduced

The geometric probability density is a distribution surface in space

A two-dimensional probability density has the following properties:

3. The distribution function can be expressed in terms of the formula

4. The probability of hitting a continuous random variable in the area is equal to

5. In accordance with property (4) of the function, the formulas take place:

Example 6.1.4. The distribution function of a two-dimensional random variable is given

An ordered pair (X , Y) of random variables X and Y is called a two-dimensional random variable, or a random vector of a two-dimensional space. A two-dimensional random variable (X,Y) is also called a system of random variables X and Y. The set of all possible values ​​of a discrete random variable with their probabilities is called the distribution law of this random variable. A discrete two-dimensional random variable (X, Y) is considered given if its distribution law is known:

P(X=x i , Y=y j) = p ij , i=1,2...,n, j=1,2...,m

Service assignment. Using the service, according to a given distribution law, you can find:

  • distribution series X and Y, mathematical expectation M[X], M[Y], variance D[X], D[Y];
  • covariance cov(x,y), correlation coefficient r x,y , conditional distribution series X, conditional expectation M;
In addition, an answer is given to the question, "Are the random variables X and Y dependent?".

Instruction. Specify the dimension of the probability distribution matrix (number of rows and columns) and its form. The resulting solution is saved in a Word file.

Example #1. A two-dimensional discrete random variable has a distribution table:

Y/X 1 2 3 4
10 0 0,11 0,12 0,03
20 0 0,13 0,09 0,02
30 0,02 0,11 0,08 0,01
40 0,03 0,11 0,05 q
Find the q value and the correlation coefficient of this random variable.

Decision. We find the value q from the condition Σp ij = 1
Σp ij = 0.02 + 0.03 + 0.11 + … + 0.03 + 0.02 + 0.01 + q = 1
0.91+q = 1. Whence q = 0.09

Using the formula ∑P(x i,y j) = p i(j=1..n), find the distribution series X.

Mathematical expectation M[Y].
M[y] = 1*0.05 + 2*0.46 + 3*0.34 + 4*0.15 = 2.59
Dispersion D[Y] = 1 2 *0.05 + 2 2 *0.46 + 3 2 *0.34 + 4 2 *0.15 - 2.59 2 = 0.64
Standard deviationσ(y) = sqrt(D[Y]) = sqrt(0.64) = 0.801

covariance cov(X,Y) = M - M[X] M[Y] = 2 10 0.11 + 3 10 0.12 + 4 10 0.03 + 2 20 0.13 + 3 20 0.09 + 4 20 0.02 + 1 30 0.02 + 2 30 0.11 + 3 30 0.08 + 4 30 0.01 + 1 40 0.03 + 2 40 0.11 + 3 40 0.05 + 4 40 0.09 - 25.2 2.59 = -0.068
Correlation coefficient rxy = cov(x,y)/σ(x)&sigma(y) = -0.068/(11.531*0.801) = -0.00736

Example 2 . The data of statistical processing of information regarding two indicators X and Y are reflected in the correlation table. Required:

  1. write distribution series for X and Y and calculate sample means and sample standard deviations for them;
  2. write conditional distribution series Y/x and calculate conditional averages Y/x;
  3. graphically depict the dependence of the conditional averages Y/x on the values ​​of X;
  4. calculate the sample correlation coefficient Y on X;
  5. write a sample direct regression equation;
  6. represent geometrically the data of the correlation table and build a regression line.
Decision. An ordered pair (X,Y) of random variables X and Y is called a two-dimensional random variable, or a random vector of a two-dimensional space. A two-dimensional random variable (X,Y) is also called a system of random variables X and Y.
The set of all possible values ​​of a discrete random variable with their probabilities is called the distribution law of this random variable.
A discrete two-dimensional random variable (X,Y) is considered given if its distribution law is known:
P(X=x i , Y=y j) = p ij , i=1,2...,n, j=1,2...,m
X/Y20 30 40 50 60
11 2 0 0 0 0
16 4 6 0 0 0
21 0 3 6 2 0
26 0 0 45 8 4
31 0 0 4 6 7
36 0 0 0 0 3
Events (X=x i , Y=y j) form a complete group of events, so the sum of all probabilities p ij ( i=1,2...,n, j=1,2...,m) indicated in the table is equal to 1.
1. Dependence of random variables X and Y.
Find the distribution series X and Y.
Using the formula ∑P(x i,y j) = p i(j=1..n), find the distribution series X. Mathematical expectation M[Y].
M[y] = (20*6 + 30*9 + 40*55 + 50*16 + 60*14)/100 = 42.3
Dispersion D[Y].
D[Y] = (20 2 *6 + 30 2 *9 + 40 2 *55 + 50 2 *16 + 60 2 *14)/100 - 42.3 2 = 99.71
Standard deviation σ(y).

Since, P(X=11,Y=20) = 2≠2 6, then the random variables X and Y dependent.
2. Conditional distribution law X.
Conditional distribution law X(Y=20).
P(X=11/Y=20) = 2/6 = 0.33
P(X=16/Y=20) = 4/6 = 0.67
P(X=21/Y=20) = 0/6 = 0
P(X=26/Y=20) = 0/6 = 0
P(X=31/Y=20) = 0/6 = 0
P(X=36/Y=20) = 0/6 = 0
Conditional expectation M = 11*0.33 + 16*0.67 + 21*0 + 26*0 + 31*0 + 36*0 = 14.33
Conditional variance D = 11 2 *0.33 + 16 2 *0.67 + 21 2 *0 + 26 2 *0 + 31 2 *0 + 36 2 *0 - 14.33 2 = 5.56
Conditional distribution law X(Y=30).
P(X=11/Y=30) = 0/9 = 0
P(X=16/Y=30) = 6/9 = 0.67
P(X=21/Y=30) = 3/9 = 0.33
P(X=26/Y=30) = 0/9 = 0
P(X=31/Y=30) = 0/9 = 0
P(X=36/Y=30) = 0/9 = 0
Conditional expectation M = 11*0 + 16*0.67 + 21*0.33 + 26*0 + 31*0 + 36*0 = 17.67
Conditional variance D = 11 2 *0 + 16 2 *0.67 + 21 2 *0.33 + 26 2 *0 + 31 2 *0 + 36 2 *0 - 17.67 2 = 5.56
Conditional distribution law X(Y=40).
P(X=11/Y=40) = 0/55 = 0
P(X=16/Y=40) = 0/55 = 0
P(X=21/Y=40) = 6/55 = 0.11
P(X=26/Y=40) = 45/55 = 0.82
P(X=31/Y=40) = 4/55 = 0.0727
P(X=36/Y=40) = 0/55 = 0
Conditional expectation M = 11*0 + 16*0 + 21*0.11 + 26*0.82 + 31*0.0727 + 36*0 = 25.82
Conditional variance D = 11 2 *0 + 16 2 *0 + 21 2 *0.11 + 26 2 *0.82 + 31 2 *0.0727 + 36 2 *0 - 25.82 2 = 4.51
Conditional distribution law X(Y=50).
P(X=11/Y=50) = 0/16 = 0
P(X=16/Y=50) = 0/16 = 0
P(X=21/Y=50) = 2/16 = 0.13
P(X=26/Y=50) = 8/16 = 0.5
P(X=31/Y=50) = 6/16 = 0.38
P(X=36/Y=50) = 0/16 = 0
Conditional expectation M = 11*0 + 16*0 + 21*0.13 + 26*0.5 + 31*0.38 + 36*0 = 27.25
Conditional variance D = 11 2 *0 + 16 2 *0 + 21 2 *0.13 + 26 2 *0.5 + 31 2 *0.38 + 36 2 *0 - 27.25 2 = 10.94
Conditional distribution law X(Y=60).
P(X=11/Y=60) = 0/14 = 0
P(X=16/Y=60) = 0/14 = 0
P(X=21/Y=60) = 0/14 = 0
P(X=26/Y=60) = 4/14 = 0.29
P(X=31/Y=60) = 7/14 = 0.5
P(X=36/Y=60) = 3/14 = 0.21
Conditional expectation M = 11*0 + 16*0 + 21*0 + 26*0.29 + 31*0.5 + 36*0.21 = 30.64
Conditional variance D = 11 2 *0 + 16 2 *0 + 21 2 *0 + 26 2 *0.29 + 31 2 *0.5 + 36 2 *0.21 - 30.64 2 = 12.37
3. Conditional distribution law Y.
Conditional distribution law Y(X=11).
P(Y=20/X=11) = 2/2 = 1
P(Y=30/X=11) = 0/2 = 0
P(Y=40/X=11) = 0/2 = 0
P(Y=50/X=11) = 0/2 = 0
P(Y=60/X=11) = 0/2 = 0
Conditional expectation M = 20*1 + 30*0 + 40*0 + 50*0 + 60*0 = 20
Conditional variance D = 20 2 *1 + 30 2 *0 + 40 2 *0 + 50 2 *0 + 60 2 *0 - 20 2 = 0
Conditional distribution law Y(X=16).
P(Y=20/X=16) = 4/10 = 0.4
P(Y=30/X=16) = 6/10 = 0.6
P(Y=40/X=16) = 0/10 = 0
P(Y=50/X=16) = 0/10 = 0
P(Y=60/X=16) = 0/10 = 0
Conditional expectation M = 20*0.4 + 30*0.6 + 40*0 + 50*0 + 60*0 = 26
Conditional variance D = 20 2 *0.4 + 30 2 *0.6 + 40 2 *0 + 50 2 *0 + 60 2 *0 - 26 2 = 24
Conditional distribution law Y(X=21).
P(Y=20/X=21) = 0/11 = 0
P(Y=30/X=21) = 3/11 = 0.27
P(Y=40/X=21) = 6/11 = 0.55
P(Y=50/X=21) = 2/11 = 0.18
P(Y=60/X=21) = 0/11 = 0
Conditional expectation M = 20*0 + 30*0.27 + 40*0.55 + 50*0.18 + 60*0 = 39.09
Conditional variance D = 20 2 *0 + 30 2 *0.27 + 40 2 *0.55 + 50 2 *0.18 + 60 2 *0 - 39.09 2 = 44.63
Conditional distribution law Y(X=26).
P(Y=20/X=26) = 0/57 = 0
P(Y=30/X=26) = 0/57 = 0
P(Y=40/X=26) = 45/57 = 0.79
P(Y=50/X=26) = 8/57 = 0.14
P(Y=60/X=26) = 4/57 = 0.0702
Conditional expectation M = 20*0 + 30*0 + 40*0.79 + 50*0.14 + 60*0.0702 = 42.81
Conditional variance D = 20 2 *0 + 30 2 *0 + 40 2 *0.79 + 50 2 *0.14 + 60 2 *0.0702 - 42.81 2 = 34.23
Conditional distribution law Y(X=31).
P(Y=20/X=31) = 0/17 = 0
P(Y=30/X=31) = 0/17 = 0
P(Y=40/X=31) = 4/17 = 0.24
P(Y=50/X=31) = 6/17 = 0.35
P(Y=60/X=31) = 7/17 = 0.41
Conditional expectation M = 20*0 + 30*0 + 40*0.24 + 50*0.35 + 60*0.41 = 51.76
Conditional variance D = 20 2 *0 + 30 2 *0 + 40 2 *0.24 + 50 2 *0.35 + 60 2 *0.41 - 51.76 2 = 61.59
Conditional distribution law Y(X=36).
P(Y=20/X=36) = 0/3 = 0
P(Y=30/X=36) = 0/3 = 0
P(Y=40/X=36) = 0/3 = 0
P(Y=50/X=36) = 0/3 = 0
P(Y=60/X=36) = 3/3 = 1
Conditional expectation M = 20*0 + 30*0 + 40*0 + 50*0 + 60*1 = 60
Conditional variance D = 20 2 *0 + 30 2 *0 + 40 2 *0 + 50 2 *0 + 60 2 *1 - 60 2 = 0
covariance.
cov(X,Y) = M - M[X] M[Y]
cov(X,Y) = (20 11 2 + 20 16 4 + 30 16 6 + 30 21 3 + 40 21 6 + 50 21 2 + 40 26 45 + 50 26 8 + 60 26 4 + 40 31 4 + 50 31 6 + 60 31 7 + 60 36 3)/100 - 25.3 42.3 = 38.11
If the random variables are independent, then their covariance is zero. In our case cov(X,Y) ≠ 0.
Correlation coefficient.


The linear regression equation from y to x is:

The linear regression equation from x to y is:

Find the necessary numerical characteristics.
Sample means:
x = (20(2 + 4) + 30(6 + 3) + 40(6 + 45 + 4) + 50(2 + 8 + 6) + 60(4 + 7 + 3))/100 = 42.3
y = (20(2 + 4) + 30(6 + 3) + 40(6 + 45 + 4) + 50(2 + 8 + 6) + 60(4 + 7 + 3))/100 = 25.3
dispersions:
σ 2 x = (20 2 (2 + 4) + 30 2 (6 + 3) + 40 2 (6 + 45 + 4) + 50 2 (2 + 8 + 6) + 60 2 (4 + 7 + 3) )/100 - 42.3 2 = 99.71
σ 2 y = (11 2 (2) + 16 2 (4 + 6) + 21 2 (3 + 6 + 2) + 26 2 (45 + 8 + 4) + 31 2 (4 + 6 + 7) + 36 2 (3))/100 - 25.3 2 = 24.01
Where do we get the standard deviations:
σ x = 9.99 and σ y = 4.9
and covariance:
Cov(x,y) = (20 11 2 + 20 16 4 + 30 16 6 + 30 21 3 + 40 21 6 + 50 21 2 + 40 26 45 + 50 26 8 + 60 26 4 + 40 31 4 + 50 31 6 + 60 31 7 + 60 36 3)/100 - 42.3 25.3 = 38.11
Let's define the correlation coefficient:


Let's write down the equations of the regression lines y(x):

and calculating, we get:
yx = 0.38x + 9.14
Let's write the equations of the regression lines x(y):

and calculating, we get:
x y = 1.59 y + 2.15
If we build the points defined by the table and the regression lines, we will see that both lines pass through the point with coordinates (42.3; 25.3) and the points are located close to the regression lines.
Significance of the correlation coefficient.

According to Student's table with significance level α=0.05 and degrees of freedom k=100-m-1 = 98 we find t crit:
t crit (n-m-1;α/2) = (98;0.025) = 1.984
where m = 1 is the number of explanatory variables.
If t obs > t is critical, then the obtained value of the correlation coefficient is recognized as significant (the null hypothesis asserting that the correlation coefficient is equal to zero is rejected).
Since t obl > t crit, we reject the hypothesis that the correlation coefficient is equal to 0. In other words, the correlation coefficient is statistically significant.

Exercise. The number of hits of pairs of values ​​of random variables X and Y in the corresponding intervals are given in the table. From these data, find the sample correlation coefficient and the sample equations of the straight regression lines Y on X and X on Y .
Decision

Example. The probability distribution of a two-dimensional random variable (X, Y) is given by a table. Find the laws of distribution of the component quantities X, Y and the correlation coefficient p(X, Y).
Download Solution

Exercise. The two-dimensional discrete value (X, Y) is given by the distribution law. Find the distribution laws of the X and Y components, covariance and correlation coefficient.