Continuous random variable, distribution function and probability density. Distribution function of a random variable

Expected value

Dispersion continuous random variable X, the possible values ​​of which belong to the entire axis Ox, is determined by the equality:

Service assignment. The online calculator is designed to solve problems in which either distribution density f(x) , or distribution function F(x) (see example). Usually in such tasks it is required to find mathematical expectation, standard deviation, plot the functions f(x) and F(x).

Instruction. Select the type of input data: distribution density f(x) or distribution function F(x) .

Given the distribution density f(x) Given the distribution function F(x)

The distribution density f(x) is given:

The distribution function F(x) is given:

A continuous random variable is defined by a probability density
(Rayleigh distribution law - used in radio engineering). Find M(x) , D(x) .

The random variable X is called continuous , if its distribution function F(X)=P(X< x) непрерывна и имеет производную.
The distribution function of a continuous random variable is used to calculate the probabilities of a random variable falling into a given interval:
P(α< X < β)=F(β) - F(α)
moreover, for a continuous random variable, it does not matter whether its boundaries are included in this interval or not:
P(α< X < β) = P(α ≤ X < β) = P(α ≤ X ≤ β)
Distribution density continuous random variable is called the function
f(x)=F'(x) , derivative of the distribution function.

Distribution Density Properties

1. The distribution density of a random variable is non-negative (f(x) ≥ 0) for all values ​​of x.
2. Normalization condition:

The geometric meaning of the normalization condition: the area under the distribution density curve is equal to one.
3. The probability of hitting a random variable X in the interval from α to β can be calculated by the formula

Geometrically, the probability that a continuous random variable X falls into the interval (α, β) is equal to the area of ​​the curvilinear trapezoid under the distribution density curve based on this interval.
4. The distribution function is expressed in terms of density as follows:

The distribution density value at the point x is not equal to the probability of taking this value; for a continuous random variable, we can only talk about the probability of falling into a given interval. Let (4)

where a and b not necessarily finite. For example, for the modulus of the velocity vector of a gas molecule VО lying within the entire range of possible values, i.e. x O [ x,x+ D x] O [ a, b] (5)

Then the probability D W(x, D x) hits x in the interval (5) is equal to

Here N is the total number of measurements x, and D n(x, D x) is the number of results that fall into the interval (5).

Probability D W naturally depends on two arguments: x– positions of the interval inside [ a, b] and D x is its length (it is assumed, although it is not necessary at all, that D x> 0). For example, the probability of getting the exact value x, in other words, the probability of hitting x into an interval of zero length is the probability of an impossible event and therefore equals zero: D W(x, 0) = 0

On the other hand, the probability of getting the value x somewhere (doesn't matter where) within the entire interval [ a, b] is the probability of a certain event (something always happens) and therefore is equal to one (it is assumed that b > a):D W(a, ba) = 1.

Let D x few. The criterion of sufficient smallness depends on the specific properties of the system described by the probability distribution D W(x, D x). If D x small, then the function D W(x, D x) can be expanded in a series in powers of D x:

If we draw a dependency graph D W(x, D x) from the second argument D x, then replacing the exact dependence with the approximate expression (7) means replacing (in a small area) the exact curve with a piece of parabola (7).

In (7), the first term is exactly equal to zero, the third and subsequent terms, if D is sufficiently small, x can be omitted. Introduction of notation

gives an important result D W(x, D x) » r( x) D x (8)

Relation (8), which is more accurate, the smaller D x means that for a short interval, the probability of falling into this interval is proportional to its length.

You can still go from a small but final D x to formally infinitesimal dx, with the simultaneous replacement of D W(x, D x) on the dW(x). Then the approximate equality (8) turns into the exact one dW(x) = r( xdx(9)

Proportionality coefficient r( x) has a simple meaning. As can be seen from (8) and (9), r( x) is numerically equal to the probability of hitting x into an interval of unit length. Therefore, one of the names of the function r( x) is the probability distribution density for the variable x.

Function r( x) contains all the information about how the probability dW(x) hits x in the interval of a given length dx depends on the location of this interval, i.e. it shows how the probability is distributed over x. Therefore, the function r( x) is commonly called the distribution function for the variable x and, thus, the distribution function for that physical system, for the sake of describing the spectrum of states of which the variable was introduced x. The terms "probability density" and "distribution function" are used interchangeably in statistical physics.

We can consider a generalization of the definition of probability (6) and distribution function (9) to the case, for example, of three variables. Generalization to the case of an arbitrarily large number of variables is carried out in exactly the same way.

Let the state of a physical system randomly varying in time be determined by the values ​​of three variables x, y and z with continuous spectrum:

x O [ a, b]

y O [ c, d]

z O [ e, f] (10)

where a, b,…, f, as before, are not necessarily finite. Variables x, y and z can be, for example, the coordinates of the center of mass of a gas molecule, the components of its velocity vector x YU Vx, y YU V y and z YU Vz or impulse, etc. An event is understood as the simultaneous occurrence of all three variables in intervals of length D x, D y and D z respectively, i.e.:

x O [ x, x+ D x]

y O [ y, y+ D y]

z O [ z, z+ D z] (11)

The probability of an event (11) can be determined similarly to (6)

with the difference that now D n– number of measurements x, y and z, whose results simultaneously satisfy relations (11). Using a series expansion similar to (7) gives

dW(x, y, z) = r( x, y, zdx dy dz(13)

where r( x, y, z) is the distribution function for three variables at once x, y and z.

In the mathematical theory of probability, the term "distribution function" is used to denote a quantity different from r( x), namely: let x be some value of a random variable x. The function Ф(x), which gives the probability that x takes a value no greater than x and is called the distribution function. The functions r and Ф have different meanings, but they are related. Using the probability addition theorem gives (here a is the left end of the range of possible values x (cm. PROBABILITY THEORY: , (14) whence

Using the approximate relation (8) gives D W(x, D x) » r( x) D x.

Comparison with the exact expression (15) shows that using (8) is equivalent to replacing the integral in (16) with the product of the integrand r( x) by the length of the integration interval D x:

Relation (17) will be exact if r = const, therefore, the error when replacing (16) with (17) will be small when the integrand changes slightly over the length of the integration interval D x.

You can enter D x eff is the length of the interval on which the distribution function r( x) changes significantly, i.e. by a value of the order of the function itself, or the quantity Dr eff modulo order r. Using the Lagrange formula, we can write:

whence it follows that D x eff for any function r

The distribution function can be considered "almost constant" over a certain interval of change of the argument if its increment |Dr| on this interval, the absolute value is much less than the function itself at the points of this interval. Requirement |Dr| eff| ~ r (distribution function r і 0) gives

D x x eff (20)

the length of the integration interval should be small compared to the one on which the integrand changes significantly. The illustration is fig. one.

The integral on the left side of (17) is equal to the area under the curve. The product on the right side of (17) is the area of ​​the shaded in Fig. 1 column. The criterion for the smallness of the difference between the corresponding areas is the fulfillment of inequality (20). This can be verified by substituting into the integral (17) the first terms of the expansion of the function r( x) in a series in powers

The requirement that the correction (the second term on the right-hand side of (21) be compared with the first one be small gives inequality (20) with D x eff from (19).

Examples of a number of distribution functions that play an important role in statistical physics.

Maxwell distribution for the projection of the velocity vector of a molecule onto a given direction (for example, this is the direction of the axis OX).

Here m is the mass of a gas molecule, T- its temperature k is the Boltzmann constant.

Maxwell distribution for the modulus of the velocity vector:

Maxwell distribution for the energy of translational motion of molecules e = mV 2/2

Boltzmann distribution, more precisely, the so-called barometric formula, which determines the distribution of the concentration of molecules or air pressure in height h from some “zero level” under the assumption that the air temperature does not depend on height (isothermal atmosphere model). In fact, the temperature in the lower layers of the atmosphere drops noticeably with increasing altitude.

To find the distribution functions of random variables and their variables, it is necessary to study all the features of this field of knowledge. There are several different methods for finding the values ​​in question, including changing a variable and generating a moment. Distribution is a concept based on such elements as dispersion, variations. However, they characterize only the degree of scattering range.

The more important functions of random variables are those that are related and independent, and equally distributed. For example, if X1 is the weight of a randomly selected individual from a population of males, X2 is the weight of another, ..., and Xn is the weight of another person from the male population, then we need to know how the random function X is distributed. In this case, the classical theorem called the central limit theorem applies. It allows us to show that for large n the function follows standard distributions.

Functions of one random variable

The central limit theorem is designed to approximate discrete values ​​in question, such as binomial and Poisson. Distribution functions of random variables are considered, first of all, on simple values ​​of one variable. For example, if X is a continuous random variable having its own probability distribution. In this case, we explore how to find the density function of Y using two different approaches, namely the distribution function method and the change in variable. First, only one-to-one values ​​are considered. Then you need to modify the technique of changing the variable to find its probability. Finally, one needs to learn how the cumulative distribution can help model random numbers that follow certain sequential patterns.

Method of distribution of considered values

The method of the probability distribution function of a random variable is applicable in order to find its density. When using this method, a cumulative value is calculated. Then, by differentiating it, you can get the probability density. Now that we have the distribution function method, we can look at a few more examples. Let X be a continuous random variable with a certain probability density.

What is the probability density function of x2? If you look at or graph the function (top and right) y \u003d x2, you can note that it is an increasing X and 0

In the last example, great care was used to index the cumulative functions and the probability density with either X or Y to indicate which random variable they belonged to. For example, when finding the cumulative distribution function Y, we got X. If you need to find a random variable X and its density, then you just need to differentiate it.

Changing Variables Technique

Let X be a continuous random variable given by a distribution function with a common denominator f(x). In this case, if you put the value of y in X = v (Y), then you get the value of x, for example v (y). Now, we need to get the distribution function of a continuous random variable Y. Where the first and second equality takes place from the definition of cumulative Y. The third equality holds because the part of the function for which u (X) ≤ y is also true that X ≤ v (Y ). And the latter is done to determine the probability in a continuous random variable X. Now we need to take the derivative of FY (y), the cumulative distribution function of Y, to get the probability density of Y.

Generalization for the reduce function

Let X be a continuous random variable with common f(x) defined over c1

To address this issue, quantitative data can be collected and an empirical cumulative distribution function can be used. With this information and appealing to it, you need to combine means samples, standard deviations, media data, and so on.

Similarly, even a fairly simple probabilistic model can have a huge number of results. For example, if you flip a coin 332 times. Then the number of results obtained from flips is greater than that of google (10100) - a number, but not less than 100 quintillion times higher than elementary particles in the known universe. Not interested in an analysis that gives an answer to every possible outcome. A simpler concept would be needed, such as the number of heads, or the longest stroke of the tails. To focus on issues of interest, a specific result is accepted. The definition in this case is as follows: a random variable is a real function with a probability space.

The range S of a random variable is sometimes called the state space. Thus, if X is the value in question, then so N = X2, exp ↵X, X2 + 1, tan2 X, bXc, and so on. The last of these, rounding X to the nearest whole number, is called the floor function.

Distribution functions

Once the distribution function of interest for the random variable x is determined, the question usually becomes: "What are the chances that X falls into some subset of the values ​​of B?". For example, B = (odd numbers), B = (greater than 1), or B = (between 2 and 7) to indicate those results that have X, the value of the random variable, in subset A. So in the above example, you can describe the events as follows.

(X is an odd number), (X is greater than 1) = (X > 1), (X is between 2 and 7) = (2

Random variables and distribution functions

Thus, it is possible to calculate the probability that the distribution function of a random variable x will take values ​​in the interval by subtracting. Consideration needs to be given to including or excluding endpoints.

We will call a random variable discrete if it has a finite or countably infinite state space. Thus, X is the number of heads on three independent flips of a biased coin that goes up with probability p. We need to find the cumulative distribution function of a discrete random variable FX for X. Let X be the number of peaks in a collection of three cards. Then Y = X3 via FX. FX starts at 0, ends at 1, and does not decrease as x values ​​increase. The cumulative FX distribution function of a discrete random variable X is constant, except for jumps. When jumping the FX is continuous. It is possible to prove the statement about the correct continuity of the distribution function from the probability property using the definition. It sounds like this: a constant random variable has a cumulative FX that is differentiable.

To show how this can happen, we can give an example: a target with a unit radius. Presumably. the dart is evenly distributed over the specified area. For some λ> 0. Thus, the distribution functions of continuous random variables increase smoothly. FX has the properties of a distribution function.

A man waits at a bus stop until the bus arrives. Having decided for himself that he will refuse when the wait reaches 20 minutes. Here it is necessary to find the cumulative distribution function for T. The time when a person will still be at the bus station or will not leave. Despite the fact that the cumulative distribution function is defined for each random variable. All the same, other characteristics will be used quite often: the mass for a discrete variable and the distribution density function of a random variable. Usually the value is output through one of these two values.

Bulk Functions

These values ​​are considered by the following properties, which are of a general (mass) nature. The first is based on the fact that the probabilities are not negative. The second follows from the observation that the set for all x=2S, the state space for X, forms a partition of the probabilistic freedom of X. Example: tossing a biased coin whose outcomes are independent. You can continue to perform certain actions until you get a throw of heads. Let X denote a random variable that gives the number of tails in front of the first head. And p denotes the probability in any given action.

So, the mass probability function has the following characteristic features. Because the terms form a numerical sequence, X is called a geometric random variable. Geometric scheme c, cr, cr2,. , crn has a sum. And, therefore, sn has a limit as n 1. In this case, the infinite sum is the limit.

The mass function above forms a geometric sequence with a ratio. Therefore, natural numbers a and b. The difference in values ​​in the distribution function is equal to the value of the mass function.

The density values ​​under consideration have the following definition: X is a random variable whose distribution FX has a derivative. FX satisfying Z xFX (x) = fX (t) dt-1 is called the probability density function. And X is called a continuous random variable. In the fundamental theorem of calculus, the density function is the derivative of the distribution. You can calculate probabilities by calculating definite integrals.

Because data are collected from multiple observations, more than one random variable at a time must be considered in order to model the experimental procedures. Therefore, the set of these values ​​and their joint distribution for the two variables X1 and X2 means viewing events. For discrete random variables, joint probabilistic mass functions are defined. For continuous ones, fX1, X2 are considered, where the joint probability density is satisfied.

Independent random variables

Two random variables X1 and X2 are independent if any two events associated with them are the same. In words, the probability that two events (X1 2 B1) and (X2 2 B2) occur at the same time, y, is equal to the product of the variables above, that each of them occurs individually. For independent discrete random variables, there is a joint probabilistic mass function, which is the product of the limiting volume of ions. For continuous random variables that are independent, the joint probability density function is the product of the marginal density values. Finally, n independent observations x1, x2, are considered. , xn arising from an unknown density or mass function f. For example, an unknown parameter in functions for an exponential random variable describing the waiting time for a bus.

Simulation of random variables

The main goal of this theoretical field is to provide the tools needed to develop inferential procedures based on sound principles of statistical science. Thus, one very important use case for software is the ability to generate pseudo-data to mimic actual information. This makes it possible to test and improve analysis methods before having to use them in real databases. This is required in order to explore the properties of the data through modeling. For many commonly used families of random variables, R provides commands for generating them. For other circumstances, methods for modeling a sequence of independent random variables that have a common distribution will be needed.

Discrete Random Variables and Sample Command. The sample command is used to create simple and stratified random samples. As a result, if a sequence x is entered, sample(x, 40) selects 40 records from x such that all choices of size 40 have the same probability. This uses the default R command for fetch without replacement. Can also be used to model discrete random variables. To do this, you need to provide a state space in the vector x and the mass function f. A call to replace = TRUE indicates that sampling occurs with replacement. Then, to give a sample of n independent random variables having a common mass function f, the sample (x, n, replace = TRUE, prob = f) is used.

It is determined that 1 is the smallest value represented and 4 is the largest of all. If the command prob = f is omitted, then the sample will sample uniformly from the values ​​in vector x. You can test the simulation against the mass function that generated the data by looking at the double equals sign, ==. And recalculating the observations that take every possible value for x. You can make a table. Repeat this for 1000 and compare the simulation with the corresponding mass function.

Illustrating Probability Transformation

First, simulate homogeneous distribution functions of random variables u1, u2,. , un on the interval . About 10% of the numbers should be within . This corresponds to 10% simulations on the interval for a random variable with the FX distribution function shown. Similarly, about 10% of the random numbers should be in the interval . This corresponds to 10% simulations on the random variable interval with the distribution function FX. These x-axis values ​​can be obtained by taking the inverse from FX. If X is a continuous random variable with density fX positive everywhere in its domain, then the distribution function is strictly increasing. In this case, FX has an inverse FX-1 function known as the quantile function. FX (x) u only when x FX-1 (u). The probability transformation follows from the analysis of the random variable U = FX(X).

FX has a range from 0 to 1. It cannot take values ​​below 0 or above 1. For values ​​of u between 0 and 1. If U can be modeled, then it is necessary to simulate a random variable with FX distribution via a quantile function. Take the derivative to see that the density u varies within 1. Since the random variable U has a constant density over the interval of its possible values, it is called uniform on the interval. It is modeled in R with the runif command. The identity is called a probabilistic transformation. You can see how it works in the dart board example. X between 0 and 1, the distribution function u = FX(x) = x2, and hence the quantile function x = FX-1(u). It is possible to model independent observations of the distance from the center of the dart panel, while generating uniform random variables U1, U2,. , Un. The distribution function and the empirical function are based on 100 simulations of the distribution of a dart board. For an exponential random variable, presumably u = FX (x) = 1 - exp (- x), and hence x = - 1 ln (1 - u). Sometimes logic consists of equivalent statements. In this case, you need to concatenate the two parts of the argument. The intersection identity is similar for all 2 (S i i) S, instead of some value. The union Ci is equal to the state space S and each pair is mutually exclusive. Since Bi - is divided into three axioms. Each check is based on the corresponding probability P. For any subset. Using an identity to make sure the answer doesn't depend on whether the interval endpoints are included.

Exponential function and its variables

For each outcome in all events, the second property of the continuity of probabilities is ultimately used, which is considered axiomatic. The law of distribution of the function of a random variable here shows that each has its own solution and answer.

The probability distribution function of a random variable and its properties.

Consider the function F(x), defined on the entire numerical axis as follows: for each X meaning F(x) is equal to the probability that a discrete random variable will take a value less than X, i.e.

(18)

This function is called probability distribution function, or briefly, distribution function.

Example 1 Find the distribution function of the random variable given in example 1, item 1.

Solution: It is clear that if , then F(x)=0, since it does not take values ​​less than one. If , then ; if , then . But the event<3 в данном случае является суммой двух несовместных событий: =1 и =2. Следовательно,

So for we have F(x)=1/3. The values ​​of the function in the intervals , and are calculated similarly. Finally, if x>6 then F(x)=1, since in this case any possible value (1, 2, 3, 4, 5, 6) less than x. Function Graph F(x) shown in fig. four.

Example 2 Find the distribution function of the random variable given in example 2, item 1.

Solution: It's obvious that

Schedule F(x) shown in fig. 5.

Knowing the distribution function F(x), it is easy to find the probability that a random variable satisfies the inequalities .
Consider the event that a random variable takes a value less than . This event breaks down into the sum of two incompatible events: 1) the random variable takes values ​​less than , i.e. ; 2) the random variable takes values ​​that satisfy the inequalities . Using the addition axiom, we get

But by definition of the distribution function F(x)[cm. formula (18)], we have , ; therefore,

(19)

In this way, the probability of a discrete random variable falling into an interval is equal to the increment of the distribution function on this interval.

Consider the main properties of the distribution function.
1°. The distribution function is non-decreasing.
Indeed, let< . Так как вероятность любого события неотрицательна, то . Therefore, from formula (19) it follows that , i.e. .

2°. The values ​​of the distribution function satisfy the inequalities .
This property stems from the fact that F(x) defined as a probability [cf. formula (18)]. It is clear that * and .

3°. The probability that a discrete random variable takes one of the possible values ​​xi is equal to the jump in the distribution function at the point xi.
Indeed, let xi- the value taken by the discrete random variable, and . Assuming in formula (19) , , we obtain

Those. meaning p(xi) equals function jump ** xi. This property is clearly illustrated in Fig. 4 and fig. 5.

*Here and in what follows, the following notations are introduced: , .
** It can be shown that F(xi)=F(xi-0), i.e. what's the function F(x) left continuous at a point xi.

3. Continuous random variables.

In addition to discrete random variables, the possible values ​​of which form a finite or infinite sequence of numbers that do not completely fill any interval, there are often random variables whose possible values ​​form a certain interval. An example of such a random variable is the deviation from the nominal value of a certain size of a part with a properly established technological process. This kind of random variables cannot be specified using the probability distribution law p(x). However, they can be specified using the probability distribution function F(x). This function is defined in exactly the same way as in the case of a discrete random variable:

Thus, here too the function F(x) defined on the whole number axis, and its value at the point X is equal to the probability that the random variable will take on a value less than X.
Formula (19) and properties 1° and 2° are valid for the distribution function of any random variable. The proof is carried out similarly to the case of a discrete quantity.
The random variable is called continuous, if for it there exists a non-negative piecewise-continuous function* that satisfies for any values x equality

Based on the geometric meaning of the integral as an area, we can say that the probability of fulfilling the inequalities is equal to the area of ​​a curvilinear trapezoid with base bounded above by a curve (Fig. 6).

Since , and based on formula (22)

Note that for a continuous random variable, the distribution function F(x) continuous at any point X, where the function is continuous. This follows from the fact that F(x) is differentiable at these points.
Based on formula (23), assuming x 1 = x, , we have

Due to the continuity of the function F(x) we get that

Consequently

In this way, the probability that a continuous random variable can take on any single value of x is zero.
It follows from this that the events consisting in the fulfillment of each of the inequalities

They have the same probability, i.e.

Indeed, for example,

Because

Comment. As we know, if an event is impossible, then the probability of its occurrence is zero. In the classical definition of probability, when the number of test outcomes is finite, the reverse proposition also takes place: if the probability of an event is zero, then the event is impossible, since in this case none of the test outcomes favors it. In the case of a continuous random variable, the number of its possible values ​​is infinite. The probability that this value will take on any particular value x 1 as we have seen, is equal to zero. However, it does not follow from this that this event is impossible, since as a result of the test, the random variable can, in particular, take on the value x 1. Therefore, in the case of a continuous random variable, it makes sense to talk about the probability that the random variable falls into the interval, and not about the probability that it will take on a particular value.
So, for example, in the manufacture of a roller, we are not interested in the probability that its diameter will be equal to the nominal value. For us, the probability that the diameter of the roller does not go out of tolerance is important.

The distribution function of a random variable X is the function F(x), expressing for each x the probability that the random variable X takes the value, smaller x

Example 2.5. Given a series of distribution of a random variable

Find and graphically depict its distribution function. Solution. According to the definition

F(jc) = 0 for X X

F(x) = 0.4 + 0.1 = 0.5 at 4 F(x) = 0.5 + 0.5 = 1 at X > 5.

So (see Fig. 2.1):


Distribution function properties:

1. The distribution function of a random variable is a non-negative function enclosed between zero and one:

2. The distribution function of a random variable is a non-decreasing function on the entire number axis, i.e. at X 2 >x

3. At minus infinity, the distribution function is equal to zero, at plus infinity, it is equal to one, i.e.

4. Probability of hitting a random variable X in the interval is equal to the definite integral of its probability density ranging from a before b(see Fig. 2.2), i.e.


Rice. 2.2

3. The distribution function of a continuous random variable (see Fig. 2.3) can be expressed in terms of the probability density using the formula:

F(x)= Jp(*)*. (2.10)

4. Improper integral in infinite limits of the probability density of a continuous random variable is equal to one:

Geometric properties / and 4 probability densities mean that its plot is distribution curve - lies not below the x-axis, and the total area of ​​the figure, limited distribution curve and x-axis, is equal to one.

For a continuous random variable X expected value M(X) and variance D(X) are determined by the formulas:

(if the integral converges absolutely); or

(if the reduced integrals converge).

Along with the numerical characteristics noted above, the concept of quantiles and percentage points is used to describe a random variable.

q level quantile(or q-quantile) is such a valuex qrandom variable, at which its distribution function takes the value, equal to q, i.e.

  • 100The q%-ou point is the quantile X~ q .
  • ? Example 2.8.

According to example 2.6 find the quantile xqj and 30% random variable point x.

Solution. By definition (2.16) F(xo t3)= 0.3, i.e.

~Y~ = 0.3, whence the quantile x 0 3 = 0.6. 30% random variable point X, or quantile Х)_о,з = xoj» is found similarly from the equation ^ = 0.7. whence *,= 1.4. ?

Among the numerical characteristics of a random variable, there are initial v* and central R* k-th order moments, determined for discrete and continuous random variables by the formulas: