Goodness-of-fit criteria in statistical innovation technologies. Pearson goodness-of-fit test

In this section, we will consider one of the issues related to testing the likelihood of hypotheses, namely, the issue of consistency between theoretical and statistical distributions.

Let us assume that the given statistical distribution is leveled using some theoretical curve f(x)(Fig. 7.6.1). No matter how well the theoretical curve is chosen, some discrepancies are inevitable between it and the statistical distribution. The question naturally arises: are these discrepancies due only to random circumstances associated with a limited number of observations, or are they significant and are related to the fact that the curve we have chosen does not properly equalize this statistical distribution. To answer this question, so-called "consent criteria" are used.

LAWS OF DISTRIBUTION OF RANDOM VARIABLES



The idea behind applying the goodness-of-fit criteria is as follows.

Based on this statistical material, we have to test the hypothesis H, consisting in the fact that the random variable X obeys some definite distribution law. This law can be given in one form or another: for example, in the form of a distribution function F(x) or in the form of distribution density f(x), or in the form of a set of probabilities p t , where pt- the probability that the value X will fall within l something discharge.

Since from these forms the distribution function F(x) is the most general and determines any other, we will formulate the hypothesis H, as consisting in the fact that the value X has a distribution function ^(d:).

To accept or reject a hypothesis H, consider some quantity u, characterizing the degree of discrepancy between the theoretical and statistical distributions. Value U can be selected in various ways; for example, as U one can take the sum of the squared deviations of the theoretical probabilities pt from the corresponding frequencies R* or the sum of the same squares with some coefficients (“weights”), or the maximum deviation of the statistical distribution function F*(x) from theoretical F(x) etc. Let us assume that the quantity U chosen in one way or another. Obviously, there is some random value. The distribution law of this random variable depends on the distribution law of the random variable x, on which experiments were carried out, and from the number of experiments P. If the hypothesis H is true, then the distribution law of the quantity U determined by the distribution law of the quantity X(function F(x)) and number P.

Let us assume that this distribution law is known to us. As a result of this series of experiments, it was found that the measure we have chosen



CONSENT CRITERIA


discrepancies U took on some value a. The question is whether this can be explained by random causes, or whether this discrepancy is too large and indicates the presence of a significant difference between the theoretical and statistical distributions and, therefore, the unsuitability of the hypothesis H? To answer this question, suppose that the hypothesis H is correct, and under this assumption we calculate the probability that, due to random causes associated with an insufficient amount of experimental material, the measure of discrepancy U will be no less than the value observed by us in the experiment and, i.e., we calculate the probability of an event:

If this probability is very small, then the hypothesis H should be rejected as not very plausible; if this probability is significant, it should be recognized that the experimental data do not contradict the hypothesis N.

The question arises, in what way should the measure of discrepancy £/ be chosen? It turns out that for some ways of choosing it, the law of distribution of the quantity U has very simple properties and, for sufficiently large P practically independent of the function F(x). It is precisely such measures of discrepancy that are used in mathematical statistics as criteria for agreement.

Let's consider one of the most commonly used criteria of agreement - the so-called "criterion at?" Pearson.

Assume that there are ha independent experiments, in each of which the random variable X took on a certain value. The results of the experiments are summarized in k digits and are presented in the form of a statistical series.

Null(basic) call the put forward hypothesis about the form of the unknown distribution, or about the parameters of known distributions. competing (alternative) called the hypothesis that contradicts the null.

For example, if the null hypothesis is to assume that the random variable X is distributed according to the law , then the competing hypothesis may consist in the assumption that the random variable X distributed according to a different law.

Statistical criterion(or simply criterion) is called some random variable To, which serves to test the null hypothesis.

After choosing a certain criterion, for example criterion , the set of all its possible values ​​is divided into two non-overlapping subsets: one of them contains the criterion values ​​under which the null hypothesis is rejected, and the other - under which it is accepted.

Critical area is the set of test values ​​for which the null hypothesis is rejected. Area of ​​acceptance of the hypothesis called the set of values ​​of the criterion under which the hypothesis is accepted. critical points the points separating the critical region from the area of ​​acceptance of the null hypothesis are called.

For our example, with a value of , the value calculated from the sample corresponds to the area of ​​acceptance of the hypothesis: the random variable is distributed according to the law . If the calculated value , then it falls into the critical region, that is, the hypothesis about the distribution of a random variable according to the law is rejected.

In the case of a distribution, the critical region is determined by the inequality , the acceptance area of ​​the null hypothesis is determined by the inequality .

2.6.3. Goodness Criteria Pearson.

One of the tasks of zootechnics and veterinary genetics is the breeding of new breeds and species with the required characteristics. For example, increased immunity, disease resistance, or a change in the color of the fur.

In practice, when analyzing the results, it often turns out that the actual results more or less correspond to some theoretical distribution law. There is a need to assess the degree of correspondence between actual (empirical) data and theoretical (hypothetical) ones. To do this, put forward a null hypothesis: the resulting population is distributed according to the law "A". The verification of the hypothesis about the proposed distribution law is carried out using a specially selected random variable - the goodness-of-fit criterion.

Concordance criterion called the criterion for testing the hypothesis of the alleged law of the unknown distribution.

There are several goodness-of-fit criteria: Pearson, Kolmogorov, Smirnov, etc. Pearson's goodness-of-fit test is the most commonly used.

Consider the application of the Pearson criterion on the example of testing the hypothesis of the normal law of distribution of the general population. To this end, we will compare the empirical and theoretical (calculated in the continuation of the normal distribution) frequencies.

There is usually some difference between theoretical and empirical frequencies. For example:

Empirical frequencies 7 15 41 93 113 84 25 13 5

Theoretical frequencies 5 13 36 89 114 91 29 14 6

Consider two cases:

The discrepancy between theoretical and empirical frequencies is random (insignificant), i.e. it is possible to make a proposal about the distribution of empirical frequencies according to the normal law;

The discrepancy between theoretical and empirical frequencies is not accidental (significant), i.e. theoretical frequencies are calculated based on the wrong hypothesis about the normal distribution of the general population.

With the help of Pearson's goodness-of-fit criterion, it is possible to determine by chance or not the discrepancy between theoretical and empirical frequencies, i.e. with a given confidence probability to determine whether the general population is distributed according to the normal law or not.

So, let the empirical distribution be obtained for a sample of size n:

Options……

Empirical frequencies…….

Let us assume that, under the assumption of a normal distribution, the theoretical frequencies are calculated. At the significance level, it is required to test the null hypothesis: the population is normally distributed.

As a criterion for testing the null hypothesis, we take a random variable

(*)

This value is random, since in different experiments it takes on different, previously unknown values. It is clear that the less the empirical and theoretical frequencies differ, the smaller the value of the criterion and, consequently, it characterizes to a certain extent the closeness of the empirical and theoretical distributions.

It is proved that at , the distribution law of the random variable (*), regardless of which distribution law the general population is subject to, tends to the distribution law with degrees of freedom. Therefore, the random variable (*) is denoted by , and the criterion itself is called the “chi-square” goodness-of-fit test.

Let us denote the value of the criterion calculated from the observational data as . The tabulated critical values ​​of the criterion for a given level of significance and the number of degrees of freedom denote . In this case, the number of degrees of freedom is determined from the equality , where the number of groups (partial intervals) of the sample or classes; - the number of parameters of the proposed distribution. The normal distribution has two parameters - the mathematical expectation and the standard deviation. Therefore, the number of degrees of freedom for a normal distribution is found from the equality

If the calculated value and the table value satisfy the inequality , the null hypothesis about the normal distribution of the general population is accepted. If , the null hypothesis is rejected and the hypothesis alternative to it is accepted (the general population is not distributed according to the normal law).

Comment. When using Pearson's goodness-of-fit test, the sample size must be at least 30. Each group must contain at least 5 options. If there are less than 5 frequencies in the groups, they are combined with neighboring groups.

In general, the number of degrees of freedom for a chi-square distribution is defined as the total number of values ​​from which the corresponding measures are calculated, minus the number of those conditions that link these values, i.e. reduce the possibility of variation between them. In the simplest cases, when calculating, the number of degrees of freedom will be equal to the number of classes, reduced by one. So, for example, with dihybrid splitting, 4 classes are obtained, but only the first class is obtained unrelated, the subsequent ones are already associated with the previous ones. Therefore, for dihybrid splitting, the number of degrees of freedom is .

Example 1 Determine the degree of correspondence between the actual distribution of groups in terms of the number of cows with tuberculosis and the theoretically expected one, which was calculated when considering the normal distribution. The initial data are summarized in the table:

Solution.

By the level of significance and the number of degrees of freedom from the table of critical distribution points (see Appendix 4), we find the value . Because the , we can conclude that the difference between theoretical and actual frequencies is random. Thus, the actual distribution of groups according to the number of cows with tuberculosis corresponds to the theoretically expected.

Example 2 The theoretical distribution by phenotype of individuals obtained in the second generation by dihybrid crossing of rabbits according to Mendel's law is 9: 3: 3: 1. It is required to calculate the correspondence of the empirical distribution of rabbits from crossing black individuals with normal hair with downy animals - albino. When crossing in the second generation, 120 offspring were obtained, including 45 black with short hair, 30 black downy, 25 white with short hair, 20 white downy rabbits.

Solution. Theoretically expected segregation in the offspring should correspond to a ratio of four phenotypes (9:3:3:1). Calculate the theoretical frequencies (number of goals) for each class:

9+3+3+1=16, so we can expect black shorthairs to be ; black downy - ; white shorthair ; white downy -.

The empirical (actual) phenotypic distribution was as follows 45; thirty; 25; twenty.

Let's summarize all this data in the following table:

Using Pearson's goodness-of-fit test, we calculate the value of :

The number of degrees of freedom in a dihybrid cross. For significance level find value . Because the , we can conclude that the difference between theoretical and actual frequencies is not accidental. Consequently, the resulting group of rabbits deviates in terms of the distribution of phenotypes from Mendel's law during dihybrid crossing and reflects the influence of certain factors that change the type of splitting in the phenotype in the second generation of hybrids.

Pearson's chi-squared goodness-of-fit test can also be used to compare two homogeneous empirical distributions with each other, i.e. those that have the same class boundaries. The null hypothesis is the hypothesis that two unknown distribution functions are equal. The chi-square test in such cases is determined by the formula

(**)

where and are the volumes of the compared distributions; and are the frequencies of the corresponding classes.

Consider a comparison of two empirical distributions using the following example.

Example 3 The length of cuckoo eggs was measured in two territorial zones. In the first zone, a sample of 76 eggs () was examined, in the second of 54 (). The following results are obtained:

Length (mm)
Frequencies
Frequencies - - -

At the significance level, it is required to test the null hypothesis that both samples of eggs belong to the same cuckoo population.

Introduction

The relevance of this topic is that during the study of the basics of biostatistics, we assumed that the law of distribution of the general population is known. But what if the distribution law is unknown, but there is reason to assume that it has a certain form (let's call it A), then the null hypothesis is checked: the general population is distributed according to the law A. This hypothesis is tested using a specially selected random variable - the criterion of agreement.

Goodness-of-fit tests are criteria for testing hypotheses about the correspondence of the empirical distribution to the theoretical probability distribution. These criteria fall into two categories:

  • III General goodness-of-fit criteria apply to the most general formulation of a hypothesis, namely the hypothesis that the observed results agree with any a priori assumed probability distribution.
  • III Special goodness-of-fit tests imply special null hypotheses that formulate agreement with a certain form of probability distribution.

Goodness Criteria

The most common goodness-of-fit tests are omega-square, chi-square, Kolmogorov and Kolmogorov-Smirnov.

Non-parametric tests of agreement Kolmogorov, Smirnov, omega square are widely used. However, they are also associated with widespread errors in the application of statistical methods.

The fact is that the listed criteria were developed to test the agreement with a fully known theoretical distribution. Calculation formulas, tables of distributions and critical values ​​are widely used. The main idea of ​​the Kolmogorov, omega square and similar criteria is to measure the distance between the empirical distribution function and the theoretical distribution function. These criteria differ in the form of distances in the space of distribution functions.

Pearson's p2 goodness-of-fit tests for a simple hypothesis

K. Pearson's theorem refers to independent trials with a finite number of outcomes, i.e. to the Bernoulli trials (in a somewhat extended sense). It allows one to judge whether observations in a large number of trials of the frequency of these outcomes are consistent with their estimated probabilities.

In many practical problems, the exact distribution law is unknown. Therefore, a hypothesis is put forward about the correspondence of the existing empirical law, built on the basis of observations, to some theoretical one. This hypothesis requires statistical testing, the results of which will either be confirmed or refuted.

Let X be the random variable under study. It is required to test the hypothesis H0 that this random variable obeys the distribution law F(x). To do this, it is necessary to make a sample of n independent observations and build an empirical distribution law F "(x) from it. To compare the empirical and hypothetical laws, a rule called the goodness of fit is used. One of the most popular is K. Pearson's chi-square goodness of fit. In it the chi-square statistic is calculated:

where N is the number of intervals according to which the empirical distribution law was built (the number of columns of the corresponding histogram), i is the number of the interval, pt i is the probability that the value of the random variable will fall into the i-th interval for the theoretical distribution law, pe i is the probability that the value of the random variable will fall into the i-th interval for the empirical distribution law. It must obey the chi-square distribution.

If the calculated value of the statistic exceeds the chi-square distribution quantile with k-p-1 degrees of freedom for a given significance level, then the H0 hypothesis is rejected. Otherwise, it is accepted at the given level of significance. Here k is the number of observations, p is the number of estimated parameters of the distribution law.

Let's look at the statistics:

The p2 statistic is called Pearson's chi-squared statistic for the simple hypothesis.

It is clear that p2 is the square of some distance between two r-dimensional vectors: the relative frequency vector (mi /n, …, mr /n) and the probability vector (pi , …, pr). This distance differs from the Euclidean distance only in that different coordinates enter it with different weights.

Let us discuss the behavior of the h2 statistic in the case when the hypothesis H is true and in the case when H is false. If H is true, then the asymptotic behavior of ch2 for n > ? indicates K. Pearson's theorem. To understand what happens to (2.2) when H is false, note that, according to the law of large numbers, mi /n > pi for n > ?, for i = 1, …, r. Therefore, for n > ?:

This value is equal to 0. Therefore, if H is incorrect, then h2 >? (when n > ?).

It follows from what has been said that H should be rejected if the value of h2 obtained in the experiment is too large. Here, as always, the words "too large" mean that the observed value of n2 exceeds the critical value, which in this case can be taken from the chi-squared distribution tables. In other words, the probability P(p2 npi p2) is a small value and, therefore, it is unlikely to accidentally get the same as in the experiment, or an even greater discrepancy between the frequency vector and the probability vector.

The asymptotic nature of K. Pearson's theorem, which underlies this rule, requires caution in its practical use. It can only be relied upon for large n. To judge whether n is large enough, it is necessary to take into account the probabilities pi , …, pr . Therefore, it cannot be said, for example, that one hundred observations will be enough, since not only n must be large, but also the products npi , …, npr (expected frequencies) must not be small either. Therefore, the problem of approximating ch2 (continuous distribution) to the statistic ch2, whose distribution is discrete, turned out to be difficult. A combination of theoretical and experimental arguments led to the belief that this approximation is applicable if all expected frequencies are npi>10. if the number r (the number of different outcomes) increases, the limit for is lowered (to 5 or even to 3 if r is on the order of several tens). To meet these requirements, in practice it is sometimes necessary to combine several outcomes, i.e. go to the Bernoulli scheme with smaller r.

The described method for checking agreement can be applied not only to Bernoulli tests, but also to random samples. Their observations must first be converted into Bernoulli tests by grouping. They do it this way: the observation space is divided into a finite number of non-overlapping regions, and then the observed frequency and hypothetical probability are calculated for each region.

In this case, to the previously listed difficulties of approximation, one more is added - the choice of a reasonable partition of the original space. At the same time, care must be taken that, in general, the rule for testing the hypothesis about the initial distribution of the sample is sufficiently sensitive to possible alternatives. Finally, I note that the statistical criteria based on the reduction to the Bernoulli scheme, as a rule, are not valid against all alternatives. So this method of verifying consent is of limited value.

The Kolmogorov-Smirnov goodness-of-fit test in its classical form is more powerful than the h2 test and can be used to test the hypothesis that the empirical distribution corresponds to any theoretical continuous distribution F(x) with known parameters. The latter circumstance imposes restrictions on the possibility of a wide practical application of this criterion in the analysis of the results of mechanical tests, since the parameters of the distribution function of the characteristics of mechanical properties, as a rule, are estimated from the data of the sample itself.

The Kolmogorov-Smirnov criterion is used for ungrouped data or for grouped data in the case of a small interval width (for example, equal to the scale division of a force meter, load cycle counter, etc.). Let the test result of a series of n samples be a variation series of mechanical properties characteristics

x1? x2? ... ? xi? ... ? xn. (3.93)

It is required to test the null hypothesis that the sample distribution (3.93) belongs to the theoretical law F(x).

The Kolmogorov-Smirnov criterion is based on the distribution of the maximum deviation of the accumulated particular from the value of the distribution function. When using it, statistics are calculated

which is a statistic of the Kolmogorov test. If the inequality

Dnvn? forehead (3.97)

for large sample sizes (n > 35) or

Dn(vn + 0.12 + 0.11/vn) ? forehead (3.98)

for n? 35, the null hypothesis is not rejected.

If inequalities (3.97) and (3.98) are not satisfied, then the alternative hypothesis is accepted that the sample (3.93) belongs to an unknown distribution.

The critical values ​​of lb are: л0.1 = 1.22; l0.05 = 1.36; l0.01 = 1.63.

If the parameters of the function F(x) are not known in advance, but are estimated from the sample data, the Kolmogorov-Smirnov criterion loses its universality and can only be used to check the compliance of the experimental data with only some specific distribution functions.

When used as a null hypothesis, whether the experimental data belongs to a normal or log-normal distribution, statistics are calculated:

where Ц(zi) is the value of the Laplace function for

Ц(zi) = (xi - xср)/s The Kolmogorov-Smirnov criterion for any sample size n is written as

The critical values ​​of lb in this case are: л0.1 = 0.82; l0.05 = 0.89; l0.01 = 1.04.

If the hypothesis is checked about the compliance of the sample with the *** exponential distribution, the parameter of which is estimated from experimental data, similar statistics are calculated:

criterion empirical probability

and make up the Kolmogorov-Smirnov criterion.

The critical values ​​of lb for this case are: λ0.1 = 0.99; l0.05 = 1.09; l0.01 = 1.31.

To test the hypothesis about the correspondence of the empirical distribution to the theoretical law of distribution, special statistical indicators are used - goodness-of-fit criteria (or compliance criteria). These include the criteria of Pearson, Kolmogorov, Romanovsky, Yastremsky, etc. Most of the goodness of fit criteria are based on the use of deviations of empirical frequencies from theoretical ones. Obviously, the smaller these deviations, the better the theoretical distribution matches (or describes) the empirical one.

Consent Criteria- these are the criteria for testing hypotheses about the correspondence of the empirical distribution to the theoretical probability distribution. Such criteria are divided into two classes: general and special. General goodness-of-fit criteria apply to the most general formulation of a hypothesis, namely, to the hypothesis that the observed results agree with any a priori assumed probability distribution. Special goodness-of-fit tests imply special null hypotheses that formulate agreement with a certain form of probability distribution.

The agreement criteria, based on the established distribution law, make it possible to establish when the discrepancies between theoretical and empirical frequencies should be recognized as insignificant (random), and when - significant (non-random). It follows from this that the goodness-of-fit criteria make it possible to reject or confirm the correctness of the hypothesis put forward when leveling the series about the nature of the distribution in the empirical series and to answer whether it is possible to accept a model expressed by some theoretical distribution law for a given empirical distribution.

Pearson goodness-of-fit test c 2 (chi-square) is one of the main goodness-of-fit criteria. Proposed by the English mathematician Karl Pearson (1857-1936) to assess the randomness (significance) of discrepancies between the frequencies of empirical and theoretical distributions:

The scheme for applying the c 2 criterion to assessing the consistency of the theoretical and empirical distributions is as follows:

1. The calculated measure of discrepancy is determined.

2. The number of degrees of freedom is determined.

3. The number of degrees of freedom n is determined using a special table.

4. If , then for a given level of significance α and the number of degrees of freedom n, the hypothesis of insignificance (randomness) of the discrepancies is rejected. Otherwise, the hypothesis can be recognized as not contradicting the obtained experimental data, and with a probability (1 – α) it can be argued that the discrepancies between the theoretical and empirical frequencies are random.

Significance level is the probability of erroneous rejection of the put forward hypothesis, i.e. the probability that the correct hypothesis will be rejected. In statistical studies, depending on the importance and responsibility of the tasks being solved, the following three levels of significance are used:

1) a = 0.1, then R = 0,9;

2) a = 0.05, then R = 0,95;

3) a = 0.01, then R = 0,99.

Using the goodness-of-fit criterion c 2 , the following conditions must be observed:

1. The volume of the studied population should be large enough ( N≥ 50), while the frequency or size of the group must be at least 5. If this condition is violated, it is necessary to first merge small frequencies (less than 5).

2. The empirical distribution should consist of data obtained as a result of random selection, i.e. they must be independent.

The disadvantage of Pearson's goodness-of-fit criterion is the loss of some of the initial information associated with the need to group the observation results into intervals and combine individual intervals with a small number of observations. In this regard, it is recommended to supplement the verification of the correspondence of distributions according to the criterion with 2 other criteria. This is especially necessary when the sample size is relatively small ( n ≈ 100).

In statistics Kolmogorov's goodness-of-fit test(also known as the Kolmogorov-Smirnov goodness-of-fit test) is used to determine whether two empirical distributions obey the same law, or to determine whether the resulting distribution obeys the proposed model. The Kolmogorov criterion is based on determining the maximum difference between the accumulated frequencies or the frequencies of empirical or theoretical distributions. The Kolmogorov criterion is calculated according to the following formulas:

where D and d- respectively, the maximum difference between the accumulated frequencies ( ff¢) and between accumulated frequencies ( pp¢) empirical and theoretical series of distributions; N- the number of units in the population.

Having calculated the value of λ, a special table determines the probability with which it can be argued that the deviations of empirical frequencies from theoretical ones are random. If the sign takes values ​​up to 0.3, then this means that there is a complete coincidence of frequencies. With a large number of observations, the Kolmogorov test is able to detect any deviation from the hypothesis. This means that any difference in the sample distribution from the theoretical one will be detected with its help if there are a lot of observations. The practical significance of this property is not significant, since in most cases it is difficult to count on obtaining a large number of observations under constant conditions, the theoretical idea of ​​the distribution law to which the sample must obey is always approximate, and the accuracy of statistical checks should not exceed the accuracy of the chosen model.

Romanovsky's goodness-of-fit criterion based on the use of the Pearson criterion, i.e. already found values ​​c 2 , and the number of degrees of freedom:

where n is the number of degrees of freedom of variation.

The Romanovsky criterion is convenient in the absence of tables for . If a< 3, то расхождения распределений случайны, если же >3, then they are not random and the theoretical distribution cannot serve as a model for the empirical distribution under study.

B. S. Yastremsky used in the agreement criterion not the number of degrees of freedom, but the number of groups ( k), a special value q depending on the number of groups, and a chi-square value. Yastremsky's agreement criterion has the same meaning as the Romanovsky criterion and is expressed by the formula

where c 2 - Pearson's criterion of agreement; - number of groups; q - coefficient, for the number of groups less than 20 equal to 0.6.

If a L fact > 3, the discrepancies between the theoretical and empirical distributions are not random, i.e. the empirical distribution does not meet the requirements of a normal distribution. If a L fact< 3, расхождения между эмпирическим и теоретическим распределениями считаются случайными.

By processing independent measurements of the random variable ξ, we can construct a statistical distribution function F*(x). By the form of this function, one can accept the hypothesis that the true theoretical distribution function is F(x). The independent measurements themselves (x 1 , x 2 ,…,x n) forming the sample can be considered as identically distributed random variables with a hypothetical distribution function F(x).

Obviously, there will be some discrepancies between the functions F * (x) and F (x). The question arises whether these discrepancies are due to the limited size of the sample or are due to the fact that our hypothesis is not correct, i.e. the actual distribution function is not F(x), but some other one. To resolve this issue, the consent criteria are used, the essence of which is as follows. A certain value Δ(F, F *) is chosen, which characterizes the degree of discrepancy between the functions F * (x) and F(x). For example, Δ(F, F *)=Sup|F(x)-F * (x)|, i.e. the upper bound in x of the modulus of the difference.

Assuming the hypothesis is correct, i.e. knowing the distribution function F(x), one can find the distribution law of the random variable Δ(F, F *) (we will not touch on the question of how to do this). We set the number p 0 so small that the event (Δ(F, F *)>Δ 0 ) with this probability will be considered practically impossible. From the condition

find the value Δ 0 . Here f(x) is the distribution density Δ(F,F *).

Let us now calculate the value Δ(F, F *)= Δ 1 from the results

samples, i.e. find one of the possible values ​​of the random variable Δ(F, F *). If Δ 1 ≥Δ 0 , then this means that an almost impossible event has occurred. This can be explained by the fact that our hypothesis is not correct. So, if Δ 1 ≥Δ 0, then the hypothesis is rejected, and when Δ 1<Δ 0 , гипотеза может оказаться неверной, но вероятность этого мала.

As a measure of the discrepancy Δ(F, F *) one can take various values. Depending on this, different criteria of agreement are obtained. For example, the Kolmogorov, Mises, Pearson goodness-of-fit test, or the chi-square test.

Let the results of n measurements be presented as a grouped statistical series with k digits.

DISCHARGE (x 0 ,x 1) (in fact, we assume that the measurement errors are distributed evenly over a certain segment). Then the probability of hitting each of the seven digits will be equal to . Using the grouped series from §11, we calculate Δ(F, F *)= Δ 1 =by formula (1). In this case .

Since the hypothetical distribution law includes two unknown parameters, α and β - the beginning and end of the segment, the number of degrees of freedom will be 7-1-2=4. According to the chi-square distribution table with the selected probability p 0 =10 -3 we find Δ 0 =18. Because Δ 1 >Δ 0 , then the hypothesis of a uniform distribution of the measurement error will have to be discarded.