Relative linear deviation excel. Variance Calculation in Microsoft Excel

The STDEV.B function returns the value of the standard deviation calculated for a specified range of numeric values.

The STDEVG function is used to determine the standard deviation of a population of numeric values ​​and returns the standard deviation, given that the values ​​passed in are the entire population, not a sample.

The STDEV function returns the standard deviation for some range of numbers that are a sample, not the entire population.

STDLONGPA returns the standard deviation for the entire population passed as its arguments.

Examples of using STDEV.V, STDEV.G, STDEV, and STDEVPA

Example 1. The company has two customer acquisition managers. Data on the number of clients served per day by each manager is recorded in an Excel spreadsheet. Determine which of the two employees works more efficiently.

Initial data table:

First, let's calculate the average number of clients that managers worked with daily:

AVERAGE(B2:B11)

This function calculates the arithmetic mean for the range B2:B11 containing the number of clients received daily by the first manager. Similarly, we calculate the average number of clients per day for the second manager. We get:

Based on the obtained values, it seems that both managers work approximately equally efficiently. However, a strong scatter in the values ​​of the number of clients for the first manager is visually visible. Let's calculate the standard deviation using the formula:


STDV B(B2:B11)

B2:B11 - the range of the studied values. Similarly, we determine the standard deviation for the second manager and get the following results:


As you can see, the performance indicators of the first manager are characterized by high variability (scatter) of values, and therefore the arithmetic mean does not reflect the real picture of work efficiency at all. Deviation 1.2 indicates more stable and, therefore, efficient work of the second manager.



An example of using the STDEV function in Excel

Example 2. In two different groups of college students, an exam was held in the same discipline. Assess student performance.

Initial data table:

Let's determine the standard deviation of the values ​​for the first group using the formula:


STDEV(A2:A11)

Let's make a similar calculation for the second group. As a result, we get:


The obtained values ​​indicate that the students of the second group were much better prepared for the exam, since the spread of assessment values ​​is relatively small. Note that the STDEV function converts the "pass" text value to the numeric value 0 (zero) and takes it into account in the calculations.

Example of STDEV.G function in Excel

Example 3. Determine the effectiveness of preparing students for the exam for all groups of the university.

Note: unlike the previous example, not a sample (several groups) will be analyzed, but the entire number of students - the general population. Students who fail the exam are not counted.

Fill in the data table:

To evaluate the effectiveness, we will operate with two indicators: the average score and the spread of values. To determine the arithmetic mean, we use the function:

AVERAGE(B2:B21)

To determine the deviation, we introduce the formula:


STDV H(B2:B21)

As a result, we get:


The data obtained indicates a performance slightly below average (<4), величина разброса характеризует довольно большое количество студентов, получивших 5 и 3 соответственно (учитывая, что анализировались только данные из диапазона от 3 до 5).

Example of STDEVPA function in Excel

Example 4. Analyze students' performance based on the results of passing the exam, taking into account those students who failed to pass this exam.

Datasheet:

In this example, we are also analyzing the population, but some of the data fields contain text values. To determine the standard deviation, we use the function:


STDEVPA(B2:B21)

As a result, we get:

A high spread of values ​​in the sequence indicates a large number of students who did not pass the exam.

Features of using STDEV.V, STDEV.G, STDEV and STDEVPA

The STDEV and STDEVPA functions have an identical syntax notation like:

FUNCTION(value1; [value2];…)

Description:

  • FUNCTION - one of the two functions discussed above;
  • value1 is a mandatory argument that characterizes one of the values ​​of the sample (or the general population);
  • [value2] is an optional argument characterizing the second value of the studied range.

Notes:

  1. Names, numeric values, arrays, references to ranges of numeric data, logical values ​​and references to them can be passed as arguments to functions.
  2. Both functions ignore null values ​​and text data contained in the passed data range.
  3. The functions return the #VALUE! error code if error values ​​or text data were passed as arguments that cannot be converted to numeric values.

The STDEV.V and STDEV.G functions have the following syntax notation:

FUNCTION(number1,[number2],…)

Description:

  • FUNCTION – any of the STDEV.V or STDEV.G functions;
  • number1 - a mandatory argument characterizing the numerical value taken from the sample or the entire general population;
  • number2 is an optional argument that characterizes the second numerical value of the studied range.

Note: Both functions do not include numbers represented as text data, nor the logical values ​​TRUE and FALSE in the calculation process.

Notes:

  1. The standard deviation is widely used in statistical calculations when finding the average of a range of values ​​does not give a correct idea of ​​the distribution of the data. It demonstrates the principle of distribution of values ​​relative to the mean value in a particular sample or the entire sequence. Example 1 will visually consider the practical application of this statistical parameter.
  2. The STDEV and STDEV.V functions should be used to analyze only a part of the general population and calculate according to the first formula, while STDEV.G and STDEV.V should take data on the entire population as input and calculate using the second formula.
  3. Excel contains the built-in functions STDEV and STDEV, retained for compatibility with older versions of Microsoft Office. They may not be included in later versions of the program, so their use is not recommended.
  4. Two common formulas are used to find the standard deviation: S=√((∑_(i=1)^n▒(x_i-x_average)^2)/(n-1)) and S=√((∑_(i= 1)^n▒(x_i-x_av)^2)/n), where:
  • S is the desired value of the standard deviation;
  • n is the considered range of values ​​(sample);
  • x_i is a single value from the sample;
  • x_av is the arithmetic mean for the range under consideration.

Statistics uses a huge number of indicators, and one of them is the calculation of variance in Excel. If you do it yourself manually, it will take a lot of time, you can make a lot of mistakes. Today we will look at how to decompose mathematical formulas into simple functions. Let's look at some of the simplest, fastest and most convenient calculation methods that will allow you to do everything in a matter of minutes.

Computing the variance

The dispersion of a random variable is the mathematical expectation of the squared deviation of a random variable from its mathematical expectation.

We calculate by the general population

To calculate mat. expectation in the program, the function VARI.G will be used, and its syntax is as follows "= VARI.G (Number1; Number2; ...)".

It is possible to apply a maximum of 255 arguments, no more. Arguments can be simple numbers or references to the cells in which they are specified. Let's look at how to calculate the variance in Microsoft Excel:

1. The first step is to select the cell where the result of the calculations will be displayed, and then click on the "Insert function" button.

2. The feature management shell will open. There you need to look for the function "DISP.G", which can be in the category "Statistical" or "Full alphabetical list". When it is found, select it and click OK.


3. The function arguments window will open. In it, you need to select the line "Number 1" and on the sheet select a range of cells with a number row.


4. After that, in the cell where the function was entered, the results of the calculations will be displayed.

This is how you can easily find the variance in Excel.

Making a sample calculation

In this case, the sample variance in Excel is calculated with the denominator indicating not the total number of numbers, but one less. This is done for a smaller error using the special function VAR.V, the syntax of which is =VAR.V(Number1;Number2;…). Action algorithm:

  • As in the previous method, you need to select a cell for the result.
  • In the function wizard, you should find "VAR.V" in the category "Full alphabetical list" or "Statistical".


  • Next, a window will appear, and you should proceed in the same way as in the previous method.

Video: Calculate variance in Excel

Conclusion

The variance in Excel is calculated very simply, much faster and more convenient than doing it manually, because the mathematical expectation function is quite complicated and it can take a lot of time and effort to calculate it.

The standard deviation function is already from the category of higher mathematics related to statistics. In Excel, there are several options for using the Standard Deviation Function:

  • STDEV function.
  • STDEV function.
  • STDEV function

We will need these functions in sales statistics to identify the stability of sales (XYZ analysis). This data can be used both for pricing and for the formation (adjustment) of the assortment matrix and for other useful sales analyzes, which I will definitely talk about in future articles.

Foreword

Let's look at the formulas first in mathematical language, and then (below in the text) we will analyze the formula in Excel in detail and how the resulting result is applied in the analysis of sales statistics.

So, Standard Deviation is an estimate of the standard deviation of a random variable x regarding its mathematical expectation based on an unbiased estimate of its variance)))) Do not be afraid of incomprehensible words, be patient and you will understand everything!

Description of the formula: The standard deviation is measured in units of the random variable itself and is used when calculating the standard error of the arithmetic mean, when constructing confidence intervals, when statistically testing hypotheses, when measuring a linear relationship between random variables. Defined as the square root of the variance of a random variable

Now the standard deviation is an estimate of the standard deviation of a random variable x with respect to its mathematical expectation based on an unbiased estimate of its variance:

Dispersion;

- i-th sample element;

Sample size;

Sample arithmetic mean:

It should be noted that both estimates are biased. In the general case, it is impossible to construct an unbiased estimate. However, an estimate based on an unbiased variance estimate is consistent.

three sigma rule() - almost all values ​​of a normally distributed random variable lie in the interval . More strictly, with approximately 0.9973 probability, the value of a normally distributed random variable lies in the specified interval (provided that the value is true, and not obtained as a result of sample processing). We will use a rounded interval of 0.1

If the true value is unknown, then you should use not, but s. Thus, the rule of three sigma is transformed into the rule of three s. It is this rule that will help us determine the stability of sales, but more on that later...

Now Standard Deviation Function in Excel

I hope I didn't overwhelm you with math? Perhaps someone will need this information for an abstract or some other purpose. Now let's chew on how these formulas work in Excel...

To determine the stability of sales, we do not need to delve into all the options for standard deviation functions. We will use only one:

STDEV function

STDEV(number1;number2;... )

Number1, Number2,..- from 1 to 30 numerical arguments corresponding to the general population.

Now let's look at an example:

Let's create a book and a makeshift spreadsheet. You can download this example in Excel at the end of the article.

To be continued!!!

Hello again. Well!? Got a free minute. Let's continue?

And so the stability of sales with the help STDEV functions

For clarity, let's take a few improvised goods:

In analytics, whether it is a forecast, research, or something else related to statistics, it is always necessary to take three periods. It can be a week, month, quarter or year. It is possible and even best to take as many periods as possible, but not less than three.

I specifically showed exaggerated sales, where you can see with the naked eye what is being sold consistently and what is not. This will make it easier to understand how the formulas work.

And so we have sales, now we need to calculate the average sales values ​​by period.

Average value formula AVERAGE(period data) in my case, the formula looks like this =AVERAGE(C6:E6)

We stretch the formula for all products. This can be done by holding the right corner of the selected cell and dragging it to the end of the list. Or put the cursor on the column with the product and press the following key combinations:

Ctrl + Down move the cursor to the bottom of the list.

Ctrl + Right, the cursor will move to the right side of the table. One more time to the right and we will get to the column with the formula.

Now we clamp

Ctrl + Shift and press up. So we select the area of ​​​​stretching the formula.

And the key combination Ctrl + D will stretch the function where we need it.

Remember these combinations, they really increase your speed in Excel, especially when you work with large arrays.

The next step, the standard deviation function itself, as I said, we will use only one STDEV

We prescribe the function and in the function values ​​we put the sales values ​​of each period. If you have sales in the table one after another, you can use the range, as in my formula =STDEV(C6:E6) or list the required cells with a semicolon =STDEV(C6;D6;E6)

Here are all the calculations and ready. But how do you know what sells consistently and what doesn't? Let's just put down the convention XYZ where,

X is stable

Y - with small deviations

Z - not stable

To do this, we use error intervals. if fluctuations occur within 10%, we will assume that sales are stable.

If between 10 and 25 percent, it will be Y.

And if the variation values ​​​​exceed 25% - this is not stability.

To correctly set the letters for each product, we will use the IF formula in more detail about. In my table, this function will look like this:

IF(H6<0,1;"X";ЕСЛИ(H6<0,25;"Y";"Z"))

Accordingly, we stretch all the formulas for all names.

I will try to immediately answer the question, Why the intervals of 10% and 25%?

In fact, the intervals may be different, it all depends on the specific task. I specifically showed you exaggerated sales values, where the difference is visible to the "eye". It is obvious that product 1 is not sold consistently, but the dynamics shows an increase in sales. Leave this item alone...

But product 2, there is already destabilization on the face. And our calculations show Z, which tells us about the instability of sales. Item 3 and Item 5 show stable performance, please note the variation is within 10%.

Those. Item 5 with scores of 45, 46, and 45 shows a 1% variation, which is a stable number series.

But Product 2 with scores of 10, 50, and 5 shows a 93% variation, which is NOT a stable number series.

After all the calculations, you can put a filter and filter out the stability, so if your table consists of several thousand items, you can easily select which are not stable in sales or, on the contrary, which ones are stable.

"Y" did not work in my table, I think for clarity of the number series, it needs to be added. I will draw Goods 6...

You see, the number series 40, 50 and 30 shows 20% variation. It seems that there is no big error, but still the spread is significant ...

And so to sum it up:

10,50,5 - Z is not stable. Variation over 25%

40,50,30 - Y you can pay attention to this product and improve its sales. Variation less than 25% but greater than 10%

45,46,45 - X is stability, nothing needs to be done with this product yet. Variation less than 10%

That's all! I hope I explained everything clearly, if not, ask what is not clear. And I will be grateful to you for every comment, whether it be praise or criticism. So I will know that you are reading me and you, which is very IMPORTANT, interesting. And accordingly, new lessons will appear.

Management intervention is needed to identify the causes of deviations.

To build a control chart, I use the original data, the mean (μ) and the standard deviation (σ). In Excel: μ = AVERAGE($F$3:$F$15), σ = STDEV($F$3:$F$15)

The control chart itself includes: raw data, mean (μ), lower control limit (μ - 2σ) and upper control limit (μ + 2σ):

Download note in format , examples in format

Looking at this map, I noticed that the original data shows a very distinct linear trend towards a decrease in the overhead share:

To add a trend line, select the data row on the chart (in our example, green dots), right-click and select the "Add trend line" option. In the Format Trendline window that opens, experiment with the options. I settled on a linear trend.

If the initial data are not scattered in accordance with around the average value, then it is not quite correct to describe them by the parameters μ and σ. For description, instead of the average value, a linear trend line and control borders equidistant from this trend line are better suited.

Excel allows you to build a trend line using the FORECAST function. We will need an additional row A3: A15 in order to known X values were a continuous series (numbers of quarters do not form such a continuous series). Instead of the average value in column H, we introduce the FORECAST function:

The standard deviation σ (STDEV function in Excel) is calculated by the formula:

Unfortunately, I did not find a function in Excel for such a definition of the standard deviation (in relation to the trend). The problem can be solved using an array formula. Who is not familiar with array formulas, I suggest reading first.

An array formula can return a single value or an array. In our case, the array formula will return a single value:

Let's take a closer look at how the array formula works in cell G3

SUM(($F$3:$F$15-$H$3:$H$15)^2) defines the sum of squared differences; in fact, the formula calculates the following sum = (F3 - H3) 2 + (F4 - H4) 2 + ... + (F15 - H15) 2

COUNT($F$3:$F$15) – number of values ​​in range F3:F15

SQRT(SUM(($F$3:$F$15-$H$3:$H$15)^2)/(COUNT($F$3:$F$15)-1)) = σ

The value of 6.2% is the point of the lower control limit = 8.3% - 2 σ

Curly quotation marks on either side of a formula indicate that it is an array formula. To create an array formula, after entering the formula in cell G3:

H4 - 2*ROOT(SUM(($F$3:$F$15-$H$3:$H$15)^2)/(COUNT($F$3:$F$15)-1))

you need to press not Enter, but Ctrl + Shift + Enter. Don't try to type curly braces on the keyboard - the array formula won't work. If you want to edit an array formula, do it in the same way as with a regular formula, but again, after editing, press Ctrl + Shift + Enter instead of Enter.

An array formula that returns a single value can be "dragged" just like a normal formula.

As a result, we got a control chart built for data with a downward trend.

P.S. After the note was written, I was able to refine the formulas used to calculate the standard deviation for data with a trend. You can get acquainted with them in the Excel file.

We have to deal with the calculation of such values ​​as variance, standard deviation and, of course, the coefficient of variation. It is the calculation of the latter that should be given special attention. It is very important that every beginner who is just starting to work with a spreadsheet editor can quickly calculate the relative scatter of values.

What is the coefficient of variation and why is it needed?

So, it seems to me that it would be useful to conduct a short theoretical digression and understand the nature of the coefficient of variation. This indicator is necessary to reflect the range of data relative to the average value. In other words, it shows the ratio of the standard deviation to the mean. It is customary to measure the coefficient of variation in percentage terms and use it to display the homogeneity of the time series.

The coefficient of variation will become an indispensable assistant in the event that you need to make a forecast based on data from a given sample. This indicator will highlight the main ranges of values ​​that will be most useful for subsequent forecasting, as well as clear the sample from insignificant factors. So, if you see that the value of the coefficient is 0%, then declare with confidence that the series is homogeneous, which means that all values ​​in it are equal to one another. If the coefficient of variation takes on a value exceeding 33%, then this indicates that you are dealing with a heterogeneous series in which individual values ​​differ significantly from the sample average.

How to find the standard deviation?

Since we need to use the standard deviation to calculate the variation indicator in Excel, it would be quite appropriate to figure out how we calculate this parameter.

From the school algebra course, we know that the standard deviation is the square root extracted from the variance, that is, this indicator determines the degree of deviation of a particular indicator of the total sample from its average value. With its help, we can measure the absolute measure of fluctuation of the trait under study and interpret it clearly.

Calculate the coefficient in Excel

Unfortunately, Excel does not have a standard formula that would allow you to calculate the variation indicator automatically. But this does not mean that you have to do the calculations in your head. The absence of a template in the "Formula Bar" in no way detracts from Excel's abilities, so you can easily force the program to perform the calculation you need by manually typing the appropriate command.

In order to calculate the variation indicator in Excel, you need to remember the school math course and divide the standard deviation by the sample mean. That is, in fact, the formula looks like this - STDEV(specified data range) / AVERAGE(specified data range). You need to enter this formula in the Excel cell in which you want to get the calculation you need.

Keep in mind that since the coefficient is expressed as a percentage, the cell with the formula will need to be formatted accordingly. You can do this in the following way:

  1. Open the Home tab.
  2. Find the category in it " Format Cells"And select the required option.

Alternatively, you can set the percentage format to the cell by clicking on the right mouse button on the activated table cell. In the context menu that appears, similarly to the above algorithm, you need to select the “Cell Format” category and set the required value.

Select "Percentage" and optionally enter the number of decimal places

Perhaps the above algorithm will seem complicated to someone. In fact, calculating the coefficient is as simple as adding two natural numbers. Once you complete this task in Excel, you will never return to tedious multi-syllabic solutions in a notebook.

Still not able to make a qualitative comparison of the degree of scatter in the data? Lost in sample size? Then right now get down to business and master in practice all the theoretical material that was presented above! Let the statistical analysis and development of the forecast no longer cause you fear and negativity. Save your energy and time with