Methods for smoothing and equalizing time series. Developing a Forecast Using the Moving Average Method

An in-depth analysis of time series requires the use of more complex methods of mathematical statistics. If there is a significant random error (noise) in the time series, one of two simple methods is used - smoothing or leveling by enlarging the intervals and calculating group averages. This method allows you to increase the visibility of the series, if most of the "noise" components are inside the intervals. However, if the "noise" is not consistent with the periodicity, the distribution of indicator levels becomes rough, which limits the possibility of a detailed analysis of the change in the phenomenon over time.

More accurate characteristics are obtained if moving averages are used - a widely used method for smoothing the indicators of the average series. It is based on the transition from the initial values ​​of the series to the average values ​​in a certain time interval. In this case, the time interval during the calculation of each subsequent indicator, as it were, slides along the time series.

The use of a moving average is useful when trends in the time series are uncertain or when cyclical outliers (outliers or interventions) are heavily impacted.

The larger the smoothing interval, the smoother the moving average chart looks. When choosing the value of the smoothing interval, it is necessary to proceed from the value of the dynamic series and the meaningful meaning of the reflected dynamics. A large time series with a large number of initial points allows the use of larger smoothing time intervals (5, 7, 10, etc.). If the moving average procedure is used to smooth out a non-seasonal series, then most often the smoothing interval is taken equal to 3 or 5. https://tvoipolet.ru/iz-moskvi-v-nyu-jork/ - a great opportunity to choose an airline for a flight from Moscow to New York

Let's give an example of calculating the moving average number of farms with high yields (more than 30 kg / ha) (Table 10.3).

Table 10.3 Smoothing the time series by coarsening intervals and moving average

Accounting year

Number of farms with high yields

Amounts for three years

Rolling over three years

moving averages

90,0

89,7

1984

88,7

87,3

87,3

87,0

86,7

83,0

83,0

82,3

82,3

82,6

82,7

82,7

Moving average calculation examples:

1982 (84 + 94 + 92) / 3 = 90.0;

1983 (94 + 92 + 83) / 3 = 89.7;

1984 (92 + 83 + 91) / 3 = 88.7;

1985 (83 + 91 + 88) / 3 = 87.3.

A schedule is being drawn up. Years are indicated on the abscissa axis, and the number of farms with high yields is indicated on the ordinate axis. The coordinates of the number of farms are indicated on the graph and the obtained points are connected by a broken line. Then the coordinates of the moving average over the years are indicated on the chart and the points are connected by a smooth bold line.

A more complex and efficient method is the smoothing (levelling) of the time series using various approximation functions. They allow you to form a smooth level of the general trend and the main axis of dynamics.

The most effective method of smoothing with math functions is simple exponential smoothing. This method takes into account all previous observations of the series according to the formula:

S t = α∙X t + (1 - α ) ∙S t - 1 ,

where S t is each new smoothing at time t ; S t - 1 - smoothed value at the previous time t -1; X t is the actual value of the series at time t ; α - smoothing parameter.

If α = 1, then previous observations are completely ignored; when α = 0, current observations are ignored; values ​​of α between 0 and 1 give intermediate results. By changing the values ​​of this parameter, you can choose the most acceptable alignment option. The choice of the optimal value of α is carried out by analyzing the obtained graphic images of the original and leveled curves, or by taking into account the sum of the squared errors (errors) of the calculated points. The practical use of this method should be carried out using a computer in the MS Excel program. The mathematical expression of the patterns of data dynamics can be obtained using the exponential smoothing function.

Econometrics 1 module
1. In what law were the patterns of demand clarified on the basis of the relationship between the grain harvest and grain prices?
in King's Law
2. What is the name of the measure of the spread of a random variable?
dispersion
3. When studying which models, econometric research can include the identification of trends, lags, and cyclical components?
time series models
4. Which of the following scales does not belong to the main scales of qualitative features?
relationship scale
5. Who founded the journal "Econometrics"?
R. Frisch
6. Which of the following can include econometric research at the present stage of development in the study of models from independent random observations?
estimation of model parameters
7. Which scale has a natural unit of measure, but no natural reference point?
in difference scale
8. Which scientist created the theory of integrated models of autoregressive ¾ moving average?
J. Box and G. Jenkins
9. In what system is each explained variable considered as a function of the same set of factors?
in the system of independent equations
10. What measurement scale refers to the scales of quantitative traits?
interval scale
11. What econometric models were developed in the 80s - early 90s. R.E. Eagle, T. Bolleslev and Nelson?
autoregressive conditional heteroscedasticity models
12. What measurement scales are the most common and convenient?
relationship scales
13. Which scientist was awarded the Nobel Prize in 1980 for applying econometric models to the analysis of economic fluctuations and economic policy?
L. Klein
14. In which country was the first international econometric society created?
in the USA
15. Which of the following is a constant component of a random variable?
arithmetic mean
16. What is the purpose of econometrics as a science? (according to E. Malenvo)
empirical analysis of economic laws
17. Which of the researchers gave a broad interpretation of econometrics, interpreting it as any application of mathematics or statistical methods to the study of economic phenomena?
E. Malenvo
18. What components are included in the composition of random variables in the analysis process?
constant and random components
19. What is the average of a random component or remainder?
0
20. Who first introduced the term "econometrics"?
P. Tsempa
21. Which of the domestic scientists at the Union level described the dynamics of grain crop yields with equations with a small number of parameters?
V. Obukhov
22. What sections does econometrics contain?
modeling of time-disordered data and time series theory
23. What characteristics of the economy cannot be measured directly?
latent characteristics
24. Which of the scientists dealt with the problem of cyclicity?
K. Juglar
25. Who is the author of the first book on econometrics, The Laws of Wages: Essays in Statistical Economics?
G. Moore

2 module
1. If the regression is significant, then
Fobs>Fcrit
2. What does the value of the regression coefficient show?
average change in result with a change in factor by one unit
3. What does the coincidence of the average of the sample estimate with the unknown value of the corresponding parameter for the general population mean?
unbiasedness
4. What is the regression if k= 2?
multiple
5. What characterizes the dispersion (deviation) of observation points relative to the regression curve?
residual regression
6. What coefficient is an indicator of the tightness of the relationship?
linear correlation coefficient
7. What value is simply the average of the sum of the squares of the residuals (deviations)?
residual regression
8. What expression determines the correlation coefficient, which is a measure of the linear relationship between random variables x and y?
r(x, y)=…
9. What value should the average approximation error not exceed?
7-8%
10. Who coined the term "regression"?
F. Galton
11. What factor in the consumption function is used to calculate the multiplier?
regression coefficient
12. What coefficient is used to determine the quality of the selection of a linear function?
using the coefficient of determination
13. What expression determines the sample correlation coefficient?
r(x,y) with squares
14. What is called an effective feature in regression analysis?
dependent variable
15. Variance of what variable is analyzed by analysis of variance?
dependent variable
16. What regression is characterized by a transparent interpretation of model parameters?
linear regression
17. What coefficient characterizes the proportion of the variance explained by regression in the total variance of the resulting feature y?
coefficient of determination
18. What coefficient shows by how many percent, on average, will the result y change from its average value when the factor x changes by 1% from its (factor x) average value?
elasticity coefficient
19. What is the value of the residual variance if the actual values ​​of the effective feature coincide with the theoretical or calculated values?
0
20. What method is used to estimate the parameters a, b of the regression equation?
least squares method (LSM)
21. What method is based on the requirement to minimize the sum of squared deviations of the actual values ​​of the effective attribute from the calculated ones?
least square method
22. At what value of k is the regression called paired?
k=1
23. Which of the following does not apply to non-linear regressions on the estimated parameters?
exponential function
24. The essence of what theorem is that if a random variable is the general result of the interaction of a large number of other random variables, none of which has a predominant effect on the overall result, then such a resulting random variable will be described by an approximately normal distribution?
central limit theorem
25. What equation describes linear regression?
y = a + bx + ε
(3 mistakes)

3 module ()1 error
1. How is the heteroscedasticity of models checked in the Breusch and Pagan asymptotic test?
by criterion c2(r)
2. What criterion allows you to choose the best model from many different specifications and is numerically constructed in such a way as to take into account the influence of two opposite trends on the quality of fitting of the model?
Schwartz criterion
3. By what value is the quality of the model judged?
by average relative error of approximation
4. What expression describes the condition of homogeneity (homoscedasticity) of observations?
s2(yu)=s2(hu+eu)=s2(eu)=s2
5. What method is applicable under the condition that the error vector covariance matrix is ​​diagonal?
least square method
6. What expression determines the absolute approximation error?
yi-y1i=e
7. What is meant by multicollinearity?
high degree of correlation of explanatory variables
8. Which variables are the original variables from which the corresponding means are subtracted and the resulting difference is divided by the standard deviation?
standardized variables
9. What error on the control sample indicates the good quality of the constructed model?
4-9%
10. What method can be used to evaluate the significance of multicollinearity factors?
method of testing the hypothesis of independence of variables
11. Which variable should be expressed as a linear function of an unknown variable?
substitute variable
12. Dispersions and covariances of observational errors in the generalized linear model of multiple regression
can be arbitrary
13. What is the second approach to solving the problem of heteroscedasticity?
in building models that take into account the heteroscedasticity of observational errors
14. What is the standardized regression coefficient in the simplest case of pairwise regression?
linear correlation coefficient
15. Which of the following is used to test the hypothesis if the researcher assumes that during the observation period there have been sharp structural changes in the form of relationships between the dependent and independent variables?
chow test
16. What is the matrix determinant if there is a complete linear dependence between the factors and all correlation coefficients are equal to 1?
0
17. What formula is used to calculate the coefficients of the model when using the ridge regression method?
bgr= (XTX+DgrIk+ 1)-1XTY
18. According to Aitken's theorem, what formula is used to estimate the coefficients of the model?
b= (X¢W-1X)-1X¢W-1Y
19. Which of the following tests does not require the assumption that the distribution of regression residuals is normal?
Spearman's rank correlation test
20. What is the name of the variable that should be in the model according to the correct theory?
significant
21. The closer to one the value of the determinant of the interfactorial correlation matrix, the
less multicollinearity of factors
22. What criterion is used to evaluate the significance of the regression equation as a whole?
Fisher F-test
23. What indicator fixes the proportion of the explained variation of the effective attribute due to the factors considered in the regression?
determination indicator
24. What coefficients allow excluding duplicate factors from the model?
intercorrelation coefficients
25. What is the number of degrees of freedom of the residual sum of squares in linear regression?
n- 2
Module 4
1. What are the steps involved in the process of structural modeling?
all of the above steps
2. The essence of what method is the partial replacement of an unusable explanatory variable with a variable that is not correlated with a random member?
instrumental variable method
3. What does the variable x in the expression represent?
disturbing process
4. Under what condition does the general solution of a difference equation of the form have an "explosive" character?
for |a1|> 2
5. What are the names of interdependent variables that are determined within the model (within the system itself) and denoted by y?
endogenous variables
6. In which model, based on the coefficients of the reduced form, can two or more values ​​of one structural coefficient be obtained?
in an over-identified
7. What coefficients are called structural coefficients of the model?
coefficients for endogenous and exogenous variables in the structural form of the model
8. What method, with limited information, is called the least dispersion ratio method?
maximum likelihood method
9. What are the names of variables related to previous points in time?
lag variables
10. If a set of numbers X is related to another set of numbers Y by Y = 4X, then the variance of Y must be
16 times larger than X variance
11. What method is used to solve the identified system?
indirect least squares
12. What variables are understood as predefined variables?
exogenous variables and lagged endogenous variables
13. What method is used if you just need to clarify the nature of the relationships of variables?
path analysis method
14. What allows you to do the construction of models of the correlation structure?
test the hypothesis that the correlation matrix has a certain form
15. What is the model if all its structural coefficients are uniquely determined by the coefficients of the reduced form of the model and the number of parameters in both forms of the model is the same?
identifiable
16. What expression determines the dependence of consumption in the year with number t on income in the previous period y(t- 1)?
C(t)=b+cy(t- 1)
17. What are the names of independent variables that are determined outside the system and are denoted as x?
exogenous variables
18. Under what condition is the entire model considered identifiable?
if at least one equation of the system is identifiable
19. When is a model unidentifiable?
if the number of reduced coefficients is less than the number of structural coefficients
20. What variables often need to be introduced to account for the influence of qualitative factors?
dummy variables
21. What allows you to do the construction of models of the structure of averages?
explore the structure of means simultaneously with the analysis of variances and covariances
22. What variables might include causal models?
explicit and latent variables
23. Under what condition is the equation unidentifiable?
if the number of predefined variables not in the equation but present in the system, increased by one, is less than the number of endogenous variables in the equation
24. When solving the expression by the method of moving “backward”, the errors ei
accumulate
25. What can be done by modeling the covariance structure?
test the hypothesis that the covariance matrix has a certain form

4 module
1. What do large values ​​close to 1 indicate (1 - a1) of the error correction model (ECM)?
that economic factors strongly change the outcome
2. How many segments is the sequence divided into to check the stationarity condition for the series?
into two sections
3. To reduce the oscillation amplitude of the smoothed series Y(t), it is necessary
increase the width of the smoothing interval m
4. Which assumption is one of the prior assumptions when applying parametric tests to test stationarity?
assumption about the normal law of distribution of time series values
5. What is called a time series?
a sequence of characteristic values ​​taken over several consecutive time points or periods
6. How does the variance of the Y(t) series smoothed by a quadratic polynomial change with an increase in the number m of equations?
decreases
7. What trends correlate with each other?
temporary
8. Which of the following is used to test the stationarity of a time series?
serial stationarity criterion
9. What is the name of the correlation dependence between successive levels of the time series?
autocorrelation of the levels of the series
10. What is the name of a random variable with variable variance?
heteroscedastic
11. Under what condition is the smoothing of a series called centered?
for k=l
12. How can the time trend be excluded from the resulting variable?
by building a regression of that variable over time and moving on to residuals that form a new stationary variable already trend-free
13. By what formula are the coefficients calculated if we take a straight line as a smoothing polynomial?
ar= 1/m
14. What component explains deviations from the trend with a frequency of 2 to 10 years?
cyclic component
15. What is the parameter L in the expression?
likelihood function
16. What sequence is white noise?
if each random variable of the sequence has zero mean and is uncorrelated with other elements of the sequence
17. What class does a series belong to if it contains unit roots and is integrable with order d?
I(d)
18. What is the name of a stochastic variable with constant variance?
homoscedastic variable
19. What principle of development of forecasts implies compliance, maximum approximation of theoretical models to real production and economic processes?
adequacy of forecasting
20. What is the name of the number of values ​​of the original series that simultaneously participate in smoothing?
smoothing interval width
21. What are the basic principles for developing forecasts?
consistency, adequacy, alternativeness
22. What is the serial criterion of stationarity used for?
to check the stationarity of the time series
23. What is the view model called?
autoregressive conditional heteroscedastic model (ARHG model)
24. What does the equation represent?
APCC process for (et2)-sequence
25. What variables are used in the random walk process?
uncorrelated non-stationary variables

Common hawthorn Common hawthorn Scientific classification Kingdom: Plants ... Wikipedia

Exponential smoothing is a mathematical transformation method used in time series forecasting ... Wikipedia

Stochastic indicator- (Stochastic Oscillator) Stochastic oscillator, description of Stochastic, versions of the Stochastic trend indicator, Trading signals of the Stochastic indicator Adding the Stochastics indicator to the chart of the Metatrader (MT) trading terminal, setting ... ... Encyclopedia of the investor

Contents: I. Physical essay. 1. Composition, space, coastline. 2. Orography. 3. Hydrography. 4. Climate. 5. Vegetation. 6. Fauna. II. Population. 1. Statistics. 2. Anthropology. III. Economic essay. 1. Farming. 2.… …

I MAP OF THE JAPANESE EMPIRE. Contents: I. Physical essay. 1. Composition, space, coastline. 2. Orography. 3. Hydrography. 4. Climate. 5. Vegetation. 6. Fauna. II. Population. 1. Statistics. 2. Anthropology. III. Economic essay. one … Encyclopedic Dictionary F.A. Brockhaus and I.A. Efron

I Ural is a territory located between the East European and West Siberian plains and stretched from the north to the south from the north. Arctic Ocean to the latitudinal section of the river. Ural below the city of Orsk. The main part of it is the Ural mountain system, ... ...

Schizane herbaceous, less often liana-like ferns, mainly tropical and subtropical. Only a few species are found in temperate areas of North America and Japan, Chile, New Zealand, Tasmania and South Africa. Schizane, ... ... Biological Encyclopedia

This term has other meanings, see Pose (meanings). Pose (from French pose through German, earlier from Latin pono (supin positum) “put, put”) the position taken by the human body, the position of the body, head and ... ... Wikipedia

Pose (lat. positum to put, put; fr: pose) the position taken by the human body, the position of the body, head and limbs in relation to each other. Contents 1 General characteristics of the pose ... Wikipedia

Ural, the territory located between the East European and West Siberian plains and elongated from the north to the south from the north. Arctic Ocean to the latitudinal section of the river. Ural below the city of Orsk. The main part of it is the Ural mountain system, ... ... Great Soviet Encyclopedia

16.02.15 Viktor Gavrilov

38133 0

A time series is a sequence of values ​​that change over time. I will try to talk about some simple but effective approaches to working with such sequences in this article. There are a lot of examples of such data - currency quotes, sales volumes, customer requests, data in various applied sciences (sociology, meteorology, geology, observations in physics) and much more.

Series are a common and important form of data description, as they allow us to observe the entire history of the value we are interested in. This gives us the opportunity to judge the "typical" behavior of the quantity and the deviations from such behavior.

I was faced with the task of choosing a data set on which it would be possible to clearly demonstrate the features of the time series. I decided to use international passenger traffic statistics because this data set is quite descriptive and has become somewhat of a standard (http://robjhyndman.com/tsdldata/data/airpass.dat , source Time Series Data Library, R. J. Hyndman). The series describes the number of international airline passengers per month (in thousands) from 1949 to 1960.

Since I always have at hand, which has an interesting tool "" for working with rows, I will use it. Before importing the data into the file, you need to add a column with a date so that the values ​​\u200b\u200bare bound to time, and a column with the name of the series for each observation. Below you can see what my source file looks like, which I imported into the Prognoz Platform using the import wizard directly from the time series analysis tool.

The first thing we usually do with a time series is plot it on a chart. Prognoz Platform allows you to build a graph by simply dragging and dropping a series into a workbook.

Time series on the chart

The symbol 'M' at the end of the series name means that the series has a monthly dynamics (the interval between observations is one month).

Already from the graph, we can see that the series demonstrates two features:

  • trend- on our chart, this is a long-term increase in the observed values. It can be seen that the trend is almost linear.
  • seasonality- on the graph, these are periodic fluctuations in the value. In the next article on the topic of time series, we will learn how to calculate the period.

Our series is quite “neat”, however, there are often series that, in addition to the two characteristics described above, demonstrate one more thing - the presence of “noise”, i.e. random variations in one form or another. An example of such a series can be seen in the chart below. This is a sinusoidal signal mixed with a random variable.

When analyzing series, we are interested in identifying their structure and evaluating all the main components - trend, seasonality, noise, and other features, as well as the ability to make forecasts of changes in magnitude in future periods.

When working with series, the presence of noise often makes it difficult to analyze the structure of the series. To exclude its influence and better see the structure of the series, you can use the methods of smoothing the series.

The simplest method for smoothing series is the moving average. The idea is that for any odd number of points in a series sequence, replace the central point with the arithmetic mean of the remaining points:

where x i- original row s i- smoothed row.

Below you can see the result of applying this algorithm to our two series. By default, Prognoz Platform suggests using anti-aliasing with a window size of 5 points ( k in our formula above will be equal to 2). Please note that the smoothed signal is no longer affected by noise, but along with the noise, of course, some useful information about the dynamics of the series also disappears. It can also be seen that the smoothed series lacks the first (and also the last) k points. This is due to the fact that smoothing is performed for the central point of the window (in our case, for the third point), after which the window is shifted by one point, and the calculations are repeated. For the second, random series, I used smoothing with a window equal to 30 to better reveal the structure of the series, since the series is “high-frequency”, there are a lot of points.

The moving average method has certain disadvantages:

  • The moving average is inefficient in the calculation. For each point, the average must be recalculated in a new way. We cannot reuse the result calculated for the previous point.
  • The moving average cannot be extended to the first and last points of the series. This can cause a problem if we are interested in exactly these points.
  • The moving average is not defined outside of the series and, as a result, cannot be used for forecasting.

Exponential Smoothing

A more advanced smoothing method that can also be used for prediction is exponential smoothing, also sometimes called the Holt-Winters method after the names of its creators.

There are several variants of this method:

  • single smoothing for series that do not have a trend and seasonality;
  • double smoothing for series that have a trend but no seasonality;
  • triple smoothing for series that have both trend and seasonality.

The exponential smoothing method calculates the values ​​of the smoothed series by updating the values ​​calculated in the previous step using information from the current step. Information from the previous and current steps is taken with different weights that can be controlled.

In the simplest version of single smoothing, the ratio is:

Parameter α defines the ratio between the unsmoothed value at the current step and the smoothed value from the previous step. At α =1 we will take only the points of the original series, i.e. there will be no smoothing. At α =0 series, we will take only the smoothed values ​​from the previous steps, i.e. the series will become a constant.

To understand why smoothing is called exponential, we need to expand the relation recursively:

It can be seen from the relation that all previous values ​​of the series contribute to the current smoothed value, however, their contribution fades exponentially due to the growth of the degree of the parameter α .

However, if there is a trend in the data, a simple smoothing will “lag behind” it (or you will have to take values α close to 1, but then smoothing will be insufficient). You need to use double exponential smoothing.

Double smoothing already uses two equations - one equation evaluates the trend as the difference between the current and previous smoothed values, then smoothes the trend with simple smoothing. The second equation performs smoothing as in the simple case, but the second term uses the sum of the previous smoothed value and the trend.

Triple smoothing includes another component, seasonality, and uses another equation. At the same time, two variants of the seasonal component are distinguished - additive and multiplicative. In the first case, the amplitude of the seasonal component is constant and does not depend on the base amplitude of the series over time. In the second case, the amplitude changes along with the change in the base amplitude of the series. This is just our case, as can be seen from the graph. As the series grows, the amplitude of seasonal fluctuations increases.

Since our first series has both trend and seasonality, I decided to adjust the triple smoothing parameters for it. In Prognoz Platform, this is quite easy to do, because when the parameter value is updated, the platform immediately redraws the graph of the smoothed series, and visually you can immediately see how well it describes our original series. I settled on the following values:

How I calculated the period, we will look at in the next article on time series.

Typically, values ​​between 0.2 and 0.4 can be considered as first approximations. Prognoz Platform also uses a model with an additional parameter ɸ , which dampens the trend so that it approaches a constant in the future. For ɸ I took the value 1, which corresponds to the usual model.

I also made a forecast of the values ​​of the series by this method for the last 2 years. In the figure below, I marked the start point of the forecast by drawing a line through it. As you can see, the original series and the smoothed one coincide quite well, including on the forecasting period - not bad for such a simple method!

Prognoz Platform also allows you to automatically select the optimal parameter values ​​using a systematic search in the space of parameter values ​​and minimizing the sum of squared deviations of the smoothed series from the original.

The methods described are quite simple, easy to apply, and a good starting point for structure analysis and time series forecasting.

Read more about time series in the next article.

Ministry of Education of the Russian Federation

All-Russian Correspondence Institute of Finance and Economics

Yaroslavl branch

Department of Statistics

Course work

by discipline:

"Statistics"

task number 19

Student: Kurashova Anastasia Yurievna

Specialty "Finance and Credit"

3 course, periphery

Head: Sergeev V.P.

Yaroslavl, 2002

1. Introduction………………………………………………………………3 p.

2. Theoretical part……………………………………………… …4 p.

2.1 Basic concepts of time series…………………………...4 p.

2.2 Methods for smoothing and equalizing time series…………………………………………………………………….6 p.

2.2.1 Methods of "mechanical smoothing"………………………6 p.

2.2.2 Methods of “analytical” alignment…………………. 8 p.

3. Estimated part……………………………………………………… 11 p.

4. Analytical part……………………………………………. .16 page

5. Conclusion ………………………………………………………. 25 pages

6. References……………………………………………… 26 p.

7. Applications………………………………………………………. 27 pages


Introduction

Complete and reliable statistical information is the necessary basis on which the process of economic management is based. All information of national economic significance is ultimately processed and analyzed using statistics.

It is the statistical data that make it possible to determine the volume of gross domestic product and national income, to identify the main trends in the development of economic sectors, to assess the level of inflation, to analyze the state of financial and commodity markets, to study the standard of living of the population and other socio-economic phenomena and processes.

Mastering statistical methodology is one of the conditions for understanding market conditions, studying trends and forecasting, and making optimal decisions at all levels of activity.

Complicated, time-consuming and responsible is the final, analytical stage of the study. At this stage, average indicators and distribution indicators are calculated, the structure of the population is analyzed, the dynamics and the relationship between the studied phenomena and processes are studied.

At all stages of research, statistics uses different methods. The methods of statistics are special techniques and methods for studying mass social phenomena.

I. Theoretical part.

1.1 Basic concepts about series of dynamics.

Time series are statistical data that reflect the development of the phenomenon under study over time. They are also called dynamic series, time series.

There are two main elements in each row of dynamics:

1) time indicator t;

2) the corresponding levels of development of the studied phenomenon y;

As indications of time in the series of dynamics, either certain dates (moments) or separate periods (years, quarters, months, days) are used.

The levels of the series of dynamics display a quantitative assessment (measure) of the development of the studied phenomenon in time. They can be expressed as absolute, relative or average values.

Dynamic series differ in the following ways:

1) By time. Depending on the nature of the phenomenon under study, the levels of the series of dynamics can refer either to certain dates (moments) in time, or to individual periods. In accordance with this, the series of dynamics are divided into moment and interval.

Momentary series of dynamics reflect the state of the studied phenomena at certain dates (points) in time. An example of a moment series of dynamics is the following information on the payroll number of store employees in 1991 (tab. 1):

Table 1

List number of store employees in 1991

A feature of the moment series of dynamics is that its levels can include the same units of the studied population. Although there are intervals in the moment series - intervals between adjacent dates in the series, the value of one or another specific level does not depend on the length of the period between two dates. So, the main part of the store's staff, which makes up the headcount as of 01/01/1991, continuing to work during this year, is displayed in the levels of subsequent periods. Therefore, when summing the levels of the moment series, repeated counting may occur.

By means of moment series of dynamics in trade, commodity stocks, the state of personnel, the amount of equipment and other indicators that reflect the state of the studied phenomena at certain dates (points) in time are studied.

Interval series of dynamics reflect the results of the development (functioning) of the studied phenomena for certain periods (intervals) of time.

An example of an interval series is data on the retail turnover of a store in 1987-1991. (tab. 2):

table 2

The volume of retail turnover of the store in 1987 - 1991.

The volume of retail trade turnover, thousand rubles

885.7 932.6 980.1 1028.7 1088.4

Each level of the interval series is already the sum of levels for shorter periods of time. In this case, the unit of the population, which is part of one level, is not included in other levels.

A feature of the interval series of dynamics is that each of its levels is made up of data for shorter intervals (sub-periods) of time. For example, summing up the turnover for the first three months of the year, they get its volume for the first quarter, and summing up the turnover for four quarters, they get its value for the year, etc. Other things being equal, the level of the interval series is the greater, the longer the length of the interval, to which this level belongs.

The property of summing levels for successive time intervals makes it possible to obtain series of dynamics of more enlarged periods.

Through interval series, the dynamics in trade study changes in the time of receipt and sale of goods, the amount of distribution costs and other indicators that reflect the results of the functioning of the phenomenon under study for certain periods.

Dynamic series structure:

Any series of dynamics can theoretically be represented as components:

1) trend - the main trend in the development of a dynamic series (to increase or decrease its levels);

2) cyclical (periodic fluctuations, including seasonal ones);

random fluctuations.

1. 2. Methods for smoothing and equalizing time series.

The elimination of random fluctuations in the values ​​of the levels of the series is carried out by finding "averaged" values. Ways to eliminate random factors are divided into two more groups:

1. Ways of "mechanical" smoothing of fluctuations by averaging the values ​​of the series relative to other, adjacent, levels of the series.

2. Methods of "analytical" alignment, i.e., determining first the functional expression of the trend of the series, and then the new, calculated values ​​of the series.

1.2. 1 Methods of "mechanical" smoothing.

These include:

a. Method of averaging over two halves of a series, when the series is divided into two parts. Then, two values ​​of the average levels of the series are calculated, according to which the trend of the series is graphically determined. It is obvious that such a trend does not fully reflect the main regularity of the development of the phenomenon.

b. The method of enlargement of intervals, in which the length of time intervals is increased, and new values ​​of the levels of the series are calculated.

in. moving average method. This method is used to characterize the development trend of the studied statistical population and is based on the calculation of the average levels of the series for a certain period. The sequence for determining the moving average:

The smoothing interval or the number of levels included in it is set. If three levels are taken into account when calculating the average, the moving average is called a three-term, five levels is called a five-term, and so on. If small, chaotic fluctuations in levels in a series of dynamics are smoothed out, then the interval (the number of the moving average) is increased. If the waves are to be kept, the number of terms is reduced.

Calculate the first average level by simple arithmetic:

y1 = Sy1/m, where

y1 – I-th level of the series;

m - membership of the moving average.

The first level is discarded, and the level following the last level participating in the first calculation is included in the calculation of the average. The process continues until the last level of the studied series of dynamics y n is included in the calculation of y.

According to a series of dynamics built from average levels, a general trend in the development of the phenomenon is revealed.

The negative side of using the moving average method is the formation of shifts in fluctuations in the levels of the series, due to the "sliding" of the enlargement intervals. Smoothing with a moving average can lead to “reverse” fluctuations, when a convex “wave” is replaced by a concave one.

Recently, the adaptive moving average began to be calculated. Its difference lies in the fact that the average value of the attribute, calculated as described above, does not refer to the middle of the series, but to the last time interval in the enlargement interval. Moreover, it is assumed that the adaptive average depends on the previous level to a lesser extent than on the current one. That is, the more time intervals between the level of the series and the average value, the less influence the value of this level of the series has on the value of the average.

d. Exponential Average Method. An exponential average is an adaptive moving average calculated using weights that depend on the degree of "remoteness" of the individual levels of the series from the average value. The value of the weight decreases as the level moves away along the chronological straight line from the average value in accordance with the exponential function, therefore such an average is called exponential. In practice, multiple exponential smoothing of the time series is used, which is used to predict the development of the phenomenon.

Conclusion: the methods included in the first group, due to the calculation methods used, provide the researcher with a very simplified, inaccurate idea of ​​the trend in a series of dynamics. However, the correct application of these methods requires the researcher to have a deep knowledge of the dynamics of various socio-economic phenomena.