Optimal system behavior. Behavior

General scheme of decision making. Types and parameters of economic problems of optimization and control

Any decision-making task is characterized by the presence of a certain number of persons who have certain capabilities and pursue certain goals. Therefore, in order to build a decision-making model, it is necessary to answer the following questions:

who makes the decisions;

what are the goals of the decision;

What is decision making?

Determine the range of options

under what conditions the decision is made.

In order to construct a model, some notation must be introduced.

N is the set of all decision makers. N=(1; n), i.e. available n participants. Each participant is called a decision maker (individual, legal entity).

Suppose the set of all feasible solutions has been previously studied and described as an inequality (mathematically).

If denoted by x 1, x 2,…, x n presented alternatives, then the decision-making process is reduced to the following: each person chooses a specific element from the entire set of decisions, i.e. .

As a result, the set x 1, x 2,…, x n can be called a certain situation.

To evaluate the vector in terms of the goals pursued, a function is built, which is called the objective function, which assigns numerical values ​​(estimates) to each situation. For example, the income of firms in a situation or the costs of the same firms in a given situation.

Based on the above, the goal i th decision maker can be formulated as follows: choose such that in a situation X number will be either maximum or minimum.

However, the influence of other parties on this situation complicates the process, i. there is an intersection of interests of individuals. There is a conflict, which is expressed in the fact that the function, in addition to x i also depends on xj, . Therefore, in decision-making models with several participants, their goals have to be formalized differently than maximizing (minimizing) the values ​​of the function .

Thus, the general scheme of the decision-making problem can be formulated as follows:

This is a set of all characteristics (conditions) under which a decision has to be made.

If in the formula (*) N consists of only one element, and all the conditions and prerequisites of the original real problem can be described as a set of feasible solutions, then we obtain the structure of an optimization or extremal problem:

This scheme is used by the decision maker as a planning scheme, and it can be used to describe two extreme tasks:

If the time factor is taken into account in this problem, then it is called the optimal control problem.


If the decision maker has several goals, then the equation (*) will look like . In this case, the functions are defined on the same set X. Such problems are called multiobjective optimization problems.

There are decision-making problems that are named based on their purpose: queuing systems, network and scheduling problems, reliability theory, etc.

If the elements of the model (*) do not depend on time, i.e. the decision-making process is instantaneous, then the task is called static, otherwise it is dynamic.

If the elements (*) do not contain random variables, then the problem is deterministic, otherwise it is stochastic.

Task examples:

1. The task of optimal cutting

The company manufactures products from several parts (p). Moreover, these parts are included in one product in quantities. For this purpose, cutting m parties. AT i th party has b i units of material. Each piece of material can be cut n ways. This results in aijn number of details. It is required to draw up a cutting plan in order to obtain the maximum number of products.

2. Transport task

Available n suppliers and m consumers of the same product. The output of each supplier and the needs of each consumer are known, as well as the costs of transporting products from the supplier to the consumer. It is required to build a transportation plan with minimal transportation costs, taking into account the wishes of suppliers and consumer demand.

3. Job assignment task

Available n works and n performers. The cost of doing the work i performer j is equal to c ij. It is necessary to distribute performers to work in order to minimize wages.

4. Investment distribution problem

Available n projects. And for j-th project the expected effect from the implementation is known d and the required amount of capital investment gj. The total amount of capital investment may exceed the specified value b. It is required to determine which projects need to be implemented so that the total effect is the greatest.

5. Production location problem

Planned release m types of products that can be produced n enterprises. The costs of production, sales of a unit of production, the planned volume of annual production and the planned cost of a unit of production of each type are known. Required from n enterprises to choose such m, each of which will produce one type of product.

In decision-making problems, the principle of optimality is understood as a set of rules by which the decision maker determines his actions, and in such a way as to maximize the achievement of a certain goal. Such a solution is called optimal.

The ultimate goal of the study of any problem is to find the optimal solution for all persons who accept them.

The principle of optimality is chosen without taking into account the specific conditions of decision-making (the number of participants, goals, opportunities, the nature of the conflict of interests).

Formalization of optimal behavior is one of the most difficult stages of mathematical modeling.

The development of any principle of optimality is justified if it meets the following requirements:

2. Existence of an optimal solution under various additional assumptions.

3. The possibility of identifying the distinctive features of optimal solutions for their detection (necessity and sufficiency of optimality).

4. Availability of methods for calculating the optimal solution (exact or approximate).

In decision theory, a large number of formal principles of optimal behavior have been developed:

1. The principle of maximization (minimization) is mainly used in mathematical programming problems designed to find the optimal minimum or maximum.

2. The principle of criteria convolution is mainly used in problems with the optimization of many criteria by one coordinating center (multi-criteria optimization problem).

For each of the criteria or target functions, weights or numbers are assigned by expert means, and each of them is positive and their sum is 1. Each shows the importance or significance of its criterion. The decision to be made must maximize or minimize the convolution of the criteria, and the decision X selected from many X.

3. The principle of lexicographic preference. First, the optimality criterion is ranked by importance and compiled as a set of objective functions. Some Solution X solution is preferable if one of the following conditions is met:

Contained n+1 equations. n+1- when all match: .

4. The minimax principle is applied when the interests of the opposing sides clash, i.e. in a conflict. Each decision maker calculates a guaranteed outcome for each of his strategies. Then he finally chooses the strategy for which this result will be the largest. Such an action does not give the maximum gain, but is the only reasonable principle in a conflict. In particular, any risk is excluded.

5. The Nash equilibrium principle is a generalization of the minimax principle, when many parties participate in the interaction, each of which pursues its own goal, but there is no direct confrontation. If the number of decision makers is n, then the set of selected situations x 1, x 2,…, x n is called equilibrium if the unilateral deviation of any person from this situation can only lead to a decrease in his payoff. In an equilibrium situation, participants do not receive the maximum payoff, but they are given to stick to this situation.

6. The principle of Pareto optimality assumes as optimal situations in which it is impossible to improve the payoff of an individual participant without worsening the payoffs of other participants. This principle imposes weaker requirements on the concept of optimality than the Nash equilibrium principle, so Pareto-optimal situations almost always exist.

7. The principle of non-dominant outcomes is a representative of many principles of optimality in problems of collective decision making. This leads to the concept of a decision kernel. In this case, all participants unite and, by joint coordinated actions, maximize the total gain. The principle of non-dominance is one of the principles of a fair division between the participants in the total gain. A situation arises when one of the participants cannot reasonably object to the proposed method of division.

8. The principle of stability (threats and counter-threats). Each team of participants puts forward its proposal with certain conditions. If these conditions are not met, certain sanctions will follow. The optimal solution is when there is a counter-threat from the other team against any threat.

9. Arbitration schemes based on the situation of the conflict and on its solution with the help of an arbitrator. The optimal solution is built using a system of axioms that includes several principles of optimality.

10. The principle of extreme pessimism or Wald's criterion. According to this principle, playing with nature or making decisions under conditions of uncertainty is played as with a reasonable aggressive opponent who does everything in order to place a certain success.

11. The principle of minimum maximum risk is pessimistic in nature, but when choosing the optimal strategy, it focuses not on gain, but on risk, i.e. risk is defined as the difference between the maximum gain and the real gain. The value of the minimum gain is considered optimal.

12. The principle of pessimism-optimism or the Hurwitz criterion. The principle uses the maximum weighted average between extreme optimism and extreme pessimism. Options are selected from subjective considerations, based on the danger of the situation.

The concept of dynamic stability is as follows. Since all the above principles are formulated in relation to statistical problems, therefore, their application in dynamic problems is accompanied by complications, since any principle of optimality chosen in the initial state remained optimal until the end of the dynamic process. This property is called dynamic stability and can be considered as the principle of realizability of the statistical principles of optimal behavior in dynamic decision making models.

Organizational activity. Alternative paradigms of the organizational process.

The whole variety of approaches to organizational activity can be represented in the form of two alternative paradigms (Table 5.1). The above paradigms reflect two fundamentally different approaches to organizational activity. The first can be conditionally called the approach of coercion, when it is necessary to make efforts to create and maintain. As soon as these efforts stop, the system returns to its original state. You can construct as many artificial organizational schemes as you like, but they will be fragile and inefficient. History knows many such examples: collective farms, economic councils, production associations, and so on.

Table 5.1

Alternative Organizational Process Paradigms

The second approach focuses on the natural processes of the organization, developing long enough to give place to the will of man. Human goals that fall outside the range of natural development (for example, the creation of collective farms) are doomed to failure, no matter what resources are attracted to achieve them. At the same time, there is no fatalism here - a person with his goal-setting and volitional activity is not excluded from the development process, it is only necessary to fulfill the condition: the space of human goals must coincide with the range of directions of natural (possible in principle) development. Orientation towards natural development can also be found in the studies of A. Smith, who argued that peace, light taxes and tolerance in management are necessary for the socio-economic development of society, and everything else will be done by the natural course of things.

Control system - cybernetic approach. Control principles: principle of open control; the principle of open control with disturbance compensation; the principle of closed control; single control principle.

Organization as a process of organizing is one of the main functions of management. The management function is understood as a set of repetitive management actions, united by the unity of content. Since the organization (as a process) serves as a management function, any management is an organizational activity, although it is not limited to it.

Management is a specially oriented influence on the system, which ensures that it is given the required properties or states. One of the state attributes is structure.

To organize means, first of all, to create (or change) a structure.

With differences in approaches to the construction of control systems, there are common patterns developed in cybernetics. From the point of view of the cybernetic approach, the control system is an integral set of the control subject (control system), the control object (control system), as well as direct and feedback links between them. It is also assumed that the control system interacts with the external environment.

The basic classification feature for building control systems, which determines the type of system and its potential capabilities, the method of organizing the control loop. According to the latter, there are several principles for organizing the control loop.

The principle of open (software) control. This principle is based on the idea of ​​autonomous influence on the system, regardless of the conditions of its operation. It is obvious that the area of ​​practical application of this principle implies the reliability of knowledge of the state of the environment and the system over the entire interval of its operation. Then it is possible to predetermine the reaction of the system to the calculated impact, which is pre-programmed as a function (Fig. 5.1).

Rice. 5.1. Open loop principle

If this effect is different from the expected one, deviations in the nature of the change in the output coordinates will immediately follow, i.e. the system will be unprotected from disturbances in the original sense of the word. Therefore, a similar principle is used with confidence in the reliability of information about the operating conditions of the system. For example, for organizational systems, such confidence is acceptable with high performance discipline, when the given order does not need follow-up control. Sometimes such management is called directive. The undoubted advantage of such a control scheme is the simplicity of the organization of control.

The principle of open control with disturbance compensation. The content of the approach is to eliminate the limitations of the first scheme, i.e. unregulated impact of disturbances on the functioning of the system. The possibility of compensating perturbations, and hence the elimination of the unreliability of a priori information, is based on the availability of perturbations to measurements (Fig. 5.2).


Rice. 5.2. The principle of compensation management

The measurement of disturbances makes it possible to determine a compensating control that fends off the consequences of disturbances. Usually, along with corrective control, the system is subjected to program influence. However, in practice it is far from always possible to record information about external perturbations, not to mention the control of deviations in system parameters or unexpected structural changes. If information about disturbances is available, the principle of their compensation by introducing a compensating control is of practical interest.

The principle of closed control. The principles discussed above belong to the class of open control loops: the amount of control does not depend on the behavior of the object, but is a function of time or perturbation. The class of closed control loops is formed by systems with negative feedback, embodying the basic principle of cybernetics.

In such systems, it is not the input action that is programmed in advance, but the required state of the system, i.e. a consequence of the impact on the object, including control. Consequently, a situation is possible when the perturbation has a positive effect on the dynamics of the system, if it brings its state closer to the desired one. To implement the principle, the program law of the change in the state of the system in time Csp (t) is a priori found, and the task of the system is formulated as ensuring the approximation of the actual state to the desired one (Fig. 5.3). The solution to this problem is achieved by determining the difference between the desired state and the actual one:

∆С(t) = Ср(t) – С(t).


Figure 5. 3 Closed-loop control principle

This difference is used for control to minimize the detected mismatch. This ensures the approximation of the controlled coordinate to the program function, regardless of the reasons that caused the appearance of the difference, be it disturbances of various origins or control errors. The quality of control affects the nature of the transient process and the steady-state error - the discrepancy between the program and the actual final state.

Depending on the input signal in control theory, there are:

■ program control systems (case under consideration);

■ stabilization systems, when cpr(t) = 0;

■ Tracking systems when the input signal is a priori unknown.

This detailing does not affect the implementation of the principle in any way, but introduces specifics into the technique of building the system.

The widespread use of this principle in natural and artificial systems is explained by the efficiency of the loop organization: the control problem is effectively solved at the conceptual level due to the introduction of negative feedback.

The case of programming the change in time of the state of the system Csp(t), which means a preliminary calculation of the trajectory in the state space, is considered. But the question of how to do it fell out of sight. The answer is limited by two requirements for the trajectory, which must:

1) pass through the target;

2) satisfy the extremum of the quality criterion, i.e. be optimal.

In formalized dynamic systems, to find such a trajectory, the calculus of variations or its modern modifications are used: the maximum principle of L. Pontryagin or dynamic programming of R. Bellman. In the case when the problem is reduced to the search for unknown parameters (coefficients) of the system, mathematical programming methods are used to solve it - it is required to find the extremum of the quality function (indicator) in the parameter space. To solve poorly formalized problems, it remains to rely on heuristic solutions based on futurological forecasts, or on the results of simulation mathematical modeling. It is difficult to assess the accuracy of such solutions.

Let's return to the problem of programming. If there is a way to calculate a program trajectory for formalized tasks, then it is natural to require the control system to be content with target designation, and to find a program change in the state of the system directly in the control process (terminal control). Such an organization of the system, of course, will complicate the control algorithm, but it will allow minimizing the initial information, which means it will make control more efficient. A similar task in the 1960s. was theoretically solved by Professor E. Gorbatov to control the movement of ballistic missiles and spacecraft.

With regard to the formulation and solution of the optimal control problem, the following fundamental circumstance should be taken into account.

It is possible to choose the optimal behavior of the system only if the behavior of the object under study is reliably known over the entire control interval and the conditions under which the movement occurs.

Optimal solutions can also be obtained by fulfilling other, additional assumptions, but the point is that each case should be specified separately, the solution will be valid “up to conditions”.

Let us illustrate the formulated position on the example of the behavior of a runner who strives to achieve a high result. If we are talking about a short distance (100, 200 m), then a trained athlete aims to ensure maximum speed at any given time. When running over longer distances, success is determined by his ability to properly distribute forces on the track, and for this he must clearly understand his capabilities, the terrain of the route and the characteristics of his rivals. In conditions of limited resources, there can be no question of any maximum speed at any moment.

It is quite obvious that the above constraint is satisfied only within the deterministic formulation of the problem, i.e. when everything is known a priori. Such conditions turn out to be excessive for real problems: the Procrustean bed of determinism does not correspond to the actual conditions of the system functioning. The a priori nature of our knowledge is extremely doubtful both in relation to the system itself and the environment and its interaction with one or another object. The reliability of a priori information is the less, the more complex the system, which does not add optimism to researchers conducting the synthesis procedure.

Such uncertainty has led to the emergence of a whole trend in control theory based on taking into account the stochastic conditions for the existence of the system. The most constructive results were obtained in the development of the principles of adaptive and self-adjusting systems.

Control optimization. Adaptive and self-adjusting systems.

Adaptive systems allow you to cope with uncertainty by obtaining additional information about the state of the object and its interaction with the environment in the process of management, followed by restructuring the system structure and changing its parameters when operating conditions deviate from a priori known (Fig. 5.4). In this case, as a rule, the purpose of transformations is to approximate the characteristics of the system to the a priori ones used in the synthesis of control. Thus, adaptation is focused on maintaining the homeostasis of the system under perturbations.


Rice. 5.4. Adaptive system

One of the most difficult constructive components of this task is obtaining information about the state of the environment, without which it is difficult to carry out adaptation.

An example of successful obtaining of information about the state of the environment is the invention of the Pitot tube, which is equipped with almost all aircraft. The tube allows you to measure the velocity head - the most important characteristic on which all aerodynamic forces directly depend. The measurement results are used to set up the autopilot. A similar role in social systems is played by sociological surveys, which make it possible to correct solutions to domestic and foreign policy problems.

An effective technique for studying the dynamics of a control object is the dual control method, once proposed by A. Feldbaum. Its essence lies in the fact that, along with control commands, special testing signals are sent to the object, the reaction to which is predetermined for the a priori model. By the deviation of the reaction of the object from the reference, the interaction of the model with the external environment is judged.

A similar technique was used in Russian counterintelligence during the First World War to identify a spy. A circle of employees suspected of betrayal was singled out, and each of this circle was “trusted” with important, but false information of a unique nature. The reaction of the enemy was observed, according to which the traitor was identified.

A class of self-adjusting systems is distinguished from adaptive systems. The latter are configured in the process of adaptation. However, at the accepted level of generality, the structure of a self-adjusting system is similar to the structure of an adaptive system (see Fig. 5.4).

Regarding the processes of adaptation and self-tuning, it can be noted that their possibility in specific cases is mainly determined by the purpose of the system and its technical implementation. Such systems theory is replete with illustrations, but does not seem to contain generalizing achievements.

Another way to overcome the insufficiency of a priori data on the control process is to combine the control process with the procedure for its synthesis. Traditionally, the control algorithm is the result of synthesis based on the assumption of a deterministic description of the motion model. But it is obvious that deviations in the movement of the adopted model affect the accuracy of achieving the goal and the quality of processes, i.e. lead to a deviation from the criterion extremum. It follows from this that it is necessary to build control as a terminal one, calculating the trajectory in real time and updating information about the object model and motion conditions. Of course, in this case, it is also necessary to extrapolate the traffic conditions for the entire remaining control interval, but as the goal is approached, the extrapolation accuracy increases, which means that the quality of control increases.

This shows an analogy with the actions of the government, which is not able to fulfill planned targets, such as budget ones. The conditions for the functioning of the economy are changing in an unplanned way, with a violation of forecasts, therefore, it is necessary to constantly adjust the planned plan in an effort to achieve the final indicators, in particular, to sequester. Deviations from a priori assumptions can be so great that the available resources and the management measures taken can no longer ensure the achievement of the goal. Then we have to “zoom in” the target, placing it inside the new reachable area. Note that the described scheme is valid only for a stable system. The poor quality of management organization can lead to destabilization and, as a result, to the destruction of the entire system.

Let us dwell on one more control principle underlying the developed theory of operations research.

Single control principle. A wide range of practically significant tasks implies the need to carry out a single act of management, namely, to make a certain decision, the consequences of which affect a long time. Of course, traditional management can also be interpreted as a sequence of one-time decisions. Here we again encounter the problem of discreteness and continuity, the boundary between which is as blurred as between static and dynamic systems. However, the difference still exists: in classical control theory, it is assumed that the impact on the system is a process, a function of time or state parameters, and not a one-time procedure.

Another distinctive feature of operations research is that this science operates with controls - constants, system parameters. Then, if in dynamic problems a mathematical construction is used as a criterion - a functional that estimates the movement of the system, then in the study of operations, the criterion has the form of a function specified on the sets of the studied parameters of the system.

The area of ​​practical problems covered by operations research is very extensive and includes measures for resource allocation, route selection, planning, inventory management, queues in queuing problems, etc. When solving the corresponding problems, the above methodology for describing them is used, taking into account the categories of the model, state , goals, criteria, management. In the same way, the optimization problem is formulated and solved, which consists in finding the extremum of the criterion function in the parameter space. Problems are solved both in deterministic and stochastic settings.

Since the procedure for operating with constants is much simpler than operating with functions, the theory of operations research turned out to be more advanced than the general theory of systems and, in particular, the theory of control of dynamical systems. Operations research offers a larger arsenal of mathematical tools, sometimes very sophisticated, for solving a wide range of practically significant problems. The whole set of mathematical methods serving the research of operations has received the name of mathematical programming. So, within the framework of operations research, the theory of decision making is developing - an extremely relevant area.

Decision theory, in fact, considers the procedure for optimizing the conditions for a detailed description of a vector criterion and the features of establishing its extreme value. Thus, for setting the problem, a criterion consisting of several components is characteristic, i.e. multicriteria task.

To emphasize the subjectivity of the criterion and the decision-making process, a decision maker (LIR) is introduced into consideration, who has an individual view of the problem. When studying solutions by formal methods, this manifests itself through a system of preferences when evaluating one or another component of the criterion.

As a rule, to make a decision, the decision maker receives several options for action, each of which is evaluated. This approach is as close as possible to the real conditions of the actions of the responsible subject in the organizational system when choosing one of the options prepared by the apparatus. Behind each of them is a study (analytical, simulation mathematical modeling) of a possible course of events with an analysis of the final results - a scenario. For the convenience of making responsible decisions, situational rooms are organized, equipped with visual means of displaying scenarios on displays or screens. To do this, specialists (operationalists) are involved, who own not only mathematical methods for analyzing situations and preparing decision-making, but also the subject area.

It is clear that the result of applying the theory of operations research to the object, in particular, and the theory of decision making, is some optimal plan of action. Consequently, the input of some block, “stuffed” with an optimization algorithm and built using the appropriate method of mathematical programming of the situation model, is supplied with information: initial state, goal, quality criterion, list of variable parameters, restrictions. (The system model is used when constructing the algorithm.) The output of the block is the desired plan. From the point of view of cybernetics, such a construction is classified as an open control loop, since the output information does not affect the input signal.

In principle, the considered approach can also be applied to the case of closed control. To do this, it is necessary to organize an iterative process in time: after the implementation of the plan, enter a new state of the system as the initial condition and repeat the cycle. If the task allows, it is possible to shorten the planning period by bringing the goal closer to the initial state of the system. Then one can see the analogy of the proposed actions with the iterative procedure of terminal control considered above, which is also based on periodic updating of the initial information. Moreover, the dynamic problem operating with processes can be reduced to the approximation of functions by functional series. In this case, the parameters of such series will be the variable variables, which means that the apparatus of the theory of operations research is applicable. (Similar things have been done in probability theory, when random processes are described by a canonical expansion.)

The described methodology began to find application in the theory of artificial intelligence in the synthesis of situational control.

It should be pointed out the danger associated with the practical application of decision theory by persons who are not sufficiently competent in the theory of systems. So, often in organizational systems (state institutions, firms, financial organizations), decision-making is absolutized and reduced to operating with numerous indicators and optimal implementation of a one-time management act. At the same time, the consequences of the action taken for the system are overlooked, they forget that they control not the criterion, but the system, not taking into account the multi-stage nature of the closed process - from the system to its state, then through the indicators to the solution and back to the system. Of course, on this long journey, many mistakes are made, both objective and subjective, which are enough for a serious deviation from the planned results.

The principle of optimality is understood as the set of rules by which the decision maker determines his action (decision, alternative, strategy, managerial decision) that best contributes to the achievement of his goal. The principle of optimality is chosen based on the specific decision-making conditions: the number of participants, their capabilities and goals, the nature of the conflict of interests (antagonism, non-antagonism, cooperation, etc.).

In decision making models, especially in game theory, a large number of formal principles of optimal behavior have been developed. We will only focus on a few of them here.

Principle of maximization (minimization). This principle is applied in mainly in problems of mathematical programming (see (2) - (4)).

Criteria convolution principle. It is used in the "optimization" of many criteria by one coordinating center (multi-criteria optimization problem (5)). For each of the criteria (objective functions)

f 1 (u),...,f n (u)

"weights" (numbers) are assigned by expert way

and α i shows the "importance or significance" of the criterion f. Next, the solution x* from the set of feasible solutions X is chosen so as to maximize (or minimize) the convolution of the criteria:

The principle of lexicographic preference. This is another principle of optimality in multiobjective optimization problems. First, the criteria are ranked by "importance". Let this ranking be:

f 1 (x),f 2 (x),...,f n (x)

Solution x*X is "better" than solution xX in terms of lexicographic preference if one of the n+1 conditions is met:

    f 1 (x*)>f 1 (x);

    f 1 (x*)=f 1 (x), f 2 (x*)>f 2 (x);

    f 1 (x*)=f 1 (x), f 2 (x*)=f 2 (x), f 3 (x*)>f 3 (x);

………………

    f i (x*)=f i (x) for i=1,…,n-1, f n (x*)>f n (x);

n+1) f i (x*)=f i (x) for i=1,…,n.

Minimax principle. It is used when the interests of two opposing sides clash (antagonistic conflict). Each decision maker first calculates a “guaranteed” result for each of his strategies (alternatives), then finally chooses the strategy for which this result is the largest compared to his other strategies. Such an action does not give the decision maker the "maximum gain", however, it is the only reasonable principle of optimality in the conditions of antagonistic conflict. In particular, any risk is excluded.

The principle of balance. This is a generalization of the minimax principle, when many parties participate in the interaction, each pursuing its own goal (there is no direct confrontation). Let the number of decision makers (participants in a non-antagonistic conflict) be n. A set of chosen strategies (situation)x 1 *,x 2 *,…,x n * is called equilibrium if unilateral deviation of any decision maker from this situation can only lead to a decrease in his own "gain". In an equilibrium situation, the participants do not receive the "maximum" payoff, but they are forced to adhere to it.

Pareto's principle of optimality. This principle assumes as optimal those situations (sets of strategies х 1 ,…,x n) in which the improvement of the "payoff" of an individual participant is impossible without worsening the "payoffs" of the other participants. This principle imposes weaker requirements on the concept of optimality than the equilibrium principle. Therefore, Pareto-optimal situations almost always exist.

Principle of non-dominated outcomes. This principle is representative of many principles of optimality in cooperative games (collective decision making) and leads to the notion of a "core" of decisions. All participants unite and, by joint coordinated actions, maximize the “total gain”. The principle of non-dominance is one of the principles of a "fair" division between participants. This is the situation when none of the participants can reasonably object to the proposed division (element of the "core"). There are other principles for the "optimal" division of the total total payoff.

Principlessustainability(threatsandcounterthreats). The idea behind all principles of resilience based on threats and counterthreats is as follows. Each coalition of participants puts forward its proposal, accompanying it with a real threat: if the proposal is not accepted by the other participants, then such actions will be taken that worsen the position of the other participants and do not worsen (possibly improve) the position of the threatening coalition. The optimal solution is one in which against any threat to any coalition there is a counter-threat from some coalition.

arbitration schemes. Economic conflicts suggest a "public arbiter". It is undesirable for conflicts of interest to turn, for example, into open threats and counter-threats. There must be social mechanisms that would take into account the preferences and strategic capabilities of each participant and would ensure a "fair" solution to the conflict. Such a preliminary mechanism, whether it be an individual or a voting system, is called an arbiter. In game theory, an optimal decision, in the sense of an arbitrage scheme, is constructed using a system of axioms, including such concepts as status quo, Pareto optimality, linearity of alternatives, independence from "ranks", etc.

Consider further the issues of optimal decision making under uncertainty. To develop the optimal behavior of the decision maker, it is useful to model such a situation as an antagonistic game of two persons, where nature is considered as the opponent of the decision maker. The latter is endowed with all conceivable possibilities under the given conditions.

In "games with nature" there are specific (albeit reminiscent of the minimax principle) principles for the optimal choice of solution.

The principle of extreme pessimism (Wald's criterion). According to this principle, the game with nature (decision making under uncertainty) is played as a game with a reasonable, aggressive opponent who does everything to prevent us from achieving success. The decision maker's strategy is considered optimal if the payoff is guaranteed not less than "permitted by nature".

Minimax risk principle (Savage's criterion). This principle is also pessimistic, but when choosing the optimal strategy, it advises to focus not on "winning", but on risk. The risk is defined as the difference between the maximum payoff of the decision maker (under the condition of complete information about the state of nature) and the real payoff (under the condition of ignorance of the state of nature). The optimal strategy is the one that minimizes the risk.

Principle of pessimism - optimism (Hurwitz criteria). This criterion recommends that when choosing a solution, one should not be guided by either extreme pessimism (“always expect the worst!”) Or extreme optimism (“maybe the curve will take you out!”). According to this criterion, the weighted average between the payoffs of extreme pessimism and extreme optimism is maximized. Moreover, the "weight" is chosen from subjective considerations about the danger of situations.

The concept of dynamic stability. All the above principles of optimality are formulated in relation to static decision-making problems. An attempt to apply them in dynamic problems can be accompanied by all sorts of complications.

The main thing is the features of dynamic processes. It is necessary that one or another principle of optimality, chosen in the initial state of the process (at the initial moment of time), remains optimal in any current state (at any moment of time) until the end of the dynamic process. This principle is called dynamic stability.

Behavior. Law of Optimal Behavior


Part III. Law of Optimal Behavior 135

Part III. THE LAW OF OPTIMUM BEHAVIOR

Law of Optimal Behavior

So, we can state the discovery of the Law of optimal behavior, the same Law that reflects the general principle inherent in the behavior of any person.

From the Law of Optimal Behavior it is clear that a person is not able to go against himself, i.e. against their interests.

It would seem that a person should live in complete harmony with the outside world - nature and society. But this is far from true. Apparently, there is a certain reason for disharmony, which we will have to identify, given that people's behavior, being a consequence of their thinking, is subject to an objective Law - the Law of optimal behavior.

It cannot be otherwise, because people's behavior is subject to the Law of optimal behavior, and it is only possible to control it by introducing various conditions.

First, it is obvious that the regulatory conditions within which ordinary workers are placed do not determine for them all, without exception, favorable consequences in the case of their good work and unfavorable ones in the case of poor work, i.e. introduce uncertainty into the sphere of labor relations. Employees, obeying the Law of Optimal Behavior, follow the path of least resistance and choose the type of behavior that is optimal for them at the moment, i.e. allows them to avoid those adverse consequences that are somehow determined by existing conditions. But despite this kind of partial satisfaction of their interests, workers are not able to choose the type of behavior that the administration expects from them, because their behavior is subject not to the intentions, not to the requirements of the administration, but to the Law. Undoubtedly, workers are able to give much more than they give under existing conditions, and, as a rule, they are well aware of this. All criticism of the workers against the administration is nothing but an expression on their part of the demand to supplement the regulatory conditions for the fullest satisfaction of their interests in productive work. In fact, employees unconsciously strive for certainty in labor relations, i.e. to ensure that all favorable and unfavorable consequences for them from one or another of their actions were always clear.

Moreover, through the introduction of any regulatory conditions, it is possible to one degree or another - depending on the degree of completeness of these conditions - to control human behavior. In fact, this is what happens in all spheres of public life, because the Law of optimal behavior is universal for human society.

From now on, we know the main property inherent in any person, and hence any employee of the organization, is to always act optimally, with the greatest benefit for oneself, taking into account all the consequences determined by the regulatory conditions. We also know the Law of Optimal Behavior, which we cannot change. There is only one thing left for us to purposefully change the regulatory conditions, to change so that a person naturally - precisely because of his main property - always acts rationally, with the greatest benefit for the organization. Only in this case, a person becomes a quality labor resource, entirely manageable. As an object of management, it will be beneficial for him that management is always aimed at the rational use of all available resources.

On the other hand, these same people, being subject to the Law of Optimal Behavior and having committed an act that ultimately caused them to repent, certainly faced a series of adverse consequences - a negative reaction of the external environment. Their optimal behavior turned out to be unreasonable (irrational) in relation to her.

Indeed, since any person is objectively subject to the Law of Optimal Behavior, it can be unequivocally stated that not a single person will act for the benefit of the external environment until this leads him to receive benefits for himself, until the rational in relation to the external environment becomes optimal for him.

If always R - 1, i.e. the initial degree of internal rationality is due to the operation of the Law of optimal behavior, then this or that actual degree of general rationality (R external environment, person, daily,

The employee's behavior is formed under the influence of the Law of Optimal Behavior.

In accordance with the Law of Optimal Behavior, the intellect of each individual is tirelessly guarding his own interests. At the moment of infringement of these interests, all his intellectual potential is objectively directed to their protection. And if the interests of the two subjects of labor relations - the entrepreneur and the employee - contradict each other, constructive and productive work in such a situation is simply impossible to organize, and even unthinkable.

Each person is individual, but, regardless of certain traits of his character, anyone is always inclined to justify his actions. If something goes wrong, a person, as a rule, considers his failure to be a consequence of the erroneous actions of the people around him. And in this he is right in his own way, because his behavior is always built taking into account his own interests - it is always subject to the Law of optimal behavior.

At first glance, the proposed situation is paradoxical. It is completely unclear who is actually right and who is wrong. The law of optimal behavior justifies everyone.

Thus, due to the absence of criterial conditions, the manifestation of the Law of optimal behavior becomes negative, "destructive" everyone justifies - and, as it seems to him, quite reasonably - only himself and blames - just as reasonably - others. Irrational in relation to others in this case is optimal.

The reason for the pattern correctly noticed by Parkinson can be understood, again, by knowing the Law of Optimal Behavior.

So, only the presence of criterial conditions makes it possible to avoid the negative manifestation of the Law of optimal behavior, and it is the presence of these conditions that leads to the fact that the Law begins to "perform its creative work" in all spheres of social relations, without exception, where such conditions are introduced.

In order to more visually imagine the negative manifestation of the Law of Optimal Behavior in the field of labor relations, we will consider the negative consequences generated by the power of this Law, using the example of the most burning problems that exist today in this area.

Behavior that appears to be clearly adaptive, or well-planned, may either be the result of the animal using simple rules of thumb for behavior, or it may be cognitive or intentional behavior (see Section 26.7). For example, a child may cross the street under strict traffic rules. If the child is well trained, then his behavior when crossing the road will be automated. An adult person who has not been trained in these rules, for example, a foreigner, will think about how to cross the street, evaluate the speed and nature of the approaching traffic, etc. The external picture of the behavior of a child and an adult when crossing the street can be practically indistinguishable, but in one case this behavior is carried out on the basis of the simplest empirical rules, and in the other - on the basis of cognition.

It is possible to ensure optimal behavior through a simple set of rules. We find an example of this kind in the work of Green (Green, 1983), who analyzed the stopping rules that should ensure the optimality of foraging behavior. In his work, Green suggested that prey animals are distributed over different patches of land, which vary in quality, and on the best of them, predators catch their prey much faster. In different environmental conditions, the distribution of plots by quality will be different. It is assumed that a predator is able to distinguish between types of feeding areas only by evaluating its success in each of them. The predator does not return to the site where it has already been, and systematically examines each site until it decides to leave it and move to another.

An optimal foraging strategy can be characterized by a stopping rule that determines when exactly a predator should leave a given area. At any time, a predator can decide whether to leave or stay in this area in order to continue searching for prey. Green shows that the best stopping rule is one based on the amount of production as a function of the time spent surveying a given area. Alternative stopping rules include: naive strategy, in which the predator relies on knowing the average probability of finding a prey in each area; omniscient strategy, at which the predator can evaluate the quality of each site without examining it, and in this way can avoid areas poor in prey, and, finally, strategy for taking into account the instantaneous speed of food production. With this strategy, the predator leaves the hunting area when this speed drops below a critical level. The best strategy, according to Green, involves assessing the quality of the site as it is surveyed. This strategy is more productive than the naive strategy and the strategy of taking into account the instantaneous rate of food production.


niya. It is also more productive than the omniscient strategy because it places fewer demands on the ability of an individual animal to make calculations. Green's strategy can be represented as a simple rule: stay on the site as long as more than half of the surveyed places bring prey, otherwise leave. This strategy can be implemented through a simple mechanism.

The Green (1980; 1983) and Waage (1979) models give similar results. However, it is important to remember that Green has this functional a model that defines exactly what an animal should do in order to achieve the best result. Waage model mechanistic, and it is built on the basis of ideas about the immediate causes of behavior.

One way to determine whether an animal follows certain fixed rules in choosing its decision is to intervene selectively in some way in its behavior. For example, when studying the behavior of burrowing wasps (Ammophila campestris) Baerends (Baerends, 1941) found that before laying an egg, the female digs a mink, kills or paralyzes the butterfly caterpillar, carries it to the mink, lays an egg on the caterpillar and hides it in the mink. The female wasp then repeats this procedure when laying the second and each subsequent egg. Meanwhile, the first egg matures, and the larva begins to devour the caterpillar. Now the wasp returns to the first hole and adds new caterpillars to it. After that, depending on the circumstances, she can proceed to the manufacture of a new mink or will supply caterpillars to a second mink. Thus, the female wasp can serve up to five nests at the same time (Fig. 25.16).

Behrends discovered that the wasps inspect all the burrows every morning before heading out to their "hunting grounds". By taking caterpillars from the mink, Berends could force the wasp to bring more food than usual; by adding caterpillars, he could make her bring less food. However, he could thus control the behavior of the wasp only if he made changes in the nest before the wasp's first daily visit to the burrow. If such changes were made during the day after that moment, they did not cause any effect. Apparently, the female wasp is guided by some simple rules. There is a standard procedure for laying an egg, which involves digging a mink and harvesting a caterpillar. In addition, there is a standard early morning inspection routine for all minks, during which it is usually established which nest should be fed during the day. Finally, there is a standard termination procedure in which the wasp closes the nest hole when enough caterpillars are present. Although she is able to estimate the amount of food stored in it when she visits the nest, she does not always use this ability. Moreover, each of the standard sequences of actions, once started, continues to completion. So, for example, a wasp will bring and bring


Rice. 25.16. Diagram of nesting behavior of a burrowing wasp (Ammophila).(After Baerends, 1941.)

sit in the nest of caterpillars, if they are systematically removed from the nest every time, as soon as the wasp brings them. This example shows that complex behavior can be programmed as a set of rigid rules. The wasp behaves like an automaton, although it has some standard behavior programs that allow it to get out of a difficult situation, for example, remove some obstacles from a mink.

As we have seen before, the interruption of an animal's behavior under certain circumstances masks the behavior that would have occurred if there were no such interruption. Such a time-sharing situation suggests that the animal follows certain rules that determine the organization and priority of behavioral acts in the overall picture of behavior. Let's consider a specific example. When a hungry dove (Streptopelia) eats, either picking grains from a pile or receiving food in a Skinner chamber, typical pauses lasting several minutes can be observed in her behavior (see Fig. 25.11). What the dove will do during these pauses depends on the circumstances. If the bird has access to water, it will drink. Otherwise, she may clean feathers or just stand still. Under experimental conditions, it has been shown that the timing of these pauses is not affected by the manipulation of secondary priority motivational factors, such as changes in thirst levels. In one experiment, hungry turtledoves were attached to each wing with a paper clip. During the meal, the turtledoves did not pay any attention to the paper clips, while during the pauses they tried to get rid of them. However, the presence of paper clips did not affect the nature of eating behavior and did not change the temporal distribution of pauses (McFarland, 1970b). One gets the impression that in the feeding behavior of the turtledoves, as it were, programmed for a strictly defined pause time and that the rules that govern the feeding behavior of the turtledoves are not influenced in any way by other motivational factors, such as thirst or the desire to clean feathers, unless these trends will not become stronger than food. This is a typical case of the time-sharing phenomenon.


Rice. 25.17. The boundary between the animal states dominated by hunger and thirst.

If in some way the feeding behavior of a hungry turtledove is interrupted, then usually after the break it will continue this behavior. But if the process of drinking water is interrupted, then, as a rule, it will be masked if the break is long enough (McFarland, Lloyd, 1973). In an experimental situation with instrumental behavior, where turtle doves must peck at luminous keys to obtain food and water, interrupting the current activity can be achieved by simply turning off the backlight of the key. Birds will quickly learn to stop pecking when these keys are not lit. Under conditions of free eating and drinking behavior, interruption of behavior can be achieved if the experimental room is plunged into darkness for about a minute. When compared, it turned out that these two types of interruption in the activity of turtledoves have the same effect (Larkin and McFarland, 1978).

The division of time in the eating and drinking behavior of turtledoves has been the subject of numerous experiments, the purpose of which was to discover the rules on the basis of which the bird decides whether to eat or drink. The results show that, first, either drinking or eating activities can dominate in experiments (McFarland and Lloyd, 1973; McFarland, 1974). Secondly, the line reflecting the boundary (Fig. 25.17) between the dominance of hunger and the dominance of thirst does not change its position either with repeated experiments, or with different initial levels of hunger and thirst of the animal, or with changes in the results of eating and drinking behavior (Sibly, McClery, 1976). However, if the motivational state of the bird is changed during the experiment, it may seem that the graph reflecting boundary between dominant states(Fig. 25.18). A theoretical analysis of this situation shows that there is no real change in the position of the boundary of dominant states. This apparent change is due to the frame of reference used by the experimenter, because the animal's motivational state is usually depicted in two dimensions, while other dimensions must be taken into account (McFarland and Sibly, 1975). The magnitude of this apparent turn of the dominant state boundary has proven to be a useful measure of the strength of motivational factors, such as the magnitude of the attractiveness of food and drink rewards (Sibly, 1975), the effectiveness of external stimuli that signal


analyze the availability of food and water (McFarland and Sibly, 1975; Beardsley, 1983), and the costs (assessed by the bird itself) of changing eating to drinking behavior and vice versa (Larkin and McFarland, 1978). In general, it seems that both internal and external factors have some influence on. tendencies in eating and drinking behavior and that these tendencies compete for dominance (McFarland, 1974). Having taken a dominant position, the winning system periodically provides time for the implementation of other (subdominant) activities. Why behavior is organized in this way remains a mystery.

It is possible that pauses in the feeding behavior of turtledoves are part of a behavioral strategy aimed at detecting predators. Being in a flock, individual birds have the opportunity to spend more time foraging and less time to watch for predators (Barnard, 1980; Bertram, 1980; Elgar and Catterall, 1981). Lendrem (1983) found that solitary turtledoves spend about 25% of their two-minute feeding period looking around, and about 20% when other birds are around. However, this difference was much more pronounced if the turtledoves had seen a predator (ferret) nearby shortly before. In this situation, lone turtledoves spend about half the time looking around, while in the presence of two other birds, they spend only 25% of the time looking around. The time spent not foraging decreased even more as the number of birds in the flock increased. As the size of the flock increased, turtledoves received food faster, while at the same time, the overall foraging rate decreased if they had recently seen a predator. As a detailed analysis of the feeding behavior of turtledoves shows, in situations of risk, their rate of obtaining food decreases, while the pauses between meals increase. Thus, they feed more slowly when alone in unfamiliar surroundings and shortly after seeing a predator. In this case, the period of time after each peck especially increases, when the dove stands with its head raised; it is possible that this enhances the bird's ability to spot predators.

The rate of obtaining food is also reduced when turtle doves have to distinguish suitable food from unsuitable food. By adding lithium chloride to the diet of turtledoves, in combination with certain feeding conditions, these birds, like many other animals, can be taught to avoid wheat grains that are dyed a certain color (Lendrem and McFarland, 1985). Trained birds behave as if grains of this color were poisonous. For example, some birds avoid yellow grains, while others avoid red ones. When such turtledoves are given a mixture of red and yellow grains, they have to distinguish between the two types of grains in order to avoid the grains of the color for which they have developed an aversion. Birds that forage from a mixture of "poisonous" and harmless grains feed more slowly than birds that are given a mixture of "non-poisonous" grains of various colors (Lendrem and McFarland, 1985).


If the turtledove feeds more slowly than usual, because she has to distinguish between harmless grains and poisonous ones, then it can be thought that she has a weakened ability to detect predators, since she pays more attention to food. In fact, turtle doves' response rate to a model hawk flying overhead increases if turtle doves are given a mixture of poisonous and harmless grains (Lendrem and McFarland, 1985). Birds that have been previously shown a predator (and therefore peck grain at a reduced rate) respond faster to the hawk model than birds that have to distinguish between harmless food and poisonous food. Thus, it appears that slower feeding, whatever the cause, increases the bird's ability to detect predators. These data are consistent with the notion that a high rate of foraging (or other behavior) is associated with high costs.

What happens if we further complicate the task of distinguishing grains by placing them against a background where they will be poorly distinguishable? As expected, there will be a further decrease in the rate of eating behavior (Fig. 25.19). This may partly be due to the birds having to pay more attention to foraging, but it could also be an active tactic to stay vigilant. Turtle doves that select harmless grains from a mixture of poisonous grains against a background where they are difficult to distinguish, notice the hawk pattern faster than birds that select well-distinguished grains (Fig. 25.20) (Lendrem, McFarland, 1985). However, birds that select suitable grains in conditions of low visibility make more errors (eat more poisonous grains) and pause less in their feeding behavior than birds that feed on well-defined grains. Thus, it is quite clear that there is a certain balance between the requirements of vigilance and the requirements of foraging.

In conclusion, it should be said that, in all likelihood, turtledoves, which

Rice. 25.19. The rate of food intake by turtle doves, which were offered a mixture of "poisonous" and harmless grains, in conditions where the grains of these two types are difficult (low visibility) or easy (high visibility) to distinguish from each other. (Lendrem & McFarland, 1985.)

Rice. 25.20. Latent periods of reaction to the hawk model in turtle doves that feed in conditions of low and high visibility of food (Fig. 25.19). Pay attention to the fact that turtledoves in conditions of low visibility of grains, although they feed more slowly, they react faster to a potential predator. These results suggest that the reduced rate of food intake when eating poorly distinguishable grains is not due to the fact that birds must focus on this in order to distinguish between grains, but rather. that this situation is more dangerous (due to the increased likelihood of ingesting "poisonous" grains), and therefore doves pay more attention to the environment in general. (After Lendrem, McFarland, 1985.)


eat food quickly, are less likely to spot predators. When turtledoves are alert, i.e. when they are in an unfamiliar environment, or alone, or in a situation where they have recently seen a predator, turtledoves eat food more slowly. However, turtledoves have a number of different ways in which they can reduce their overall rate of eating. For example, they are able to pause more often, lengthen them, or slow down the speed of actually eating food. These ways can increase the chances of spotting a predator. There are some indications that these different methods cancel each other out (Lendrem and McFarland, 1985). Quite possible,

that turtledoves rely on being able to detect unusual movement as they raise their heads after each peck, and pause to look around from time to time. It is possible that by pausing in pecking, a bird may spend some time cleaning its feathers or drinking, an example of a phenomenon called time-sharing. At present, we do not have sufficient knowledge of avian vision to support these hypotheses. We also don't know if birds use some complex set of decision rules or if their behavior is regulated through cognitive processes.

FOR REMEMBER

1. Animals can make decisions based on simple rules of thumb that help them adapt to specific environmental conditions.

2. If manipulation of the second priority activity changes the time distribution of the animal's switching from one activity to another, then we can conclude that these switchings are due to the competition of motivations. If this distribution does not change, then such switching is caused by disinhibition.

3. In the case when the moment of the beginning and the duration of the manifestation of some activity are regulated by another activity, we can say that the behavior is organized in a time-sharing mode.

4. The adoption of an optimal decision by an animal is realized in a sequence of behavioral acts that maximizes a certain indicator of the organism's fitness under existing conditions. Any violation of the mutual fit between an animal and its environment will result in such maximum fitness rarely being achieved. However, animals can use such decision rules that their behavior will be close to optimal.

Krebs J. R., McCleery R.//., 1984. Optimization in behavioral ecology. In: Krebs J. R., Davies N. B. (eds), Behavioral Ecology, 2nd edn, Oxford, Blackwell Scientific Publications.