Zipf's law and the fractal nature of social and economic phenomena. What is zipf's law

words of a natural language: if all the words of a language (or just a sufficiently long text) are ordered in descending order of their frequency of use, then the frequency n th word in such a list will be approximately inversely proportional to its serial number n(the so-called rank of this word, see scale of order). For example, the second most used word is about two times less common than the first, the third is three times less common than the first, and so on.

History of creation[ | ]

The author of the discovery of the pattern is a French stenographer (fr. Jean-Baptiste Estoup), who described it in 1908 in The Range of Shorthand. The law was first applied to describe the distribution of city sizes by the German physicist Felix Auerbach in his work "The Law of Population Concentration" in 1913 and bears the name of the American linguist George Zipf, who in 1949 actively popularized this regularity, first proposing to use it to describe the distribution of economic forces and social status.

An explanation of Zipf's law based on the correlation properties of additive Markov chains (with step memory function) was given in 2005.

Zipf's law is mathematically described by the Pareto distribution. It is one of the basic laws used in infometrics.

Applications of the law[ | ]

George Zipf in 1949 for the first time showed the distribution of people's incomes according to their sizes: the richest person has twice more money than the next rich man, and so on. This statement turned out to be true for a number of countries (England, France, Denmark, Holland, Finland, Germany, USA) in the period from 1926 to 1936.

This law also works in relation to the distribution of the city system: the city with the most large population in any country twice as large as the next largest city, and so on. If you arrange all the cities of a certain country in the list in descending order of population, then each city can be assigned a certain rank, that is, the number that it receives in this list. At the same time, the population size and rank obey a simple pattern expressed by the formula:

P n = P 1 / n (\displaystyle P_(n)=P_(1)/n),

where P n (\displaystyle P_(n))- city population n-th rank; P 1 (\displaystyle P_(1))- population of the main city of the country (1st rank).

Empirical studies support this assertion.

In 1999, the economist Xavier Gabet described Zipf's law as an example of a power law: if cities grow randomly with the same standard deviation, then at the limit the distribution will converge to Zipf's law.

According to the findings of researchers in relation to urban settlement in the Russian Federation, in accordance with Zipf's law:

  • most Russian cities lie above the ideal Zipf curve, so the expected trend is a continued decline in the number and population of medium and small towns due to migration to big cities;
  • respectively 7 million-plus cities (St. Petersburg, Novosibirsk, Yekaterinburg, Nizhny Novgorod, Kazan, Chelyabinsk, Omsk), which are below the ideal Zipf curve, have a significant population growth reserve and expect population growth;
  • there are risks of depopulation of the first city in the rank (Moscow), since the second city (St. Petersburg) and subsequent large cities are far behind the ideal Zipf curve due to a decrease in demand for labor with a simultaneous increase in the cost of living, including, first of all, the cost of purchase and rental housing.

Criticism [ | ]

American bioinformatician proposed a statistical explanation of Zipf's law, proving that a random sequence of characters also obeys this law. The author concludes that Zipf's law, apparently, is a purely statistical phenomenon that has nothing to do with the semantics of the text and has a superficial relation to linguistics.

Zipf-Pareto law, new quantum technologies and the philosophy of the unconscious

I.V. Danilevsky

In the article using the quantum model human psychology an explanation is given of the mechanism of functioning of the so-called hyperbolic distributions in economics, politics, culture and other areas, among which the most famous are the laws of Zipf, Pareto, Lotka, Bradford, Willis. The quantum model of the collective unconscious allows us to give a new explanation to a number of controversial philosophical issues by using the concepts of quantum nonlocality, quantum cryptography, etc.

In studies of systems of various classes, the so-called hyperbolic distributions are well known, which are often called "Zipfian". These are distributions (or laws) of Zipf, Pareto, Lotka, Willis, Bradford and others. a formula in which only the exponent varies (the formula is written in two forms - frequency and rank, but this is not fundamental). For example, Pareto's law states that approximately 80 percent of wealth belongs to 20 percent of the population, 80 percent of work is done by 20 percent of employees, 20 percent of customers bring in 80 percent of profits, etc., and Zipf's law establishes no less asymmetrical use of words in finished texts. large volume, phonemes and syllables. Auerbach's law, which Zipf resurrected with his research, shows a similar asymmetry in the distribution of population across cities. But probably the most rigorously confirmed in the form in which it was once discovered is Lotka's law. It concerns the distribution of scientific productivity of scientists, expressed in the number of their publications. In 1926, the American mathematician Alfred Lotka counted the number of scientists who wrote one, two, and so on. papers cited in a chemistry abstract journal for ten years and received a distribution in which the exponent equaled one. His results received a great response, inspiring others to similar studies, and very soon it came to the point that the validity of Lotka's law could be tested on the number of publications devoted to Lotka's law. And, moreover, an almost anecdotal situation began to emerge, since it turned out that distributions of the same nature, i.e. sharply asymmetrical, describing, for example, the ability to play golf, the results of passing exams in mathematics and the number of owners of estates (by their annual income) who took part in the Jacobite rising in 1717. After all these discoveries, experts could not but admit that a new class distributions. They were called "non-Gaussian", thereby emphasizing the difference from the symmetric distributions named after the German mathematician, and the question of their explanation arose on the agenda.

Explanations were always given. Attempts were made both by the authors of these discoveries and by other specialists, but all of them, to one degree or another, were recognized as unsatisfactory, because there was always a missing link. Most often, this was done as follows: the Zipf-Pareto law is the result of the action of two mutually directed factors. For example, if we talk about the number of publications in scientific journals, then these factors are: the desire to publish and the throughput of journals. However, as A. Lotka showed, the law discovered by him describes the number of discoveries in physics for the period from 1600 to 1900, verified by the works of the Royal Society of London. An author such as D. Crane points out that discoveries and inventions in other areas are subject to this law, but their nomination is not affected throughput magazines. Already today, synergetics - for example, G. Malinetsky - interpret these patterns as "self-organized criticality." This means that, firstly, the elements in a system that obeys Zipf's law are interconnected, and the system itself is highly adapted to rapidly changing conditions, so the price for such self-organization is "criticality" - little change conditions causes avalanche-like changes - see about it. The fact is that the distribution of the number of particles, for example, grains of sand, in an avalanche descends also obeys this law, and the consideration of models of avalanches, turbulence, etc., as is known, is typical for synergetics. But how can such an explanation be applied to the same fact: subordination to the Zipfo-Pareto law of the advancement of discoveries, inventions? Indeed, in this case it turns out that thoughts, ideas of the most diverse, not interconnected in Everyday life people turn out to be linked, as in the usual material system? By the way, approximately the same thing happens in the situation considered by V. Pareto, when 20 percent of workers - completely regardless of the notorious Marxist "form of ownership of the means of production" - do 80 percent of the work in a team. Somehow “by itself” it turns out that the total contributions of each of the workers eventually equalize, pouring into the Paretian formula. Of course, this makes less impression than when different scientists and inventors, often without even suspecting the existence of each other, do, as it seems to them, purely individual work, which in fact turns out to be an expressed collective formula, but nevertheless. So, are our thoughts, to some extent, not only our thoughts? And if this is true, how is this possible? In order to answer this question already, one should delve into the mechanics of the origin of our very thoughts and (as a special case) ideas.

If we believe that consciousness is responsible for the creations of our psyche, then in this case we will find ourselves in an explanatory impasse - consciously people in the situations described above did not interact with each other. If, by inertia, we continue to argue that consciousness is primarily responsible for our thoughts, and the unconscious is only an auxiliary part of our “I”, then the situation will not change. However, if we accept that it is the unconscious that is the leading force in our mental apparatus, then a completely different picture will appear before us.

Discoveries and inventions are primarily unconscious: they are an intuitive breakthrough, only prepared and completed by conscious processing. The composition of literary and other works is also primarily an unconscious process, like any creativity. Our everyday life, including economic, including labor, is primarily informal, and only then formal relations, and informal relations are what is directly influenced by spontaneous reactions, i.e. also based on the unconscious as a foundation. Therefore, the whole question is how to interpret the structure of our unconscious. If we focus primarily or exclusively on the unconscious individual, then this will not clear up much more than when we take into account only consciousness. However, if we remember that there is still an unconscious collective, about which Carl-Gustav Jung wrote a lot and fruitfully, we will get the first key we need to solve the problem of social relations being permeated by Zipfo-Pareto distributions.

Jung, as you know, wrote about the collective unconscious mainly in the context of "archetypes", but with all our greatest respect for this scientist and thinker, one cannot but admit that this approach quite phenomenological. "Late" Jung gradually shifted the focus of his interests from archetypes to the problem of the so-called "synchronicity" - the semantic identity of events in the absence of causal relationships between them and, interestingly, he did this already during his collaboration with one of the creators quantum mechanics W. Pauli.

Let us ask ourselves a question (or questions): what is the situation like when the total result of the behavior of many people turns out to be the same regardless of personalities, cultures and eras? What does it look like this picture both in the case of the Pareto-Zipf law, and in the case of the invariant basis of the “mentality” (the collective subconscious, i.e., in fact, capable of being a conscious part of the collective unconscious) of a particular nation that has not changed over the centuries? Is there any analogy to this in the natural world?

The answer to this question is both simple, and perhaps unexpected from the point of view of conventional approaches: it is similar to the so-called "Einstein-Podolsky-Rosen paradox" in quantum mechanics.

In 1935, Einstein and two of his collaborators published a paper which they hoped would disprove the brainchild of Bohr-Heisenberg-Schrödinger. The essence of this paradox can be conveyed as follows: if two particles interact with each other, a so-called “bound” (“hooked”) is formed between them, i.e. a correlated state with common total characteristics - momentum, the so-called "spin", etc. After that, the particles scatter to any conceivable distance, to the limit equal to the Universe itself. If you measure the state of one of them - say, with a total spin equal to zero, the spin of one will be equal to "minus one", then the spin of the other at the same time takes the value "plus one". Since particles in the microcosm tend to decay and interconvert in the most in various ways, limited only by conservation laws, then in the process of further interactions, their total characteristics should still remain common. There seems to be nothing paradoxical about this. However, the whole point is that in the microcosm there is a kind of "Berkleianism"; quantum mechanics demonstrates in a striking way (above all, of course, to “convinced materialists”) that in a certain sense Bishop Berkeley turned out to be absolutely right. The specific values ​​of many particle characteristics are determined only at the moments of observation; before observing them, i.e. characteristics, contrary to dialectical and other "materialist fundamentalism" ( Islamic fundamentalism in philosophy, where Allah is matter) "objectively" simply does not exist. Therefore, the experimenter who has caught the second particle is not at all obliged to find that its characteristics are correlated with the previously determined characteristics of the first particle, but this is exactly what he always finds. Einstein believed that such a long-range action in the microcosm is impossible, and quantum mechanics does not take into account at least something. Sveta. For many, such results turned out to be unexpected, but nevertheless, the fact remained: particles that at least once entered into interaction with each other “feel” each other (see about this).

Comparison is not proof, critical readers may object. On what basis is the quantum analogy invoked, and not some other analogy? And what about the arguments that have been actively put forward by opponents of physicalism for at least the last few decades - the desire of many positivist-minded authors to reduce human thinking to physical and chemical processes in the brain? - the same critics will surely say.

Let's answer the first question first. Interestingly, the dominant paradigms act on the minds of scientists! None of the authors dealing with philosophical issues of consciousness and psychology in general, as far as we know, disputes Louis de Broglie's guesses about the corpuscular-wave dualism of matter, which states that local particles of matter are at the same time non-local waves. But for some reason, only a few consider it necessary to admit that, in general, the same should be true of the quantum level of organization of thought processes in the material substratum of the brain! Hypotheses that interpret the human psyche as a kind of quantum or quantum-like formation have existed since the late 1950s. They entered philosophy in the mid-1970s as a reaction to three circumstances: a series of statements by Niels Bohr, the results of experiments with hallucinogenic drugs (mainly LSD), and the publication of Fridtjof Capra's book The Tao of Physics, which needs no introduction. After Capra, D. Bohm’s theory of the so-called “non-local hidden parameters” that determine the behavior of microparticles, which existed by that time, and the mechanics of holograms, which required quantum technology in the form of lasers, as well as the “delightful unusualness” of quantum theory itself, received the status of the ideological basis of a number of unorthodox scientific (S. Grof), but mainly parascientific theories. Academic science to today is limited to using ideas about the psyche as a quantum system, mainly in the form of attempts inspired by the Nobel laureate Eccles to connect human thinking not only with biochemical processes in neurons, but also with quantum processes in synapses, continuing to work within the framework of the former "biochemical paradigm", even a cursory acquaintance with the results of the dominance of which (for example, see) is enough to understand its futility. Philosophical proof of the failure of this paradigm is given in almost all the books of the remarkable Saratov author E.M. Ivanov - see, for example,. Despite the existence of hypotheses justifying that macroscopic quantum processes of superfluidity-superconductivity occur in the human brain (see about this), that the nerve cell is a quantum biocomputer, and others (for example, currently the largest the hypothesis of Stuart Hameroff and Roger Penrose that the tubulin microtubes of neurons ensure the existence of large-scale quantum processes in the brain is well-known - see), until there are experimental confirmations of these views that can convince skeptics of this, and the skeptics themselves, of course, change their once chosen paradigm , do not rush. We believe that additional indirect evidence of precisely the quantum-like organization according to at least a significant part of thought processes can be facts social character- and those that are fixed by the Zipf-Pareto law, and many others (for example, analogies between myths, magical practices and non-local quantum effects): see more about this.

Recall briefly what a quantum computer is. This is a fundamentally new type of computer, which has not yet been created, but the theoretical basis for it has long been available (see). main idea it is as follows: since a quantum object (for example, an atom), unlike any object familiar to us, is able to simultaneously be in a variety of mutually exclusive states (yes-no, 0-1, etc.), all these states can be “instructed” some computational work and thus make a parallel computational process, achieving a huge gain in speed for a number of tasks - decomposition big numbers for certain factors (used in encryption), searching for the required information in a huge database, etc. For example, if a typical modern computer searches for the necessary factors of a thousand-digit number ten to the twenty-fifth degree, then a quantum computer will solve the same problem in a few hours (As they say , "feel the difference"! - ed.).

Let's see what happens when Zipf's law is satisfied for texts, i.e. when the number of words used to write it (or Chinese characters- such a variant was also checked by Zipf) turns out to be distributed according to a certain hyperbolic regularity. It is obvious that such work is never done consciously, and, therefore, is carried out only unconsciously. But in this case, it turns out that the unconscious acts like a computer, which, firstly, translates any symbols of any language - English, Russian or Chinese - into numerical form and, secondly, controls the correlation of the use of words with the ideological intent of the text from the very beginning. from the beginning to the end of its writing by one or another author (in studies devoted to Zipf's law, the need for the integrity of the text is especially emphasized, for which this law: it does not work for arbitrary passages). On the other hand, in order to coordinate the economic or purely intellectual activity a huge number of people, both a mechanism for accessing the thoughts (ideas) of these very people “in real time” and their almost instantaneous calculation and processing are needed. Because the the number of atoms in the universe does not exceed ten to the eightieth power, and the problem of enumerating ten to the five hundredth power various options quantum computer solves in minutes(!) (see), the question is - “If at least part of our unconscious acts like a quantum computer, receiving the information necessary for processing using the effect fixed by the Einstein-Podolsky-Rosen paradox (the so-called quantum nonlocality), then can such computer (more precisely, computers: according to the hypothesis of E. Lieberman, each neuron is such a computer) to calculate and “average” the activities of several billion people according to the Zipfo-Pareto formula? becomes rhetorical. It will take him on average the same minutes or even seconds.

As for the accusations of quantum theories of consciousness in "physicalism", they, however strange it may seem, are indeed justified, but this has absolutely nothing to do with our hypothesis. In consciousness, as the opponents of physicalism emphasize, value content, the ability to represent for the subject in the form of experience, etc., are distinguished, but physical systems do not have anything like that. This is absolutely true, but, firstly, the brain is still biological, and not just physical system(see N. Cartwright's article in ), and, secondly (most importantly!), This only means that it is consciousness as such that remains "Terra incognita" for modern science- in particular, physical science, - since it, apparently, lacks some very important link, and these characteristics - a value representation, a form of subjective experience, etc. - do not per se belong to the unconscious (although the law Zipf and demonstrates that the unconscious, having received a signal from consciousness, is able to translate all kinds of “meanings”, “intentionality” and other characteristics of consciousness traditional in philosophy into a certain mathematical form and then process it like a quantum computer).

Therefore, the unconscious can be modeled using an "impersonal" physical theory, the frontier of which is quantum mechanics, and, in addition, the "principle of indistinguishability" of particles operates in quantum mechanics, which means that, for example, one electron is no different from all other electrons, and one photon is from other photons, and it looks like it could not be more suitable for the above modeling of the collective unconscious common to all mankind.

What does all of the above mean for philosophy? In particular, it means the following:

1) It becomes possible to speak of structuralism not in the sense given to this term by K. Levi-Strauss (i.e., in a purely semiotic sense), but in a kind of “physical and mathematical” one. On the other hand, Levi-Strauss quite rightly emphasized what structuralism should be as a science-based philosophy - a system of views aimed at finding universal patterns that operate in all areas. human life. He is close to Jung in this, although he did not agree with him (unfairly, as it is now clear) in assessing the existence of so-called archetypes common to all people at all times. And we see that they, these regularities (in particular, Zipfo-Pareto ones), exist. Therefore, all the statements of post-structuralist-postmodernists in this regard are incorrect. For example: "There is no universal form of the unconscious, as psychoanalysis insists" (Baudrillard, ); “Why there is no talk of following Jung” (Derrida).

2) The human unconscious and, in particular, the collective unconscious is predominantly arranged as a kind of quantum biocomputer (see E. Lieberman's hypothesis about neurons -). At the same time, the quantum (more precisely, quantum-like) nature of its structure cannot but entail a chain of others, no less fantastic than reducing the speed of calculations of a number of problems from ten to the twenty-fifth degree to several hours, consequences. For example, the structures of the unconscious, including the collective, must have the property of reversibility in time, since, firstly, in the world elementary particles there are no fundamental laws prohibiting temporal reversibility, and, secondly, it, this reversibility, must be due to the absence (by definition!) of consciousness observing the unconscious, which is why the so-called reduction of the wave function does not occur (Of course, if that this reduction exists at all (for example, D. Deutsch, M. B. Mensky, and many other prominent authors think otherwise). Lack of reduction wave function in the unconscious should explain what is under hypnosis medical human can be introduced many times into the same initial state, but under state hypnosis (in the totalitarian states of the twentieth century and mediacratic states of the twentieth I century) you can force the masses of the population to believe in almost anything, or wean them to think of almost everything that is not in the interests of the ruling elites.

3) As a continuation of the above, it follows that the interpretation of Kant's "a priori" as structures of the unconscious (for example, S. Abramov calls them "compositional forms of the unconscious") is most likely unjustified. Kant's "a priori" are structures of consciousness, not of the unconscious. In the quantum world, for example, contrary to the belief of consciousness, the effect can be before the cause.

4) If we use the many-world interpretation of quantum mechanics as the initial paradigm for modeling collective-unconscious processes, which, as it turned out from the correspondence of the author of these lines with M.B. Mensky, cannot be refuted purely logical means, then, as rightly emphasized by M.B. Mensky, the function of consciousness is indeed the choice between many Everettian worlds (Mensky himself formulated this idea even more rigidly: consciousness and the separation of alternative classical “worlds” are one and the same). But given function, like any other, is carried out by consciousness in unity with the unconscious sphere, and the leading role, if we trust the conclusions of K.-G. Jung and other creators of psychoanalysis, still belongs to the unconscious. Consciousness really finds itself in one of the many possible worlds and it is known to be holistic; even in the case of schizophrenia, sometimes one “personality” appears, sometimes another, but not both at the same time. However, the unconscious, as a quantum-like object, is capable of being in a split consciousness of the so-called superposition, so it would be logical to assume that for it all possible alternatives are preserved in time.

5) The fact that the Zipf-Pareto law in its original Pareto version is equally valid for the distribution of wealth among different peoples in different eras, and for the gravitational density of stellar systems (in the formula derived by Pareto, the exponent is the same), suggests that, in addition to the quantum-like and quantum nonlocality behind all this, respectively, this circumstance can be explained in two ways. The first option: the very existence of hyperbolic distributions is a consequence of the hyperbolic distribution of gravitational density in the Universe. This explanation would probably please Roger Penrose, who is just looking for the influence of gravity on the objective reduction of consciousness. However, firstly, such an explanation would be physical reductionism, the philosophical inconsistency of which has long been proven, and, secondly, a number of objections can be put forward against it. For example: why does gravity “bend under itself” the intellectual activity of people or the distribution of the number of biological species by genera, but does not do the same in those cases when the subordination of the results of any human activity or biological processes to the so-called "golden section"? It would be more correct to recognize the justice of the Pythagorean-Platonic metaphysics - the justice of the fact that our world, like hoops, is embraced by some kind of mathematical structures, which, although they manifest themselves in it, do not themselves belong to our world. By the way: it is surprising that Penrose calls himself a staunch supporter of Plato, but he tries to combine in his searches two poorly combined concepts: Platonic ontology and modern (albeit updated in his own author's way) physical reductionism.

We consider these and other questions related to all of the above in detail in the monograph. K.-G. Jung did not have enough time to take a decisive step towards explaining his own interpretation of the collective unconscious (archetypal and especially synchronistic) as a quantum or quantum-like system, although, collaborating with Pauli, he had already begun to move along this path. But after the Einstein-Podolsky-Rosen paradox was confirmed in the eighties of the last century and the possibility of creating quantum computers was substantiated, and in the nineties the so-called quantum teleportation was discovered (instantaneous transfer of the state of a particle from one to another through the interaction with them third due to the same quantum non-locality, fixed in the EPR paradox), the time has come to “dynamize” the theory of the collective unconscious. Move from phenomenological statics to physical and mathematical dynamics. And almost the only thing that can prevent this, oddly enough, lies in the same quantum or quantum-like essence of the unconscious, whether individual or collective (although, strictly speaking, a completely “individual” unconscious should not exist due to the quantum-like nonlocality of the latter). ; just the synchronization of dreams or personal complexes in different people who are not connected with each other in everyday life is extremely difficult to track).

We mean the effect of the so-called "quantum cryptography", commercial samples which are already on sale. The new cryptography is based on a circumstance characteristic of quantum mechanics - any measurement, i.e., in fact, any observation of a quantum system in which information is encoded by the states of microparticles, causes irreversible changes in it. Therefore, any attempt to connect to a cable carrying information encoded in this way will, firstly, be immediately detected, and, secondly, it will still not be possible to use the resulting “mixture” of particle states. Is it not on this “quantum-cryptographic effect” that the secrets of our own inner world are kept closed from us, about the imminent “decoding of codes” of which so much was written in the seventies of the last century? (For example, in our country this was done on behalf of N. Bekhterev's neurophysiology, and on behalf of philosophy - D. Dubrovsky.) If this is so, then the affairs of science are complicated in the most radical way. Let us recall Freud's idea that has been criticized more than once: with the help of an artificially induced "transfer" - the transfer of emotions, etc. of the doctor - to rescue the patient's memories and determine them himself. As in modern quantum teleportation (see): particles 1 and 2 interacted with each other; a “bound” state has formed between them, and now, if we connect a third particle, which is in an unknown state, to particle 1, then this unknown state will be transferred to particle 2 and can be determined by experimenters. Psychoanalysis in general, and Freud in particular, have at all times been accused of being unscientific; in the fact that the psychoanalyst often, and perhaps always in general, himself implants the memories he expects into the heads of his patients, and this is in most cases the pure truth. But the whole point is that it is very difficult to comply with the conditions of the experiment so as not to carry out an analogue of the quantum-teleportation act into the patient's past on his own. Raised on a Cartesian-Newtonian picture of the world, Freud could not have known that in the quantum world a later observation could determine the results of an earlier observation. Nowadays, it turns out that new, “quantum-cryptographic” difficulties are added to this. Indeed: suppose that one of the adherents of the idea of ​​​​the possibility of a phased decoding of brain codes (for example, the same D.I. Dubrovsky) managed to decipher the desired codes (say, his own brain). Then Dubrovsky will be able, for example, by running the program he found on a computer, to find out that he himself, D.I. Dubrovsky, should think or do in the near future. However, since he knew this, then couldn’t he really think or do something else in this case (as they say, “out of spite”)? Of course he can. But that would mean that the codes he found in his own brain were wrong. Thus, D.I. Dubrovsky comes to an insoluble contradiction. That is, the codes of the brain cannot be deciphered in principle! (This, by the way, also follows from Gödel's and Tarski's theorems: since there are undecidable statements within the framework of any sufficiently complex axiomatic system, consciousness can only be known by a system more high level, i.e. superconsciousness; this also requires a higher-level language.) But then you need to specify next question: does the matter that is known to physical science today really allow the existence of such "ciphered" information messages, “having connected” to which, we will never and under no circumstances (unless, of course, due to the influence of certain factors suddenly become Superhumans) be able to recognize their code organization? Yes, it does. This is quantum cryptography.

Therefore, as a result, I would like to say the following: contrary to what postmodernists claim, a person - at least at his core, deep essence - is not something that can be "read like a text." Even if a Person is a text, it is a text that cannot be deciphered in the usual textual way.

Literature

  1. Abramov S.S. Implicit subjectivity (Experience of philosophical research). -Tomsk: Publishing House of Tomsk University, 1991. -208 p.
  2. Bannikov V.S., Vedensky O.Yu., Ermak G.P., Kolesnik O.L., Shestopalov V.P. Josephson effect in biomolecular structures. // Reports of the Academy of Sciences of the Ukrainian SSR. Ser. A. -1990. -No. 9. –S.46-50.
  3. Belokurov V.V., Timofeevskaya O.D., Khrustalev O.A. Quantum teleportation is an ordinary miracle. - Izhevsk: Research Center "Regular and Chaotic Dynamics", 2000. -256 p.
  4. Baudrillard J. Transparency of evil. -M.: Dobrosvet, 2000. -263 p.
  5. Borisyuk G.N., Borisyuk R.M., Kazanovich Ya.B., Ivanitsky G.R. //UFN. -2002. -T. 172 . –S.1189-1214.
  6. Valiev K.A., Kokin A.A. From the results of the twentieth century: from quanta to quantum computers. (http:// aakokin. chat. en /).
  7. Danilevsky IV Structures of the collective unconscious: Quantum-like social reality. Ed. 2nd. - M: URSS, 2005. -376s.
  8. Derrida J. Letter and difference. - St. Petersburg: Academic project, 2000. - 432 p.
  9. Deutsch D. Structure of reality. Per. from English. - Izhevsk: Research Center "Regular and Chaotic Dynamics", 2001. -400 p.
  10. Ivanov E.M. Physical and subjective: the search for an analogy. Saratov: Publishing house Saratov University, 1997. - 56 p.
  11. Mensky M.B. The concept of consciousness in the context of quantum mechanics. // UFN. - 2005. -T.175. – No. 4. - S. 413-435.
  12. Mitina S.V., Liberman E.A. Input and output channels of a quantum biocomputer. // Biophysics. –1990. -T.5. –Issue 1. - P.132-135.
  13. Penrose R., Shimoni A., Cartwright N., Hawking S. Large, small and human mind. Per. from English. – M.: Mir, 2004. – 191 p.
  14. Petrov V.M., Yablonsky A.I. Mathematics and social processes.–M.: Knowledge, 1980. –64 p.
  15. Sosnin E.A., Poizner B.N. Laser model of creativity (from the theory of dominant to the synergetics of culture). -Tomsk: Publishing House of TGU, 1997. -150 p. ( http ://spkurdyumov. people. en/ CULTURE. htm).
  16. Philosophical studies of the foundations of quantum mechanics. To the 25th anniversary of Bell's inequalities. –M.: Philosophical society of the USSR, 1990. -183 p.
  17. Yablonsky A.I. Models and methods of science research. - M.: Editorial URSS, 2001. -400 p.
  18. Frohlich H. Long rang coherence and energy storage in biological systems. // Inf. Of Quantum Chem . –1968. –#2. - R. 56-58.

Among the criteria for assessing the quality of the text, its naturalness is considered the main one. This indicator can be checked using mathematical method discovered by the American linguist George Zipf.

Zipf's law test- This is a method for assessing the naturalness of the text, determining the regularity of the arrangement of words, where the frequency of the word is inversely proportional to its place in the text.

Zipf's first law "rank - frequency"

C \u003d (Frequency of occurrence of a word x Rank of frequency) / Number of words.

If we take the ratio of a word to the rank of frequency, then the value (C) will be unchanged, and this is true for a document in any language, within each language group the value will be constant.

The words that are significant for the document and determine its subject matter are in the middle of the hyperbole. The words used most often, as well as low-frequency ones, do not carry a decisive semantic meaning.

Zipf's second law "quantity - frequency"

The frequency of a word and its number in the text are also related to each other. If you build a graph, where X is the frequency of a word, Y is the number of words of a given frequency, the shape of the curve will be unchanged.

The principle of writing good text suggests that it must be made the most understandable using the fewest words.

The law shows common property for any language, because there will always be a certain number of most frequently occurring words.

It is necessary to check the SEO text for naturalness if keywords were used in writing so that it is interesting and understandable for a large audience of readers. Also, this indicator is important when ranking sites. search engines, which determine the correspondence of the text to key queries, distributing words into groups of important, random and auxiliary.

More:

  • The relationship between the frequency of occurrence of a word in the text f, and its place in frequency dictionary(rank) r, inversely proportional. The higher the rank of the word (the farther it is from the beginning of the dictionary), the lower the frequency of its occurrence in the text.
  • The graph of such a dependence is a hyperbole, which, when small values ranks decreases very sharply, and then, in the region of low values ​​of the frequency of occurrence, f, stretches very far, gradually, but very imperceptibly, decreasing as the rank, r, increases.
  • If the frequency of occurrence of one word is 4 per million, and the frequency of another is 3 per million, it does not matter that the ranks of these words differ by a thousand times. These words are used so rarely that many native speakers have not even heard them.
  • However, this distant region is remarkable in that the word located here can very easily reduce the value of its rank many times over. Even the smallest increase in the frequency of occurrence of a word dramatically shifts its position to the beginning of the frequency dictionary.
  • In terms of this law, the measure of the popularity of a word is its position in the frequency dictionary of the language. A more popular word is closer to the top of the dictionary than a less popular one.
  • It reflects the dependence of the frequency of using a word in a language on its place in the frequency dictionary. Popular words of the language are used more often. FROM mathematical point of view, the graph of this dependence is a hyperbola with a sharp rise as it approaches the origin of coordinates and a long, gentle, almost horizontal, “tail”. Most of the words of the language are located in this "tail". Here the place of a word in the frequency dictionary, if it changes the frequency of use of this word in the language, is not at all by much.
  • But as soon as the position of the word in the frequency dictionary reaches that place on the hyperbola, where, as we approach the origin, a significant rise in the curve begins, the situation changes. Now a small change in the frequency of occurrence of a word no longer leads to significant changes in its rank, that is, the position of the word in the frequency dictionary ceases to change. This means that the growth of the word's popularity has slowed down. In order for it to continue, special measures should be taken to increase the frequency of occurrence of the word. For example, if the word is the name of the product, you need to spend money on an advertising campaign (

The first time I met a description of Zipf's law while reading. The essence of the law: if the words of any text are ranked by frequency of use, then the product of the rank by the frequency is a constant value:

F*R=C, where:

F is the frequency of occurrence of the word in the text;

R - word rank (the most frequently used word gets rank 1, the next - 2, etc.);

C is a constant.

For those who still remember a little algebra :), in the above formula, it is easy to recognize the equation of a hyperbola. Zipf experimentally determined that C ≈ 0.1. So that graphic image Zipf's law is approximately the following:

Rice. 1. Hyperbola of Zipf's law.

Download note in format , examples in format

Hyperbolas have a remarkable property. If we take a logarithmic scale for both axes, then the hyperbola will look like a straight line:

Rice. 2. The same hyperbole, but on a graph with logarithmic scales

The question may arise: what does search engine optimization have to do with it? So, it turns out that specially generated texts containing an increased number keywords do not fit into the law. Search engines (Google, Yandex) check texts for "naturalness", that is, compliance with Zipf's law and either lower the rating of sites with "suspicious" texts, or even ban such sites.

The second time I met Zipf's law was with Benoit Mandelbrot in his book. And I liked this little section so much that let me quote it in full.

Unexpected power law

In 1950, I was a young mathematics student at the University of Paris looking for a topic for my dissertation. My uncle Zolem was the local textbook example of a professor of mathematics: a deep theorist, very conservative and, despite being born in Poland, a pillar of the French scientific community. Already at the age of 31, he was elected full-time professor at the prestigious French College.

That was the era of Nicolas Bourbaki; Behind this collective pseudonym was hidden a mathematical "club" which, like Dada in art or existentialism in literature, spread from France and became for a time extremely influential on the world stage. Abstraction and pure mathematics, mathematics for the sake of mathematics, were elevated to the rank of a cult; members of the "club" despised pragmatism, applied mathematics, and even mathematics as a tool of science. This approach was for French mathematicians dogma, and for me, perhaps, the reason to leave France and go to work at IBM. I was, to my uncle's dismay, a young rebel. While working on my doctoral dissertation, I often went into his office at the end of the day to chat, and often these conversations turned into a discussion. Once, trying to somehow brighten up the upcoming long and boring subway ride home, I asked him for something to read on the way. He reached into the wastebasket and pulled out several crumpled pieces of paper.

“Here, take this,” my uncle muttered. “The stupidest article you love.

It was a review of a book by sociologist George Kingsley Zipf. Zipf, a man rich enough not to think about a piece of daily bread, read in Harvard University lectures on his own invented discipline, which he called statistical human ecology. In his book Human Behavior and the Principle of Least Effort (Human behavior and the principle of least effort) power laws were seen as ubiquitous structures social sciences. In a chip, power laws are quite common and act as a form of what I now call fractal self-repetition on a scale. seismologists have mathematical formula power dependence the number of earthquakes from their strength on the famous Richter scale. Or, in other words: weak earthquakes are common, while strong ones are rare, and the frequency and strength of earthquakes are related by an exact formula. At that time there were few such examples, and they were known to only a few people. Zipf, the encyclopedist, was obsessed obsession that power laws operate not only in the physical sciences; they are subject to all manifestations of behavior, organization and human anatomy - even the size of the genitals.

Fortunately, the review of the book that my uncle gave me limited itself to only one unusually elegant example: the frequency of words. In text or speech, some words such as English the (definite article) or this ("this") are common; others, milreis or momus, appear rarely or never at all (for the most inquisitive: the first means an ancient Portuguese coin, the second is a synonym for the word "critic"). Zipf proposed the following exercise: take any text and count how many times each word appears in it. Then assign a rank to each word: 1 - for the most frequently used words, 2 - for those occupying the second place in terms of frequency of occurrence, etc. Finally, construct a graph on which, for each rank, indicate the number of occurrences of this word. We will get an amazing drawing. The curve does not decrease uniformly from the ordinary word in this text to the rarest. At first it falls with dizzying speed, after which it begins to decrease more slowly, repeating the trajectory of a skier who jumped from a springboard, and then landed and descended the relatively gentle slope of a snow-covered mountain. An example of a classic non-uniform scale. Zipf, having adjusted the curve to fit his diagrams, came up with a formula for it.

I was stunned. By the end of my long subway ride, I already had a topic for half of my doctoral dissertation. I knew exactly how to explain the mathematical foundations frequency distribution words, which Zipf, not being a mathematician, could not have done. In the following months I was expected amazing discoveries. Using the above equation, one can create powerful tool social studies. An improved version of the Zipf formula made it possible to quantify and rank the richness of the vocabulary of any person: a high value - a rich vocabulary; low value - poor. With such a scale, one can measure differences in vocabulary between texts or speakers. It becomes possible to quantify erudition. True, my friends and consultants were horrified by my determination to tackle this strange topic. Zipf, they told me, is a quirky man. I was shown his book and I agreed that it was disgusting. Word count is not real mathematics, I was convinced. Having taken up this topic, I will never find Good work; And it won't be easy for me to become a professor either.

But I remained deaf to wise advice. Moreover, I wrote my dissertation without any consultants at all and even persuaded one of the university bureaucrats to certify it with a seal. I was determined to follow the chosen path to the end and apply Zipf's ideas in economics, because not only speech can be reduced to a power law. We are rich or poor, prosperous or starving - all this also seemed to me the object of a power law.

Mandelbrot slightly modified Zipf's formula:

F \u003d C * R -1 /a, where

a - coefficient characterizing the richness of the vocabulary; how more value a, the richer vocabulary text, since the curve of dependence of the frequency of occurrence of each word on its rank decreases more slowly, and, for example, rare words appear more often than at smaller values ​​of a. It was this property that Mandelbrot intended to use to assess erudition.

Not everything is so smooth with Zipf's law, and in specific applications it is not always possible to rely on the experimentally determined coefficient a. At the same time, Zipf's law is nothing more than Pareto's law "on the contrary", since both of them are special cases of power series, or ... a manifestation of the fractal nature of economic and social systems.

For myself, I formulated the essence of the fractal nature of economic systems as follows. On the one hand, there game randomness: roulette, throwing dice. On the other hand, technological/physical accident: variation in the diameter of a shaft made on a lathe, variation in the height of an adult. All of these phenomena are described. So, there are a number of phenomena that do not obey this distribution: the wealth of countries and individual people, fluctuations in stock prices, exchange rates, the frequency of use of words, the strength of earthquakes ... For such phenomena, it is characteristic that the average value is very dependent on the sample. For example, if you take a hundred random people of different heights, then adding yourself to them tall man on Earth will not greatly change the average height of this group. If we calculate the average income of a hundred random people, then adding the richest person on the planet - Carlos Slim Elu (and not Bill Gates, as many might think :)) will significantly increase the average wealth of everyone, to about 500 million dollars!

Another manifestation of fractality is a significant stratification of the sample. Consider, for example,

Agree, the presented pattern is like two drops of water similar to the Zipf curve!

One of the properties of fractality is self-repetition. So, out of the 192 countries of the world listed in the list, 80% of the world's wealth is concentrated in just 18 countries - 9.4% (18/192). If we now consider only these 18 countries, then their total wealth is 46 trillion. dollars - distributed equally unevenly. 80% of these 46 trillion. Concentrated in less than half of the countries, etc.

You may ask: what is the practical conclusion of all this? I would say this:

  1. Social and economic systems are not described by a Gaussian. These patterns are subject to power series[synonym - fractal nature].
  2. Outliers from the mean are substantially more likely than those predicted by the Gaussian bell curve. Moreover, outliers are intrinsic to the system; they are not random, but regular.
  3. Risk assessments cannot be based on normal distribution probabilities of rare unwanted events.
  4. … I won’t lie, I can’t think of anything else yet… but this does not mean that there are no more practical conclusions… it’s just that my knowledge is limited to this…

... but you must admit, beautiful patterns!

For fractality, see Benoit Mandelbrot

It should be noted that data from different sources vary greatly, but this is not relevant to the topic discussed here.

Hello dear readers! Zipf's law will help check the text for naturalness. So, at least, it is considered. What kind of "naturalness" is on our head? Do I need to control this indicator as well, how important is it for website promotion? Is it correctly determined by online services? All these questions would be good to deal with. There are various, sometimes very opposing opinions on this subject on the network. I'll put it in, and I'll have my "five cents" and try to state my own approaches to this Zipf.

Why suddenly about the law - in the feminine gender? Yes, because I really want to compare the brainchild of the linguist and philologist George Kingsley Zipf with a cunning fox that, by hook or by crook, penetrates ours " bast hut”- copywriting and starts downloading rights there. But first, a little background with math and statistics. But, don't be afraid, friends, I myself am not one of the strong calculators, so I will not torment you or myself.

Zipf's law and global patterns

J.K. Zipf was a self-described statistical social…ecologist. An interesting combination, isn't it? He tried to explore patterns social phenomena in terms of statistics and mathematics of large numbers. And he succeeded to some extent. So, on the example of comparing the frequency of the use of words in the English language with their number in the "table of ranks", the scientist found that the opposite is observed proportional dependence. Roughly speaking, the word that ranks second on the list in terms of frequency of use is used twice as often as the first; the third - three times and so on. From the point of view of mathematics, this functional dependence is described by the Pareto distribution. For each language, of course, their own constants and coefficients are introduced.

The same pattern can be traced in some economic categories, for example, income distribution the richest people peace. In addition, the population is again largest cities most countries of the world also line up, designated by the same Zipf. With some deviations, taking into account all sorts of disturbing factors, but the law works in some incomprehensible way. I do not want to dwell on this phenomenon for a long time. We are still interested in the mysterious beast Zipf, not even from the point of view of linguistics, but from the point of view of its applicability to small samples of words, which are our articles.

Is it worth checking texts according to Zipf's law

Note friends, in the previous section we talked about growing megacities or the capital of the rich, using superlatives. On one of the sites, I even found information that Zipf's calculations do not work for cities with an average population. The same is true for the economy: for firms with revenues of less than $10 million / year, the rank / frequency law does not work either. As for linguistic research, the whole language group is a rather sickly selection. English has, for example, about a million words. And there, yes, the ratio of frequency and use of these words ideally builds hyperbole. But here's something I didn't find any restrictions anywhere for applying Zipf to samples of small words.

However, a simple sense of logic suggests that if average (with a population of hundreds of thousands) cities or firms with incomes of less than 10 million (poor little ones!) Cannot act as apologists for Zipof calculations, then why torment our texts. After all, even a thousand words will be typed into them infrequently. So the average article for 3 thousand b / n characters contains approximately 400-500 words. And what regularity are we trying to find among such a group?

No, it is possible that the developers of online services for checking texts according to Zipf's law tried to somehow take into account the fact that our articles can hardly be called semantic mega-samples. But if they succeeded, then it would smell like Nobel Prize! Such an amendment to the discovery of the famous scientist would certainly require at least the addition of the name of a child prodigy, such as the Zipf-Pupkin law. Sounds like? But we didn't hear any fanfares.

And again, logic, coupled with some life experience suggests: the developers of search ranking algorithms have played a little. I understand their difficult task: each member of the team must constantly prove their effectiveness, creativity, gush with ideas. So they gushed over our heads.

Experiments of zealous optimizers

Well, you don't need to fire a cannon at our sparrow articles: our opuses are not suitable for your experiments with Zipf, dear developers. On small samples, these patterns are far-fetched. This, of course, is purely my opinion. On the net, I also came across the opposite: Zipf's law, they say, improved the position of the site in the issue, the texts became noticeably more interesting, and so on, in the same vein. Many people try to analyze the TOP for compliance with the Zipf distribution and draw some conclusions on this basis. Stop, gentlemen! Against the background of about eight hundred factors that search engines take into account when ranking, are you trying to track the influence of one? Well, that's no good! Research is not carried out this way, their results cannot be recognized as correct.

With all my negative attitude not to Zipfe (I respect science), but to unjustified attempts to believe in again harmony with algebra, I have repeatedly analyzed my work for naturalness in online services. At the request of customers, of course. I can say that I'm alive human language without clericalism, clichés and tautologies, it very easily helps to overcome Zipof barriers. Achieving 70-80% naturalness of the text is not difficult at all. Those who wish can check their texts for example. I don't think it's necessary to do this all the time. Moreover, you should not bet on the Zipfu fox for promotion. Honestly, friends, do not waste time and energy on unscientific experiments.

This text has 87% naturalness. Enough. I think that even if I catch up with the figures to 98%, this will not affect the positions in the search results at all. According to my forecasts, the TOP does not shine for this article. Well, okay, but she said what she wanted.

Goodbye friends.

Your country guide GALANT copywriting.