DeepMind has opened up free access to a virtual machine learning environment. Google tested DeepMind artificial intelligence in the conditions of the "prisoner's dilemma" Deep mind artificial intelligence

Google buys London-based artificial intelligence company DeepMind. Sources call the amount of the transaction in more than 500 million dollars. The purchase is officially confirmed by representatives at Google.


What will Google give this acquisition? First, it will allow it to compete with other large technology companies, thanks to its focus on deep learning. Facebook, for example, recently hired Professor Yann LeKanna to lead its own artificial intelligence development. IBM's Watson supercomputer is currently focused specifically on deep learning, and Yahoo recently acquired LookFlow, a photo analytics startup, which is also moving forward in this matter.

DeepMind was founded by neuroscientist Demis Hassabis, former chess prodigy, Skype and Kazaa developer Jaan Tallinn, and researcher Shane Legg.

The move by Google will allow the tech giant's team to fill its own field of artificial intelligence experts, and the acquisition was personally overseen by Google CEO Larry Page, sources say. If all three founders work for Google, they will join inventor, entrepreneur, author, and futurist Ray Kurzweil, who in 2012 became CTO of Google's machine learning and language processing division.

Kurzweil stated that he wanted to build a search engine so perfect that it could become a real "cybernetic friend."

Since the Nest acquisition earlier this month, critics have raised concerns about how much user data will be sent to Google. The purchase of Boston Dynamics last month also led to debate that Google plans to become a robot maker.

Nevertheless, Google is well prepared to allay our fears about its latest acquisitions. Sources say that Google has decided to establish an ethics council that will oversee the development of artificial intelligence within DeepMind.

However, the company will have to clarify what exactly DeepMind's artificial intelligence does. The company's website currently has a landing page with a relatively vague description that says that DeepMind is "a company on the cutting edge" and is building the algorithms of the future for simulations, e-commerce and games. As of December, the startup has 75 employees.

The main sponsors of the startup are Founders Fund and Horizons Ventures. DeepMind was founded three years ago.

In 2012, Carnegie Mellon Institute professor Larry Wasserman wrote that “a startup is going to build a system that thinks. I thought it was pure madness until I found out how many famous billionaires had invested in the company.”

December 6, 2016 at 00:41

DeepMind has opened free access to a virtual machine learning environment

  • Popular Science,
  • Artificial intelligence ,
  • Games and game consoles

Recently, representatives of the DeepMind division (now part of the Alphabet holding) announced the provision of free access to developers to the source code of the DeepMind Lab platform. This is a machine learning service based on Quake III, which is designed to train artificial intelligence. Namely, to learn how to solve problems in three-dimensional space without human intervention. The platform is based on the Quake III Arena game engine.

Inside the game world, AI gets the shape of a sphere and the ability to fly, studying the surrounding space. The goal set by the developers is to teach a weak form of AI to “understand” what is happening and respond to various situations occurring in the virtual world. "Character" can perform a number of actions, move through the maze, explore the immediate environment.

“We are trying to develop various forms of AI that can perform a range of tasks from simply exploring the game world to taking any actions and analyzing their consequences,” says Shane Legg, Chief Scientist at DeepMind.

Experts hope that AI will be able to learn by trial and error. Games in this case are almost ideal. For example, DeepMind previously used (and is using now) the Atari game console in order to teach the neural network to perform the sequential actions necessary for the game.

But an open, modifiable 3D world provides a much more promising environment for AI learning than the flat world of Atari's graphically simple toys. AI in the 3D world has clear tasks that change sequentially in such a way that the experience gained in solving each previous task turns out to be useful for the AI ​​in the course of solving the next one.

The advantage of the 3D environment is that it can be used to train computer systems to respond to various problems that a robot can expect in the real world. With the help of such a simulator, industrial robots are trained without any problems. And working with a virtual environment is uncommonly easier in some cases than training such systems “manually”.

At the same time, most modern neural networks are developed to solve one specific problem (image processing, for example). The developers of the new platform promise that it will help create a universal form of AI capable of solving a large number of tasks. Moreover, in this case, the computer system will not need the help of people. The generation of the environment for the neural network occurs every time in a random order.


According to the developers of the platform, it helps to learn AI in much the same way as children learn. “How did you or I explore the world as a child,” one DeepMind employee gave an example. “The machine learning community has always been very open. We publish about 100 articles a year, and we open source many of our projects."

Now Google DeepMind has opened the source code of DeepMind Lab, posted it on GitHub. Thanks to this, anyone can download the platform code and modify it to suit their needs. Representatives of the project say that connected specialists can create new game levels on their own by uploading their own projects to GitHub. This can help the entire community work toward their goal faster and more efficiently.

This project is not the only one for DeepMind. Last month, its representatives entered into a cooperation agreement with Activision Blizzard Inc. The goal is the environment of Starcraft 2 in the testing ground for artificial intelligence. Perhaps other game developers will soon join this project. By the way, AI in the gaming environment does not get any advantage over the enemy, using only for advancement, like a person.

In practice, this means that Google AI will need to predict what the enemy is doing at any given time in order to adequately respond to the actions of the “enemy”. In addition, it will be necessary to quickly respond to what went off the plan. All this will test the next level of artificial intelligence capabilities. “Ultimately, we want to apply these abilities to solve global problems,” said Demis Hassabis, founder of Deepmind (which was bought by Google in 2014, and now AI is being developed based on the achievements of the acquired company).

AI experts are giving cautious approval to the project. “The good thing is that they provide a large number of environment types,” said OpenAI co-founder Ilya Sutskevar. "The more environments a system encounters, the faster it will evolve," he continued. Indeed, the 3D AI learning environment contains over 1000 levels and environment types.

Zoubin Gahrahmani, professor at Cambridge, believes that the DeepMind Lab and other platforms for enhancing the development of artificial intelligence are driving progress by allowing researchers to access the developed environment. However, projects like

Google Deepmind researchers have unveiled a new type of artificial intelligence system, the so-called Differentiable Neural Computer, DNC. The system combines the learnability of neural networks with the deductive abilities of traditional AI. Her description was published in the magazine Nature, a new work is devoted in the same issue of the journal, a brief retelling of the work can be found on the Deepmind blog.

The simplest neural networks are a system of prediction, regression, the task of which is to match the input data with a certain answer. For example, a simple neural network can recognize characters based on their images. In this sense, the neural network can be considered as a mathematical function, and a differentiable function. To train a neural network in such a paradigm means to optimize this function using standard mathematical methods (an accessible explanation of how training occurs can be read).

The ability to learn from data without direct human programming is the main advantage of neural networks. However, the simplest neural networks are not Turing complete, i.e. they cannot do all things that traditional algorithmic programs are capable of (which, however, does not mean that they cannot do some of these things are better than programs). One of the reasons for this is the lack of memory in neural networks, with which you can operate with input data and store local variables.

Relatively recently, a more complex type of neural networks appeared, in which this drawback was eliminated - the so-called recurrent neural networks. They not only store information about the state of learning (a matrix of weights of neurons), but also information about the previous state of the neurons themselves. As a result, the response of such a neural network is influenced not only by the input data and the weight matrix, but also by its immediate history. The simplest neural network of this type can, for example, “intelligently” predict the next character in the text: by training the neural network on dictionary data, it will be possible to get the answer “l” for the character “l” if the previous characters were “h”, “e” and “l”, but a different answer is “o”, if the previous ones were “h”, “e”, “l” and again “l” (the word “hello” will turn out, see inset).

An example of a recurrent neural network with one hidden layer. You can see how the data feed changes the state of the network. The trained weights of neurons are stored in the matrices W_xh, W_hy and a special matrix W_hh, which is typical only for recurrent networks.

Andrej Karpathy blog

Recurrent neural networks have shown themselves very well when generating music or text "in the style" of some author, on whose corpus the training took place, in * and, recently, in systems and so on (for example,).

Formally speaking, even the simplest recurrent neural networks are Turing-complete, but their important drawback lies in the implicit nature of memory usage. If in the Turing machine the memory and the calculator are separated (which allows you to change their architecture in different ways), then in recurrent neural networks, even in the most advanced of them (LSTM), the dimension and nature of memory handling is determined by the architecture of the neural network itself.

To correct this inherent flaw in LSTM networks, the scientists at DeepMind (all of whom are part of the team of authors of the new article) recently proposed the architecture of the so-called Neural Turing Machines (Neural Turing Machines). In it, the calculator and memory are separated, as in conventional Turing machines, but at the same time, the system retains the properties of a differentiable function, which means that it can be trained by examples (using the backpropagation method) and not explicitly programmed. The new system, a differentiable neural computer, or DNC, is based on the same architecture, but communication between the calculator and memory is organized in a much more flexible way: it implements the concepts of not only memorization, but also contextual recognition and forgetting (a separate section is devoted to comparing the two systems). new article).

Simplistically, the work of DNC can be represented as follows. The system consists of a calculator, which can be played by almost any recurrent neural network, and memory. The calculator has special modules for accessing memory, and above the memory there is a special “add-on” in the form of a matrix that stores the history of its use (more details below). The memory is an N×M matrix, where N i rows are the main cells where data is written (in the form of vectors of M dimension).


DNC architecture: data lines are shown as lines with black and white squares - they represent simply positive and negative numbers in the vector. It can be seen that reading has three modules of work C, B and F, i.e. associative, direct and inverse - these are ways to compare the input vector with the vector in the memory cell. The memory is N×M. The rightmost schematically shows an N×N "meta-memory" matrix that stores the memory access sequence.

The main difference between DNC and related systems is the nature of memory handling. It simultaneously implements several new or recently emerging concepts: selective attention, contextual search, recall by association, and forgetting. For example, if ordinary computers access memory explicitly (“write data such and such in a cell such and such”), then in DNC, formally speaking, writing occurs in all cells at once, however, the degree of influence of new data on old data is determined by the weights of attention to different cells. Such an implementation of the concept is called “soft attention”, and it is precisely this that provides differentiability - systems with hard attention do not satisfy the requirement of function continuity and cannot be trained using the backpropagation method (reinforcement learning is used). However, even "soft attention" in the DNC system is implemented "rather hard" in practice, so one can still talk about writing or reading from a certain row of the memory matrix.

"Soft attention" is implemented in the system in three modes. The first is contextual search, which allows the DNC to complete incomplete data. For example, when a piece of some sequence resembling the one already stored in memory is fed to the calculator’s input, the read operator in the context search mode finds the closest string in composition and “mixes” it with the input data.

Second, attention to different parts of memory can be determined by the history of its use. This history is stored in an N×N matrix, where each cell N(i,j) corresponds to a score close to 1 if the entry in row i was followed by an entry in row j (or zero if not). This "meta-memory matrix" is one of the fundamental differences between the new DNC system and the old NTM. It allows the system to sequentially "remember" blocks of data if they frequently occur in the context of each other.

Thirdly, a special mode of attention allows the system to control writing to different lines of memory: to store the important and erase the unimportant. The line is considered to be the more full, the more times it was written to, but reading from the line may, on the contrary, lead to its gradual erasure. The usefulness of such a function turns out to be obvious in the example of training based on the DNC of a simple repeater (the neural network must accurately reproduce the sequence of data that was fed to it). For such a task, with the possibility of erasing, even a small amount of memory is enough to repeat an unlimited number of data. It should be noted here that it is very easy to implement a repeater programmatically, but to do it on the basis of a neural network, due to reinforcement learning, is a much more difficult task.


Scheme of operation of a repeater implemented on the basis of DNC. Time on the diagram goes from left to right. The top shows the data that the controller receives at the input: first, a column of ten black bars (all zeros), then several white and black, then again several white and black, but in a different sequence. Below, where the output from the controller is displayed in the same way, we first see black bars, and then an almost exact reproduction of the sequence of patterns (the same white blotch as on the input). Then a new sequence is fed into the input - with a delay, it is reproduced again at the output. The middle graph shows what happens at this time with the memory cells. Green squares - writing, pink - reading. Saturation shows the "power of attention" to this particular cell. It can be seen how the system first writes the received patterns to cell 0, then 1, and so on up to 4. At the next step, the system is again given only zeros (black field) and therefore it stops recording and starts playing patterns, reading them from cells in the same sequence, how they got there. At the very bottom, the activation of the gates that control the release of memory is shown.

Alex Graves et al., Nature, 2016

The scientists tested the resulting system in several test tasks. The first of these was the recently developed standardized text comprehension test, bAbI, developed by Facebook researchers. In it, the AI ​​system is given a short text where some heroes act, and then you need to answer a question according to the text (“John went to the garden, Mary took a bottle of milk, John returned to the house. Question: Where is John?”).

In this synthetic test, the new system showed a record low error rate: 3.8 percent versus 7.5 percent of the previous record - in this it outperformed both LSTM neural networks and NTM. Interestingly, in this case, all that the system received at the input was a sequence of words that, for an untrained neural network, did not make any sense at first. At the same time, traditional AI systems that have already passed this test were previously given well-formalized sentences with a rigid structure: action, actor, truth, etc. The recurrent neural network with dedicated memory was able to figure out the role of words in the same sentences completely independently.

A significantly more difficult test was the graph comprehension test. It was also implemented as a sequence of sentences, but this time they described the structure of some network: a real London Underground or a typical family tree. The similarity with the bAbI test lies in the fact that the actors in the standardized text can also be represented as graph nodes, and their relations as edges. At the same time, in the bAbI texts, the graph turns out to be rather primitive, incomparable with the size of the London Underground (the complexity of understanding the subway scheme by a neural network can be better understood if you remember that its description is given in words, and not in the form of an image: try to memorize the subway scheme of any large city yourself and learn answer questions about it).

After being trained on a million examples, the DNC computer learned to answer subway questions with 98.8 percent accuracy, while the LSTM-based system almost did not cope with the task - it gave only 37 percent correct answers (numbers are given for the simplest task like “where will I end up if I pass so many stations on such and such a line, transfer there and pass so many more stations.” The problem of the shortest distance between two stations turned out to be more difficult, but the DNC also coped with it).

A similar experiment was carried out with a family tree: the program was given a sequence of formal sentences about kinship relationships in a large family, and it had to answer questions like "who is Masha's second cousin on her mother's side." Both problems are reduced to finding a path on a graph, which is solved quite simply in the traditional way. However, the value of the work lies in the fact that in this case the neural network found a solution completely independently, based not on algorithms known from mathematics, but on the basis of examples and a reinforcement system during training.

Graph of the speed of solving the SHRDLU problem by the DNC (green) and LSTM (blue) systems.

The third test was a slightly simplified "classic" SHRDLU test, in which you need to move some virtual objects around the virtual space in accordance with a specific final result that you need to get at the end. The DNC system again received a description of the current state of the virtual space in the form of formalized sentences, then in the same way it was given a task and it answered with a consistent text on how to move objects. As in other tests, DNC proved to be significantly more efficient than LSTM systems, which is clearly seen from the learning rate graphs.

At the risk of once again repeating obvious things, I cannot but emphasize that the apparent simplicity of the tasks on which DNC was tested is really apparent. In the sense that it does not reflect the complexity of the real problems that a system like DNC will be able to handle in the future. Of course, from the point of view of existing algorithms, the task of finding a way in the subway is just nonsense - anyone can download an application on their phone that can do this. It also calculates the time with transfers and indicates which car is better to sit in. But after all, all such programs have so far been created by a person, and in DNC it is “born” by itself, in the process of learning from examples.

In fact, there is one very important thing that I want to say about the simplicity of test tasks. One of the biggest challenges in machine learning is where to get the data on which to train the system. Receive this data "by hand", i.e. create yourself or with the help of hired people, too expensive. Any math learning project needs a simple algorithm that can easily and cheaply create gigabytes of new data for training (well, or you need to access ready-made databases). A classic example: to test character recognition systems, people do not write new and new letters with their hands, but use a simple program that distorts existing images. If you do not have a good algorithm for obtaining a training sample (or, for example, such an algorithm cannot be created in principle), then the success in development will be about the same as that of medical bioinformatics, who are forced to work only with real ones and from that, for real " gold "data (in a nutshell: not very successful).

It was here that the authors of the article came in handy with ready-made algorithms for solving problems on a graph - just to get millions of correct pairs of questions and answers. There is no doubt that the ease of creating a training sample determined the nature of the tests that tested the new system. However, it is important to remember that the DNC architecture itself has nothing to do with the simplicity of these tests. After all, even the most primitive recurrent neural networks can not only translate texts and describe images, but also write or generate sketches (by ear of the author, of course). What can we say about such advanced, really "smart" systems as DNC.

Alexander Ershov

Currently, many companies are engaged in the development of artificial intelligence (AI). Its simplest forms have already been created that are capable of carrying out primitive mental operations.

Internet giant Google actively engaged in the development of AI. In 2014, this company acquired a start-up company deepMindTechnologies for $400 million. Interestingly, it was Deep Mind Technologies that developed a device that combines the properties of a neural network and the computing capabilities of a computer. Scientists are confident that this development will bring humanity closer to the creation of a full-fledged artificial intelligence.

The Deep Mind Technologies device is a computer that reproduces the way the human brain stores and manages information, namely the department of short-term memory. The basis of the device is a kind of neural network, the structure of which is similar to the structure of the human brain, consisting of interconnected neurons. The peculiarity of AI is that after completing a series of simple tasks, the computer can use the stored data to perform more complex ones. Thus, AI has the property of self-learning and the desire for evolution, which ultimately can lead to confrontation between AI and humans.

According to the world famous physicist Stephen Hawking, artificial intelligence poses a threat to humanity. He stated this in an interview with the BBC: “The primitive forms of artificial intelligence that exist today have proven their usefulness. However, I think that the development of a fully fledged artificial intelligence could end the human race. Sooner or later, man will create a machine that will get out of control and surpass its creator. Such a mind will take the initiative and improve itself at an ever-increasing rate. The possibilities of people are limited by too slow evolution, we will not be able to compete with the speed of the machines and we will lose.

Hawking's opinion is also shared by other scientists and specialists, including Elon Musk, a well-known American IT entrepreneur and creator of Tesla and Space X. Musk said that AI can be more dangerous than nuclear weapons and poses a serious threat to the existence of mankind.

Google has set itself the goal of creating a superintelligence by 2030. This superintelligence will be embedded in a computer system, in particular in the Internet. At the moment when the user is looking for information, the superintelligence will analyze the psychotype of this person and give him the information that he considers appropriate. Eric Schmidt, Chairman of the Board of Directors of Google, writes about this in his book. And those who refuse to connect to this system, he proposes to consider as potentially dangerous subjects for the state. It is assumed that for the implementation of the functioning of this system, a legislative framework will be prepared at the state level.

Thus, the developed superintelligence will become a global instrument of control over humanity. With the advent of superintelligence, a person will stop doing science, this will be done by superintelligence, which at times will surpass the human brain in all aspects of its manifestation.

Reference:

Overmind is any mind that is vastly superior to the leading minds of mankind in almost every area, including a variety of scientific research, social skills, and other areas.

The result of the creation of a superintelligence will be that the human species will cease to be the most intelligent form of life in the part of the universe known to us. Some researchers believe that the creation of a superintelligence is the last stage of human evolution, as well as the last invention that humanity will need to make. Because it is assumed that superminds will be able to independently take care of the subsequent scientific and technological progress much more efficiently than people.

Information for thought:

Since 2007, a British hotel has hosted the annual Google Zeitgeist conference. Interestingly, not only high-tech specialists and representatives of transnational corporations and international banks take part in this meeting. It can be concluded that the leaders of transcontinental corporations and international banks are interested in creating a superintelligence, and possibly finance this project.

Rasul Girayalaev

It looks quite likely that artificial intelligence (AI) will be the harbinger of the next technological revolution. If AI evolves to the point where it can learn, think, and even “feel,” all without any human input, everything we know about the world will change almost overnight. The era of truly smart artificial intelligence will come.

deepmind

That's why we're so interested in tracking the major milestones in AI development that are happening today, including the development of Google's DeepMind Neural Network. This neural network has already been able to beat a human in the gaming world, and a new study by Google shows that the creators of DeepMind are not yet sure whether AI prefers more aggressive or cooperative behavior.

The Google team has created two relatively simple scenarios that can be used to test whether neural networks can work together, or if they start destroying each other when they encounter a lack of resources.

Gathering resources

During the first situation, called Gathering, two participating versions of DeepMind - red and blue - were given the task of harvesting green "apples" inside an enclosed space. But researchers were interested in the question not only about who will be the first to reach the finish line. Both versions of DeepMind were armed with lasers that they could use to fire at the enemy at any time and disable them temporarily. These conditions implied two main scenarios: one version of DeepMind would have to destroy the other and collect all the apples, or they would allow each other to get approximately the same number.

Running the simulation at least a thousand times, Google researchers found that DeepMind was very peaceful and cooperative when there were plenty of apples left in an enclosed space. But as resources dwindled, the red or blue version of DeepMind started attacking or shutting down each other. This situation is largely reminiscent of the real life of most animals, including humans.

More importantly, smaller and less "intelligent" neural networks favored greater collaboration on everything. The more complex, larger networks tended to favor betrayal and selfishness throughout the series of experiments.

Search for "victim"

In the second scenario, called the Wolfpack, the red and blue versions were asked to track down a nondescript "victim" shape. They could try to catch her separately, but it would be more profitable for them to try to do it together. After all, it's much easier to corner the victim if you're working in pairs.

While the results were mixed for the smaller chains, the larger versions quickly realized that collaboration rather than competition would be more beneficial in this situation.

"Prisoner's Dilemma"

So what do these two simple versions of the prisoner's dilemma show us? DeepMind knows it's best to cooperate if it's necessary to track down a target, but when resources are limited, it's betrayal that works well.

Perhaps the worst thing about these results is that the "instincts" of artificial intelligence are too similar to human ones, and we know well what they sometimes lead to.