An artificial intelligence-based system developed by Microsoft has achieved the maximum possible score for the Ms. Pac-Man game, 999,990. This system was developed by a team at Maluuba, a Canadian deep learning startup acquired by Microsoft earlier this year. They used a divide-and-conquer method that could have broad implications for teaching AI agents to do complex tasks. This is a significant achievement since AI researcher have always found Ms. Pac-Man among the most difficult to crack. Maluuba team calls the technique used in this system as Hybrid Reward Architecture. Read about it in detail below,
This technique uses more than 150 agents, each of which worked in parallel with the other agents to master Ms. Pac-Man. For example, some agents got rewarded for successfully finding one specific pellet, while others were tasked with staying out of the way of ghosts. Then, the researchers created a top agent – sort of like a senior manager at a company – who took suggestions from all the agents and used them to decide where to move Ms. Pac-Man.
The top agent took into account how many agents advocated for going in a certain direction, but it also looked at the intensity with which they wanted to make that move. For example, if 100 agents wanted to go right because that was the best path to their pellet, but three wanted to go left because there was a deadly ghost to the right, it would give more weight to the ones who had noticed the ghost and go left.
The technique is particularly interesting because many complex tasks which would normally be too difficult for machine learning systems to take on can be broken down into multiple individual simpler tasks, with significant implications for the amount and type of work AI will soon be able to displace.
Read more about this story here.