It comes natural to humans to tell where a photo was taken if the place is not new to them. But Google’s deep-learning machines are closer than ever to tell where you snapped that selfie last summer. And they do not need famous tourist attractions in the background to do so.
Google engineers explained that humans need a lifetime to enrich their knowledge on different places, architectures, vegetation, natural and artificial settings around them, which are commonly-used cues by geolocation-powered devices.
So, you may think that machines would never be capable of such feat. Yet, you should think again. A team of computer vision specialists at Google have trained a super computer to recognize the place where an image was taken by just looking at its pixels.
Developers say that the machine is so powerful that it can outperform even the savviest human. It can also tell the location of most indoor pictures and images that feature no specific background but a pet or random object.
The Google team instructed the computer to divide the planet into a grid, which has a variable density depending on how many photos a location may be home to. The team said that polar regions and oceans were not mapped as few people ever take pictures there. But above large urban areas the grid has the largest density for obvious reasons.
Google engineers also created a monster database containing 126 million geolocated photographs picked from the internet. Each image was automatically matched with the place it was taken at.
Next, researchers instructed the super computer which acts like a huge artificial neural network to detect where 91 million of those photos were taken on the grid by simply scanning each picture.
They used the remaining 34 million pictures to validate the data set in the network dubbed “PlaNet.” Finally, they tested the machine to see how well it works.
For this purpose, they ‘fed’ the computer 2.3 million photos with geotag information harvested from Flicker. The network was able to tell the location of photos in 3.6 percent of cases at street level and 10.1 percent of cases at city level. In 28 percent of cases, the machine was able to tell the country in the pictures, and in 48 percent of cases it was able to accurately tell the continent.
The researchers further tested the machine against humans. They picked human participants with vast traveling experience, and asked them to identify random images at street level from Google Street View.
In the end, PlaNet won the competition with 28 right guesses of 50. Plus, the network had a localization error of 703 miles, while humans had an average localization error of 1,441 miles. Its creators deemed the machine ‘superhuman.’
Image Source: Flickr