Visual recognition – Our world from AI’s point of view

We develop our vision in the womb. However, after we see daylight, we can only partially perceive our environment. Recognising shades of colour, focussing on objects, and depth perception are a big challenge for small people, as babies are physiologically premature at birth compared to all other mammals. But after about eight months we’ve done it: Our eyesight now resembles that of an adult. In the labs of the future, the development of the vision of an AI has been going on for quite some time. Current systems are already achieving remarkable results.

How does that work?

Many researchers see machine vision as one of the most complicated tasks for artificial intelligence. It’s about understanding images and videos and orienting oneself to the capabilities of human visualisation. Even observing how we see, biologically, raises many questions and is only partially researched. Scientists have developed a model for the procedure of machine image analysis.

Model according to Marr (1982)

As a first step, the algorithm creates a raw sketch from the actual image. It describes and locates different brightness levels to find different objects on the image. If there are large differences in brightness, the AI recognizes this as an edge. Then, with the help of the algorithm, a 2½-D sketch with the first recognisable depths is designed. Thanks to further evaluations of shadows, textures, and overlaps, information about depth is gained. The result is an initial three-dimensional picture of our world. This uses the AI to create an overall picture of the situation, regardless of the viewer’s perspective. Individual 3D objects are then aligned with a database and identified.

Image analysis

After recognising individual objects, it’s now about understanding the scene. The possible interpretation depends decisively on the question.

Possible interpretations of this image:

A black and white photo
3 circles, 1 bow, and 20 lines
A person and a car
Emotion: Happy, smile
A person is crossing the street

Thus, there is no general way of working with image analysis. Thanks to machine learning, the analyses become “smarter” after each application and the artificial intelligence perfects the link between its artificial synapses.

Who can do that?

All major artificial intelligence players offer special image recognition tools. There are still no mechanical vision systems that even bring about the performance that we as humans can. IBM is launching its artificial intelligence Watson with Visual Recognition, Google offers its powerful Cloud Vision API, Amazon manages billions of images every day with Recognition, and Microsoft Azure also offers its own machine vision software.

What can I do with that?

But what are the benefits of these novel models if we cannot benefit from them yet? Companies have found exciting application examples in which artificial intelligence can support us with their »machine eyes«. The AIs are adjusted with their image analysis capabilities to the respective situations, so that they can do their tasks purposefully.

Google Lens – the people’s image analyser

Google makes machine image analysis tangible for everyone. With their app »Google Lens« it is possible to photograph objects with a smartphone and send them to Google. Then, Google Assistant tries to interpret the image and provide important information. So you get data on buildings, the species of a plant in the living room, or the breed of the neighbour’s dog. An image analyser for basically everyone.

Safety first!

For some time, Amazon has been offering a version of its Fire tablet for the youngest of us. They promise peace of mind for children. Kids get their own picture and video content, which blocks unsafe and inappropriate content. Naturally, for the most part, this selection does not come from a real human, but from an artificial intelligence. Of course, Amazon uses its own tool »Amazon Recognition« here.

Your new co-worker: Watson

At John Deere, one of the largest agricultural equipment manufacturers in the world, more than 2,000 tractors leave the production lines every month at the Mannheim plant. Achieving these numbers requires a strong workforce. Watson manages to become an employee with many awards such as Employee of the Month. His »machine eyes« scan components for defects and assist in picking in the warehouse. The error rate goes to zero.

Organising images

One of the simplest tasks for artificial intelligence is sorting photos by content. In the future, huge libraries will not have to be tagged and sorted by interns but will only have to be hunt down by an analysis tool. This saves time and nerves and is, above all, more accurate. You can see how these tags work directly at Google or IBM. But whether this also replaces historians, art historians, and librarians who have to master more subtle abilities of contextualization remains to be seen.

From image recognition to surveillance mania

The Ministry for Security would’ve been happy: With the intelligent image recognition, we aren’t far away from complete surveillance. Via the amazing »Recognition« Amazon is already working with police to find perps. Fed up with mugshots, the system searches and finds its way even in large crowds. There is also a trend in China: By 2020, over 500 million surveillance cameras will be installed there, which will synchronise their data with face recognition databases.

Finding new trends

Image recognition of artificial intelligence has long been more than a trend, it’s a huge market with ambivalent potential. But it’s definitely a topic that nobody can miss. Successful digital brands are keen to make their brand fit for digital transformation and to remain an attractive employer. But how does innovation manage and what kind of opportunity does one’s own digital platform or voice commerce offer, for example? Handelskraft 2018 »Moving Towards Digital Excellence« shows all of this and is now exclusively available for retailers and manufacturers for download. Download here.