OPLIN 4Cast #310: Getting the picture

Librarians know about metadata. So do search engines. Those words that describe something are often critical to being able to find it. Searching a library catalog for a specific title (part of the metadata) works very well, as does searching for things on the Internet that have clear and accurate descriptions. But we’ve all experienced the library patron who says, “I don’t remember the title, but it was a big, red book with a green leaf on the cover,” or some similarly unhelpful description. Like many people, they have good visual memory of what the book looks like, even if they can’t remember the words on the cover. Search engines and other applications have the same problem when trying to find a poorly described image or video on the Internet, but computer techies are working hard to find a solution.

Seeking a better way to find web images (New York Times/John Markoff) “Now, along with computer scientists from Princeton, Dr. Li, 36, has built the world’s largest visual database in an effort to mimic the human vision system. With more than 14 million labeled objects, from obsidian to orangutans to ocelots, the database has because a vital resource for computer vision researchers. The labels were created by humans. But now machines can learn from the vast database to recognize similar, unlabeled objects, making possible a striking increase in recognition accuracy.”
The midnight epiphany that changed Like.com from an over-hyped failure to a $100 million acquisition (Fast Company/Sindya N. Bhano) “This second iteration of the Riya’s technology allowed users to find an image, say of a strappy red shoe, and request Like.com to do a ‘Likeness search’ to find similar items. Users could find variations of products in different colors, shop for clothing similar to what celebrities were wearing, and upload images of their favorite items, then scour the web for similar items.”
gazeMetrix using image recognition tech to find branded Instagram photos (BetaKit/Humayun Khan) “The company’s technology uses an algorithm that breaks down the unique characteristics of a brand’s logo, everything from the corners, shapes, lines, shadows, and colors, to create a brand signature. From there each photo is processed using what Singh termed ‘fuzzy matching,’ which means that even if the logo in the image is only partially showing, is on a wrinkled t-shirt or any piece of clothing or anywhere else, it will still pick it up and match it to the brand. What brands can then do is aggregate all the images containing their logo and eventually will be able to interact with those who uploaded the photos to boost brand engagement.”
DARPA seeks breakthroughs in computer vision (EE Times/Rick Merritt) “The Mind’s Eye program aims to develop breakthrough algorithms for automatically recognizing and describing human activities. Donlon showed small steps forward—and a few bloopers—from his first 18 months of work on the three-year effort. For example, efforts of a dozen systems failed to recognize a running dog; one described a collision between two shopping carts as ‘the car left.’ In particular, current algorithms have difficulty detecting forearm motions that are key to activities of high interest such as giving and taking.”

Images fact:
If a picture is worth a thousand words, there’s an incredible amount of non-textual information on the Internet. Five million images a day are uploaded to Instagram alone.

OPLIN 4Cast #310: Getting the picture

editor