MIT-IBM scientists develop an AI algorithm to recommend topics based on preferences

This article was first published on our sister Site, Whats New On The Net.

There’s a vast amount of text available on the Internet. Books, blogs & news articles abound, so deciding what to read can be a challenge. This is especially true for companies who make it their business to recommend topics for their consumers to read such as Amazon or Flipboard.

There’s a need to cut through the chaff, evaluate the merit of each piece of writing & decide whether a particular person, based on their preferences, might enjoy it.

Researchers with the MIT-IBM Watson Artificial Intelligence Laboratory & the MIT Geometric Data Processing Group believe they’ve developed the very best AI algorithm to achieve this objective.

The MIT-IBM team use a combination of well-respected AI techniques to parse text on the Internet & break it down into usable sections, which they can use to determine what the Content is about, they then ‘score’ the topics found in the document, so that they can use this data to recommend articles to consumers.

Two major techniques, inlays & optimal transport, are employed by the scientists, which allows them to analyze millions of documents at here to unheard of speeds. Information data obtained in this manner can then be used to offer articles to people based solely on their historical preferences. 

To give a little more detail about how AI achieves this aim, assistant professor Justin Solomon has explained elsewhere that text is summarized based on commonly used words that define a topic. Once summarized by theme, the document is further divided into 5-15 of its most relevant or important topics. The algorithm then gives a numerical representation of the embedded words (Inlays). Optimal transport, the second technique used, assists in calculating the best way to move objects (or data points) between multiple destinations(generally used for comparisons).

To demonstrate the effectiveness of their algorithm, the MIT-IBM team analyzed 1,720 pairs of titles from the Gutenberg Project — the algorithm compared them all in just 1 second, or more than 800 times faster than any other method. The team’s system is also excellent at sorting documents into categories & providing lists of themes & topics, which are easy for humans to understand. All this makes it much simpler overall for people to find relevant documents or even products they might like, which the algorithm could find, for instance, based on product reviews.


 

Leave a Reply

Click here to opt out of Google Analytics