Skip to main content

Diving into Data Mining

By January 23, 2015March 20th, 2015Blog Posts
data screens

As part of the TEAM program, students are encouraged to take up to three technical elective courses. This is a perfect complement to the core entrepreneurship management courses as it allows us to get in touch with the latest technologies available and focus on those specific areas that might be needed to develop our future startups. In my case, coming from a computer science background and being deeply interested in the creation of web and mobile apps, I decided to take the courses Data Mining and Human Computer Interaction during the fall semester. I can say today that I could not have made a better decision. I learned a lot from both courses and more importantly, I was able to put my knowledge into practice by working on small projects requiring my data science and computer science skills. On top of that, after finishing the semester, I was asked by both professors to do research for each of the courses, so that means I will be able to further dive into the stuff I love and will have plenty of time to do it!

In this post, I am going to talk about Data Mining and will leave Human Computer Interaction for my next article. Data Mining is the extraction of interesting patterns within large sets of data by using several different techniques related to artificial intelligence, statistics, machine learning, and database systems. These patterns are then translated into useful information that can be easily understood by end users or clients. So for example, a supermarket might use data mining in order to analyze its customers’ purchases and make smart decisions based on the results of the research. These might help to understand the different kinds of customers who shop at the supermarket, what they seek, how much they spend on average, and which products they usually buy. It might also help to find correlations between products (e.g., if most of the customers who buy gin also buy tonic and lemons), or even to predict whether putting a new product on a specific shelf would be a good decision, or if it should be put somewhere else. There is a story that years ago, Walmart found that many young American males bought diapers and beer together, a correlation that is unexpected and not obvious at first glance. You would expect customers buying diapers to also buy baby food, for example, but not beer. After doing some analysis, researchers reached the conclusion that these were young new fathers who were asked by their wives to go out to buy some diapers and, once they were at the shop, took the chance to also get some beer!

There is plenty of bibliographical material you can read about this fascinating discipline so I will stop talking about Data Mining and tell you about the project I worked on during the end of the course. Christmas was coming and I wanted to give a book as a gift to a friend and so I faced the problem: what books would this friend like? Gustame is an app I developed to address this problem. By signing in to the site using their Facebook accounts, users are able to see a list of their friends and for each of those friends, get recommendations based on the Facebook pages related to books they already like. Sounds tricky, huh? Then I should not mention that it uses an open knowledge database (Freebase) to disambiguate redundant entities and that it uses a Slope One algorithm based on your closed graph of contacts to make the recommendation… but I will give you an example so you can figure out how it works.

  • Your friend Albert has liked the Facebook pages “The Hobbit – the book” and “Alice in Wonderland,” so we might assume he has read those two books.
  • Your other friends Bob and Claire also liked those two pages, as well as “The Chronicles of Narnia Books”, which Albert hasn’t. This means they have similar tastes to Albert and that they have read a book Albert might enjoy reading too.
  • You get a recommendation to give Albert the books “The Chronicles of Narnia” as a gift.

However, I invite you to just go and try the app yourself by visiting www.gustame.com! Any suggestions are welcome!

 

– Agustin Baretto ’15 (MS)

Agustin_pic