Customize Consent Preferences

We use cookies to help you navigate efficiently and perform certain functions. You will find detailed information about all cookies under each consent category below.

The cookies that are categorized as "Necessary" are stored on your browser as they are essential for enabling the basic functionalities of the site. ... 

Always Active

Necessary cookies are required to enable the basic features of this site, such as providing secure log-in or adjusting your consent preferences. These cookies do not store any personally identifiable data.

No cookies to display.

Functional cookies help perform certain functionalities like sharing the content of the website on social media platforms, collecting feedback, and other third-party features.

No cookies to display.

Analytical cookies are used to understand how visitors interact with the website. These cookies help provide information on metrics such as the number of visitors, bounce rate, traffic source, etc.

No cookies to display.

Performance cookies are used to understand and analyze the key performance indexes of the website which helps in delivering a better user experience for the visitors.

No cookies to display.

Advertisement cookies are used to provide visitors with customized advertisements based on the pages you visited previously and to analyze the effectiveness of the ad campaigns.

No cookies to display.

Big Data: Principles and Examples Vol. 1

Big Data has become the subject of Big Hype, much as Social Media and Mobile were recently. Our goal today is to peel back the hype and discover some of the key principles behind Big Data so we can make the best possible decisions about when, where, and how to apply it.

My background with Big Data has predominantly been in retail, as Principal Engineer in Personalization at Amazon, and now Chief Scientist at RichRelevance, so I will use several retail examples. However, the principles behind these examples are without question more broadly applicable. These principles are:

  1. Before we look at any data, we have to have a clear and well-defined goal. Otherwise we are likely to find very clever solutions to the wrong problems.
  2. Smart data science requires the same fundamental scientific method—hypothesis, experimentation, and analysis—as every other science.
  3. Correlation is not causation. We all know this, but in a big data world it is much easier to confuse the two.
  4. Data are economic assets. Understanding them as such helps us understand how to motivate all participants in the data economy, from individuals to corporations to governments and non-profits.

The Netflix Prize

The Netflix Prize has done more to bring Big Data and data science in general to the public mind than any other event. This has been great for increasing the visibility of the field, but I’m sad to say, miserable for actual practice. The saddest part is that the winning algorithms are not in use at Netflix today, and are unlikely ever to be.

What went wrong? Fundamentally, the contest violated Principle 1. It did not ask contestants to optimize the right thing, which is what films to recommend to customers. Instead, the contest judged algorithms by how well they predict how customers would rate movies. So far that doesn’t sound completely illogical. If you know what I will rate highly, you can recommend it to me.

Unfortunately, the algorithms were judged across the ratings scale. So being able to tell the difference between something to which I would give one star, and something to which I would give two stars, was just as important as being able to tell the difference between something to which I would give four stars and something I to which I would give five.

Why does this matter? Well, chances are I would never recommend either the one- star or the two-star film. Does it really matter that I can tell the difference between films you despise and films you merely don’t like? It almost certainly does not. And it certainly does not matter nearly as much as knowing the difference between films you will love and those you will merely like.

So what went wrong? Principle 1 was violated, the data scientists were unleashed, and we got great solutions to the wrong problem. Netflix got a lot of press, the winners got some cash, but the solution never went into production.

Share :
Related Posts

Leave Your Comment

toto slot toto slot situs hk pools toto slot

toto hk hk pools

slot toto

bo togel

hk pools

toto slot

toto slot

toto togel

toto macau

situs togel

toto slot

situs slot gacor

mahjong slot