Data Preprocessing – Normalization - Real time example
In this article, we are going to see normalization in action in a popular web application. People who are not familiar with normalization please refer to my previous post.
Let us see another interesting thing here, Let us rephrase our query. I want to find who is more popular tennis player in the year 2009 among Roger Federer, Serena Williams and Venus Williams? You may think Obviously everyone of us would know that Federer is going to be popular, what is interesting in this. Come on !! be patient I swear its going to be interesting, look at the new graph which is being displayed below .
It doesn't end with this, I am going to share a critical secret about normalization with you. We programmers, computer scientists are very familiar with the concept of encapsulation, which is nothing but hiding the actual implementation details. Think what google is doing here, Google doesnt want to share the actual count of search volume hits but it does want to convey to us relative significance between keywords. Yes you got it I think. Normalization can be considered as a method to encapsulate the actual data. Say for eg, you want a data mining expert to analyse your business but you do not want to reveal the actual revenue your business generates, just normalize the data in a scale of say [0-100000] and handle the data to him, you get your analysis done and he never gets an opportunity to know your actual revenue if you keep your normalization algorithm as a secret. Of course we need to understand that there may be some errors based on the normalization technique that we adopt.
I appreciate your feed backs and comments.
Good one with an understandable example using tennis players .good job.
ReplyDeletekewl man!! This blog goes on to my fav list now..
ReplyDeleteGood one.
ReplyDeleteYou might be interested to take a look at the collection of Tutorials and videos on Data mining.
Tutorials: http://www.dataminingtools.net/browsetutorials.php
Videos:
http://www.dataminingtools.net/browse.php
nice area to know the data mining concept
ReplyDelete