Friday, September 23, 2011

DataMining Tools Catching up with Big Data

With Data Explosion increasing every data, the well established data mining tools are getting ready to attack the Big Data with the help of hadoop Framework. Hadoop is a Mimic of Google's Map reduce built in java. It provides a framework for massive parallel and distributed computing on commodity hardware.

Mahout is a machine learning framework built  on top of the Hadoop Framewrok, which implements few of the machine learning algorithms

R is a well known animal among statisticians and also widely used by data miners. R is being integrated with Hadoop by revolutionary analytics. For more details visit the below link where a white paper and presentation download is available. REVOLUTION WEBINAR: LEVERAGING R IN HADOOP ENVIRONMENTS

Rapid Miner is a another good data mining tool which is available for free to the community and practitioners, Rapid-I is on the way to integrate hadoop with Rapid Miner and has come up with Radoop which integrates Rapid Miner with Hadoop and Mahout and aims to provide an easy user interface to Mahout and Big Data Analytics.

Everyone is in the urge to keep up the pace to handle the Big Data. It will be great if  Weka, a widely used Machine Learning open source tool  which is Memory Based Java Implemented also gets rewritten to leverage the advantages of Hadoop. Of course we should keep in mind, not all algorithms can be implemented using Map-Reduce and Integrating Weka with Hadoop could be daunting task.

No comments:

Post a Comment