WEKA: My Magic 8 Ball

by woodsps56 on 02-22-2010 09:43 PM

I had to pry myself away from Lucene  to explore WEKA, a open-source machine learning tool that has been around a while and like all good open source, just keeps getting better and better. And the book that introduces the concepts, theory, and practice couldn’t be better written.




 


If you’re like me, when you think of data mining you think of databases, but that’s only a part of what data mining. WEKA is about machine learning.  It’s like having your own Magic 8 Ball. Ask a question; shake the ball, out comes the answer.


 


Here’s my first question.


 


If I only know the counts of key words in a legacy source code module, can I determine the functionality within that code?



So I retrieved a legacy application, counted the key words, classified the key word by functionality, calculated the percentages, and told WEKA what the outcomes were.


WEAK crunched on the numbers and found actual rules I could use to make future module classifications:


Correctly Classified Instances         116               65.9091 %


Incorrectly Classified Instances        60               34.0909 %


 


WEKA even built a graphical decision tree for me:



 


 How’d WEKA do? It correctly classified 66% of the cases. Can it get better? Sure, as I collect more and more classification data, WEKA will learn to produce better rules.


 The best part is that we can build WEKA into VI Explorer. More to come on this exciting technology….


 


We encourage you to share your comments on this post. Comments are moderated and will be reviewed and posted as promptly as possible during regular business hours.

To ensure your comment is published, please follow our community guidelines.

Comments
by datakid1(anon) on 07-31-2010 05:38 PM

 

You might be interested to take a look at the collection of  tutorials and videos on WEKA.

Tutorials:

http://www.dataminingtools.net/browsetutorials.php?tag=weka

Videos:

http://www.dataminingtools.net/videos.php?id=6

 

 

by woodsps56 on 10-11-2010 03:53 PM

Thanks,

Great tutorials.

Post a Comment
Be sure to enter a unique name. You can't reuse a name that's already in use.
Be sure to enter a unique email address. You can't reuse an email address that's already in use.
Type the characters you see in the picture above.Type the words you hear.

Find HP in Social Media

Facebook Twitter YouTube SlideShare Flickr
About the Author
Latest Comments