Saturday, April 28, 2007

Java Data Mining

Java data mining is a book introducing basic data mining concepts and one package contributing to this purpose, developed in Java. You can find in Javaworld for more interesting detail.

Friday, April 27, 2007


PMML's full name is Predictive Model Markup Language. Its authoritative site is . PMML is the industry standard to store and represent predictive model in data mining community. It is XML-based media, and now it can host most commonly found models, like linear regression, vector machine, tree model, neural network, naive Bayes, etc.

Training provided to new pies in company

Recently, there are several new graduates join us component team in SPSS R&D center. It is quite exciting event to see new blood be fed into this team. Being an experienced member of this team, I am responsible to host a series of training courses for them.

Wish them can take full responsibility asap!

Poster presentation on Canadian AI 2007

I have invited to do a poster presentation on Canadian AI conference, 2007. However, due to tight project schedule, seems I can't make it done then.

IAMB is really really data inefficient

Recently, I implemented IAMB in c++. IAMB is one algorithm to find the markov blanket of given target, assuming faithfulness was satisfied and the CI test was correct. However, it is quite data inefficient. Why? Look at CI(T,XMB) during its growing phase. With variable be added into the MB, the data size required to have a correct CI test is exponential to the size of MB. So, quite quickly, I have to stop the growing step since one of the basic assumption doesn't work any more.

I am still working to find a better solution, and seems PCMB is not bad one. But PCMB is too careful in the checking, which costs us much time in computing.

Bayes' Theorem

Bayes' Theorem

Bayes's Theorem is a simple mathematical formula used for calculating conditional probabilities. It figures prominently in subjectivist or Bayesian approaches to epistemology, statistics, and inductive logic. Subjectivists, who maintain that rational belief is governed by the laws of probability, lean heavily on conditional probabilities in their theories of evidence and their models of empirical learning. Bayes's Theorem is central to these enterprises both because it simplifies the calculation of conditional probabilities and because it clarifies significant features of subjectivist position. Indeed, the Theorem's central insight — that a hypothesis is confirmed by any body data that its truth renders probable — is the cornerstone of all subjectivist methodology.
1. Conditional Probabilities and Bayes's Theorem
2. Special Forms of Bayes's Theorem
3. The Role of Bayes's Theorem in Subjectivist Accounts of Evidence
4. The Role of Bayes's Theorem in Subjectivist Models of Learning
Other Internet Resources
Related Entries

You can read the whole text on

Enjoy it.