Web and text mining is what I am interested but haven't touched much.
Lucene is a good starting point since 1) easy basic idea to understand; 2) open-source code to study things happening behind; 3) search technology is hot and basic for more advanced text mining.
1 comment:
http://nlp.stanford.edu/IR-book/pdf/irbookprint.pdf
A free book about information retrieval.
Post a Comment