Sunday, December 31, 2006

Papers about Google Technology

Below is a partial list of papers written by people at Google, organized by category.

Achieving Anonymity via Clustering in a Metric Space, Gagan Aggarwal, Tomas Feder, Krishnaram Kenthapadi, Samir Khuller, Rina Panigrahy, Dilys Thomas, An Zhu, PODS, 2006
An O(log n) Approximation Ratio for the Asymmetric Traveling Salesman Path Problem, Chandra Chekuri, Martin Pál, Proceedings of APPROX 2006, 2006
Knapsack auctions, Gagan Aggarwal, Jason D. Hartline, SODA, 2006
Efficient Computation of the Relative Entropy of Probabilistic Automata, Corinna Cortes, Mehryar Mohri, Ashish Rastogi, Michael Riley, Proceedings of the 7th Latin American Symposium (LATIN 2006), 2006
On the Computation of Some Standard Distances between Probabilistic Automata, Corinna Cortes, Mehryar Mohri, Ashish Rastogi, Proceedings of the 11th International Conference on Implementation and Application of Automata (CIAA 2006), 2006
A Loopless Gray Code for Minimal Signed-Binary Representations, Gurmeet Singh Manku, Joe Sawada, European Symposium on Algorithms, 2005
Generalized Opinion Pooling, Ashutosh Garg, T. S. Jayram, Shivakumar Vaithyanathan, Huaiyu Zhu, AMAI, 2004
On the Streaming Model Augmented with a Sorting Primitive, Gagan Aggarwal, Mayur Datar, Sridhar Rajagopalan, Matthias Ruhl, FOCS, 2004

Artificial Intelligence
Reasoning about Partially Observed Actions, Megan Nance, Adam Vogel, Eyal Amir, AAAI, 2006
Special Review Issue, Donald Perlis, Peter Norvig, Artif. Intell., 2005
Artificial Intelligence: A Modern Approach, Stuart Russell, Peter Norvig, 2002

Audio Processing
Social- and Interactive-Television Applications Based on Real-Time Ambient-Audio Identification, Michael Fink, Michele Covell, Shumeet Baluja, European Interactive TV Conference (Euro-ITV), 2006, 2006
Predicting EMG Data from M1 Neurons with Variational Bayesian Least Squares, Jo-Anne Ting, Aaron D'Souza, Kenji Yamamoto, Toshinori Yoshioka, Donna Hoffman, Shinji Kakei, Lauren Sergio, John Kalaska, Mitsuo Kawato, Peter Strick, Stefan Schaal, Advances in Neural Information Processing Systems 18, 2006
Simple Reconstruction of Binary Near-Perfect Phylogenetic Trees, Srinath Sridhar, Kedar Dhamdhere, Guy E. Blelloch, Eran Halperin, R. Ravi, Russell Schwartz, International Conference on Computational Science (2), 2006
Finite-State Transducers in Computational Biology, Corinna Cortes, Mehryar Mohri, Tutorial presented at the 13th Annual International Conference on Intelligent Systems for Molecular Biology (ISMB 2005), 2005

Computational Science and Modeling
Oral Mucosal Microvascular Network Abnormalities in De Novo Mutation Achondroplasia, C. D. Felice, S. Parrini, G. D. Maggio, R. N. Laurini, K. Shirriff, Fractals, 2005
Computer Architecture
The Price of Performance, Luiz Andre Barroso, ACM Queue, 2005
Computer Graphics
The Definitive Guide to ImageMagick, Michael Still, 2005
Computer Vision
Large Scale Image-Based Adult-Content Filtering, Henry A. Rowley, Yushi Jing, Shumeet Baluja, 1st International Conference on Computer Vision Theory, 2006
Boosting Sex Identification Performance, Shumeet Baluja, Henry A. Rowley, AAAI, 2005
Large Scale Performance Measurement of Content-Based Automated Image-Orientation Detection, Shumeet Baluja, Henry A. Rowley, International Conference on Image Processing, 2005

Data Compression
Index Coding with Side Information, Ziv Bar-Yossef, Yitzhak Birk, T. S. Jayram, Tomer Kol, FOCS, 2006
Data Mining
Dense Subgraph Extraction, David Gibson, Ravi Kumar, Kevin S. McCurley, Andrew Tomkins, in: Mining Graph Data, 2006
Mining the Web to Determine Similarity Between Words, Objects, and Communities, Mehran Sahami, Proceedings of the 19th International FLAIRS Conference (FLAIRS-2006), 2006
Adaptive Product Normalization: Using Online Learning for Record Linkage in Comparison Shopping, Mikhail Bilenko, Sugato Basu, Mehran Sahami, Proceedings of the 5th IEEE International Conference on Data Mining, 2005
Evaluating similarity measures: a large-scale study in the orkut social network, Ellen Spertus, Mehran Sahami, Orkut Buyukkokten, Proceedings of the Eleventh ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD-2005), 2005
A social network caught in the Web, Lada A. Adamic, Orkut Buyukkokten, Eytan Adar, First Monday, 2003
Mining Optimized Gain Rules for Numeric Attributes, Sergey Brin, Rajeev Rastogi, Kyuseok Shim, IEEE Trans. Knowl. Data Eng., 2003
Scalable Techniques for Mining Causal Structures, Craig Silverstein, Sergey Brin, Rajeev Motwani, Jeffrey D. Ullman, Data Min. Knowl. Discov., 2000
Beyond Market Baskets: Generalizing Association Rules to Dependence Rules, Craig Silverstein, Sergey Brin, Rajeev Motwani, Data Min. Knowl. Discov., 1998
Extracting Patterns and Relations from the World Wide Web, Sergey Brin, WebDB, 1998
Scalable Techniques for Mining Causal Structures, Craig Silverstein, Sergey Brin, Rajeev Motwani, Jeffrey D. Ullman, VLDB, 1998

Data and System Management
Data management projects at Google, Wilson Hsieh, Jayant Madhavan, Rob Pike, SIGMOD Conference, 2006
Growth Codes: Maximizing Sensor Network Data Persistence, Abhinav Kamra, Jon Feldman, Vishal Misra, Dan Rubenstein, Proc. Special Interest Group on Data Communication (SIGCOMM), to appear, 2006
Sender Reputation in a Large Webmail Service, Bradley Taylor, Third Conference on Email and Anti-Spam (CEAS 2006), 2006
Networking proposal for TR2, Gerhard Wesp, 2005
Distributed Systems and Parallel Computing
A Tool for Prioritizing DAGMan Jobs and Its Evaluation, Grzegorz Malewicz, Ian Foster, Arnold Rosenberg, Michael Wilde, Proceedings of the IEEE International Symposium on High-Performance Distributed Computing (HPDC06), 2006
An Autonomic Routing Framework for Sensor Networks, Yu He, Cauligi S. Raghavendra, Steven Berson, Robert Braden, Cluster Computing, Special Issue on Autonomic Computing (Kluwer Academic Pulishers), 2006
An Experimental Study of the Skype Peer-to-Peer VoIP System, Saikat Guha, Neil Daswani, Ravi Jain, Proceedings of The 5th International Workshop on Peer-to-Peer Systems (IPTPS '06), 2006
Bigtable: A Distributed Storage System for Structured Data, Fay Chang, Jeffrey Dean, Sanjay Ghemawat, Wilson C. Hsieh, Deborah A. Wallach, Mike Burrows, Tushar Chandra, Andrew Fikes, Robert E. Gruber, 7th USENIX Symposium on Operating Systems Design and Implementation (OSDI), 2006
The Chubby lock service for loosely-coupled distributed systems, Mike Burrows, 7th USENIX Symposium on Operating Systems Design and Implementation (OSDI), 2006
Tolerating Dependences Between Large Speculative Threads Via Sub-Threads, Christopher B. Colohan, Anastassia Ailamaki, J. Gregory Steffan, Todd C. Mowry, International Symposium on Computer Architecture (ISCA), 2006
Decentralized algorithms using both local and random probes for P2P load balancing, Krishnaram Kenthapadi, Gurmeet Singh Manku, SPAA, 2005
Interpreting the Data: Parallel Analysis with Sawzall, Rob Pike, Sean Dorward, Robert Griesemer, Sean Quinlan, Scientific Programming Journal, 2005
Papillon: Greedy Routing in Rings, Ittai Abraham, Dahlia Malkhi, Gurmeet Singh Manku, DISC, 2005
MapReduce: Simplified Data Processing on Large Clusters, Jeffrey Dean, Sanjay Ghemawat, OSDI'04: Sixth Symposium on Operating System Design and Implementation, 2004
Web Search for a Planet: The Google Cluster Architecture, Luiz Andre Barroso, Jeffrey Dean, Urs Holzle, IEEE Micro, 2003

Electronic Commerce
Truthful auctions for pricing search keywords, Gagan Aggarwal, Ashish Goel, Rajeev Motwani, ACM Conference on Electronic Commerce, 2006
File Systems
The Google File System, Sanjay Ghemawat, Howard Gobioff, Shun-Tak Leung, Proceedings of the 19th ACM Symposium on Operating Systems Principles, 2003
Human-Computer Interaction
Scaling the card sort method to over 500 items: restructuring the Google AdWords Help Center, Yelena Nakhimovsky, Rudy Schusteritsch, Kerry Rodden, Proceedings of ACM CHI 2006, 2006
Hypertext and the Web
A Web-based Kernel Function for Measuring the Similarity of Short Text Snippets, Mehran Sahami, Tim Heilman, Proceedings of the Fifteenth International World Wide Web Conference, 2006
Browsing on Small Screens: Recasting Web-Page Segmentation into an Efficient Machine Learning Framework, Shumeet Baluja, Proceedings of the Fifteenth International World Wide Web Conference, 2006
Hyperlink analysis on the world wide web, Monika Rauch Henzinger, Hypertext, 2005
Thresher: automating the unwrapping of semantic content from the World Wide Web, Andrew Hogue, David Karger, WWW '05: Proceedings of the 14th international conference on World Wide Web, 2005
Eye-tracking analysis of user behavior in WWW search, Laura A. Granka, Thorsten Joachims, Geri Gay, SIGIR, 2004
Extracting knowledge from the World Wide Web, Monika Henzinger, Steve Lawrence, Mapping Knowledge Domains, 2003
Patterns on the Web, Krishna Bharat, SPIRE, 2003
Who Links to Whom: Mining Linkage between Web Sites, Krishna Bharat, Bay-Wei Chang, Monika Henzinger, Matthias Ruhl, IEEE International Conference on Data Mining (ICDM '01), 2001
A Comparison of Techniques to Find Mirrored Hosts on the WWW, Krishna Bharat, Andrei Z. Broder, Jeffrey Dean, Monika Rauch Henzinger, IEEE Data Eng. Bull., 2000
The Anatomy of a Large-Scale Hypertextual Web Search Engine, Sergey Brin, Lawrence Page, Computer Networks, 1998
What can you do with a Web in your Pocket?, Sergey Brin, Rajeev Motwani, Lawrence Page, Terry Winograd, IEEE Data Eng. Bull., 1998

Image and Video Processing
Foreground object segmentation from binocular stereo video, Kevin Law, Stan Sclaroff, Intelligent Robots and Computer Vision XXIII: Algorithms, Techniques, and Active Vision. Edited by Casasent, David P.; Hall, Ernest L.; Röning, Juha. Proceedings of the SPIE, 2005
Efficient Face Orientation Discrimination, Shumeet Baluja, Mehran Sahami, Henry A. Rowley, International Conference on Image Processing (ICIP-2004), 2004
Information Retrieval
Retroactive Answering of Search Queries, B. Yang, G. Jeh, Proceedings of the Fifteenth International World Wide Web Conference (WWW-2006), 2006
Using annotations in enterprise search, Pavel A. Dmitriev, Nadav Eiron, Marcus Fontoura, Eugene Shekita, WWW, 2006
Concept-based interactive query expansion, Bruno M. Fonseca, Paulo Braz Golgher, Bruno Possas, Berthier A. Ribeiro-Neto, Nivio Ziviani, CIKM, 2005
Current trends in the integration of searching and browsing, Andrei Z. Broder, Yoelle S. Maarek, Krishna Bharat, Susan T. Dumais, Steve Papa, Jan O. Pedersen, Prabhakar Raghavan, WWW (Special interest tracks and posters), 2005
Internet Searching, Peter Norvig, Computer Science: Reflections on the Field, Reflections from the Field, 2004
Learning to find answers to questions on the Web, Eugene Agichtein, Steve Lawrence, Luis Gravano, ACM Trans. Internet Techn., 2004
The Happy Searcher: Challenges in Web Information Retrieval, Mehran Sahami, Vibhu Mittal, Shumeet Baluja, Henry Rowley, The Eighth Pacific Rim International Conference on Artificial Intelligence (PRICAI-2004), 2004
The Past, Present and Future of Web Information Retrieval, Monika Rauch Henzinger, PODS, 2004
The Past, Present, and Future of Web Search Engines, Monika Rauch Henzinger, ICALP, 2004
Query-Free News Search, Monika Henzinger, Bay-Wei Chang, Brian Milch, Sergey Brin, Proceedings of the 12th International World Wide Web Conference (WWW-2003), 2003
Knowledge Discovery
Unweaving a web of documents, R. Guha, Ravi Kumar, D. Sivakumar, Ravi Sundaram, KDD, 2005

Machine Learning
Bayesian Regression with Input Noise for High-Dimensional Data, Jo-Anne Ting, Aaron D'Souza, Stefan Schaal, In Proceedings of the 23rd International Conference on Machine Learning, 2006
Dependency trees in sub-linear time and bounded memory, Dan Pelleg, Andrew W. Moore, VLDB J., 2006
Efficient Learning of Label Ranking by Soft Projections onto Polyhedra, S. Shalev-Shwartz, Y. Singer, Journal of Machine Learning Research, 2006
Online Learning meets Optimization in the Dual, S. Shalev-Shwartz, Y. Singer, Proceedings of the Nineteenth Annual Conference on Computational Learning Theory, 2006
Online Multiclass Learning by Interclass Hypothesis Sharing, Michael Fink, Shai Shalev-Shwartz, Yoram Singer, Shimon Ullman, Proceedings of the 23rd International Conference on Machine Learning, 2006
Online Passive Aggressive Algorithms, K. Crammer, O. Dekel, J. Keshet, S. Shalev-Shwartz, Y. Singer, Journal of Machine Learning Research, 2006
Online multitask learning, Ofer Dekel, Philip M. Long, Yoram Singer, The 19th Annual Conference on Learning Theory, 2006
PAC Learning Mixtures of Gaussians with No Separation Assumption, Jon Feldman, Ryan O'Donnell, Rocco A. Servedio, Proc. 19th Annual Conference on Learning Theory (COLT), 2006
Predicting Electricity Distribution Feeder Failures Using Machine Learning Susceptibility Analysis, Philip Gross, Albert Boulanger, Marta Arias, David L. Waltz, Philip M. Long, Charles Lawson, Roger Anderson, Matthew Koenig, Mark Mastrocinque, William Fairechio, John A. Johnson, Serena Lee, Frank Doherty, Arthur Kressner, IAAI, 2006
Learning Linearly Separable Languages, Leonid Kontorovich, Corinna Cortes, Mehryar Mohri, Proceedings of The 17th International Conference on Algorithmic Learning Theory (ALT 2006), 2006
A New Perspective on an Old Perceptron Algorithm, S. Shalev-Shwartz, Y. Singer, Proceedings of the Eighteenth Annual Conference on Computational Learning Theory, 2005
Data-Driven Online to Batch Conversions, Ofer Dekel, Yoram Singer, NIPS, 2005
Loss Bounds for Online Category Ranking, K. Crammer, Y. Singer, Proceedings of the Eighteenth Annual Conference on Computational Learning Theory, 2005
Online Multiclass Learning with k-Way Limited Feedback and an Application to Utterance Classification, Hiyan Alshawi, Machine Learning, 2005
Online Ranking by Projecting, K. Crammer, Y. Singer, Neural Computation, 2005
Phoneme Alignment Based on Discriminative Learning, J. Keshet, S. Shalev-Shwartz, Y. Singer, D. Chazan, Interspeech, 2005
The Forgetron: A Kernel-Based Perceptron on a Fixed Budget, Ofer Dekel, Shai Shalev-Shwartz, Yoram Singer, NIPS, 2005
A General Regression Technique for Learning Transductions, Corinna Cortes, Mehryar Mohri, Jason Weston, Proceedings of the Twenty-Second International Conference on Machine Learning (ICML 2005), 2005
Confidence Intervals for the Area under the ROC Curve, Corinna Cortes, Mehryar Mohri, Advances in Neural Information Processing Systems (NIPS 2004), 2005
Margin-Based Ranking Meets Boosting in the Middle, Cynthia Rudin, Corinna Cortes, Mehryar Mohri, Robert E. Schapire, Proceedings of The 18th Annual Conference on Computational Learning Theory (COLT 2005), 2005
Moment Kernels for Regular Distributions, Corinna Cortes, Mehryar Mohri, Machine Learning, 2005
Learning Theory, 17th Annual Conference on Learning Theory, COLT 2004, Banff, Canada, July 1-4, 2004, Proceedings, John Shawe-Taylor, Yoram Singer, Editor, COLT, 2004
The Power of Selective Memory: Self-Bounded Learning of Prediction Suffix Trees, O. Dekel, S. Shalev-Shwartz, Y. Singer, Advances in Neural Information Processing Systems 17, 2004
AUC Optimization vs. Error Rate Minimization, Corinna Cortes, Mehryar Mohri, Advances in Neural Information Processing Systems (NIPS 2003), 2004
Distribution Kernels Based on Moments of Counts, Corinna Cortes, Mehryar Mohri, Proceedings of the Twenty-First International Conference on Machine Learning (ICML 2004), 2004
Rational Kernels: Theory and Algorithms, Corinna Cortes, Patrick Haffner, Mehryar Mohri, Journal of Machine Learning Research (JMLR), 2004
Eliminating Dependent Pattern Matching, Healfdene Goguen, Conor McBride, James McKinna, Essays Dedicated to Joseph A. Goguen, 2006

Mobile Computing
A Large Scale Study of Wireless Search Behavior: Google Mobile Search, Maryam Kamvar, Shumeet Baluja, Proceedings of the SIGCHI conference on Human Factors in computing systems (CHI), 2006
Mobile search with text messages: designing the user experience for Google SMS, Rudy Schusteritsch, Shailendra Rao, Kerry Rodden, Proceedings of ACM CHI 2005, 2005
Report on the Mobile Search Workshop at WWW 2002, Aya Soffer, Yoelle S. Maarek, Bay-Wei Chang, SIGMOD Record, 2002
Natural Language Processing
Comparative Experiments on Sentiment Classification for Online Product Reviews, Hang Cui, Vibhu Mittal, Mayur Datar, Proceedings of the 21st National Conference on Artificial Intelligence, 2006
Integrating probabilistic extraction models and data mining to discover relations and patterns in text, Aron Culotta, Andrew McCallum, Jonathan Betz, HLT-NAACL, 2006
Names and Similarities on the Web: Fact Extraction in the Fast Lane, Marius Pasca, Dekang Lin, Jeffrey Bigham, Andrei Lifchits, Alpa Jain, Proceedings of the 21st International Conference on Computational Linguistics and 44th Annual Meeting of the Association for Computational Linguistics (COLING-ACL-06), 2006
Organizing and Searching the World Wide Web of Facts - Step One: the One-Million Fact Extraction Challenge, Marius Pasca, Dekang Lin, Jeffrey Bigham, Andrei Lifchits, Alpa Jain, Proceedings of the 21st National Conference on Artificial Intelligence (AAAI-06), 2006
Probabilistic Context-Free Grammar Induction Based on Structural Zeros, Mehryar Mohri, Brian Roark, Proceedings of the Seventh Meeting of the Human Language Technology conference - North American Chapter of the Association for Computational Linguistics (HLT-NAACL 2006), 2006
Using Encyclopedic Knowledge for Named Entity Disambiguation, Razvan Bunescu, Marius Pasca, Proceedings of the 11th Conference of the European Chapter of the Association of Computational Linguistics (EACL-2006), 2006
Vertex covering by paths on trees with its applications in machine translation, Guohui Lin, Zhipeng Cai, Dekang Lin, Inf. Process. Lett., 2006
Aligning Needles in a Haystack: Paraphrase Acquisition Across the Web, Marius Pasca, Peter Dienes, Proceedings of the 2nd International Joint Conference on Natural Language Processing (IJCNLP-2005), 2005
Finding Instance Names and Alternative Glosses on the Web: WordNet Reloaded, Marius Pasca, Proceedings of the 6th International Conference on Computational Linguistics and Intelligent Text Processing (CICLing-2005), 2005
Local Grammar Algorithms, Mehryar Mohri, Inquiries into Words, Constraints, and Contexts. Festschrift in Honour of Kimmo Koskenniemi on his 60th Birthday, 2005
Mining Paraphrases from Self-Anchored Web Sentence Fragments, Marius Pasca, Proceedings of the 9th European Conference on Principles and Practice of Knowledge Discovery in Databases (PKDD-2005), 2005
A Generalized Construction of Integrated Speech Recognition Transducers, Cyril Allauzen, Mehryar Mohri, Brian Roark, Michael Riley, Proceedings of the International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2004), 2004
A Nearest-Neighbor Method for Resolving PP-Attachment Ambiguity, Shaojun Zhao, Dekang Lin, IJCNLP, 2004
Acquisition of Categorized Named Entities for Web search, Marius Pasca, Proceedings of the 13th ACM Conference on Information and Knowledge Management (CIKM-04), 2004
Fast and optimal decoding for machine translation, Ulrich Germann, Michael Jahr, Kevin Knight, Daniel Marcu, Kenji Yamada, Artif. Intell., 2004
Statistical Modeling for Unit Selection in Speech Synthesis, Cyril Allauzen, Mehryar Mohri, Michael Riley, 42nd Meeting of the Association for Computational Linguistics (ACL 2004), Proceedings of the Conference, 2004
Searching the Web by Voice, Alexander Franz, Brian Milch, Proceedings of the 19th International Conference on Computational Linguistics (COLING), 2002
Data Reduction for the Scalable Automated Analysis of Distributed Darknet Traffic, Michael Bailey, Evan Cooke, Farnam Jahanian, Niels Provos, Karl Rosaen, David Watson, Proceedings of the 2005 Internet Measurement Conference, 2005
Trickle: A Userland Bandwidth Shaper for Unix-like Systems, Marius Eriksen, USENIX Annual Technical Conference, FREENIX Track, 2005
Topology discovery in heterogeneous IP networks: the, Yuri Breitbart, Minos N. Garofalakis, Ben Jai, Cliff Martin, Rajeev Rastogi, Avi Silberschatz, IEEE/ACM Trans. Netw., 2004

Return of Gonzo Gizmos, Simon Quellen Field, 2006
Gonzo Gizmos, Simon Quellen Field, 2002
Power Management
High-efficiency power supplies for home computers and servers, Urs Hoelzle, Bill Weihl, 2006
Programming Languages
Parallel Assignments in Software Model Checking, Murray Stokely, Sagar Chaki, Joel Ouaknine, Electr. Notes Theor. Comput. Sci., 2006
Independently Extensible Solutions to the Expression Problem, Matthias Zenger, Martin Odersky, FOOL, 2005
Java Puzzlers: Traps, Pitfalls, and Corner Cases, Joshua Bloch, Neal Gafter, 2005
Scalable Component Abstractions, Martin Odersky, Matthias Zenger, OOPSLA, 2005
Search Engine Design
Algorithmic Aspects of Web Search Engines, Monika Rauch Henzinger, ESA, 2004
Security, Cryptography, and Privacy
A Method for Making Password-Based Key Exchange Resilient to Server Compromise, Craig Gentry, Philip MacKenzie, Zulfikar Ramzan, Advances in Cryptology - CRYPTO 2006, 2006
Cookies Along Trust-Boundaries (CAT): Accurate and Deployable Flood Protection, Martin Casado, Aditya Akella, Pei Cao, Niels Provos, Scott Shenker, In Proceedings of Steps To Reduce Unwated Traffic From The Internet, 2006
Flow-Cookies: Using Bandwidth Amplification to Defend Against DDoS Flooding Attacks, Martin Casado, Pei Cao, Aditya Akella, Niels Provos, Proceedings of the IEEE Workshop on QoS, 2006
Language Modeling and Encryption on Packet Switched Networks, Kevin S. McCurley, Advances in Cryptology: Proc. Eurocrypt 2006, 2006
Limits to Anti Phishing, Jeff Nelson, David Jeske, Proceedings of the W3c Security and Usability Workshop, 2006
Resource Fairness and Composability of Cryptographic Protocols, Juan Garay, Philip MacKenzie, Manoj Prabhakaran, Ke Yang, Theory of Cryptography: Third Theory of Cryptography Conference, TCC 2006, 2006
A Virtual Honeypot Framework, Niels Provos, USENIX Security Symposium, 2004
Countering Code-Injection Attacks With Instruction-Set Randomization, Gaurav S. Kc, Angelos D. Keromytis, Vassilis Prevelakis, ACM Conference on Computer and Communications Security, 2003
Forward-Security in Private-Key Cryptography, Mihir Bellare, Bennet S. Yee, CT-RSA, 2003
Monotonicity and Partial Results Protection for Mobile Agents, Bennet S. Yee, ICDCS, 2003
Software Engineering
LEVER: A Tool for Learning Based Verification (Tool Paper), Abhay Vardhan, Mahesh Viswanathan, Proceedings of the 18th International Conference on Computer-Aided Verification (CAV'06), 2006
Modular Software Upgrades for Distributed Systems, Sameer Ajmani, Barbara Liskov, Liuba Shrira, European Conference on Object-Oriented Programming (ECOOP), 2006
Hancock: A language for analyzing transactional data streams, Corinna Cortes, Kathleen Fisher, Daryl Pregibon, Anne Rogers, Frederick Smith, ACM Trans. Program. Lang. Syst., 2004
Jscheme: A Dialect of Scheme for Scripting in Java,, Ken Anderson, Tim Hickey, Peter Norvig, Proceedings of the MIT Dynamic Languages Seminar, 2001

Theory and Foundations
Approximate reasoning for real-time probabilistic processes, Vineet Gupta, Radha Jagadeesan, Prakash Panangaden, Logical Methods in Computer Science, 2006
Head Normal Form Bisimulation for Pairs and the Lambda Mu-Calculus (Extended Abstract), Soren B. Lassen, Proceedings of the 21st Annual IEEE Symposium on Logic in Computer Science (LICS' 06), 2006
Normal Form Simulation for McCarthy's Amb, Soren B. Lassen, Proceedings of the 21st Annual Conference on Mathematical Foundations of Programming Semantics (MFPS XXI), 2006
Programmable clustering, Sreenivas Gollapudi, Ravi Kumar, D. Sivakumar, PODS, 2006
Using Many Machines to Handle an Enormous Error-Correcting Code, Jon Feldman, Proc. IEEE Information Theory Workshop (ITW), 2006
Eager Normal Form Bisimulation, Soren B. Lassen, Proceedings of the 20th Annual IEEE Symposium on Logic in Computer Science (LICS' 05), 2005

Three sources of classification error

  1. Bayes or Indistinguishability Error: The error due to overlapping densities. This error is an inherent property of the problem and can never be eliminated;
  2. Model Error: The error due to having an incorrect model. This error can only be eliminated if the designer specifies a model that includes the true model which generated the data. Designers generally choose the model based on knowledge of the problem domain rather than on the subsequent estimation method, and thus the model error in maximum-likelihood and Bayes methods rarely differ.
  3. Estimation Error: The error arisinig from the fact that the parameters are estimated from a finite sample. This error can best be reduced by increasing training data.

These three points are from "Pattern Classification" by Duda. From my point of view, the second point is the most common. Actually, it is about model structure choice or learning. The third one, parameter learning, works on the outcome of second point normally.

Monday, December 11, 2006

Greedy search-based structure learning algorithm for Bayesian Network

It is a score-based search algorithm to learn the structure of BN. Steps are described as below:

1. Select an initial DAG G0, from which to start the search;
2. Calculate Bayes factors between D0 and all possible networks, which differ by only one arrow, that is:
(1) One arrow is added to D0;
(2) One arrow in D0 is deleted;
(3) One arrow in D0 is turned.
3. Among all these networks, select the one that increases the Bayes factor the most;
4. If the Bayes factor is not increased, stop the search. Otherwise, let the chosen network be D0 and repeat from 2.

This algorithm is borrowed from Heckman et al's, and it is customized to DEAL, a BN package on R.

Thursday, December 07, 2006




适合人群:只适合软件开发者   技术开发最全面的论坛,里面可以遇到很多牛人,版面也很全,什么J2EE,.NET啊,该有的全上,在这里基本上可以提出任何问题,人气也是最旺的,不过一般提出的意见都有正方两面的,所以最终解决问题,还是靠自己。   
我爱研发网,顾名思义,是针对R&D研发人员的,目前在射频,通信,手机研发是国内第一的,牛人不少,问题讨论的很彻底。论坛制度很人性化,设有资料交换区,有超大量有价值的资料。  评价:强,速度还可以;   

适合人员:电子工程师  人气不错,覆盖面比较广,老牌了,就是有点和时代脱节了,无论是设计上还是内容上  
环球资源的子站,页面设计很大气,技术文章也比较及时和权威,是个难得的好网  评论:强,速度还可以     

适合人员:布线/网络工程师   人气很旺,特色是版面比较紧凑,综合布线这一块很权威,很窄很专,时间非常久了,颜色比较明快,就是太低端了,   

适合人员:企业策划,CIO   业界知名的知识站点“唐人社区”,信息化管理顾问可以去看看,人也很多,可惜,都是下载,实质性内容需要改观,我记得是非常专业的网站。   评价:不错,速度也还可以。     

适合人群:大多数,   评论类比较多,基本上在其他媒体上看到的评论,这里都会有,要想了解IT发展的情况,就来这里看看。   

Tuesday, December 05, 2006


ICANN 2007 welcomes contributions on the theory, algorithms and applications in the following broad areas:
• Computational neuroscience
• Connectionist cognitive science
• Data analysis and pattern recognition
• Graphical networks models, Bayesian networks
• Hardware implementations and embedded systems
• Intelligent Multimedia and the Semantic Web
• Neural and hybrid architectures and learning algorithms
• Neural control, planning and robotics applications
• Neural dynamics and complex systems
• Neuroinformatics
• Real world applications
• Self-organization
• Sequencial and structured information processing
• Signal and time series processing, blind source separation
• Vision and image processing
Ideas and nominations for interesting tutorials, special sessions, workshops and experts willing to organize various session tracks are called for. Most active experts will be included in the scientific committee of the conference.

Proceedings of ICANN will be published in Springer's Lecture Notes in Computer Science series. Paper length is restricted to a maximum of 10 pages, including figures. Papers will be blind reviewed. Instructions for authors are given here.

An extended version of selected ICANN 2007 papers will be published in a Special Issue of Elsevier's journal Neural Networks.

Thursday, November 30, 2006


May 28 - 30, 2007 Montreal, Québec,
Canada Full paper submission due: January 15th 2007

AI'07, the twentieth Canadian Conference on Artificial Intelligence, invites papers that present original work in all areas of Artificial Intelligence, either theoretical or applied such as

Natural Language Agent and Multi-Agent System Machine Learning
User Modeling Search AI applications Constraint Satisfaction
Smart Graphics Knowledge Representation E-Commerce Planning Information Processing Automated Reasoning Bioinformatics Neural Nets Web Applications Reasoning under Uncertainty Education Data Mining Games Robotics Case-based reasoning

Papers will be reviewed by the Program Committee members and judged according to their originality, technical merit and clarity of presentation. Each accepted paper will be allocated a maximum of 12 pages in the proceedings. All accepted papers for which one of the authors will have registered and present at the conference will be published in the conference proceedings as Lecture Notes in Artificial Intelligence - Springer. An award will be given for the best paper of the conference. Another prize will be given for the best paper for which the main author is a student. Papers submitted to AI'2007 must not have been accepted for publication elsewhere or be under review for another conference.

Authors are invited to submit electronically, by January 15th 2007, full papers in PDF, Postscript or MS-Word RTF. All papers must be written in English (only). Papers of up to 12 pages in length must be formatted according to Springer LNCS style. The use of the LaTeX2e style file available from Springer ( is strongly encouraged.

Important dates: -------------------
Full paper submission due: January 15th, 2007
Notification of acceptance: February 26th, 2007
Final paper due : March 15th, 2007

Monday, November 27, 2006

The XML Decade

I found this article to celebrate XML decade on IBM site, and share with you the first paragraph here,
"XML is approaching 10 years old. How closely depends on how you're counting. The W3C Recommendation Extensible Markup Language (XML) 1.0 was published on 10 February 1998. Work on XML started around 1996, however, rooted in almost thirty years of SGML. The design principles for XML, which guided its development were published on 25 August 1996. The first working draft, published on 14 November 1996 defined documents very similar to the majority of XML you might see today. Many of the changes between that first draft and the final recommendation were in more obscure areas of the standard. The basic idea of labeled, balanced, hierarchical tags and clearly defined text encoding were well in place in 1996, and so IBM Systems Journal accounts 2006 the year of XML's decade. Regardless of whether you agree with their counting, it is a volume well worth a thorough read by all XML professionals as it combines an interesting retrospective of XML with some useful articles discussing specific techniques and development, providing a glimpse into the future of the technology, and thus our profession. In this article I offer some comment and expansion on the treatment in IBM Systems Journal, focusing on the keynote article "Technical context and cultural consequences of XML" and one of the other contained papers, "Emerging patterns in the use of XML for information modeling in vertical industries". The latter paper is concerned with a common theme of Thinking XML--the development and adoption of industry-specific XML vocabularies. "

For the whole article, please refer to

Borrow one logo for my usage

Sunday, November 26, 2006

About Weka

Weka is a famous open-source collection of machine learning algorithms for data mining tasks. It originated from Waikato university in NewZealand.

Data mining related job positions is a good resource to find exciting data mining related jobs provided by known or emerging companies.

KDNUGGETS.COM collection of data mining softwares contains almost all the existing software package related to data mining, text mining, web mining, and predictive analysis.

Good links for agile modeling - collects many articles specifically related to iterative and agile methods, plus links; - articles on agile modeling; - has specificalized for years in object technology; - Brad Appleton maintains a large collection of links on software engineering, including iterative methods; - The Chinese font page links to an English version, with a search engine referencing iterative and agile articles.

Thursday, November 23, 2006

COLT 2007

The Twentieth Annual Conference on Learning Theory (COLT 2007) will take place on 13-15 June, 2007, in San Diego, California, as part of the 2007 Federated Computing Research Conference (FCRC) .

Electronic submission of papers
January 16, 2007 (5:59pm PST)
Elec. submission of two-page open problems
February 15, 2007
Notification of acceptance or rejection
March 12, 2007
Final submission of all papers
March 23, 2007
Conference dates
June 13-15, 2007

Interested to higher your ranking in the search result

Pls. visit for any interesting information though it is not encouraged.

about PMML standard

For those engaged in data mining and predictive application development, you are recommended to follow the PMML standard from It is supported by mainstream companies in this field, such as SPSS, SAS, IBM, Microsoft, etc. The newest version is 3.1, but as I know, v3.2 is to born soon.

Wednesday, November 22, 2006


The Industrial/Government Applications Track of the Thirteenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD-2007) will highlight challenges, lessons, concerns, and research issues arising out of deploying applications of KDD technology. The focus would be on promoting the exchange of ideas between researchers and practitioners of data mining.
The KDD-2007 Industrial/Government Applications (I/G) Track seeks to:
provide a forum for exchanging ideas between KDD practitioners, researchers, companies, and government organizations; and
help commercial and government organizations highlight successful KDD applications,
raise interesting (research) challenges and other concerns more specific to industry and government -- customer privacy issues, analysis of data not generally available in academia, issues of scale that arise more heavily in a corporate setting, etc.
The I/G Applications Track solicits papers describing attempts to deploy KDD solutions relevant to commercial or government challenges. The primary emphasis is on papers that advance our understanding of practical, applied, or pragmatic issues and perhaps highlight new research challenges in real KDD applications. Applications can be in any field including scientific, engineering, commercial, governmental, social, or political. The I/G Applications Track will consist of competitively-selected contributed papers - presented in oral and poster form - as well as invited talks. The full conference will also feature keynote presentations, workshops, tutorials, research track papers, and the KDD Cup competition. We envision submissions along four sub-areas:
Emerging applications, technology, and issues;
Deployed KDD case studies;
Product and experience descriptions; and
Pragmatic issues and research considerations in fielding real applications.
Emerging application, technology, and issue papers discuss prototype applications, tools for focused domains or tasks, useful techniques or methods, useful system architectures, scalability enablers, tool evaluations, or integration of KDD and other technologies. Case studies describe deployed projects with measurable benefits that include KDD technology. Such papers need to demonstrate the importance and impact of the work clearly. Product submissions clearly describe KDD technology embedded in commercial products (without otherwise being a product advertisement). Pragmatic issues and considerations include important practical and research considerations, approaches, and architectures that enable successful applications. Submitters are encouraged (but not required) to select one (or more) of these sub-areas for their papers.In their submission, authors are required to explain why the application is important, the specific need for KDD technology to solve the problem (including why other methods perhaps not based on data mining may fall short), and any innovations or lessons learned in the solution.
For submission details and organizers, see
Important Dates:
Abstracts due: 23 February, 2007 Paper submissions due: 28 February, 2007 (9 pages)

A good example for LaTex explanation

For anyone who want to publish papers frequently, you have to learn LaTex that is great invention by CS scientist Knuth who is the author of "The Art of Programming" too.

About SPSS

SPSS has become a leader in predictive analytics technologies through a combination of commitment to innovation and dedication to customers. You will find SPSS customers in virtually every industry, including telecommunications, banking, finance, insurance, healthcare, manufacturing, retail, consumer packaged goods, higher education, government, and market research.
Customers use SPSS predictive analytics software to anticipate change, manage both daily operations and special initiatives more effectively, and realize positive, measurable benefits. By incorporating predictive analytics into their daily operations, they become Predictive Enterprises—able to direct and automate decisions to meet business goals and achieve measurable competitive advantage.
Today, SPSS is shipping great tools and solutions to customers all over the world, such as SPSS for statistical analysis, Clementine for data mining, and Dimension for survey.

Microsoft XBN Introduction

XBN is the XML-based Bayesian Network storage format, originating from its ascendant BNIF which can be traced back to UAI'96, like XMLBIF. It will replace Microsoft's old DSC format that is used by MSBN tool. For more detail, pls. refer to

KDD 2007

SIGKDD 2007 will be hold in San Jose, California, USA.

Overview of Clementine

Clementine is an integrated data mining workbench. It provides a wide array of data mining techniques, along with pre-built vertical solutions, in an integrated and comprehensive manner, with a special focus on visualisation and ease-of-use. The latest release (Clementine 11.0) incorporates many enhancements, which fall into a number of areas. Productivity is probably the most important of these. One of the key concerns within the business community about analytic activity is the time that is taken. In Clementine 11.0, there are many enhancements that are targeted specifically at enhancing the productivity of the analyst. In addition, data mining is no longer a back-room activity - its results are now widely deployed throughout the enterprise and are leveraged by many users – this new release includes a number of significant features designed to assist in this enterprise-wide deployment.


Google的搜索机制是:几个分布的Crawler(自动搜索软件)同时工作——在网上“爬行”,URL服务器负责向Crawler提供URL的列表。Crawler所找到的网页被送到存储服务器中。存储服务器于是把这些网页压缩后存入一个知识库(repository)中。每个网页都有一个关联ID——doc ID,当一个新的URL从一个网页中解析出来时,就被分配一个doc ID。索引库和排序器负责建立索引,索引库从知识库中读取记录,将文档解压并进行解析。每个文档就转换成一组词的出现状况,称为hits。hits记录了词、词在文档中的位置、字体大小、大小写等。索引库把这些hit又分成一组“barrels”,产生经过部分排序后的索引。索引库同时分析网页中所有的链接,并将重要信息存在Anchors文档中,该这个文档包含了足够信息,可以用来判断一个链接被链入或链出的结点信息。 URL分解器(URL Resolver)阅读Anchors文档,并把相对的URL转换成绝对的URLs,并生成doc ID,它进一步为Anchor文本编制索引,并与Anchor所指向的doc ID建立关联。同时,它还产生由doc ID对(pairs of doc ID)所形成的数据库。这个链接数据库(Links)用于计算所有文档的页面等级(Pagerank)。 排序器会读取barrels,并根据词的ID号(word ID)列表来生成倒排挡。一个名DumpLexicon的程序则把上面的列表和由索引库产生的一个新的词表结合起来产生另一个新的词表供搜索器(Searcher)使用。这个搜索器就是利用一个Web服务器,并使用由DumpLexicon所生成的词表,并利用上述倒排挡以及页面等级来回答用户的提问。 从Google的体系结构、搜索原理中可以看到,其关键是:利用URL分解器获得Links信息,并且运用一定的算法得出页面等级的信息,这正是网络结构挖掘技术。

Tuesday, November 21, 2006


XMLBIF - XML-based BayesNets Interchange Format. It is a XML format to store a Bayes Network aiming at a commonly-agreed standard in this community. It is originally defined by Fabio Cozman and members of the GeNie project, notably Marek Druzdzel and Daniel Garcia. Its newest version is 0.3, and implemented in the JavaBayes and GeNie systems. Possibly, Netica and Hugin will support this format too.

The goal of the current XMLBIF format is to represent directed acyclic graph (DAG) that can be associated to conditional probability measures for discrete variables, with the possibility that decision and utility variables be present in the graph. Bayesian Network is one of the most famous DAG widely known today.

I will introduce the detail of XMLBIF in the coming days, and welcome any suggestion and idea exchange with you, my friend.

CS Conference Ranking

AREA: Artificial Intelligence and Related Subjects
Rank 1:
AAAI: American Association for AI National Conference
CVPR: IEEE Conf on Comp Vision and Pattern Recognition
IJCAI: Intl Joint Conf on AI
ICCV: Intl Conf on Computer Vision
ICML: Intl Conf on Machine Learning
KDD: Knowledge Discovery and Data Mining
KR: Intl Conf on Principles of KR & Reasoning
NIPS: Neural Information Processing Systems
UAI: Conference on Uncertainty in AI
ICAA: International Conference on Autonomous Agents
ACL: Annual Meeting of the ACL (Association of Computational Linguistics)
Rank 2:
NAACL: North American Chapter of the ACL
AID: Intl Conf on AI in Design
AI-ED: World Conference on AI in Education
CAIP: Inttl Conf on Comp. Analysis of Images and Patterns
CSSAC: Cognitive Science Society Annual Conference
ECCV: European Conference on Computer Vision
EAI: European Conf on AI
EML: European Conf on Machine Learning
GP: Genetic Programming Conference
IAAI: Innovative Applications in AI
ICIP: Intl Conf on Image Processing
ICNN/IJCNN: Intl (Joint) Conference on Neural Networks
ICPR: Intl Conf on Pattern Recognition
ICDAR: International Conference on Document Analysis and Recognition
ICTAI: IEEE conference on Tools with AI
AMAI: Artificial Intelligence and Maths
DAS: International Workshop on Document Analysis Systems
WACV: IEEE Workshop on Apps of Computer Vision
COLING: International Conference on Computational Liguistics
EMNLP: Empirical Methods in Natural Language Processing
EACL: Annual Meeting of European Association Computational Lingustics
CoNLL: Conference on Natural Language Learning
Rank 3:
PRICAI: Pacific Rim Intl Conf on AI
AAI: Australian National Conf on AI
ACCV: Asian Conference on Computer Vision
AI*IA: Congress of the Italian Assoc for AI
ANNIE: Artificial Neural Networks in Engineering
ANZIIS: Australian/NZ Conf on Intelligent Inf. Systems
CAIA: Conf on AI for Applications
CAAI: Canadian Artificial Intelligence Conference
ASADM: Chicago ASA Data Mining Conf: A Hard Look at DM
EPIA: Portuguese Conference on Artificial Intelligence
FCKAML: French Conf on Know. Acquisition & Machine Learning
ICANN: International Conf on Artificial Neural Networks
ICCB: International Conference on Case-Based Reasoning
ICGA: International Conference on Genetic Algorithms
ICONIP: Intl Conf on Neural Information Processing
IEA/AIE: Intl Conf on Ind. & Eng. Apps of AI & Expert Sys
ICMS: International Conference on Multiagent Systems
ICPS: International conference on Planning Systems
IWANN: Intl Work-Conf on Art & Natural Neural Networks
PACES: Pacific Asian Conference on Expert Systems
SCAI: Scandinavian Conference on Artifical Intelligence
SPICIS: Singapore Intl Conf on Intelligent System
PAKDD: Pacific-Asia Conf on Know. Discovery & Data Mining
SMC: IEEE Intl Conf on Systems, Man and Cybernetics
PAKDDM: Practical App of Knowledge Discovery & Data Mining
WCNN: The World Congress on Neural Networks
WCES: World Congress on Expert Systems
INBS: IEEE Intl Symp on Intell. in Neural \& Bio Systems
ASC: Intl Conf on AI and Soft Computing
PACLIC: Pacific Asia Conference on Language, Information and Computation
ICCC: International Conference on Chinese Computing
ICADL: International Conference on Asian Digital Libraries
RANLP: Recent Advances in Natural Language Processing
NLPRS: Natural Language Pacific Rim Symposium
ICRA: IEEE Intl Conf on Robotics and Automation
NNSP: Neural Networks for Signal Processing
ICASSP: IEEE Intl Conf on Acoustics, Speech and SP
GCCCE: Global Chinese Conference on Computers in Education
ICAI: Intl Conf on Artificial Intelligence
AEN: IASTED Intl Conf on AI, Exp Sys & Neural Networks
WMSCI: World Multiconfs on Sys, Cybernetics & Informatics
LREC: Language Resources and Evaluation Conference

AREA: Hardware and Architecture

Rank 1:
ASPLOS: Architectural Support for Prog Lang and OS
ISCA: ACM/IEEE Symp on Computer Architecture
ICCAD: Intl Conf on Computer-Aided Design
DAC: Design Automation Conf
MICRO: Intl Symp on Microarchitecture
HPCA: IEEE Symp on High-Perf Comp Architecture
Rank 2:
FCCM: IEEE Symposium on Field Programmable Custom Computing Machines
SUPER: ACM/IEEE Supercomputing Conference
ICS: Intl Conf on Supercomputing
ISSCC: IEEE Intl Solid-State Circuits Conf
HCS: Hot Chips Symp
VLSI: IEEE Symp VLSI Circuits
ISSS: International Symposium on System Synthesis
DATE: IEEE/ACM Design, Automation & Test in Europe Conference
Rank 3:
ICA3PP: Algs and Archs for Parall Proc
EuroMICRO: New Frontiers of Information Technology
ACS: Australian Supercomputing Conf
Advanced Research in VLSI
International Symposium on System Synthesis
International Symposium on Computer Design
International Symposium on Circuits and Systems
Asia Pacific Design Automation Conference
International Symposium on Physical Design
International Conference on VLSI Design

AREA: Applications

Rank 1:
I3DG: ACM-SIGRAPH Interactive 3D Graphics
ACM-MM: ACM Multimedia Conference
DCC: Data Compression Conf
SIGMETRICS: ACM Conf on Meas. & Modelling of Comp Sys
SIGIR: ACM SIGIR Conf on Information Retrieval
PECCS: IFIP Intl Conf on Perf Eval of Comp \& Comm Sys
WWW: World-Wide Web Conference
Rank 2:
EUROGRAPH: European Graphics Conference
CGI: Computer Graphics International
CANIM: Computer Animation
PG: Pacific Graphics
NOSSDAV: Network and OS Support for Digital A/V
PADS: ACM/IEEE/SCS Workshop on Parallel \& Dist Simulation
WSC: Winter Simulation Conference
ASS: IEEE Annual Simulation Symposium
MASCOTS: Symp Model Analysis \& Sim of Comp \& Telecom Sys
PT: Perf Tools - Intl Conf on Model Tech \& Tools for CPE
NetStore: Network Storage Symposium
MMCN: ACM/SPIE Multimedia Computing and Networking
JCDL: Joint Conference on Digital Libraries
Rank 3:
ACM-HPC: ACM Hypertext Conf
MMM: Multimedia Modelling
ICME: Intl Conf on MMedia & Expo
DSS: Distributed Simulation Symposium
SCSC: Summer Computer Simulation Conference
WCSS: World Congress on Systems Simulation
ESS: European Simulation Symposium
ESM: European Simulation Multiconference
HPCN: High-Performance Computing and Networking
Geometry Modeling and Processing
DS-RT: Distributed Simulation and Real-time Applications
IEEE Intl Wshop on Dist Int Simul and Real-Time Applications
ECIR: European Colloquium on Information Retrieval
DVAT: IS\&T/SPIE Conf on Dig Video Compression Alg \& Tech
MME: IEEE Intl Conf. on Multimedia in Education
ICMSO: Intl Conf on Modelling, Simulation and Optimisation
ICMS: IASTED Intl Conf on Modelling and Simulation

AREA: System Technology (Including networking and security)

Rank 1:
SIGCOMM: ACM Conf on Comm Architectures, Protocols & Apps
SPAA: Symp on Parallel Algms and Architecture
PODC: ACM Symp on Principles of Distributed Computing
PPoPP: Principles and Practice of Parallel Programming
MassPar: Symp on Frontiers of Massively Parallel Proc
RTSS: Real Time Systems Symp
SOSP: ACM SIGOPS Symp on OS Principles
OSDI: Usenix Symp on OS Design and Implementation
CCS: ACM Conf on Comp and Communications Security
S&P (Oakland): IEEE Symposium on Security and Privacy
MOBICOM: ACM Intl Conf on Mobile Computing and Networking
MOBIHOC: ACM International Symposium on Mobile Ad Hoc Networking and Computing
ICNP: Intl Conf on Network Protocols
OPENARCH: IEEE Conf on Open Arch and Network Prog
PACT: Intl Conf on Parallel Arch and Compil Tech
INFOCOM: Annual Joint Conf IEEE Comp & Comm Soc
Rank 2:
USENIX Symp on Internet Tech and Sys
CC: Compiler Construction
IPDPS: Intl Parallel and Dist Processing Symp
MOBISYS: International Conference on Mobile Systems, Applications, and Services
SenSys: ACM Conference on Embedded Networked Sensor Systems
ICPP: Intl Conf on Parallel Processing
ICDCS: IEEE Intl Conf on Distributed Comp Systems
SRDS: Symp on Reliable Distributed Systems
MPPOI: Massively Par Proc Using Opt Interconns
ASAP: Intl Conf on Apps for Specific Array Processors
Euro-Par: European Conf. on Parallel Computing
Usenix Security Symposium
NDSS: ISOC Network and Distributed System Security Symposium
ESORICS: European Symposium on Research in Computer Security
RAID: International Symposium on Recent Advances in Intrusion Detection
DSN: The International Conference on Dependable Systems and Networks
ACSAC: Annual Computer Security Applications Conference
WCW: Web Caching Workshop
LCN: IEEE Annual Conference on Local Computer Networks
IPCCC: IEEE Intl Phoenix Conf on Comp & Communications
CCC: Cluster Computing Conference
ICC: Intl Conf on Comm
WCNC: IEEE Wireless Communications and Networking Conference
IPSN: International Conference on Information Processing in Sensor Networks
IPTPS: Annual International Workshop on Peer-To-Peer Systems
CSFW: IEEE Computer Security Foundations Workshop
Rank 3:
MPCS: Intl. Conf. on Massively Parallel Computing Systems
GLOBECOM: Global Comm
IMC: Internet Measurement Conference
IC3N: Intl Conf on Comp Comm and Networks
ICCC: Intl Conf on Comp Communication
NOMS: IEEE Network Operations and Management Symp
CONPAR: Intl Conf on Vector and Parallel Processing
VAPP: Vector and Parallel Processing
ICPADS: Intl Conf. on Parallel and Distributed Systems
Public Key Cryptosystems
Fast Software Encryption
SecureComm: Int. Conf on Security and Privacy for Emerging Areas in Communication Networks
AsiaCCS: ACM Symposium on Information, Computer and Communications Security
ACNS: International Conference on Applied Cryptography and Network Security
Annual Workshop on Selected Areas in Cryptography
Australasia Conference on Information Security and Privacy
Int. Conf on Inofrm and Comm. Security
Financial Cryptography
Workshop on Information Hiding
Smart Card Research and Advanced Application Conference
ICON: Intl Conf on Networks
IMSA: Intl Conf on Internet and MMedia Sys
NCC: Nat Conf Comm
IN: IEEE Intell Network Workshop
Softcomm: Conf on Software in Tcomms and Comp Networks
INET: Internet Society Conf
Workshop on Security and Privacy in E-commerce
EEE: IEEE Conference on e-Technology, e-Commerce and e-Service (Suggested by Roy Grønmo. Thanks)

PARCO: Parallel Computing
SE: Intl Conf on Systems Engineering

AREA: Programming Languages and Software Engineering

Rank 1:
POPL: ACM-SIGACT Symp on Principles of Prog Langs
PLDI: ACM-SIGPLAN Symp on Prog Lang Design & Impl
OOPSLA: OO Prog Systems, Langs and Applications
ICFP: Intl Conf on Function Programming
JICSLP/ICLP/ILPS: (Joint) Intl Conf/Symp on Logic Prog
ICSE: Intl Conf on Software Engineering
FSE: ACM Conference on the Foundations of Software Engineering (inc: ESEC-FSE when held jointly)
FM/FME: Formal Methods, World Congress/Europe
CAV: Computer Aided Verification
Rank 2:
CP: Intl Conf on Principles & Practice of Constraint Prog
TACAS: Tools and Algos for the Const and An of Systems
ESOP: European Conf on Programming
ICCL: IEEE Intl Conf on Computer Languages
PEPM: Symp on Partial Evalutation and Prog Manipulation
SAS: Static Analysis Symposium
RTA: Rewriting Techniques and Applications
ESEC: European Software Engineering Conf
IWSSD: Intl Workshop on S/W Spec & Design
CAiSE: Intl Conf on Advanced Info System Engineering
ITC: IEEE Intl Test Conf
IWCASE: Intl Workshop on Cumpter-Aided Software Eng
SSR: ACM SIGSOFT Working Conf on Software Reusability
SEKE: Intl Conf on S/E and Knowledge Engineering
ICSR: IEEE Intl Conf on Software Reuse
ASE: Automated Software Engineering Conference
PADL: Practical Aspects of Declarative Languages
ISRE: Requirements Engineering
ICECCS: IEEE Intl Conf on Eng. of Complex Computer Systems
IEEE Intl Conf on Formal Engineering Methods
Intl Conf on Integrated Formal Methods
FOSSACS: Foundations of Software Science and Comp Struct
Rank 3:
FASE: Fund Appr to Soft Eng
APSEC: Asia-Pacific S/E Conf
PAP/PACT: Practical Aspects of PROLOG/Constraint Tech
ALP: Intl Conf on Algebraic and Logic Programming
PLILP: Prog, Lang Implentation & Logic Programming
LOPSTR: Intl Workshop on Logic Prog Synthesis & Transf
ICCC: Intl Conf on Compiler Construction
COMPSAC: Intl. Computer S/W and Applications Conf
CSM: Conf on Software Maintenance
TAPSOFT: Intl Joint Conf on Theory & Pract of S/W Dev
WCRE: SIGSOFT Working Conf on Reverse Engineering
AQSDT: Symp on Assessment of Quality S/W Dev Tools
IFIP Intl Conf on Open Distributed Processing
Intl Conf of Z Users
IFIP Joint Int'l Conference on Formal Description Techniques and Protocol Specification, Testing, And Verificati
PSI (Ershov conference)
UML: International Conference on the Unified Modeling Language
EDOC: IEEE Conference on Enterprise Computing (Suggested by Roy Grønmo.)
Australian Software Engineering Conference
IEEE Int. W'shop on Object-oriented Real-time Dependable Sys. (WORDS)
IEEE International Symposium on High Assurance Systems Engineering
The Northern Formal Methods Workshops
Formal Methods Pacific
Int. Workshop on Formal Methods for Industrial Critical Systems
JFPLC - International French Speaking Conference on Logic and Constraint Programming
L&L - Workshop on Logic and Learning
SFP - Scottish Functional Programming Workshop
HASKELL - Haskell Workshop
LCCS - International Workshop on Logic and Complexity in Computer Science
VLFM - Visual Languages and Formal Methods
NASA LaRC Formal Methods Workshop
(1) FATES - A Satellite workshop on Formal Approaches to Testing of Software
(1) Workshop On Java For High-Performance Computing
(1) DSLSE - Domain-Specific Languages for Software Engineering
(1) FTJP - Workshop on Formal Techniques for Java Programs
(*) WFLP - International Workshop on Functional and (Constraint) Logic Programming
(*) FOOL - International Workshop on Foundations of Object-Oriented L anguages
(*) SREIS - Symposium on Requirements Engineering for Information Sec urity
(*) HLPP - International workshop on High-level parallel programming and applications
(*) INAP - International Conference on Applications of Prolog
(*) MPOOL - Workshop on Multiparadigm Programming with OO Languages
(*) PADO - Symposium on Programs as Data Objects
(*) TOOLS: Int'l Conf Technology of Object-Oriented Languages and Systems
(*) Australasian Conference on Parallel And Real-Time Systems

AREA: Algorithms and Theory

Rank 1:
STOC: ACM Symp on Theory of Computing
FOCS: IEEE Symp on Foundations of Computer Science
COLT: Computational Learning Theory
LICS: IEEE Symp on Logic in Computer Science
SCG: ACM Symp on Computational Geometry
SODA: ACM/SIAM Symp on Discrete Algorithms
SPAA: ACM Symp on Parallel Algorithms and Architectures
PODC: ACM Symp on Principles of Distributed Computing
ISSAC: Intl. Symp on Symbolic and Algebraic Computation
CRYPTO: Advances in Cryptology
EUROCRYPT: European Conf on Cryptography
Rank 2:
CONCUR: International Conference on Concurrency Theory
ICALP: Intl Colloquium on Automata, Languages and Prog
STACS: Symp on Theoretical Aspects of Computer Science
CC: IEEE Symp on Computational Complexity
WADS: Workshop on Algorithms and Data Structures
MFCS: Mathematical Foundations of Computer Science
SWAT: Scandinavian Workshop on Algorithm Theory
ESA: European Symp on Algorithms
IPCO: MPS Conf on integer programming & comb optimization
LFCS: Logical Foundations of Computer Science
ALT: Algorithmic Learning Theory
EUROCOLT: European Conf on Learning Theory
WDAG: Workshop on Distributed Algorithms
ISTCS: Israel Symp on Theory of Computing and Systems
ISAAC: Intl Symp on Algorithms and Computation
FST&TCS: Foundations of S/W Tech & Theoretical CS
LATIN: Intl Symp on Latin American Theoretical Informatics
RECOMB: Annual Intl Conf on Comp Molecular Biology
CADE: Conf on Automated Deduction
IEEEIT: IEEE Symposium on Information Theory
Rank 3:
MEGA: Methods Effectives en Geometrie Algebrique
ASIAN: Asian Computing Science Conf
CCCG: Canadian Conf on Computational Geometry
FCT: Fundamentals of Computation Theory
WG: Workshop on Graph Theory
CIAC: Italian Conf on Algorithms and Complexity
ICCI: Advances in Computing and Information
AWTI: Argentine Workshop on Theoretical Informatics
CATS: The Australian Theory Symp
COCOON: Annual Intl Computing and Combinatorics Conf
UMC: Unconventional Models of Computation
MCU: Universal Machines and Computations
GD: Graph Drawing
SIROCCO: Structural Info & Communication Complexity
ALEX: Algorithms and Experiments
ALG: ENGG Workshop on Algorithm Engineering
LPMA: Intl Workshop on Logic Programming and Multi-Agents
EWLR: European Workshop on Learning Robots
CITB: Complexity & info-theoretic approaches to biology
FTP: Intl Workshop on First-Order Theorem Proving (FTP)
CSL: Annual Conf on Computer Science Logic (CSL)
AAAAECC: Conf On Applied Algebra, Algebraic Algms & ECC
DMTCS: Intl Conf on Disc Math and TCS
Information Theory Workshop

AREA: Data Bases

Rank 1:
SIGMOD: ACM SIGMOD Conf on Management of Data
PODS: ACM SIGMOD Conf on Principles of DB Systems
VLDB: Very Large Data Bases
ICDE: Intl Conf on Data Engineering
ICDT: Intl Conf on Database Theory
Rank 2:
SSD: Intl Symp on Large Spatial Databases
DEXA: Database and Expert System Applications
FODO: Intl Conf on Foundation on Data Organization
EDBT: Extending DB Technology
DOOD: Deductive and Object-Oriented Databases
DASFAA: Database Systems for Advanced Applications
CIKM: Intl. Conf on Information and Knowledge Management
SSDBM: Intl Conf on Scientific and Statistical DB Mgmt
CoopIS - Conference on Cooperative Information Systems
ER - Intl Conf on Conceptual Modeling (ER)
Rank 3:
COMAD: Intl Conf on Management of Data
BNCOD: British National Conference on Databases
ADC: Australasian Database Conference
ADBIS: Symposium on Advances in DB and Information Systems
DaWaK - Data Warehousing and Knowledge Discovery
RIDE Workshop
IFIP-DS: IFIP-DS Conference
IFIP-DBSEC - IFIP Workshop on Database Security
NGDB: Intl Symp on Next Generation DB Systems and Apps
ADTI: Intl Symp on Advanced DB Technologies and Integration
FEWFDB: Far East Workshop on Future DB Systems
MDM - Int. Conf. on Mobile Data Access/Management (MDA/MDM)
ICDM - IEEE International Conference on Data Mining
VDB - Visual Database Systems
IDEAS - International Database Engineering and Application Symposium
ARTDB - Active and Real-Time Database Systems
CODAS: Intl Symp on Cooperative DB Systems for Adv Apps
DBPL - Workshop on Database Programming Languages
EFIS/EFDBS - Engineering Federated Information (Database) Systems
KRDB - Knowledge Representation Meets Databases
NDB - National Database Conference (China)
NLDB - Applications of Natural Language to Data Bases
KDDMBD - Knowledge Discovery and Data Mining in Biological Databases Meeting
FQAS - Flexible Query-Answering Systems
IDC(W) - International Database Conference (HK CS)
RTDB - Workshop on Real-Time Databases
SBBD: Brazilian Symposium on Databases
WebDB - International Workshop on the Web and Databases
WAIM: Interational Conference on Web Age Information Management
(1) DASWIS - Data Semantics in Web Information Systems
(1) DMDW - Design and Management of Data Warehouses
(1) DOLAP - International Workshop on Data Warehousing and OLAP
(1) DMKD - Workshop on Research Issues in Data Mining and Knowledge Discovery
(1) KDEX - Knowledge and Data Engineering Exchange Workshop
(1) NRDM - Workshop on Network-Related Data Management
(1) MobiDE - Workshop on Data Engineering for Wireless and Mobile Access
(1) MDDS - Mobility in Databases and Distributed Systems
(1) MEWS - Mining for Enhanced Web Search
(1) TAKMA - Theory and Applications of Knowledge MAnagement
(1) WIDM: International Workshop on Web Information and Data Management
(1) W2GIS - International Workshop on Web and Wireless Geographical Information Systems
* CDB - Constraint Databases and Applications
* DTVE - Workshop on Database Technology for Virtual Enterprises
* IWDOM - International Workshop on Distributed Object Management
* IW-MMDBMS - Int. Workshop on Multi-Media Data Base Management Systems
* OODBS - Workshop on Object-Oriented Database Systems
* PDIS: Parallel and Distributed Information Systems

--- The original source is :

Sunday, November 05, 2006

Useful patter for data mining software development(I)

STRATEGY is a good candidate since it encourages HAS-A relationships. A data mining algorithm will HAS various parts, including setting, input, output, variable list ... Besides, it enables us to swith dynamically among specific implementation choices. For example, we can choose equal-interval binning, or we can use equal-frequency binning, or dynamic benning method. I find cons and pros about applying STRTEGY online, sharing with you here:

Benefits in using Strategy Pattern
- A family of algorithms can be defined as a class hierarchy and can be used interchangeably to alter application behavior without changing its architecture.
- By encapsulating the algorithm separately, new algorithms complying with the same interface can be easily introduced.
- The application can switch strategies at run-time.
- Strategy enables the clients to choose the required algorithm, without using a "switch" statement or a series of "if-else" statements.
- Data structures used for implementing the algorithm is completely encapsulated in Strategy classes. Therefore, the implementation of an algorithm can be changed without affecting the Context class.
- Strategy Pattern can be used instead of sub-classing the Context class. Inheritance hardwires the behavior with the Context and the behavior cannot be changed dynamically.
- The same Strategy object can be strategically shared between different Context objects. However, the shared Strategy object should not maintain states across invocations.
Drawbacks in using Strategy Pattern
- The application must be aware of all the strategies to select the right one for the right situation.
- Strategy and Context classes may be tightly coupled. The Context must supply the relevant data to the Strategy for implementing the algorithm and sometimes, all the data passed by the Context may not be relevant to all the Concrete Strategies.
- Context and the Strategy classes normally communicate through the interface specified by the abstract Strategy base class. Strategy base class must expose interface for all the required behaviors, which some concrete Strategy classes might not implement.
- In most cases, the application configures the Context with the required Strategy object. Therefore, the application needs to create and maintain two objects in place of one.
Since, the Strategy object is created by the application in most cases; the Context has no control on lifetime of the Strategy object. However, the Context can make a local copy of the Strategy object. But, this increases the memory requirement and has a sure performance impact.


It is to be hold in Beijing next year. For anyone who is interested with data management and data mining related research, you can visit and submit something. SIGMOD 2007 welcome both research and industry track papers and demos. Link to SIGMOD'07

Sunday, October 22, 2006

One minute to introduce myself

My name is Shunkai Fu. In May 2006, I joined SPSS as a R&D engineer, working primarily on Clementine(r), a world-class data mining platform. Meanwhile, I am a Google Chinese search engine quality rater (part-time). I would like to share with you my experience, idea, and knowledge here with you! Welcome, and have fun!