Information Mining Worker (傅顺开): April 2008

Sunday, April 27, 2008

Topic Detection and Tracking

1) http://www.itl.nist.gov/iaui/894.01/tests/tdt/
2) http://projects.ldc.upenn.edu/TDT/

Monday, April 21, 2008

Content-based Image Retreival(CBIR)

Most image retrieval today rely on metadata such as captions or keywords, which actually is text-based retrieval.

"Content-based" means that the search will analyze the actual contents of the image. The term 'content' in this context might refer to colors, shapes, textures, or any other information that can be derived from the image itself.

Potential uses for CBIR include:

Art collections

Photograph archives

Retail catalogs

Medical diagnosis

Crime prevention

The military

Intellectual property

Architectural and engineering design

Geographical information and remote sensing systems

Query techniques:

Query by example. An example image is provided to the CBIR system, and the underlying search engine returns imags sharing common elements with the provided example.This query technique removes the difficulties that can arise when trying to describe images with words.

Semantic retrieval. The user makes a request like "find pictures of dogs" or even "find pictures of Abraham Lincoln", which is quite difficult for computer to perform. Current CBIR systems generaly make use of lower-level features like textures, color, and shape, although some systems take advantage of very common higher-level features like faces. Not every CBIR system is generic. Some systems are designed for a specific domain.

Content comparison techniques:

Color. It retrieves images based on color similarity, e.g. by computing a color histogram for each image that identifies the proportion of pixels within an image holding specific values. This is one of the most widely used techniques because it does not depend on image size or orientation.

Texture. It look for visual patterns in images and how they are spatially defined. Textures are represented by texels which are then placed into a number of sets, depending on how many textures are detected in the image. These sets not only define the texture, but also where in the image the texture is located.

Shape. It refers to the shape of a particular region that is being sought out. Shapes will often be determiend first applying segmentation or edge detection to an image.

Lustre File System

What is Lustre?

Lustre is a scalable, secure, robust, highly-available cluster file system. It is designed, developed and maintained by Sun Microsystems, Inc.

The central goal is the development of a next-generation cluster file system which can serve clusters with 10,000's of nodes, provide petabytes of storage, and move 100's of GB/sec with state-of-the-art security and management infrastructure.

Lustre runs on many of the largest Linux clusters in the world, and is included by Suns's partners as a core component of their cluster offering (examples include HP StorageWorks SFS, and the Cray XT3 and XD1 supercomputers). Today's users have also demonstrated that Lustre scales down as well as it scales up, and runs in production on clusters as small as 4 and as large as 25,000 nodes.

Reference Resource:

Lustre wiki on Sun
Sun's official entrance

WebKDD 2008 CFP

10th SIGKDD Workshop on Web Mining and Web Usage Analysis (WEBKDD'08)

submission deadline: 2008. 5. 26
conference date: 2008. 8. 24 - 8. 27
conference venue: Las Vegas, NV, USA

Monday, April 14, 2008

ProActive: A powerful middleware for cluster computing

ProActive is a middleware for parallel, distributed and multi-threaded computing. It provides a comprehensive framework and programming model to simplify the programming and execution of parallel applications: within multi-core processors, distributed on LAN, on clusters and data centers, on intranet and Internet Grids.

The core part of ProActive is its Active Object Model. Programming on ProActive is primarily dealing with with active objects. A distributed or concurrent application built using ProActive is composed of a number of active objects.

Saturday, April 12, 2008

WI 2008 CFP

Overview
The 2008 IEEE/WIC/ACM International Conference on Web Intelligence(WI'08) (WI-08) will be jointly held with the 2008 IEEE/WIC/ACM International Conference on Intelligent Agent Technology (IAT-08). The IEEE/WIC/ACM 2008 joint conferences are organized by University of Technology Sydney, Australia, and sponsored by IEEE Computer Society Technical Committee on Intelligent Informatics (TCII), Web Intelligence Consortium (WIC), and ACM-SIGART.

Important Dates
Workshop proposal submission: April 10, 2008
Electronic submission of full papers: July 10, 2008
Tutorial proposal submission: July 10, 2008
Workshop paper submission: July 30, 2008
Notification of paper acceptance: September 3, 2008
Camera-ready of accepted papers: September 30, 2008
Workshops: December 9, 2008 Conference: December 9 - 12, 2008

Information Mining Worker (傅顺开)