Qiang Zhu's Homepage::Research

Projects

Mining Historical Manuscripts with Local Color Patches (ICDM 2010)

Focused on local region-of-interest patches, other than the common global atomic images
Introduced a simple color measure which both addresses and exploits the rich color information of images in historical manuscripts
Proposed a novel and tight lower bound for color matching, to cheaply prune off unpromising candidatesto and enable the efficient mining of massive archives
Built several highler-level data mining tools including motif discovery and link analyses, beyond the fast similarity search
Demonstrated on extensive manuscripts dating back to the fifteenth century
Designed an image retrieval framework which utilizes both the existing web image search engine and our color distance measurement

Fast and Flexible Multivariate Time-Series Search (ICDM 2010, work with NASA Ames Research Center)

Conducted the first work of MTS subsequence matching supporting queries on any subset of variables with time-shifting between them
Designed a novel indexing method allowing fast search on very large datasets (the C-MAPSS and Conex datasets we indexed are much larger than any other datasets considered in the literature of time-series subsequence search)
Algorithms and codes will be used within multiple NASA projects, including the Integrated Vehicle Health Management project

Using CAPTCHAs to Index Cultural Artifacts (IDA 2010)

Proposed a novel language-independent CAPTCHA which considers inherently real-valued data (photographs of rock art) and expects real-valued responses (mouse movements)
The first real-valued-response CAPTCHA for crowdsourcing in data mining
Very easy for humans, but extremely hard or even impossible for current machines
Human efforts spent solving the CAPTCHAs (usually wasted) now can be utilized on another Human Computation project, which helps to extract useful data from incredibly heterogeneous and noisy datasets
Made indexing all the world’s rock art possible

Mining and Indexing of Petroglyphs (KDD 2009, CAA 2009, DMKD 2010)

Considered, for the first time, the problem of data mining large collections of rock art
Introduced an explicit framing of Generalized Hough Transform (GHT) as the similarity measure
Estimated a lower bound distance based on one dimensional signatures extracted from original data
Proposed algorithms which allow efficient and effective mining of rock art, e.g.: finding repeated motifs, clustering, and enabling query-by-content
Working on supporting rotation invariance and partial shape matching

Exact Discovery of Time Series Motifs (SDM 2009, DMKD 2010)

Presented a tractable exact algorithm to find time series motifs, which are most similar pairs of time series
Faster than all current approximate algorithms and up to three orders of magnitude faster than the brute-force search in large datasets
Proposed a novel use of time series motifs in anytime classification

Interactive and Intelligent Searching of Biological Images

Focused on identification of nematodes, which are particularly difficult to identify and have direct and significant effect on humans
Constructed multiple graphs for nematodes based on different features and similarity functions
Built an interactive navigator (use guest/guest to login) which makes nematode identification a simple process of point and click

Surface Depth Hallucination

Implemented a perceptually validated model for depth hallucination
Enabled acquiring surface detail for texturing by a standard digital camera
Estimated a 3D model that can be viewed from any angle under any illumination using two single-view 2D images
Source code, user manual, etc can be downloaded here

Credit Evaluation of Banking Customers

Constructed a hybrid evaluation model based on clustering (SOM and K-means) and probabilistic neural network (PNN)
Performed better than simplex PNN and other 14 classic evaluation models on two standard datasets