Dimitrios Papadopoulos
Dept. of Computer Science and Engineering, UC Riverside, Riverside, CA 92521
edu!ucr!cs!dimitris http://www.cs.ucr.edu/~dimitris/
Education * University of California Riverside, Riverside, CA.
PhD in Computer Science (completed on Jan 3, 2005)
Thesis title: Clustering and Indexing Methods for High Dimensional Data and Moving Objects
* University of California Riverside, Riverside, CA.
M.S. in Computer Science, Aug. 2003
Project title: Clustering Gene Expression Data in SQL Using Locally Adaptive Metrics
* University of Ioannina, Greece
B.S. in Computer Science, Sep. 1998
Work * Research Assistant Database Lab, CS Dept.
Experience Univ. California Riverside Spring 2001 Winter 2005
Advisor: Prof. Dimitrios Gunopulos
* Teaching Assistant CS Dept.
Univ. California Riverside F1999, W2000, S2000, S2001, W2004
Classes: CS8 Introduction to Computing, CS10/CS12 Introduction to Computer
Science (C++ / Visual Studio .NET), CS130 Computer Graphics (OpenGL), CS141
Algorithms
* Software Engineer MEDLAB, E.U. Project TEMeTeN
Univ. of Ioannina, Greece. Sep. 1998 Sep. 1999.
Led the group that implemented the back-end application server and configured
the database servers of a distributed computerbased patient record system.
Publications
* Carlotta Domeniconi, Dimitrios Gunopulos, Sheng Ma, Bojun Yan, Muna Al-Razgan,
Dimitris Papadopoulos: "Locally adaptive metrics for clustering high dimensional
data", Data Min. Knowl. Discov. 14(1): 63-97 (2007)
* Maria Halkidi, Vana Kalogeraki, Dimitrios Gunopulos, Dimitris Papadopoulos,
Demetris Zeinalipour-Yazti, Michalis Vlachos:
"Efficient Online State Tracking Using Sensor Networks",
7th International Conference on Mobile Data Management (MDM'06), Nara, Japan, May 2006
* Sharmila Subramaniam, Themis Palpanas, Dimitris Papadopoulos, Vana Kalogeraki,
Dimitrios Gunopulos: "Online Outlier Detection in Sensor Data Using Non-Parametric
Models", In Proc. VLDB 2006: 187-198
* Maria Halkidi, Dimitris Papadopoulos, Vana Kalogeraki, Dimitrios Gunopulos:
"Resilient and Energy Efficient Tracking in Sensor Networks", International Journal of
Wireless and Mobile Computing (accepted)
* George Kollios, Dimitris Papadopoulos, Dimitrios Gunopulos, Vassilis J. Tsotras:
"Indexing Mobile Objects Using Dual Transformations",
The VLDB Journal Vol. 14(2): 238-256 (Apr. 2005) (Online First, Sep. 2004)
* Carlotta Domeniconi, Dimitris Papadopoulos, Dimitrios Gunopulos, Sheng Ma:
"Subspace Clustering of High Dimensional Data", SIAM International Conference
on Data Mining (SDM), Apr. 2004
* Themistoklis Palpanas, Dimitris Papadopoulos, Vana Kalogeraki, Dimitrios Gunop-
ulos: "Distributed Deviation Detection in Sensor Networks", SIGMOD Record, Vol
32, No. 4, Dec. 2003
* Dimitris Papadopoulos, Carlotta Domeniconi, Dimitrios Gunopulos, Sheng Ma:
"Clustering Gene Expression Data in SQL Using Locally Adaptive Metrics", 8th
ACM SIGMOD Workshop on Research Issues in Data Mining and Knowledge
Discovery (DMKD), Jun. 2003
* Dimitris Papadopoulos, George Kollios, Dimitrios Gunopulos, Vassilis J. Tsotras:
"Indexing Mobile Objects on the Plane", 5th DEXA International Workshop on
Mobility in Databases and Distributed Systems (MDDS), Sep. 2002
* Dimitris Papadopoulos, George Kollios, Dimitrios Gunopulos, Vassilis J. Tsotras:
"Indexing Mobile Objects Using Duality Transforms", IEEE Data Engineering Bul-
letin 25(2): 18-24 (Jun. 2002)
Research My research interests fall into the area of managing and processing multi-dimensional
Interests data. In particular, I have been involved in projects related to multidimensional
indexing techniques, clustering algorithms, data mining in static databases, as well
as in sensor networks and streaming environments. An outline of current and future
paths of interests follows:
* The task of indexing moving objects in order to answer various spatio-temporal
queries inherently involves handling and processing data with many attributes. A
problem like "Give the objects' ID's which will be in a particular area A during a
specific time interval t in the future" appears in real-life applications such as pre-
dicting future congestion areas in a highway system or allocating more bandwidth
for areas where a high volume of mobile phone usage is imminent.
* Clustering is another data mining task, during which the high dimensionality of
the data makes it hard to be applied. One promising approach is to focus on
a different, possibly overlapping, subset of the objects' features in order to form
each cluster, i.e. perform subspace clustering. In datasets where each object is
described by hundreds or thousands of attributes (e.g. document corpora like the
REUTERS dataset), it is extremely unlikely that the objects are correlated along
each attribute. In these cases, subspace clustering algorithms try to identify clus-
ters, which are formed on a subset of dimensions. These techniques can be ap-
plied to a diverse set of domains. For instance, biologists seek to identify genes
which are co-expressed under certain conditions or stimuli. Subspace clustering
algorithms can tackle this problem efficiently. In the telecommunication industry,
call-detail records offer a wealth of customer behavior information. Identifying sets
of customers that share behavior patterns is another useful application of subspace
clustering algorithms.
* Data mining tasks are usually harder to be carried out when the data is processed
in a streaming fashion. The unique nature of datastream processing ("You got to
see the data only once!"), along with computational constraints which are often
inherent in such settings (e.g. small memory of sensors), contribute to the dif-
ficulty of devising such data mining frameworks. In the setting of sensornets, I
am interested into developing data mining and in-network processing techniques,
which function in a distributed and online fashion, and can be efficiently deployed
on memory, CPU, and power constrained devices.
Projects * Implementation of clustering algorithms in SQL [DMKD'03] and C/C++ [SDM'04, DMKD'07]
* C/C++ implementation of moving object indexing techniques in external memory
[VLDBJ'05, MDDS'02]
* Implementation of outlier detection techniques for sensor networks; evaluation and
simulation in Java [VLDB'06]
* Implementation of distributed algorithms for target tracking in sensornets; evalu-
ation and simulation in Java [IJWMC'06, MDM'06]
* Shared Storage API library, multi-threaded server and client application: The
server provided storage services, while supporting locking, caching, user authenti-
cation, encryption and Unicode string support.
* Implemented various compiler optimization techniques, using C++ and STL.
* Measurements of multicast tree characteristics: Focused on the characteristics of
multicast trees in IP Multicast, i.e. out-degree of nodes and distances between re-
ceivers. Built graph visualization tool using the GFC toolkit by IBM Alphaworks.
* TPC-H benchmark on DB2 installations under Linux and Windows NT.
* C++ implementation of the Apriori algorithm for finding frequent itemsets.
* Design and implementation of the back-end application server and configuration
of the database servers for a distributed computerbased patient record (CPR)
system, named Pandora. This project was undertaken under the auspices of the
TEMeTeN (Towards a European Medical and Teleworking Network) consortium.
TEMeTeN involved 19 partners from the 5 European regions of Crete (Greece),
Balears (Spain), Epirus (Greece), Sardegna (Italy), and Satakunta (Finland).
(see http://europa.eu.int/comm/regional_policy/innovation/innovating/risi2/055.htm)
The objective of the Pandora CPR system was to provide a unified view of pa-
tients' medical records, across many points of access. The architecture of the
system adhered to the three-tier model and adopted the CORBA framework for
distributed service invocation. Application server instances were deployed at mul-
tiple locations (i.e. hospitals) providing access to the databases (DB2 UDB). The
deployment involved configuring the DB2 servers to replicate the portion of the
patients' records that contained demographic data across installations, following
the Update Anywhere replication scheme. Each node (i.e. app server) of the system
was capable of managing data stored in remote locations by requesting services
of the app server instance running at that particular remote location. Thus, each
app server instance had a dual role: to serve the requests issued by client applica-
tions directly, as well as the requests from remote app server instances. The app
server was implemented in Java and exported CORBA interfaces of all supported
functionalities. The system supported access to MEDPACS (also developed by
MEDLAB), which is an autonomous PACS system capable of managing DICOM
image sets. (see http://medlab.cs.uoi.gr/pandora.asp)
Skills C++ / C, Java, CORBA, SQL, Linux administration, DB2 administration, Javascript,
Matlab, Python, UML, SQL Server administration
Scholarships * Dean's Fellowship, College of Engineering, UC Riverside
& Awards * Scholarship awards (1994, 1995) during undergraduate studies, National Scholar-
ships Foundation (IKY), Greece
Professional Reviewer for the ACM SIGMOD, VLDB, IEEE ICDE, ACM SIGKDD, IEEE ICDM,
Activities ACM SAC, IEEE ICPS, MDM, PAKDD, SSTD, SSDBM, WAIM conferences & symposia, and
the VLDB Journal, IEEE TKDE, ACM TODS, GeoInformatica and IEEE TPDS journals.
Community Greek Army Service: Nov. 2005 - Aug. 2006
Service
References Available upon request