Warren Shen
|
|
I am a Ph.D.
student working with AnHai Doan.
My main research interest is on applying AI, database, and web technologies to
data management problems, especially with regard to
Community Information Management (CIM) and
data integration.
Toward these goals, my research focus has recently been on inferring structure from unstructured
data, including key problems such as information extraction and entity matching (a.k.a. record linkage, entity resolution).
Additionally, I have been a key architect for
DBLife, a prototype CIM
system for the database research community.
Publications
-
Web 2.0 Style Schema Matching. R. McCann, W. Shen, A. Doan. ICDE-08.
-
Building Community Wikipedias: A Human-Machine Approach. P. DeRose, X. Chai, B. Gao, W. Shen, A. Doan, P. Bohannon, J. Zhu. ICDE-08.
-
Declarative Information Extraction Using Datalog with Embedded Extraction Predicates. W. Shen, A. Doan, J. Naughton, R. Ramakrishnan. VLDB-07.
-
Building Structured Web Community Portals: A Top-Down, Compositional, and Incremental Approach. P. DeRose, W. Shen, F. Chen, A. Doan, R. Ramakrishnan. VLDB-07.
-
Source-aware Entity Matching: A Compositional Approach.
W. Shen, P. DeRose, L. Vu, A. Doan, R. Ramakrishnan. ICDE-07. 122/659 = 18%.
-
User-Centric Research Challenges
in Community Information Management Systems. A. Doan, P. Bohannon,
R. Ramakrishnan, X. Chai, P. DeRose, B. Gao, W. Shen.
IEEE Data Engineering Bulletin, special issue on data management in social
networks. 2007.
-
Community Information Management.
A. Doan, R. Ramakrishnan, F. Chen, P. DeRose, Y. Lee, R. McCann, M. Sayyadian, W. Shen.
IEEE Data Engineering Bulletin, Special Issue on Probabilistic Databases, 29(1), 2006.
-
Constraint-Based Entity Matching.
W. Shen, X. Li, A. Doan. AAAI-05 (Nat. Conf. on AI). 148/803 = 18%. PPT slides.
- Integrating Data from Disparate Sources: A Mass Collaboration Approach.
R. McCann, A. Kramnik, W. Shen, V. Varadarajan, O. Sobulo, A. Doan. ICDE-05. Poster. 100/521 = 19%.
-
Collective Integration of Information for Virtual Organizations.
R. McCann, W. Shen, A. Doan. SIGMOD Workshop on Databases in Virtual Organizations (DIVO), 2004.
Education
Ph.D. Computer Science, University of Illinois, Urbana-Champaign, 2008 (expected)
M.S. Computer Science, University of Illinois, Urbana-Champaign, 2005
B.S. Computer Science, Stanford University, 2002