Tao Cheng

Ph.D. expected in 2010
Department of Computer Science
Univ. of Illinois at Urbana-Champaign
201 N. Goodwin Avenue
Urbana, IL 61801, USA
E-mail: tcheng3[at]cs.uiuc.edu
Complete CV: HTML PDF

Hi, there. I am a Ph.D. candidate in the Department of Computer Science, University of Illinois at Urbana-Champaign. My advisor is Prof. Kevin Chen-Chuan Chang.


Research Interests   

My research interests lie in large-scale data management, especially search and mining upon the ultimate data repository, the World Wide Web. I am particularly interested in leveraging information extraction techniques to enrich the Web and therefore advance the state-of-the-art Web search and mining. I enjoy the process of building novel, useful and large-scale search systems, as well as identifying and solving real research problems that emerge in the process.


Recent Experiences  
  • Research Assistant, Database and Information Systems Lab, University of Illinois at Urbana-Champaign, IL, 2004-present
    WISDM project: http://wisdm.cs.uiuc.edu
    Aiming at proposing and building a novel data-aware search engine beyond document retrieval, that searches over data entities on the Web.

  • Intern, Search Labs at Microsoft Research, 05/2008-08/2008
    Worked on entity synonym generation to support structured Web search. The work resulted to (a) design and implementation of an entity synonym generation system (technologically transferred into a feature in "Bing" search) (b) a US patent and a research paper describing the invention.

  • Intern, Cazoodle Inc., 05/2007-12/2007
    Worked as the key architect of several search products towards data-aware search, specifically: (a) co-designed and co-implemented a general distributed crawling, extraction, indexing framework (b) implemented a large-scale entity search engine prototype, and helped to build a specialized geographic search engine: GeoEngine.


Selected Publications  
  • T. Cheng, H. Lauw, S. Paparizos, "Fuzzy Matching of Web Queries to Structured Data," in the Proceeding of the 26th International Conference on Data Engineering Conference (ICDE 2010), Short Paper, Long Beach, USA, Mar 2010.[PDF]

  • M. Zhou, T.Cheng, K. C.--C. Chang, "Data-oriented Content Query System: Searching for Data in Text on the Web," in the Proceeding of the Third International Conference on Web Search and Data Mining (WSDM 2010), New York, USA, Feb 2010. [PDF]

  • T.Cheng, X. Yan and K. C.--C. Chang, "EntityRank: Searching Entities Directly and Holistically," in the Proceedings of the 33rd International Conference on Very Large Data Bases (VLDB 2007), Vienna, Austria, 2007. [PDF][PPT]

  • T.Cheng and K. C.--C. Chang, "Entity Search Engine: Towards Large Scale Information Integration on the Web," in the Proceeding of the 3rd Conference of Innovative Database Systems Research (CIDR 2007), Extended Demo Paper, Asilomar, Jan 7-10, 2007. [PDF] [PPT]

  • T. Dalamagas, T. Cheng, K. J. Winkel and T. Sellis, "A Methodology for Clustering XML Documents by Structure," in Information Systems, vol. 33, no. 3, pages 187-228, 2006. [PDF]

Patents  
  • K. C.--C. Chang, T. Cheng and X. Yan, "SYSTEM FOR ENTITY SEARCH AND A METHOD FOR ENTITY SCORING IN A LINKED DOCUMENT DATABASE," US Patent 20090083262 by University of Illinois at Urbana-Champaign, 2009.

  • S.Paparizos, T. Cheng and H. Lauw, "GENERATING SYNONYMS BASED ON QUERY LOG DATA," US Patent by Microsoft, 2008.

Selected Awards  
  • Yahoo Key Technical Challenge Award, 2007 (one out of twelve selected nation wide)
  • Conference of Innovative Database System Research (CIDR) Scholarship, 2007.
  • Excellent TA Award, Department of Computer Science, UCSB, 2004.
  • Distinguished Graduate Honor Certificate, ZJU, 2003.
  • Outstanding Graduation Thesis, ZJU, 2003.

Hobbies   

In my spare time, I like to travel with friends, play basketball, watch football and take photos.

Last Modified: Nov, 2009