Victoria University

Techniques for Improving Web Search by Understanding Queries

ResearchArchive/Manakin Repository

Show simple item record

dc.contributor.advisor Andreae, Peter
dc.contributor.advisor Gao, Xiaoying
dc.contributor.author Crabtree, Daniel Wayne
dc.date.accessioned 2011-07-28T21:43:39Z
dc.date.available 2011-07-28T21:43:39Z
dc.date.copyright 2011
dc.date.issued 2011
dc.identifier.uri http://researcharchive.vuw.ac.nz/handle/10063/1711
dc.description.abstract This thesis investigates the refinement of web search results with a special focus on the use of clustering and the role of queries. It presents a collection of new methods for evaluating clustering methods, performing clustering effectively, and for performing query refinement. The thesis identifies different types of query, the situations where refinement is necessary, and the factors affecting search difficulty. It then analyses hard searches and argues that many of them fail because users and search engines have different query models. The thesis identifies best practice for evaluating web search results and search refinement methods. It finds that none of the commonly used evaluation measures for clustering meet all of the properties of good evaluation measures. It then presents new quality and coverage measures that satisfy all the desired properties and that rank clusterings correctly in all web page clustering situations. The thesis argues that current web page clustering methods work well when different interpretations of the query have distinct vocabulary, but still have several limitations and often produce incomprehensible clusters. It then presents a new clustering method that uses the query to guide the construction of semantically meaningful clusters. The new clustering method significantly improves performance. Finally, the thesis explores how searches and queries are composed of different aspects and shows how to use aspects to reduce the distance between the query models of search engines and users. It then presents fully automatic methods that identify query aspects, identify underrepresented aspects, and predict query difficulty. Used in combination, these methods have many applications — the thesis describes methods for two of them. The first method improves the search results for hard queries with underrepresented aspects by automatically expanding the query using semantically orthogonal keywords related to the underrepresented aspects. The second method helps users refine hard ambiguous queries by identifying the different query interpretations using a clustering of a diverse set of refinements. Both methods significantly outperform existing methods. en_NZ
dc.language.iso en_NZ
dc.publisher Victoria University of Wellington en_NZ
dc.subject Web page clustering en_NZ
dc.subject Query refinement en_NZ
dc.subject Vocabulary model en_NZ
dc.title Techniques for Improving Web Search by Understanding Queries en_NZ
dc.type Text en_NZ
vuwschema.contributor.unit School of Engineering and Computer Science en_NZ
vuwschema.subject.marsden 280213 Other Artificial Intelligence en_NZ
vuwschema.subject.marsden 280205 Text Processing en_NZ
vuwschema.subject.marsden 280103 Information Storage, Retrieval and Management en_NZ
vuwschema.type.vuw Awarded Doctoral Thesis en_NZ
thesis.degree.discipline Computer Science en_NZ
thesis.degree.grantor Victoria University of Wellington en_NZ
thesis.degree.level Doctoral en_NZ
thesis.degree.name Doctor of Philosophy en_NZ
vuwschema.subject.anzsrcfor 089999 Information and Computing Sciences not elsewhere classified en_NZ


Files in this item

This item appears in the following Collection(s)

Show simple item record

Search ResearchArchive


Advanced Search

Browse

My Account

Statistics