School of Engineering

Graduates the engineering leaders of tomorrow...

Upgraded SemIndex Prototype supporting Intelligent Database Keyword Queries through Disambiguation, Query As You Type, and Parallel Search Algorithms

Processing keyword-based queries is a fundamental problem in the domain of Information Retrieval (IR). A standard containment keyword-based query, which retrieves textual identities that contain a set of keywords, is generally supported by a full-text index. Inverted index is considered as one of the most useful full-text indexing techniques for very large textual collections, supported by many relational DBMSs, and then extended toward semi-structured and unstructured data to support keyword-based queries.

In a previous study [1], we proposed SemIndex: a semantic-aware inverted index model designed to process semantic-aware queries. An extended query model with different levels of semantic awareness was defined, so that both semantic-aware queries and standard containment queries can be processed within the same framework. ‎Fig. 1illustrates the overall framework of the SemIndex approach and its main components. Briefly, the Indexer manages the index generation and maintenance, while the Query Processor processes and answers semantic-aware (or standard) queries issued by the user using SemIndex component.

While the study in [1] introduced the core logical design of SemIndex, the goal of our current paper is to shed the light on upgrades to the SemIndex framework and components. At the indexer level, we add: i) dedicated weight functions, associated with the different elements of the SemIndex graph, allowing more sophisticated semantic query result selection and ranking, coupled with ii) a dedicated relevance scoring measure, required in the query evaluation process in order to retrieve and rank relevant query answers. At the query processing level, we develop iii) different alternative query processing algorithms (in addition to the main algorithm), and iv) a dedicated GUI interface allowing user to easily manipulate the prototype system.

Preliminary experiments highlight SemIndex querying effectiveness and efficiency, considering different querying algorithms, different semantic coverages, and a varying number of query keywords.

Fig. 1.SemIndex Framework

[1]  Chbeir R. et al., SemIndex: Semantic-Aware Inverted Index. East-European Conf. on Advanced Databases and Information Systems (ADBIS’14), 2014. pp. 290-307.

Copyright 1997–2021 Lebanese American University, Lebanon.
Contact LAU | Feedback