Pseudo-Relevance Feedback in the Era of Dense Retrieval

Nicola Tonellotto

Abstract: Pseudo-relevance feedback mechanisms, from Rocchio to the relevance models, have shown the usefulness of expanding and re-weighting the users’ initial queries using information occurring in an initial set of retrieved documents, known as the pseudo-relevant set. Recently, dense retrieval - through the use of neural contextual language models such as BERT for analysing the documents and queries’ contents and computing their relevance scores - has shown a promising performance on several information retrieval tasks still relying on the traditional inverted index for identifying documents relevant to a query. Two different dense retrieval families have emerged: the use of single embedded representations for each passage and query (e.g. using BERT’s [CLS] token), or via multiple representations (e.g. using an embedding for each token of the query and document). In this talk, we will discuss the first study into the potential for multiple representation dense retrieval to be enhanced using pseudo-relevance feedback, jointly carried out by the University of Pisa and the University of Glasgow.

Short Bio: Dr. Nicola Tonellotto (male) is assistant professor at the Information Engineering Department of the University of Pisa since 2019. From 2002 to 2019 he was researcher at the Information Science and Technologies Institute “A. Faedo” of the National Research Council of Italy. His main research interests include Cloud Computing, Web Search, Information Retrieval and Deep Learning. He co-authored more than 80 papers on these topics in peer reviewed international journal and conferences. He is co-recipient of the ACM SIGIR 2015 Best Paper Award.

 Talk Overview  Program