The workshop on Personalization, Recommendation and Search (PRS) aims at bringing together practitioners and researchers in these three domains. The goal of this workshop is to facilitate the sharing of information and practices, as well as finding bridges between these communities and promoting discussion. For the workshop we have set up a full day program with 7 invited speakers that are well-known practitioners and scientists in the space of recommendations and search.
The workshop is organized by:
Roelof van Zwol - roelofvanzwol[at]netflix.com
Yves Raimond - yraimond[at]netflix.com
Tony Jebara - tjebara[at]netflix.com
Maarten de Rijke, University of Amsterdam
Semantic Entity Search (slides)
Entities, such as people, products, organizations, are the ingredients around which most conversations are built. A very large fraction of the queries submitted to search engines revolve around entities. No wonder that the information retrieval community continues to devote a lot of attention to entity search. In the talk I will discuss recent advances in entity retrieval. Most of the talk will be focused on unsupervised semantic matching methods for entities that are able to learn from raw textual evidence associated with entities alone. I will then point out challenges (and partial solutions) to learn such representations in a dynamic setting and to learn to improve such representations using interaction data.
The talk is based on joint work with Christophe Van Gysel, Daan Odijk, Evangelos Kanoulas, and Masrour Zoghi.
Maarten de Rijke is Professor of Computer Science at the Informatics Institute of the University of Amsterdam. Together with a team of PhD students and postdocs he works on problems in semantic search and on- and offline learning to rank for information retrieval. He is (co)editor in chief of ACM Transactions on Information Systems and of Foundations and Trends in Information Retrieval. He was a co-chair for SIGIR 2013 and CIKM 2015, and is general co-chair for WSDM 2017 and ICTIR 2017.
Deborah Donato, StumbleUpon
Combining matrix factorization and LDA topic modeling for rating prediction and learning user interest profiles (slides)
Matrix Factorization through Latent Dirichlet Allocation (fLDA) is a generative model for concurrent rating prediction and topic/persona extraction. It learns topic structure of URLs and topic affinity vectors for users, and predicts ratings as well. The fLDA model achieves several goals for StumbleUpon in a single framework: it allows for unsupervised inference of latent topics in the URLs served to users and for users to be represented as mixtures over the same topics learned from the URLs (in the form of affinity vectors generated by the model).
In this talk, I will present an ongoing effort inspired by the fLDA framework devoted to extend to original approach to an industrial environment. The current implementation uses a (much faster) expectation maximization method for parameter estimation, instead of Gibbs sampling as in the original work and implements a modified version of in which topic distributions are learned independently using LDA prior to training the main model. This is an ongoing effort but we have very interesting results.
Debora Donato is Sr. Director of Personalization and Principal Data Scientist at StumbleUpon. Before moving to StumbleUpon, Debora was Senior Scientist at Yahoo! Labs. Her research interests include User Behavior Analysis, Recommendation Systems, Web Information Retrieval, Link Analysis, Algorithms for the Characterization of the Web, Complex Networks and Social Networks. Debora obtained a Ph.D. in Computer Engineering in 2005 from the University of Rome "La Sapienza". She has published more than 50 scientific papers and she has been serving on the program committee of top tier conferences in the area of Data Mining and Information Retrieval. She is coordinating R&D projects in the areas of User modeling, Content Understanding, Recommendation Algorithm. The main ongoing efforts are devoted to improve recommendation performances by leverage implicit user feedback, co-modeling users and content on the same (tag-based) dimensional space.
Jennifer Neville, Purdue University
Exploiting User Relationships to Accurately Predict Preferences in Large Scale Networks (slides)
The popularity of social networks and social media has increased the amount of information available about users' behavior online--including current activities and interactions among friends and family. This rich relational information can be used to predict user interests and preferences even when individual data is sparse, since the characteristics of friends are often correlated. Although relational data offer several opportunities to improve predictions about users, the characteristics of online social network data also present a number of challenges to accurately incorporate the network information into machine learning systems. This talk will outline some of the algorithmic and statistical challenges that arise due to partially-observed, large-scale networks, and describe methods for semi-supervised learning and active exploration that address the challenges.
Aish Fenton, Netflix
Why would you recommend me THAT!? (slides)
With so many advances in machine learning recently, it’s not unreasonable to ask: why aren’t my recommendations perfect by now? Aish provides a walkthrough of the open problems in the area of recommender systems, especially as they apply to Netflix’s personalization and recommender algorithms. He also provides a brief overview of recommender systems, and sketches out some tentative solutions for the problems he presents.
David Ross, Google
Diversity in Radio
Many services offer streaming radio stations seeded by an artist or song, but what does that mean? To get specific, what fraction of the songs in “Taylor Swift Radio” should be by Taylor Swift? I’ll provide a short introduction to the YouTube Radio project, and dive into the diversity problem, sharing some insights we’ve learned from live experiments and human evals.
David Ross leads the Radio project at YouTube, providing sequential music recommendations for the new Music app and youtube.com. David received his Ph.D. in computer science from the University of Toronto, and has contributed to various research areas, including visual tracking, face recognition and cover-song matching.
Deborah Estrin and Andy Hsieh, Cornell Tech
Immersive Recommendation Using Personal Digital Traces (slides)
From topics referred to in Twitter or email, to web browser histories, to videos watched and products purchased online, our digital traces (small data) reflect who we are, what we do, and what we are interested in. In this talk, we present a new user-centric recommendation model, called Immersive Recommendation, that incorporate cross-platform, diverse personal digital traces into recommendations. We discuss techniques that infer users' interests from personal digital traces while suppressing context-specified noise replete in these traces, and propose a hybrid collaborative filtering algorithm to fuse the user interests with content and rating information to achieve superior recommendation performance throughout a user's lifetime, including in cold-start situations. We illustrate this idea with personalized news and local event recommendations. Finally we discuss future research directions and applications that incorporate richer multimodal user-generated data into recommendations, and the potential benefits of turning such systems into tools for awareness and aspiration.
Olivier Chapelle, Criteo
Response prediction for display advertising (slides, paper)
Click-through and conversion rates estimation are two core predictions tasks in display advertising. I will present a machine learning framework based on logistic regression that is specifically designed to tackle the specifics of display advertising. The resulting system has the following characteristics: it is easy to implement and deploy; it is highly scalable (we have trained it on terabytes of data); and it provides models with state-of-the-art accuracy.
Olivier Chapelle is a principal research scientist at Criteo, where he works on machine learning for display advertising. Prior to that, he was part of the machine learning group of Yahoo! Research and before that worked at the Max Planck Institute in Tübingen. His main research interests include kernel machines, semi-supervised learning, ranking and large scale learning. He graduated in theoretical computer science from the Ecole Normale Supérieure de Lyon in 1999 and received his PhD for University of Paris 6 in 2002. He has published over 80 publications and has been granted more than 10 patents. He has served as an associated editor for the Machine Learning Journal and Transactions on Pattern Analysis and Machine Intelligence.
The workshop is held on the Netflix campus, Building D, 121 Albright Way, Los Gatos, CA. The new Netflix campus is close to the intersection of 17 and 85, 10 minutes south of San Jose, and a 45 minutes drive from San Francisco. We have arranged a shuttle from San Francisco to Netflix, as several of the participants will be in the Bay Area for the WSDM conference. Please contact us by email for the pick-up and drop-off details or if you want to sign up.
Parking space is available at the back of the building, very close to the theater entrance.
The workshop on Personalization, Recommendation and Search (PRS) aims at bringing together practitioners and researchers in these three domains. The goal of this workshop is to facilitate the sharing of information and practices, as well as finding bridges between these communities and promoting discussion. For the workshop we have set up a full day program with 7 invited speakers that are well-known practitioners and scientists in the space of recommendations and search.