Christopher Quinn, Assistant Professor in the Department of Computer Science, received a National Science Foundation (NSF) grant in the amount of $250,000 for his collaborative research project titled, "Sequential Decision Making Under Uncertainty with Submodular Rewards."
The project is a collaborative proposal with a collaborator at Purdue, Vaneet Aggarwal. The proposal also included work from Quinn's advisee, Guanyu Nie, who was supported by the department's Summer 2021 (Round 2) Seed funding.
"That seed funding support was invaluable to help get the proposal to a stage where it was accepted on the first try. We are grateful for that support," Quinn said.
Project abstract: Many companies, government agencies, and individuals make sequences of challenging decisions over time, for which they must choose from among many possible options, may have limited knowledge about the outcomes of their decisions, and will receive limited feedback. For example, search engines and content providers make decisions for what sets of websites, products, or media to recommend each time a user logs on to their system or submits a query, in some cases having limited knowledge of the users’ underlying preferences. If users' privacy is protected, then only users' past actions, such as which links or media were selected by earlier users, will be available as feedback to inform the search engine or content provider on what to recommend next. This project aims to develop provably good strategies that decision-makers can use in such settings, aiding their decision-making under uncertainty and with limited feedback. This project will also develop strategies for the more challenging setting where multiple decision-makers must coordinate with each other on such problems, but have limited communication available to do so. Furthermore, this project will support undergraduate and graduate research training, as well as graduate-level course development, in machine learning and artificial intelligence, preparing students for careers in advanced technical fields.
The goal of this project is to develop novel, provably good strategies for solving sequential decision problems (multi-armed bandit problems) when the actions available have a combinatorial structure (such as choosing subsets of products to recommend), the rewards have a diminishing returns property (submodularity), and there is no side-information available -- the only feedback comes from the reward itself. The proposed work builds on the rich literature of multi-armed bandits and of submodular optimization. The technical aims of the project are divided into two thrusts. The first thrust focuses on developing algorithms and identifying their regret bounds for combinatorial multi-armed bandit problems with submodular rewards and no additional feedback. The second thrust extends those strategies and regret analyses to a decentralized setting, where multiple agents coordinate to solve combinatorial multi-armed bandit problems, despite limited resources for communication.
This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
Congratulations to all involved!