Sources of big data are now near-universal, e.g., across social media, financial markets, Internet of things, or scientific data. Thus understanding how to effectively and efficiently summarize data is a crucial step toward both understanding past events as well as predicting
future events. On the other hand, as big data algorithms have been integrated into more applications, we often require additional specific properties or functionality, such as sublinear-space or sublinear-runtime algorithms, the ability to handle time-sensitive data, robustness to either noise or adversarial input, incorporation of advice, and security/privacy. In this talk, we will discuss new theoretical tools and practical optimizations to bridge the gap between classical algorithms and evolving demands from real-world applications for big data algorithms, in particular touching on three areas: sublinear algorithms for processing big data, new tools for traditional data science, and security/privacy for data science.
Samson is a postdoctoral researcher at UC Berkeley and Rice University, hosted by Jelani Nelson and Vladimir Braverman. His current research interests are on the theoretical foundations of data science, including sublinear algorithms, machine learning, security/privacy, and numerical linear algebra. Prior to his current position, he was a postdoctoral researcher at Carnegie Mellon University, hosted by David Woodruff. He received a PhD in computer science from Purdue University and bachelor degrees in mathematics and computer science from MIT.