Title: Dynamic In-Memory Processing and Large Scale Execution
Date/Time: November 20th, 2015 @ 12:30 PM
Place: 213 Atanasoff Hall
Faculty Advisor: Professor Shashi Gadia
The most popular framework for large-scale data analytics is MapReduce. It is scalable, fault-tolerant, easy to program and flexible. Though it has its own merits, it has drawbacks for exhibiting acceptable performance for various processing tasks. It is not possible to retrieve quick approximate results and it also lacks interactive or real-time processing. It requires long waiting time to query the complete dataset which might result in non-useful results. Also design of map reduce hinders efficient support for interactive or real-time processing, which requires fast processing times.
In-order to overcome the above problem I propose Dynamic In-Memory Processing and Large Scale Execution system which is a novel dual query execution engine for interactive as well as batch oriented data analysis to enable fast and interactive ad-hoc analysis for large datasets. I designed this system keeping into account the trade-off between accuracy of the results and the latency of query execution.