Hridesh Rajan and Tien Nguyen Lead NSF-funded Project to Make Big Data-enabled Software Engineering More Accessible

June 12, 2015

The US National Science Foundation has awarded a three-year, $1,426,917 research grant to ISU Computer Science faculty members Hridesh Rajan and Tien N. Nguyen and BGSU Computer Science faculty member Robert Dyer for research on Big Data in Software Engineering. 

Big Data in Software Engineering research is leveraging the vast amount of open source code in large repositories such as SourceForge, GitHub, Bitbucket, and Google Code to improve our understanding of software and software development processes to improve productivity, decrease errors, and gather actionable items to drive software engineering research. 

This project will focus on enhancing the Boa infrastructure, which is world's first publicly available end-to-end infrastructure for big data in software engineering research. Boa is a research infrastructure that consists of a domain-specific language, its compiler and data updating tools, terabytes (and growing) of raw data from open source repositories, a backend based on map-reduce to effectively analyze this dataset, a compute cluster, and a web-based frontend for writing analysis programs. Within Boa, research questions concerning human and technical aspects of open source software development can be answered by writing, often short, programs that are automatically parallelized by the infrastructure to process already curated dataset. This significantly decreases the barrier to entry for such research, improves scalability, and lowers complexity and size of analysis programs, which allows researchers to focus on their essential tasks. Since standardized datasets are available within Boa, collaboration and comparison of research results is facilitated. Reproducing an experiment conducted using Boa is just a matter of re-running, Boa programs provided by previous researchers. In summary, Boa makes Big Data in Software Engineering research both easy and fast. More information is available at

Inquiries about this project may be directed to Hridesh Rajan.