SMR Research Projects
Current Projects

Software Integration & Merging
Multiple versions of a software system can exist for various reasons, such as developing an SPL or simply forking or branching a repo to work on a given feature. At one point, these versions need to be integrated. Such integration is not an easy task since there may be conflicting changes in the code, textually, syntactically, and semantically. In this work, we look at how we can facilitate such integrations and how we can help developers merge their code more easily with less conflicts.

Helping Developers Select and Use Software Libraries
This project helps developers select the best library to use based on their current task and needs. We explore how we can mine software repositories to extract information that can be used to compare libraries and their corresponding APIs.

Code Recommender Systems (Helping Developers use APIs)
Do you often spend time searching for how to use a specific library to accomplish your programming task? Do you wish there was a concise code example that you can just integrate into your project? Recommender systems save developers some of this time and pain. In this line of work, we investigate various types of recommender systems (Code search, code completion, code generation, documentation navigation etc.) to help developers write better code faster.

API Misuse Detection
When developers use Application Programming Interfaces (APIs), they often make mistakes that can lead to bugs, system crashes, or security vulnerabilities. We refer to such mistakes as misuses. One example of a misuse is forgetting to call close()
after opening a FileInputStream
and writing to it. In this line of research, we aim to automatically mine API usage rules that can be used to detect various types of API misuses.

Managing Software Variability
Software reuse is essential to build software faster. Version-control systems and social coding platforms offer more systematic reuse mechanisms, such as pull requests and cross-project traceability. In this project, we explore how software families (i.e., a group of related software systems that vary slightly in terms of the functionality they offer) make use of these mechanisms. As a first step, we study variability management in the Android ecosystem.
Past Projects

CPP Usage In Practice
The C preprocessor has received strong criticism in academia, among others regarding separation of concerns, error proneness, and code obfuscation, but is widely used in practice. Many (mostly academic) alternatives to the preprocessor exist, but have not been adopted in practice. Since developers continue to use the preprocessor despite all criticism and research, we ask how practitioners perceive the C preprocessor.

Reverse-engineering Configuration Constraints
One of the challenges of developing and maintaining highly configurable software is reasoning about configuration constraints (aka feature dependencies). For example, some features do not work well together or some features require other features to be present. These constraints are essential for reasoning about valid configurations of the software, but unfortunately are not always documented. In this project, we develop a framework that analyzes the implementation of existing highly configurable software to identify configuration constraints.
Variability Implementation Mechanisms for C++
The goal of this project is to help C++ developers with their variability implementation decisions. Specifically, in collaboration with IBM, we explore the IBM OMR project, which uses static polymorphism as its main variability implementation strategy.

Identifying Causes and Fixes of Linux Variability Anomalies
In order to prevent variability anomalies from occurring in the first place, we need to understand what causes them. In order to provide automated solutions for such anomalies, we need to understand how developers usually fix them. This project mines commit information from Linux's git repository in order to identify causes and fixes of variability anomalies.

Analyzing Linux Kbuild to Detect Variability Anomalies
Although build systems control what code gets compiled into the final built product, they are often overlooked when studying software variability. The Linux kernel is one of the biggest open source software systems supporting variability and contains over 10,000 configurable features described in its KCONFIG files. To understand the role of the build system in variability implementation, we use Linux as a case study. We study its build system, KBUILD, and extract the variability constraints in its Makefiles.

Root Cause Analysis & Change Impact Analysis using CMDBs
Many IT systems use Configuration Management Databases (CMDBs) to keep track of which hardware and software is installed as well as any problems that occur over time. Thus, over time, CMDBs collect large amounts of valuable data that can be used for decision support. This project proposes mining historic data from a CMDB to detect common co-changes that can be used to support change impact analysis.