API Misuse | Sarah Nadi

When developers use Application Programming Interfaces (APIs), they often make mistakes that can lead to bugs, system crashes, or security vulnerabilities. We refer to such mistakes as misuses. One example of a misuse is forgetting to call close() after opening a FileInputStream and writing to it.

We study various types of API misuse.

General Java API Misuse

We created MUBench, a benchmark of existing Java API misuses against which we can evaluate several misuse-detectors. We systematically compared existing Java API-misuse detectors and identified weaknesses. This allowed us to design a new API misuse detector, MuDetect, that can achieve higher recall and precision. MuDetect allows us to mine API usage rules that involve method calls and preconditions. These usage rules are then used to find misuses in target projects. MuDetect uses a graph representation called an API Usage Graph (AUG) to represent different aspects of a method call such as the parameters that are required by a method, the types of those parameters, the order in which different method calls are invoked, the exceptions thrown by different method calls, objects that are returned by different method calls.

Annotation Misuse in Java

While MuDetect focuses on method calls, there are other categories of APIs misuses as well, such as misuses that involve annotations. We built a human-in-the-loop approach that focuses on producing accurate Java annotation usage rules. For the ease of usability, these usage rules are packaged into a Maven plugin that can be used to catch bugs (similar to SpotBugs). Our tool is a complete pipeline that provides an easy way to mine and validate usage rules, and generate a misuse detector from confirmed rules.

Java Cryptography Misuse

Through analyzing StackOverflow posts, GitHub repositories, and conducting two surveys of a total of 48 application developers, we collect the problems developers face with the current cryptography APIs and their suggestions for improvement. Some of our findings included that developers have problems choosing the correct algorithm to use and also want higher level abstractions such as tasks. To address these issues, we looked closer at the cryptography domain, and realized that there is a wide variety of cryptographic components and algorithms (e.g., ciphers, digests, signatures, etc.) and that each of these components comes with its own variability. For example, a cipher can be symmetric or asymmetric. If it is symmetric, it can operate on blocks or streams. Additionally, there are different modes of operations (e.g., ECB vs CBC) as well as different padding schemes. In order to deal with this huge variability space, we model cryptographic components using concepts from feature modeling. However, such components have many attributes. Additionally, some cryptography solutions may use multiple components at the same time. We, therefore, need additional modeling notations than those offered by basic feature modeling.

CogniCrypt was built on the insights derived from these studies.

Related Publications

2023

SecDev

Securing Your Crypto-API Usage Through Tool Support - A Usability Study

Stefan Krüger, Michael Reif, Anna-Katharina Wickert, Sarah Nadi, Karim Ali, Eric Bodden, Yasemin Acar, Mira Mezini, and Sascha Fahl

In IEEE Secure Development Conference (SecDev), 2023

PDF

2022

CASCON

A Human-in-the-loop Approach to Generate Annotation Usage Rules: A Case Study with MicroProfile

Mansur Gulami, Ajay Kumar Jha, Sarah Nadi, Karim Ali, Emily Jiang, and Yee-Kang Chang

In Annual International Conference on Computer Science and Software Engineering (CASCON ’22), 2022

PDF
ICSME

Mining Annotation Usage Rules: A Case Study with MicroProfile

Batyr Nuryyev, Ajay Kumar Jha, Sarah Nadi, Yee-Kang Chang, Emily Jiang, and Vijay Sundaresan

In Proceedings of the 38th IEEE International Conference on Software Maintenance and Evolution – Industry Track, 2022

PDF

2019

MSR

Investigating Next-Steps in Static API-Misuse Detection

Sven Amann, Hoan Nguyen, Sarah Nadi, Tien Nguyen, and Mira Mezini

In Proceedings of the 16th International Conference on Mining Software Repositories (MSR ’19) , 2019

PDF

2018

TSE

A Systematic Evaluation of Static API-Misuse Detectors

Sven Amann, Hoan A. Nguyen, Sarah Nadi, Tien N. Nguyen, and Mira Mezini

IEEE Transactions on Software Engineering, 2018

PDF

2017

ASE

CogniCrypt: Supporting Developers in using Cryptography

Stefan Krüger, Sarah Nadi, Michael Reif, Karim Ali, Mira Mezini, Eric Bodden, Florian Göpfert, Felix Günther, Christian Weinert, Daniel Demmler, and Ram Kamath

In Proceedings of the 32nd IEEE/ACM International Conference on Automated Software Engineering (ASE ’17) – Tool Demo Track, 2017

PDF

2016

ICSE

"Jumping Through Hoops": Why do Java Developers Struggle with Cryptography APIs?

Sarah Nadi, Stefan Krüger, Mira Mezini, and Eric Bodden

In Proceedings of the 38th International Conference on Software Engineering (ICSE ’16), 2016

PDF
VaMoS

Variability Modeling of Cryptographic Components (Clafer Experience Report)

Sarah Nadi, and Stefan Krüger

2016

PDF
MSR

MUBench: A Benchmark for API-Misuse Detectors

Sven Amann, Sarah Nadi, Hoan A. Nguyen, Tien N. Nguyen, and Mira Mezini

In Proceedings of the 13th International Conference on Mining Software Repositories – Data Showcase Track (MSR ’16), 2016

PDF

2015

ONWARD

Towards Secure Integration of Cryptographic Software

Steven Arzt, Sarah Nadi, Karim Ali, Eric Bodden, Sebastian Erdweg, and Mira Mezini

In Proceedings of the SIGPLAN Symposium on New Ideas in Programming and Reflections on Software at SPLASH (ONWARD ’15), 2015

(Acceptance Rate: 17/37 = 35%)

PDF