API Misuse
Ensuring that library APIs are correctly used
When developers use Application Programming Interfaces (APIs), they often make mistakes that can lead to bugs, system crashes, or security vulnerabilities. We refer to such mistakes as misuses. One example of a misuse is forgetting to call close()
after opening a FileInputStream
and writing to it.
We study various types of API misuse.
API Misuse of Data-centric Python Libraries
General Java API Misuse
We created MUBench, a benchmark of existing Java API misuses against which we can evaluate several misuse-detectors. We systematically compared existing Java API-misuse detectors and identified weaknesses. This allowed us to design a new API misuse detector, MuDetect, that can achieve higher recall and precision. MuDetect allows us to mine API usage rules that involve method calls and preconditions. These usage rules are then used to find misuses in target projects. MuDetect uses a graph representation called an API Usage Graph (AUG) to represent different aspects of a method call such as the parameters that are required by a method, the types of those parameters, the order in which different method calls are invoked, the exceptions thrown by different method calls, objects that are returned by different method calls.
Annotation Misuse in Java
While MuDetect focuses on method calls, there are other categories of APIs misuses as well, such as misuses that involve annotations. We built a human-in-the-loop approach that focuses on producing accurate Java annotation usage rules. For the ease of usability, these usage rules are packaged into a Maven plugin that can be used to catch bugs (similar to SpotBugs). Our tool is a complete pipeline that provides an easy way to mine and validate usage rules, and generate a misuse detector from confirmed rules.
Java Cryptography Misuse
Through analyzing StackOverflow posts, GitHub repositories, and conducting two surveys of a total of 48 application developers, we collect the problems developers face with the current cryptography APIs and their suggestions for improvement. Some of our findings included that developers have problems choosing the correct algorithm to use and also want higher level abstractions such as tasks. To address these issues, we looked closer at the cryptography domain, and realized that there is a wide variety of cryptographic components and algorithms (e.g., ciphers, digests, signatures, etc.) and that each of these components comes with its own variability. For example, a cipher can be symmetric or asymmetric. If it is symmetric, it can operate on blocks or streams. Additionally, there are different modes of operations (e.g., ECB vs CBC) as well as different padding schemes. In order to deal with this huge variability space, we model cryptographic components using concepts from feature modeling. However, such components have many attributes. Additionally, some cryptography solutions may use multiple components at the same time. We, therefore, need additional modeling notations than those offered by basic feature modeling.
CogniCrypt was built on the insights derived from these studies.
Related Resources
- MUBench Repository
- MUDetect Repository
- Annotation Usage Rule Generation Pipeline Repository
- CogniCrypt Project
Related Publications
2024
- ESEMAn Empirical Study of API Misuses of Data-Centric LibrariesIn Proceedings of the ACM/IEEE International Symposium on Empirical Software Engineering and Measurement (ESEM ’24), 2024
2023
- SecDevSecuring Your Crypto-API Usage Through Tool Support - A Usability StudyIn IEEE Secure Development Conference (SecDev), 2023
2022
- CASCONA Human-in-the-loop Approach to Generate Annotation Usage Rules: A Case Study with MicroProfileIn Annual International Conference on Computer Science and Software Engineering (CASCON ’22), 2022
- ICSMEMining Annotation Usage Rules: A Case Study with MicroProfileIn Proceedings of the 38th IEEE International Conference on Software Maintenance and Evolution – Industry Track, 2022
2019
- MSRInvestigating Next-Steps in Static API-Misuse DetectionIn Proceedings of the 16th International Conference on Mining Software Repositories (MSR ’19) , 2019
2018
- TSEA Systematic Evaluation of Static API-Misuse DetectorsIEEE Transactions on Software Engineering, 2018
2017
- ASECogniCrypt: Supporting Developers in using CryptographyIn Proceedings of the 32nd IEEE/ACM International Conference on Automated Software Engineering (ASE ’17) – Tool Demo Track, 2017
2016
- ICSE"Jumping Through Hoops": Why do Java Developers Struggle with Cryptography APIs?In Proceedings of the 38th International Conference on Software Engineering (ICSE ’16), 2016
- VaMoS
- MSRMUBench: A Benchmark for API-Misuse DetectorsIn Proceedings of the 13th International Conference on Mining Software Repositories – Data Showcase Track (MSR ’16), 2016
2015
- ONWARDTowards Secure Integration of Cryptographic SoftwareIn Proceedings of the SIGPLAN Symposium on New Ideas in Programming and Reflections on Software at SPLASH (ONWARD ’15), 2015(Acceptance Rate: 17/37 = 35%)