Please login to be able to save your searches and receive alerts for new content matching your search criteria.
Libraries offer reusable functionality through Application Programming Interfaces (APIs) with usage constraints such as call conditions or orders. Constraint violations, i.e. API misuses, commonly lead to bugs and security issues. Although researchers have developed various API misuse detectors in the past few decades, recent studies show that API misuse is prevalent in real-world projects, especially for secure socket layer (SSL) certificate validation, which is completely broken in many security-critical applications and libraries. In this paper, we introduce SSLDoc to effectively detect API misuse bugs, specifically for SSL API libraries. The key insight behind SSLDoc is a constraint-directed static analysis technique powered by a domain-specific language (DSL) for specifying API usage constraints. Through studying real-world API misuse bugs, we propose ISpec DSL, which covers majority types of API usage constraints and enables simple but precise specification. Furthermore, we design and implement SSLDoc to automatically parse ISpec into checking targets and employ a static analysis engine to identify potential API misuses and prune false positives with rich semantics. We have instantiated SSLDoc for OpenSSL APIs and applied it to large-scale open-source programs. SSLDoc found 45 previously unknown security-sensitive bugs in OpenSSL implementation and applications in Ubuntu. Up to now, 35 have been confirmed by the corresponding development communities and 27 have been fixed in master branch.
APIs play a crucial role in contemporary software development, streamlining implementation and maintenance processes. However, improper API usage can result in significant issues such as unexpected outcomes, security vulnerabilities and system crashes. To detect API misuses, current methods primarily rely on comparing established API usage patterns with target points for automated detection, mainly based on pre-validated datasets. Nonetheless, there is a scarcity of publicly available datasets on API misuses and their corresponding fixes, which hinders data-driven research. Moreover, most existing techniques concentrate on statically typed languages, such as Java and C, with only a few addressing dynamic languages like Python effectively, due to difficulties in handling dynamic features. Therefore, it is essential to identify Python API misuses and their fixes automatically and promptly. In this paper, we introduce HatPAM, a Hybrid Analysis and Attention-based Python API-Misuse Miner, which (a) provides a method for automatically mining true-positive commits related to Python API-misuse fixes from GitHub and (b) presents the subsequent processing for classifying Python API misuses in true-positive cases. Particularly, HatPAM applies hybrid static analysis and introduces a structure-based attention mechanism to examine syntax, semantics and structural features in Python code context, and considers the consistency between code and developers’ natural intent to significantly reduce false-positive cases. Evaluation on six popular Python projects reveals that HatPAM outperforms various state-of-the-art baselines, achieving up to 92.2% Precision, 86.7% Recall and 89.3% F1-score, indicating its capability to identify and classify Python API-misuse commits.