Improving Tese Case Generation for Python Native Libraries Through Constraints on Input Data Structures

06/28/2022
by   Xin Zhang, et al.
0

Modern Python projects execute computational functions using native libraries and give Python interfaces to boost execution speed; hence, testing these libraries becomes critical to the project's robustness. One challenge is that existing approaches use coverage to guide generation, but native libraries run as black boxes to Python code with no execution information. Another is that dynamic binary instrumentation reduces testing performance as it needs to monitor both native libraries and the Python virtual machine. To address these challenges, in this paper, we propose an automated test case generation approach that works at the Python code layer. Our insight is that many path conditions in native libraries are for processing input data structures through interacting with the VM. In our approach, we instrument the Python Interpreter to monitor the interactions between native libraries and VM, derive constraints on the structures, and then use the constraints to guide test case generation. We implement our approach in a tool named PyCing and apply it to six widely-used Python projects. The experimental results reveal that with the structure constraint guidance, PyCing can cover more execution paths than existing test cases and state-of-the-art tools. Also, with the checkers in the testing framework Pytest, PyCing can identify segmentation faults in 10 Python interfaces and memory leaks in 9. Our instrumentation strategy also has an acceptable influence on testing efficiency.

READ FULL TEXT

page 3

page 5

page 6

page 7

page 8

page 9

page 10

page 12

research
06/11/2021

Toward Efficient Interactions between Python and Native Libraries

Python has become a popular programming language because of its excellen...
research
02/18/2021

APIScanner – Towards Automated Detection of Deprecated APIs in Python Libraries

Python libraries are widely used for machine learning and scientific com...
research
03/03/2023

The Awkward World of Python and C++

There are undeniable benefits of binding Python and C++ to take advantag...
research
05/04/2023

SlipCover: Near Zero-Overhead Code Coverage for Python

Coverage analysis is widely used but can suffer from high overhead. This...
research
01/16/2022

Social Networks as a Collective Intelligence: An Examination of the Python Ecosystem

The Python ecosystem represents a global, data rich, technology-enabled ...
research
03/11/2021

Using Relative Lines of Code to Guide Automated Test Generation for Python

Raw lines of code (LOC) is a metric that does not, at first glance, seem...
research
04/04/2019

Automated Fortran--C++ Bindings for Large-Scale Scientific Applications

Although many active scientific codes use modern Fortran, most contempor...

Please sign up or login with your details

Forgot password? Click here to reset