Automating Test Case Identification in Open Source Projects on GitHub

02/23/2021
by   Matej Madeja, et al.
0

Software testing is one of the very important Quality Assurance (QA) components. A lot of researchers deal with the testing process in terms of tester motivation and how tests should or should not be written. However, it is not known from the recommendations how the tests are actually written in real projects. In this paper the following was investigated: (i) the denotation of the test word in different natural languages; (ii) whether the test word correlates with the presence of test cases; and (iii) what testing frameworks are mostly used. The analysis was performed on 38 GitHub open source repositories thoroughly selected from the set of 4.3M GitHub projects. We analyzed 20,340 test cases in 803 classes manually and 170k classes using an automated approach. The results show that: (i) there exists weak correlation (r = 0.655) between the word test and test cases presence in a class; (ii) the proposed algorithm using static file analysis correctly detected 95% of test cases; (iii) 15% of the analyzed classes used main() function whose represent regular Java programs that test the production code without using any third-party framework. The identification of such tests is very low due to implementation diversity. The results may be leveraged to more quickly identify and locate test cases in a repository, to understand practices in customized testing solutions and to mine tests to improve program comprehension in the future.

READ FULL TEXT

page 29

page 30

research
12/16/2019

RTj: a Java framework for detecting and refactoring rotten green test cases

Rotten green tests are passing tests which have, at least, one assertion...
research
11/20/2018

Automatic Test Improvement with DSpot: a Study with Ten Mature Open-Source Projects

In the literature, there is a rather clear segregation between manually ...
research
06/26/2018

How Do Static and Dynamic Test Case Prioritization Techniques Perform on Modern Software Systems? An Extensive Study on GitHub Projects

Test Case Prioritization (TCP) is an increasingly important regression t...
research
10/19/2020

Using mutation testing to measure behavioural test diversity

Diversity has been proposed as a key criterion to improve testing effect...
research
07/28/2021

Models of Computational Profiles to Study the Likelihood of DNN Metamorphic Test Cases

Neural network test cases are meant to exercise different reasoning path...
research
09/01/2018

Test Case Prioritization Using Test Similarities

A classical heuristic in software testing is to reward diversity, which ...
research
05/21/2019

A comparison of evaluation methods in coevolution

In this research, we compare four different evaluation methods in coevol...

Please sign up or login with your details

Forgot password? Click here to reset