Studying the Relationship between Exception Handling Practices and Post-release Defects

Abstract

Modern programming languages, such as Java and C#, typically provide features that handle exceptions. These features separate error-handling code from regular source code and aim to assist in the practice of software comprehension and maintenance. Nevertheless, their misuse can still cause reliability degradation or even catastrophic software failures. Prior studies on exception handling revealed the suboptimal practices of the exception handling flows and the prevalence of their anti-patterns. However, little is known about the relationship between exception handling practices and software quality. In this work, we investigate the relationship between software quality (measured by the chance of having post-release defects) and: (i) exception flow characteristics and (ii) 17 exception handling anti-patterns. We perform a case study on three Java and C# open-source projects. By building statistical models of the chance of post-release defects using traditional software metrics and metrics that are associated with exception handling practice, we study whether exception flow characteristics and exception handling anti-patterns have a statistically significant relationship with post-release defects. We find that exception flow characteristics in Java projects have a significant relationship with post-release defects. In addition, although majority of the exception handing anti-patterns are not significant in the models, there exist anti-patterns that can provide significant explanatory power to the chance of post-release defects. Therefore, development teams should consider allocating more resources to improving their exception handling practices, and avoid the anti-patterns that are found to have a relationship with post-release defects. Our findings also highlight the need for techniques that assist in handling exceptions in the software development practice.

Read our paper: Paper

Evaluated Subject Projects

The subject projects are 3 open-source softwares. To reproduce our study, the releases we analysed can be found in the links below.

Project	Release	Latest Post-Release
C# Umbraco-CMS	release-7.6.0	release-7.6.12
Java Hadoop	rel/release-2.6.0	rel/release-2.6.5
Java Hibernate	5.0.0.Final	5.0.16

Exception Flow Analyzer

The source code and binary files of the two source-code analysis tools that we used are available below.

Java Binary and Source
C# Binary and Source

Extracted metrics and aggregations

Before modeling, the metrics are not in the file level. This link is the list of all metrics, where they were extracted, and how they were aggregated to the next level.

Metrics and aggregation rules

Data Input at File Level, Scripts and Output

Refer to the paper for further explanation about what the script is doing.

The input data comes from five different sources(i.e., different metrics/columns). This file is all data sources merged without any observation that miss data on any column.
Single CSV with merged data with no missing

The input data comes from five different sources(i.e., different metrics/columns). This file is all separate data sources and all scripts required and used to obtain the results presented on the paper. This is a full replication package for modeling.
Input, Scripts and Output

Model construction and analysis steps

Based on the R Notebooks created (i.e., scripts above), we knitted the results in HTMLs below:

Data processing and missing data analysis
Model construction and analysis

More on R and Regression Modeling Strategies

To understand more about our modeling approach and the R packages we used.

Dr. Frank Harrell’s RMS package, course and book
Dr. Frank Harrell’s Hmisc package
Dr. Max Kuhn’s caret package