What's SOBER
SOBER is a statistical debugging tool, which can automatically localize the underlying software faults without any prior knowledge about program semantics. More information can be found in the following papers.- C. Liu, X. Yan, L. Fei, J. Han and S. Midkiff, “SOBER:Statistical Model-based Bug Localization”, In Proc. of the 5th joint meeting of the European Software Engineering Conference and ACM SIGSOFT Symposium on the Foundations of Software Engineering (ESEC/FSE'05), Lisbon, Portugal, Sept. 2005. [pdf]
- C. Liu, L. Fei, X. Yan, J. Han and S. Midkiff, “Statistical Debugging: A Hypothesis Testing-based Approach,” IEEE Transaction on Software Engineering, Vol. 32, No. 10, pp. 831-848, Oct., 2006. [pdf]
Motivation
At current stage, SOBER is still a research prototype. So this site is not meant for mature package release. Instead, the main purpose is to help researchers reproduce the results presented in the above papers. As the evaluation of fault localization tools is quite subjective, we elaborate on the experiment details in the following, together with downloads.Experiment Details and Downloads
Bug Benchmark
- We used the Siemens Program suite in our experiment, which can be available upon request from Subject Infrastructure Repository. Because T-score-based evaluation requires the identification of the defect set for each buggy program, for each faulty version of the Siemens suite, we manually prepared one file named "faulty", which lists the line(s) of code that could be the root cause, i.e., the bug. We preapred the faulty file according to our programming experience with no bias. The Siemens suite with the "faulty" file for each version can be downloaded locally here.
- The 8,000 inputs used to crash bc-1.06 are available here: input set 1 and input set 2. Programs used to generate random input to bc-1.06 are InputGenNodeType.java and InputGen.java.
Instrumentation
- The localizaton Quality can be sensitive to the instrumentation strategy. The instrumented Siemens suite can be downloaded here. As one important note, our instrumentation takes each conditional as one inseperable instrumentation unit and does not treat each subclause individually. Our experiment on the siemens suite indicates that this non-split strategy generally brings better localization quality than splitting each conditional into separated subclauses.
SOBER Algorithm Implementation
- Our implementation of SOBER in Matlab, download here.
T-Score-based Evaluation
- The T-score based evaluation is done within CodeSurfer 1.9 with patch 3, using the factory-default option.
Acknowlegements
- We would like to thank Gregg Rothermel for making the Siemens program suite available.
- Andreas Zeller, Holger Cleve and Manos Reneris generously shared with us their evaluation frameworks.
- GrammaTech Inc. offered us a free copy of CodeSurfer.