Wrapper for building classification models using Covering Arrays

Loading...
Thumbnail Image

Date Issued

Date Online

Language

en

Review Status

Peer Review

Access Rights

Open Access Open Access

Usage Rights

CC-BY-4.0

Share

Citation

Dorado, Hugo; Cobos, Carlos; Torres-Jimenez, Jose; Burra, Dharani Dhar; Mendoza, Martha & Jimenez, Daniel. (2019). Wrapper for building classification models using Covering Arrays. IEEE Access. 1-16 p.

Permanent link to cite or share this item

External link to download this item

Abstract/Description

Wrapper methods are a type of feature selection method that finds a subset of variables to improve the performance of a classifier by removing redundant and irrelevant variables. The use of a wrapper implies that each time a candidate solution is explored, the classifier is evaluated on the quality measures selected (e.g. accuracy or precision). Though robust, this iteration across several candidate solutions can become computationally intensive and time-consuming. In this paper we propose a wrapper, that is based on binary Covering Arrays (CAs), and binary Incremental Covering Arrays (ICAs), that have been widely used for experimental design and fault detection in software and hardware testing. The new wrapper was evaluated with six classifiers on seven data sets. The results show that the CAs and ICAs with strength 6 significantly improve the performance and reduces the number of variables required by the classifier. A comparative analysis of the proposed method against wrappers based on other search approaches such as genetic algorithms (GA) and particle swarm optimization (PSO), shows that the proposed method yields results similar to GA, but not to PSO, with differences to PSO, in accuracy, which in the majority of cases is below 0.04. This lack of accuracy, by which the new wrapper fails to match PSO, is offset by the fact that the user does not need to fine tune algorithm parameters, such as velocity ranges, timing, cognitive coefficient, and social coefficient, while it is also much easier to program in parallel.

Author ORCID identifiers

Organizations Affiliated to the Authors