Abstract (EN):
This article presents an alternative approach to the problem of regression. The methodology we describe allows the use of classification algorithms in regression tasks. From a practical point of view this enables the use of a wide range of existing machine learning (ML) systems in regression problems. In effect, most of the widely available systems deal with classification. Our method works as a pre-processing step in which the continuous goal variable values are discretised into a set of intervals. We use misclassification costs as a means to reflect the implicit ordering among these intervals. We describe a set of alternative discretisation methods and, based on our experimental results, justify the need for a search-based approach to choose the best method. The discretisation process is isolated from the classification algorithm, thus being applicable to virtually any existing system. The implemented system (RECLA) can thus be seen as a generic pre-processing tool. We have tested RECLA with three different classification systems and evaluated it in several regression data sets. Our experimental results confirm the validity of our search-based approach to class discretisation, and reveal the accuracy benefits of adding misclassification costs. © 1997 Elsevier Science B.Y.
Language:
English
Type (Professor's evaluation):
Scientific