DaSSWeb - Imbalanced regression and extreme value prediction
May 24th | 14:30
DaSSWeb, Data Science and Statistics Webinar
Imbalanced regression and extreme value prediction
Rita P. Ribeiro
University of Porto & INESC TEC
Join us here.
Research in imbalanced domain learning has almost exclusively focused on solving classification tasks for accurate prediction of cases labelled with a rare class. Approaches for addressing such problems in regression tasks are still scarce due to two main factors. First, standard regression tasks assume each domain value as equally important. Second, standard evaluation metrics focus on assessing the performance of models on the most common values of data distributions. In this paper, we present an approach to tackle imbalanced regression tasks where the objective is to predict extreme (rare) values. We propose an approach to formalise such tasks and to optimise/evaluate predictive models, overcoming the factors mentioned and issues in related work. We present an automatic and non-parametric method to obtain relevance functions, building on the concept of relevance as the mapping of target values into non-uniform domain preferences. Then, we propose SERA, a new evaluation metric capable of assessing the effectiveness and of optimising models towards the prediction of extreme values while penalising severe model bias. An experimental study demonstrates how SERA provides valid and useful insights into the performance of models in imbalanced regression tasks.
Rita P. Ribeiro is an Assistant Professor at the Department of Computer Science of the Faculty of Sciences of the University of Porto (FCUP) and a Researcher at the Artificial Intelligence and Decision Support Lab (LIAAD) of the Institute for Systems and Computer Engineering, Technology and Science (INESCTEC). She holds a PhD in Computer Science from the University of Porto. Her main research topics are imbalanced domain learning, outlier detection, evaluation issues on learning tasks and problems related to social good. She has been involved in several research projects concerning environmental problems, fraud detection, and predictive maintenance applications. She is a member of the program committee of several conferences, serving as a reviewer of several journals and has been involved in the organization of some scientific events. Currently, she is also the director of the Masters in Data Science at FCUP.