Abstract (EN):
Class overlap presents a significant challenge to machine learning algorithms, especially when class imbalance is present. These factors contribute substantially to the complexity of classification tasks, particularly in realworld scenarios. As a result, measuring overlap is crucial, yet it remains difficult to quantify due to its intricate nature, since it can manifest and be measured in multiple ways. To help mitigate this, recent research has conceptualized a new taxonomy of class overlap measures, divided into multiple families, which allows researchers to obtain a more complete overview of the complexity of the datasets. In line with recent research, we introduce a new Python package for class overlap measurement named pycol. This package implements 29 overlap measures, divided into four overlap families specifically designed to capture class overlap in imbalanced real-world scenarios. This makes pycol an essential tool for researchers dealing with complex classification problems, providing robust solutions to quantify the joint-effect of class overlap and class imbalance effectively.
Idioma:
Inglês
Tipo (Avaliação Docente):
Científica
Nº de páginas:
10