Go to:
Logótipo
Você está em: Start > Publications > View > mdatagen: A python library for the artificial generation of missing data
Map of Premises
Principal
Publication

mdatagen: A python library for the artificial generation of missing data

Title
mdatagen: A python library for the artificial generation of missing data
Type
Article in International Scientific Journal
Year
2025
Authors
Mangussi, AD
(Author)
Other
The person does not belong to the institution. The person does not belong to the institution. The person does not belong to the institution. Without AUTHENTICUS Without ORCID
Santos, MS
(Author)
FCUP
View Personal Page You do not have permissions to view the institutional email. Search for Participant Publications View Authenticus page Without ORCID
Lopes, FL
(Author)
Other
The person does not belong to the institution. The person does not belong to the institution. The person does not belong to the institution. Without AUTHENTICUS Without ORCID
Pereira, RC
(Author)
Other
The person does not belong to the institution. The person does not belong to the institution. The person does not belong to the institution. Without AUTHENTICUS Without ORCID
Lorena, AC
(Author)
Other
The person does not belong to the institution. The person does not belong to the institution. The person does not belong to the institution. Without AUTHENTICUS Without ORCID
Journal
Title: NeurocomputingImported from Authenticus Search for Journal Publications
Vol. 625
ISSN: 0925-2312
Publisher: Elsevier
Other information
Authenticus ID: P-017-ZVT
Abstract (EN): Missing data is characterized by the presence of absent values in data (i.e., missing values) and it is currently categorized into three different mechanisms: Missing Completely at Random, Missing At Random, and Missing Not At Random. When performing missing data experiments and evaluating techniques to handle absent values, these mechanisms are often artificially generated (a process referred to as data amputation) to assess the robustness and behavior of the used methods. Due to the lack of a standard benchmark for data amputation, different implementations of the mechanisms are used in related research (some are often not disclaimed), preventing the reproducibility of results and leading to an unfair or inaccurate comparison between existing and new methods. Moreover, for users outside the field, experimenting with missing data or simulating the appearance of missing values in real-world domains is unfeasible, impairing stress testing in machine learning systems. This work introduces mdatagen, an open source Python library for the generation of missing data mechanisms across 20 distinct scenarios, following different univariate and multivariate implementations of the established missing mechanisms. The package therefore fosters reproducible results across missing data experiments and enables the simulation of artificial missing data under flexible configurations, making it very versatile to mimic several real-world applications involving missing data. The source code and detailed documentation for mdatagen are available at https://github.com/ArthurMangussi/pymdatagen.
Language: English
Type (Professor's evaluation): Scientific
No. of pages: 10
Documents
We could not find any documents associated to the publication.
Related Publications

Of the same journal

The vitality of pattern recognition and image analysis (2015)
Another Publication in an International Scientific Journal
Luisa Mico; Joao M Sanches; Jaime S Cardoso
ydata-profiling: Accelerating data-centric AI with high-quality data (2023)
Article in International Scientific Journal
Clemente, F; Ribeiro, GM; Quemy, A; Santos, MS; Pereira, RC; Barros, A
The vitality of pattern recognition and image analysis (2015)
Article in International Scientific Journal
Micó, L; Sanches, JM; Jaime S Cardoso
Pre-processing approaches for imbalanced distributions in regression (2019)
Article in International Scientific Journal
Branco, P; Torgo, L; Rita Ribeiro
Predicting satisfaction: perceived decision quality by decision-makers in Web-based group decision support systems (2019)
Article in International Scientific Journal
João Carneiro; Pedro Saraiva; Luís Conceição; Ricardo Santos; Goreti Marreiros; Paulo Novais

See all (22)

Recommend this page Top
Copyright 1996-2025 © Faculdade de Medicina Dentária da Universidade do Porto  I Terms and Conditions  I Acessibility  I Index A-Z
Page created on: 2025-08-09 at 13:31:44 | Privacy Policy | Personal Data Protection Policy | Whistleblowing | Electronic Yellow Book