Go to:
Logótipo
Comuta visibilidade da coluna esquerda
Você está em: Start > Publications > View > A new cluster-based oversampling method for improving survival prediction of hepatocellular carcinoma patients
Publication

Publications

A new cluster-based oversampling method for improving survival prediction of hepatocellular carcinoma patients

Title
A new cluster-based oversampling method for improving survival prediction of hepatocellular carcinoma patients
Type
Article in International Scientific Journal
Year
2015
Authors
Santos, MS
(Author)
Other
View Personal Page You do not have permissions to view the institutional email. Search for Participant Publications View Authenticus page Without ORCID
Garcia Laencina, PJ
(Author)
Other
The person does not belong to the institution. The person does not belong to the institution. The person does not belong to the institution. Without AUTHENTICUS Without ORCID
Simao, A
(Author)
Other
The person does not belong to the institution. The person does not belong to the institution. The person does not belong to the institution. Without AUTHENTICUS Without ORCID
Carvalho, A
(Author)
Other
The person does not belong to the institution. The person does not belong to the institution. The person does not belong to the institution. Without AUTHENTICUS Without ORCID
Journal
Vol. 58
Pages: 49-59
ISSN: 1532-0464
Publisher: Elsevier
Other information
Authenticus ID: P-00G-WW5
Abstract (EN): Liver cancer is the sixth most frequently diagnosed cancer and, particularly, Hepatocellular Carcinoma (HCC) represents more than 90% of primary liver cancers. Clinicians assess each patient's treatment on the basis of evidence-based medicine, which may not always apply to a specific patient, given the biological variability among individuals. Over the years, and for the particular case of Hepatocellular Carcinoma, some research studies have been developing strategies for assisting clinicians in decision making, using computational methods (e.g. machine learning techniques) to extract knowledge from the clinical data. However, these studies have some limitations that have not yet been addressed: some do not focus entirely on Hepatocellular Carcinoma patients, others have strict application boundaries, and none considers the heterogeneity between patients nor the presence of missing data, a common drawback in healthcare contexts. In this work, a real complex Hepatocellular Carcinoma database composed of heterogeneous clinical features is studied. We propose a new cluster-based oversampling approach robust to small and imbalanced datasets, which accounts for the heterogeneity of patients with Hepatocellular Carcinoma. The preprocessing procedures of this work are based on data imputation considering appropriate distance metrics for both heterogeneous and missing data (HEOM) and clustering studies to assess the underlying patient groups in the studied dataset (K-means). The final approach is applied in order to diminish the impact of underlying patient profiles with reduced sizes on survival prediction. It is based on K-means clustering and the SMOTE algorithm to build a representative dataset and use it as training example for different machine learning procedures (logistic regression and neural networks). The results are evaluated in terms of survival prediction and compared across baseline approaches that do not consider clustering and/or oversampling using the Friedman rank test. Our proposed methodology coupled with neural networks outperformed all others, suggesting an improvement over the classical approaches currently used in Hepatocellular Carcinoma prediction models.
Language: English
Type (Professor's evaluation): Scientific
No. of pages: 11
Documents
We could not find any documents associated to the publication.
Related Publications

Of the same journal

Relational machine learning for electronic health record-driven phenotyping (2014)
Article in International Scientific Journal
Peggy L Peissig; Vitor Santos Costa; Michael D Caldwell; Carla Rottscheit; Richard L Berg; Eneida A Mendonca; David Page
Multisource and temporal variability in Portuguese hospital administrative datasets: Data quality implications (2022)
Article in International Scientific Journal
Souza, J; Caballero, I; Santos, JV; Lobo, M; Pinto, A; Viana, J; Saez, C; Lopes, F; Freitas A
Healthcare hashtag index development: Identifying global impact in social media. (2016)
Article in International Scientific Journal
Ana Luisa Neves
Facegram - Objective quantitative analysis in facial reconstructive surgery (2016)
Article in International Scientific Journal
Geros, A; Horta, R; Paulo Aguiar
CMIID: A comprehensive medical information identifier for clinical search harmonization in Data Safe Havens (2021)
Article in International Scientific Journal
Domingues, MAP; Rui Camacho; Pedro Pereira Rodrigues

See all (7)

Recommend this page Top
Copyright 1996-2025 © Faculdade de Direito da Universidade do Porto  I Terms and Conditions  I Acessibility  I Index A-Z
Page created on: 2025-08-06 at 22:34:45 | Privacy Policy | Personal Data Protection Policy | Whistleblowing