Defesa de Dissertação de Mestrado: Impact of Subdomain Identification on the Efficiency of Machine Learning Models
-
Palestrantes
Aluno: Gerardo Samuel Rojas Torres
-
Informações úteis
Orientadores:
Fabio André Machado Porto - Laboratório Nacional de Computação Científica - LNCC
Banca Examinadora:
Fabio Andre Machado Porto - Laboratório Nacional de Computação Científica - LNCC (presidente)
Marcio Rentes Borges - Laboratório Nacional de Computação Científica - LNCC
Eduardo Ogasawara - CEFET
Suplentes:
Gilson Antônio Giraldi - Laboratório Nacional de Computação Científica - LNCC
Eduardo Bezerra da Silva - Centro Federal de Educação Tecnológica Celso Suckow da Fonseca - CEFET-RJ
Resumo:This research explores the use of clustering techniques to enhance predictive modeling performance for multivariate time series data. The objective is to determine whether training mod els on clusters can achieve results that are comparable to or better than a single global model trained on the entire dataset.
K-medoids and quadtree-based algorithms were employed to create clusters, utilizing Dynamic Time Warping (DTW) as the dissimilarity measure for both methods. For the quadtree approach, entropy was additionally used as a criterion for partitioning the input space.
Long Short-Term Memory (LSTM) networks were employed to train and evaluate models, with performance compared against the global model. This approach provides a robust framework for testing the hypothesis that subset modeling, based on clustered data, can enhance predictive accuracy or maintain comparable performance to the global model, while potentially offering computational or analytical advantages. - Mais informações