Utilize este identificador para referenciar este registo: https://hdl.handle.net/1822/90257

TítuloSubgroup mining for performance analysis of regression models
Autor(es)Pimentel, Joao
Azevedo, Paulo J.
Torgo, Luis
Palavras-chaveinterpretability
machine learning
performance
regression
Data2022
EditoraWiley
RevistaExpert Systems
CitaçãoPimentel, J., Azevedo, P. J., & Torgo, L. (2022, August 9). Subgroup mining for performance analysis of regression models. Expert Systems. Wiley. http://doi.org/10.1111/exsy.13118
Resumo(s)Machine learning algorithms have shown several advantages compared to humans, namely in terms of the scale of data that can be analysed, delivering high speed and precision. However, it is not always possible to understand how algorithms work. As a result of the complexity of some algorithms, users started to feel the need to ask for explanations, boosting the relevance of Explainable Artificial Intelligence. This field aims to explain and interpret models with the use of specific analytical methods that usually analyse how their predicted values and/or errors behave. While prediction analysis is widely studied, performance analysis has limitations for regression models. This paper proposes a rule-based approach, Error Distribution Rules (EDRs), to uncover atypical error regions, while considering multivariate feature interactions without size restrictions. Extracting EDRs is a form of subgroup mining. EDRs are model agnostic and a drill-down technique to evaluate regression models, which consider multivariate interactions between predictors. EDRs uncover regions of the input space with deviating performance providing an interpretable description of these regions. They can be regarded as a complementary tool to the standard reporting of the expected average predictive performance. Moreover, by providing interpretable descriptions of these specific regions, EDRs allow end users to understand the dangers of using regression tools for some specific cases that fall on these regions, that is, they improve the accountability of models. The performance of several models from different problems was studied, showing that our proposal allows the analysis of many situations and direct model comparison. In order to facilitate the examination of rules, two visualization tools based on boxplots and density plots were implemented. A network visualization tool is also provided to rapidly check interactions of every feature condition. An additional tool is provided by using a grid of boxplots, where comparison between quartiles o
TipoArtigo
DescriçãoThe data that support the findings of this study are openly available in kaggle at https://www.kaggle.com/datasets/mohansacharya/graduate admissions?select=Admission_Predict_Ver1.1.csv
URIhttps://hdl.handle.net/1822/90257
DOI10.1111/exsy.13118
ISSN0266-4720
Versão da editorahttps://onlinelibrary.wiley.com/doi/10.1111/exsy.13118
Arbitragem científicayes
AcessoAcesso aberto
Aparece nas coleções:HASLab - Artigos em revistas internacionais

Ficheiros deste registo:
Ficheiro Descrição TamanhoFormato 
Expert Systems - 2022 - Pimentel - Subgroup mining for performance analysis of regression models.pdf734,08 kBAdobe PDFVer/Abrir

Partilhe no FacebookPartilhe no TwitterPartilhe no DeliciousPartilhe no LinkedInPartilhe no DiggAdicionar ao Google BookmarksPartilhe no MySpacePartilhe no Orkut
Exporte no formato BibTex mendeley Exporte no formato Endnote Adicione ao seu ORCID