Please use this identifier to cite or link to this item: http://hdl.handle.net/1893/31235
Appears in Collections:Computing Science and Mathematics Conference Papers and Proceedings
Author(s): Pimenta, Cristiano G
de Sá, Alex G C
Ochoa, Gabriela
Pappa, Gisele L
Title: Fitness Landscape Analysis of Automated Machine Learning Search Spaces
Editor(s): Paquete, Luís
Zarges, Christine
Citation: Pimenta CG, de Sá AGC, Ochoa G & Pappa GL (2020) Fitness Landscape Analysis of Automated Machine Learning Search Spaces. In: Paquete L & Zarges C (eds.) Evolutionary Computation in Combinatorial Optimization. EvoCOP 2020. Lecture Notes in Computer Science, 12102. EvoCOP 2020: Evolutionary Computation in Combinatorial Optimization, Seville, Spain, 15.04.2020-17.04.2020. Cham, Switzerland: Springer International Publishing, pp. 114-130. https://doi.org/10.1007/978-3-030-43680-3_8
Issue Date: 2020
Date Deposited: 3-Jun-2020
Series/Report no.: Lecture Notes in Computer Science, 12102
Conference Name: EvoCOP 2020: Evolutionary Computation in Combinatorial Optimization
Conference Dates: 2020-04-15 - 2020-04-17
Conference Location: Seville, Spain
Abstract: The field of Automated Machine Learning (AutoML) has as its main goal to automate the process of creating complete Machine Learning (ML) pipelines to any dataset without requiring deep user expertise in ML. Several AutoML methods have been proposed so far, but there is not a single one that really stands out. Furthermore, there is a lack of studies on the characteristics of the fitness landscape of AutoML search spaces. Such analysis may help to understand the performance of different optimization methods for AutoML and how to improve them. This paper adapts classic fitness landscape analysis measures to the context of AutoML. This is a challenging task, as AutoML search spaces include discrete, continuous, categorical and conditional hyperparameters. We propose an ML pipeline representation, a neighborhood definition and a distance metric between pipelines, and use them in the evaluation of the fitness distance correlation (FDC) and the neutrality ratio for a given AutoML search space. Results of FDC are counter-intuitive and require a more in-depth analysis of a range of search spaces. Results of neutrality, in turn, show a strong positive correlation between the mean neutrality ratio and the fitness value.
Status: AM - Accepted Manuscript
Rights: This is a post-peer-review, pre-copyedit version of a paper published in Zarges C & Paquete L (eds.) Evolutionary Computation in Combinatorial Optimization. EvoCOP 2020. Lecture Notes in Computer Science, 12102. EvoCOP 2020: European Conference on Evolutionary Computation in Combinatorial Optimization, Seville, Spain, 15.04.2020-17.04.2020. Cham, Switzerland: Springer, pp. 197-213. The final authenticated version is available online at: https://doi.org/10.1007/978-3-030-43680-3_8
Licence URL(s): https://storre.stir.ac.uk/STORREEndUserLicence.pdf

Files in This Item:
File Description SizeFormat 
FLAutoMLEvoCOP2020.pdfFulltext - Accepted Version517.24 kBAdobe PDFView/Open



This item is protected by original copyright



Items in the Repository are protected by copyright, with all rights reserved, unless otherwise indicated.

The metadata of the records in the Repository are available under the CC0 public domain dedication: No Rights Reserved https://creativecommons.org/publicdomain/zero/1.0/

If you believe that any material held in STORRE infringes copyright, please contact library@stir.ac.uk providing details and we will remove the Work from public display in STORRE and investigate your claim.