Services on Demand
Journal
Article
Indicators
- Cited by SciELO
- Access statistics
Related links
- Cited by Google
- Similars in SciELO
- Similars in Google
Share
Ingeniería y Desarrollo
Print version ISSN 0122-3461On-line version ISSN 2145-9371
Abstract
RAMIREZ, Juan Sebastián and DUQUE-MENDEZ, Néstor. Evaluation of Unsupervised Machine Learning Algorithms with Climate Data. Ing. Desarro. [online]. 2022, vol.40, n.2, pp.131-165. Epub Apr 10, 2023. ISSN 0122-3461. https://doi.org/10.14482/inde.40.02.622.553.
When using climate data, researchers have difficulty determining the clustering algorithm and the best performing parameters for processing a specific dataset. We evaluated of the following unsupervised machine learning algorithms: K-means, K-medoids and Linkage-complete, which are applied to three datasets with climatological variables (temperature, rainfall, relative humidity, and solar radiation) for three meteorological stations located in the department of Caldas, Colombia, at different heights above sea level. Five scenarios are defined for 2, 3, and 5 clusters for each of the two partitioned algorithms, and five scenarios for the hierarchical algorithm, in each one of the meteorological stations. Different quantities and groupings of variables are applied for the different scenarios by using Euclidean distance. Davis-Bouldin is the applied method of quality evaluation of clusters. Normalization with techniques such as range-transformation and Z-trans-formation, as well as some iterations of the algorithm and reduction of dimensionality with PCA. In addition, the computational cost is evaluated. This study can guide researchers on certain decisions in cluster analysis used in meteorological data, as well as identify the most important algorithm and parameters to take into consideration for the best performance, according to particular conditions and requirements.
Keywords : Climate; clustering; machine learning; K-means; K-medoids.