Fuzzy Inference Systems Tuning with Optimization Algorithms for Solar Flares Classification

Ramos, Liz Angélica; Bustos Pinzón, Alex Francisco; Melgarejo R, Miguel A; Vargas Domínguez, Santiago; Ramos, Liz Angélica; Bustos Pinzón, Alex Francisco; Melgarejo R, Miguel A; Vargas Domínguez, Santiago

doi:10.18180/tecciencia.2017.23.5

Serviços Personalizados

Journal

Artigo

Indicadores

Citado por SciELO
Acessos

Links relacionados

Citado por Google
Similares em SciELO
Similares em Google

Mais
Mais

Permalink

Tecciencia

versão impressa ISSN 1909-3667

Tecciencia vol.12 no.23 Bogotá jul./dez. 2017

https://doi.org/10.18180/tecciencia.2017.23.5

Artículos

Fuzzy Inference Systems Tuning with Optimization Algorithms for Solar Flares Classification

Sintonización de Sistemas de Inferencia Difusa Mediante Algoritmos de Optimización para Clasificación de Fulguraciones Solares

Liz Angélica Ramos¹^*

Alex Francisco Bustos Pinzón¹

Miguel A Melgarejo R¹

Santiago Vargas Domínguez²

^¹ Universidad Distrital Francisco José de Caldas, Bogotá, Colombia

^²Universidad Nacional de Colombia, Bogotá, Colombia

Abstract

In this work we describe the implementation and analysis of different optimization algorithms used for finding the best set of parameters for a Fuzzy Inference System intended to classify solar flares. The parameters will be identified among a universe of possible solutions for the algorithms, and the system will be tested in the particular case of dealing with the aim of classifying the solar flares.

Keywords: ANFIS; EBDF; Solar Flares; Fuzzy Sets

Resumen

Se describe la implementación y análisis de diferentes algoritmos de optimización usados para encontrar el mejor conjunto de parámetros de un Sistema de Inferencia Difusa destinado a la clasificación de fulguraciones solares. Los parámetros serán identificados entre un universo de posibles soluciones para los algoritmos y el sistema será probado en el caso particular de tratar con el objetivo de clasificar las fulguraciones solares.

Palabras clave: ANFIS; EBDF; Fulguraciones Solares; Sistemas Difusos

Introduction

The Sun is the main responsible for the varying conditions of the interplanetary medium, particularly, in the space surrounding our planet, in what is commonly known as space weather. Multiple solar phenomena show up at many spatial and temporal scales, and are studied through observations, theoretical models and simulations. Among the most energetic phenomena in the solar system are the solar flares. These are transient events associated to the activity of the star in which certain regions of the solar atmosphere can emit a vast amount of energy up to 10²⁵ Joules.

These zones in the solar atmosphere are associated with the presence of dark spots in the solar surface (photosphere) called sunspots. Sunspots are the manifestation of intense magnetic fields emerging from the solar interior and crossing the photosphere, inhibiting the normal convection of solar plasma and thus reducing the radiation emission.

For this reason the temperature values in sunspots drop approximately 2000 K compared to the temperature in the non-active photosphere, known as quiet sun. Sunspots are proxies of solar activity and their number on the solar disk was used to discover the solar cycle in 1843 ^[¹^] and are the main constituents of the so-called solar active regions.

Table 1. Classification of Solar Flares. Source: Based on ³ .

Solar activity has become a very important research topic due to its connection with space weather and the possible impact of energetic phenomena on the normal development of the current technological society, based on satellites, which could be affected by intense solar emissions ^[²^].

Depending on the amount of energy released (flux in Wm^-2 ) during the intensity peak of flaring events, solar flares are classified in A, B, C, M or X, as listed in Table 1. The effect of the different types of flares is also different depending on the flare type ^[³^].

The main goal of this work is to choose the best Fuzzy Inference System (FIS), from among several FIS tuning methods used, through a validation index Starting from the solar flares characteristics and quantity of them in the solar disk (as inputs of the FIS), each FIS allows to obtain a classification of the solar flares (as output of the FIS).

The parameters of each system were tuned using five methods: Manual Tuning, Adaptive Neuro-Fuzzy Inference System (ANFIS) with random initialization ^[⁴^], Compact Genetic Algorithm (CGA) ^[⁵^], Differential Evolution (DE) ^[⁶^] and Stochastic Hill Climbing (SHC) with random initialization ^[⁷^].

The flow chart that describes the problem is shown in Figure 1, in which the "Problem in Nature" is the unknown way that makes the input values to be related with the output values, observed from Sun behavior. This behavior should be emulated by the FIS. The validation index is a function of the expected output, generated by the Problem in Nature, and from the output obtained by the FIS.

The sunspot features and their associated flares were obtained by generating a database according to ^[²^], through a cross search in the sunspots and solar flares catalogs from the National Geophysical Data Center (NGDC). The parameters for the cross search allowed to obtain a total of 1391 individual values, using a time span of 6 hours, in the records from 1999 to 2002, to cover the activity peak of the Solar Cycle 23.

The quantities for each class with these parameters are recorded in Table 2.

Figure 1 Flow chart of the Global description for the Artificial Intelligence problem.

Note that the generated data presents an imbalance: the number of type C (common) flares are big compared to the M (moderate) flares, data class. Similarly, the M class has more data than X (extreme) flares, as expected from displaying activity of the Sun during its cycle of approximately 11 years.

Aiming to abbreviate, the inputs of the database were numerated as follows:

Modified Zurich Class
Penumbra: Largest Spot
Sunspot Distribution
Normalized number of Sunspots

Creating scatter plots from pairs of inputs like in Figure 2, shows that it is not possible to plot a linear function that separates the classes.

Also, it is quite clear from the Figure 2 that class M seems to be "absorbed" by class C. Furthermore, class X, having the lower amount of data, is almost not recognizable from class M. Thereby, the attention is focused on classify the class X solar flares.

2.Methodological Considerations

2.1 Fuzzy Inference System

A FIS consists of five components: a base of fuzzy rules, a data base that defines the membership functions of the fuzzy sets used in fuzzy rules, the fuzzy inference engine, the fuzzifier and defuzzifier ^[⁴^].

Table 2. Data used by class.

Figure 2 Scatter plots of possible combinations from pairs of the inputs.

The FIS can be represented with a fuzzy basis function expansion in which an input vector x is related with a punctual y output, such that y=f(x). Thus, it is possible to represent in a compact manner the inference process of a FIS and the resulting function is a universal estimator ^[⁵^]

The FIS represented by ^[¹^] has the following characteristics:

Fuzzification: Singleton
Membership Functions: Gaussian.
Implication: Product
Defuzzification: Average of centers.

The 𝑙 index refers to the 𝑙 -th rule, being M the total number of rules. By its part, the i index refers to the i-th input and N are the total of them. The 𝜇𝐴 l i ( ) membership function (MF)

is then unique for each input in every rule. Similarly, the center of the consequent set y_𝑙 is unique in every rule ^[⁵^].

The 𝜇𝐴 l 𝑖 ( )are of Gaussian type, and can be written as ^[²^].

Every MF in ^[²^] has their c mean value and a σ standard deviation.

The total quantity of parameters that defines a FIS in the form ^[¹^] are given by ^[³^], having in mind that, for each input and every rule there are two parameters due to the antecedent set (c and σ ), and an additional parameter being the center of the consequent.

2.2 Manual Tuning Method

Starting from the authors' perceptions about the data and the possible relations that may be present in it, it is possible to create an initial FIS with their fuzzy sets for each of the inputs, their punctual output values, and the rule base allowing to link the fuzzy sets of the inputs to the punctual outputs.

The purpose of this method is to deepen into the problem recognizing possible relationships among features as well as revealing preliminary classification rules.

Although a valid solution can be found, the most important result of this method is the knowledge derived from approaching the problem.

Initially the software used was GNU's Octave, loading the packages "io" and "fuzzy-logic-toolkit". The first allows that Pctave reads the generated CSV dataset, and the second to design, test and verify the manual tuned FIS.

Despite the fact that in the following algorithms the software used was MATLAB, the final FIS created with Pctave was migrated to MATLAB through the Fuzzy Logic Designer, a graphical tool part of the Fuzzy Logic Toolbox; with the mere purpose to use the same software tool at the final validation stage.

2.3. Adaptive Neuro-Fuzzy Inference System (ANFIS) with random initialization

ANFIS, a FIS based on adaptive networks, is a method based on a supervised learning model that, given a set of input/output pairs (x,y), related by an unknown function f, there is an apprentice and a supervisor of the learning process from f, with the use of a validation metric to evaluate the results of the apprentice and able to correct it. The algorithm uses a hybrid model that combines least squares method and the decreasing gradient or back-propagation method.

In this case the apprentice is a fuzzy system that can be written as the expansion of fuzzy based functions for a Sugeno type system shown in ^[¹^]. The parameters to be determined correspond to y , X 𝑖 l and σ 𝑖 l ^[⁴^]. The validation metrics represents the root mean square error (RMSE) between the output value for the fuzzy apprentice system and the output value y of the data pairs ^[⁵^]. The process aims at minimizing the error for the input values in a set comprising part of the complete available data, which is generally about 70% of them. Searching for an apprentice generalization, it is validated with the remaining 30% of the database.

Additional to the individual (apprentice system with its parameters and rules) to be adjusted, ANFIS requires initial conditions such as the number of rules, number of inputs and the rate of initial learning. For the case mentioned above, the inputs stay constant and the other two parameters are tuned up. Because ANFIS fits the parameters of an existing individual, thus implying a local search, it executes several times and, prior to this, it generates the individual with initialized parameters in random values, aiming at (depending on randomization) perform a global search in a whole universe of possible solutions.

Algorithm 1 Pseudo code for the MATLAB implementation using the ANFIS function.

2.4 Compact Genetic Algorithm (CGA)

This belongs to a series of algorithms known as Probabilistic Model Building Genetic Algorithm (PMBGA) ^[⁸^], which are characterized by discriminating the significant contribution attributes in the construction of an optimal individual. The validation indexes for determining the performance of an individual is the "Fitness" function, which in turn depends on the problem to be solved. The implementation considers an individual with the best performance when the value of this function is minimized.

Because in this work we are dealing with a classification problem, besides using the RMSE, we decided to also consider the use of classification error and correlation. With that in mind, we can assemble an initial brief of a fitness function ^[⁴^].

And

Where:

Every 𝐸_𝐶𝑥 classification error has its respective w_x weight. As the database is inherently imbalanced, every weight w x was assigned to be greater than the proportion of data belonging to class C, to the quantity of data from the other classes:

Therefore, the weight associated to the class X of solar flares, for which the number of data is lower, has the highest value. By doing this, a badly classified data that belongs to this class produces a more significant increase in the first factor of (4) that one not incorrectly classified in class C, in the final fitness function factors (6)

To explain the E_Rmse Root Mean Square Error in (4), suppose that the problem is not a classification problem, but a prediction problem instead. For a conceptual brief, the E_Rmse gives an idea on how the individual are not "following" the expected sequence from the training data ^[⁵^].

Then, a bad predictor will have a greater E_Rmse value, than other that gets closer to the output values of the database, and considering that the data also depends on some time unit. The root mean square error is mathematically described as:

Where

v₀ is the value obtained
v_e is the expected value

The number of rules was taken from the obtained result with the ANFIS algorithm, R=8 rules. For developing the algorithm, the parameter for adjusting the converging speed of the probability vector n is tuned. Since the optimal value is unknown, it is randomly designated based on ^[⁵^], and implemented in MATLAB. The process of randomly varying n and developing the algorithm, is repeated several times (w = number of experiments). Finally, among the best solutions the value generating the lowest number in (4) with (6) is found.

Algorithm 2 Pseudo code for CGA. Based on (5).

The parameters describing every FIS (individual) are then converted from real to binary data, due to the method adjusting every bit.

2.5 Differential Evolution

This is an algorithm based on the evolution of a population of vectors (individuals) with real parameters, which represent solutions in the searching space.

The algorithm of differential evolution is basically composed by 4 steps, as follows:

Initialization: Every vector (individual) of the population is randomly initialized.
Mutation: A mutation is applied in order to create a testing population of individual.
Crossing: Every vector is used as a mutant vector.
Selection: The testing vector previously obtained is used to do the crossing procedure, which compete with the target vector by the evaluation of the Fitness function ^[⁶^].

Algorithm 3 Pseudo code for DE. Source: Based on ^[⁶^].

2.6 Stochastic Hill Climbing (SHC) with random initialization.

The Stochastic Hill Climbing, consist on taking a FIS (1) and keep evaluating the solutions in the vicinity of it ^[⁷^{] [}⁹^] in a maximum number of iterations. The parameters of the input FIS are randomly initialized.

Algorithm 4 Pseudo code for Stochastic Hill Climbing ^[¹⁰^].

Here:

Imax: Maximum number of iterations 𝑆 : Some particular solution (like 𝐶𝑢𝑟𝑟𝑒𝑛𝑡 or 𝐶𝑎𝑛𝑑𝑖??𝑎𝑡𝑒) Cost(𝑆𝑜𝑙) : Fitness function, obeys (2)

RandomNeighbor(Current) in Algorithm 3 also requires the center and deviation variations, that refers to the allowed absolute value variations of the related parameters when searching for a neighbor. As example, if some of the parameters has the value 0.6, and the specified variation of this parameter is 0.1, then the neighbor will have some uniformly distributed random value between 0.5 and 0.7.

Every separate experiment consist on a single run of a program that implements the Algorithm 4, to obtain a final single individual, but n individuals can be obtained by running n experiments. Afterwards, the individuals can be evaluated with (4) and the validation base, in order to choose the best individual of the n individuals.

2.7 Confusion Matrixes

The classifier output consists on C values, corresponding to the 𝜔₁, 𝜔₂, … , 𝜔𝑐 classes. Due to the erroneous classifications occasionally occurring, the multiclass sorter is evaluated through a (C x C) - dimensional confusion rate matrix showing the respective classification errors between classes (off diagonal) and correct classifications (diagonal elements) ^[¹¹^].

Table 3 shows an example of a confusion matrix for a total of C = 3 classes. The 𝐶𝜔_𝑖, elements correspond to the data quantity from the 𝜔_𝑖 class that was classified as elements of the 𝜔_j class.

3. Parameters for the Algorithms

Excluding the manual tuned FIS, and in order to allow the replicability of similar results, we expose briefly the parameters used for the algorithms. For the CGA, DE and SHC algorithms, the number of rules was taken from the best ANFIS result, as shown in Table 3.

Table 3 Confusion Matrix for a three class sorting problem. Source: Based on ^[¹¹^].

Table 4 Classification of Solar Flares.

Table 5 Initialization Parameters for the implementation of ANFIS with random initialization.

Table 6. Initialization Parameters for the Compact Genetic Algorithm (CGA) implementation.

Table 7. Initialization Parameters for the Differential Evolution (DE) Algorithm implementation.

3.1 Manual Tuning

As the parameters for this method obey to human perceptions of the problem, only the main features are shown in Table 4, for this reason this method was applied only as an exercise of comparison between the human performance and machine performance, in building a FIS that solves the classification problem. These values are not normative by the same fact that the parameters were based from human perceptions of the authors, are then allowed to test other values, but the manual tuning method takes too much time to get a single FIS.

Tables 5-8 show the initialization parameters for each implementation.

Table 8 Parameters for the implementation of the Stochastic Hill Climbing (SHC) with random initialization algorithm.

4. Results

In this section we show first the best results for every method and their analysis. This analysis includes a comparison of their performance.

4.1 Confusion Matrices

The best FIS obtained by each algorithm was evaluated using the whole database. With the evaluated output values and the expected output values a confusion matrix can be filled as shown in Table 3 to obtain the matrices shown in Tables 9, 11, 12, 13 and 14.

In the case of ANFIS, the individual with the lowest validation error was selected for each of the different combinations of number of rules and initial learning rate (LR) as shown in Table 10.

Table 9 Confusion Matrix for the manual tuned FIS.

Table 10 List of the lowest validation error (RMSE) for every 𝑛 test.

Table 11 Confusion Matrix for the best ANFIS individual chosen.

Table 12 Confusion Matrix for the best individual obtained by CGA.

Table 13 Confusion Matrix for the best individual obtained by the DE Algorithm.

Table 14 Confusion Matrix for the best individual obtained by the SHC Algorithm.

Table 15 Validation Errors for the best functions obtained.

From Table 10 the best individual are chosen to make the confusion matrix shown in Table 11. In order to compare the results with the same metric, this individual was evaluated with (4) and its results are part of Table 15. The chosen individual was obtained with the following parameters:

𝑅𝑢𝑙𝑒𝑠 = 8
𝐿𝑒𝑎𝑟𝑛𝑖𝑛𝑔 𝑅𝑎𝑡𝑒 (𝐿𝑅) = 1

The best FIS obtained by the CGA occurred on experiment 𝑤 = 175 and for a value 𝑛 = 41 of the probability adjustment parameter.

Table 16 Results for the Welch's t-test between DE and CGA.

Table 17 Results for the Welch's t-test between DE and ANFIS with Random Initialization.

Table 18 Results for the Welch's t-test between DE and SHC.

Final Result by the validation Metric

Table 15 lists the more relevant metrics for the individuals in every scheme. The final individual was the one with the lowest value of the Fitness function (4), using the validation database.

4.3 Statistical Analysis

To perform a statistical analysis of the algorithms implemented, the Welch's t-test was used for two-samples, assuming unequal variances to confirm or reject the null hypothesis whether both methods provide similar analytical results or not ^[¹²^].

Comparing the results of the test between DE with the CGA and ANFIS algorithms as shown in Tables 16 and 17 respectively, it is possible to reject the null hypothesis and conclude that the methods provide different analytical results with a 99% confidence level.

On the other hand, from Table 18 it can be evidenced that, although the best solution was achieved with the DE algorithm, the average and the variance of the fitness of the individuals obtained with SHC are better than those obtained with DE. This result makes sense in the light of the non-free lunch theorems ^[¹³^], which state that optimization methods perform similarly in average over the entire set of possible optimization problems.

The result of the Welch's t-test shows that the null hypothesis should not be rejected because in the case of two tails the confidence level to reject is less than 20% and in the case of one tail it is less than 60%. Therefore, both methods provide the same average results and the observed differences are purely due to random errors.

Conclusions

In this section we summarize the obtained results and discuss on the different aspects of their performance.

Due to the imbalance in the database, systems and algorithms used in the present work have limited options to learn from class M, and much lower ones from class X.

Additionally for ANFIS, because of the fact mentioned before, the validation metrics for RMSE is not adequate for solving the problem since it ignores the classification error, from which it is evidenced that the best individual obtained in this method is an optimal class C classifier, but not so for the rest of classes.

Despite the Compact Genetic Algorithm has a simple description with little memory, it sufficiently restricts the space of solutions since it works with parameters represented in fixed point, having a more reduced universe as compared to the representation in floating points.

From the items listed above, and from Table 4, it cannot be discarded different problems in which either class C are distinguished from being or not solar flares (modifying the generation parameters of the database), or type M or X solar flares are distinguished. As a future work, the problem can be addressed by using neural network algorithms, e.g. Cascade-Correlation Neural Networks (CCNNs), Support Vector Machines (SVMs) and Radial Basis Function Networks (RBFNs) (2) instead of FISs, in order to determine if it is feasible to obtain a best classifier and therefore extend the problem of estimating the occurrence of solar flares.

References

[1] Heinrich Schwabe and Hofrath Schwabe. "Sonnenbeobachtungen im Jahre 1843. (German) [Observation of the Sun in the year 1843]". In: Astronomische Nachrichten 21 (1843), pp. 233-236. DOI: 10.1002/asna.18440211505. [ Links ]

[2] R. Qahwaji and Colak. "Automatic Short-Term Solar Flare Prediction Using Machine Learning and Sunspot Associations. [On the electrodynamics of moving bodies]". In: T. Sol Phys 241 (2005), pp. 195-211. DOI: 10.1007/s11207-006-0272-5. [ Links ]

[3] T. Bai and P. A. Sturrock. "Classification of solar flares". In: Annual review of astronomy and astrophysics 27 (1989), pp. 421-467. DOI: 10.1146/annurev.aa.27.090189.002225. [ Links ]

[4] Jyh-Shing Roger Jang. "ANFIS: adaptive-network-based fuzzy inference system". In: IEEE Transactions on Systems, Man, and Cybernetics 23 (1993), pp. 665-685. DOI: 10.1109/21.256541. A O [ Links ]

[5] Miguel Melgarejo, Alvaro Prieto, and Carlos Ruiz. "Modelado de 4 2 sistemas difusos basado en el algoritmo genético compacto". (Spanish) [Modeling of fuzzy systems based on the compact genetic algorithm]. In: Proceedings of ASAI 2011, Argentine Symposium on Artificial Intelligence. Universidad de Palermo, Buenos Aires, Argentina (2011), pp. 180-191. [ Links ]

[6] Andrea Villate, David Rincón, and Miguel Melgarejo. "Evolución diferencial aplicada a la sintonización de clasificadores difusos para el reconocimiento del lenguaje de señas". (Spanish) [Applying Differential Evolution to Tune Fuzzy Classifiers Intended for Sign- Language recognition] In: Ingeniería y Universidad: Engineering for Development 16 (2012), pp. 397-413. [ Links ]

[7] Stephan Rudlof and Mario Köppen. "Stochastic Hill Climbing with Learning by Vectors of Normal Distributions". In: Proceedings for Nagoya 1996, Online Workshop on Soft Computing (WSC) no. 1 (1996), pp. 60-70. [ Links ]

[8] Kumara Sastry and David E. Goldberg. "Probabilistic Model Building and Competent Genetic Programming". In: Genetic Programming Series vol 6. (2003), pp. 205-220. DOI: 10.1007/9781-4419-8983-313. [ Links ]

[9] Stuart J. Russell and Peter Norvig. "Artificial Intelligence: A modern approach". Pearson Education, 2003. ISBN: 01379039523. [ Links ]

[10] Jason Brownlee. "Clever Algorithms: Nature-Inspired Programming Recipes". Jason Brownlee, 2011. ISBN: 9781446785065. [ Links ]

[11] Thomas C.W. Landgrebe, and Robert P.W. Duin. "Efficient Multiclass ROC Approximation by Decomposition via Confusion Matrix Perturbation Analysis" In: IEEE Transactions on Pattern Analysis and Machine Intelligence 30 (2008), pp. 810-822. DOI: 10.1109/TPAMI.2007.70740. [ Links ]

[12] Fagerland M.W. and Sandvik L. "Performance of five two-sample location tests for skewed distributions with unequal variances". In: Contemp Clin Trials, vol. 30, no. 5 (2009), pp. 490-496. DOI: 10.1016/j.cct.2009.06.007. [ Links ]

[13] D. H. Wolpert and W. G. Macready, "No free lunch theorems for optimization". In IEEE Transactions on Evolutionary Computation, vol. 1, no. 1 (Apr 1997), pp. 67-82. DOI: 10.1109/4235.585893 [ Links ]

How to cite:Ramos, L., Bustos, A., Melgarejo, M., Vargas, S., Fuzzy Inference Systems Tuning with Optimization Algorithms for Solar Flares Classification, TECCIENCIA, Vol. 12 No. 23, 33-42, 2017 DOI:http://dx.doi.org/10.18180/tecciencia.2017.23.5

Received: May 07, 2017; Accepted: June 06, 2017

^*Corresponding Author. E-mail:laramosm@correo.udistrital.edu.co

This is an open-access article distributed under the terms of the Creative Commons Attribution License