Geochemical characterization of oceanic basalts using Artificial Neural Network

The geochemical discriminate diagrams help to distinguish the volcanics recovered from different tectonic settings but these diagrams tend to group the ocean floor basalts (OFB) under one class i.e., as mid-oceanic ridge basalts (MORB). Hence, a method is specifically needed to identify the OFB as normal (N-MORB), enriched (E-MORB) and ocean island basalts (OIB). We have applied Artificial Neural Network (ANN) technique as a supervised Learning Vector Quantisation (LVQ) to identify the inherent geochemical signatures present in the Central Indian Ocean Basin (CIOB) basalts. A range of N-MORB, E-MORB and OIB dataset was used for training and testing of the network. Although the identification of the characters as N-MORB, E-MORB and OIB is completely dependent upon the training data set for the LVQ, but to a significant extent this method is found to be successful in identifying the characters within the CIOB basalts. The study helped to geochemically delineate the CIOB basalts as N-MORB with perceptible imprints of E-MORB and OIB characteristics in the form of moderately enriched rare earth and incompatible elements. Apart from the fact that the magmatic processes are difficult to be deciphered, the architecture performs satisfactorily.


Introduction
Several discrimination diagrams have been proposed to classify the ocean floor basalts (OFB) into ocean island basalts (OIB), mid-oceanic ridge basalts (MORB), and island arc basalts (IAB) that are recovered from different tectonic settings. These diagrams are constructed by considering a variety of oxides and/or their ratios, for instance the triangular diagrams of Ti/100 -Zr -Y.3 by Pearce & Cann [1], Hf/3 -Th -Ta by Wood et al. [2], TiO 2 -MnO -P 2 O 5 × 100 by Mullen [3] and 2Nb -Zr/4 -Y by Meschede [4]. The model based geochemical studies classify the MORB into three types such as normal -, enriched or plume -and transitional MORB (i.e., N-MORB, E or P-MORB and T-MORB, respectively) or as OIB [5][6][7]. The discrimination diagrams provide a broad picture of the type of basalts but it is difficult to determine the basic characters that are involved in the geochemical classification of OFB based solely on the above mentioned elements and oxides. Recently, Sheth [8] considered several log-ratio and discriminant-analysis based diagrams to evaluate and classify the basalts into OIB, island arc basalts (IAB) and MORB. The suggested discriminate diagrams helped to distinguish the volcanics recovered from different tectonic settings but group the OFB under one class i.e., as MORB. Hence, a method is needed to specifically identify the OFB as N-MORB, E/P-MORB and OIB. Therefore, other than through conventional discrimination plots, a methodology is explored for an improved Open Access technique to characterise and evaluate the various basaltic characters in a geochemical dataset. We found that a hybrid Artificial Neural Network (ANN) architecture, also known as Learning Vector Quantisation (LVQ) which is a supervised network, could better help to characterise the OFB. As a supervised method, LVQ uses known target output classifications for each input pattern of the form. Some instances where LVQ architecture has being extensively used are for pattern recognition and seafloor classification [9] and characterisation of the seafloor sediments [10]. In this communication we use the LVQ approach in order to determine the inherent geochemical characters and to classify the Central Indian Ocean Basin (CIOB) basalts.

Learning Vector Quantisation (LVQ) Architecture
An ANN is an information processing paradigm that is inspired by the way biological nervous systems, such as the brain, process information. The key element of this pattern is the novel structure which is composed of a large number of highly interconnected processing elements (neurons) working in unison to solve specific problems. An ANN is configured for a specific application, such as pattern recognition or data classification, through a learning process. Learning in biological systems involves adjustments to the synaptic connections that exist between the neurons. This is true of ANNs as well.
LVQ constitutes a powerful and intuitive method for adaptive nearest prototype classification. The LVQ architecture is based on the weight-updating rule to obtain the characteristics of the learning data. In a feedforward ANN (Fig. 1) the data travel one way" from input to output with no feedback (loops) i.e., the output of any layer does not affect that same layer. Feed-forward ANN tends to be a straight forward network that associates inputs with outputs. LVQ algorithms do not approximate density functions of class samples as is the case for Vector Quantisation or Probabilistic Neural Networks, but directly define the class boundaries based on prototypes, a nearest-neighbour rule and a 'winnertakes-it-all' paradigm [11]. The LVQ is an algorithm for learning classifiers from labeled data samples. Instead of modeling the class densities, it models the discrimination function defined by the set of labeled codebook vectors (CVs) and the nearest neighbourhood search between the codebook and data. During classification, a data point xi is assigned to a class according to the class label of the closest CV. The training algorithm involves an iterative gradient update of the winner unit. The direction of the gradient update depends on the correctness of the classification using a nearest neighbourhood rule in Euclidean space. If a datum sample is correctly classified (i.e., the labels of the winner unit and the sample are the same), the model vector closest to the sample is attracted towards the sample; if incorrectly classified, the sample has a repulsive effect on the model vector.
The objective of LVQ is to cover the input space of samples with CVs, each representing a region labeled with a class. A CV can be considered as a prototype of a class member, localized in the center of a class or decision region in the input space. A class can be represented by an arbitrary number of CVs, but one CV represents one class only. In terms of neural networks a LVQ is a feed-forward net with one hidden layer of neurons, fully connected with the input layer. A CV can be seen as a hidden neuron ('Kohonen neuron') [11] or a weight vector of the weights between all input neurons and the concerned Kohonen neuron [12], respectively ( Fig. 1). Here 'weights' refer to the value of the individual vector in the matrix. In contrast to the standard LVQ, where the winner unit (neuron) is defined with a nearest-neighbour rule in the Euclidean space, we now have a winner unit which minimizes the negative log likelihood of the data. Equivalently, this maximum likelihood unit m c is defined by: where θ k is the weight of CVs.
'Learning' means modifying the value of CVs in accordance with adapting rules [11] and therefore, changing the position of a CV in the input space. Since class boundaries are built piecewise -linear segments of the mid-planes between CVs of neighboring classesthese are adjusted during the learning process. The tessellation (a tessellation or tiling) of the plane is a collection of figures that fills the plane with no overlaps and no gaps) induced by the set of CVs is optimal if all data within one cell indeed belong to the same class. Classification after learning is based on a presented sample's vicinity to the CVs. The classifier assigns the same class label i.e., the label of the cell's prototype (the CV nearest to the sample) to all the samples that fall into the same tessellation.
The core of the heuristics [11] is based on a distance function -usually the Euclidean distance is used -for comparison between an input vector and the class representatives. The Euclidean distance [d(i)] is calculated by the equation: This distance expresses the degree of similarity between presented input vector and CVs. A shorter distance corresponds to a high degree of similarity and a higher probability for the presented vector to be a member of the class represented by the nearest CV. Therefore, the definition of class boundaries by LVQ is strongly dependent on the distance function, the start positions of CVs, their adjustment rules and the pre-selection of distinctive input features. The CV update equation during learning phase, as defined by the nearestneighbour rule, and a datum sample x(t) are fed in the equation 3 to change the CVs where the sign depends on whether the datum sample is correctly classified (+) or misclassified (-). The learning rate α(t) ∈ [0, 1] decreases monotonically with time. For different picks of data samples from our training set, this procedure is repeated iteratively until a convergence occurs. Kohonen12 also presents optimized learning-rate LVQ, where the learning-rate is individually optimized for each codebook. The learning function (α) for LVQ1 [10][11][12] uses small values and was optimized to: for right (0.1/t 0.1 ) and wrong (0.1/t 0.06 ) classifications. During the training and testing of LVQ1, the randomly generated weight matrix was tuned for a particular character in the data set. The LVQ1 network learns all the possible variations for a particular data set and in order to obtain the optimum iteration, we continuously changed the number of iteration steps from a small number to a large one with continuous observation of classification of the data. It was noticed that irrespective of the number of neurons, 30 iterations were optimum for classifying the CIOB basalts.
The basic LVQ algorithm i.e., LVQ1 rewards correct classifications by moving the CV towards a presented input vector, whereas incorrect classifications are punished by moving the CV in an opposite direction. The magnitudes of these weight adjustments are controlled by a learning rate [11] which can be lowered over time so as to acquire finer movements in a later learning phase. Improved versions of LVQ1 are Kohonen's OLVQ1 (with different learning rates for each CV in order to obtain a faster convergence) and LVQ2, LVQ2.1 and LVQ3. Since LVQ1 tends to push CVs away from decision surfaces, it can be expected to search for a better approximation by adjustments of two CVs belonging to adjacent classes. Therefore, in LVQ2 adaptation occurs only in regions with a few cases of mis-classification in order to achieve finer and better class boundaries. While LVQ2 allows adaptation for correctly classifying CVs, LVQ3 leads to an even more weight adjusting operations due to less restrictive adaptation rules.
The accuracy of classification and, therefore generalization and the speed of learning depends on several factors. Generally, the developer of a LVQ has to prepare a learning schedule and a plan as to which LVQalgorithm(s) -LVQ1, OLVQ, LVQ2.1 etc. -should be used with values for the main parametres during the different training phases. Also, the number of CVs for each class must be decided in order to reach an high classification accuracy and generalization while avoiding under-or over-fitting of the CVs. Additionally, the rules for stopping the learning process as well as the initialization method (e.g., random values, values of randomly selected samples) determine the results.
In this study we have implemented the LVQ1 network to classify the CIOB basalts without placing emphasis on the geographical locations of the samples. The LVQ1 algorithm is such that if the class levels of the input and closest matching reconstruction vectors are the same, then the weights are moved closer to the input vector. Conversely, a mismatch between the two causes the weight to move away from the input vector. This concept is termed as "rewardpunishment". Randomly generated weight matrix is used as an initial weight distribution for LVQ1. The weight update equations are implemented on the winning neuron for each input vector presented, with alternate testing and training throughout the dataset. The weight updating takes place following the above equation #3.
The LVQ1 was used as a single layer for classification of the CIOB basalts and thirty five samples were used to train the network with every sample containing twenty one variables. The LVQ1 testing was carried out on known and classified basalt data set [13][14][15][16][17][18][19][20][21] so as to optimize the weight matrix and to store the characters of the training data. Optimization is a basic step that helps the network to classify the unknown basalt data. To improve the performance of the LVQ1, it was found that an output neuron grid size of 25 × 1 which represents the different class is most favorable for this study. An increase in the number of neurons would lead to more time to perform a specified work (Table 1). If the number of classified group increases then to avoid the overlap, the number of neurons can be increased and LVQ2 and LVQ3 can be implemented to strengthen the classification. The LVQ1 architecture was written using Matlab 6.1 and the program was run on a P4 (1.70 GHz) computer with 256 MB RAM.

Use of LVQ to Classify Oceanic Basalts
As stated earlier, based on geochemical data the OFB have been classified as N-, E/P-or T-MORB or OIB.
Recently, Lacassie et al. [22] have used self organizing map (SOM) based ANN to classify the volcanic rocks. But it is difficult to determine the inherent geochemical characters of the samples with respect to N-MORB, E/P-MORB and OIB, until and unless the network has pre-defined parametres to separate the geochemical characters of the data. Therefore, an attempt is made to introduce the LVQ method for classification and to unravel discrete geochemical traits of the OFB by using certain characteristic elemental concentration of these basalts. In order to classify the OFB we considered one major oxide (K 2 O), seven trace (Sc, Rb, Sr, Y, Zr, Nb and Ba,), six rare earth elements (REE) (La, Ce, Nd, Sm, Eu, and Yb) and seven elemental ratios (Zr/Nb, Y/Nb, Ba/ Nb, Zr/Y, Sm/Nd, La/Yb and Ce/Y). A reason for utilizing the above mentioned elements and their ratios is because these carry the geochemical signatures of the individual OFB type i.e., N-MORB, E/P-MORB and OIB [7]. A criterion that we considered while selecting the samples for training and testing, was that the data should not be solely from one sampled site in the CIOB.  Table 2). The variation in a few elements could suggest a combination of geochemical makeup of the MORB and this can be deciphered by using the LVQ method.
The initial matrix of CVs of unbiased random number of 25 × 21, 50 × 21 and 75 × 21 (Fig. 2) was generated, saved and subsequently used for training of LVQ1 architecture prior to classification. Figure 3 represents the trained weight matrix of 25, 50 and 75 neurons for N-MORB, E-MORB and OIB respectively. To use the LVQ technique, a separate data set for training and testing was arranged in a 21 × n pattern for N-MORB, E/P-MORB and OIB. Here '21' represents the properties of the basalts in terms of the elements and their ratios and 'n' indicates the number of data strips used in the study. The data which were previously classified as N-MORB, E/P-MORB and OIB using classical geochemical criteria, were selected for the study and at the same time the previously defined data were divided in to two sets, one set for training of the network and the other set to observe the performance of the network. During training of the network, the CVs get updated and classify the basalts into the different categories. From the network of 25 neurons we selected output neurons 4 to 6, 10 to 12 and 16 to 18 to designate the N-MORB, E/P-MORB and OIB respectively.
Three different weight-matrixes of CVs were used for the three types of basalts. The initial weight-matrix of CVs was  updated during training of the network and when the network reached its optimum efficiency the final weightmatrix was saved and used to classify the unknown data. In all the cases the network showed a satisfactory result by classifying the known data between 100% and 95%. Due to 100% classification of known N-MORB, E/P-MORB and OIB basalts data, a need did not arise to use the LVQ2 and LVQ3 architectures. LVQ1 architecture with 25 neurons performed very satisfactorily whereas with 50 and 75 neurons the possibility of mis-classification for E/P-MORB and OIB increased (Fig. 4). As it took less time for completion of the classification hence an architecture of 25 neurons was used (Table 1).
To help identify the involved characters in the data set, filters were designed using the optimized and final weight-matrix of the CVs. The filters are similar to the testing part of the LVQ1 architecture. While passing through the filters the network identifies the individual characters of the unknown data and this recognition is dependent upon the available characters of the basalts in the form of CVs in the weight-matrix.

Classification Of Unknown Basalt Data
Sampling in the CIOB recovered a variety of rocks such as basalts, ferrobasalts, spilites and pumice clasts [23]. Basalts occur as pillows, large outcrops and as fragments.
Compositionally, the basalts are Normal-MORB (N-MORB) similar to those from the Mid-Atlantic Ridge and East Pacific Rise [24]. Ferrobasalts, recovered near topographic highs and high amplitude magnetic zones, consist of plagioclase (predominant), sometimes olivine and frequently small euhedral magnetite and hematite grains [25]. Spilites, occurring near the Indrani fracture zone (79°E), show fine to medium grains of albitic plagioclase, clinopyroxene and olivine while epidote, hematite, chlorite and ore minerals form minor constituents. Pumices encompass a large field and are trachyandesite to rhyodacite in composition [26].
In general, the CIOB basalts have distinct incompatible element ratios e.g., Zr/Nb = 25-125, Y/Nb = 7-63 and (La/Sm) N = 0.5-1.5 [27,28]. The binary plots of Zr, Rb, Ce, Sr and Ba show a variable distribution against Nb (Fig. 5). Zr and Ce show a strong positive correlation with increasing Nb, whereas Rb and Ba show a scattered  [29,30]. An important aspect of the geochemical signatures of the CIOB basalts is the significant fractionation among the highly incompatible elements. For example, the Ba/La ratio is a factor of~5 higher than typical N-MORB, where as the moderately incompatible element ratio such as Sm/La (0.4 to 1.1) is very close to the N-MORB [27,28].
The Zr/Nb ratio serves as a useful information to identify the nature of the MORB. The CIOB basalts have high Zr/ Nb (>25) [28] similar to typical N-MORB (> 30) [7]. The plots of Ce/Y vs Zr/Nb and La/Yb vs Zr/Nb indicate a close association of the CIOB basalts with the Southeast Indian Ridge (Fig. 6a, b). The plot (La/Sm) N vs Zr/Nb (Fig. 6c) indicates that although the CIOB basalts are typical N-MORB yet, faint signatures of E/P-type MORB are noticeable in the mixing relation between N-and Ptypes and this may be indicative of a low degree of partial melting of the source rock. The La/Yb and Ce/Y ratios (~0.7-2.7 and~0.15-0.62, respectively) of the CIOB basalts are close to the chondrite values (La/Yb ≈ 1.39 and Ce/Y ≈ 0.39) [31] and indicate that these ratios were affected by the fractional crystallization of olivine and pyroxene. The chondrite normalized REE of the CIOB basalts also attest to the N-MORB nature of these basalts ([La/Yb] N of~1.0) [28]. Interestingly, the CIOB basalts show enriched LREE and relatively flat HREE pattern (Fig. 7).
The LVQ analysis of the geochemical data of the CIOB basalts produced the following results ( Fig. 8; Table 3

Conclusion
It is well recognized that the geochemical study of basalts together with discrimination plots of selective elements and their ratios could help to identify the basic volcanics vis-è-vis their tectonic settings. The purpose of this work however, was to highlight the development of a suitable real-time program to help classify the oceanic basalts on the basis of their discrete geochemical characters which may not be fully revealed in the classical discrimination diagrams. In this respect, the need of soft computational techniques (like ANN) is useful and faster.
The present study indicates that the supervised LVQ1 architecture performs satisfactorily to identify the geochemical characters in the data and the possibility of mis-characterization is minimal. Further work could help to refine the model by a possible reduction in the number of variables that are needed for the classification scheme.