Factor Analysis and Mathematical Modeling in Determining the Quality of Coal

The separation of coal material of three types of coals originating from three various Polish hard coal mines (types 31, 34.2 and 35, according to Polish nomenclature, which were steam coal, semi-coking coal and coking coal) into particle size fractions and then into particle density fractions was done and then the following parameters were measured for each particle size-density fraction: combustion heat, ash contents, sulfur contents, volatile parts contents, analytic moisture. In this way a 7-dimensional vector of data was created. Using methods of factor analysis the important features of coal were selected, which decide about their membership to individual types. To evaluate the appropriateness of the applied method the Bartlett’s sphericity test as well coefficient of Kaiser-Mayer-Olkin (KMO) were used. To select important factors the Kaiser criterion and Cattell’s scree test were used. The obtained results were compared with the results obtained in previous works by means of observation tunnels method. The results showed which particular features are crucial to define the type of coal what is also important to select appropriate method of its enrichment. Furthermore, the construction of a mathematical model presenting the relations between these properties and particle size and density is presented. Because of the fact that particles of certain size or density may occur in neighboring fractions three sorts of relations were examined basing on regression analysis.The analysis was conducted for all three coal types. Because of the fact that the models contain various amounts of independent variables R2 coefficient, mean squared error (MSE) and Mallow’s statistics Cp were applied to evaluate and compare obtained results.


Introduction
Mineral raw materials which are beneficiated in purpose of their using characterize with many factors describing their features. In case of coal, these features are among others ash contents, sulfur contents, combustion heat, volatile parts contents or analytic moisture. The features mentioned above decide about coal quality also in economical aspect. Because of that the preciseness of determining values of these features is very important.
The most often researched properties of the coal are combustion heat, ash contents, sulphur contents, volatile parts contents and moisture. These features are very often highly correlated but also can occur independently . The selection of the necessary factors which influence on individual properties is the goal of the paper. To this purpose three types of coal (according to Polish nomenclature -coal types 31 (steam coal), 34.2 (gas-coking coal) and 35 (orto-coking coal)) were selected to the investigation which were divided into particle size and density fractions. The classification of coals is presented in Table 1.
The whole group of considered factors were measured for each size-density fraction [14].
The following variables were considered (X i = 1, 2, …, 5 Knowledge about these features can serve also to evaluate beneficiation process (Brożek, 1984;Dobosz, 2001; Foszcz et al., 2016; Głowiak 2019a; b; Niedoba, 2013a;Stanisz, 2007;Stępiński, 1964;Tumidajski and Saramak, 2009). The ash contents, sulfur contents and volatile parts contents were investigated dependably on particle size and particle density also by means of kriging method (Niedoba, 2013a). The application of non-conventional statistical methods can be very beneficial in getting precise information (Foszcz et al. . The presented work is an attempt of constructing new mathematical model describing relation between ash contents and particle size and density.

Materials and methods
The considered types of coal originated from three various Polish coal mines and all of them were initially screened on a set of sieves of the following sizes: -1.00, -3.15, -6.30, -8.00, -10.00, -12,50, -14.00, -16.00 and -20.00 mm. Then, the size fractions were additionally separated into density fractions by separation in dense media using zinc chloride aqueous solution of various densities (1.3, 1.4, 1.5, 1.6, 1.7, 1.8 and 1.9 g/ cm3). The fractions were used as a basis for further consideration and additional coal features were determined by means of chemical analysis. In purpose of appropriate identification Submission date: 30-11-2019 | Review date: 28-03-2020 of coal type many parameters are being measured which describe coal quality. For each density-size fraction such parameters as combustion heat, ash contents, sulfur contents, volatile parts contents and analytical moisture were determined, making up, together with the mass of these fractions, seven various features for each coal.
The example of obtained data is presented in Table 2.
The measurements of X i were performed for each size-density fraction. Because of the fact that the individual features were measured in various units their standardization was done.
In purpose of selecting significant factors influencing on individual variables, the factor analysis method was applied. To evaluate adequacy of applying factor analysis to this problem two criteria were used: Bartlett's test and Kaiser-Mayer-Olkin coefficient (KMO) (Comrey, 1973;Dobosz, 2001;Kline, 1994;Lawley and Maxwell, 1971; Tumidajski and Saramak, 2009).
The reduction of variables is done through the Cattell's scree criteria and criterion of sufficient proportion which suggest to apply such number of factors that they explain together at least 85% of variance of all observed variables [Stanisz, 2007].

Factor analysis
Applying Bartlett's test it occurred that for all researched cases the value of the test was significantly higher than the critical values on significance level being equal to α = 0.0005. The lowest value of the test U was obtained for coal, type 35 in particle density fraction (1.9-2.0) and was equal to 84.74, while the critical value on this level is equal to 31.42. It can be said then that zero hypothesis (that correlation matrix is a unit matrix) should be rejected for all particle size and density fractions.
Furthermore, it can be noticed that in almost all cases the value of KMO coefficient was higher than 0.5. Only for density fraction lower than 1.3 g/cm3 for coal, type 34.2 and density fraction (1.6-1.7) for coal, type 35 it occurred to be slightly lower than 0.5. That means that the results of Bartlett's test and the values of KMO coefficient gave strong basis to apply factor analysis.
In the work, the reduction of variables is done through the Cattell's scree criteria and criterion of sufficient proportion which suggest to apply such number of factors that they explain together at least 85% of variance of all observed variables [22].
The correlation matrix of the factor Z j with variable X i is obtained by creation of matrix Z, which elements are numbers (1)  where: λ i -i th eigenvalue of correlation matrix; a ji -elements of matrix A which fulfills the condition A T =R, where R is correlation matrix of variables X j .
The square of number z ij is the percentage of variance changeability explained by the factor Z j . For example, considering coal, type 31 from the particle size fraction (10-12.5) it is obtained that matrix Z is in form (2) The eigenvalues of the correlation matrix are in this case numbers λ 1 =3.8177; λ 2 =1.0355; λ 3 =0.0875; λ 4 =0.0488; λ 5 =0.0105.
The plot of scree is presented on Figure 1.
On the basis of the presented Cattell's scree plot only these factors remain which are located to the left from the point in which a mild decline of eigenvalues is observed. In this case these are factors Z1 and Z2.
The group of factors (Z1, Z2) explain 98.07% of changeability of combustion heat, 97.12% of changeability of ash contents, 99.71% of changeability of sulfur contents, 96.33% of changeability of volatile parts contents and 93.62% of changeability of moisture.
It is obtained then that factor Z1 is responsible for variables {X1, X2, X4, X5} and factor Z2 for variable X3.
The Cattell's scree plot suggests to take factors Z1, Z2 and Z3 into consideration. The same factors explain sufficient percentage of changeability of all observed variables. Group of factors (Z1, Z2, Z3) explains 93.25% of changeability of combustion heat, 92.41% of ash contents, 94.32% of sulfur contents, 90.33% of volatile parts contents and 97.99% of mois-ture, while factor Z1 is related to variables X1, X2, X3, X4; factor Z2 to variables X2, X3, X4 and factor Z3 to variable X5.
Another criterion of limiting number of factors is determination of amount of percent of total variance explained by chosen factors (most often it is required to not be lower than 85% The influences of individual factors on considered variables in all fractions of individual types of coal are presented in Tables 3-8. It was assumed that changeability of each feature should be explained by factors in at least 85%.

Mathematical modeling
On the basis of one-and multidimensional regressive analysis four models presenting relations between ash contents in certain particle size fraction (or density fraction), particle density (or particle size) and ash contents in neighboring size or density fractions.
The general form of proposed models are: • One-dimensional model y = ax 1 • Two-dimensional models y = a 1 x 1 + a 2 x 2 + b (5) and • Three-dimensional model y=a 1 x 1 + a 2 x 2 + a 3 x 3 + b where: y -ash contents in certain particle size (or particle density) fraction; x 1 -particle size or particle density; x 2 -ash contents in previous particle size (or density) fraction; x 3 -ash contents in following particle size (or density) fraction.
Because of the fact that during material separation process particles from other fractions transfer to the certain considered fraction in two-and three-dimensional models ash contents in neighboring fractions were taken into account and their influence was evaluated.
The analysis was conducted for all three types of coal. The results of analyzes were presented in Tables 9-14.

Investigation of models quality
To evaluate the quality of models obtained by means of general formulas presented in equations (4)-(7) such factors as R2 coefficient, mean squared error MSE and Mallow's statistics C p were calculated which are given by the following formulas (Stanisz, 2007;Tumidajski and Saramak, 2009), presented in equations (8)-(10): where q is an amount of independent variables occurring in considered function (10) where MSE4 is mean squared error calculated for y 4 .
The obtained results of calculated errors are presented in Tables 15 and 16.

Conclusions
Because of the fact that the most often three factors occur in individual fractions and considering power of relations between individual properties the investigated variables can be divided into three subsets. First one contains combustion heat, ash contents and volatile parts contents, second one contains sulfur contents and the third one contains moisture. In scientific works [3,4,5,6,7,8,9,12,13,14,15,16,17,18,19,20,23,24], through application of various visualization methods it was claimed that features being sufficient to identify coal type are sulfur contents, moisture and volatile parts contents. The conducted analysis confirms these results. The selection of variable X 4 (volatile parts contents) occurs from the fact that this variable is explained by other factor than mutual factor with variables moisture and combustion heat.
Considering the mathematical models it must be said that during grained material separation (in this case -coal) into particle size or density fractions some of the particles from neighboring fractions (j-1 or j+1) occur in jth fraction it seems to be justified to consider this fact during construction of mathematical model describing ash contents by means of particle size or density.
In the paper four models are proposed: • One-dimensional, which does not consider influences of neighboring fractions; • Two-dimensional, which takes the influence of one of neighboring fractions into consideration -two models of such type; • Three-dimensional, which takes the influence of both neighboring fractions.
The verification of these models was conducted on the basis of three factors: R2 coefficient, mean squared error MSEand Mallow's statistics C p .
Taking into consideration the R2 coefficient it is visible that for all considered models the value of this factor is relatively high (above 0.9). It can be noticed that the R2 achieves higher values when the separation is done in accordance to particle density than in case of particle size (apart from coal, type 34.2).
Furthermore, the value of mean squared error indicates that the models are well fitted, but (apart from coal, type 34.2) significantly better fitting to empirical results is achieved in case of separation done in accordance to particle density. To compare the models for various dimensions the Mallow's statistics C p was used, which suggests that the best model is the one which values of C p is close to the value q+1, where q is a number of independent variables occurring in the model. Analyzing Tables 8 and 9 it can be stated that the best model is a three-dimensional one, but in some cases, as for coal, type 35 by separation done in accordance to particle size, the two-dimensional models have the value of C p around q+1=3.
The analyzed cases indicate that despite satisfying results of one-dimensional approximation to obtain better models is worthy to consider also influences of the researched feature in neighboring fractions.