Literature DB >> 31795370

Soft Sensors in the Primary Aluminum Production Process Based on Neural Networks Using Clustering Methods.

Alan Marcel Fernandes de Souza1, Fábio Mendes Soares1, Marcos Antonio Gomes de Castro2, Nilton Freixo Nagem3, Afonso Henrique de Jesus Bitencourt4, Carolina de Mattos Affonso1, Roberto Célio Limão de Oliveira1.   

Abstract

Primary aluminum production is an uninterrupted and complex process that must operate in a closed loop, hindering possibilities for experiments to improve production. In this sense, it is important to have ways to simulate this process computationally without acting directly on the plant, since such direct intervention could be dangerous, expensive, and time-consuming. This problem is addressed in this paper by combining real data, the artificial neural network technique, and clustering methods to create soft sensors to estimate the temperature, the aluminum fluoride percentage in the electrolytic bath, and the level of metal of aluminum reduction cells (pots). An innovative strategy is used to split the entire dataset by section and lifespan of pots with automatic clustering for soft sensors. The soft sensors created by this methodology have small estimation mean squared error with high generalization power. Results demonstrate the effectiveness and feasibility of the proposed approach to soft sensors in the aluminum industry that may improve process control and save resources.

Entities:  

Keywords:  clustering methods; estimation; neural network; primary aluminum production; real data; soft sensor

Year:  2019        PMID: 31795370      PMCID: PMC6929109          DOI: 10.3390/s19235255

Source DB:  PubMed          Journal:  Sensors (Basel)        ISSN: 1424-8220            Impact factor:   3.576


1. Introduction

Although pure aluminum (Al) is one of nature’s most abundant elements, it is extremely difficult to extract, and extraction is not possible without the occurrence of some chemical reaction. Al is always attached to some other chemical element in the form of salts or oxides, which makes separation necessary. In the 1880s, the young students Charles Hall and Paul Héroult used electrolysis to separate the Al of oxygen from alumina (Al2O3) grains into salts fluxes such as cryolite (Na3AlF6). This is the Hall–Héroult process [1,2] by which the primary aluminum industries perform can obtain Al up to 99.9% purity. Basically, this is the separation of alumina into alumina and oxygen, but the process also requires the participation of other elements such as flux salts, gases, and chemical additives to maintain process stability, which makes the process more complex [1,3]. For complex industrial processes, mathematical modeling is also a complex task, in such a way that representing a process in a completely analytical way becomes impracticable. The use of approximate and hybrid representations produces very satisfactory results, although they are not scalable from a certain point [4]. As the scientific improvement of modeling and identification techniques [5], this task has been dealt with more easily and in various areas of knowledge, although the great difficulty of performing dynamic modeling of nonlinear processes remains. This process of modeling and identification of dynamic nonlinear systems has advanced considerably with the use of artificial intelligence and machine learning techniques, which have been applied in the last few decades with excellent results [6,7,8,9,10]. The success of using these “intelligent” paradigms in modeling dynamic systems is due to the little knowledge required to perform modeling (only a reasonable amount of data is required) compared to other forms of analytical modeling, and also because they are naturally nonlinear models. Among these “intelligent” techniques used for nonlinear dynamic modeling [11,12], one of the most used is artificial neural networks. The use of artificial intelligence in dynamic modeling based on data is sometimes referred to as soft sensors. Soft sensors are computationally implemented, data-driven models that provide online estimates of process variables that cannot be continuously and/or reliably measured online for technological and/or economic reasons [4,13]. These techniques use process variables that are measured and recorded reliably online using available physical sensors or offline through laboratory analysis results. Data-driven soft sensors have wide success in the industry, because of its practicability, robustness, and flexibility to be developed and applied to a wide range of processes, in addition to their independence from a process mathematical model [14,15]. There are a number of methods for implementing flexible data-driven sensors for industrial processes. Some of the most commonly used linear methods are multi-statistic regression algorithms, such as principal component analysis (PCA) [16,17,18,19] and partial least squares (PLS) [20,21,22,23]. These methods have more practical applications because of their simplicity and can work with some invariance in time; however, they have some disadvantages because they are prone to errors in the presence of data impurities (missing values and outliers) and are inadequate to deal with nonlinearities. Nonlinear processes are usually modeled with nonlinear structures such as artificial neural networks (ANN) [24,25,26,27,28], neuro-fuzzy [29,30,31], Gaussian process regression support vectors [32,33,34], and support vector machines [35,36,37]. The most common types of ANN are multi-layer perceptron (MLP) and radial basis function networks (RBFN). The literature has shown that ANN is especially suitable for implementation of soft sensors, and these have indeed been used [38,39,40,41,42,43,44,45,46,47]. More recently, deep learning has been used to create soft sensors also successfully [48,49,50,51]. Due to the complexity of the primary aluminum production process, it is interesting to use data-driven soft sensors to measure the most important variables of this process, since it is a nonlinear, time-variant, and distributed-parameter dynamic process. Moreover, since the electrolytic process of oxidized alumina reduction is very aggressive, it is not possible to have temperature measurements in real time, since the chemical bath corrodes the thermocouple (usually a thermocouple can do 50 measurements every 24 h). ANNs have been used as a powerful artificial intelligence technique to construct models based on data in the Al industry [52,53,54,55]. In this way, ANNs are also widely used to implement soft sensors. In the Al smelting process, ANN has been used in a minor way to simulate and model processes [56,57,58], while in parallel other techniques like clustering help to identify pots with common behaviors to enhance the knowledge derived from the data [59]. In major part, mathematical techniques have been used to create models to emulate the Al production process [60,61,62,63,64]. An industrial Al plant has hundreds of pots working simultaneously, so this feature contributes to make the production process more complex as a whole, often requiring many human interventions [3]. Methodologically, it is possible to apply neural modeling in one of the following approaches: A single ANN for all electrolysis pots; in this approach, the results are barely satisfactory, since it is very difficult for ANN to capture the behavioral differences of all pots. An ANN for each pot, which might be too complex and difficult to apply, since it is necessary to tune hundreds of ANNs. One ANN for a certain cluster of pots, which present similar behaviors. This paper describes the process of designing soft sensors using the third methodology, which could present the best trade-off between complexity and quality of results. The engineering expertise is useful for determining the key process variables to include, and the ANN technique helps in variable indirect estimation within electrolytic bath furnace modeling using real data from an Al smelter plant. This paper’s major contributions are as follows: clustering data by pots section; considering three different phases of pots, based on lifespan division; and comparing and proposing neural network estimators as soft sensors to replace manual measurements with automatic. The results show this is possible, since the models generate estimations with small errors. It is important to highlight ANN models created are dynamic, because delayed inputs were considered to estimate the current outputs. Briefly, the flowchart of the proposed method is presented by Figure 1.
Figure 1

Flowchart of the proposed method.

The rest of this work is organized as follows. Section 2 describes the primary Al production process and describes the layout of the Al smelter concerned in this paper. Section 3 addresses in detail the design of the ANN-based estimation models. Results and discussions are presented in Section 4. Finally, Section 5 provides the conclusions.

2. Brief Description of the Primary Aluminum Production Process

Softness, lightness, high thermal conductivity, and high recyclability are important properties of Al. A wide variety of products are derived from this metal, which has helped it to become the most frequently consumed nonferrous metal around the world [64]. The primary Al production process is complex, due to the handling of variables from multiple disciplines, such as electrical, chemical, and physical [65]. The raw material of Al is alumina. Direct Al extraction from alumina requires a temperature over 2000 °C [66]. The machinery to maintain this high temperature is expensive, and so is the energy waste under these requirements. From the late nineteenth century, the Hall–Héroult process has been used as an alternative to produce Al, as it consumes less energy and requires a lower temperature (about 960 °C) [1,2,3]. To reduce the heat, cryolite is used as an electrolytic bath and several chemical components are added together with alumina [67]. This process is widely known as Al smelting, which uses electrolysis pots, also named pots or reduction pots [68]. A pot (Figure 2) consists of a steel shell with a lining of fireclay brick for heat insulation, which, in turn, is lined with carbon bricks to hold the molten electrolyte. Steel bars carry the electric current through the insulating bricks into the carbon cathode floor of the pot. Carbon anode blocks are hooked onto steel rods and immersed in the electrolyte. Alumina molecules are dissolved by the heat and decomposed into Al and oxygen (O) by electric current that flows through the electrolyte [69]. In modern smelters, process-control computers connected to remote sensors ensure optimal operation of electrolysis pots [70]. Electrolysis furnaces are organized within reduction rooms—standard Al smelting uses around four reduction rooms and between 900 and 1200 pots in total, depending on the smelter.
Figure 2

Example of a pot and its parts.

According to the stoichiometric relation (Equation (1)), alumina is consumed in the production process together with the solid carbon of the anodes. Theoretically, this consumption is 1.89 kg of Al2O3 for each 1.00 kg of Al+, whereas 0.33 kg of carbon (C+) produces 1.22 kg of carbon dioxide (CO2). In practice, typical values are 1.93 kg Al2O3 to 1.00 kg Al+ and between 0.40 and 0.45 kg of C+ to 1.00 kg Al+, with an emission of about 1.50 kg CO2 [69]. Several sensors monitor the entire process continuously, acquiring data from the entire plant. Data are stored and organized in databases, which became a rich patrimony of the plants, as they keep the historical information on each production pot. This data collection supports the building of automatic decision-making systems and guides for the engineers [71,72,73,74]. Many control systems display the data acquired in real time for the permanent monitoring of the process. Plant control systems for Al smelting have two modes of operation [74,75]: Automatic control: Data are collected and processed by computers and/or microcontrollers, which then drive a control action on the plant without direct human intervention. Examples: control of electrical resistance of the pot by the anode–cathode distance (ACD) using pulse width modulation (PWM) to drive the lifting/lowering of anodes; and the control of alumina to be added to the electrolytic bath through mathematical models. Manual control: Data are collected through plant floor sensors or manually measured by process operators, but the calculation of the output is performed by the process engineers, taking into account mathematical models and their expertise. Examples: thermocouple to measure the temperature of the pots (Figure 3), percentage of fluoride alumina in the bath (laboratory result), metal level of the pot, replacement of anodes, and Al tapping from the pot.
Figure 3

Pot temperature measurement: (a) human operator; and (b) thermocouple connected to display the temperature value.

The experiments conducted in this paper were derived from a real Brazilian Al smelter, from which real data were used to generate results. The pots are arranged in four reductions, each of which has two rooms, and each room has 120 pots, resulting in 960 pots. Figure 4 shows the overall layout of this factory.
Figure 4

Overall layout of the smelter made up of four reductions, eight rooms, and 960 pots.

Electrically, Al reduction pots are connected in series. This connection allows the continuous electric current (approximately 180 kA) to be the same in all pots. It should be noted that for a room there are two lines of electricity, each line composed of two sections, which in turn contain 30 pots, resulting in 32 different sections for the entire smelter. Figure 5 outlines the arrangement of the sections for reduction I and the first room. This same organization is present in all rooms of the smelter concerned and these pots’ disposition was used as clusters empirically; each cluster is a section.
Figure 5

Section layout by room.

3. Design of Estimation Models

The full database has hundreds of thousands of samples and hundreds of process features (variables) from 2006 to 2016. The following subsection depicts the preprocessing steps performed in the original database in order to generate the datasets used in this work.

3.1. Data Extraction, Imputation, and Split

Data extraction considered the entire life of each pot, in other words a lifespan from 1 to 1500 days, taking into account an average of five years of operation. Table 1 shows all variables available in the database. Therefore, features selection considered Pearson correlation (R), between input and output, to rank variables by degree of importance. It is important to know that some variables have a large number of null values, so they were discarded. R is calculated as: where n is sample size, x and y are the individual sample points indexed with i, and and are the sample averages.
Table 1

All variables available in the database.

AbbreviationComplete NameUnit
%CaOCalcium Oxide Percentage%
%Fe2O3Iron Oxide Percentage%
%MnOManganese Dioxide Percentage%
%Na2OSodium Oxide Percentage%
%P2O5Phosphorus Pentoxide Percentage%
%SiO2Silicon Oxide Percentage%
%TiO2Titanium Dioxide Percentage%
%V2O5Vanadium Pentoxide Percentage%
%ZnOZinc Oxide Percentage%
<325 m<325 Mesh%
>100 m>100 Mesh%
>200 m>200 Mesh%
CRFriction Index%
CRFThin Crust%
DAApparent Densityg/cm3
LOI1Loss on ignition (300–1000 °C)%
LOI2Loss on ignition (110–1000 °C)%
LOI3Loss on ignition (110–300 °C)%
SESpecific Surfacem2/g
%FEIron Content in Metalppm
%GaGallium Content%
%MnManganese Content%
%NaSodium Content in Metal%
%NiNickel Content%
%PMetal Phosphorus Contentppm
%SISilicon Content in Metalppm
%TBasePercentage of Time on Base Feed%
%TChkCheck Feed Time Percentage%
%TInicPercentage of Initial Feeding Time%
%TOthersPercentage of Time Other Feeding Modes%
%TOVPercentage of Feeding Over Time%
%TUNPercentage of Feeding Time Under%
%V_Vanadium Content%
A%1Feeding (Al2O3)%
ALFAluminum Fluoride (% in Bath)%
ALF3AAmount of AlF3 Addedkg/Misc
ALF3ABAlF3–Base Addition–Totalkg/Misc
ALF3ABFAlF3–Base Addition–ABFkg/t Al
ALF3ABFCAlF3–Base Addition–Factor Ckg/t Al
ALF3ABNAlF3–Base Addition–Na2Okg/t Al
ALF3ABTAlF3–Base Addition–Totalkg/Misc
ALF3ABVAlF3–Base Addition–Lifekg/Misc
ALF3AcAmount of AlF3 Added–Correctionkg/Misc
ALF3AEALF3A–Extra Additionkg/Misc
ALF3AhAmount of AlF3 Added–Historickg/Misc
ALF3AmAmount of AlF3 Added–Maintenancekg/Misc
ALF3ARAlF3 Deviation Referencekg/Misc
ALF3ARBALF3A–[Real–Base]kg/Misc
ALF3ASAlF3–Hopper Balance Correctionkg/Misc
ALF3AtAmount of AlF3 Added–Trendkg/Misc
ALF3ATSHopper Balancekg/Misc
ALF3ATSAcAccumulated Hopper Balancekg/Misc
ALF3CAAlF3–% AlF3 Correctionkg/Misc
ALF3CMAlF3 Quantity–Manual Correctionkg/Misc
ALF3CTAlF3–Temperature Correctionkg/Misc
ALF3DAAlF3 Added–Cumulative Deviationkg
ALF3DALIAlF3–Accumulated Deviation–Lower Limitkg
ALF3DALSAlF3–Accumulated Deviation–Upper Limitkg
ALF3LCAlF3–Limit Check Correctionkg/Misc
ALFcaAluminum Fluoride for CA%
ALFcalcCalculated Aluminum Fluoride%
ALMFeederKg
CAFCalcium Fluoride (% in Bath)%
CAF2AAmount of CaF2 Addedkg
CAF2CMCaF2 Quantity–Manual Correctionkg
CANAnode Coveragecm
CESpecific Energy ConsumptionkWh/kg Al
CoLiqLiquid Columncm
CQB-EfetivChemical Bath Control—Effectiveness%
DeltaRResistance DeltauOhm
DeltaTSuper Heat°C
DeltaT1Super Heat°C
DeltaTMSuper Heat Measured°C
DeltRCIDeltaR–Instability CalculationuOhm
DesAnodCARAnode Descent in CARun
DesAutAnodAutomatic Anode Descentun
DifNMEMetal Level (Real-Set)cm
DifRMRRreal-RsetuOhm
DifRSORtarget-RsetuOhm
DRPTroPost-Trade Resistance DeltauOhm
EaEnergLAnode Effect (AE)–Net EnergyKwh/EA
EANUnscheduled Anode EffectEA/d
EAPScheduled Anode Effectea/d
EaDurPolAE–Polarization Durationseg/Ea
EaDurPolTotAE–Total Duration of Polarizationseg/F/Day
EaVBrutaAE–Gross VoltageV/Ea
EaVLiqAE–Liquid VoltageV/Ea
EaVMaxAE–Maximum VoltageV
EaVPolAE–Voltage PolarizationV/Ea
ECOCurrent Efficiency%
FABAlF3 Base Additionkg/Misc
FARBAddition (Real + Extra − Base)kg/Misc
IMxCurrent IntensitykA
IncCTAlimIncrement–CTFeeduOhm
IncCTOscIncrement–CTOscuOhm
IncOpIncrement–OperationuOhm
IncOsIncrement–OscillationuOhm
IncTmIncrement–TemperatureuOhm
IncTrIncrement–Anode ExchangeuOhm
NaSodium Content in Metal (PPM)ppm
NA2CO3AAdded Amount of Na2CO3kg
NA2CO3CMNa2CO3 Quantity–Manual Correctionkg
NBABath Levelcm
NBAABath AdditionKg
NBAcBath ControlKg
NBARBath RemovalKg
NCicSEASEA Cycle NumberCiclos/SEA
NEATotal Anode Effectea/d
NEARecorrTotal Recurrent Anode EffectEA/d
NMEMetal Levelcm
NOVNumber of Oversun
NSANumber of Feed Shotsun
NTRNumber of Tracks-
NumOverUnderNumber of Overs Followed by Undersun
PANAnodic LossuOhm
PCACathodic LossmV
PCOCathodic Loss (uOhms)mOhm
PHVLoss Rod BeamuOhm
PreEAAnode Pre-Effectea/d
PrvEAAnode Effect Predictionea/d
PURMetal Purity (% Al)%
QALrFeed Quantity (Real)kg
QALtFeed Quantity (Theoretical)kg
QMEAmount of Flushed Metal (Real)ton
RMRReal ResistanceuOhm
RSResistance SetpointuOhm
RSOTarget ResistanceuOhm
SetNBABath Level Setpointcm
SetNMEMetal Level Setpointcm
SILOAlf3 Silo Filling Control-
SIMImpossible Anode Effect Suppression%
SIMTotImpossible Total Anode Effect Suppression%
SPEAAnode Pre-Suppressionea/d
SPEAIMImpossible Anode Pre-Effect Suppression%
SubAnodCARCAR Anode Riseun
SubAutAnodAutomatic Anode Riseun
SWFStrong Oscillation%
SWTTotal Oscillation%
TASSuspended Feed Timemin
TC1Check Timemin
TEAAnode Effect Timemin
TMPBath Temperature°C
TMPcatCA Bath Temperature°C
TMPLIBath Temperature–Lower Limit°C
TMPLiqLiquid Temperature°C
TMPLSBath Temperature–Upper Limit°C
TMTTrack Timemin
TOVOver Timemin
TUNUnder Timemin
VIDAPot Lifedays
WFReal Consumption of OvenkW
WFAOven Target ConsumptionkW
AFFresh Alum Silo Level%
af%FAdsorbed Fluoride (Fluorinated Alumina)%
af%F(Cor)Corrected plant fluoridation%
af%Na2OSodium Oxide (Fluorinated Alumina)%
af%UMMoisture (Fluorinated Alumina)%
Af < 325 m<325 Mesh (Fluorinated Alumina)%
Af < 400 m<400 Mesh (Fluorinated Alumina)%
Af > 100 m>100 Mesh (Fluorinated Alumina)%
Af > 200 m>200 Mesh (Fluorinated Alumina)%
afDAApparent Density (Fluorinated Alumina)g/cm3
afLOI1L.O.I. (110–300 °C; AF)%
AluTTransported AluminaT
Na2OdifSodium Oxide (Fluorinated Alumina–Virgin)%
SPVZFresh Alumina Flow SetpointT/h
VZFresh Alumina FlowT/h
af%UMxMoisture (Fluorinated Alumina)%
ALF LILower Limit ALF%
ALF LSALF Upper Limit%
IATarget CurrentkA
IMCurrent IntensitykA
IMBBBooster Current IntensitykA
IMCCurrent Intensity (Pot)kA
IMRBCurrent IntensitykA
VLLine VoltageV
WLActual Line ConsumptionMW
ECpPredicted Current Efficiency%
ECrReal Current Efficiency%
PRODRealReal Productiont
Table 2 lists the most important inputs associated with output variables selected to create the estimation models. Firstly, the inputs have been determined after a Pearson correlation study (Equation (2)). After that, process engineers validated the feature selection to the model. It is important to note that all input variables are delayed by one step, because neural models emulate a first order dynamic system with delayed inputs to estimate the current output. The final selected dataset had about 1,728,000 samples and eleven inputs and three outputs.
Table 2

Variables used for the modeling.

IDTypeVariableAbbreviationUnitDelayR w/TMPR w/ALFR w/NME
1InputGross VoltageVMR-1V1-step−0.490.430.30
2Gross ResistanceRMR-1uOhm−0.480.410.24
3Bath LevelNBA-1cm0.58−0.41−0.69
4Calcium Fluoride(% in the Bath)CAF-1%−0.53−0.490.37
5Percentage of Sodium OxidePNA2O-1%−0.52−0.670.31
6Percent of Calcium OxidePCAO-1%−0.570.720.32
7Amount of AlF3 AddedALF3A-1kg/misc0.40−0.46−0.30
8Amount Fed (Real)QALR-1kg−0.350.320.52
9 TemperatureTMP-1°C0.88−0.790.32
10 Aluminum Fluoride(% in the Bath)ALF-1%−0.780.940.25
11 Metal LevelNME-1cm−0.410.340.94
12OutputTemperatureTMP°C ---
13Aluminum Fluoride(% in the Bath)ALF%----
14Metal LevelNMEcm ---
Some variables, such as temperature, percentage of fluoride, and metal level, are collected manually by physical sensors or through laboratory analysis, generating different sampling frequencies. Other variables, for instance real resistance and raw voltage, are collected online via sensors without human interference. Most of the variables are sampled on a daily basis; however, variables that are collected manually have other sampling frequencies. This fact causes null data to be present between measurements when combining variables from different samplings. Missing data were imputed by calculating a linear interpolation between the previous and subsequent measurements, according to the variable sampling. According to process engineers, linear interpolation fits well, because the chemical process is slow and it has been validated before. Figure 6 shows an imputation example for bath temperature. The soft sensors described in this work have the advantage of being capable of estimating missing data after they have been properly trained.
Figure 6

Example of data imputation for bath temperature.

Process engineers also agree there are three different types of behaviors produced by pots according to their lifespan: a lifespan of 1–100 days is considered a “starting point”; 101–1200 days as a “stationary regime”; and 1201–1500 days as the “shutdown point”. This lifespan division is the second method used to cluster the entire dataset (the first is clustering by section, explained before). These ranges may vary according to the pot, but they are the same on average. Figure 7 summarizes behaviors and the amount of data for each lifespan division.
Figure 7

Description of each lifespan division.

The different behaviors also may be verified when the dataset of each group is statistically analyzed. Figure 8 shows histograms of each input variable for each group. The ALF3A variable has zero values at the starting point, because it is not observed in this phase, so this variable may be discarded when models for this phase are created. The PNA2O variable at the starting point has a larger number of samples less than 0.4; in the stationary regime and shutdown point, the higher concentration of samples is more than 0.4. The behavior of input variables between stationary regime and shutdown point is similar.
Figure 8

Input variables histogram: (a) starting point; (b) stationary regime; and (c) shutdown regime.

Analyzing the output variables histogram for each behavior (Figure 9), it is possible to observe that the TMP variable at the starting and shutdown points had a range of values greater than the stationary regime, ratifying the instability thesis. Another behavior verified was about the NME variable: at the starting point it had a large accumulation of samples at 24, but in the stationary and shutdown phases the accumulation was 25. The ALF variable at the starting point had a larger sample concentration less than 10; in the other two phases the concentration was greater than 10.
Figure 9

Output variables histogram: (a) starting point; (b) stationary regime; and (c) shutdown point.

Besides histograms, the difference in TMP variation can be observed in the three phases by Figure 10. In starting point, the mean is equals 970.5 °C, because the pot must be reheated; in stationary regime, the mean decreases to 963.7 °C, the standard mean of the plant; and in shutdown point, it also decreases to 958.8 °C, since the pot is being cooled to turn off. TMP was chosen to perform this analysis, because it is one of the most monitored process variables.
Figure 10

Bath temperature variation of the pot 5.

The following subsection shows the steps performed in the original database in order to generate the resulting models.

3.2. Strategy for Modeling

Data clustered by each section and by each lifespan division were used to build models to estimate TMP, ALF, and NME using the ANN technique. It is important to know that each ANN model has only one of three outputs and two different training algorithms were used to create them: Levenberg−Marquardt (LM) and back propagation (BP). Besides, three strategies were used for each technique: Consider 70% of the data from each cluster to train, 15% to validate, and 15% to test the models. Consider data from all pots of one entire section to train the models, except for one pot of the respective section to test the model. This was applied to section clustering and lifespan division. Dataset standardization was done using the z-score method. The z-score generates a standardized dataset with average equal to 0 and standard deviation equal to 1 and it is expressed by: where x is the value to be standardized, is the average of the variable, and is the standard deviation of the variable. Table 3 shows the division of the complete dataset for the modeling process: for each lifespan division or all datasets and two different learning algorithms. Moreover, three strategies were used for each technique, 32 different pot sections, whole dataset, and three outputs, resulting in 594 different models, initially.
Table 3

Complete modeling process.

Lifespan DivisionTraining AlgorithmNumber of Models
Starting pointANN-LM32 sections × 3 outputs = 96All dataset × 3 outputs = 3
ANN-BP32 sections × 3 outputs = 96All dataset × 3 outputs = 3
Stationary regimeANN-LM32 sections × 3 outputs = 96All dataset × 3 outputs = 3
ANN-BP32 sections × 3 outputs = 96All dataset × 3 outputs = 3
Shutdown pointANN-LM32 sections × 3 outputs = 96All dataset × 3 outputs = 3
ANN-BP32 sections × 3 outputs = 96All dataset × 3 outputs = 3
TOTAL576 models (clustered data)18 models (all dataset)594 models
Each model was trained ten times, because the initial weights of the neural network and the division of training and validation data are random, according to a Gaussian probability density function. In total, 5760 neural networks were created considering clustered data, whereas 2880 models use the LM algorithm and 2880 use the BP algorithm. The pseudocode (Algorithm 1) summarizes the entire modeling process. The mean squared error (MSE) and the R between target and estimated values were considered as quality metrics of the models. MSE is defined as: where n is the number of samples, and y and are the target and estimated values by the model, respectively.

3.3. Parameter Learning for ANN Models

It is important to mention that there were empirical attempts to define the number of neurons in the hidden layer and transfer functions in the hidden and output layers. Empirical attempts considering 2, 4, 8, 16, 32, 64, and 128 neurons in the hidden layer were done and alternating the transfer function resulted in a small variation in training, validating, and testing MSE of 0.5%. Therefore, it was decided to generate simpler models according to the parameters explained in Table 4.
Table 4

Artificial neural network (ANN) model details.

ParameterValueJustification
Number of hidden layers1Empirical attempts.
Number of neurons in the hidden layer2
Transfer function in the hidden layerSymmetric Sigmoid
Transfer function in the output layerLinear
Learning algorithmsLMTo build models faster, because this algorithm considers an approximation of Newton’s method, which uses an array of second-order derivatives and a first-order derivative matrix (Jacobian matrix). On the other hand, it uses more memory to calculate optimal weights [76,77].
BPTo create models based on the most traditional learning algorithm: descendent gradient. It is slower than LM, but it uses less memory [78,79].
It is important to mention that the models were generated using MATLAB® version R2018a (The MathWorks Inc.: Natick, MA, USA) on a computer equipped with a processor by Intel® Core™ i7-3537U, CPU 2.00 GHz, 8 GB RAM, SSD (Solid State Disk).

4. Results and Discussion

After running the experiments, this section shows and discusses the results. Figure 11 shows the time spent in each set of experiments by lifespan division and the training algorithm. Once there were 32 different sections, three different outputs and ten experiments were done, so each point represents the training of 960 different models. All experiments consumed over two and a half hours in total, where the LM algorithm was almost twice as fast as the BP.
Figure 11

Time spent on ANN- Levenberg–Marquardt (LM) and ANN-back propagation (BP) experiments.

Figure 12 exemplifies the evolution of training, validating and testing of neural networks creation process for TMP output, considering starting point data. It is possible to verify LM converges faster and it is more accurate than BP. This same behavior was identified for the other outputs and lifespan divisions.
Figure 12

Examples of the evolution of training, validating and testing of neural networks creation process for TMP output: (a) LM algorithm; and (b) BP algorithm.

Since the reduction pot always operates with the closed loop control, the available data are closed loop. In other words, the estimation of the variables made by the soft sensors is in a closed loop. Thus, the estimates obtained show bias deviations and inherent error in the frequency domain [72,73,74,75,76]. Since the reduction pot cannot operate in an open loop, these errors will be inherent in the estimates obtained, but are sufficiently useful for control [73,76]. Therefore, it is possible that data are affected by the change of the controller transfer function. Figure 13 shows MSE and R values for 2880 models considering all pots in starting, stationary and shutdown phases, ANN-LM, the three output variables, and normalized data. Most models present low MSE values and high R values (the blue line is the average). Therefore, the contribution is to prove that the modeling strategy described worked properly.
Figure 13

Mean squared error (MSE) and R values of ANN-LM based models considering the 2880 models: (a) MSE for starting point; (b) R for starting point; (c) MSE for stationary regime; (d) R for stationary regime; (e) MSE for shutdown point; and (f) R for shutdown point.

Figure 14 shows MSE and R values for the other 2880 models, considering all the characteristics and pots previously mentioned, but the ANN-BP training algorithm. It is noted that MSE and R values were bigger on average and had more variants than those of ANN-LM. It is interesting to note high variance in the results of each section.
Figure 14

MSE and R values of ANN-BP-based models considering the 2880 models: (a) MSE for starting point; (b) R for starting point; (c) MSE for stationary regime; (d) R for stationary regime; (e) MSE for shutdown point; and (f) R for shutdown point.

Figure 15 shows MSE and R values for models created by all data for ANN-LM and ANN-BP. It was possible to verify higher MSE and lower R (on average) when compared to previous models.
Figure 15

MSE and R values of ANN-LM- and ANN-BP-based models considering models created by all data: (a) MSE for ANN-LM; (b) R for ANN-BP; (c) MSE for ANN-BP; and (d) R for ANN-BP.

Table 5 outlines MSE and R average (avg) and standard deviation (std) global values, besides minimum and maximum MSE and R values in all 5760 models. It is possible to verify that the LM algorithm generates more accurate models in all cases. The quality of the estimation is much better when LM is considered; it may be check analyzing the high values of BP’s avg and std.
Table 5

Compendium of MSE and R global values considering all models.

Lifespan DivisionANN Training AlgorithmOutput VariableMSEglobalRglobalMIN and MAX MSEMIN and MAX R
Starting pointLMTMPavg: 0.182std: 0.001avg: 0.903std: 0.00060.031; 0.6390.623; 0.986
ALFavg: 0.124std: 0.002avg: 0.935std: 0.00090.015; 0.8990.568; 0.993
NMEavg: 0.110std: 0.0008avg: 0.927std: 0.00050.001; 0.4960.727; 0.997
BPTMPavg: 31.833std: 13.102avg: 0.618std: 0.0130.053; 424.582.5 × 10−5; 0.973
ALFavg: 28.133std: 22.021avg: 0.675std: 0.0170.029; 460.520.0002; 0.988
NMEavg: 69.322std: 23.053avg: 0.333std: 0.0110.005; 668.168.6 × 10−6; 0.971
Stationary regimeLMTMPavg: 0.196std: 0.0001avg: 0.896std: 8.5 × 10−50.093; 0.3260.821; 0.952
ALFavg: 0.105std: 5.5 × 10−5avg: 0.945std: 3.0 × 10−50.041; 0.2050.891; 0.979
NMEavg: 0.129std: 7.9 × 10−5avg: 0.932std: 3.6 × 10−50.002; 0.2990.839; 0.982
BPTMPavg: 12.45std: 12.84avg: 0.731std: 0.0420.109; 310.310.0002; 0.943
ALFavg: 4.84std: 11.96avg: 0.817std: 0.0410.057; 234.280.0005; 0.970
NMEavg: 41.15std: 39.82avg: 0.526std: 0.0150.015; 946.947.7 × 10−5; 0.972
Shutdown pointLMTMPavg: 0.213std: 0.0004avg: 0.886std: 0.00030.018; 0.5030.705; 0.991
ALFavg: 0.112std: 0.0003avg: 0.941std: 0.00010.010; 0.2830.850; 0.996
NMEavg: 0.184std: 0.0003avg: 0.897std: 0.00010.001; 0.4620.742; 0.998
BPTMPavg: 11.36std: 17.93avg: 0.730std: 0.0330.047; 342.540.0008; 0.976
ALFavg: 14.34std: 27.38avg: 0.742std: 0.0250.017; 634.695.1 × 10−5; 0.991
NMEavg: 11.36std: 17.93avg: 0.581std: 0.0150.006; 725.002.3 × 10−5; 0.990
All dataLMTMPavg: 0.80std: 0.25avg: 0.70std: 0.260.241; 0.9900.061; 0.890
ALFavg: 0.83std: 0.15avg: 0.82std: 0.030.534; 0.9450.772; 0.909
NMEavg: 0.50std: 0.32avg: 0.83std: 0.080.131; 0.9690.730; 0.932
BPTMPavg: 1.07std: 0.04avg: 0.30std: 0.181.020; 1.1600.084; 0.585
ALFavg: 0.88std: 0.08avg: 0.79std: 0.060.756; 0.9960.612; 0.833
NMEavg: 2.75std: 0.23avg: 0.30std: 0.222.359; 3.2520.061; 0.649
Comparative graphs between target values and estimated by the models were generated after the creation of estimating models and selection of the best ones. Once there were 32 models for three different lifespan divisions, models based on all data, three outputs (TMP, ALF, and NME), and two ANN learning algorithms, then it was necessary to select only one pot to visualize this similarity (pot 5). Figure 16 displays comparisons for ANN-LM-based models considering non-standardized data. It verified that the models based on lifespan division (red line) estimate very well the dynamics of the process for all output variables. Models based on all data had not learned to estimate the values (green line), especially the ALF output. Next to the graphs, there were the respective MSE and R values.
Figure 16

Comparison between target and estimated values for ANN-LM-based models and by clustered and all data: (a) starting point; (b) stationary regime; and (c) shutdown point.

Figure 17 shows comparisons for ANN-BP-based models. Estimated values also follow target values, but the accuracy is lower than the ANN-LM-based models for the most variables. When models based on all data are analyzed, it is possible to verify that they have not learned using the neural network parameters cited above.
Figure 17

Comparison between target and estimated values for ANN-BP-based models and by lifespan division: (a) starting point; (b) stationary regime; and (c) shutdown point.

Table 6 displays the MSE and R values for comparisons between target and estimated values for ANN-LM, ANN-BP-based models and by clustered and all data plotted on the graphs in Figure 16 and Figure 17. It proves the advantage of using the proposed method. It is important to remember that data used to perform these comparisons were not used in the neural net creation process.
Table 6

MSE and R values by training algorithm, lifespan division, and data type.

ANN Training AlgorithmLifespan DivisionData TypeMSER
LMStarting pointClusteredTMP: 9.939ALF: 0.083NME: 0.014TMP: 0.977ALF: 0.996NME: 0.999
All dataTMP: 73.18ALF: 5.39NME: 0.54TMP: 0.809ALF: 0.867NME: 0.913
Stationary regimeClusteredTMP: 14.37ALF: 0.179NME: 0.007TMP: 0.941ALF: 0.989NME: 0.999
All dataTMP: 53.12ALF: 6.92NME: 1.00TMP: 0.874ALF: 0.733NME:0.905
Shutdown pointClusteredTMP: 15.669ALF: 0.1652NME: 0.018TMP: 0.940ALF: 0.991NME: 0.998
All dataTMP: 48.58ALF: 6.92NME: 0.83TMP: 0.888ALF: 0.757NME: 0.839
BPStarting pointClusteredTMP: 10.96ALF: 0.077NME: 0.012TMP: 0.975ALF: 0.996NME: 0.999
All dataTMP: 139.13ALF: 5.19NME: 3.17TMP: −0.760ALF: 0.779NME: 0.818
Stationary regimeClusteredTMP: 14.06ALF: 0.177NME: 0.010TMP: 0.942ALF: 0.989NME: 0.999
All dataTMP: 141.94ALF: 6.57NME: 3.51TMP: −0.663ALF: 0.782NME:0.775
Shutdown pointClusteredTMP: 16.624ALF: 0.158NME: 0.020TMP: 0.935ALF: 0.992NME: 0.998
All dataTMP: 137.31ALF: 6.60NME: 3.53TMP: −0.542ALF: 0.863NME: 0.831
Another results evaluation was performed analyzing residual plot in all phases, considering the best clustered based model. Figure 18 shows that the most TMP points are between −5 °C and 5 °C, the most ALF points are between −1% and 1%, and NME points are between −0.5 cm and 0.5 cm. These error variances are perfectly acceptable by process engineer. Red lines display the std ranges.
Figure 18

Residual plots: (a) starting point; (b) stationary regime; and (c) shutdown point.

5. Conclusions

In this work, the results of an innovative approach to create soft sensors to estimate TMP, ALF, and NME variables of primary Al production were presented. After testing different neural net topologies and considering two different training algorithms, training and testing 5940 different models, the best model of each output variable was selected and it was possible to ensure that these models generate high generalization power and very small errors that are fully tolerated by process engineers. In all cases, models based on section clustering and lifespan division performed more accurate estimates compared to models that do not use clustering. LM has helped to create neural networks more accurate than the BP algorithm. Besides, LM is faster for training the models. TMP, ALF, and NME variables are the most important to control the proper functioning of the pots. The lifespan and section dataset clustering contributed to creating more specialized models in the behaviors of the respective clusters of pots, reducing errors and increasing the precision of the estimating soft sensors. ANNs have been chosen because they can generate models with a high power of generalization and they have the capability to learn the nonlinearity of the process using experimental plant data. MATLAB® was used to develop the models, but a computer system will be created to implement the integration of soft sensors with data acquired in real time, making it possible for engineers to virtually estimate the behavior of the pots, rather than make manual or laboratory measurements. It is planned to use these soft sensors to control the pots.
  7 in total

1.  Training feedforward networks with the Marquardt algorithm.

Authors:  M T Hagan; M B Menhaj
Journal:  IEEE Trans Neural Netw       Date:  1994

2.  Soft sensor modeling of chemical process based on self-organizing recurrent interval type-2 fuzzy neural network.

Authors:  Taoyan Zhao; Ping Li; Jiangtao Cao
Journal:  ISA Trans       Date:  2018-10-12       Impact factor: 5.468

3.  Estimation of fungal biomass using multiphase artificial neural network based dynamic soft sensor.

Authors:  Chitra Murugan; Pappa Natarajan
Journal:  J Microbiol Methods       Date:  2019-02-05       Impact factor: 2.363

4.  Development of soft sensor for neural network based control of distillation column.

Authors:  Asha Rani; Vijander Singh; J R P Gupta
Journal:  ISA Trans       Date:  2013-01-30       Impact factor: 5.468

5.  Feed-Forward Neural Network Prediction of the Mechanical Properties of Sandcrete Materials.

Authors:  Panagiotis G Asteris; Panayiotis C Roussis; Maria G Douvika
Journal:  Sensors (Basel)       Date:  2017-06-09       Impact factor: 3.576

6.  Back-propagation neural network-based reconstruction algorithm for diffuse optical tomography.

Authors:  Jinchao Feng; Qiuwan Sun; Zhe Li; Zhonghua Sun; Kebin Jia
Journal:  J Biomed Opt       Date:  2018-12       Impact factor: 3.170

7.  Levenberg-Marquardt Neural Network Algorithm for Degree of Arteriovenous Fistula Stenosis Classification Using a Dual Optical Photoplethysmography Sensor.

Authors:  Yi-Chun Du; Alphin Stephanus
Journal:  Sensors (Basel)       Date:  2018-07-17       Impact factor: 3.576

  7 in total
  1 in total

1.  Industrial Semi-Supervised Dynamic Soft-Sensor Modeling Approach Based on Deep Relevant Representation Learning.

Authors:  Jean Mário Moreira de Lima; Fábio Meneghetti Ugulino de Araújo
Journal:  Sensors (Basel)       Date:  2021-05-14       Impact factor: 3.576

  1 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.