| Literature DB >> 35810177 |
Francesco Miniati1, Gianluca Gregori2.
Abstract
Transport processes ruled by complex micro-physics and impractical to theoretical investigation may exhibit emergent behavior describable by mathematical expressions. Such information, while implicitly contained in the results of microscopic-scale numerical simulations close to first principles or experiments is not in a form suitable for macroscopic modelling. Here we present a machine learning approach that leverages such information to deploy micro-physics informed transport flux representations applicable to a continuum mechanics description. One issue with deep neural networks, arguably providing the most generic of such representations, is their noisiness which is shown to break the performance of numerical schemes. The matter is addressed and a methodology suitable for schemes characterised by second order convergence rate is presented. The capability of the methodology is demonstrated through an idealized study of the long standing problem of heat flux suppression relevant to fusion and cosmic plasmas. Symbolic representations, although potentially less generic, are straightforward to use in numerical schemes and theoretical analysis, and can be even more accurate as shown by the application to the same problem of an advanced symbolic regression tool. These results are a promising initial step to filling the gap between micro and macro in this important area of modeling.Entities:
Year: 2022 PMID: 35810177 PMCID: PMC9271097 DOI: 10.1038/s41598-022-15416-y
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.996
Figure 1Taylor expansion’s error: residual error from Taylor expansion up to order 0, 1, 2 (corresponding to with , respectively in the legend) of the analytic flux function q in Eq. (13) (red) and its MLP representation (blue). Each panel corresponds to variations of each individual thermodynamic variable, with the shaded regions representing the range between the absolute of the mean value (bottom boundary) and one standard deviation (upper boundary) for a sample of 30 randomly chosen expansion points.
Figure 2Smoothness test: plot of the scaled residual of a first order Taylor expansion , with respect to each thermodynamic variable, (no sample averaging over the expansion point was computed). The open gray pentagons represent the three overlapping expansions for the analytic flux function q in Eq. (13). The blue, red and olive symbols (circles + dash-line) show the expansions with respect to , respectively, for the case of the MLP representation. The black solid line is the parabolic curve that all scaled residual errors are expected to follow, an expectation fulfilled by the analytic case, but not the MLP representation, except that for a very short interval.
Figure 3Trainable MLP representing the flux-gradient function. The first layer embeds the input features via Random Fourier Features (RFFs) which are then fed to the first hidden layer. The number of RFFs is the same as the number of hidden units which is constant across the hidden layers. Nonlinearity is introduced by application of a ReLU activation function to the affine mapping returned by the hidden units. The RFF embeddings are also fed to every other hidden layer except the last through skip connections. The output layer consists of as many regression units as the gradient components without activation function.
Grid of parameter space: range and spacing for the grids of plasma parameter values at which the heat flux function (top) and its gradient (bottom three panels) are evaluated at different sampling density ().
| Parameter | Min | Max | Spacing |
|---|---|---|---|
| Uniform | |||
| 1 | 10 | Uniform | |
| 10 | Uniform | ||
| 1.0 | – | ||
| Uniform | |||
| 1.8 | 9.2 | Uniform | |
| 9.5 | Uniform | ||
| 0.9 | – | ||
| Uniform | |||
| 2.4 | 8.6 | Uniform | |
| 9.1 | Uniform | ||
| 0.8 | – | ||
| Uniform | |||
| 3.3 | 7.8 | Uniform | |
| 1.6 | 8.5 | Uniform | |
| 0.02 | 0.6 | – | |
For each Table section, the last line shows the range of values of the heat-flux suppression factor. The temperature gradient length is fixed at cm.
Datasets: the columns represent the datasets’ name, the percentage of random relative noise added to the flux function, the sampling density represented by the points-per-decade parameter, the total number of buffer grid points, and the total number of flux function evaluations.
| Name | ||||||||
|---|---|---|---|---|---|---|---|---|
| Total | Training | Eval. | Test | |||||
| A.0 | 0 | 20 | 4 | 46,464 | 32,000 | 21,760 | 5440 | 4800 |
| A.1 | 1 | 20 | 4 | 46,464 | 32,000 | 21,760 | 5440 | 4800 |
| A.5 | 5 | 20 | 4 | 46,464 | 32,000 | 21,760 | 5440 | 4800 |
| A.10 | 10 | 20 | 4 | 46,464 | 32,000 | 21,760 | 5440 | 4800 |
| A.20 | 20 | 20 | 4 | 46,464 | 32,000 | 21,760 | 5440 | 4800 |
| B.0 | 0 | 10 | 4 | 8064 | 4000 | 2720 | 680 | 600 |
| B.1 | 1 | 10 | 4 | 8064 | 4000 | 2720 | 680 | 600 |
| B.5 | 5 | 10 | 4 | 8064 | 4000 | 2720 | 680 | 600 |
| B.10 | 10 | 10 | 4 | 8064 | 4000 | 2720 | 680 | 600 |
| B.20 | 20 | 10 | 4 | 8064 | 4000 | 2720 | 680 | 600 |
| C.1 | 1 | 5 | 4 | 1764 | 500 | 340 | 85 | 75 |
| C.0 | 0 | 5 | 4 | 1764 | 500 | 340 | 85 | 75 |
| C.5 | 5 | 5 | 4 | 1764 | 500 | 340 | 85 | 75 |
| C.10 | 10 | 5 | 4 | 1764 | 500 | 340 | 85 | 75 |
| C.20 | 20 | 5 | 4 | 1764 | 500 | 340 | 85 | 75 |
The last four columns relate to the gradient function datasets (three components each), including the total number of evaluations and its partitions into training (68%), evaluation (17%) and test (15%) sets, respectively.
Figure 4Single regularization result: regularization results for the dataset B.10, with and . Top panels: histograms of the relative error distribution of the regularised and unregularised flux function (left) and each gradient component (next three panels), respectively (see legend for details). The blue shaded regions correspond to the regularised error distribution expanded by a factor 10. Bottom panels: corresponding cumulative error distributions for the histograms in the top panels, particularly the regularised and unregularised flux function data (blue and red) and its individual gradient components (green and olive), respectively.
Figure 5Regularization results: for each combination of the and parameters, the blue and red curve show the cumulative distributions of the relative error of the flux function while the green and olive curves show the cumulative distributions of the flux gradient relative error, estimated by the ratio of the Euclidean norm of the flux log-gradient error and the Euclidean norm of the correct flux log-gradient (see main text for definition of log-gradient and Euclidean norm).
Reduced space searched for the final tuning of hyperparameters characterising the MLP model.
| Hyperparameter | Search space | Space type |
|---|---|---|
| Number hidden layers | {4, 5, 6} | exhaustive |
| Number hidden units | {128, 256, 512, 1024} | exhaustive |
| Learning rate | [ | log-uniform sampling |
| [0.1, 5.0] | log-uniform sampling |
Best models selection: from left to right the columns include the model’s name, the name of training set as listed in Table 2, the number of hidden layers and units, respectively, the parameter for generation of Random Fourier Features embeddings, and final the RMS and Max statistics for the evaluation errors.
| Model | Dataset | Learning rate | Neural network | Regularization | Result | ||||
|---|---|---|---|---|---|---|---|---|---|
| Layers | Units | Type | Param. | RMS error | Max error | ||||
| MA.0 | A.0 | 4 | 1024 | 0.963 | L1 | ||||
| MA.1 | A.1 | 4 | 512 | 0.733 | – | – | |||
| MA.5 | A.5 | 5 | 256 | 0.601 | L1 | ||||
| MA.10 | A.10 | 6 | 512 | 0.663 | – | – | |||
| MA.20 | A.20 | 6 | 1024 | 1.723 | L1 | ||||
| MB.0 | B.0 | 4 | 512 | 0.790 | L2 | ||||
| MB.1 | B.1 | 6 | 512 | 0.739 | – | – | |||
| MB.5 | B.5 | 5 | 512 | 0.564 | L2 | ||||
| MB.10 | B.10 | 4 | 512 | 0.965 | L2 | ||||
| MB.20 | B.20 | 6 | 256 | 0.844 | L1 | ||||
| MC.0 | C.0 | 5 | 1024 | 0.444 | L1 | ||||
| MC.1 | C.1 | 5 | 128 | 0.515 | – | – | |||
| MC.5 | C.5 | 6 | 256 | 0.545 | – | – | |||
| MC.10 | C.10 | 5 | 128 | 0.483 | L1 | ||||
| MC.20 | C.20 | 5 | 128 | 0.547 | L2 | ||||
The table is divided into three subtables, one for each density of sampling parameter, , characterising the training data.
Figure 6MLP models test errors: Histogram of the test-errors of the MLP models in Table 3. All the histograms are rescaled so that they all peak at 1. Each row corresponds to a different (and parameter range domain PR-), while each column corresponds to a different value of , the noise in the flux dataset before applying Tikhonov’s regularization. The errors are computed with respect to the noiseless testsets, i.e. the A.0, B.0 and C.0 testsets for models of the A-, B-, C-series, respectively. The legend shows the mean and standard deviation of the histogram in each panel. The lightblue shapes in some panels correspond to a histogram of 10 larger errors.
Figure 7Test-error Statistics: RMS (blue dash line), Max (red dash line) and Bias (yellow thin dash line) statistics of the model prediction errors and their trend with the pre-regularization noise for different sampling size cases. In particular, from top to bottom, the rows correspond to errors computed using the noiseless testset of the C-, B- and A-Series respectively, while from left to right the columns correspond to models trained with datasets with and 5, respectively. The half-filled points threaded by the black dash line correspond to the case of equal relative error and input noise (the identity line, which would be diagonal in a linear plot).
Figure 8Trend with data sampling density: Test-errors’ RMS (left), Max (center) and Mean (left) presented in the top three panels of Fig. 7, relative to the PR-5 domain, plotted as a function of . Different symbols correspond to models trained with data characterised by different pre-regularization noise (see Figure’s legend). The error statistics appear to roughly decrease as the inverse of the parameter (black dashed line).
Figure 9Convergence test: and error norms for the implementations using an MLP model of the flux gradient (blue and red, respectively) and for the implementation using instead an analytic expression (gray and cyan, respectively, with the cyan curve multiplied by 1.15 to make it visible). The plotted errors are averages over a sample of 30 runs using different, randomly chosen, unperturbed values of the thermodynamic parameters, (). Models trained with data characterised by different pre-regularization noise are represented by different symbol (see legend), though they are difficult to distinguish as their calculated error data points mostly overlap. The error norm is also shown for the initial conditions (olive). Expected error drop rate for a second order accurate scheme is indicated by the black dash line.
Results from the symbolic regression analysis.
| Dataset | Symbolic expression for | Function error | Gradient error | ||
|---|---|---|---|---|---|
| RMS | Max | RMS | Max | ||
| A.1 | |||||
| A.5 | |||||
| A.10 | |||||
| A.20 | |||||
From left to right the column indicate the name of the dataset from which the training data were sampled, the found symbolic expression corresponding for clarity to , the RMS and Max statistics of the relative error for the function and its gradient on a random sample of 10 entries.