Literature DB >> 31738778

Randomized methods to characterize large-scale vortical flow networks.

Zhe Bai¹, N Benjamin Erichson², Muralikrishnan Gopalakrishnan Meena³, Kunihiko Taira³, Steven L Brunton⁴.

Abstract

We demonstrate the effective use of randomized methods for linear algebra to perform network-based analysis of complex vortical flows. Network theoretic approaches can reveal the connectivity structures among a set of vortical elements and analyze their collective dynamics. These approaches have recently been generalized to analyze high-dimensional turbulent flows, for which network computations can become prohibitively expensive. In this work, we propose efficient methods to approximate network quantities, such as the leading eigendecomposition of the adjacency matrix, using randomized methods. Specifically, we use the Nyström method to approximate the leading eigenvalues and eigenvectors, achieving significant computational savings and reduced memory requirements. The effectiveness of the proposed technique is demonstrated on two high-dimensional flow fields: two-dimensional flow past an airfoil and two-dimensional turbulence. We find that quasi-uniform column sampling outperforms uniform column sampling, while both feature the same computational complexity.

Entities: Chemical Disease Species

Mesh：

Year: 2019 PMID： 31738778 PMCID： PMC6860431 DOI： 10.1371/journal.pone.0225265

Source DB: PubMed Journal: PLoS One ISSN： 1932-6203 Impact factor: 3.240

Introduction

Fluid dynamics is a rich and challenging field at the intersection of physics and engineering. There is still a tremendous amount that we do not understand about turbulence, yet working fluids are at the heart of nearly every major industry, including health, defense, transportation, and energy. One cannot overestimate the significance and impact that a better understanding of fluid flows would have in our ability to predict and manipulate their behavior in the real world. The challenge of modeling and controlling fluids stems from the fact that they are nonlinear with complex multi-scale interactions over a large range of spatial and temporal scales. Moreover, data from experiments and simulations are generally represented as exceedingly high-dimensional measurements that may obscure underlying patterns. With advances in simulation capabilities and measurement techniques, the volume and quality of such data are rapidly increasing. Despite the complexity of the governing equations and the overwhelming volume of data, it is often observed that flows evolve on a low-dimensional attractor defined by a few dominant coherent structures [1]. There are a wealth of modal decomposition techniques to mine and characterize these structures from experimental data and numerical solvers [2, 3]. The majority of techniques, such as proper orthogonal decomposition (POD) [1, 4] and dynamic mode decomposition [5-7] are fundamentally linear, as are many of the standard techniques in control theory [8]. Linear control has been widely applied for flow control [9-11], for example to stabilize boundary layers. However, many flows are fundamentally nonlinear, limiting the broad use of linear control theory. Low-order nonlinear models may be obtained by Galerkin projection of the Navier Stokes equations onto POD modes, and these models have had great success for uncovering underling mechanisms that drive flows [4]. Regardless, there are issues with Galerkin models, such as instability and mode deformation with changing parameters and boundary conditions [12-14], limiting the success of POD-Galerkin modeling for turbulence. Recently, network theoretic approaches have been increasingly leveraged to analyze complex, fluid flow systems [15-21]. Network science [22] characterizes the structure and dynamics on a graph consisting of nodes and the edges connecting them. It is possible to cast a fluid flow in this context, where each grid cell of vorticity is a node, and the induced velocity from the Biot-Savart interaction between each node establishes the edge connections [15]. In this way, individual packets of vorticity interact and evolve according to physics that is encoded in the graph. Network analysis provides a complementary perspective to classical techniques in fluid dynamics, especially for flow control. There have been many powerful advances in network control theory [22, 23] surrounding multi-agent systems [23, 24] in the past two decades. Multi-agent control is designed to be local, efficient to operate, and scalable to extremely large systems, such as the internet [25, 26] or the electric grid [27]. The multi-agent control and underlying network dynamics may also be strongly nonlinear, and these controllers are built to handle time-delays and communication failures, which are typical limiting factors for robust performance in classical control [8]. In particular, networks are often characterized by a large collection of elements, represented by nodes on a graph, that each execute their own set of local protocols in response to external stimulus. This analogy holds quite well for a number of large networked dynamical systems, including animals flocking [28, 29], multi-robotic cooperative control systems [30], sensor networks [31, 32], biological regulatory networks [33, 34], and the internet [25, 26], to name a few. A long-term goal of characterizing vortical networks is the eventual application of multi-agent control to school turbulence into a beneficial configuration for an engineering advantage. In contrast, past closed-loop flow control efforts have commonly involved applying linear control techniques to a suitable reduced-order model of the fluid [10].

Motivation

The application of network theory in fluid dynamics faces the challenge of an exceedingly large number of degrees of freedom (nodes) for a fully turbulent flow, leading to an even larger adjacency matrix of edges. Indeed, the adjacency matrix for a vortical flow network scales as the square of the number of fluid grid points, quickly becoming intractable to store and analyze using classical techniques from linear algebra. Recent approaches have been developed to manage this complexity, including building a graph on the modes of a flow [35], given by dominant fluid coherent structures from POD, rather than the nodes, given by fluid grid elements. However, there is still interest in developing scalable methods to directly analyze the vortical network in the original ambient measurement space, where nodes are grid elements. Randomized methods for linear algebra provide an emerging alternative to efficiently compute an approximate eigendecomposition of large-scale matrices. These methods work with a reduced representation, a so-called sketch, of the input matrix that captures the essential spectral information. This sketch is obtained either through random sampling or by random projections. The sketch can then be used to efficiently compute a desired low-rank approximation, such as the singular value decomposition. For a comprehensive survey and implementation details see [36-40].

Contributions

In this work, we use randomized methods to analyze large networks arising in fluid mechanics. Specifically we are interested in using randomized methods to obtain a qualitatively correct view of the dominant graph structures, which can then be used for the downstream tasks of sensor and actuator placement, optimization, and control. This application domain is the subject of the present work, where we explore the use of random sampling and randomized linear algebra to generate qualitatively accurate estimates of dominant flow structures for cases where the adjacency matrix is exceedingly large. We compare both uniform and quasi-uniform sampling, and demonstrate this approach on two high-dimensional vortical flow networks. A high-level principle sketch is provided in Fig 1.

Fig 1

Identification of dominant eigenvalues and eigenvectors of the adjacency matrix obtained from a fluid flow field using randomized methods.

Path I randomly samples columns of the adjacency matrix to approximate the leading eigenvalues and eigenvectors; path II uses both column and row sampling to form a sketch. Note, it is not required to explicitly construct the full adjacency matrix.

Identification of dominant eigenvalues and eigenvectors of the adjacency matrix obtained from a fluid flow field using randomized methods.

Vortical interaction network

Recently, network analysis has been employed to represent and analyze vortical interactions in flows [15, 17, 19, 35, 41]. In Nair and Taira [15], spectral sparsification was used to identify a low-dimensional skeleton underlying the high-dimensional, nonlinear fluid dynamics. Taira et al. [17] later demonstrated scale-free behavior of the vortex interaction network for two-dimensional isotropic turbulence. Network analysis has also been used by Meena et al. [19] for community detection and force prediction in an unsteady wake. More recently, interaction networks based on the energy transfer between modes has been used for modeling and control [35, 41]. These studies provide compelling evidence that the emerging network perspective may complement well-established flow modeling techniques. A graph (network) consists of three components: a set of vertices (nodes) , a set of edges that connect these vertices, and the associated edge weights . In the present work, the vortical elements in a flow field represent the nodes of the vortical flow network, following the formulation of Nair and Taira [15]. The interaction among vortical elements can be quantified by the induced velocity. Following the Biot-Savart law, the induced velocity from element i to j in a two-dimensional flow is given by where Γ = ω ΔxΔy is the strength of vortical element i with vorticity ω and grid area ΔxΔy, and d is the Euclidean distance between the vortical elements. This induced velocity represents the edge weights, and may be used to construct an adjacency matrix which mathematically represents the interactions in the vortical network. Since a vortex cannot induce motion upon itself, there are no diagonal entries for the adjacency matrix. Here we have an undirected network with a symmetric adjacency matrix A to enable the use of the algorithm described later. However, it is possible to define an asymmetric adjacency matrix for a directed network. Given the vorticity field of a flow, the corresponding vortical interaction network can be formulated as above. This also suggests that the size of the adjacency matrix scales as where n is the number of grid points or vortical elements in the flow field. For a high-fidelity simulation or experimental measurements of a flow field, n can exceed , making the computation and analysis of the vortical network computationally expensive. Below, we discuss a few network measures that are particularly useful for understanding the interaction-based characteristics of the flow. The current work focuses on computing these measures for large, dense vortical networks in a computationally tractable manner.

Network measures and computations

The adjacency matrix is the basis for a number of useful network measures, such as eigenvector centrality, Katz centrality, and PageRank [22, 42]. The centrality of a network describes the most important or central elements based on measures that quantify the connection strength of the nodes. One of the most basic centrality measures is the node strength, given by s = ∑ A, which measures the overall interaction strength of a node. The eigenvector centrality is another fundamental network property, which also considers the strength of the neighboring elements of a node and is used to identify the most influential nodes in the network [43]. It is computed as the eigenvector of A corresponding to the largest eigenvalue λ: Entries with large absolute value in the vector u correspond to influential nodes. The Katz centrality and PageRank are related and also based on computing the dominant eigenvector of the adjacency matrix. Network analysis is also useful for graph partitioning and clustering [22], where the nodes in the network are grouped into clusters based on their close connectivity. Spectral partitioning or clustering is a common approach based on the dominant eigenvector of the adjacency matrix. Nodes can be divided and sub-divided into groups based on the sign of the corresponding elements in the dominant eigenvectors of A. Spectral partitioning can also be used in community detection algorithms to identify groups of closely interacting nodes, called communities [44]. These algorithms perform a hierarchical grouping of nodes into communities based on maximizing the overall modular measure of the network. This hierarchical grouping can be performed using the dominant eigenvectors of A [45]. In vortical flows, these network-based measures are useful for identifying the most influential vortical elements and clustering them into coherent structures or communities. Due to the importance of the spectral information of the adjacency matrix, considerable effort has gone into developing accurate and efficient algorithms to compute dominant eigenvalues for large adjacency matrices. Provided that the largest eigenvalue is real and distinct, the dominant eigenvector can be efficiently computed using the power method [46]. A symmetric adjacency matrix with nonnegative entries will have purely real eigenvalues. For instance, the power method is commonly used to compute the dominant eigenvector of an adjacency matrix [47], and this method is particularly efficient for large sparse adjacency matrices. The computational costs of the power method for computing the dominant eigenvector scales as .

Random sampling-based matrix approximation

As stated above, we are interested in computing the eigendecomposition of a real symmetric matrix that takes the form where the columns of the orthogonal matrix are eigenvectors and the entries of the diagonal matrix are the corresponding eigenvalues λ1 ≥ ⋯ ≥ λ. More concretely, we are interested in computing only the dominant k eigenvectors and eigenvalues (where k ≪ n) that yield the rank-k approximation where contains only the k columns of U that correspond to the k largest eigenvalues. Here, we use tools from the field of randomized numerical linear algebra [36, 38–40, 48–54] to compute such a low-rank approximation efficiently. Indeed, randomized methods are widely used for computing low-rank approximations [55-61]. Scalability is achieved by forming a sketched representation of the input (adjacency) matrix which extracts the inherent spectral information. A sketch can be constructed by post-multiplying the adjacency matrix with a an arbitrary ‘sketching’ matrix . The random matrix S represents either a random projection or a random sampling process. Projection-based methods construct a ‘sketch’ by forming a set of k randomly weighted linear combinations of the columns of the input (adjacency) matrix [39, 48, 49, 62]. This approach can substantially reduce the computational demands when computing a low-rank approximation. However, projection-based methods generally require at least a single pass over the entire data matrix. Sampling-based methods aim to approximate the low-rank structure of the input (adjacency) matrix from a small random subsample of actual columns or rows of the matrix [63-65]. The sampling process is described in more detail below. In practice, sampling-based methods may bypass the construction of the full adjacency matrix, while sacrificing some accuracy for improved scalability. In what follows, we generate qualitatively accurate estimates of dominant flow structures for cases where the adjacency matrix is exceedingly large. In this problem context, we require a qualitatively correct view of the dominant graph structures, which can be used for the downstream tasks of sensor and actuator placement, optimization, and control. Therefore, we investigate the use of column sampling to compute a sketched singular value decomposition and the Nyström Method to approximate the leading k eigenvalues and eigenvectors. Both techniques have been successfully used to approximate large-scale matrices in other applications [66]. Further, sampling-based methods provide tunable error bounds that are based on the number of sampled columns/rows, making them viable as an alternative to well-established deterministic algorithms. It is also important to contrast randomized sampling approaches to uniform undersampling, which is the most obvious approach to reduce computations. However, uniform sampling has well-known limitations, including systematic aliasing of high-frequency spatial structures. Randomized methods avoid this aliasing, providing both efficiency and accuracy.

Sketched singular value decomposition

Column sampling for low-rank matrix approximation dates back to the pioneering work of Frieze et al. [62]. Instead of computing the full eigendecomposition of A, it is possible to form a rank-k approximation by first sampling l ≥ k columns from the input matrix where the matrix consists of a subset of l columns of A and S describes the corresponding random sampling process. In practice, it is only necessary to form a list J comprised of l column indices There are several sampling strategies to choose good columns, and the approximation error depends on the both the sampling process and the number of columns sampled. In our problem setup, each column of A corresponds to the network interaction of a single grid element in the original fluid flow field with every other grid element. Here, we compare uniform random sampling against (quasi-random) Halton sampling [67, 68]. Halton sampling provides better coverage of the domain, as illustrated in Fig 2, and our empirical results show that this translates into faster convergence of the approximation error with respect to the number of columns sampled, as discussed in the next section.

Fig 2

Examples of two different random sampling approaches based on uniform sampling and Halton random sampling.

Halton sampling provides a better coverage of the domain than uniform sampling does.

Examples of two different random sampling approaches based on uniform sampling and Halton random sampling.

Halton sampling provides a better coverage of the domain than uniform sampling does. If the columns of C are carefully chosen, then C provides a basis for the column space of the adjacency matrix A. Note that here we assume that A is a symmetric matrix. Thus, the dominant k left singular vectors of , provide an approximation for the dominant k eigenvectors of A, i.e., we have . The corresponding eigenvalues are approximated by scaling the singular values of C: The cost of constructing the sketch matrix using uniform or Halton sampling is . Then, the cost of computing the SVD on C is . For parallel computations, it is possible to compute U and Σ by taking the SVD of C C, which requires to be generated and for the corresponding SVD. Because the SVD is the computational engine for snapshot POD, randomized techniques have been used to accelerate this computation as well.

The Nyström method

The Nyström method provides an efficient approach for low-rank matrix approximations in large-scale learning applications. It was initially introduced as a quadrature method for numerical integration to approximate eigenfunction solutions. Recent work has streamlined these computations and provided theoretical foundations for the use of various sampling schemes [69-72]. In addition to column sampling, the Nyström method further constructs a square matrix by selecting the J rows and columns of A: When A is symmetric and positive-semidefinite (SPSD), W is also SPSD. The small matrix W can be used to efficiently compute the dominant eigenvectors and eigenvalues of A. Following [73], we first compute the eigendecomposition: Then, we reconstruct the dominant eigenvalues as and the corresponding eigenvectors The Nyström method has a computational complexity of .

Results

We now demonstrate the use of sampling-based randomized decomposition techniques to characterize the vortical interaction networks for two example flows: the laminar wake flow past a NACA 0012 airfoil with a Gurney flap attached to the trailing edge [74], and a two-dimensional homogeneous decaying isotropic turbulence [17].

Numerical simulations

The first example is the flow past a NACA 0012 airfoil with a chord-based Reynolds number of Re = 1,000, at an angle of attack of 20°, and with a Gurney flap of 10% chord length (c). This setup generates an unsteady 2P wake [74, 75], with periodic shedding of two pairs of positive and negative vortices. This flow is solved using direct numerical simulations (DNS) via the immersed boundary projection method [76, 77], following the computational setup of Gopalakrishnan Meena et al. [74]. The computational domain consists of five nested levels of multidomains with the finest level of size (x/c, y/c) ∈ [−1, 1] × [−1, 1] and the largest being (x/c, y/c) ∈ [−16, 16] × [−16, 16] in size. All the domains have a grid resolution of m × m = 360 × 360. The time step is limited to a maximum Courant-Friedrichs-Lewy (CFL) number of 0.3. Fig 3(a) shows the vorticity field in a sub-domain with a grid resolution of 1261 × 1261 and size 7 × 7, nondimensionalized by the chord length of the airfoil.

Fig 3

Example flow fields.

Example flow fields.

(a) Vorticity field of two-dimensional DNS of the flow over a NACA 0012 airfoil with Gurney flap at an angle of attack of 20° and flap height of 0.1 chord length at Re = 1000; the full illustration can be found in [74]. A subdomain of the original vorticity field is used in this example. (b) Vorticity field of two-dimensional decaying homogeneous isotropic turbulence, initialized by an integral length-scale based Reynolds number of Re(t0) = 814. The second example is two-dimensional decaying homogeneous isotropic turbulence, which is a canonical high-dimensional, mutiscale turbulent flow that exhibits complex nonlinear interaction of vortices over a wide range of length scales [78]. The vorticity field is obtained from a two-dimensional incompressible DNS, solving the vorticity transport equation without forcing [17]. The simulation, based on the Fourier spectral method and the fourth-order Runge-Kutta time integration scheme, is performed on a square biperiodic computational domain with grid resolution of m × m = 1024 × 1024. The vorticity field is initialized with 100 vortices with random strengths, core sizes, and centers such that the kinetic energy spectra satisfies with k0 = 26.5 [79]. For this study, the flow field is initialized by an integral length-scale based Reynolds number of Re(t0) = 814, shown in Fig 3(b). The vorticity snapshots of both flows are used to construct the adjacency matrices of the corresponding vortical networks using Eq 2. The size of the adjacency matrices are of the order and for the airfoil and turbulent flow, respectively, emphasizing the need for dimensionality reduction and computationally tractable analysis tools.

Computational performance

A naive application of network theory in fluid dynamics can be challenging, because the adjacency matrix is often too large to be stored or analyzed using classical techniques from linear algebra. There are two primary challenges associated with large vortical networks: The size of the adjacency matrix, which describes the network edges, scales as the square of the number of fluid grid points; Unlike social graphs, the vortical adjacency matrix is dense. Here, we use randomized methods to accelerate computations based on the vortical adjacency matrix. We sample a small number of columns from the adjacency matrix to compute a sketched SVD or Nyström approximation. Thus, the approximation quality can be controlled via the number of samples used to compute the low-rank approximation. In the context of networks arising in fluid mechanics, it turns out that sampling a small number of columns is sufficient for computing an approximation that is qualitatively useful. This leads to tremendous memory and computational savings, since the costs are proportional to the number of columns that we sample. We run all of our experiments using Amazon AWS, working on a memory optimized instance ‘x1.16xlarge’ that is powered by two Intel Xeon E7 8880 v3 (Haswell) processors with 64 virtual cores and 976 GB fast memory. Our algorithms are implemented in Python powered by MKL accelerated linear algebra routines, and C extensions for constructing the elements of the adjacency matrix. We show in Fig 4 the approximation error as a function of the percentage of columns sampled and the corresponding computational time required.

Fig 4

Computational performance for the flow over an airfoil using two different spatial resolutions.

The results are averaged over 20 runs using different random seeds. The shaded region indicates the variance of approximation error in multiple runs.

Computational performance for the flow over an airfoil using two different spatial resolutions.

The results are averaged over 20 runs using different random seeds. The shaded region indicates the variance of approximation error in multiple runs. Here, we consider two cropped spatial grids of different dimensions from which we construct the adjacency matrix. The error is reported as the acute angle between the true leading eigenvector v and the approximate eigenvector , which provides a useful summary of how closely the two high-dimensional eigenvectors align. Since the algorithms are fundamentally based on random sampling, we average the results over 20 initializations and report the distribution of errors. Halton sampling results in considerably lower error than uniform sampling at the same percentage of columns sampled, which is consistent with the observation that it is sampling the flow domain more efficiently. The results based on Halton sampling also have considerably lower variance, indicating improved numerical robustness. Fig 4a and 4b illustrate how substantially the computational time increases when the spatial resolution of the flow field is increased only slightly. In the 421 × 421 case, only about 5% of the columns need to be sampled for 5% error compared between the approximate and true eigenvectors. In other words, we use only about 5% of the information to compute the low-rank approximation, and this reduces the memory requirements for the example in Fig 4b from about 215GB to roughly 10.5GB. At the same time, the computational time is reduced since it is not necessary to construct the full adjacency matrix. The computational time is reduced further by using the Nyström method, while only sacrificing a small amount of accuracy. For comparison, Table 1 summarizes the computational demands and performance for the deterministic power iteration method. Fig 5 shows similar results for the turbulent flow. Again, we see that Halton sampling is favorable and that only a small fraction of columns is required to yield a good approximation of the leading eigenvector.

Table 1

Summary of computational results for constructing the full adjacency matrix and computing the dominant eigenvector using the deterministic power method.

Here the computational bottleneck is the memory required to construct the adjacency matrix. Note that the power method does not generally require that the full adjacency matrix is constructed explicitly; however, the power method is not efficient because the adjacency matrices we consider are dense.

	Dimensions of adjacency matrix	Time to construct full adjacency matrix	Storage required	Time to compute deterministic eigenvector
airfoil	316² × 316²	197.76 (sec)	74.29 (GB)	60.84 (sec)
airfoil	421² × 421²	715.78 (sec)	214.67 (GB)	197.87 (sec)
turbulent	256² × 256²	158.44 (sec)	32.00 (GB)	27.51 (sec)
turbulent	342² × 342²	272.09 (sec)	101.92 (GB)	85.01 (sec)

Fig 5

Computational performance for the isotropic turbulent flow using two different spatial resolutions.

The results are averaged over 20 runs using different random seeds. The shaded region indicates the variance of approximation error in multiple runs.

Computational performance for the isotropic turbulent flow using two different spatial resolutions.

The results are averaged over 20 runs using different random seeds. The shaded region indicates the variance of approximation error in multiple runs.

Summary of computational results for constructing the full adjacency matrix and computing the dominant eigenvector using the deterministic power method.

Approximate eigenvectors and spectral clustering

Fig 6 shows the leading eigenvectors of the adjacency matrix for the two flow examples, computed using deterministic and randomized methods. By visual inspection, the Nyström method with quasi-random Halton sampling provides an excellent approximation of the leading eigenvector using only 10% of the columns of the adjacency matrix. The resulting eigenvector centrality of the matrix A highlights strong vortex cores in both flows, as shown in Fig 6(b). Vortices with lower strength have less centrality, emphasizing their decreased role in influencing the flow field. The 2P wake structure is clearly visible in the flow past an airfoil, and some shear layer structures are present in the 2D turbulent flow. The irrotational regions in both flows have the lowest centrality measure. These observations complement the traditional fluid dynamics literature regarding highly influential structures in a vortical flow. The approximate eigenvector centrality measures capture both the highly influential vortex cores as well as the smaller, less influential vortices, as seen in both flows. Moreover, the network-based measures take into account the spatial arrangements of the vortices to assess their influence. The error between the approximate and true eigenvectors are shown in Fig 7.

Fig 6

Qualitative comparison of the deterministic (b) and approximate (c) dominant eigenvector of the adjacency matrix generated from the flow field in (a).

Fig 7

Relative approximation error between the deterministic and approximate dominant eigenvector as in Fig 6, distributed in the spatial domain: (a) airfoil wake flow, (b) two-dimensional isotropic turbulent flow.

Qualitative comparison of the deterministic (b) and approximate (c) dominant eigenvector of the adjacency matrix generated from the flow field in (a).

The top row shows the results for the airfoil wake and the bottom shows the results for the two-dimensional isotropic turbulence. The approximation in (c) uses the Nyström method with Halton sampling with 10% of the columns of the adjacency matrix. In the computational comparison above, we only considered relatively low resolution flow fields because it was computationally prohibitive to compute the ground truth eigenvectors using deterministic methods. However, with randomized techniques it is possible to compute the leading eigenvector of the adjacency matrix for a 512 × 512 resolution grid and a 1024 × 1024 resolution grid, respectively, as shown in Fig 8. The adjacency matrix for the 1024 × 1024 grid would require about 8 TB of storage, if explicitly constructed. Instead, we sample only about 10% of the columns to compute the Nyström approximation. While we cannot quantify the approximation error, we can see by visual inspection that the eigenvector captures dominant flow structures.

Fig 8

Approximated leading eigenvectors of the adjacency matrices for higher-resolution isotropic flow fields.

Approximated leading eigenvectors of the adjacency matrices for higher-resolution isotropic flow fields.

Here, we are using the Nyström method with Halton sampling. By visual inspection we can see that sampling only about 10% of the columns of the full adjacency matrix is sufficient for computing an approximated eigenvector that captures the dominant graph structure. Finally, we investigate spectral clustering, which is one of the common uses of the leading eigenvectors of the adjacency matrix, providing a principled approach to cluster nodes into common communities. Fig 9 shows the results of spectral clustering based on the three leading eigenvectors obtained from the adjacency matrix. In both flow fields, the approximate clusters obtained using randomized methods closely resemble the clusters computed via deterministic eigendecomposition. The right panels in Fig 9 further shows the data plotted in these first three eigenvector coordinates to visualize the clusters and subspaces. Typically, these leading eigenvectors are quite useful for determining relevant community structures within a network. These communities have an important physical interpretation in vortical flow networks, hierarchically identifying and organizing vortex cores within the flow. Because these communities are often determined from the few dominant eigenvectors, the randomized approach here can provide considerable computational advantages. Further, since high-vorticity nodes have a large influence on the network interactions, it may be possible to improve convergence by preferential sampling using these community structures.

Fig 9

Spectral clustering using the top three eigenvectors of the adjacency matrix for the airfoil wake flow.

In (a) we show the clusters that we found using the exact eigenvectors. It can be seen that the approximate eigenvectors allow us to reveal the same seven clusters, as shown in (b). In (c) and (d) we show the first three approximated eigenvector coordinates to visualize the clusters and subspaces.

Spectral clustering using the top three eigenvectors of the adjacency matrix for the airfoil wake flow.

Discussion

In this work, we have demonstrated the effective use of scalable algorithms from randomized linear algebra to accelerate network computations for large-scale fluid flows, as demonstrated on the wake behind an airfoil and two-dimensional isotropic turbulent flow. In particular, we have used sampling-based methods to approximate the leading eigenvectors of the adjacency matrix of the vortical interaction network for cases where the data is so large that even single-pass algorithms are prohibitively expensive. Combining importance sampling, based on the probability distribution of detected communities, and the Nyström method, the leading eigenvector can be accurately computed at a fraction of the computational cost and memory requirement, bypassing the need to construct or query the full A matrix. We also showed that quasi-random Halton sampling techniques outperform uniform sampling in both examples as they provide more comprehensive sampling coverage of the spatial domain. There are a number of interesting future directions based on this work. Further study is required to apply these randomized network based analysis techniques to more complex three-dimensional turbulent flows. There are a two main aspects to consider: the high-dimensionality of 3D flows, and the broadband spectrum of turbulent flow fields. The high-dimensionality of 3D flows does not present a challenge for randomized methods, and in fact, would present an even bigger opportunity for computational savings. However, if the increase of dimensionality is accompanied by an increase in the number of energetic modes in the SVD, then this may present a challenge for randomized methods, which rely on the existence of low-rank structure and a gap between these singular values and those characterized by turbulent fluctuations. However, even in fully turbulent flows with broadband frequency content, there often exist dominant coherent structures, and it is likely that these structures will still be identified with randomized algorithms. In addition, there are similar scaling issues in the related fields of almost invariant sets and set-oriented methods [80-84], where the state-transition operator may be viewed as a graph on the high-dimensional state-space. In the transfer operator case, machine learning has been used to identify a data-driven discretization of the state-space into clusters, which serve as nodes that scale with the intrinsic rank of the attractor, rather than the high-dimensional measurement dimension [85]; similar clustering has been used to identify subspaces for POD–Galerkin models [86]. These methods have been used to determine eigenvectors of the Perron–Frobenius operator, the dual of the Koopman operator, establishing a strong connection between sensitivity and coherence in fluid flows. It will be interesting to mathematically connect those approaches with the randomized methods considered in the present study. Further, since the adjacency matrix depends on the instantaneous flow field, it will be interesting to extend the structures derived in this analysis to time-varying flows. Previously, this would have been prohibitively expensive, but the randomized methods presented here may make this more tractable. There is also an opportunity to incorporate regularization in the Biot-Savart computations, as vorticity elements that are very close may cause numerical instability.

Data set of the example flow fields used in the manuscript, as shown in Fig 3.

(ZIP) Click here for additional data file.

10 in total

10. Characterization and detection of thermoacoustic combustion oscillations based on statistical complexity and complex-network theory.

Authors: Shogo Murayama; Hikaru Kinugawa; Isao T Tokuda; Hiroshi Gotoda
Journal: Phys Rev E Date: 2018-02 Impact factor: 2.529

10 in total

Randomized methods to characterize large-scale vortical flow networks.

Introduction

Motivation

Contributions

Identification of dominant eigenvalues and eigenvectors of the adjacency matrix obtained from a fluid flow field using randomized methods.

Vortical interaction network

Network measures and computations

Random sampling-based matrix approximation

Sketched singular value decomposition

Examples of two different random sampling approaches based on uniform sampling and Halton random sampling.

The Nyström method

Results

Numerical simulations

Example flow fields.

Computational performance

Computational performance for the flow over an airfoil using two different spatial resolutions.

Computational performance for the isotropic turbulent flow using two different spatial resolutions.

Summary of computational results for constructing the full adjacency matrix and computing the dominant eigenvector using the deterministic power method.

Approximate eigenvectors and spectral clustering

Qualitative comparison of the deterministic (b) and approximate (c) dominant eigenvector of the adjacency matrix generated from the flow field in (a).

Approximated leading eigenvectors of the adjacency matrices for higher-resolution isotropic flow fields.

Spectral clustering using the top three eigenvectors of the adjacency matrix for the airfoil wake flow.

Discussion

Data set of the example flow fields used in the manuscript, as shown in Fig 3.

1. Network motifs: simple building blocks of complex networks.

2. Genomic analysis of regulatory network dynamics reveals large topological changes.

3. Transport in time-dependent dynamical systems: finite-time coherent sets.

4. Finding community structure in very large networks.

5. The "robust yet fragile" nature of the Internet.

6. Randomized algorithms for the low-rank approximation of matrices.

7. Community structure in directed networks.

8. Network community-based model reduction for vortical flows.

9. Networked-oscillator-based modeling and control of unsteady wake flows.

10. Characterization and detection of thermoacoustic combustion oscillations based on statistical complexity and complex-network theory.