| Literature DB >> 27965565 |
Kristian Loewe1, Sarah E Donohue2, Mircea A Schoenfeld3, Rudolf Kruse4, Christian Borgelt4.
Abstract
The functioning of the human brain relies on the interplay and integration of numerous individual units within a complex network. To identify network configurations characteristic of specific cognitive tasks or mental illnesses, functional connectomes can be constructed based on the assessment of synchronous fMRI activity at separate brain sites, and then analyzed using graph-theoretical concepts. In most previous studies, relatively coarse parcellations of the brain were used to define regions as graphical nodes. Such parcellated connectomes are highly dependent on parcellation quality because regional and functional boundaries need to be relatively consistent for the results to be interpretable. In contrast, dense connectomes are not subject to this limitation, since the parcellation inherent to the data is used to define graphical nodes, also allowing for a more detailed spatial mapping of connectivity patterns. However, dense connectomes are associated with considerable computational demands in terms of both time and memory requirements. The memory required to explicitly store dense connectomes in main memory can render their analysis infeasible, especially when considering high-resolution data or analyses across multiple subjects or conditions. Here, we present an object-based matrix representation that achieves a very low memory footprint by computing matrix elements on demand instead of explicitly storing them. In doing so, memory required for a dense connectome is reduced to the amount needed to store the underlying time series data. Based on theoretical considerations and benchmarks, different matrix object implementations and additional programs (based on available Matlab functions and Matlab-based third-party software) are compared with regard to their computational efficiency. The matrix implementation based on on-demand computations has very low memory requirements, thus enabling analyses that would be otherwise infeasible to conduct due to insufficient memory. An open source software package containing the created programs is available for download.Entities:
Keywords: big data; dense connectome analysis; functional connectivity; graph theoretical analysis; resting-state fMRI
Year: 2016 PMID: 27965565 PMCID: PMC5126118 DOI: 10.3389/fninf.2016.00050
Source DB: PubMed Journal: Front Neuroinform ISSN: 1662-5196 Impact factor: 4.081
Figure 1Performance comparison with respect to time and memory efficiency. The compared programs for degree computation are based on corrcoef (1), corr (2), IPN_fastCorr (3), IPN_calLCAM (4) and the proposed functional connectivity matrix object FCMAT using the half-stored (5 and 8), the on-demand (6 and 9) and the cache-based variant (7). The programs 1-7 use Pearson's r as a functional connectivity estimate; the programs 8 and 9 use the tetrachoric correlation coefficient r. Comparisons were conducted on two machines, a desktop computer with an Intel Core i7-3960X CPU and 64GB of main memory, and a server with two Intel Xeon E5-2697 v2 CPUs and 256GB of main memory. The number of threads was varied between 1 and 6 on the desktop computer and between 1 and 48 on the server. For each program, only the best result is reported, i.e., the result based on the number of threads for which the elapsed time was shortest for that program. See Figure 2 for more detailed results regarding the performance gained by each program through multi-threading. The reported results are averages from 10 runs. For details see text. The number of timepoints T was fixed at T = 256. N: number of nodes; mem [GiB]: peak memory in GiB; time [s]: elapsed time in seconds.
Figure 2Performance gained through multi-threading. The compared programs for degree computation and the two machines on which the comparisons were conducted are the same as in Figure 1. The reported results are averages from 10 runs on each machine. The number of timepoints T was fixed at T = 256. N: number of nodes; perf [elem./s]: performance in number of computed elements per second.
Figure 3Speed vs. cache misses. The compared programs for degree computation and the two machines on which the comparisons were conducted are the same as in Figures 1 and 2. The reported results are averages from 10 runs on each machine. The number of timepoints T was fixed at T = 256. N: number of nodes; time [s]: elapsed time in seconds.
Expected and measured memory usage.
| 4( | 13.47 | 53.76 | 214.81 | 26.94 | 107.52 | ||
| 4( | 13.47 | 53.76 | 214.81 | 26.94 | 107.52 | ||
| 4( | 13.47 | 53.76 | 214.81 | 26.99 | 107.63 | ||
| ? + 4 | 6.23 | 24.21 | 44.33 | ||||
| 2 | 6.82 | 27.05 | 107.75 | 6.83 | 27.06 | 107.75 | |
| | 6.76 | 26.94 | 107.52 | 6.77 | 26.94 | 107.52 | |
| | 8 | 0.11 | 0.23 | 0.46 | 0.12 | 0.24 | 0.46 |
| | 0.06 | 0.12 | 0.24 | 0.06 | 0.12 | 0.24 | |
| | 4 | 0.36 | 0.48 | 0.71 | 0.37 | 0.48 | 0.71 |