| Literature DB >> 35197486 |
László Keresztes1, Evelin Szögi1, Bálint Varga1, Vince Grolmusz2,3.
Abstract
Gaussian blurring is a well-established method for image data augmentation: it may generate a large set of images from a small set of pictures for training and testing purposes for Artificial Intelligence (AI) applications. When we apply AI for non-imagelike biological data, hardly any related method exists. Here we introduce the "Newtonian blurring" in human braingraph (or connectome) augmentation: Started from a dataset of 1053 subjects from the public release of the Human Connectome Project, we first repeat a probabilistic weighted braingraph construction algorithm 10 times for describing the connections of distinct cerebral areas, then for every possible set of 7 of these graphs, delete the lower and upper extremes, and average the remaining 7 - 2 = 5 edge-weights for the data of each subject. This way we augment the 1053 graph-set to 120 [Formula: see text] 1053 = 126,360 graphs. In augmentation techniques, it is an important requirement that no artificial additions should be introduced into the dataset. Gaussian blurring and also this Newtonian blurring satisfy this goal. The resulting dataset of 126,360 graphs, each in 5 resolutions (i.e., 631,800 graphs in total), is freely available at the site https://braingraph.org/cms/download-pit-group-connectomes/ . Augmenting with Newtonian blurring may also be applicable in other non-image-related fields, where probabilistic processing and data averaging are implemented.Entities:
Mesh:
Year: 2022 PMID: 35197486 PMCID: PMC8866411 DOI: 10.1038/s41598-022-06697-4
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.379
Figure 1The distribution of the Jaccard-distances of all the pairs formed from the 120 augmented graphs of the two closest (measured in Jaccard distance) graphs from the 1053 ones: the graphs No. 101915 and 654350. From the 120 + 120 = 240 augmented graphs, we can form 28,680 pairs. The pairs are partitioned into three classes: Red class: both members of the pair belong to subject 101915; Blue class: both members of the pair belong to subject 654350; Green class: one member belongs to subject 101915, the other to 654350. On the x-axis, the Jaccard distance is given; on the y-axis, the count of the pairs of graphs with the given Jaccard distance is shown (it is a histogram).
Figure 2The comparison of train- and test-accuracies with and without augmentation, as the function of the logarithms of parameter C. The figure shows the better test accuracy in almost all domain of log C with augmentation.