Spontaneous resting-state neural activity or hemodynamics has been used to reveal functional connectivity in the brain. However, most of the commonly used clustering algorithms for functional parcellation are time-consuming, especially for high-resolution imaging data. We propose a density center-based fast clustering (DCBFC) method that can rapidly perform the functional parcellation of isocortex. DCBFC was validated using both simulation data and the spontaneous calcium signals from widefield fluorescence imaging of excitatory neuron-expressing transgenic mice (Vglut2-GCaMP6s). Compared to commonly used clustering methods such as k-means, hierarchical, and spectral, DCBFC showed a higher adjusted Rand index when the signal-to-noise ratio was greater than −8 dB for simulated data and higher silhouette coefficient for in vivo mouse data. The resting-state functional connectivity (RSFC) patterns obtained by DCBFC were compared with the anatomic axonal projection density (PDs) maps derived from the voxel-scale model. The results showed a high spatial correlation between RSFC patterns and PDs.
Spontaneous resting-state neural activity or hemodynamics has been used to reveal functional connectivity in the brain.1–5 The network topology of functional connectivity can deepen the understanding of the relationship between brain function and structure. Resting-state functional magnetic resonance imaging obtains whole brain cerebral blood oxygen level-dependent signals to analyze the functional connection and provides important references for diagnoses of neurophysiological disease.6–8 However, fMRI reflects hemodynamic fluctuations, which does not directly indicate neural activity. Investigating resting-state functional connectivity (RSFC) using direct neural signals, such as spontaneous fluctuation of calcium signals, can enable understanding of the mechanisms underlying the link between functional connectivity and brain structure.9–11 Because datasets for neural optical imaging usually have high spatiotemporal resolution, automatic fast clustering methods for widefield fluorescent calcium-signal imaging can help to improve the efficiency of RSFC analysis.
Current data-driven functional clustering methods can be grouped into two broad categories. The first is the nonclustering family, which includes algorithms based on probability distribution models, regional growth methods, decomposed signal and extracted components, self-organization mapping, dictionary learning,12 and edge detection.13 The probability density partition algorithm models the brain neural activity and uses an optimization method to find the optimal solution.14–16 Regional growth chooses a few specific pixels or even all pixels in the cerebral cortex as seed points, and then the growth area is merged based on similar criteria until the region size is greater than the set threshold.17,18 The most common algorithm for extracting brain signal components used in fMRI is the independent component correlation algorithm,19–21 which extracts blind source information from original data and can perform denoising. This method requires repeated calculations to obtain the optimal number of components. Sparse decomposition generates as few atoms as possible to represent the original brain information by learning an overcomplete dictionary22,23 and can remove redundant information and improve analysis efficiency. Self-organizing maps show results as a neural network output layer.24
The other category is the clustering family that includes k-means,15,25–28 hierarchical clustering,28,29 spectral clustering,27,30–32 affinity propagation,23,33 fuzzy c-means,34 and density-based clustering.35,36 Some of these methods are sensitive to the initial selection (e.g., k-means, spectral clustering, and fuzzy c-means). As the same data may produce different clustering results, the method may repeat the calculation several times to obtain the best solution and require the iterative convergence to a cluster center. Some methods hierarchically integrate similar brain activity pixels into a cluster tree to identify the hierarchical relationship of clusters. However, it also leads to longer calculation time (e.g., hierarchical clustering). The methods based on density clustering treat clusters as high-density regions separated by low-density regions.35,36 Rodriguez and Laio35 proposed a density-based algorithm that locates cluster centers using a criterion based on density peak, which can automatically calculate the cluster number and offers fast calculation. However, this method calculates the density for each pixel by summing the number of links whose geometric distances to the remaining pixels are less than the threshold, which is not suitable for analyzing the characteristics of neural activity signals, as the latter represents functional rather than distance clustering, and the correlation map of adjacent seed pixels are usually spatially continuous.
In this study, we propose a density center-based fast clustering algorithm (DCBFC) that is suitable for calcium-sensitive optical fluorescent imaging. To make the density peak-based clustering strategy useful for clustering the functional connectivity, the temporal Pearson’s correlation is used to calculate the similarity between the time courses of pixel pairs in the DCBFC. Furthermore, a strategy of adaptive threshold is proposed for calculating the candidate cluster center based on the characteristics of in vivo calcium imaging data of a mouse cortex to improve the homogeneity of the intracluster and the heterogeneity of the intercluster. We tested our method using both simulated data and in vivo mouse data and compared our algorithm with the other clustering algorithms [k-means, hierarchical, spectral, principal components analysis (PCA)-k-means, PCA-hierarchical, and spectral-threshold clustering]. Using the adjusted Rand index (ARI)37 as a criterion, we evaluated the clustering performance of the algorithms on simulated data. The silhouette coefficient (SI)28,31,38–40 was used to quantify the homogeneity and heterogeneity of functional clusters on in vivo mouse data, and the processing time was also evaluated. The results show that the DCBFC has good performance and requires only a short calculation time. Finally, the functional connection maps obtained by the DCBFC are compared with the axonal projections labeled by recombinant adeno-associated virus (rAAV)-mediated tracing methods in Allen Mouse Connectivity. We used voxel-scale model41 to construct the whole isocortical projection connectivity at a scale of voxels. The average connection diversity/average connection strength (ACD/ACS) relationships of projection connectivity are calculated and compared with the ACD/ACS relationships of RSFC obtained by DCBFC. Because some clusters obtained by the DCBFC may cover more than one brain function area defined by cytoarchitecture, the main brain regions involved in each cluster are collectively referred to as “functional module” for convenience of expression. The codes used to generate the results are publicly available at https://github.com/miuleee/DCBFC.git and https://github.com/miuleee/FCvsPD.git.
Materials and Methods
Density Center-Based Fast Clustering
Our method attempts to rapidly obtain the functional parcellation by characterizing the cluster centers based on the following two criteria of density peak derived from Ref. 35.
• A cluster center is assumed to be the cortical functional connection hub that tightly connects the surrounding pixels in a single functional module. The average Pearson’s correlation coefficients between one central point, and its connected pixels should preferably be as high as possible.
• The correlation coefficient between two central hubs should be as low as possible. The temporal correlation coefficient between each pair of candidate centers must be smaller than the average correlation coefficient of any functional module.
Based on the two aforementioned properties, we propose an approach that attempts to directly obtain the central pixels by calculating a composite variable Eq. (8), which represents the ability of a pixel to become a candidate center. A pixel with a greater value is more likely to become a central point. A schematic of DCBFC is given in Fig. 1 and described as follows.
Functional similarity matrix and threshold for determining cluster centers
Neural signals in the same functional module fluctuate synchronously. After preprocessing of the measured data, we quantify the synchronization of calcium oscillations by calculating the functional similarity matrix [Fig. 1(c)]. Similarity between two pixels can be defined in many ways, and Pearson’s correlation coefficient between time series of the recorded pixels was used to measure similarity in this study, given by the following equation:
A threshold is applied to the similarity matrix to extract the strong connections used to determine the candidate central pixels. The threshold value for the correlation matrix will affect the final clustering results. An appropriate threshold should balance the compactness within a cluster and the heterogeneity between different clusters. To determine the threshold, clustering tests were conducted on the resting-state data of the calcium signal recorded from mice brains at 0.1 to 1 and 0.1 to 4 Hz filtering bands with thresholds ranging from to at intervals of , where is the average of all elements in and is the corresponding standard deviation. The SI (see Sec. 5.3)28,31,38–40 and average cluster number are used as the measures for clustering performance. For the functional connectivity patterns obtained from the spontaneous calcium signal, a high SI value means that the correlation coefficients of the calcium-signal time series of the pixels contained in a functional module are as high as possible, and the correlation coefficients between the pixels in other functional modules are as low as possible. The closer the SI is to 1, the better the clustering effect. As shown in Fig. 2, the blue and orange-dotted lines show average SI and average cluster number, respectively, for different thresholds, where the cluster number increases gradually with . We expected to be able to obtain a greater number of clustering numbers on the premise that its corresponding SI value is sufficiently large. The mean SI of all thresholds is shown as the green-dotted line in Fig. 2. It can be seen that the SI decreases at the beginning and approximately flattens around the green-dotted line in the interval from to . It then gradually decreases as the threshold increases. We then chose the threshold value corresponding to the transition of the SI curve from the flat to the declining period [i.e., ]. The was used for both simulated and in vivo mouse data to ensure a high performance and large number of clusters simultaneously.
Determining cluster centers
Equations (2) and (3) were used to obtain the thresholded similarity matrix , in which all the elements whose absolute values were below the threshold were set to 0. The element in row and column of is denoted as , which represents the thresholded correlation coefficient between pixels and (). DCBFC clusters data by splitting the similarity matrix . Here is the number of nonzero values of row ’th of given as in Eq. (4). If is lower than the cutoff index (usually set as 1% to 2% of 35), then pixel may not be considered as a candidate cluster center. In Eq. (5), we set to zero to prevent pixel from being selected.
Two quantities were defined as criteria to determine the clustering center: the rank degree (the average of the absolute correlation coefficients that are greater than the between the ’th pixel and all other pixels) and the maximum correlation coefficient between pixel and the other pixels whose values are higher than the value of pixel . The reason we use the average correlation coefficient instead of the sum of multiple correlation coefficients is to ensure there is a chance for those pixels that have a specific seed-point correlation map, but with a small probability of occurrences in the correlation maps of all pixels, to be selected as cluster centers. Then, the pixel with the largest value is screened out and its value is set to 0.
Finally, a composite index is defined asFig. 1(d)] rapidly falls off at the beginning and decreases slowly thereafter. This sorted curve can be fitted by a formula such as through an experimental test. Let be the largest rational number of . We then calculate the threshold by the formula . Pixels with values greater than are cluster central candidates. If and both encounter 0, will be set to zero. This pixel may represent a slave point in one cluster.
To determine exhaustively the central pixels using the similarity matrix , we initialize a dataset that contains all pixels and perform a screening of the cluster centers in each loop: the pixels whose are greater than are extracted from current dataset as candidate centers. Then, the correlation coefficients between all pairs of candidate centers are evaluated. For the two candidates whose correlation is greater than , the candidate with a lower will be removed from the list of candidate centers. The candidates that have not been deleted will then be selected as cluster centers. Before the next loop begins, the pixels that are most similar to each central point are removed from the dataset together with the determined cluster centers. In addition, the pixels whose correlation coefficient with the removed pixels is greater than are also removed from the dataset . Then, the loop is repeated for new centers until all cluster centers are screened out (Fig. 1, steps iii to v).
Clustering all pixels to form the functional modules
After all cluster centers are obtained [Fig. 1(f)], all other pixels must be assembled to the clusters corresponding to the obtained central pixels. A pixel will be assigned to a center pixel with which it has the highest correlation among all the central pixels. Considering that the signal-to-noise ratio (SNR) of the spontaneous calcium activity signal with a single central pixel may be low, assembling a cluster based on it may reduce the compactness of the intracluster. Therefore, we used an averaged calcium signal for each cluster. To determine the pixel sets () used to obtain the averaged calcium signal, the values of all the pixels of the image are first sorted in descending order. Then, the number of pixels (i.e., less than 1% of the total number of pixels) were allocated (i.e., according to the order of the values, from high to low) to the central pixel if its correlation coefficient with the central pixel is greater than . The mean spontaneous calcium activity signal over the pixels in the pixel set is regarded as the calcium activity signal sequence of the cluster corresponding to a central pixel. Then, each pixel is clustered into its most correlated cluster (i.e., with the highest correlation coefficient with the averaged calcium signal) [Fig. 1(g)]. Finally, the seed functional connection maps corresponding to all pixels in the same cluster are averaged to construct the RSFC of isocortex [Fig. 1(h)].
In this manner, DCBFC can automatically obtain the main resting-state functional modules, which makes further functional connection analysis more convenient.
To evaluate the feasibility of DCBFC in processing resting-state optical imaging data, the simulated data were constructed using a linear superposition of the simulated cortical neural activity images and the background cortical images. The simulated cortical neural activity for each pixel consisted of multispike time series, as introduced by Pnevmatikakis et al.42 The background images were 1800 frames of autofluorescence images of resting-state C57BL/6J mouse cortex collected by the same optical imaging system used for the following calcium-signal imaging.
To validate DCBFC, we generated a simulated dataset containing the artificial functional modules as labeled ground truth in the following manner. First, we created a template pattern with a specific number of functional modules in the cerebral cortex and generated simulated cortical neural activity signals for each pixel. Pixels within the same module were highly correlated. Second, the background cortical images containing the physiological noises were superimposed on the simulated cortical neural activity images.
To evaluate the performance of DCBFC under different levels of SNR, we changed the amplitude ratio between the neural and background signals to obtain simulation data with different SNR ranges. Simulated data were also produced with different numbers of functional modules and different image sizes (, , and ) as follows.
i. Number of functional modules: 7, SNR level: to 10 dB, time points: 1800, repeat times: 3.
ii. Number of functional modules: 11, SNR level: 5 dB, time points: 1800, repeat times: 3.
iii. Number of functional modules: 11, SNR level: , time points: 1800, repeat times: 3.
iv. Number of functional modules: 50, SNR level: 5 dB, time points: 1800, repeat times: 3.
v. Number of functional modules: 50, SNR level: , time points: 1800, repeat times: 3.
Animal data acquisition and preprocessing
All procedures were approved by the Committee for the Care and Use of Laboratory Animals at Huazhong University of Science and Technology. We generated Vglut2-GCaMP6s transgenic mice (, at approximately 6 weeks of age) by crossing the homozygous Vglut2-ires-cre (Jax. #016963) and RCL-GCaMP6s (Ai96; Jax. #024106) strains. The presence of GCaMP expression was determined by genotyping each animal with polymerase chain reaction amplification.
Mice were anesthetized with isoflurane (1.5% in ) and placed in a stereotaxic apparatus. The body temperature was maintained at . We made an intact skull window according to Ref. 43. A homemade fixed steel sheet was attached to the occipital bone with dental cement. The intact skull was directly covered with a thick layer of dental adhesive (C&B-Metabond, Parkell, Edgewood, New York). Before the mixture became solid, a circular cover glass was gently placed on the skull without forming bubbles.
Widefield fluorescent calcium imaging was recorded through a transparent window (). We focused the camera below the surface to reduce the effects of blood vessels. The cranial window was illuminated with a blue light-emitting diode (480 nm, FF01-480/40-25, Semrock), and the fluorescence GCaMP signal was acquired through a green light bandpass filter (530 nm, FF01-535/50-25, Semrock). We captured the spontaneous calcium-signal images using a SCMOS camera (16 bits, , HAMAMATSU ORCA-Flash4.0 V3 C13440-20CU) at 10 Hz for 3 min each epoch.
A mouse cortical atlas was adapted from the Allen Institute Brain Atlas. We registered the Vglut2-GCaMP6s mice brain by marking bregma and an arbitrary marker on the sagittal suture according to the Allen Atlas. To remove the contribution of global physiological characteristics and other global fluctuations during imaging, the spontaneous activity of each pixel point in the recording sequence was subtracted from their mean fluorescence values for the entire time course. Then, a bandpass filter was used to the demean time courses to custom frequency bands temporally (e.g., 0.1 to 4 Hz). Finally, we used global signal regression to remove potential global sources of noise (e.g., respiratory and cardiac noises).44
Clustering Performance Evaluation
The performance of DCBFC was assessed in terms of calculation time, receiver operating characteristic curve (ROC),45 ARI,37,39 SI,28,31,38–40 and Dice’s coefficient.31,46 We used both simulation and animal experimental data to evaluate the performance of DCBFC and compared it with commonly used clustering methods, including k-means, hierarchical, and spectral clustering. It should be noted that DCBFC clustering uses a thresholded similarity matrix to calculate the candidate central pixels when determining the cluster center, which can be regarded as dimensionality reduction of data. For better evaluation of the speedup of DCBFC as compared with other algorithms, we used PCA to reduce the dimensionality of the time series for all pixel points for k-means and hierarchical clustering (referenced as PCA-k-means, PCA-hierarchical), in which the first principal components corresponding to 90% of the total eigenvalues were chosen, resulting in a reduction of the data from to . For spectral clustering, singular value decomposition (SVD) was used to extract the eigenvectors of the Laplacian matrix and conduct k-means clustering for the extracted eigenvectors. Thus, SVD reduces the dimensionality from to . In addition, we used the thresholded similarity matrix () used in DCBFC as the weighted matrix for spectral clustering (referred to as spectral-threshold clustering).
All clustering programs described in the following sections were run on the same PC [Intel® Core (Xeon®) CPU E5-2687W v3 @3.10 GHz, RAM 256 GB], and all methods were tested using MATLAB code.
The ROC curves (see Sec. 5.1) were plotted to analyze the relationship between artificial functional modules [Fig. 3(a)] and the functional connectivity maps obtained by DCBFC using the simulated data.
Given the truth labels and clustering assignments, ARI (see Sec. 5.2) can measure the similarity between them. In the range of ARI as [, 1], indicates a bad clustering implement and 1 indicates a perfect match. Because the real clustering labels of simulation data are known, ARI is usually used to evaluate the performance of clustering algorithms with truth label assignments. K-means clustering, PCA-k-means clustering, and spectral clustering were repeated 50 times each turn, and the best clustering result was used. In this manner, we could compare the antinoise abilities and computation times of the different algorithms at diverse scales. We also tested the performance of seven clustering methods on different cluster numbers (see Fig. S1 in the Supplementary Material).
SI and Dice’s coefficient test
Because no ground truth labels exist when clustering in vivo neural activity of brain regions, the clustering performance evaluation ARI is not applicable. Therefore, we adopted the SI (see Sec. 5.3) as a metric.28,38–40
To compare the reproducibility of the clustering results, Dice’s coefficients for different methods were calculated to evaluate the similarity between the clustering results obtained at the group and individual levels.31,46 The range of Dice’s coefficient was [0, 1], where 0 means no similarity and 1 means a complete match.
The functional modules were clustered to obtain the individual-level clustering result for each epoch. Each individual-level clustering result was then converted into an adjacency matrix , in which if pixels and were in the same cluster; otherwise, .31 All adjacency matrices of the individual level were subsequently summed to obtain group adjacency matrix . Then, the correlation coefficient matrix of was clustered by DCBFC to obtain the clustering result for the group level.
Comparing RSFC with Anatomic Connectivity of Axonal Projection
To further validate the functional connectivity obtained, we compared RSFC with the anatomic connectivity of axonal projection obtained using the voxel-scale connectivity model proposed by Knox et al.41 to construct the whole isocortical connectivity at a scale of voxels. The model interpolates projection intensity in source space and enabled us to easily analyze the connectivity both in voxel and region scales.
The axonal projection data were downloaded from the Allen Mouse Brain Connectivity Atlas website;47 the detailed information is listed in Table S2 in the Supplementary Material. Because no anatomy projection experimental data on the isocortex are found for the Vglut2-IRES-Cre line (widely expressed in most areas of the brain, except very sparse expression in the striatum and restricted populations within cerebellum, medulla, and pons), we used projection data from 185 experiments using several other types of mice, including C57BL/6J and excitatory Cre driver mice, which express in intratelencephalic neurons (project to both ipsilateral and contralateral cortex).48 All experiments used rAAV tracers for the labeling of axonal projections. These data were used to fit the voxel-scale model and derive the projection connectivity of the whole isocortex. Experiments were selected based on the following characteristics: (1) the experiment should not have large volume saturated in the injection region, and (2) the experiment should not have only few projection volumes outside of the injection site.
We used the Nadaraya–Watson connectome estimator with a radial basis function (RBF) kernel introduced by Knox et al.41 to estimate the projection connectivity ( https://github.com/AllenInstitute/mouse_connectivity_models). The voxel-scale connectivity model was constructed to include the connectivity projections from all voxels within the isocortical. The model fits the bandwidth of the RBF kernel by employing nested cross validation with held-out injection experiments.
To explore the relationship between RSFC and projection connectivity, we first calculated the spatial correlation between RSFC patterns obtained by different clustering methods and the projection density (PD) maps11 (see Sec. 5.5). We filtered the coordinate positions of mice isocortex according to information about brain regions provided in the Allen Mouse 3D Reference Model annotation and based on the coordinates of structure centers. The two-dimensional (2-D) normalized PDs were derived from the voxel-scale model41 and considered only the projection pathways of the brain regions on the isocortex. We compared the RSFC map with axonal PD patterns both in region and voxel scales: (1) the regional RSFC maps, , versus regional PD patterns, (Fig. 6). (2) The RSFC maps of central seed obtained by DCBFC, , versus the PD patterns, , of the same cortical location (Fig. S2 in the Supplementary Material).
In addition, to explore quantitatively whether the functional links between the functional modules obtained by DCBFC had the anatomical connections, we calculated the functional connection strength (see Sec. 5.4) between different functional modules49,50 and compared them with the normalized projection connection density (see Sec. 5.6) between these functional modules. The average connection strength (ACS) of one functional module is used to quantify the overall strength of its connections across the whole isocortex. In addition, the average connection diversity (ACD) of different functional modules was calculated (see Secs. 5.4 and 5.6), which was derived from the Shannon entropy51 and has been used in fMRI.49,50 Connection diversity quantifies the uncertainty that one functional module will connect to another. Module with low ACD prefers to connect with only a few functional modules. Module with high ACD trends to evenly connect with other regions across the whole isocortex. Here, “high” and “low” are relative values (i.e., relative to other functional modules, the connection strength or diversity is high or low). We normalized the ACS and the ACD to [0, 1] and regarded the normalized value that was greater than 0.5 as relatively high; otherwise, it was relatively low.
To conduct this analysis, DCBFC was used to obtain automatically the functional connection patterns of spontaneous neural activities in Vglut2- GCaMP6s mice (), and the functional modules are formed at the group level. In the bilateral hemisphere, some symmetrical regions are clustered into one functional module [e.g., top of Fig. 4(c), white arrow], whereas others are divided into two. We merged the symmetrical regions as a functional module. Eight modules, represented as M2, M/S, BF, RSP/M, V, HL, FL, and PtA/SSp-tr, were then analyzed [Fig. 8(a)].
Clustering Performance Evaluation
Figure 3 provides the ROC curves of functional connectivity patterns obtained by DCBFC at different SNR levels in the simulated data. The results show that a strong correspondence existed between the functional connection map and simulated functional module, even when the SNR ratio is very low. For all modules [see Figs. 3(b)–3(h)], the clustering results and true labels had the strongest correspondence (100% sensitivity and 100% specificity) in a broad range of thresholds (0.1130 to 0.7070) with SNR equal to 10 dB. When the SNR was , the strongest correspondence of module 1 [Fig. 3(b)] exhibited 73.8% sensitivity and 64.4% specificity. The values for the other regions (at ) were as follows: module 2 [Fig. 3(c), 91.6% sensitivity and 88.9% specificity], module 3 [Fig. 3(d), 68.2% sensitivity and 50.5% specificity], module 4 [Fig. 3(e), 56.8% sensitivity and 62.4% specificity], module 5 [Fig. 3(f), 71.3% sensitivity and 65.8% specificity], module 6 [Fig. 3(g), 63.8% sensitivity and 57.6% specificity], and module 7 [Fig. 3(h), 65.6% sensitivity and 62% specificity].
Simulated cortical neural activity data with different image sizes (, , and ) were clustered using different methods. Table 1 shows the ARI values of the results obtained by the clustering algorithms for the aforementioned three image sizes, with as the similarity matrix. It was observed that DCBFC had significantly higher ARI values than those of k-means [two-way ANOVA tests, false discovery rate (FDR) correction, ], spectral clustering (), PCA-k-means (), and spectral-threshold clustering ( for , for the others).
Simulation testing: comparing the ARI of DCBFC with the other clustering methods using different picture size.
The clustering performances of different methods for various numbers of clusters (4, 11, and 50 functional modules, respectively) are shown in Fig. S1 in the Supplementary Material. When the SNR level was greater than , the ARI value of DCBFC approximated 1. With the addition of spatial background noise, DCBFC had the highest ARI value [blue line in Fig. S1(a) in the Supplementary Material].
Functional parcellation obtained with in vivo data
Figure 4 shows the group-level functional clustering results of the Vglut2-GCaMP6s mice data ( image size, ) using different clustering methods (DCBFC, k-means, hierarchical, and spectral clustering). The results were registered in the Allen Mouse Brain Atlas [Fig. 4(b)] using the two markers on the skull [Fig. 4(a), red points]. In this study, DCBFC obtained 12 functional modules at the group level [upper left of Fig. 4(c), ]. Because the other clustering methods needed to set a specific number of clusters, the same number of functional modules as the DCBFC was used for k-means, hierarchical, and spectral clustering.
Figure 4(c) shows the functional modules clustered by DCBFC, k-means, hierarchical, and spectral clustering, and the corresponding functional connectivity maps are shown in the right of Figs. 6(a)–6(d). The functional connectivity maps illustrate that a correlation of spontaneous neural activity existed between the bilateral brain regions. Some regions that symmetrically connect between the bilateral hemispheres may have been clustered into the same functional connection module [upper left of Fig. 4(c), white arrow]. Some functional connection modules clustered by DCBFC were highly correlated to the anatomy brain areas (i.e., visual areas and primary somatosensory barrel field), whereas others may have covered more than one brain area, such as primary motor cortex, secondary motor cortex, and somatosensory area [upper left of Fig. 4(c), green arrow], which reflect the interconnection between different brain function areas.
SI and Dice’s coefficient test
The average SI values of all experimental epochs for different clustering algorithms were 0.35, 0.33, 0.31, 0.32, 0.33, 0.31, and 0.34, for DCBFC, k-means, hierarchical, spectral, PCA-k-means, PCA-hierarchical, and spectral-threshold clustering, respectively, which indicate that the functional modules obtained by DCBFC had the highest synchronization between intercluster heterogeneity and intracluster homogeneity. The average Dice’s coefficients under all experimental epochs for the aforementioned methods were 0.65, 0.64, 0.58, 0.64, 0.63, 0.59, and 0.64, respectively, indicating that DCBFC had the highest reproducibility.
Figure 5 illustrates the relative speedup performance of DCBFC compared to the other methods for three image sizes of the simulated data. We observed that hierarchical clustering, which links cluster objects step by step to generate a complete tree, had the longest computation time. The ratio of computing time between hierarchical clustering and DCBFC was in the range of 211.9 to . Because k-means clustering must be calculated 50 times to obtain an optimal result, its calculation time was longer than that of DCBFC 3.7 to . Sometimes the iterations of k-means clustering were difficult to converge [Fig. 5(c), blue bar, ], which incurs a very long clustering time. The computation time for spectral clustering was affected by the number of functional modules, which was 1.1 to longer than DCBFC in the case of . After PCA was used to reduce the dimensionality of data, the computing times of k-means (PCA-k-means, Fig. 5, steel blue bar, 2.0 to ) and hierarchical clustering (PCA-hierarchical, Fig. 5, sky blue bar, 41.7 to ) were greatly reduced. Spectral-threshold clustering (1.3 to ) was not always faster than spectral clustering (Fig. 5 dark blue and black bars), which means that the threshold similarity matrix used in DCBFC was not the key to improving the speed. DCBFC still had a very short calculation time. In all cases, the calculation time of DCBFC was lower than those of the other methods.
Table 2 shows the average computing time of each algorithm for clustering Vglut2-GCaMP6s transgenic mice data. For the image size, the computation speedup of DCBFC was 51.3, 837.5, 1.6, 16.0, 289.2, and faster than that of k-means, hierarchical, spectral, PCA-k-means, PCA-hierarchical, and spectral-threshold clustering, respectively. To better demonstrate the advantages of DCBFC in terms of computing time, we clustered the images at a higher resolution (, ). DCBFC was 13.0, 1406.0, 1.1, 2.1, 1308.1, and faster than the other aforementioned methods, respectively.
Experimental testing: comparing the average computing time of DCBFC to the other methods under different image size.
Relationship between RSFC and Axonal Projection Connectivity
Spatial correlation between isocortical axonal projection maps and RSFC maps
In Fig. 6, the left plot of each subfigure shows the normalized regional PD maps corresponding to 12 functional modules derived by the clustering methods [Fig. 4(c)]. The right side of the subfigures shows the RSFC maps corresponding to the modules. Figures 6(#1)–6(#12) represent M2 (secondary motor area), M/S (portions of motor areas combined with portions of primary somatosensory areas), L_BF (left primary somatosensory barrel field), R_BF (right primary somatosensory barrel field), HL (primary somatosensory lower limb), L_FL (left primary somatosensory upper limb), R_FL (right primary somatosensory upper limb), RSP/M (retrosplenial area combined with motor areas), L_V (left visual areas), R_V (right visual areas), L_PtA/SSp-tr (left posterior parietal association areas combined with primary somatosensory trunk), and R_SSp-tr/PtA (right posterior parietal association areas combined with primary somatosensory trunk). Figure 6(a) shows that all functional connectivity maps obtained by DCBFC had similar axonal projection patterns, in which the minimum and maximum spatial correlation coefficients were 0.75 and 0.91, respectively. It is worthy to note that when the contralateral hemispheric connections existed in the RSFC map, the corresponding axonal projections also exhibited these long-range connections.
The spatial correlation coefficients between the regional RSFC maps obtained by different clustering methods and the corresponding regional PD patterns were calculated. The average spatial correlation coefficients for each method were 0.8288 (DCBFC), 0.8266 (k-means), 0.8156 (hierarchical), 0.8230 (spectral), 0.8271 (PCA-k-means), 0.8234 (PCA-Hierarchical), and 0.8262 (spectral-threshold). The average spatial correlation coefficient of DCBFC was slightly higher than that of other methods.
We also compared the RSFC maps of central pixels found using DCBFC with the PD patterns of the same cortical location obtained using the voxel-scale connectivity model, as shown in Fig. S2 in the Supplementary Material. The and patterns both showed a long-range link to contralateral or ipsilateral hemisphere for some brain regions. For example, we can see a functional connection between M2 and BF [Figs. S2(d)–S2(f) in the Supplementary Material] existing in both bilateral hemispheres,28 which can also be found in the PD when the injection site of the PD map is M2 or BF. It should be noted that the average spatial correlation coefficient between the RSFC maps of the central pixels () obtained by DCBFC and the relevant normalized PD () patterns with the same location was lower than that between the regional RSFC maps () and regional normalized PD () patterns ( versus ). In addition, the average spatial correlation coefficient between and maps was . Correspondingly, the average spatial correlation coefficient between and maps was . The average spatial correlation between the RSFC maps and PD patterns across all seed pixels was .
ACS and ACD analysis of RSFC and axonal projection
Figure 8(b) shows the resting-state connection strength between the different functional modules clustered by DCBFC, where only significant connections were shown (-test, FDR correction, ). Figure 8(c) shows the projection connection strength between different functional modules; only the connections whose PD is greater than are shown. The results reveal that all significant functional connections between functional modules clustered by DCBFC could be found in the axonal projection connectivity derived from the voxel scaling model. In addition, projection connectivity, shown in Fig. 8(c), had several additional connections (dashed line) that were not apparent in the resting-state functional connection, such as the connection between PtA/SSp-tr and M2, PtA/SSp-tr and V, BF and HL, M2 and HL, and BF and V.
The functional connections between M2, M/S, FL, and BF shown in Fig. 8(b) were also found in previous studies.28,52,53 When M2 was the injection structure [Fig. 8(c)], the axonal projections were found to be transmitted to ipsilateral M/S, FL, BF modules, and the contralateral hemisphere [Fig. S2(d) in the Supplementary Material], which is consistent with the functional connection. M/S also had axonal projections toward FL. The injection site in motor area could project to SSp-m, FL, and the contralateral hemisphere [Fig. S2(a) in the Supplementary Material]. Table 3 shows that M/S closely connected only to certain isocortical functional modules in both RSFC and projection connectivity. The ACS between M/S (or M2) and all the other modules in the isocortex was relatively weak. The ACD of FL was relatively high both in RSFC and projection connectivity. ACS of FL was relatively high for RSFC but was relatively low for projection connectivity. BF could project to M2, PtA/SSp-tr, and FL and also had a long-range link to the contralateral hemisphere [Figs. 6(#3) and 6(#4)]. BF, HL, and PtA/SSp-tr had a relatively high ACS in terms of both RSFC and projection connectivity. PtA/SSp-tr was evenly connected to the other modules (high ACD). The projection pathway of the injection site in the visual cortex could be remotely connected to the posteromedial secondary motor areas [Figs. 6(a#9) and 6(a#10)], and this situation also occurred when the source and target sites were exchanged [Figs. 6(a#8) and 8(c), RSP/M and V]. It is interesting that the RSP, which may correspond to the default mode network described in humans,28,49 showed both functional and projection connection with the visual, PtA/SSp-tr, HL, and posteromedial motor areas. Visual cortex preferentially connected to specific isocortical functional modules (e.g., link between V and RSP/M) in both RSFC and projection connectivity. ACD of RSP/M was relatively low for RSFC but was relatively high for projection connectivity. The ACS between RSP/M (or V) and all other modules in the isocortex was relatively weak. From Table 3, the results indicated that some modules with relatively high ACD (such as BF clustered by DCBFC) had few significant connections to other modules in Fig. 8.
The normalized ACD and ACS of different modules.
Influence of Number of Clusters for DCBFC
The other methods compared with DCBFC require a preset number of clusters. For the result described in Sec. 3.1.4, the clustering number of k-means was set as that obtained by DCBFC. To further discuss whether the clustering performance of DCBFC was also better than that of other methods for different number of clusters, the SI values of the results obtained by different methods were compared for the number of clusters ranging from 4 to 20. The number of clusters for DCBFC was changed by adjusting the threshold according to the method described in Table S1 in the Supplementary Material.
Figure 7 demonstrates SI values of brain functional modules clustered by DCBFC are higher than those clustered by k-means clustering (), hierarchical clustering (), spectral clustering (), PCA-k-means clustering (), PCA-hierarchical clustering (), and spectral-threshold clustering () at a significance level of 0.05 (two-way ANOVA test with FDR correction).
Influence of SNR on the Clustering Performance
In this study, we tested the antinoise performance of different clustering methods and used ARI37 to evaluate the performances of different methods in clustering simulation data based on ground truth class assignment knowledge.
For simulation data, in the test [see Fig. S1(a) in the Supplementary Material], with the addition of spatial background noise, DCBFC had the highest ARI values [blue line in Fig. S1(a) in the Supplementary Material] even at high noise levels. K-means clustering [green line in Fig. S1(a) in the Supplementary Material] and hierarchical clustering [red line in Fig. S1(a) in the Supplementary Material] clustered the data by calculating the correlation coefficients ( value) of each pair of time series, and they were influenced by the shape of the functional modules and background noise. This may have caused the pixels in different functional modules to cluster into the same module. Following PCA dimension reduction, the clustering performances of k-means [PCA-k-means, yellow line in Fig. S1(a) in the Supplementary Material] and hierarchical clustering [PCA-hierarchical, pink line in Fig. S1(a) in the Supplementary Material] improved, indicating that PCA dimension reduction was effective in eliminating spatial noise. For the other simulation data testing, without the addition of spatial background noise [see Table 1 and Figs. S1(b) and S1(c) in the Supplementary Material], spectral-threshold clustering, which employed the thresholded similarity matrix () used in DCBFC as the weighted matrix for spectral clustering, had a higher ARI value than that of spectral clustering when using zero as the threshold for the similarity matrix (two-way ANOVA tests, FDR correction, for all).
Time Consumption for Clustering
DCBFC clustering used a thresholded similarity matrix to calculate the candidate central pixels during the step of determining the cluster center, which could be regarded as dimensionality reduction of data. To better evaluate the speedup of DCBFC compared with other algorithms, dimensionality reduction preprocessing was also applied to other clustering methods for comparison, such as PCA-k-means, PCA-hierarchical, and spectral-threshold clustering. The results showed DCBFC remained faster than other methods. Figure 5(c) shows that an abnormal calculation time existed for k-means (blue bar, ) which was much longer than other cases. This was because some clustering algorithms using iterative solution, such as k-means, spectral clustering, etc., may sometimes encounter slow iterative convergence due to the selection of initial value.
Difference of Clustering Methods for Functional Parcellation
Figure 4(c) reveals that the functional modules divided by different methods are slightly different. Compared to DCBFC, k-means and spectral clustering divided the RSP/M, M2, and BF regions into more subregions, hierarchical clustering did the same for the M2 and BF regions. DCBFC derived the PtA/SSP-tr module, which was not differentiated from the other cortex regions by the other methods. The objective of DCBFC is to have the central pixels of different clusters be remote from each other and to aggregate all pixels based on their similarity to the central pixels. K-means and spectral clustering use the average time series of all the pixels in each cluster to update the current cluster center in each iteration so that they usually obtain higher intracluster correlation than DCBFC does. The average correlation coefficients between different functional modules derived using different methods were 0.40 (DCBFC), 0.46 (k-means), 0. 43 (hierarchical), and 0.45 (spectral), which indicated the clusters obtained by DCBFC had largest distance between each other. The average similarities within the same functional module for different methods were 0.62 (DCBFC), 0.64 (k-means), 0.62 (hierarchical), and 0.63 (spectral). The intercluster similarity of DCBFC was the lowest, whereas k-means and spectral clustering were the highest (these relationships were still maintained for different number of clusters, data not shown). Therefore, k-means and spectral clustering obtain more subregions may be due to their high intracluster similarity and low intercluster distance.
Comparing the RSFC patterns of aforementioned subregions with the corresponding normalized PD maps [e.g., Figs. 6(a), 6(b)#8 versus Figs. 6(c), 6(d)#7, 6(d)#8; Figs. 6(a)#1 versus Figs. 6(b)–6(d)#1, Figs. 6(b)–6(d)#11], we can see that the average spatial correlation coefficients between patterns and their corresponding maps for these subregions were smaller than that for only one region.
As previously mentioned, from the perspective of functional clustering, when we hope that the functional connectivity module can be closely in contact within the same cluster and be far away from other clusters, SI can be used as an evaluation index for clustering performance. The results showed that DCBFC had the highest SI value and could rapidly cluster the connection matrix.
Links between RSFC and Anatomy Projection Connection
We describe the links between RSFC and anatomy projection connection from two respects.
1. The spatial correlation coefficients of two types of patterns reflect the spatial similarity between the two maps on the top view. The results [Fig. 6(a)] show that the spatial correlation coefficients between and corresponding were in range of [0.75, 0.91]. In addition, Fig. 8 shows that all the significant functional connections between functional modules can be found in the projection connectivity, whereas projection connectivity has several additional connections (dashed line) that are not shown in resting-state functional connection, such as the connections between PtA/SSp-tr and M2, PtA/SSp-tr and V, BF and HL, M2 and HL, and BF and V. This may be partly due to the difference between the strains of mice used in projective connection fitting and that in RSFC calculating. It may also be because the axonal projection shows all existing anatomical output connection through axons, but RSFC here only shows the correlated spontaneous neural activity in the low frequency (0.1 to 4 Hz). Moreover, the PDs connection between two adjacent brain regions such as the connection between BF and visual cortex might be overestimated when the virus is diffused from the injection site to its surrounding area if the dose of virus is not well controlled, or the location of virus injection may cover two adjacent function regions.
2. The ACD/ACS relationship is attempted to analyze whether the connection diversities of RSFC and projection connection are also at the same level (higher or lower) when their connection strengths are at the same level (stronger or weaker). It was shown that both ACD and ACS of axonal projection matched those of RSFC in some cortical regions, such as M/S, V, and PtA/SSp-tr. From Sec. 3.2.2, it could be seen that although some functional modules were closely connected to a few modules, their ACD values were relatively high, such as the BF for RSFC. This was partly because there were still weak connections between these modules and other modules were not shown besides those strong connections shown in Fig. 8. Furthermore, ACD measures the difference of connection strengths between one functional module and all the other cortical modules based on Shannon entropy. If the difference of the connection strengths is large (ACD value is low), this module will trend to coactivity with certain specific modules. If the difference of connection strengths is small (ACD value is high), the connections of this functional module are spread evenly across the multiple brain regions.
Given one artificial module, the values (true labels) of the pixels within this module were set to 1 and other pixels were set to 0. Then the functional connection maps obtained by DCBFC were binarized with different thresholds (from −0.999 to 0.999, with intervals of 0.001) at different SNR levels, in which the value of the pixel was set to 1 if its correlation coefficient was greater than the threshold value, whereas those of other pixels were set to 0. Results were displayed using a series of ROC curves. For each SNR level, the corresponding true positive rate (TPR or sensitivity) versus false positive rate (FPR or 1-specificity) curve is plotted under the background of artificial functional modules (true label). In the equation , denotes the number of pixels whose values are 1 in the binary functional connection map but whose true labels are 0, and denotes the number of pixels whose values are both 0. In the equation , denotes the number of pixels whose values are both 1, and denotes the number of pixels whose values are 0 in binary functional connection map but whose true labels are 1.
Given the truth labels and the clustering assignments, ARI can measure the similarity between them. The range of ARI is [, 1], where indicates a bad clustering implement and 1 indicates a perfect match.
The aim of SI is to identify whether the functional modules are compactly in contact with each other within the same cluster and are remote from interclusters. Suppose that is the ensemble of all clusters and pixel belongs to cluster ,
Connection Strength and Connection Diversity
In this study, we only explored the positive correlation connections. The connection strength between modules and , namely, , is defined as the average value of their positive time correlation coefficients given as
The ACD for module is defined as , formulated as
Projection Density Map (Top Views)
According to the code (i.e., the voxel-scale model) provided by Konx et al. ( https://github.com/AllenInstitute/mouse_connectivity_models), we derived PD maps for all functional modules ( PDs total) by summing and normalizing the PD value of all I to VI layers of isocortex. Therefore, the PDs were scaled to [0, 1]. Finally, we derived PDs () corresponding to functional modules clustered by different clustering methods. In addition, we derived the cortical projection maps whose injection site had the same cortical location with the central pixels found by DCBFC.
Normalized Connection Density between Different Functional Modules
The normalized connection density of projection connectivity was used in comparison with connection strength derived from RSFC. The projection connections from each voxel of the isocortex were first derived using the voxel-scale model. We then obtained the normalized connection density between different modules:5.4, respectively.
The authors have no relevant financial interests in this article and no potential conflicts of interest to disclose.
This work was funded by the National Natural Science Foundation of China (NSFC) (Nos. 61890951, 61775071, and 31471083); the National Key Research and Development Program of China (No. 2017YFB1002503); the Fundamental Research Funds for the Central Universities, Huazhong University of Science and Technology (No. 2019KFYXMBZ009); the Innovation Fund of Wuhan National Laboratory for Optoelectronics (WNLO). We thank the Allen Institute for Brain Science for providing a database of axonal projections.
Miaowen Li is a PhD student in biomedical engineering at Wuhan National Laboratory for Optoelectronics, Huazhong University of Science and Technology, Wuhan, China. Her research focuses on optical imaging of cortical functional connectivity and machine learning.
Shen Gui is a PhD student in biomedical engineering at Wuhan National Laboratory for Optoelectronics, Huazhong University of Science and Technology, Wuhan, China.
Qin Huang Qin Huang received his PhD in optical engineering from Huazhong University of Science and Technology, Wuhan, China, in 2018.
Liang Shi is a PhD student in optical engineering at Wuhan National Laboratory for Optoelectronics, Huazhong University of Science and Technology, Wuhan, China.
Jinling Lu is an associate professor at Wuhan National Laboratory for Optoelectronics, Huazhong University of Science and Technology, Wuhan, China. Her research focuses on optical imaging of cortical functional connectivity.
Pengcheng Li is a professor of biomedical engineering, Huazhong University of Science and Technology. His research interests are in vivo functional optical imaging technologies of biological tissue, such as laser speckle imaging of blood flow and viscoelasticity, multispectral reflectance imaging of blood oxygenation, and multimodal optical imaging of brain functional activity.