Group-level cortical functional connectivity patterns using fNIRS: assessing the effect of bilingualism in young infants

Abstract. Significance: Early monolingual versus bilingual experience induces adaptations in the development of linguistic and cognitive processes, and it modulates functional activation patterns during the first months of life. Resting-state functional connectivity (RSFC) is a convenient approach to study the functional organization of the infant brain. RSFC can be measured in infants during natural sleep, and it allows to simultaneously investigate various functional systems. Adaptations have been observed in RSFC due to a lifelong bilingual experience. Investigating whether bilingualism-induced adaptations in RSFC begin to emerge early in development has important implications for our understanding of how the infant brain’s organization can be shaped by early environmental factors. Aims: We attempt to describe RSFC using functional near-infrared spectroscopy (fNIRS) and to examine whether it adapts to early monolingual versus bilingual environments. We also present an fNIRS data preprocessing and analysis pipeline that can be used to reliably characterize RSFC in development and to reduce false positives and flawed results interpretations. Methods: We measured spontaneous hemodynamic brain activity in a large cohort (N=99) of 4-month-old monolingual and bilingual infants using fNIRS. We implemented group-level approaches based on independent component analysis to examine RSFC, while providing proper control for physiological confounds and multiple comparisons. Results: At the group level, we describe the functional organization of the 4-month-old infant brain in large-scale cortical networks. Unbiased group-level comparisons revealed no differences in RSFC between monolingual and bilingual infants at this age. Conclusions: High-quality fNIRS data provide a means to reliably describe RSFC patterns in the infant brain. The proposed group-level RSFC analyses allow to assess differences in RSFC across experimental conditions. An effect of early bilingual experience in RSFC was not observed, suggesting that adaptations might only emerge during explicit linguistic tasks, or at a later point in development.


1) Intensity to optical density
in which channels' time series are clustered based on similarity. A similar cluster configuration can be observed across groups in HbO and HbR. Cluster 1 is formed by channels located in the most anterior part of both hemispheres. Cluster 2 comprises channels located in middle brain regions. Channels located in the most posterior part of the setup are grouped together in cluster 3.

Model order selection for Independent Component Analysis
A recurrent issue in studies using independent component analysis (ICA) to examine resting-state functional connectivity is how to determine the number of IC to be estimated. Here we propose a data-driven approach to determine the number of estimated ICA components by computing two metrics that exploit specific defining properties of ICA estimation and fNIRS data: 1) the robustness of the components across multiple realization of the ICA (Himberg et al., 2004) and 2) the assumed similarity between the HbO and HbR derived IC.
First, a valid interpretation of the results requires selecting only those independent components that are considered robust; being robustness defined by the identifiability of the component across multiple runs of the ICA algorithm with different random initializations. A principal component analysis (PCA) is commonly performed prior to ICA to reduce the dimensionality of the data and/or remove noise components. The number of ICA components to be estimated usually corresponds with the number of PCA components that are retained. We evaluated the robustness of the estimated IC across multiple PCA thresholds by using ICASSO (Himberg et al., 2004;Damaraju et al., 2014). Concretely, 100 realizations of ICA were computed for different PCA thresholds (i.e., 60, 65, 70, 75, 80, 85, 90, 95 and 99) representing the percentage of data variance explained. For the two ICA methods employed in the current work (temporal group ICA and connICA), ICASSO quality index ( ) values were obtained for each component and for each PCA threshold (Supplementary Figure 2 -volcano plots). This index quantifies the robustness of the estimated components across ICA realizations, with values ranging between 0 (low robustness) and 1 (high robustness). In the temporal group ICA approach, higher values (i.e. more robustness) were obtained at the lower PCA thresholds (60% and 65% explained variance). For higher thresholds, the first 15-20 IC also showed high values. However, these values start decaying as the number of IC increased and were particularly low for PCA thresholds above 80%. A similar trend can be observed for connICA. The highest values were obtained at the lower PCA thresholds (60% -70% explained variance), whereas the values for higher PCA thresholds were only high up to the first 15 IC and then progressively decreased.
Second, the expected statistical association between HbO and HbR chromophores can be used as a metric for evaluating the validity of the IC extracted from the ICA algorithm. A high correlation between the extracted HbO and HbR IC or maps implying consistency across Hb chromophores can be interpreted as a measure of physiological reliability (Villringer and Chance, 1997; Wolf et al., 2002;Obrig and Villringer, 2003). In the temporal group ICA approach, IC are estimated from HbO and HbR time series, which are expected to be negatively correlated. A highly similar spatial configuration in the extracted spatial maps of both chromophores (i.e., channel weights) is expected, but with opposite sign (i.e., negative correlation). In our temporal group ICA analysis HbO and HbR spatial maps displayed a strong negative correlation at the lower PCA thresholds-up to 75% of explained variance (Supplementary Figure 2 -volcano plots). For higher thresholds, correlation between HbO and HbR components remained negative, but values decreased considerably. For the connICA approach, a strong positive correlation between the HbO and HbR independent functional components is expected, as they are estimated from the HbO and HbR functional connectivity matrices, which show alike topology across chromophores. The analysis for the connICA method yielded similar results as the previous method. At the lower PCA thresholds (60% and 65% explained variance) the correlation between HbO and HbR components was notably high (r ≥ 0.7). At higher PCA thresholds, correlation between components remained positive, but with lower correlation values between HbO and HbR components.
Based on these two metrics (i.e., robustness of IC, and correlation between the extracted IC from HbO and HbR data) an appropriate choice for PCA model order, and in turn for the subsequent ICA method, corresponds to approximately 60% of the explained variance in both approaches (i.e., tGICA and connICA). To reinforce this choice, the variance explained by each IC was computed by calculating the accumulative data variance explained by the components. This analysis showed that for the two approaches with the selected 60% threshold, we are able to explain the largest amount of data variance, while also obtaining the highest values in our robustness and consistency metrics. At higher PCA thresholds, explaining the same amount of variance would require including components that are not robust, or which do not show the expected association between HbO and HbR. Based on this information, in the temporal group ICA approach we selected 15 principal components corresponding to 60% explained variance. As a reference, a method for automatic dimensionality selection based on Laplace approximation (Minka et al., 2001) yielded a result of 27 components (i.e., 73.5% explained variance). For the connICA method we considered 11 principal components, which corresponds to 60% explained variance. For this method the Laplacian approximation produced a value of 17 components (i.e., 66.34% explained variance).

Supplementary Figure 2.
Model order selection criteria for temporal group ICA (first column) and connICA (second column) methods at different PCA thresholds indicating the percentage of explained variance (y axis). First row depicts the cluster quality index (Iq) for each component as estimated using ICASSO, which represents component stability across multiple ICA realizations. Second row shows the correlation between the estimated HbO and HbR components. Third row shows the components' accumulative explained variance, being the maximum amount of explained variance determined by the specific PCA threshold.  Supplementary Table 2. Temporal group ICA model order evaluation metrics for the PCA threshold selected in this study (i.e., 60% -15 ICs). Sum of squared differences (ssd) are computed with respect to the data after PCA (total = 100%) and with respect to the original data without PCA (total = 60%). Iq = ICASSO cluster quality index; r = correlation coefficient between HbO and HbR spatial maps.   Supplementary Table 3. Model order evaluation metrics for the PCA threshold selected in the connICA approach (i.e., 60% -11 independent components). Sum of squared differences (ssd) are computed with respect to the data after PCA (total = 100%) and with respect to the original data without PCA (total = 60%). Iq = ICASSO cluster quality index; r = correlation coefficient between HbO and HbR components.

Supplementary
For each participant (n=99), two figures comprising various plots of the different steps of our data quality assessment routine are displayed in this supplementary material. The last five figures on Pages 215-219, represent the time series for the five datasets that were discarded due to insufficient data qualty. A brief explanation of the figures is provided below.

Figure 1
This figure presents the time series of the complete recording (560 seconds) for the two wavelengths (760 and 850 nm) used by our fNIRS system.
Top row: Raw intensity data. Middle row: Optical density data after conversion from raw intensity data. Bottom row: Optical density data after artifact correction using the wavelet-based despike method (Patel et al., 2014).

Figure 2
This figure shows various data quality assessment plots at different preprocessing steps. Rows 1 and 2 display the HbO and HbR time series. The third row shows the power spectral density of HbO and HbR concentration data at each preprocessing step. In the bottom row, and for each column (i.e., preprocessing step), leftmost plot shows the functional connectivity matrix. In this plot, the functional connectivity matrix for HbO and HbR is shown in the top-left part and bottom-right part, respectively. The functional connectivity matrix representing the correlation between HbO and HbR is shown in the top-right part (expected negative). Note that these matrices are symmetric with respect to the main diagonal. For preprocessed data, after filtering and global signal regression, the standard functional connectivity matrix and the functional connectivity matrix computed using robust regression (Santosa et al., 2017) are presented. The histogram plots in polar coordinates presented on the right part show the phase difference between HbO and HbR time-series (hPod value, Watanabe et al., 2017) at each preprocessing step.
Left column: raw HbO and HbR data. Middle column: HbO and HbR time series after filtering. Right column: HbO and HbR preprocessed data, after filtering and global signal regression.