Prefrontal cortex activation of return-to-work trainees in remission of mental disorders with depressive symptoms compared to that of healthy controls

Abstract. The increase in the number of patients with mental disorders with depressive symptoms has become a significant problem. To prevent people developing those disorders and help with the effective recovery, it is important to quantitatively and objectively monitor an individual’s mental state. Previous studies have shown the relationship between negative or depressive mood state and human prefrontal cortex (PFC) activation during verbal and spatial working memory tasks based on a near-infrared spectroscopy imaging technique. In this study, we aimed to explore a biomarker of the mental state of people in remission of mental disorders with depressive symptoms using this technique. We obtained the PFC activation of return-to-work (RTW) trainees in remission of those disorders, compared that of healthy controls, and obtained subjective questionnaire scores with the Profile of Mood States. We compared the PFC activation with the questionnaire scores by receiver operating characteristic analysis using a logistic-regression model. The results showed that the PFC activation indicates a healthy state compared to that of the RTW trainees evaluated by area-under-curve analysis. This study demonstrates that our PFC measurement technique will be useful as a quantitative and objective assessment of mental state.


Introduction
The increase in the number of patients with mental disorders with depressive symptoms has become a significant problem. The prevalence rate of depression in high-and low-economic countries ranges from 6.5% to 21%. 1 A World Health Organization survey shows that such mental disorders have the highest disability-adjusted life year (a measure of overall disease burden). 2 Moreover, depression is a highly recurrent disorder, and the factors associated with recurrence are variable (e.g., onset age, number of depressive episodes, genetic nature, comorbid psychopathology, and vulnerability factor). 3,4 Conventional methods using questionnaires have limitations in objectively diagnosing and monitoring mental states. 5 One of the solutions to these problems might be to provide patients and/or healthcare professionals with a reliable biomarker to quantify the risks of mental disorder or degrees of recovery. Conventional diagnostic and healthcare/treatment methods for patients have used information from standardized interviewing and/or questionnaire techniques, which gives useful suggestions on patients. However, these techniques might not always give the correct information when a patient's answers are ambiguous or he/she is confused about his/her feelings. To obtain more accurate information about mental states compared to when using conventional techniques, an objective and quantitative biomarker has been desired in the mental healthcare field.
Functional neuroimaging techniques are a noninvasive tool that would be useful for the study of potential biomarkers related to mental states [6][7][8] because the identification of the biomarker includes inaccessible brain tissues for study and scarcity of valid animal models. 6 A functional magnetic resonance imaging (fMRI) technique has been used for investigating the relationship between emotional states induced by a video clip and prefrontal cortex (PFC) activity during working memory (WM) tasks. 9 Recently, resting-state fMRI studies have reported that the mental states of depressed patients are related to the functional connectivity in the default mode network area. 10 fMRI has a high spatial resolution and monitors activation in deep areas of brain tissues but has a constraint on subjects. However, the nearinfrared spectroscopy (NIRS) technique 11,12 noninvasively monitors cerebral hemodynamic changes when the subject is in an ordinary posture, such as sitting, and has been used for research and clinical purposes, 13-16 and a differential diagnostic method for mental disorders based on the NIRS technique has been proposed. This method was approved as an advanced medical technology by the Japan Ministry of Health, Labor, and Welfare (MHLW) in 2009 7 and then was covered by the insurance of the MHLW in 2014. 7,17 This method gives diagnostic support information for differentiation of major psychiatric disorders with depressive symptoms by analyzing human frontotemporal NIRS signals. 7 We assumed that the usefulness of NIRS in mental healthcare could be expanded from clinical situations to daily use situations, such as monitoring an individual's mood states in the workplace. Previous studies have suggested a significant correlation between an individual's natural mood states and cognitive functions using an NIRS system and revealed that subjects reporting a higher level of depressive mood or negative mood (the total score for tension, depression, anxiety, fatigue, and confusion) showed a lower level of PFC activity during verbal WM tasks without a mood-induction method. [18][19][20][21][22] These studies have revealed the mood-cognition interaction in the PFC, which has been considered to play a crucial role in cognitive functions, [23][24][25] and have shown that the PFC activation could be a biomarker as an indication of the natural mood state for healthy individuals. However, these studies have focused on group-analysis results while a robust biomarker also requires a high accuracy at the individual level. Thus, it is unclear whether an objective biomarker, such as an NIRS measurement, has an advantage over a subjective mood scale, such as a questionnaire. Thus, it is necessary to compare the NIRS measurement with a subjective "logistic-regression model" mood scale for considering its advantages.
In this study, we aimed to explore the advantages of NIRS by comparing two kinds of indication: the NIRS measurement indication and the questionnaire-based indication with the Profile of Mood States (POMS ™ ). 26,27 The NIRS measurement and questionnaire-based indications were used to discriminate between subjects in remission from mental disorders and healthy controls using a logistic-regression model. Then, we compared those indications by receiver operating characteristic (ROC) analysis 28 based on the hypothesis of determining advantages for an NIRS-based biomarker over a questionnaire-based indication.

Participants
Twenty-nine healthy Japanese volunteers (12 females and 17 males, mean age ¼ 35.9 AE 7.7) and twenty-one return-to-work (RTW) trainees in remission of mental disorders with depressive symptoms (5 females and 16 males, mean age ¼ 40.9 AE 6.31) participated in this study. The RTW trainees have attended this trial instead of attending some kind of normal training class for returning to work, which is given by clinical psychologists in an RTW facility. The facility obtains statements about each trainee's mental states (e.g., depressive symptoms) and the will to return to work but does not obtain other private information such as education level and treatment, which are managed by the doctors in charge who are in different hospitals/clinics. Written informed consent was obtained from all participants after they were informed of the purpose, procedures, risks, benefits, and voluntary nature of the experiments. This study was conducted according to the regulations of the internal review board of the Central Research Laboratory, Hitachi, Ltd.

Mood State Measurement
The participants' mood state was assessed using the Japanese edition of a short form of the POMS ™ . 26,27 The participants rated their mood using 30 mood-related adjectives (angry, sad, vital, etc.) on a 5-point scale ranging from 0 ("not at all") to 4 ("extremely"). This rating enabled us to estimate six mood-related measures: tension-anxiety (T-A), depressiondejection (D), anger-hostility (A-H), vigor (V), fatigue (F), and confusion (C).

Prefrontal Cortex Activity Measurement
We used two types of multichannel functional NIRS system for measuring the changes in the cerebral blood concentration of the participants. 11,12,29,30 One was a commercial model of an NIRS system (ETG-4000, Hitachi Medical Corp.), 11,12 and the other was a prototype of a wearable optical topography (WOT) system developed by the Central Research Laboratory, Hitachi, Ltd. 29,30 The former was used to measure healthy controls in an experimental room, and the latter was for used for participants in the RTW program because of its portability (we could use it at the other site where the program was operated). Each NIRS device emits near-infrared light (2 mW) at two wavelengths onto the scalp and detects the reflected light. The NIRS method measures changes in the product of hemoglobin (Hb) concentration and effective optical path length in human brain tissue. The unit of Hb change is molar concentration (mM) multiplied by optical path length (mm). The changes in oxygenated Hb (HbO 2 ) and deoxygenated Hb (HHb) signals are estimated using absorption coefficient spectra for the two wavelengths. Both systems have the same source-detector distance (3 cm) and the same laser power (2 mW). The differences of both systems were the sampling rate (the former was 10 Hz, and the latter was 5 Hz) and the pair of wavelengths (the former used 695 and 830 nm, and the latter used 754 and 830 nm). However, the waveforms of HbO 2 and HHb measured with the latter prototype were confirmed as almost the same as those of the commercial model. 29,30 The ETG-4000 and WOT systems cover the entire forehead, and the measurement positions of each system were estimated by a probabilistic method used in previous studies. [31][32][33] The channel positions on the prefrontal area of both systems are shown in Fig. 1; however, the positions were not completely the same, so we chose the five channel positions numbered 1 to 5 that are closest to each other in the PFC for analysis.

Working Memory Tasks
We used verbal and spatial delayed match-to-sample WM tasks to measure the PFC activities [ Fig. 2(a)]. These were the same as those used in previous NIRS studies. [18][19][20][21][22] Each task trial started with a 1.5-s presentation of the target stimuli (target) on a PC screen [ Fig. 2(b)]. After presenting a delay (black screen with white fixation cross) for the following 7 s, a test stimulus for retrieval (probe) was then presented for 2 s or until the participant responded. The participant responded by pressing a button on a handheld game pad connected to the PC, and the test stimulus disappeared when the button was pressed within 2 s. The intervals between the trial finish and the target onset in the next trial were randomized from 16 to 22 s for avoiding participant habituation or prediction of the target onset. For a verbal WM task, a set of four Japanese hiragana characters was presented as the target, and a Japanese katakana character was presented as the probe. The participants were asked to judge whether the probe character corresponded to one of the four target characters. For the judgment, they used their right index and middle fingers to press the "yes" and "no" button, respectively. Because the characters presented as target and probe were in different Japanese morphograms, the participants made their decisions on the basis of the phonetic information conveyed by the characters, not their form. Auditory cues of 1000-and 800-Hz pure tones of 100-ms duration were presented at the target and probe onsets, respectively. For a spatial WM task, the target was given by the locations of four squares randomly arranged at eight peripheral locations. The probe was given by presenting a square at one of the eight locations. The participant was asked to judge whether the probe square location corresponded to one of the target locations.

Experimental Procedure
The participants were seated in a comfortable chair in a quiet room. Their mood states were assessed with POMS ™26,27 at the beginning of the experiment. Next, the participants received computer-automated instructions that were followed by a brief practice session to familiarize the participants with the tasks. The practice session included two to five trials of each task depending on the participants' familiarization with the tasks. Thereafter, NIRS measurements were conducted while the participants performed the WM tasks. The tasks were organized into two sessions, one for the verbal WM task and the other for the spatial WM task, with a counterbalanced order across participants. Each session included five trials of either one of the tasks, and the sessions were separated by a short break (∼1 min). The duration of the NIRS measurements was ∼15 min, and the whole experiment took about 30 min.

Near-Infrared Spectroscopy Data Analysis
For the NIRS data analysis, we used MATLAB (The MathWorks, Inc.) and the plug-in-based software we developed. We defined a block for each signal as the period from 1 s before presenting the target (which lasts 1.5 s followed by a 7-s delay for memorization) to 16 s after starting presenting the test, 25.5 s in total. We conducted five blocks of spatial WM tasks and five blocks of verbal WM tasks. To remove the components originating from slow fluctuations and the high-frequency noise of blood flow, a 1.5-Hz low-pass filter and 0.02-Hz high-pass filter were applied to the signals. Then, the HbO 2 signal in each block was baseline corrected by subtracting the average value of the first 1 s. Finally, we defined the PFC activity period as the period from 5 s after presenting the target to just before starting presenting the test (3.5 s). 16 We focused on the HbO 2 signal in the statistical analysis as the previous studies found main effects in this signal. [18][19][20] We used the t-statistic value of HbO 2 signals during the PFC activity period from the five blocks for spatial and verbal WM tasks and denote them as HbO spat 2 and HbO verb 2 , respectively, as the measures estimating the PFC activity for each channel.

Modeling Indication of Healthy
Control/Return-to-Work Trainee To estimate the indication of the difference between healthy controls and RTW trainees, we applied multivariable logisticregression analysis to the POMS ™ scores, HbO spat 2 , and HbO verb 2 (Ch 1 to 5, see Fig. 1). In this analysis, each participant's data were labeled with a binary value, which is 1 if the participant is a healthy control and 0 if the participant is an RTW trainee. Next, manual backward selection was used to eliminate nonsignificant components from each logistic-regression model and repeat the regression analysis using only significant component(s) (p < 0.05).
To evaluate the performance of logistic-regression models from the three sets of POMS ™ , HbO spat 2 , and HbO verb 2 , we used ROC analysis. 26 The area-under-curve (AUC) was used for quantitative comparison of the ROC curve obtained from those logistic-regression models, and its 95% confidence interval (CI) was calculated by bootstrap estimation with 10,000 replications. HHb signals, respectively. Changes in the HbO 2 signals during spatial and verbal WM memory tasks for both groups increase after the target onset and decrease around the baseline in the several seconds after the probe stimuli, and those in the HHb signals are almost flat compared to those in the HbO 2 signals. However, the amplitudes in the first peak (around 7 to 9 s) after the target onset for the RTW trainees are less than half of those for the healthy controls.

Logistic-Regression Models of Prefrontal Cortex Activity and Profile of Mood States
The logistic-regression results from HbO spat 2 and HbO verb 2 (calculated from HbO 2 signals) are summarized in Table 2. This table shows the regression coefficients and p values of the target channels. No significant components are observed in the results from HbO spat 2 . The statistically significant components are Ch 1 Journal of Biomedical Optics 056008-4 May 2019 • Vol. 24 (5) and 3 of HbO verb 2 . We applied logistic-regression analysis to those two channels again since we assumed that the other channels contribute less to discriminate the RTW trainee group and healthy control group than those two channels (Ch 1 and 3), and then we obtained significant results (Ch 1: regression coef ¼ 0.19, p ¼ 0.0065, Ch 3: −0.70, p ¼ 0.035).
The logistic-regression results from POMS ™ are shown in Table 3. The statistically significant component is the vigor score (POMS_V). We applied logistic-regression analysis to only POMS_V, and then we obtained significant results (POMS_V: regression coef ¼ 0.29, p ¼ 0.0062).

Receiver Operating Characteristic Analysis
To assess the performance of discrimination between healthy controls and RTW trainees based on logistic-regression models of HbO verb 2 and POMS_V, we used ROC curve analysis. One ROC curve shows the relationship of sensitivity versus 1specificity when the criterion to discriminate the binary state based on the designed model is changed; thus, the model with higher sensitivity and higher specificity is closer to the upper left corner. Figure 4 shows the ROC plots derived from HbO verb 2 and POMS_V. The solid diagonal line from (0, 0) to (1,1) indicates the no-discrimination line (chance level), and its AUC is equal to 0.5. In Fig. 3, the ROC curves of both HbO verb 2 and POMS_V fall between the upper left corner and the no-discrimination line. The AUC of HbO verb 2 is 0.792 (95% CI 0.651 to 0.918) and that of POMS_V is 0.755 (95% CI 0.591 to 0.891), indicating moderate power for discrimination, but the AUC of HbO verb 2 is slightly higher than that of POMS_V. To estimate the best discrimination point of each AUC curve, the point nearest to the upper left point was selected and marked with a circle on the curve. At the best discrimination point of HbO verb 2 , the sensitivity and specificity are 75.9% and 81%, respectively. At that of POMS_V, those values are 90% and 57.1%, respectively.

Discussion
In this study, we aimed to explore an effective biomarker of an individual's mental state and evaluated the PFC activation of RTW trainees in remission of mental disorders and that of healthy controls. We applied the logistic-regression model to the PFC activation of the participants and found that the PFC activation in the right and central frontopolar cortex during a verbal WM task are statistically significant to discriminate those participants into two groups: the RTW trainee group and the healthy control group. This result is consistent with some previous articles that have shown that PFC activities during a verbal WM task are inversely correlated with a negative/ depressive mood state compared to those during a spatial WM task [18][19][20][21] and might be supported by a previous fMRI study that reported that an induced negative mood-attenuated PFC activity during a verbal WM task. 34 In our study, statistically significant coefficients in the logistic-regression model of the right and central area (Ch 1 and 3, respectively) in frontopolar cortex are positive (0.19, p ¼ 0.0065) and negative (−0.70, p ¼ 0.035), respectively. The model outputs a value of 0 and 1, which indicates that the higher value means a more healthy state because "1" means healthy and "0" does not, so the NIRS data at the right area in frontopolar cortex correspond to the results in previous studies. [18][19][20][21] Previous studies have shown that frontopolar cortex plays a crucial role in WM, attentional resource allocation (referred as branching), and planning. 35,36 Thus, our result suggests that the difference of these functions between the RTW trainees and the healthy controls is reflected. However, we need to investigate the activation of the central area in frontopolar cortex because it is not clear how this activation contributes to the model and it has the opposite effects of activation from Ch 1 (positive) and Ch 3 (negative). On the basis of the POMS scores, we also made a logisticregression model and obtained the statistically significant component from only the POMS vigor score (POMS_V), which indicates that the regression coefficient is 0.29 (p ¼ 0.0062). This result shows that the higher POMS_V score means a more vigorous state.
Based on the above results, we compared the logistic-regression models derived from PFC activities and POMS_V using ROC analysis. In this analysis, the ROC curves of both PFC activity and POMS_V are over the chance level. The AUC of PFC activities is 0.792 (95% CI 0.625 to 0.898), which indicates moderate power to discriminate a healthy control and an RTW trainee. The AUC of POMS_V is 0.755 (95% CI 0.625 to 0.875), which also indicates moderate power. The 95% CIs of those two AUCs overlap well, but that of PFC activities is slightly higher than that of POMS_V.
In the ROC analysis, we also tried to extract the point nearest to the upper left point (0, 1) on each ROC curve, which gives one of the best discrimination points. At the nearest point on the ROC curve of PFC activities, the sensitivity and specificity are 75.9% and 81%, respectively, and for POMS_V, those values are 90% and 57.1%, respectively. The results show that the sensitivity of POMS_V is very high but its specificity is about the chance level, which indicates that it is difficult to discriminate the healthy controls and the RTW trainees using only the vigor score of POMS. The sensitivity of PFC activities (75.9%) is less than that of POMS_V (90%) but the specificity is high (81%), so the measures of PFC activities have the potential to discriminate the healthy controls and the RTW trainees. However, we have not obtained enough results showing advantages of the PFC activation over the POMS_V to discriminate two populations (RTW trainee group and healthy control group) in our study because the AUC of PFC activities and that of POMS_V indicate moderate power (0.792 and 0.755, respectively), and those 95% CIs overlap (0.625 to 0.898 and 0.625 to 0.875, respectively); thus, we need more investigation with a larger sample size and more participants' details.
Our study has several limitations. We used two kinds of NIRS measurement systems: a floor-standing and commercial model using optical fibers (ETG-4000, Hitachi Medical Corp.) for healthy controls and a prototype of the WOT system that we previously developed, 29,30 because the former system was difficult to install on the RTW facility where there was not enough space. The performance and specification of the latter have been confirmed by our previous studies, but the measurement positions were limited and slightly different from those of the former model. Thus, we confirmed the measurement positions of each system using the spatial registration method. [31][32][33] We need to analyze NIRS data from wide range of the PFC by obtaining the data from the same measurement positions with the same system for different groups for further investigation.
Another limitation is that we did not control medical information for analysis since we have not obtained such information about the RTW trainees (diagnosis, medical history, treatment in hospital/clinic, education level, and so on) except for depressive symptoms, age, and gender, because we had difficulties to obtain such information from doctors of the trainees from different hospitals/clinics. In addition, we did not control the gender of the RTW trainees (5 females and 16 males) for analysis, so it is difficult to discuss the gender differences in this study. Thus, this study was limited to evaluation of the difference of depressive symptoms between the RTW trainees and the healthy controls. We need to conduct precise analyses for various aspects of mental disorders with depressive symptoms by accumulating biometric measures and medical information.
The other limitation is that it is difficult to distinguish whether the indication derived from the PFC activities in our study shows the participant's state or trait because we have not obtained participants' personality trait scores that were evaluated in the previous study. 19 We need to investigate more to clarify whether the indication shows the state or trait.
Finally, the results of this study show the potential to give one of the indications from the PFC activation for discrimination of RTW trainees in remission of mental disorders with depressive symptoms and healthy controls. By minimizing the above limitations, we hope that the current results can lead to more practical applications, such as individual monitoring for both healthy and RTW trainees.

Disclosures
The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest. The authors declare that they have no conflicts of interest.