Do men and women have different brains? Previous neuroimage studies sought to answer this question based on morphological difference between specific brain regions, reporting unfortunately conflicting results. In the present study, we aim to use a deep learning technique to address this challenge based on a large open-access, diffusion MRI database recorded from 1,065 young healthy subjects, including 490 men and 575 women healthy subjects. Different from commonly used 2D Convolutional Neural Network (CNN), we proposed a 3D CNN method with a newly designed structure including three hidden layers in cascade with a linear layer and a terminal Softmax layer. The proposed 3D CNN was applied to the maps of factional anisotropy (FA) in the whole-brain as well as specific brain regions. The entropy measure was applied to the lowest-level image features extracted from the first hidden layer to examine the difference of brain structure complexity between men and women. The obtained results compared with the results from using the Support Vector Machine (SVM) and Tract-Based Spatial Statistics (TBSS). The proposed 3D CNN yielded a better classification result (93.3%) than the SVM (78.2%) on the whole-brain FA images, indicating gender-related differences likely exist in the whole-brain range. Moreover, high classification accuracies are also shown in several specific brain regions including the left precuneus, the left postcentral gyrus, the left cingulate gyrus, the right orbital gyrus of frontal lobe, and the left occipital thalamus in the gray matter, and middle cerebellum peduncle, genu of corpus callosum, the right anterior corona radiata, the right superior corona radiata and the left anterior limb of internal capsule in the while matter. This study provides a new insight into the structure difference between men and women, which highlights the importance of considering sex as a biological variable in brain research.
Recent studies indicate that gender may have a substantial influence on human cognitive functions, including emotion, memory, perception, etc., (Cahill, 2006). Men and women appear to have different ways to encode memories, sense emotions, recognize faces, solve certain problems, and make decisions. Since the brain controls cognition and behaviors, these gender-related functional differences may be associated with the gender-specific structure of the brain (Cosgrove et al., 2007).
Diffusion tensor imaging (DTI) is an effective tool for characterizing nerve fibers architecture. By computing fractional anisotropy (FA) parameters in DTI, the anisotropy of nerve fibers can be quantitatively evaluated (Lasi et al., 2014). Differences in FA values are thought to associate with developmental processes of axon caliber, myelination, and/or fiber organization of nerve fibers pathways. By computing FA, researchers has revealed subtle changes related to normal brain development (Westlye et al., 2009), learning (Golestani et al., 2006), and healthy aging (Kochunov et al., 2007). Nevertheless, existing studies are yet to provide consistent results on exploring the difference of brain structure between men and women. Ingalhalikar et al. (2014) argued that the men have greater intra-hemispheric connection via the corpus callosum while women have greater interhemispheric connectivity. However, other studies reported no significant gender difference in brain structure (Raz et al., 2001; Salat et al., 2005). A recent critical opinion article suggested that more research is needed to investigate whether men and women really have different brain structures (Joel and Tarrasch, 2014).
Most existing DTI studies used the group-level statistical methods such as Tract-Based Spatial Statistics (TBSS) (Thatcher et al., 2010; Mueller et al., 2011; Shiino et al., 2017). However, recent studies indicated that machine learning techniques may provide us with a more powerful tool for analyzing brain images (Shen et al., 2010; Lu et al., 2017; Tang et al., 2018). Especially, deep learning can extract non-linear network structure, realize approximation of complex function, characterize distributed representation of input data, and demonstrate the powerful ability to learn the essential features of datasets based on a small size of samples (Zeng et al., 2016, 2018a; Tian et al., 2018; Wen et al., 2018). In particular, the deep convolutional neural network (CNN) uses the convolution kernels to extract the features of image and can find the characteristic spatial difference in brain images, which may promise a better result than using other conventional machine learning and statistical methods (Cole et al., 2017).
In this study, we performed CNN-based analyses on the FA images and extracts the features of the hidden layers to investigate the difference between man and woman brains. Different from commonly used 2D CNN model, we innovatively proposed a 3D CNN model with a new structure including 3 hidden layers, a linear layer and a softmax layer. Each hidden layer is comprised of a convolutional layer, a batch normalization layer, an activation layer and followed by a pooling layer. This novel CNN model allows using the whole 3D brain image (i.e., DTI) as the input to the model. The linear layer between the hidden layers and the softmax layer reduces the number of parameters and therefore avoids over-fitting problems.
Materials and Methods
MRI Data Acquisition and Preprocessing
The database used in this work is from the Human Connectome Project (HCP) (Van Essen et al., 2013). This open-access database contains data from 1,065 subjects, including 490 men and 575 women. The ages range is from 22 to 36. This database represents a relatively large sample size compared to most neuroimaging studies. Using this open-access dataset allows replication and extension of this work by other researchers.
We performed DTI data preprocessing includes format conversion, b0 image extraction, brain extraction, eddy current correction, and tensor FA calculation. The first four steps were processed with the HCP diffusion pipeline, including diffusion weighting (bvals), direction (bvecs), time series, brain mask, a file (grad_dev.nii.gz) for gradient non-linearities during model fitting, and log files of EDDY processing. In the final step we use dtifit to calculate the tensors to get the FA, as well as mean diffusivity (MD), axial diffusivity (AD), and radial diffusivity (RD) values.
The original data were too large to train the model and it would cause RESOURCE EXAUSTED problem while training due to the insufficient of GPU memory. The GPU we used in the experiment is NVIDIAN TITAN_XP with 12G memory each. To solve the problem, we scaled the size of FA image to [58 × 70 × 58]. This procedure may lead to a better classification result, since a smaller size of the input image can provide a larger receptive field to the CNN model. In order to perform the image scaling, “dipy” (http://nipy.org/dipy/) was used to read the .nii data of FA. Then “ndimage” in the SciPy (SciPy.org – SciPy.org) was used to reduce the size of the data. Scaled data was written into the TFRecord files (TensorFlow) with the corresponding labels. TFRecord file format is a simple record oriented binary format that is widely used in Tensorflow application for the training data to get a high performance of input efficiency. The labels were processed into the format of one-hot. We implemented a pipeline to read data asynchronously from TFRecord according to the interface specification provided by Tensorflow (Abadi et al., 2016). The pipeline included the reading of TFRecord files, data decoding, data type conversion, and reshape of data.
We did the experiments on a GPU work station, which has four NVIDIA TITAN Xp GPUs. The operation system of the GPU work station was Ubutnu16.04. We used FSL to preprocess the data. The CNN model was designed using the open source machine learning framework Tensorflow (Abadi et al., 2016).
The commonly used CNN structures are based on 2D images. When using a 2D CNN to process 3D MRI images, it needs to map the original image from different directions to get 2D images, which will lose the spatial structure information of the image. In this study, we designed a 3D CNN with 3D convolutional kernels, which allowed us to extract 3D structural features from FA images. Besides, traditional CNN model usually uses several fully connected layers to connect the hidden layers and the output layer. The fully connected layer may be prone to the over-fitting problem in binary classification when the number of samples is limited (like our data). To address this problem, we used a linear layer to replace the fully connected layer. The linear layer integrates the outputs of hidden layers (i.e., a 3D matrix comprised of multiple featuremaps) into the inputs (i.e., a 1D vector) of the output layer which is a softmax classifier. Moreover, we performed a Batch Normalization (Ioffe and Szegedy, 2015) after each convolution operation. The Batch Normalization is used to avoid internal covariate shift problem in training the CNN model. Therefore, our designed model is a 3D “pure” CNN (3D PCNN). The architecture of the 3D PCNN model is shown in Figure 1. The 3D PCNN consists of three hidden layers, a linear layer and a softmax layer. Each of the hidden layer contains a convolutional layer, a Batch Normalization layer, an activation layer, a pooling layer with several feature maps as the outputs.