Activations of Deep Convolutional Neural Network are Aligned with Gamma Band Activity of Human Visual Cortex
Research article by Ilya Kuzovkin, Raul Vicente, Mathilde Petton, Jean-Philippe Lachaux, Monica Baciu, Philippe Kahane, Sylvain Rheims, Juan R. Vidal and Jaan Aru.
Previous work demonstrated a direct correspondence between the hierarchy of the human visual areas and layers of deep convolutional neural networks (DCNN) trained on visual object recognition. We used DCNNs to investigate which frequency bands carry feature transformations of increasing complexity along the ventral visual pathway. By capitalizing on intracranial depth recordings from 100 patients and 11293 electrodes we assessed the alignment between the DCNN and signals at different frequency bands in different time windows. We found that activity in low and high gamma bands was aligned with the increasing complexity of visual feature representations in the DCNN. These findings show that activity in the gamma band is not only a correlate of object recognition, but carries increasingly complex features along the ventral visual pathway. These results demonstrate the potential that modern artificial intelligence algorithms have in advancing our understanding of the brain.
The overview of the three major steps and the analysis flow.
First we feed stimuli though the DNN to extract activations of each layer of DNN.
See DNN/RADME.md
section "Scripts and running order" for detalils.
Next we transform brain data into suitable structure and extract features.
See Intracranial/README.md
for details.
Finally we use DNN activations to map each intracranial probe to a DNN layer.
See Collusion/README.md
for details.
Our data consists of two big chunks: activations and derivatives from human brain responses and activations and derivatives from DCNN responses. We made most of it publicly available with exception, due to the restrictions imposed by the third party, to the raw LFP responses from the implanted electrodes. But already from the the very next step of analysis -- spectral decomposition of the LFPs -- the data is available:
Full set of 419 stimuli presented to the subjects: images.zip | 3.4 Mb
Out of 419 the subset of 250 images from 5 categories were used in the current analysis: stimsequence.txt | 1.2 Kb
Human brain responses to those images from 11,293 electrodes across 100 subjects were recorded, resulting in 2,823,250 LFP recordings: [not available publicly]
Recordings were preprocessed using detrending, artifact rejection, bipolar rereferencing and dropping non-responposive electrodes: [not available publicly]
LFP responses of remaining responsive electodes are split into 15 regions of interest: 5 bands (θ, α, β, γ, Γ) in 3 time windows (50-250 ms, 150-350 ms, 250-450 ms) and average baseline-normalized power extracted for each probe for each region of interest: mean_band_responses.zip | 27 Mb, inside that archive you will find 15 directories -- one per region of interest. Within each directory there are 100 .mat
files -- one per subject, each of those files has the following structure:
s.name name (anonymized) of the test subject
s.stimseq filenames of the stimuli images (always the same across all subjects)
s.stimgroup category of each of the stimuli images (always the same across all subjects)
s.probes.mni [N x 3] matrix with MNI coordinates of N probes
s.probes.areas Brodmann area of each of N probes
s.data [250 x N] baseline-normalized average power of each probe's response to each of the stimuli
RDM matrix was computed for each probe's response: brain_and_dnn_rdms.zip | 1.8 Gb has 15 directories, inside each you will find M
files, corresponding to the number of responsive probes in that region of interest + 17 files that hold RDMs of DCNN (both the pre-trained AlexNet and randomly initialized AlexNet, those 17 files are same in all 15 directories). Brain RDM filename format is brain-SUBJECT-PROBE.txt
and it holds an RDM matrix of size 250 x 250 that describes that probe's representation geometry.
The particular DCNN that was used in our research was AlexNet trained on ImageNet. Weights of this network are publicly available: BVLC Reference CaffeNet.
Responses of that network to our set of images: alexnet-activations.zip | 395 Mb has an .npy
for each layer of the DCNN.
Once we had human and machine responses we computed representation geometry of both and compared those geometries using representational similarity analysis.
Final RSA scores and p-values rsa_scores_and_pvalues.zip | 2.9 Mb show correlation between each pair of (probe, layer) RDMs and their significance level based on the permutation test. Both for pre-trained and random (control) networks.