The "Podcast" ECoG dataset for modeling neural activity during natural language comprehension.

Publication Year
2025

Type

Journal Article
Abstract

Naturalistic electrocorticography (ECoG) data are a rare but essential resource for studying the brain's linguistic capabilities. ECoG offers high temporal resolution suitable for investigating processes at multiple temporal timescales and frequency bands. It also provides broad spatial coverage, often along critical language areas. Here, we share a dataset of nine ECoG participants with 1,330 electrodes listening to a 30-minute audio podcast. The richness of this naturalistic stimulus can be used for various research questions, from auditory perception to narrative integration. In addition to the neural data, we extracted linguistic features of the stimulus ranging from phonetic information to large language model word embeddings. We use these linguistic features in encoding models that relate stimulus properties to neural activity. Finally, we provide detailed tutorials for preprocessing raw data, extracting stimulus features, and running encoding analyses that can serve as a pedagogical resource or a springboard for new research.

Journal
Scientific data
Volume
12
Issue
1
Pages
1135
Date Published
07/2025
ISSN Number
2052-4463
Alternate Journal
Sci Data
PMCID
PMC12226714
PMID
40610484