Automated Narrative Analysis


The accuracy of four Machine Learning methods in predicting narrative macrostructure scores was compared to scores obtained by human raters utilizing a criterion-referenced progress monitoring rubric. The machine learning methods that were explored covered methods that utilized hand-engineered features, as well as those that learn directly from the raw text. The predictive models were trained on a corpus of 414 narratives from a normative sample of school-aged children (5;0-9;11) who were given a standardized measure of narrative proficiency. Performance was measured using Quadratic Weighted Kappa, a metric of inter-rater reliability. The results indicated that one model, BERT, not only achieved significantly higher scoring accuracy than the other methods, but was consistent with scores obtained by human raters using a valid and reliable rubric. The findings from this study suggest that a machine learning method, specifically, BERT, shows promise as a way to automate the scoring of narrative macrostructure for potential use in clinical practice.

Author ORCID Identifier




Document Type




File Format

.csv, .txt

Viewing Instructions

There are two csv files. AutomatedNarrativeAnalysisMIMSLData.csv contains de-identified MISL scores for 414 participants in response the the Aliens story from the Test of Narrative Language, as well as their associated de-indentified transcript and full Coh-Metrix measures. ExpertScores.csv contains the MISL double-scores for a randomly selected set of 50 narrative transcripts, which were produced by an expert doctoral student.

Publication Date



Lillywhite Endowment


Utah State University


Data were collected as part of the TNL norming data-base, part of a national norming sample. Audio collected during sampling were digitally recorded and transcribed in Systematic Analysis of Language Transcripts (SALT) software by trained research assistants who were blinded to the purposes of the study. Transcripts were cleaned in R to remove unwanted characters. MISL data were cleaned in excel and processed using R and Python. Expert scores were obtained by randomly selecting 50 narrative transcripts and double-scoring them on the MISl. Expert scores were produced by an expert doctoral student with more than three years of scoring experience.

Referenced by

Jones, S., Fox, C., Gillam, S., & Gillam, R. B. (2019). An exploration of automated narrative analysis via machine learning. PLOS ONE, 14(10), e0224634. https://doi.org/10.1371/journal.pone.0224634

Start Date


End Date




Code Lists

ID = de-identified assigned ID number

vecOfNarratives = narrative transcript

Char = character score

Sett = setting score

E = initating event score

Plan = plan score

Act = action score

Con = consequence score

ENP = elaborated noun phrase score


See README for additional information.


Communication Sciences and Disorders


Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 License.

Additional Files

README.txt (6 kB)
MD5: b3aead1b9c2252a76a69f5a838ddb284

AutomatedNarrativeAnalysisMISLData.csv (606 kB)
MD5: a41039f0a14b062c6e4317cba3caf11a

ExpertScores.csv (22 kB)
MD5: 8c3457a382a0e1c6bb7517da533fe647