Automated Narrative Analysis
Description
The accuracy of four Machine Learning methods in predicting narrative macrostructure scores was compared to scores obtained by human raters utilizing a criterion-referenced progress monitoring rubric. The machine learning methods that were explored covered methods that utilized hand-engineered features, as well as those that learn directly from the raw text. The predictive models were trained on a corpus of 414 narratives from a normative sample of school-aged children (5;0-9;11) who were given a standardized measure of narrative proficiency. Performance was measured using Quadratic Weighted Kappa, a metric of inter-rater reliability. The results indicated that one model, BERT, not only achieved significantly higher scoring accuracy than the other methods, but was consistent with scores obtained by human raters using a valid and reliable rubric. The findings from this study suggest that a machine learning method, specifically, BERT, shows promise as a way to automate the scoring of narrative macrostructure for potential use in clinical practice.
Author ORCID Identifier
Ronald B Gillam https://orcid.org/0000-0002-6077-6885
OCLC
1143695399
Document Type
Dataset
DCMI Type
Dataset
File Format
.csv, .txt
Viewing Instructions
There are two csv files. AutomatedNarrativeAnalysisMIMSLData.csv contains de-identified MISL scores for 414 participants in response the the Aliens story from the Test of Narrative Language, as well as their associated de-indentified transcript and full Coh-Metrix measures. ExpertScores.csv contains the MISL double-scores for a randomly selected set of 50 narrative transcripts, which were produced by an expert doctoral student.
Publication Date
6-6-2019
Funder
Lillywhite Endowment
Publisher
Utah State University
Methodology
Data were collected as part of the TNL norming data-base, part of a national norming sample. Audio collected during sampling were digitally recorded and transcribed in Systematic Analysis of Language Transcripts (SALT) software by trained research assistants who were blinded to the purposes of the study. Transcripts were cleaned in R to remove unwanted characters. MISL data were cleaned in excel and processed using R and Python. Expert scores were obtained by randomly selecting 50 narrative transcripts and double-scoring them on the MISl. Expert scores were produced by an expert doctoral student with more than three years of scoring experience.
Referenced by
Jones, S., Fox, C., Gillam, S., & Gillam, R. B. (2019). An exploration of automated narrative analysis via machine learning. PLOS ONE, 14(10), e0224634. https://doi.org/10.1371/journal.pone.0224634
Start Date
1-2003
End Date
11-2003
Language
eng
Code Lists
ID = de-identified assigned ID number
vecOfNarratives = narrative transcript
Char = character score
Sett = setting score
E = initating event score
Plan = plan score
Act = action score
Con = consequence score
ENP = elaborated noun phrase score
Disciplines
Communication Sciences and Disorders
License
This work is licensed under a Creative Commons Attribution 4.0 License.
Recommended Citation
Gillam, R., Jones, S. K., & Fox, C. (2019). Automated Narrative Analysis. Utah State University. https://doi.org/10.26078/CATV-BM75
Additional Files
README.txt (6 kB)MD5: b3aead1b9c2252a76a69f5a838ddb284
AutomatedNarrativeAnalysisMISLData.csv (606 kB)
MD5: a41039f0a14b062c6e4317cba3caf11a
ExpertScores.csv (22 kB)
MD5: 8c3457a382a0e1c6bb7517da533fe647
Comments
See README for additional information.