Exact Approaches for Bias Detection and Avoidance with Small, Sparse, or Correlated Categorical Data
Date of Award:
12-2017
Document Type:
Dissertation
Degree Name:
Doctor of Philosophy (PhD)
Department:
Mathematics and Statistics
Committee Chair(s)
Chris Corcoran
Committee
Chris Corcoran
Committee
D. Richard Cutler
Committee
Daniel Coster
Committee
Kady Schneiter
Committee
Thomas Ledermann
Abstract
Every day, traditional statistical methodology are used world wide to study a variety of topics and provides insight regarding countless subjects. Each technique is based on a distinct set of assumptions to ensure valid results. Additionally, many statistical approaches rely on large sample behavior and may collapse or degenerate in the presence of small, spare, or correlated data. This dissertation details several advancements to detect these conditions, avoid their consequences, and analyze data in a different way to yield trustworthy results.
One of the most commonly used modeling techniques for outcomes with only two possible categorical values (eg. live/die, pass/fail, better/worse, ect.) is logistic regression. While some potential complications with this approach are widely known, many investigators are unaware that their particular data does not meet the foundational assumptions, since they are not easy to verify. We have developed a routine for determining if a researcher should be concerned about potential bias in logistic regression results, so they can take steps to mitigate the bias or use a different procedure altogether to model the data.
Correlated data may arise from common situations such as multi-site medical studies, research on family units, or investigations on student achievement within classrooms. In these circumstance the associations between cluster members must be included in any statistical analysis testing the hypothesis of a connection be-tween two variables in order for results to be valid.
Previously investigators had to choose between using a method intended for small or sparse data while assuming independence between observations or a method that allowed for correlation between observations, while requiring large samples to be reliable. We present a new method that allows for small, clustered samples to be assessed for a relationship between a two-level predictor (eg. treatment/control) and a categorical outcome (eg. low/medium/high).
Checksum
0a9ca2dbfbb9ff527cd6b12ce29ace54
Recommended Citation
Schwartz, Sarah E., "Exact Approaches for Bias Detection and Avoidance with Small, Sparse, or Correlated Categorical Data" (2017). All Graduate Theses and Dissertations, Spring 1920 to Summer 2023. 6888.
https://digitalcommons.usu.edu/etd/6888
Included in
Copyright for this work is retained by the student. If you have any questions regarding the inclusion of this work in the Digital Commons, please email us at .