All Graduate Theses and Dissertations, Spring 1920 to Summer 2023

Exact Approaches for Bias Detection and Avoidance with Small, Sparse, or Correlated Categorical Data

Sarah E. Schwartz, Utah State UniversityFollow

Date of Award:

12-2017

Document Type:

Dissertation

Degree Name:

Doctor of Philosophy (PhD)

Department:

Mathematics and Statistics

Committee Chair(s)

Chris Corcoran

Committee

Chris Corcoran

Committee

D. Richard Cutler

Committee

Daniel Coster

Committee

Kady Schneiter

Committee

Thomas Ledermann

Abstract

Every day, traditional statistical methodology are used world wide to study a variety of topics and provides insight regarding countless subjects. Each technique is based on a distinct set of assumptions to ensure valid results. Additionally, many statistical approaches rely on large sample behavior and may collapse or degenerate in the presence of small, spare, or correlated data. This dissertation details several advancements to detect these conditions, avoid their consequences, and analyze data in a different way to yield trustworthy results.

One of the most commonly used modeling techniques for outcomes with only two possible categorical values (eg. live/die, pass/fail, better/worse, ect.) is logistic regression. While some potential complications with this approach are widely known, many investigators are unaware that their particular data does not meet the foundational assumptions, since they are not easy to verify. We have developed a routine for determining if a researcher should be concerned about potential bias in logistic regression results, so they can take steps to mitigate the bias or use a different procedure altogether to model the data.

Correlated data may arise from common situations such as multi-site medical studies, research on family units, or investigations on student achievement within classrooms. In these circumstance the associations between cluster members must be included in any statistical analysis testing the hypothesis of a connection be-tween two variables in order for results to be valid.

Previously investigators had to choose between using a method intended for small or sparse data while assuming independence between observations or a method that allowed for correlation between observations, while requiring large samples to be reliable. We present a new method that allows for small, clustered samples to be assessed for a relationship between a two-level predictor (eg. treatment/control) and a categorical outcome (eg. low/medium/high).

Checksum

0a9ca2dbfbb9ff527cd6b12ce29ace54

Recommended Citation

Schwartz, Sarah E., "Exact Approaches for Bias Detection and Avoidance with Small, Sparse, or Correlated Categorical Data" (2017). All Graduate Theses and Dissertations, Spring 1920 to Summer 2023. 6888.
https://digitalcommons.usu.edu/etd/6888

Download

Included in

Mathematics Commons, Statistics and Probability Commons

COinS

Copyright for this work is retained by the student. If you have any questions regarding the inclusion of this work in the Digital Commons, please email us at .

DOI

https://doi.org/10.26076/39be-91c3

All Graduate Theses and Dissertations, Spring 1920 to Summer 2023

Exact Approaches for Bias Detection and Avoidance with Small, Sparse, or Correlated Categorical Data

Date of Award:

Document Type:

Degree Name:

Department:

Committee Chair(s)

Committee

Committee

Committee

Committee

Committee

Abstract

Checksum

Recommended Citation

Included in

DOI

Browse

For Authors

Scholarly Communication

Research Data

All Graduate Theses and Dissertations, Spring 1920 to Summer 2023

Exact Approaches for Bias Detection and Avoidance with Small, Sparse, or Correlated Categorical Data

Author

Date of Award:

Document Type:

Degree Name:

Department:

Committee Chair(s)

Committee

Committee

Committee

Committee

Committee

Abstract

Checksum

Recommended Citation

Included in

Share

DOI

Browse

For Authors

Scholarly Communication

Research Data