Date of Award:

8-2025

Document Type:

Thesis

Degree Name:

Master of Science (MS)

Department:

Mathematics and Statistics

Committee Chair(s)

Alan Wisler

Committee

Alan Wisler

Committee

Brennan Bean

Committee

Kevin Moon

Abstract

Classification tasks are fundamental in statistical machine learning. In classification tasks, a general goal is to build or select a model that can correctly classify data with as few errors as possible. However, for a particular dataset, the minimal number of errors achievable is seldom zero since overlap in the data makes errors unavoidable. As a result, it is often difficult for machine learning practitioners and data scientists to know whether classification errors can be reduced through further refinement. A potential solution to this lies in the Bayes error rate (BER). The BER is the lowest error rate achievable for a given set of features. If known, the BER could give data scientists better ability to gauge the performance of specific models relative to the limitations of the data and thus make more educated decisions on how much time should be spent iterating on existing solutions. In general, the exact class distributions are unknown, so the BER cannot be determined exactly. Instead, research focuses on estimating or bounding the BER as closely as possible given the data. There are a wide variety of ways to try to bound the BER. This thesis discusses several of these methods and aims to characterize how well they can perform in different scenarios where the BER is known. In particular, we seek to quantify how often the true BER actually falls within the lower and upper bounds for the different methods in the literature. This characteristic has been neglected in the prior literature and would help establish the degree to which these bounds can actually be trusted as a reliable tool for classification problems.

Checksum

d7c4629627003d5309cb6bdc655b2770

Creative Commons License

This work is licensed under a Creative Commons Attribution 4.0 License.

Recommended Citation

May, Riley, "Empirical Evaluation of Bayes Error Rate Bounds in Binary Classification" (2025). All Graduate Theses and Dissertations, Fall 2023 to Present. 522.
https://digitalcommons.usu.edu/etd2023/522

Download

Included in

Statistics and Probability Commons

COinS

Copyright for this work is retained by the student. If you have any questions regarding the inclusion of this work in the Digital Commons, please email us at .

DOI

https://doi.org/10.26076/a2ee-2cdf

All Graduate Theses and Dissertations, Fall 2023 to Present

Empirical Evaluation of Bayes Error Rate Bounds in Binary Classification

Date of Award:

Document Type:

Degree Name:

Department:

Committee Chair(s)

Committee

Committee

Committee

Abstract

Checksum

Creative Commons License

Recommended Citation

Included in

DOI

Browse

For Authors

Scholarly Communication

Research Data

All Graduate Theses and Dissertations, Fall 2023 to Present

Empirical Evaluation of Bayes Error Rate Bounds in Binary Classification

Author

Date of Award:

Document Type:

Degree Name:

Department:

Committee Chair(s)

Committee

Committee

Committee

Abstract

Checksum

Creative Commons License

Recommended Citation

Included in

Share

DOI

Browse

For Authors

Scholarly Communication

Research Data