• Home
23/09/2025

What label should we assign? Classification under uncertainty

Enquesta de satisfacció

In the field of supervised machine learning, classification is a central task that assigns labels based on their probability distribution. Classification often follows the MAP criterion, but this can be inadequate when categories are ordered, as in satisfaction scales. This article proposes Ord-MAP, an optimal alternative that opens the door to better practices in ordinal classification.

Classification is one of the central tasks in supervised machine learning, a branch of artificial intelligence that builds, from prior data, systems capable of assigning labels (classifiers) to new items. Classifiers do not assign labels deterministically but rather operate under uncertainty: for each new item, they output a probability distribution over the possible labels.

When labels are binary (yes/no, good/bad, …), it is common to assign the label with the highest probability —the mode of the distribution. This criterion, known as MAP (Maximum A Posteriori), is not only intuitive but also optimal if we assume that all classification errors incur the same loss, in the sense that it minimizes the expected loss when classifying.

The MAP criterion is also widely used in nominal multiclass classification (more than two unordered labels), under the same assumption of symmetrical loss for classification errors. However, in many real-world problems, the labels have an intrinsic order, such as in satisfaction scales (very dissatisfied – dissatisfied – neutral – satisfied – very satisfied), known as Likert scales. In these cases, not all errors are equally severe: classifying a customer as “dissatisfied” instead of “very dissatisfied” is less serious than classifying them as “very satisfied.”

When labels are ordered, a more coherent alternative to penalizing all errors equally is to consider the distance between labels. In a recent article, a new decision rule for ordinal classification has been proposed: Ord-MAP, which consists of assigning the median of the probability distribution, that is, the first label for which the cumulative probability exceeds 0.5.

For example, if a classifier returns the following probabilities: 0.35, 0.05, 0.05, 0.30, and 0.25, corresponding to the five satisfaction categories, the MAP criterion would assign the label “very dissatisfied” (the highest probability), whereas Ord-MAP would assign “satisfied”, as it is the first category for which the cumulative probability exceeds 0.5.

This surprisingly simple criterion is mathematically proven in the article to be optimal, in the sense that it minimizes the expected loss when loss is defined as the distance between the true and assigned labels. Experiments with various classifiers and real datasets, as well as simulations, clearly show the superiority of the Ord-MAP criterion over the commonly used MAP.

This contribution opens the door to better practices in ordinal classification, which is widespread in real-world applications such as recommendation systems, surveys, and service evaluation.

Rosario Delgado

Department of Mathematics

Universitat Autònoma de Barcelona

References

Delgado, R. (2025). Ord-MAP criterion: Extending MAP for ordinal classification. Knowledge-Based Systems 324, 113837,
https://doi.org/10.1016/j.knosys.2025.113837

 
View low-bandwidth version