Computing the mode with SQL

The mode of a bunch of numbers is the most frequent number in the bunch. A bunch of numbers can have more than one most-frequent number.

Given a table tbl with columns category and value, this SQL returns the mode for each category.


WITH freq AS (
    SELECT COUNT(*) freq,
      FROM tbl
     GROUP BY value, category
most AS (
     SELECT MAX(freq) freq,
      FROM freq
     GROUP BY category
mode AS (
      SELECT GROUP_CONCAT(value ORDER BY value DESC) modes,
        FROM freq
        JOIN most ON freq.freq = most.freq
       GROUP BY freq.category


This works first by finding the frequencies of each value in each category, using the freq CTE. It then extracts the largest frequency value for each category, using the most CTE. FInally, it joins the freq and most virtual tables together to extract the most frequent values from freq.

It uses GROUP_CONCAT() because it’s possible for there to be more than one mode  — more than one most-frequent value — in a set of data.

Leave a Comment