SQL Practice Platform
Article

CUME_DIST() in SQL – Cumulative Distribution Explained (From Noob to Pro)

CUME_DIST() in SQL – Cumulative Distribution Explained (From Noob to Pro)

Hey, I’m Gaurav.

When I first saw `CUME_DIST()` in an SQL interview, I had no clue what it meant.

The interviewer asked:  
"Find the cumulative distribution of employee salaries in each department."

My immediate noob reaction was:  
“What is cumulative distribution? Why not just use RANK() or PERCENT_RANK()?”

I was confused between `RANK()`, `DENSE_RANK()`, `PERCENT_RANK()`, and `CUME_DIST()`.

Today, I’ll explain CUME_DIST() from the very basics — how I finally understood it, how it differs from other ranking functions, and when to use it.

You can practice all examples live on [sqlpractice.in](https://sqlpractice.in).

---

What is CUME_DIST()?

CUME_DIST() stands for Cumulative Distribution.

It tells you:  
> “What percentage of rows have a value less than or equal to the current row’s value?”

It returns a value between 0 and 1 (or 0% to 100%).

Simple Formula:
```sql
CUME_DIST() = (Number of rows with value <= current row) / (Total rows in the partition)
```

---

Example 1: Basic CUME_DIST on Salary

Let’s say we have employees with salaries:

| Name     | Salary |
|----------|--------|
| Alice    | 50000  |
| Bob      | 60000  |
| Charlie  | 60000  |
| David    | 70000  |
| Emma     | 80000  |

Query:
```sql
SELECT 
    name,
    salary,
    CUME_DIST() OVER (ORDER BY salary) AS cum_dist
FROM employees;
```

Result:

| Name     | Salary | cum_dist   |
|----------|--------|------------|
| Alice    | 50000  | 0.2 (20%)  |
| Bob      | 60000  | 0.6 (60%)  |
| Charlie  | 60000  | 0.6 (60%)  |
| David    | 70000  | 0.8 (80%)  |
| Emma     | 80000  | 1.0 (100%) |

Explanation:
- Alice: 1 out of 5 rows ≤ 50000 → 1/5 = 0.2
- Bob & Charlie: 3 out of 5 rows ≤ 60000 → 3/5 = 0.6
- Emma: All 5 rows ≤ 80000 → 5/5 = 1.0

Practice Question 1 (Easy)  
Calculate cumulative distribution of salaries.  
[Try it live →](https://sqlpractice.in/?question=cume-dist-basic)

---

Example 2: CUME_DIST with PARTITION BY (Most Useful)

Question: Show cumulative distribution of order amounts within each customer.

```sql
SELECT 
    u.name AS customer,
    o.total_price,
    CUME_DIST() OVER (
        PARTITION BY u.id 
        ORDER BY o.total_price
    ) AS cum_dist
FROM orders o
JOIN users u ON o.user_id = u.id;
```

This shows: “What percentage of this customer’s orders have amount ≤ current order?”

Practice Question 2 (Medium)  
Find cumulative distribution of spending per customer.  
[Try it live →](https://sqlpractice.in/?question=cume-dist-per-customer)

---

How CUME_DIST() Differs from Other Ranking Functions

| Function          | Returns          | Handles Ties          | Use Case                              |
|-------------------|------------------|-----------------------|---------------------------------------|
| `ROW_NUMBER()`    | 1, 2, 3...       | Unique numbers        | Pagination, unique sequencing         |
| `RANK()`          | 1, 1, 3...       | Gaps on ties          | Competition ranking                   |
| `DENSE_RANK()`    | 1, 1, 2...       | No gaps               | Continuous ranking                    |
| `PERCENT_RANK()`  | 0 to 1           | Based on rank         | Relative position (0% to 100%)        |
| `CUME_DIST()` | 0 to 1       | Based on values   | Cumulative percentage             |

Key Difference:
- `PERCENT_RANK()` is based on rank
- `CUME_DIST()` is based on actual values (how many rows are ≤ current value)

---

Real Interview Question Example

Question: Find the salary percentile (cumulative distribution) of each employee within their department.

Correct Solution:
```sql
SELECT 
    name,
    department,
    salary,
    ROUND(CUME_DIST() OVER (PARTITION BY department ORDER BY salary) 100, 2) AS salary_percentile
FROM employees;
```

Practice Question 3 (Hard)  
Show salary percentile for each employee in their department.  
[Try it live →](https://sqlpractice.in/?question=cume-dist-percentile)

---

How to Think About CUME_DIST() (My Mental Model)

When you see a question involving "cumulative", "percentage of", or "what % of rows are below this value", think:

1. Do I need a percentage (0 to 1)?
2. Should it reset for every group? → Use `PARTITION BY`
3. In what order? → Use `ORDER BY`
4. Use `CUME_DIST()` → It directly gives cumulative distribution

Pro Tip: Multiply by 100 if you want percentage:
```sql
CUME_DIST() 100 AS percentile
```

---

Final Thoughts

`CUME_DIST()` was one of the trickier window functions for me because it sounds very mathematical.  
But once I understood it as:

> “Out of all the rows in this group, what percentage have a value less than or equal to the current row?”

…it became much simpler.

It’s especially useful in:
- Percentile analysis
- Distribution reports
- Finding “top X%” customers/employees

Go practice `CUME_DIST()` questions on [sqlpractice.in](https://sqlpractice.in) — try both basic and partitioned versions.

You’ll start seeing when to use `CUME_DIST()` vs `RANK()` vs `PERCENT_RANK()`.

Keep practicing — you’re getting really good at window functions! 💪

Building sqlpractice.in to help everyone master SQL for interviews — completely free.