Article
•
CUME_DIST() in SQL – Cumulative Distribution Explained (From Noob to Pro)
CUME_DIST() in SQL – Cumulative Distribution Explained (From Noob to Pro)
Hey, I’m Gaurav.
When I first saw `CUME_DIST()` in an SQL interview, I had no clue what it meant.
The interviewer asked:
"Find the cumulative distribution of employee salaries in each department."
My immediate noob reaction was:
“What is cumulative distribution? Why not just use RANK() or PERCENT_RANK()?”
I was confused between `RANK()`, `DENSE_RANK()`, `PERCENT_RANK()`, and `CUME_DIST()`.
Today, I’ll explain CUME_DIST() from the very basics — how I finally understood it, how it differs from other ranking functions, and when to use it.
You can practice all examples live on [sqlpractice.in](https://sqlpractice.in).
---
What is CUME_DIST()?
CUME_DIST() stands for Cumulative Distribution.
It tells you:
> “What percentage of rows have a value less than or equal to the current row’s value?”
It returns a value between 0 and 1 (or 0% to 100%).
Simple Formula:
```sql
CUME_DIST() = (Number of rows with value <= current row) / (Total rows in the partition)
```
---
Example 1: Basic CUME_DIST on Salary
Let’s say we have employees with salaries:
| Name | Salary |
|----------|--------|
| Alice | 50000 |
| Bob | 60000 |
| Charlie | 60000 |
| David | 70000 |
| Emma | 80000 |
Query:
```sql
SELECT
name,
salary,
CUME_DIST() OVER (ORDER BY salary) AS cum_dist
FROM employees;
```
Result:
| Name | Salary | cum_dist |
|----------|--------|------------|
| Alice | 50000 | 0.2 (20%) |
| Bob | 60000 | 0.6 (60%) |
| Charlie | 60000 | 0.6 (60%) |
| David | 70000 | 0.8 (80%) |
| Emma | 80000 | 1.0 (100%) |
Explanation:
- Alice: 1 out of 5 rows ≤ 50000 → 1/5 = 0.2
- Bob & Charlie: 3 out of 5 rows ≤ 60000 → 3/5 = 0.6
- Emma: All 5 rows ≤ 80000 → 5/5 = 1.0
Practice Question 1 (Easy)
Calculate cumulative distribution of salaries.
[Try it live →](https://sqlpractice.in/?question=cume-dist-basic)
---
Example 2: CUME_DIST with PARTITION BY (Most Useful)
Question: Show cumulative distribution of order amounts within each customer.
```sql
SELECT
u.name AS customer,
o.total_price,
CUME_DIST() OVER (
PARTITION BY u.id
ORDER BY o.total_price
) AS cum_dist
FROM orders o
JOIN users u ON o.user_id = u.id;
```
This shows: “What percentage of this customer’s orders have amount ≤ current order?”
Practice Question 2 (Medium)
Find cumulative distribution of spending per customer.
[Try it live →](https://sqlpractice.in/?question=cume-dist-per-customer)
---
How CUME_DIST() Differs from Other Ranking Functions
| Function | Returns | Handles Ties | Use Case |
|-------------------|------------------|-----------------------|---------------------------------------|
| `ROW_NUMBER()` | 1, 2, 3... | Unique numbers | Pagination, unique sequencing |
| `RANK()` | 1, 1, 3... | Gaps on ties | Competition ranking |
| `DENSE_RANK()` | 1, 1, 2... | No gaps | Continuous ranking |
| `PERCENT_RANK()` | 0 to 1 | Based on rank | Relative position (0% to 100%) |
| `CUME_DIST()` | 0 to 1 | Based on values | Cumulative percentage |
Key Difference:
- `PERCENT_RANK()` is based on rank
- `CUME_DIST()` is based on actual values (how many rows are ≤ current value)
---
Real Interview Question Example
Question: Find the salary percentile (cumulative distribution) of each employee within their department.
Correct Solution:
```sql
SELECT
name,
department,
salary,
ROUND(CUME_DIST() OVER (PARTITION BY department ORDER BY salary) 100, 2) AS salary_percentile
FROM employees;
```
Practice Question 3 (Hard)
Show salary percentile for each employee in their department.
[Try it live →](https://sqlpractice.in/?question=cume-dist-percentile)
---
How to Think About CUME_DIST() (My Mental Model)
When you see a question involving "cumulative", "percentage of", or "what % of rows are below this value", think:
1. Do I need a percentage (0 to 1)?
2. Should it reset for every group? → Use `PARTITION BY`
3. In what order? → Use `ORDER BY`
4. Use `CUME_DIST()` → It directly gives cumulative distribution
Pro Tip: Multiply by 100 if you want percentage:
```sql
CUME_DIST() 100 AS percentile
```
---
Final Thoughts
`CUME_DIST()` was one of the trickier window functions for me because it sounds very mathematical.
But once I understood it as:
> “Out of all the rows in this group, what percentage have a value less than or equal to the current row?”
…it became much simpler.
It’s especially useful in:
- Percentile analysis
- Distribution reports
- Finding “top X%” customers/employees
Go practice `CUME_DIST()` questions on [sqlpractice.in](https://sqlpractice.in) — try both basic and partitioned versions.
You’ll start seeing when to use `CUME_DIST()` vs `RANK()` vs `PERCENT_RANK()`.
Keep practicing — you’re getting really good at window functions! 💪
Building sqlpractice.in to help everyone master SQL for interviews — completely free.