Mysql selecting top N per group

+--------+---------------------+------+-----+-------------------+-------+ | Field | Type | Null | Key | Default | Extra | +--------+---------------------+------+-----+-------------------+-------+ | id | bigint(20) unsigned | NO | PRI | NULL | | | input | tinyint(3) unsigned | NO | PRI | NULL | | | score | float unsigned | NO | | NULL | | | date | timestamp | NO | | CURRENT_TIMESTAMP | | | locale | tinyint(3) unsigned | NO | PRI | NULL | | +--------+---------------------+------+-----+-------------------+-------+

select * from (select e.locale, e.id, e.score, find_in_set(e.score, x.scoreslist) as rank from (select locale, group_concat(score order by score) as scoreslist from (select locale, min(score) as score from scores where id not in (select id from blacklist) group by locale, id) as k group by locale) as x, scores as e where e.locale = x.locale) as z where rank <= 5 and rank > 0 order by locale, rank

+--------+---------------------------------+ | locale | scoreslist | +--------+---------------------------------+ | 1 | 1.75,2.129,2.85,6.34,9,10,11,12 | | 2 | 2.185,4.12,8.32 | | 3 | 2.4 | +--------+---------------------------------+

4 responses to “Mysql selecting top N per group”

yacov says:

July 14, 2010 at 9:55 pm

why did you have to make such a complex query?
wouldn’t it been simpler to select the data ordered by the scores, and attempt to get the top 5 scores?
- frishrash says:
  
  July 14, 2010 at 10:29 pm
  
  Well, each language has it’s own score board. It’s unfair to compare typing times for different alphabets (different lengths and layouts) so I had to either group results by language or make separate query for each language. I already wrote the considerations that made me prefer one query over multiple queries. Anyhow even if my decision was wrong the purpose of this post was to show how to handle situations when you need one query that selects top N per group.
  
  If u meant programmatically parsing results list – sure it is possible, but you don’t know how many rows you need to parse before you find top N in each category so your array or recordset object might get huge. Why “waste” this memory when the DB can do it in much more efficient way?
yacov says:

July 20, 2010 at 6:46 pm

because this way you don’t need to pull your hair out 🙂
affinity says:

November 26, 2011 at 2:22 pm

What’s wrong with using the LIMIT clause?

Here’s a simple example to get the 3 most expensive and the 3 least expensive items from a table using group_concat with LIMIT:

mysql> desc SHOP_ITEM;
+————-+————–+——+—–+———+—————-+
| Field | Type | Null | Key | Default | Extra |
+————-+————–+——+—–+———+—————-+
| ID | int(11) | NO | PRI | NULL | auto_increment |
| CODE | varchar(45) | NO | | NULL | |
| PIC | varchar(100) | NO | | NULL | |
| DESCRIPTION | longtext | NO | | NULL | |
| DETAIL | text | NO | | NULL | |
| RETAIL | int(11) | NO | | NULL | |
| HIRE | int(11) | NO | | NULL | |
| IN_STOCK | int(11) | NO | | NULL | |
| SHOP_ID | int(11) | NO | PRI | NULL | |
+————-+————–+——+—–+———+—————-+
9 rows in set (0.00 sec)

mysql> select ID, RETAIL from SHOP_ITEM;
+—-+——–+
| ID | RETAIL |
+—-+——–+
| 1 | 1180 |
| 2 | 1380 |
| 3 | 1120 |
| 4 | 1050 |
| 5 | 1450 |
| 6 | 950 |
| 7 | 680 |
| 8 | 540 |
| 9 | 780 |
| 10 | 900 |
| 11 | 1100 |
| 12 | 1620 |
| 13 | 960 |
| 14 | 660 |
+—-+——–+
14 rows in set (0.00 sec)

mysql> select PRICE_GROUP, group_concat(RETAIL)
-> from
-> (
-> (
-> select ‘HIGH’ as PRICE_GROUP, RETAIL
-> from SHOP_ITEM
-> where ID not in (select 12 from dual)
-> order by RETAIL desc limit 3
-> )
-> UNION
-> (
-> select ‘LOW’ as PRICE_GROUP, RETAIL
-> from SHOP_ITEM
-> where ID not in (select 12 from dual)
-> order by RETAIL asc limit 3)
-> ) as q
-> group by PRICE_GROUP
-> order by PRICE_GROUP;
+————-+———————-+
| PRICE_GROUP | group_concat(RETAIL) |
+————-+———————-+
| HIGH | 1450,1380,1180 |
| LOW | 540,660,680 |
+————-+———————-+
2 rows in set (0.00 sec)

mysql>

My selection from dual was due to not having a ‘blacklist’ table…. but it works fine.
My example obviously uses a different table, but the concept is the same.