I think anyone who has had to analyze large tables to make reporting has already made such queries:
SELECT *, (SELECT COUNT(DISTINCT(l.like_id)) FROM like l WHERE l.news_id = n.news_id) AS nb_likes FROM news n WHERE blabla
Except imagine that you have 1 million news, MySQL will perform 1 million and 1 queries to retrieve all:
- 1 query to retrieve 1M news
- 1M queries to retrieve the number of likes by news
The solution to implement to optimize all is to join directly the subassembly with the likes and finally work in all 2:
SELECT * COUNT(DISTINCT(l.like_id)) AS nb_likes FROM ( SELECT * FROM news n WHERE blabla ) AS n JOIN like l ON n.news_id = l.news_id GROUP BY l.news_id
We get the same result in only 3 queries:
- 1 query to retrieve 1M news
- 1 query to join the subset of news with all likes
- 1 query to group and deduce the count as well as information from your news
The difference in performance is exponential, in my case with a table of 1M recordsets, I passed from 30s to 0.02s.
SELECT
n.*,
COUNT(DISTINCT(l.like_id)) AS nb_likes
FROM
news n
JOIN like l ON n.news_id = l.news_id
WHERE
blabla
GROUP BY
l.news_id;
Comment