SQL exercises on movie Database: Compute a report which shows the year when most of the Mystery movies produces, and number of such movies and their average rating
23. From the following tables, write a SQL query to find the years when most of the ‘Mystery Movies’ produced. Count the number of generic title and compute their average rating. Group the result set on movie release year, generic title. Return movie year, generic title, number of generic title and average rating.
Sample table: movie
mov_id | mov_title | mov_year | mov_time | mov_lang | mov_dt_rel | mov_rel_country
--------+----------------------------------------------------+----------+----------+-----------------+------------+-----------------
901 | Vertigo | 1958 | 128 | English | 1958-08-24 | UK
902 | The Innocents | 1961 | 100 | English | 1962-02-19 | SW
903 | Lawrence of Arabia | 1962 | 216 | English | 1962-12-11 | UK
904 | The Deer Hunter | 1978 | 183 | English | 1979-03-08 | UK
905 | Amadeus | 1984 | 160 | English | 1985-01-07 | UK
906 | Blade Runner | 1982 | 117 | English | 1982-09-09 | UK
907 | Eyes Wide Shut | 1999 | 159 | English | | UK
908 | The Usual Suspects | 1995 | 106 | English | 1995-08-25 | UK
909 | Chinatown | 1974 | 130 | English | 1974-08-09 | UK
.......
925 | Braveheart | 1995 | 178 | English | 1995-09-08 | UK
Sample table: genres
gen_id | gen_title --------+---------------------- 1001 | Action 1002 | Adventure 1003 | Animation 1004 | Biography 1005 | Comedy 1006 | Crime 1007 | Drama 1008 | Horror 1009 | Music .... 1013 | War
Sample table: movie_genres
mov_id | gen_id
--------+--------
922 | 1001
917 | 1002
903 | 1002
912 | 1003
911 | 1005
908 | 1006
913 | 1006
926 | 1007
928 | 1007
.....
904 | 1013
Sample table: rating
mov_id | rev_id | rev_stars | num_o_ratings
--------+--------+-----------+---------------
901 | 9001 | 8.40 | 263575
902 | 9002 | 7.90 | 20207
903 | 9003 | 8.30 | 202778
906 | 9005 | 8.20 | 484746
924 | 9006 | 7.30 |
908 | 9007 | 8.60 | 779489
909 | 9008 | | 227235
910 | 9009 | 3.00 | 195961
.....
923 | 9020 | 6.70 | 13091
Sample Solution:
-- Selecting mov_year, gen_title, the count of gen_title, and the average rev_stars
FROM movie
NATURAL JOIN movie_genres
NATURAL JOIN genres
NATURAL JOIN rating
-- Filtering the result to include only records where gen_title is 'Mystery'
WHERE gen_title='Mystery'
-- Grouping the result by mov_year and gen_title to perform aggregate functions on each group
GROUP BY mov_year, gen_title;
Sample Output:
mov_year | gen_title | count | avg
----------+----------------------+-------+--------------------
1958 | Mystery | 1 | 8.4000000000000000
(1 row)
Code Explanation :
The said query in SQL that retrieves the count of movies and average rating for the Mystery genre in each year.
The NATURAL JOIN keyword is used to join the tables based on their common column names. This avoids having to explicitly specify the join conditions.
The WHERE clause filters the rows which limits the results to only movies that belong to the 'Mystery' genre.
The GROUP BY clause groups the results by mov_year and gen_title, which means that the count and average rating will be calculated for each unique combination of year and genre.
Alternative Solution:
Using a Self-Join:
SELECT m1.mov_year, g1.gen_title,
COUNT(g1.gen_title), AVG(r1.rev_stars)
FROM movie m1
JOIN movie_genres mg1 ON m1.mov_id = mg1.mov_id
JOIN genres g1 ON mg1.gen_id = g1.gen_id
JOIN rating r1 ON m1.mov_id = r1.mov_id
JOIN movie m2 ON m1.mov_year = m2.mov_year
JOIN movie_genres mg2 ON m2.mov_id = mg2.mov_id
JOIN genres g2 ON mg2.gen_id = g2.gen_id
WHERE g1.gen_title = 'Mystery'
AND g2.gen_title = 'Mystery'
GROUP BY m1.mov_year, g1.gen_title;
Explanation:
This SQL query involves a self-join on the movie table (m1 and m2). It selects movies with the genre 'Mystery' and groups them by year and genre title.
Relational Algebra Expression:
Relational Algebra Tree:
Go to:
PREV : From the following tables, write a SQL query to find the highest-rated ‘Mystery Movies’. Return the title, year, and rating.
NEXT : From the following tables, write a query in SQL to generate a report, which contain the fields movie title, name of the female actor, year of the movie, role, movie genres, the director, date of release, and rating of that movie.
Practice Online
Query Visualization:
Duration:
Rows:
Cost:
Have another way to solve this solution? Contribute your code (and comments) through Disqus.
What is the difficulty level of this exercise?
Test your Programming skills with w3resource's quiz.
