Finding the most popular genre
Find from the 'movies' collection in MongoDB with the most common genre among the movies.
Structure of 'movies' collection:
{ _id: ObjectId("573a1390f29313caabcd42e8"), plot: 'A group of bandits stage a brazen train hold-up, only to find a determined posse hot on their heels.', genres: [ 'Short', 'Western' ], runtime: 11, cast: [ 'A.C. Abadie', "Gilbert M. 'Broncho Billy' Anderson", 'George Barnes', 'Justus D. Barnes' ], poster: 'https://m.media-amazon.com/images/M/MV5BMTU3NjE5NzYtYTYyNS00MDVmLWIwYjgtMmYwYWIxZDYyNzU2XkEyXkFqcGdeQXVyNzQzNzQxNzI@._V1_SY1000_SX677_AL_.jpg', title: 'The Great Train Robbery', fullplot: "Among the earliest existing films in American cinema - notable as the first film that presented a narrative story to tell - it depicts a group of cowboy outlaws who hold up a train and rob the passengers. They are then pursued by a Sheriff's posse. Several scenes have color included - all hand tinted.", languages: [ 'English' ], released: ISODate("1903-12-01T00:00:00.000Z"), directors: [ 'Edwin S. Porter' ], rated: 'TV-G', awards: { wins: 1, nominations: 0, text: '1 win.' }, lastupdated: '2015-08-13 00:27:59.177000000', year: 1903, imdb: { rating: 7.4, votes: 9847, id: 439 }, countries: [ 'USA' ], type: 'movie', tomatoes: { viewer: { rating: 3.7, numReviews: 2559, meter: 75 }, fresh: 6, critic: { rating: 7.6, numReviews: 6, meter: 100 }, rotten: 0, lastUpdated: ISODate("2015-08-08T19:16:10.000Z") } .....
Query:
db.movies.aggregate([
{
$unwind: '$genres'
},
{
$group: {
_id: '$genres',
count: { $sum: 1 }
}
},
{
$sort: { count: -1 }
},
{
$limit: 1
}
])
Output:
[ { _id: 'Drama', count: 13789 } ]
Explanation:
The said query in MongoDB returns the most frequently occurring genre from the 'movies' collection in MongoDB.
The $unwind stage separates document for each element in the "genres" array and count the occurrences of each genre individually in each group.
The $group stage groups the documents based on the "genres" field. The _id field is used to specify the grouping key.
Inside the $group stage, the $sum aggregation operator increments the count for each occurrence of a genre. The 1 passed to $sum represents adding 1 for each document, effectively counting the occurrences of each genre.
The $sort stage sorts the documents based on the "count" field in descending order.
The $limit stage limits only the first document, effectively retrieving the genre with the highest count.
Improve this sample solution and post your code through Disqus.
Previous: Calculate average runtime of movies released in each country.
Next: Find movies with highest average rating by year.
What is the difficulty level of this exercise?
- Weekly Trends and Language Statistics
- Weekly Trends and Language Statistics