In this comprehensive MongoDB tutorial, we'll delve into the powerful aggregate()
method, showcasing how to compute aggregate values within a collection or view. Using the classic Northwind dataset, we'll provide practical examples that demonstrate grouping, summing, averaging, and transforming data. This guide is especially beneficial for those familiar with SQL, as we'll draw parallels between SQL queries and MongoDB's aggregation operators, facilitating a smoother transition and deeper understanding.
Syntax or Definition
The MongoDB aggregate syntax is simple, like this.
db.collection.aggregate(pipeline, options)
That syntax calculates the aggregate of a collection by pipeline and options. The aggregation pipeline is a sequence of data aggregation operations or stages. We can compare this aggregation pipeline with the SQL terms function and concepts.
SQL Terms, Functions, and Concepts | MongoDB Aggregation Operators |
WHERE | $match |
GROUP BY | $group |
HAVING | $match |
SELECT | $project |
ORDER BY | $sort |
LIMIT | $limit |
SUM() | $sum |
COUNT() | $sortByCount |
JOIN | $lookup |
SELECT INTO NEW_TABLE | $out |
MERGE INTO TABLE | $merge (Available starting in MongoDB 4.2) |
Preparation
We will use the popular and legendary Northwind database, which has been converted to the MongoDB database. You can clone or download it from our GitHub. After cloning or downloading, restore the MongoRestore command and make sure your MongoDB server is running. We are running the MongoDB daemon manually, so we need to open a new Terminal tab to run it.
mongod
In the previous Terminal tab, type this command to restore the Products collection from the Northwind MongoDB database.
mongorestore --db your-db-name -c products ~/northwind-mongo/dump/products.bson
Now, enter the Mongo console, then check the Products collection.
mongo
use your-db-name
db.products.find()
Now, we are ready to practice using the MongoDB aggregate method.
Group By and Calculate Sum Example
The below example will show you how to Group products by SupplierID, then calculate the sum of UnitPrice for each SupplierID. The result will sort ascending by the SupplierID.
db.products.aggregate([
{ $group: { _id: "$SupplierID", TotalPrice: { $sum: "$UnitPrice" } } },
{ $sort: { _id: 1 } }
])
Result:
{ "_id" : 1, "TotalPrice" : 47 }
{ "_id" : 2, "TotalPrice" : 81.4 }
{ "_id" : 3, "TotalPrice" : 95 }
{ "_id" : 4, "TotalPrice" : 138 }
{ "_id" : 5, "TotalPrice" : 59 }
{ "_id" : 6, "TotalPrice" : 44.75 }
{ "_id" : 7, "TotalPrice" : 177.85 }
{ "_id" : 8, "TotalPrice" : 112.7 }
{ "_id" : 9, "TotalPrice" : 30 }
{ "_id" : 10, "TotalPrice" : 4.5 }
{ "_id" : 11, "TotalPrice" : 89.13 }
{ "_id" : 12, "TotalPrice" : 223.39000000000001 }
{ "_id" : 13, "TotalPrice" : 25.89 }
{ "_id" : 14, "TotalPrice" : 79.3 }
{ "_id" : 15, "TotalPrice" : 60 }
{ "_id" : 16, "TotalPrice" : 46 }
{ "_id" : 17, "TotalPrice" : 60 }
{ "_id" : 18, "TotalPrice" : 281.5 }
{ "_id" : 19, "TotalPrice" : 28.049999999999997 }
Group By and Get Average Example
The below example will show you how to Group products by SupplierID, then get the average of UnitPrice for each SupplierID. The result will sort ascending by the SupplierID.
db.products.aggregate([
{ $group: { _id: "$SupplierID", AvgPrice: { $avg: "$UnitPrice" } } },
{ $sort: { _id: 1 } }
])
Result:
{ "_id" : 1, "AvgPrice" : 15.666666666666666 }
{ "_id" : 2, "AvgPrice" : 20.35 }
{ "_id" : 3, "AvgPrice" : 31.666666666666668 }
{ "_id" : 4, "AvgPrice" : 46 }
{ "_id" : 5, "AvgPrice" : 29.5 }
{ "_id" : 6, "AvgPrice" : 14.916666666666666 }
{ "_id" : 7, "AvgPrice" : 35.57 }
{ "_id" : 8, "AvgPrice" : 28.175 }
{ "_id" : 9, "AvgPrice" : 15 }
{ "_id" : 10, "AvgPrice" : 4.5 }
{ "_id" : 11, "AvgPrice" : 29.709999999999997 }
{ "_id" : 12, "AvgPrice" : 44.678000000000004 }
{ "_id" : 13, "AvgPrice" : 25.89 }
{ "_id" : 14, "AvgPrice" : 26.433333333333334 }
{ "_id" : 15, "AvgPrice" : 20 }
{ "_id" : 16, "AvgPrice" : 15.333333333333334 }
{ "_id" : 17, "AvgPrice" : 20 }
{ "_id" : 18, "AvgPrice" : 140.75 }
{ "_id" : 19, "AvgPrice" : 14.024999999999999 }
{ "_id" : 20, "AvgPrice" : 26.483333333333334 }
Get Minimum Values Example
The below example will show you how to Group products by SupplierID, then get the minimum values of UnitPrice for each SupplierID. The result will sort ascending by the SupplierID. This operation will use the $min operator.
db.products.aggregate([
{ $group: { _id: "$SupplierID", AvgPrice: { $min: "$UnitPrice" } } },
{ $sort: { _id: 1 } }
])
Result:
{ "_id" : 1, "AvgPrice" : 10 }
{ "_id" : 2, "AvgPrice" : 17 }
{ "_id" : 3, "AvgPrice" : 25 }
{ "_id" : 4, "AvgPrice" : 10 }
{ "_id" : 5, "AvgPrice" : 21 }
{ "_id" : 6, "AvgPrice" : 6 }
{ "_id" : 7, "AvgPrice" : 15 }
{ "_id" : 8, "AvgPrice" : 9.2 }
{ "_id" : 9, "AvgPrice" : 9 }
{ "_id" : 10, "AvgPrice" : 4.5 }
{ "_id" : 11, "AvgPrice" : 14 }
{ "_id" : 12, "AvgPrice" : 7.75 }
{ "_id" : 13, "AvgPrice" : 25.89 }
{ "_id" : 14, "AvgPrice" : 12.5 }
{ "_id" : 15, "AvgPrice" : 2.5 }
{ "_id" : 16, "AvgPrice" : 14 }
{ "_id" : 17, "AvgPrice" : 15 }
{ "_id" : 18, "AvgPrice" : 18 }
{ "_id" : 19, "AvgPrice" : 9.65 }
{ "_id" : 20, "AvgPrice" : 14 }
Get Maximum Values Example
The below example will show you how to Group products by SupplierID, then get the maximum values of UnitPrice for each SupplierID. The result will sort ascending by the SupplierID. This operation will use $max operator.
db.products.aggregate([
{ $group: { _id: "$SupplierID", AvgPrice: { $max: "$UnitPrice" } } },
{ $sort: { _id: 1 } }
])
Result:
{ "_id" : 1, "AvgPrice" : 19 }
{ "_id" : 2, "AvgPrice" : 22 }
{ "_id" : 3, "AvgPrice" : 40 }
{ "_id" : 4, "AvgPrice" : 97 }
{ "_id" : 5, "AvgPrice" : 38 }
{ "_id" : 6, "AvgPrice" : 23.25 }
{ "_id" : 7, "AvgPrice" : 62.5 }
{ "_id" : 8, "AvgPrice" : 81 }
{ "_id" : 9, "AvgPrice" : 21 }
{ "_id" : 10, "AvgPrice" : 4.5 }
{ "_id" : 11, "AvgPrice" : 43.9 }
{ "_id" : 12, "AvgPrice" : 123.79 }
{ "_id" : 13, "AvgPrice" : 25.89 }
{ "_id" : 14, "AvgPrice" : 34.8 }
{ "_id" : 15, "AvgPrice" : 36 }
{ "_id" : 16, "AvgPrice" : 18 }
{ "_id" : 17, "AvgPrice" : 26 }
{ "_id" : 18, "AvgPrice" : 263.5 }
{ "_id" : 19, "AvgPrice" : 18.4 }
{ "_id" : 20, "AvgPrice" : 46 }
Insert the Array of Values from Other Fields Example
The example below will show you how to insert the array of values from other fields for each SupplierID. The result will sort ascending by the SupplierID. This operation will use the $push operator.
db.products.aggregate([
{ $group: { _id: "$SupplierID", AvgPrice: { $max: "$UnitPrice" }, ReorderLevel: { $push: "$ReorderLevel" } } },
{ $sort: { _id: 1 } }
])
Result:
{ "_id" : 1, "AvgPrice" : 19, "ReorderLevel" : [ 10, 25, 25 ] }
{ "_id" : 2, "AvgPrice" : 22, "ReorderLevel" : [ 0, 20, 0, 0 ] }
{ "_id" : 3, "AvgPrice" : 40, "ReorderLevel" : [ 25, 10, 0 ] }
{ "_id" : 4, "AvgPrice" : 97, "ReorderLevel" : [ 0, 0, 5 ] }
{ "_id" : 5, "AvgPrice" : 38, "ReorderLevel" : [ 30, 0 ] }
{ "_id" : 6, "AvgPrice" : 23.25, "ReorderLevel" : [ 5, 5, 0 ] }
{ "_id" : 7, "AvgPrice" : 62.5, "ReorderLevel" : [ 10, 0, 0, 5, 30 ] }
{ "_id" : 8, "AvgPrice" : 81, "ReorderLevel" : [ 5, 0, 5, 15 ] }
{ "_id" : 9, "AvgPrice" : 21, "ReorderLevel" : [ 25, 25 ] }
{ "_id" : 10, "AvgPrice" : 4.5, "ReorderLevel" : [ 0 ] }
{ "_id" : 11, "AvgPrice" : 43.9, "ReorderLevel" : [ 30, 30, 0 ] }
{ "_id" : 12, "AvgPrice" : 123.79, "ReorderLevel" : [ 0, 30, 25, 15, 0 ] }
{ "_id" : 13, "AvgPrice" : 25.89, "ReorderLevel" : [ 15 ] }
{ "_id" : 14, "AvgPrice" : 34.8, "ReorderLevel" : [ 20, 0, 25 ] }
{ "_id" : 15, "AvgPrice" : 36, "ReorderLevel" : [ 20, 0, 15 ] }
{ "_id" : 16, "AvgPrice" : 18, "ReorderLevel" : [ 15, 15, 10 ] }
{ "_id" : 17, "AvgPrice" : 26, "ReorderLevel" : [ 20, 25, 5 ] }
{ "_id" : 18, "AvgPrice" : 263.5, "ReorderLevel" : [ 15, 5 ] }
{ "_id" : 19, "AvgPrice" : 18.4, "ReorderLevel" : [ 30, 10 ] }
{ "_id" : 20, "AvgPrice" : 46, "ReorderLevel" : [ 15, 0, 25 ] }
Insert the Array of Unique Values from Other Fields Example
The example below will show you how to insert the array of unique values from other fields for each SupplierID. The result will sort ascending by the SupplierID. This operation will use the $addToSet operator.
db.products.aggregate([
{ $group: { _id: "$SupplierID", AvgPrice: { $max: "$UnitPrice" }, ReorderLevel: { $addToSet: "$ReorderLevel" } } },
{ $sort: { _id: 1 } }
])
Result:
{ "_id" : 1, "AvgPrice" : 19, "ReorderLevel" : [ 25, 10 ] }
{ "_id" : 2, "AvgPrice" : 22, "ReorderLevel" : [ 20, 0 ] }
{ "_id" : 3, "AvgPrice" : 40, "ReorderLevel" : [ 0, 10, 25 ] }
{ "_id" : 4, "AvgPrice" : 97, "ReorderLevel" : [ 5, 0 ] }
{ "_id" : 5, "AvgPrice" : 38, "ReorderLevel" : [ 0, 30 ] }
{ "_id" : 6, "AvgPrice" : 23.25, "ReorderLevel" : [ 0, 5 ] }
{ "_id" : 7, "AvgPrice" : 62.5, "ReorderLevel" : [ 30, 5, 0, 10 ] }
{ "_id" : 8, "AvgPrice" : 81, "ReorderLevel" : [ 15, 0, 5 ] }
{ "_id" : 9, "AvgPrice" : 21, "ReorderLevel" : [ 25 ] }
{ "_id" : 10, "AvgPrice" : 4.5, "ReorderLevel" : [ 0 ] }
{ "_id" : 11, "AvgPrice" : 43.9, "ReorderLevel" : [ 0, 30 ] }
{ "_id" : 12, "AvgPrice" : 123.79, "ReorderLevel" : [ 25, 30, 15, 0 ] }
{ "_id" : 13, "AvgPrice" : 25.89, "ReorderLevel" : [ 15 ] }
{ "_id" : 14, "AvgPrice" : 34.8, "ReorderLevel" : [ 0, 25, 20 ] }
{ "_id" : 15, "AvgPrice" : 36, "ReorderLevel" : [ 15, 0, 20 ] }
{ "_id" : 16, "AvgPrice" : 18, "ReorderLevel" : [ 10, 15 ] }
{ "_id" : 17, "AvgPrice" : 26, "ReorderLevel" : [ 5, 25, 20 ] }
{ "_id" : 18, "AvgPrice" : 263.5, "ReorderLevel" : [ 5, 15 ] }
{ "_id" : 19, "AvgPrice" : 18.4, "ReorderLevel" : [ 10, 30 ] }
{ "_id" : 20, "AvgPrice" : 46, "ReorderLevel" : [ 25, 0, 15 ] }
Get the First Value for Each Group
The example below will show you how to get the first value of ReorderLevel for each SupplierID. The result will sort ascending by the SupplierID. This operation will use $first operator.
db.products.aggregate([
{ $group: { _id: "$SupplierID", AvgPrice: { $max: "$UnitPrice" }, ReorderLevel: { $first: "$ReorderLevel" } } },
{ $sort: { _id: 1 } }
])
Result:
{ "_id" : 1, "AvgPrice" : 19, "ReorderLevel" : 10 }
{ "_id" : 2, "AvgPrice" : 22, "ReorderLevel" : 0 }
{ "_id" : 3, "AvgPrice" : 40, "ReorderLevel" : 25 }
{ "_id" : 4, "AvgPrice" : 97, "ReorderLevel" : 0 }
{ "_id" : 5, "AvgPrice" : 38, "ReorderLevel" : 30 }
{ "_id" : 6, "AvgPrice" : 23.25, "ReorderLevel" : 5 }
{ "_id" : 7, "AvgPrice" : 62.5, "ReorderLevel" : 10 }
{ "_id" : 8, "AvgPrice" : 81, "ReorderLevel" : 5 }
{ "_id" : 9, "AvgPrice" : 21, "ReorderLevel" : 25 }
{ "_id" : 10, "AvgPrice" : 4.5, "ReorderLevel" : 0 }
{ "_id" : 11, "AvgPrice" : 43.9, "ReorderLevel" : 30 }
{ "_id" : 12, "AvgPrice" : 123.79, "ReorderLevel" : 0 }
{ "_id" : 13, "AvgPrice" : 25.89, "ReorderLevel" : 15 }
{ "_id" : 14, "AvgPrice" : 34.8, "ReorderLevel" : 20 }
{ "_id" : 15, "AvgPrice" : 36, "ReorderLevel" : 20 }
{ "_id" : 16, "AvgPrice" : 18, "ReorderLevel" : 15 }
{ "_id" : 17, "AvgPrice" : 26, "ReorderLevel" : 20 }
{ "_id" : 18, "AvgPrice" : 263.5, "ReorderLevel" : 15 }
{ "_id" : 19, "AvgPrice" : 18.4, "ReorderLevel" : 30 }
{ "_id" : 20, "AvgPrice" : 46, "ReorderLevel" : 15 }
Get the Last Value for Each Group
The example below will show you how to get the last value of ReorderLevel for each SupplierID. The result will sort ascending by the SupplierID. This operation will use the $last operator.
db.products.aggregate([
{ $group: { _id: "$SupplierID", AvgPrice: { $max: "$UnitPrice" }, ReorderLevel: { $last: "$ReorderLevel" } } },
{ $sort: { _id: 1 } }
])
Result:
{ "_id" : 1, "AvgPrice" : 19, "ReorderLevel" : 25 }
{ "_id" : 2, "AvgPrice" : 22, "ReorderLevel" : 0 }
{ "_id" : 3, "AvgPrice" : 40, "ReorderLevel" : 0 }
{ "_id" : 4, "AvgPrice" : 97, "ReorderLevel" : 5 }
{ "_id" : 5, "AvgPrice" : 38, "ReorderLevel" : 0 }
{ "_id" : 6, "AvgPrice" : 23.25, "ReorderLevel" : 0 }
{ "_id" : 7, "AvgPrice" : 62.5, "ReorderLevel" : 30 }
{ "_id" : 8, "AvgPrice" : 81, "ReorderLevel" : 15 }
{ "_id" : 9, "AvgPrice" : 21, "ReorderLevel" : 25 }
{ "_id" : 10, "AvgPrice" : 4.5, "ReorderLevel" : 0 }
{ "_id" : 11, "AvgPrice" : 43.9, "ReorderLevel" : 0 }
{ "_id" : 12, "AvgPrice" : 123.79, "ReorderLevel" : 0 }
{ "_id" : 13, "AvgPrice" : 25.89, "ReorderLevel" : 15 }
{ "_id" : 14, "AvgPrice" : 34.8, "ReorderLevel" : 25 }
{ "_id" : 15, "AvgPrice" : 36, "ReorderLevel" : 15 }
{ "_id" : 16, "AvgPrice" : 18, "ReorderLevel" : 10 }
{ "_id" : 17, "AvgPrice" : 26, "ReorderLevel" : 5 }
{ "_id" : 18, "AvgPrice" : 263.5, "ReorderLevel" : 5 }
{ "_id" : 19, "AvgPrice" : 18.4, "ReorderLevel" : 10 }
{ "_id" : 20, "AvgPrice" : 46, "ReorderLevel" : 25 }
Show Detailed Information of the Aggregate Method
The previous example of the aggregate method can be described with the detailed information using the explain() method.
db.products.explain().aggregate([
{ $group: { _id: "$SupplierID", $round: { TotalPrice: { $sum: "$UnitPrice" } } } },
{ $sort: { _id: 1 } }
])
Result:
{
"waitedMS" : NumberLong(0),
"stages" : [
{
"$cursor" : {
"query" : {
},
"fields" : {
"SupplierID" : 1,
"UnitPrice" : 1,
"_id" : 0
},
"queryPlanner" : {
"plannerVersion" : 1,
"namespace" : "mongo-examples.products",
"indexFilterSet" : false,
"parsedQuery" : {
"$and" : [ ]
},
"winningPlan" : {
"stage" : "COLLSCAN",
"filter" : {
"$and" : [ ]
},
"direction" : "forward"
},
"rejectedPlans" : [ ]
}
}
},
{
"$group" : {
"_id" : "$SupplierID",
"TotalPrice" : {
"$sum" : "$UnitPrice"
}
}
},
{
"$sort" : {
"sortKey" : {
"_id" : 1
}
}
}
],
"ok" : 1
}
That it's a few examples of the MongoDB Aggregate Methods.
That's just the basics. If you need more deep learning about MongoDB or related, you can take the following cheap course:
- Mongodb fundamentals
- Introduction to MongoDB
- MongoDB
- MongoDB
- MongoDB: Complex Querying & advanced data model: 2 in 1
Thanks!