Table of contents
1.
Introduction
2.
What is Aggregation in MongoDB?
3.
Aggregation Pipeline Technique
3.1.
Syntax
3.2.
Different Stages in Aggregation
3.3.
Aggregate Expressions
4.
Map-Reduce Function
4.1.
Syntax
5.
Single Purpose Aggregation
6.
Frequently Asked Questions
6.1.
What is aggregate in MongoDB?
6.2.
Why use MongoDB aggregate?
6.3.
What is the difference between aggregate and pipeline in MongoDB?
6.4.
What is the difference between aggregate and find in MongoDB?
7.
Conclusion
Last Updated: Mar 27, 2024
Easy

MongoDB Aggregate

Career growth poll
Do you think IIT Guwahati certified course can help you in your career?

Introduction

MongoDB provides many methods and operations in it for users. These operations ease the work of the developers and make the complex code complete in less space and time. MongoDB Aggregate is one of the methods used to process the data according to the need and gives the required output.

MongoDB Aggregate


This article will discuss the aggregation operation and the MongoDB Aggregate method.

What is Aggregation in MongoDB?

MongoDB Aggregate is the process of merging data from several documents and collections to summarize the data we found.

Aggregation operations can be used for Combining values from several documents, applying procedures on the grouped data, getting a binary outcome, or for data evolution.

There are three MongoDB Aggregate techniques.

  1. Aggregation Pipeline
     
  2. Map-reduce function
     
  3. Single-purpose aggregation

Aggregation Pipeline Technique

To conduct aggregation, pipeline architecture is used in MongoDB.The aggregation pipeline comprises several steps used with a collection of documents. Filtering, grouping, sorting, and other operations on the documents are carried out at each pipeline level.

Here is an example of a straightforward aggregating pipeline that counts the number of documents in each category and sorts them according to a certain feature:

Syntax

db.collection.aggregate([
   { $group: { _id: "$field_name", count: { $sum: 1 } } }
])

 

In the above syntax,

 $group is Stage , $field_name is Expression and $sum is Accumulator .

The pipeline in  MongoDB aggregate first makes a different group of documents present in the collection by the values of the "field's name" field. Then it calculates the total number of documents in each group made using this $sum aggregation operator.

In this case, aggregation is done using the aggregate() function, which has three operators:  stage, expression, and accumulator.

Different Stages in Aggregation

Other typical steps and operators for aggregation in MongoDB include

  • $match
    Filters the documents in the pipeline using a formula called $match.
     
  • $project
    alterations and modifications to the fields and document structure in the pipeline.
     
  • $sort
    uses particular criteria to order the documents in the pipeline.
     
  • $limit
    limits the volume of documents that the pipeline can process.
     
  • $group
    It is used to classify documents according to a value.
     
  • $skip
    omits a certain number of documents in the pipeline.
     
  • $lookup
    Execute a left outer join between two collections.

Aggregate Expressions

Let’s see a list of aggregate expressions.

  • $sum
    It adds up the defined value of a particular field from all documents present in the collection.
     
  • $avg
    It calculated the average of given values from documents.
     
  • $min 
    Gives the minimum of the corresponding field value from all the documents present in the collection.
     
  • $push 
    It is used to insert any value in the collection.
     
  • $max
    Obtains the highest value for each associated field across all documents in the collection.
     
  • $first
    Obtains the highest value for each associated field across all documents in the collection.
     
  • $last
    Based on the grouping, retrieves the most recent document from the source documents. 

Map-Reduce Function

For aggregating outcomes for the vast volume of data, map reduction techniques come into consideration. The two primary operations that we use in the case of Map Reduce are Map, which organizes all the documents present, and Reduces, which operates on the grouped data.

The incoming data is divided into smaller pieces through processing the Map step, and each element is assigned a key-value pair. Each key-value pair is sent to the Map function, which is then processed to create a collection of key-value pairs.

Map-reduce operations in MongoDB aggregate use unique JavaScript functions for mapping key-value pairs. The operation condenses the values for a key if it has multiple values mapped to it, into a single object.

The Reduce step entails integrating and aggregating the corresponding key-value pairs created by the Map phase to create a final result. Each key-value pair is input to the Reduce function, which is executed for an operation to create the result finally.

During the Reduce step, the intermediate key-value pairs created by the Map phase are combined and aggregated to produce a result. Each intermediate key-value pair is then fed into the Reduce function, which executes the operation.

Thus, a MapReduce cluster can handle enormous amounts of data and produce results accurately and timely through the break-up of the data into smaller groups and the distribution of the processing across multiple junctions or nodes.

Syntax

db.collectionName.mapReduce(mappingFunction, reduceFunction, {out:'Result'});


Let us assume the following example to understand Map-reduce Aggregation.
First we will make a database ,
Database: myDb
Collection: classMark

db.marks.find().pretty()
{
“_id”:ObjectId(“601151166c46s4s64c111”),
“name”:”Utkarsh”,
“age”:10,
“marks”:11”
}
{
“_id”:ObjectId(“601151166c46s4s64c111”),
“name”:”Vardaan”,
“age”:11,
“marks”:10”
}
{
“_id”:ObjectId(“601151166c46s4s64c111”),
“name”:” Shivam”,
“age”:12,
“marks”:17”
}
{
“_id”:ObjectId(“601151166c46s4s64c111”),
“Name”:” Suyash”,
“Age”:13,
“marks”:01”
}

 

Above is the data fields which we have created for further aggregation process.We will categorize the documents based on age to find the total marks for each age group.

var mapfunction = function(){emit(this.age, this.marks)}
var reducefunction = function(key, values){return Array.sum(values)}
db.classMarks.mapReduce(mapfunction, reducefunction, {'out':'Result'})
{"result": "Results", "ok": 1 }
db.Results.find()


So the result according to the data mentioned above will be stored in the collection named as “Results”.

{"_id": 10, "value": 21 }
{"_id": 12, "value": 17 }
{"_id": 13, "value": 01 }

Single Purpose Aggregation

It combines several elements or data points into a single, simplified representation that focuses on a particular target. This aggregation is used in data analysis, which can help conserve a large amount of data into controllable summaries.

It is used whenever sudden access to a document is required, for example, counting the number of documents or finding all unique values in a document. It does not provide the flexibility and capabilities of the pipeline since it only provides access to the common aggregation process with the count(), distinct(), and estimatedDocumentCount() methods.

Let us assume the following example to understand the single purpose of Aggregation.

Database: myDb

Collection: classMark

Documents:4 documents containing the children's details in the form of field-value pairs.

db.marks.find().pretty()
{
“_id”:ObjectId(“601151166c46s4s64c111”),
“name”:”Utkarsh”,
“age”:10,
“Marks”:15”,
“location”:”Delhi”
}
{
“_id”:ObjectId(“601151166c46s4s64c111”),
“name”:”vardaan”,
“age”:11,
“Marks”:16”,
“location”:”Mumbai”
}
{
“_id”:ObjectId(“601151166c46s4s64c111”),
“name”:” Shivam”,
“age”:12,
“Marks”:17”,
“location”:”Kolkata”
}
{
“_id”:ObjectId(“601151166c46s4s64c111”),
“name”:”Suyash”,
“age”:13,
“Marks”:01”,
“location”:”Banglore”
}

 

Above is the data fields which we have created for further aggregation process.

Get distinct names and ages.

db.marks.distinct(“location”)

 

Output

[“Delhi”,”Mumbai”, ”Kolkata”, ”Banglore”]


In the above example, we used the distinct() method that gives us distinct values of the particular required field.

In the database collection , we have certain number of data fields available , so if we need to find details of a particular data field from all the objects present in the collection then we will use distinct() method, as done above.

db.marks.distinct(“age”)


Output

[10, 11, 12, 13]

 

Similarly , we we have used distinct() method for “age” field.

Also , we can use the count() method to find the total number of document objects available.

db.marks.count()


Output-

4


Similarly, There are various methods that we can use as our needs.

Frequently Asked Questions

What is aggregate in MongoDB?

MongoDB Aggregate is the process of merging data from several documents and collections to summarize the data we found. Aggregation operations can be used for Combining values from several documents, applying procedures on the grouped data, getting a binary outcome, or for data evolution.

Why use MongoDB aggregate?

It should be used because it can be utilized when one requires quick access to documents, such as when someone needs to count the number of documents or locate all distinct values within a document, then one uses MongoDB aggregate.

What is the difference between aggregate and pipeline in MongoDB?

The difference is that the aggregate in MongoDB is utilized to perform aggregate operations on data. Whereas the pipeline in MongoDB denotes the sequence of stages that tell how the operations are/will be executed. You can use the aggregate command to apply a pipeline of stages to the data.

What is the difference between aggregate and find in MongoDB?

The difference is that the aggregate in MongoDB is utilized to perform aggregate operations on data. E.g., grouping data, etc. At the same time, find is a query in MongoDB for basic document retrieval purposes. E.g., finding documents with a specific field.

Conclusion

In this article, we discussed data aggregation, frequently used to produce usable summary data for corporate analysis and to give statistical analysis for groups of people.

We learned different MongoDB aggregate techniques and provided examples of how to use them. The aggregation pipeline becomes increasingly crucial as you use MongoDB because it enables you to perform all of the reporting, converting, and complex querying functions crucial to a database developer.

So we can use these three techniques for aggregation to make our process more efficient and simple.

To know more, go through the following articles.

 

Enhance your skills in Data Structures and AlgorithmsCompetitive ProgrammingJavaScriptSystem Design, and more with our Coding Ninjas Studio  Guided Path. If you want to sharpen your coding skills for the test, check out the mock test series and enter the contests on Coding Ninjas Studio! 

Live masterclass