MongoDB Aggregate and Group example

摘要: In this tutorial, we will show you how to use MongoDB aggregate function to group documents (data).

In this tutorial, we will show you how to use MongoDB aggregate function to group documents (data).

1. Test Data

Data in JSON format, shows the hosting provider for website.

website.json
{ "_id" : 1, "domainName" : "test1.com", "hosting" : "hostgator.com" }
{ "_id" : 2, "domainName" : "test2.com", "hosting" : "aws.amazon.com"}
{ "_id" : 3, "domainName" : "test3.com", "hosting" : "aws.amazon.com" }
{ "_id" : 4, "domainName" : "test4.com", "hosting" : "hostgator.com" }
{ "_id" : 5, "domainName" : "test5.com", "hosting" : "aws.amazon.com" }
{ "_id" : 6, "domainName" : "test6.com", "hosting" : "cloud.google.com" }
{ "_id" : 7, "domainName" : "test7.com", "hosting" : "aws.amazon.com" }
{ "_id" : 8, "domainName" : "test8.com", "hosting" : "hostgator.com" }
{ "_id" : 9, "domainName" : "test9.com", "hosting" : "cloud.google.com" }
{ "_id" : 10, "domainName" : "test10.com", "hosting" : "godaddy.com" }

Imports into a “website” collection.

> mongoimport -d testdb -c website --file website.json
connected to: 127.0.0.1
Mon Jan 13 14:30:22.662 imported 10 objects
Note
If the collection is existed, add --upsert option to override the data.

> mongoimport -d testdb -c website --file website.json --upsert

2. Grouping Example

Uses db.collection.aggregate and $group to perform the data grouping.

2.1 The following example groups by the “hosting” field, and display the total sum of each hosting.

> db.website.aggregate(
    { 
	$group : {_id : "$hosting", total : { $sum : 1 }}
  );

Output

        "result" : [
                        "_id" : "godaddy.com",
                        "total" : 1
                },
                        "_id" : "cloud.google.com",
                        "total" : 2
                },
                        "_id" : "aws.amazon.com",
                        "total" : 4
                },
                        "_id" : "hostgator.com",
                        "total" : 3
        ],
        "ok" : 1

The equivalent SQL.

SELECT hosting, SUM(hosting) AS total
       FROM website
       GROUP BY hosting

2.2 Add sorting with $sort.

>  db.website.aggregate(
     { 
	$group : {_id : "$hosting", total : { $sum : 1 }}
     },
	$sort : {total : -1}
  );

Output – Display “total” in descending order. For ascending order, uses $sort : {total : 1}.

        "result" : [
                        "_id" : "aws.amazon.com",
                        "total" : 4
                },
                        "_id" : "hostgator.com",
                        "total" : 3
                },
                        "_id" : "cloud.google.com",
                        "total" : 2
                },
                        "_id" : "godaddy.com",
                        "total" : 1
        ],
        "ok" : 1

2.3 Add $match condition, groups by “hosting” for “aws.amazon.com” only.

> db.website.aggregate(
    { 
	$match : {hosting : "aws.amazon.com"}
    },
    { 
	$group : { _id : "$hosting", total : { $sum : 1 } }
  );

Output

        "result" : [
                        "_id" : "aws.amazon.com",
                        "total" : 4
        ],
        "ok" : 1
More Examples
Refer to this official MongoDB Aggregation guide for more advance aggregation and group examples.

3. Exports Grouping Result to CSV or JSON

Often times, we need to export the grouping results in csv or JSON format. To solve it, inserts the group results in a new collection, and exports the new collection via mongoexport.

3.1 Set the group results in a variable. In this case, the variable name is “groupdata”.

> var groupdata = db.website.aggregate(
    { 
	$group : {_id : "$hosting", total : { $sum : 1 }}
    },
	$sort : {total : -1}
  );

3.2Inserts groupdata.toArray() into a new collection.

> db.websitegroup.insert(groupdata.toArray());
> db.websitegroup.find().pretty()
{ "_id" : "aws.amazon.com", "total" : 4 }
{ "_id" : "hostgator.com", "total" : 3 }
{ "_id" : "cloud.google.com", "total" : 2 }
{ "_id" : "godaddy.com", "total" : 1 }
>

3.3 Exports the collection “websitegroup” to a csv file.

c:\> mongoexport -d testdb -c websitegroup -f _id,total -o group.csv --csv
connected to: 127.0.0.1
exported 4 records
group.csv
_id,total
"aws.amazon.com",4.0
"cloud.google.com",2.0
"godaddy.com",1.0
"hostgator.com",3.0

3.4 Exports the collection “websitegroup” to a JSON file.

c:\> mongoexport -d testdb -c websitegroup -o group.json
connected to: 127.0.0.1
exported 4 records
group.json
{ "_id" : "aws.amazon.com", "total" : 4 }
{ "_id" : "cloud.google.com", "total" : 2 }
{ "_id" : "godaddy.com", "total" : 1 }
{ "_id" : "hostgator.com", "total" : 3 }

4. Large Sort Operation

Changed in version 2.6 – Read this Memory Restrictions
In MongoDB, the in-memory sorting have a limit of 100M, to perform a large sort, you need enable allowDiskUse option to write data to a temporary file for sorting.

To avoid the sort exceeded memory limit error, enable the allowDiskUse option.

db.website.aggregate(
	{$group : {_id : "$hosting", total : { $sum : 1 }}},
	{$sort : {total : -1}}
],
	{allowDiskUse: true}
);

References

  1. MongoDB Aggregation
  2. MongoDB db.collection.aggregate()
  3. Aggregation Pipeline Limits
  4. MongoDB Hello World Example

上一篇: java.lang.ClassNotFoundException: org.hibernate.service.jta.platform.spi.JtaPlatform
下一篇: Maven + Emma code coverage example
 评论 ( What Do You Think )
名称
邮箱
网址
评论
验证
   
 

 


  • 微信公众号

  • 我的微信

站点声明:

1、一号门博客CMS,由Python, MySQL, Nginx, Wsgi 强力驱动

2、部分文章或者资源来源于互联网, 有时候很难判断是否侵权, 若有侵权, 请联系邮箱:summer@yihaomen.com, 同时欢迎大家注册用户,主动发布无版权争议的 文章/资源.

3、鄂ICP备14001754号-3, 鄂公网安备 42280202422812号