MongoDB的使用学习之（七）MongoDB的聚合查询（两种方式）附项目源码

qq70191 · 发表于 2015-7-6 08:30:00

　　先来张在路上……

　　铛铛铛……项目源码下载地址：http://files.iyunv.com/ontheroad_lee/MongoDBDemo.rar
　　
　　此项目是用Maven创建的，没有使用Maven的，自己百度、谷歌去；直接用Junit测试就行，先执行里面的save方法，添加10000条测试数据提供各种聚合查询等。
　　少废话，上干货……
　　一、MongoDB数据库的配置（mongodb.xml）
　　以下是我自己的配置，红色字体请改为自己本机的东东，你说不懂设置端口，不会创建数据库名称，不会配置用户名密码，那有请查阅本系列的第4节（MongoDB的使用学习之（四）权限设置--用户名、密码、端口==），你说懒得设置，那就@#￥%……&*（）！

　　

　　
　　二、各种聚合查询方法
　　以下只是展示一些常用的，聚合查询方法，无奈个人功力尚浅，没啥高深的东西，待日后，有时间有精力有实力，再整理些高级一点的
　　1、添加测试数据

@Test
public  void save() {
News n = null;
for (int i = 0; i < 10000; i++) {
n = new News();
n.setTitle("title_" + i);
n.setUrl("url_" + i);
//2014-01-01到2014-01-01之间的随机时间
Date randomDate=DateUtil.randomDate("2014-01-01","2014-05-11");
//MongoDB里如果时间类型存的是Date，那么会差8个小时的时区，因为MongoDB使用的格林威治时间，中国所处的是+8区，so……
//比如我保存的是2014-05-01 00:00:00，那么保存到MongoDB里则是2014-05-01 08:00:00，所以为了统一方面，那就保存字符串类型，底下保存的long类型
         n.setPublishTimeStr(DateUtil.formatDateTimeByDate(randomDate));
//long类型在查询速度中肯定会比较快
         n.setPublishTime(randomDate.getTime());
n.setPublishDate(randomDate);
n.setPublishMedia("publishMedia_" + i);
String[] areaArr = {"1024", "102401", "102402", "102403", "102404", "102405", "102406", "102407", "102408"
, "10240101", "10240102", "10240201", "10240202", "10240301", "10240302", "10240401", "10240402"
, "10240501", "10240502", "10240601", "10240602", "10240701", "10240702", "10240801", "10240802"};
int areaNum=(int)(Math.random() * areaArr.length);//产生0-strs.length的整数随机数
String area = areaArr[areaNum];
n.setArea(area);
String[] ckeyArr = {"A101", "A102", "A201", "A202", "A203"
, "B101", "B102", "B103", "C201", "C202", "C203", "22", "23", "24", "25", "26"};
int ckeyNum=(int)(Math.random() * ckeyArr.length);//产生0-strs.length的整数随机数
List list = new ArrayList();
for (int j = 0; j < ckeyNum; j ++) {
int ckeyNum1=(int)(Math.random() * ckeyArr.length);//产生0-strs.length的整数随机数
            list.add(ckeyArr[ckeyNum1]);
}
n.setClassKey(list);
Integer[] evalArr = {1, 0};
int evalNum=(int)(Math.random() * evalArr.length);//产生0-strs.length的整数随机数
         n.setEvaluate(evalArr[evalNum]);
Integer[] mproArr = {1, 2, 100};
int mproNum=(int)(Math.random() * mproArr.length);//产生0-strs.length的整数随机数
         n.setMediaProperty(mproArr[mproNum]);
Integer[] mtypeArr = {1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11};
int mtypeNum=(int)(Math.random() * mtypeArr.length);//产生0-strs.length的整数随机数
         n.setMediaType(mtypeArr[mtypeNum]);
Integer[] levelArr = {1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12};
int levelNum=(int)(Math.random() * levelArr.length);//产生0-strs.length的整数随机数
         n.setLevel(levelArr[levelNum]);
newsService.save(n);
}
System.out.println("OK");
}
　　
　　2、简单的分组查询--使用Mongo本身提供的AggregationOutput进行分组查询

/**
*
* 功能：使用Mongo本身提供的AggregationOutput进行分组查询
* 参数：
* 创建人：OnTheRoad_Lee
* 修改人：OnTheRoad_Lee
* 最后修改时间：2014-5-26
*/
public void testGroup1 () {
//按照eval字段进行分组，注意$eval必须是存在mongodb里面的字段，不能写$evaluate(此字段是News类中定义的，和存入mongodb中的有区别)
//{$group:{_id:{'AAA':'$BBB'},CCC:{$sum:1}}}固定格式：把要分组的字段放在_id:{}里面，BBB是mongodb里面的某个字段，AAA是BBB的重命名，CCC是$sum:1的重命名
//此查询语句== select eval as eval, count(*) as docsNum from news group by eval having docsNum>=85 order by docsNum desc
//具体的mongodb和sql的对照可以参考：http://docs.mongodb.org/manual/reference/sql-aggregation-comparison/
String groupStr = "{$group:{_id:{'eval':'$eval'},docsNum:{$sum:1}}}";
DBObject group = (DBObject) JSON.parse(groupStr);
String matchStr = "{$match:{docsNum:{$gte:85}}}";
DBObject match = (DBObject) JSON.parse(matchStr);
String sortStr = "{$sort:{_id.docsNum:-1}}";
DBObject sort = (DBObject) JSON.parse(sortStr);
AggregationOutput output = mongoTemplate.getCollection("news").aggregate(group, match, sort);
System.out.println(output.getCommand());
//转换为执行原生的mongodb查询语句
//{ "aggregate" : "news" , "pipeline" : [ { "$group" : { "_id" : { "eval" : "$eval"} , "docsNum" : { "$sum" : 1}}} , { "$match" : { "docsNum" : { "$gte" : 85}}} , { "$sort" : { "_id.docsNum" : -1}}]}
      System.out.println(output.getCommandResult());
//查询结果
//{ "serverUsed" : "localhost/127.0.0.1:47017" , "result" : [ { "_id" : { "evaluate" : 1} , "docsNum" : 9955} , { "_id" : { "evaluate" : 0} , "docsNum" : 10047}] , "ok" : 1.0}
//也可以把查询结果封装到NewsNumDTO，这样以一个dto对象返回前台操作就更容易了
NewsNumDTO dto = new NewsNumDTO();
for( Iterator< DBObject > it = output.results().iterator(); it.hasNext(); ){
BasicDBObject dbo = ( BasicDBObject ) it.next();
BasicDBObject keyValus = (BasicDBObject)dbo.get("_id");
int eval = keyValus.getInt("eval");
long docsNum = ((Integer)dbo.get("docsNum")).longValue();
if(eval == 1){
dto.setPositiveNum(docsNum);
}else {
dto.setNegativeNum(docsNum);
}
}
}
　　
　　3、获取和testGroup1方法同样结果的另一种写法，Spring Data MongoDB隆重登场，语法更加简洁易懂

/**
*
* 功能：获取和testGroup1方法同样结果的另一种写法，Spring Data MongoDB隆重登场，语法更加简洁易懂
* 参数：
* 创建人：OnTheRoad_Lee
* 修改人：OnTheRoad_Lee
* 最后修改时间：2014-5-26
*/
public void testAggregation1() {
TypedAggregation agg = Aggregation.newAggregation(
News.class,
project("evaluate")
,group("evaluate").count().as("totalNum")
,match(Criteria.where("totalNum").gte(85))
,sort(Sort.Direction.DESC, "totalNum")
);
AggregationResults result = mongoTemplate.aggregate(agg, BasicDBObject.class);
System.out.println(agg.toString());
//执行语句差不多
//{ "aggregate" : "__collection__" , "pipeline" : [ { "$project" : { "evaluate" : 1}} , { "$group" : { "_id" : "$evaluate" , "totalNum" : { "$sum" : 1}}} , { "$match" : { "totalNum" : { "$gte" : 85}}} , { "$sort" : { "totalNum" : -1}}]}
         System.out.println(result.getMappedResults());
//查询结果简洁明了
//[{ "_id" : 0 , "totalNum" : 10047}, { "_id" : 1 , "totalNum" : 9955}]
//使用此方法，如果封装好了某一个类，类里面的属性和结果集的属性一一对应，那么，Spring是可以直接把结果集给封装进去的
//就是AggregationResults result = mongoTemplate.aggregate(agg, BasicDBObject);中的BasicDBObject改为自己封装的类
//但是感觉这样做有点不灵活，其实吧，应该是自己现在火候还不到，还看不到他的灵活性，好处在哪里；等火候旺了再说呗
//所以，就用这个万能的BasicDBObject类来封装返回结果
List resultList = result.getMappedResults();
NewsNumDTO dto = new NewsNumDTO();
for(BasicDBObject dbo : resultList){
int eval = dbo.getInt("_id");
long num = dbo.getLong("totalNum");
if(eval == 1){
dto.setPositiveNum(num);
}else {
dto.setNegativeNum(num);
}
}
System.out.println(dto.getPositiveNum());
}
　　
　　4、previousOperation的简单使用--为分组的字段（_id）建立别名

/**
*
* 功能：previousOperation的简单使用--为分组的字段（_id）建立别名
* 参数：
* 创建人：OnTheRoad_Lee
* 修改人：OnTheRoad_Lee
* 最后修改时间：2014-5-26
*/
public void testAggregation2() {
TypedAggregation agg = Aggregation.newAggregation(
News.class,
//             match(Criteria.where("mediaType").is(100))
project("evaluate")
,group("evaluate").count().as("totalNum")
,project("evaluate", "totalNum")
.and("eval").previousOperation()
//为分组的字段（_id）建立别名
         );
AggregationResults result = mongoTemplate.aggregate(agg, BasicDBObject.class);
System.out.println(agg.toString());
//          { "aggregate" : "__collection__" , "pipeline" : [ { "$project" : { "evaluate" : 1}} , { "$group" : { "_id" : "$evaluate" , "totalNum" : { "$sum" : 1}}} , { "$project" : { "evaluate" : "$_id.evaluate" , "totalNum" : 1 , "_id" : 0 , "eval" : "$_id"}}]}
         System.out.println(result.getMappedResults());
//          [{ "totalNum" : 10047 , "eval" : 0}, { "totalNum" : 9955 , "eval" : 1}]
}
　　
　　5、unwind()的使用，通过Spring Data MongoDB

/**
*
* 功能：unwind()的使用，通过Spring Data MongoDB
*       unwind()就是$unwind这个命令的转换，
*       $unwind - 可以将一个包含数组的文档切分成多个, 比如你的文档有中有个数组字段 A, A中有10个元素, 那么
*       经过 $unwind处理后会产生10个文档，这些文档只有字段 A不同
*       详见：http://my.oschina.net/GivingOnenessDestiny/blog/88006
* 参数：
* 创建人：OnTheRoad_Lee
* 修改人：OnTheRoad_Lee
* 最后修改时间：2014-5-26
*/
public void testAggregation3() {
TypedAggregation agg = Aggregation.newAggregation(
News.class,
unwind("classKey")
,project("evaluate", "classKey")
//             这里说明一点就是如果group>=2个字段，那么结果集的分组字段就没有_id了，取而代之的是具体的字段名（和testAggregation()最对比）
,group("evaluate", "classKey").count().as("totalNum")
,sort(Sort.Direction.DESC, "totalNum")
);
AggregationResults result = mongoTemplate.aggregate(agg, BasicDBObject.class);
System.out.println(agg.toString());
//          { "aggregate" : "__collection__" , "pipeline" : [ { "$unwind" : "$classKey"} , { "$project" : { "evaluate" : 1 , "classKey" : 1}} , { "$group" : { "_id" : { "evaluate" : "$evaluate" , "classKey" : "$classKey"} , "totalNum" : { "$sum" : 1}}} , { "$sort" : { "totalNum" : -1}}]}
         System.out.println(result.getMappedResults());
//          结果就是酱紫，一目了然，怎么操作，就交给你自己了
//          [{ "evaluate" : 0 , "classKey" : "A201" , "totalNum" : 4857}, { "evaluate" : 0 , "classKey" : "23" , "totalNum" : 4857}, { "evaluate" : 0 , "classKey" : "A101" , "totalNum" : 4842}, { "evaluate" : 0 , "classKey" : "24" , "totalNum" : 4806}, { "evaluate" : 1 , "classKey" : "A101" , "totalNum" : 4787}, { "evaluate" : 0 , "classKey" : "C201" , "totalNum" : 4787}, { "evaluate" : 0 , "classKey" : "A102" , "totalNum" : 4783},……]
}
　　
　　6、Spring Data MongoDB，按照时间（字符串）分组

/**
*
* 功能：Spring Data MongoDB，按照时间（字符串）分组
* 参数：
* 创建人：OnTheRoad_Lee
* 修改人：OnTheRoad_Lee
* 最后修改时间：2014-5-26
*/
public void testAggregation4() {
TypedAggregation agg = Aggregation.newAggregation(
News.class,
//project().andExpression()里面是一个表达式
//             详见api：http://docs.spring.io/spring-data/data-mongodb/docs/current/reference/htmlsingle/#mongo.aggregation
//             搜索  .andExpression 定位到具体的方法模块
project("evaluate")
.andExpression("substr(publishTimeStr,0,10)").as("publishDate")
,group("evaluate", "publishDate").count().as("totalNum")
,sort(Sort.Direction.DESC, "totalNum")
);
AggregationResults result = mongoTemplate.aggregate(agg, BasicDBObject.class);
System.out.println(agg.toString());
//          { "aggregate" : "__collection__" , "pipeline" : [ { "$project" : { "evaluate" : 1 , "publishDate" : { "$substr" : [ "$publishTimeStr" , 0 , 10]}}} , { "$group" : { "_id" : { "evaluate" : "$evaluate" , "publishDate" : "$publishDate"} , "totalNum" : { "$sum" : 1}}} , { "$sort" : { "totalNum" : -1}}]}
         System.out.println(result.getMappedResults());
//          [{ "evaluate" : 0 , "publishDate" : "2014-03-09" , "totalNum" : 101}, { "evaluate" : 1 , "publishDate" : "2014-02-14" , "totalNum" : 100}, { "evaluate" : 1 , "publishDate" : "2014-02-11" , "totalNum" : 99}, { "evaluate" : 0 , "publishDate" : "2014-03-17" , "totalNum" : 98}, { "evaluate" : 1 , "publishDate" : "2014-03-26" , "totalNum" : 98},  ……]
//          这个查询结果貌似不是我们想要的效果，理想，一目了然的效果应该是，以日期为单位，日期底下正面多少，负面多少：
//          [
//          { "publishDate" : "2014-03-09" , "evalInfo" : [{"evaluate" : 0 , "totalNum" : 101}, {"evaluate" : 1 , "totalNum" : 44}]}
//          { "publishDate" : "2014-03-12" , "evalInfo" : [{"evaluate" : 0 , "totalNum" : 11}, {"evaluate" : 1 , "totalNum" : 32}]},
//          ……
//          ]
//          无奈本人功力尚浅，查了N多资料，各种论坛，中文的，英文的都查了，就是找不到Spring Data MongoDB 分组方法
//          ，所以就引出了testAggregation5
}
　　
　　7、使用原生态mongodb语法，按照时间（字符串）分组，整合查询结果，使用$push命令

/**
*
* 功能：使用原生态mongodb语法，按照时间（字符串）分组，整合查询结果，使用$push命令
* 参数：
* 创建人：OnTheRoad_Lee
* 修改人：OnTheRoad_Lee
* 最后修改时间：2014-5-26
*/
public void testAggregation5() {
/* Group操作*/
String groupStr = "{$group:{_id:{'evaluate':'$eval','ptimes':{$substr:['$ptimes',0,10]}},totalNum:{$sum:1}}}";
DBObject group = (DBObject) JSON.parse(groupStr);
/* Reshape Group Result*/
DBObject projectFields = new BasicDBObject();
projectFields.put("ptimes", "$_id.ptimes");
projectFields.put("evalInfo", new BasicDBObject("evaluate","$_id.evaluate").append("totalNum", "$totalNum"));
DBObject project = new BasicDBObject("$project", projectFields);
/* 将结果push到一起*/
DBObject groupAgainFields = new BasicDBObject("_id", "$ptimes");
groupAgainFields.put("evalInfo", new BasicDBObject("$push", "$evalInfo"));
DBObject reshapeGroup = new BasicDBObject("$group", groupAgainFields);
/* 查看Group结果 */
AggregationOutput output = mongoTemplate.getCollection("news").aggregate(group, project, reshapeGroup);
System.out.println(output.getCommand());
//       { "aggregate" : "news" , "pipeline" : [ { "$group" : { "_id" : { "evaluate" : "$eval" , "ptimes" : { "$substr" : [ "$ptimes" , 0 , 10]}} , "totalNum" : { "$sum" : 1}}} , { "$project" : { "ptimes" : "$_id.ptimes" , "evalInfo" : { "evaluate" : "$_id.evaluate" , "totalNum" : "$totalNum"}}} , { "$group" : { "_id" : "$ptimes" , "evalInfo" : { "$push" : "$evalInfo"}}}]}
      System.out.println(output.getCommandResult());
//       { "serverUsed" : "localhost/127.0.0.1:47017" , "result" : [
//       { "_id" : "2014-04-24" , "evalInfo" : [ { "evaluate" : 1 , "totalNum" : 67} , { "evaluate" : 0 , "totalNum" : 76}]}
//       , { "_id" : "2014-02-05" , "evalInfo" : [ { "evaluate" : 1 , "totalNum" : 70} , { "evaluate" : 0 , "totalNum" : 84}]}
//       , { "_id" : "2014-03-21" , "evalInfo" : [ { "evaluate" : 0 , "totalNum" : 82} , { "evaluate" : 1 , "totalNum" : 89}]}}……]
//       , "ok" : 1.0}
}
　　

账号		自动登录	找回密码
密码			立即注册

大疆运维招人啦，

C++ :try 语句块和异常处理

C++的多态

Red Hat RHCE 8 (EX294) Cert Guide

Java/C++ 区别：看完这一篇，就够用！

别再用过时库了！这 13 个顶级 C++ 库才是

c++ size_t 和 int 的区别

[经验分享] MongoDB的使用学习之（七）MongoDB的聚合查询（两种方式）附项目源码

浏览过的版块

扫码加入运维网微信交流群