//输出
public void write( DataOutput out ) throws IOException{
BSONEncoder enc = new BasicBSONEncoder();
BasicOutputBuffer buf = new BasicOutputBuffer();
enc.set( buf );
…………
}
//输入
public void readFields( DataInput in ) throws IOException{
BSONDecoder dec = new BasicBSONDecoder();
BSONCallback cb = new BasicBSONCallback();
// Read the BSON length from the start of the record
//字节流长度
byte[] l = new byte[4];
try {
in.readFully( l );
…………
byte[] data = new byte[dataLen + 4];
System.arraycopy( l, 0, data, 0, 4 );
in.readFully( data, 4, dataLen - 4 );
dec.decode( data, cb );
_doc = (BSONObject) cb.get();
………………
}
public class MongoTestMapper extends Mapper
以及处理读过来的键值对,并发到Reducer中汇总计算。注意value的类型。
public void map(final Object pkey, final BSONObject pvalue,final Context context)
{
final int year = ((Date)pvalue.get("_id")).getYear()+1990;
double bdyear = ((Number)pvalue.get("bc10Year")).doubleValue();
try {
context.write( new IntWritable( year ), new DoubleWritable( bdyear ));
} catch (IOException e) {
// TODO Auto-generated catch block
e.printStackTrace();
} catch (InterruptedException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
} Reducer会接受Mapper传过来的键值对
public class MongoTestReducer extends Reducer
进行计算并将结果写入MongoDB.请注意输出的Value的类型是BSONWritable.
public void reduce( final IntWritable pKey,
final Iterable pValues,
final Context pContext ) throws IOException, InterruptedException{
int count = 0;
double sum = 0.0;
for ( final DoubleWritable value : pValues ){
sum += value.get();
count++;
}
final double avg = sum / count;
BasicBSONObject out = new BasicBSONObject();
out.put("avg", avg);
pContext.write(pKey, new BSONWritable(out));
}
hadoop jar mongotest.jar org.ventlam.MongoTestJob
以后会将我的博客涉及到源码都发布在https://github.com/ventlam/BlogDemo 中,这篇文章对应的是mongohadoop文件夹。
4.参考文献
[1] What the overhead of Java ORM for MongoDB
http://stackoverflow.com/questions/10600162/what-the-overhead-of-java-orm-for-mongodb
[2] MongoDB Drivers and Client Libraries
http://docs.mongodb.org/ecosystem/drivers/
[3]Getting Started with Hadoop
http://docs.mongodb.org/ecosystem/tutorial/getting-started-with-hadoop/
[4] Interface Writable http://hadoop.apache.org/docs/stable/api/
本作品由VentLam创作,采用知识共享署名-非商业性使用-相同方式共享 2.5 中国大陆许可协议进行许可。