dopost 发表于 2015-11-27 19:41:32

大数据系列教程_分布式日志收集flume

10、分布式日志收集flume
  ./flume-ng agent --conf /home/hadoop/flume140cdh501/conf --conf-file /home/hadoop/flume140cdh501/conf/exec1 --name a1 -Dflume.root.logger=DEBUG,console备注:当hdfs做了HA之后,需要将hadoop的hdfs-site.xml放在flume的conf目录下,hdfs配置为联邦前缀
  
  1、spool配置
  agent1表示代理名称
  agent1.sources=source1
  agent1.sinks=sink1
  agent1.channels=channel1
  
  
  #配置source1
  agent1.sources.source1.type=spooldir
  agent1.sources.source1.spoolDir=/home/hadoop/flume150cdh512/yth
  agent1.sources.source1.channels=channel1
  agent1.sources.source1.fileHeader = false
  
  #配置sink1
  agent1.sinks.sink1.type=hdfs
  agent1.sinks.sink1.hdfs.path=hdfs://hadoopCluster/yth
  agent1.sinks.sink1.hdfs.fileType=DataStream
  agent1.sinks.sink1.hdfs.writeFormat=TEXT
  agent1.sinks.sink1.hdfs.rollInterval=4
  agent1.sinks.sink1.channel=channel1
  
  #配置channel1
  agent1.channels.channel1.type=file
  agent1.channels.channel1.checkpointDir=/home/hadoop/flume150cdh512/yth_tmp123
  agent1.channels.channel1.dataDirs=/home/hadoop/flume150cdh512/yth_tmp
  1、    exec配置
  
  a1.sources = r1
  a1.sinks = k1
  a1.channels = c1
  # Describe/configure the source
  a1.sources.r1.type = exec
  a1.sources.r1.channels = c1
  a1.sources.r1.command = tail -F /home/hadoop/flume140cdh501/log.log
  #配置sink1
  
  
  a1.sinks.k1.type=hdfs
  a1.sinks.k1.hdfs.path=hdfs://hadoopCluster/exx
  a1.sinks.k1.hdfs.fileType=DataStream
  a1.sinks.k1.hdfs.writeFormat=TEXT
  a1.sinks.k1.hdfs.rollInterval=4
  a1.sinks.k1.channel=channel1
  
  
  a1.sinks.k1.channel=c1
  # Use a channel which buffers events in memory
  a1.channels.c1.type = file
  a1.channels.c1.checkpointDir=/home/hadoop/flume140cdh501/yth_tmp123
  a1.channels.c1.dataDirs=/home/hadoop/flume140cdh501/yth_tmp
  2、    Kafka配置
  3、   a1.sources = r1
  4、   a1.sinks = k1
  5、   a1.channels = c1
  6、   # Describe/configure the source
  7、   a1.sources.r1.type = exec
  8、   a1.sources.r1.channels = c1
  9、   a1.sources.r1.command = tail -F /home/hadoop/flume140cdh501/log.log
  10、#配置sink1
  11、   
  12、   
  13、   
  14、a1.sinks.k1.type = org.apache.flume.plugins.KafkaSink
  15、a1.sinks.k1.metadata.broker.list=node6:9092,node7:9093,node8:9094
  16、a1.sinks.k1.partition.key=0
  17、a1.sinks.k1.partitioner.class=org.apache.flume.plugins.SinglePartition
  18、a1.sinks.k1.serializer.class=kafka.serializer.StringEncoder
  19、a1.sinks.k1.request.required.acks=-1
  20、#a1.sinks.k1.max.message.size=10000
  21、#a1.sinks.k1.producer.type=sync
  22、#a1.sinks.k1.custom.encoding=UTF-8
  23、a1.sinks.k1.custom.topic.name=mytopic
  24、   
  25、   
  26、   
  27、a1.sinks.k1.channel=c1
  28、# Use a channel which buffers events in memory
  29、a1.channels.c1.type = memory
  30、a1.channels.c1.capacity = 1000
  


页: [1]
查看完整版本: 大数据系列教程_分布式日志收集flume