xiyou 发表于 2015-11-27 21:06:32

使用flume搜集服务器log到hdfs

  将多个logs服务器上的log搜集到hdfs服务器上,多个logs服务上的flume-sink和hdfs服务器上的flume-source都是avro类型,hdfs服务器上的flume-sink是hdfs类型
  

  
  Flume部署文档
  系统要求:
  Java 运行时环境

  
  部署方式:
  在logs和hdfs服务器上下载并解压flume包
  
  下载flume包并解压:
  http://mirror.bit.edu.cn/apache/flume/1.4.0/apache-flume-1.4.0-bin.tar.gz
  
logs服务器flume配置
  进入解压后的flume目录,修改配置文件:
  1.      Cpconf/flume-env.sh.template conf/flume-env.sh
  在conf/flume-env.sh中添加
  JAVA_HOME=” JAVA HOME DIR”
  
  2.      在conf目录下创建flume.conf配置文件,添加以下内容
  修改agent.sources.loggerSource.command 值
  
  # Licensed to the Apache SoftwareFoundation (ASF) under one
  # or more contributor licenseagreements.See the NOTICE file
  # distributed with this work for additionalinformation
  # regarding copyright ownership.The ASF licenses this file
  # to you under the Apache License, Version2.0 (the
  # "License"); you may not usethis file except in compliance
  # with the License.You may obtain a copy of the License at
  #
  # http://www.apache.org/licenses/LICENSE-2.0
  #
  # Unless required by applicable law oragreed to in writing,
  # software distributed under the License isdistributed on an
  # "AS IS" BASIS, WITHOUTWARRANTIES OR CONDITIONS OF ANY
  # KIND, either express or implied.See the License for the
  # specific language governing permissionsand limitations
  # under the License.
  
  
  # The configuration file needs to definethe sources,
  # the channels and the sinks.
  # Sources, channels and sinks are definedper agent,
  # in this case called 'agent'
  
  agenta.sources = loggerSource
  agenta.channels = memoryChannel
  agenta.sinks = loggerSink
  
  # For each one of the sources, the type isdefined
  agenta.sources.loggerSource.type = exec
  agenta.sources.loggerSource.command = tail -F <logpath>
  
  # The channel can be defined as follows.
  agenta.sources.loggerSource.channels =memoryChannel
  
  # Each sink's type must be defined
  agenta.sinks.loggerSink.type = avro
  agenta.sinks.loggerSink.hostname = <hdfs serverip>
  agenta.sinks.loggerSink.port = 4141
  
  #Specify the channel the sink should use
  agenta.sinks.loggerSink.channel =memoryChannel
  
  # Each channel's type is defined.
  agenta.channels.memoryChannel.type = memory
  
  # Other config values specific to each typeof channel(sink or source)
  # can be defined as well
  # In this case, it specifies the capacityof the memory channel
  agenta.channels.memoryChannel.capacity =1000
  
  启动命令:
  ./bin/flume-ng agent --conf conf/--conf-file conf/flume.conf --name agent
  
  
Hdfs 服务器flume配置
  进入解压后的flume目录,修改配置文件:
  1.      Cpconf/flume-env.sh.template conf/flume-env.sh
  在conf/flume-env.sh中添加
  JAVA_HOME=” JAVA HOME DIR”
  HADOOP_HOME= “HADOOP HOME”
  
  2.      在conf目录下创建flume.conf配置文件,添加以下内容
  
  
  # Licensed to the Apache SoftwareFoundation (ASF) under one
  # or more contributor license agreements.See the NOTICE file
  # distributed with this work for additionalinformation
  # regarding copyright ownership.The ASF licenses this file
  # to you under the Apache License, Version2.0 (the
  # &quot;License&quot;); you may not usethis file except in compliance
  # with the License.You may obtain a copy of the License at
  #
  # http://www.apache.org/licenses/LICENSE-2.0
  #
  # Unless required by applicable law oragreed to in writing,
  # software distributed under the License isdistributed on an
  # &quot;AS IS&quot; BASIS, WITHOUTWARRANTIES OR CONDITIONS OF ANY
  # KIND, either express or implied.See the License for the
  # specific language governing permissionsand limitations
  # under the License.
  
  
  # The configuration file needs to definethe sources,
  # the channels and the sinks.
  # Sources, channels and sinks are definedper agent,
  # in this case called 'agent'
  
  agent.sources = loggerSource
  agent.channels = memoryChannel
  agent.sinks = loggerSink
  
  # For each one of the sources, the type isdefined
  agent.sources.loggerSource.type = avro
  agent.sources.loggerSource.bind = 0.0.0.0
  agent.sources.loggerSource.port = 4141
  
  # The channel can be defined as follows.
  agent.sources.loggerSource.channels =memoryChannel
  
  # Each sink's type must be defined
  agent.sinks.loggerSink.type = hdfs
  agent.sinks.loggerSink.hdfs.path = <hdfs sink path>
  agent.sinks.loggerSink.hdfs.filePrefix =csplog-
  agent.sinks.loggerSink.hdfs.rollInterval=86400
  agent.sinks.loggerSink.hdfs.rollSize = 0
  agent.sinks.loggerSink.hdfs.rollCount = 0
  agent.sinks.loggerSink.hdfs.fileType =DataStream
  
  #Specify the channel the sink should use
  agent.sinks.loggerSink.channel =memoryChannel
  
  # Each channel's type is defined.
  agent.channels.memoryChannel.type = memory
  
  # Other config values specific to each typeof channel(sink or source)
  # can be defined as well
  # In this case, it specifies the capacityof the memory channel
  agent.channels.memoryChannel.capacity =1000
  
  
  启动命令:
  bin/flume-ng agent --conf conf/ --conf-fileflume.conf --name agent
页: [1]
查看完整版本: 使用flume搜集服务器log到hdfs