Please ref flume user guide first
http://flume.apache.org/FlumeUserGuide.html
And the Cloudera flume blogs
http://blog.cloudera.com/blog/category/flume/
How to define JAVA_HOME, java options and add our customized lib into flume-ng.
All these information will be defined in FLUME_CONFI_DIR/flume-env.sh
Example like below.
JAVA_HOME=/opt/java
JAVA_OPTS="-Xms200m -Xmx200m -Dcom.sun.management.jmxremote -Dcom.sun.management.jmxremote.ssl=false -Dcom.sun.management.jmxremote.authenticate=false -Dcom.sun.management.jmxremote.port=3669 -Dflume.called.from.service"
FLUME_CLASSPATH=/opt/sponge/flume/lib/*
How start flume-ng as agent
Please note we should name the flume collector name to hostname_agent and this name will be used in the flume-conf-agent.properties
$/usr/lib/flume/bin/flume-ng agent --conf /opt/sponge/flume/config/ --conf-file /opt/sponge/flume/conf/flume-conf-agent.properties --name hostname_agent &
How to start flume-en as collector
Please note we should name the flume collector name to hostname_collector and this name will be used in the flume-conf-collector.properties
$/usr/lib/flume/bin/flume-ng agent --conf /opt/sponge/flume/config/ --conf-file /opt/sponge/flume/conf/flume-conf-collector.properties --name hostname_collector &
How to define the flume agent and flume collector property file.
I’ve already committed 2 different property files to https://svn.nam.nsroot.net:9050/svn/153299/elf/sponge-branches/2013-03-14-FlumeNG/sponge/myflumeng/config
Please ref flume-conf-agent.properties and flume-conf-collector.properties.
The basic name convention are
1)each agent name will be set as hostname_agent
2)each collector name will be set as hostname_collector
3)the source names will be source1, source2,source3…..
4)the sink name will be avroSink1, avroSink2, avroSink3….
5)each sink’s interceptor will be set as interceptor1, interceptor2, interceptor3 ….
6)all agent sinks will be AVRO sink.
7)the default collector source is AVRO source
8)agent sinks are load balanced as round robin
9)file channel is default for both agent and collector
# For each one of the sources, the type is defined
hostname_agent.sources.source1.type = exec
hostname_agent.sources.source1.command = tail -F /var/log/audit/audit.log
hostname_agent.sources.source1.channels = fileChannel
hostname_agent.sources.source1.batchSize=10
# For each one of the sources, the log interceptor is defined
hostname_agent.sources.source1.interceptors = logIntercept1
hostname_agent.sources.source1.interceptors.logIntercept1.type = com.citi.sponge.flume.sink.LogInterceptor$Builder
hostname_agent.sources.source1.interceptors.logIntercept1.preserveExisting = false
hostname_agent.sources.source1.interceptors.logIntercept1.hostName = hostname
hostname_agent.sources.source1.interceptors.logIntercept1.env = PROD
hostname_agent.sources.source1.interceptors.logIntercept1.logType = AUDIT_LOG
hostname_agent.sources.source1.interceptors.logIntercept1.appId = 111111
hostname_agent.sources.source1.interceptors.logIntercept1.logFilePath = /var/log/audit
hostname_agent.sources.source1.interceptors.logIntercept1.logFileName = audit.log
#for each of the sink, type is defined
hostname_agent.sinks.avroSink1.type = avro
hostname_agent.sinks.avroSink1.hostname=collector1
hostname_agent.sinks.avroSink1.port=1442
hostname_agent.sinks.avroSink1.batchSize=10
hostname_agent.sinks.avroSink1.channel = fileChannel
# For each one of the sources, the type is defined
hostname_collector.sources.avroSource.channels = fileChannel
hostname_collector.sources.avroSource.type = avro
hostname_collector.sources.avroSource.bind = hostname
hostname_collector.sources.avroSource.port = 1442