4. Sample configuration and Client
What I am seeing is that for every event that I send a new file in hadoop is being created. I was expecting that file handle would just write to existing file until it gets rolled over as specified in the configs. Am I doing something wrong?
12/06/15 17:28:52 INFO hdfs.BucketWriter: Creating hdfs://dsdb1:54310/flume/'dslg1'/FlumeData.1339806027956.tmp
12/06/15 17:28:52 INFO hdfs.BucketWriter: Renaming hdfs://dsdb1:54310/flume/'dslg1'/FlumeData.1339806027956.tmp to hdfs://dsdb1:54310/flume/'dslg1'/FlumeData.1339806027956
12/06/15 17:28:52 INFO hdfs.BucketWriter: Creating hdfs://dsdb1:54310/flume/'dslg1'/FlumeData.1339806027957.tmp
12/06/15 17:28:52 INFO hdfs.BucketWriter: Renaming hdfs://dsdb1:54310/flume/'dslg1'/FlumeData.1339806027957.tmp to hdfs://dsdb1:54310/flume/'dslg1'/FlumeData.1339806027957
12/06/15 17:28:52 INFO hdfs.BucketWriter: Creating hdfs://dsdb1:54310/flume/'dslg1'/FlumeData.1339806027958.tmp
foo.sources = avroSrc
foo.channels = memoryChannel
foo.sinks = hdfsSink
# For each one of the sources, the type is defined
Map<String,String> headers = new HashMap<String,String>();
headers.put("host", hostName);
event.setHeaders(headers);
try {
rpcClient.append(event);
} catch (EventDeliveryException e) {
connect();
}
@Test
public void testAvroClient() throws InterruptedException{
AvroClient aClient = new AvroClient();
int i = 0;
int j = 500;
while(i++ < j){
aClient.sendDataToFlume("Hello");
if(i == j/2){
//Thread.sleep(30000);
}
}
}
}
After I changed my config to this it worked. It looks like flume creates new file for any of the conditions that matches first. Since there is a default of 10 for rollCount it was creating a new document. But I think it causes lot of problem because I need
to now keep track and estimate all these variables. I think it should just do based on what's specified in the config, so if I only specify rollSize then it souldn't consider any other options for it's logic to create a new file.