本笔记基于Hadoop2.7.3,Apache Flume 1.8.0。其中flume source为netcat,flume channel为memory,flume sink为hdfs。
1,配置flume代理文件
配置一个flume agent代理,在此名称为shaman。配置文件(netcat-memory-hdfs.conf)如下:
# Identify the components on agent shaman:shaman.sources = netcat_s1shaman.sinks = hdfs_w1shaman.channels = in-mem_c1# Configure the source:shaman.sources.netcat_s1.type = netcatshaman.sources.netcat_s1.bind = localhostshaman.sources.netcat_s1.port = 44444# Describe the sink:shaman.sinks.hdfs_w1.type = hdfsshaman.sinks.hdfs_w1.hdfs.path = hdfs://localhost:8020/user/root/testshaman.sinks.hdfs_w1.hdfs.writeFormat = Textshaman.sinks.hdfs_w1.hdfs.fileType = DataStream# Configure a channel that buffers events in memory:shaman.channels.in-mem_c1.type = memoryshaman.channels.in-mem_c1.capacity = 20000shaman.channels.in-mem_c1.transactionCapacity = 100# Bind the source and sink to the channel:shaman.sources.netcat_s1.channels = in-mem_c1shaman.sinks.hdfs_w1.channel = in-mem_c1
备注:
hdfs://localhost:8020/user/root/test,其中hdfs://localhost:8020
为hadoop配置文件core-site.xml中fs.defaultFS
属性的值,root
为hadoop的登陆用户。 2,启动flume代理
bin/flume-ng agent -f agent/netcat-memory-hdfs.conf -n shaman -Dflume.root.logger=DEBUG,console -Dorg.apache.flume.log.printconfig=true -Dorg.apache.flume.log.rawdata=true
3,打开telnet客户端,输入字母测试
telnet localhost 44444
然后输入文字
4,查看hdfs test目录
hdfs dfs -ls /user/root/test
会发现有新的文件出现,文件里面的内容即是通过telent输入的字母。
学习资料:
1,《Hadoop For Dummies》2,