hadoop上执行mahout的bayes分类算法

沙诺 posted @ 2013年7月17日 16:09 in 云计算 with tags hadoop mahout 分类算法 , 5110 阅读

这两天做了一个hadoop上跑的分类算法——贝叶斯分类。下面介绍一下实验的运行过程。。

1,获取数据集:http://people.csail.mit.edu/jrennie/20Newsgroups/20news-bydate.tar.gz(做分类实验通常用的数据集)

2,解压数据:我的位置:/home/XXXXXX/hadoop/mahout/mahout-distribution-0.6/examples/bin/work

3,预处理训练数据集并需要把txtfile转换成sequenceFile(mahout处理的文件必须是sequenceFile格式的)。命令:mahout org.apache.mahout.classifier.bayes.PrepareTwentyNewsgroups -p /home/XXXXXX/hadoop/mahout/mahout-distribution-0.6/examples/bin/work/20news-bydate-train -o /home/XXXXXX/hadoop/mahout/mahout-distribution-0.6/examples/bin/work/bayes-train-input -a org.apache.mahout.vectorizer.DefaultAnalyzer -c UTF-8

4,将work下的bayes-train-input放到hadoop的分布式文件系统上的 20news-input。命令:hadoop dfs -put /home/XXXXXX/hadoop/mahout/mahout-distribution-0.6/examples/bin/work/bayes-train-input 20news-input

5,用处理好的训练数据集进行训练得出分类模型即中间结果。命令:mahout trainclassifier -i 20news-input -o newsmodel -type bayes -ng 3 -source hdfs

查看分类模型的内容:命令:hadoop fs -lsr /user/hadoop/newsmodel;还可以导出到本地的txt格式查看:命令:mahout seqdumper -s /user/XXXXXX/newsmodel/trainer-tfIdf//trainer-tfIdf/part-00000 -o /home/XXXXXX/hadoop/out/part-1

插入一张图片,不然显得太单调了:

训练得出分类模型的mapreduce过程

6,用模型测试。命令:mahout testclassifier -m newsmodel -d 20news-input -type bayes -ng 3 -source hdfs -method mapreduce

用模型测试时还有点小错误,先这样写上,等测试成功了然后再纠正此处的错误。。见谅

Avatar_small
cleaning company in 说:
2019年9月09日 23:22

Any cleaning building contractors will maintain administration together with organization within the cleaning efforts also, you don’t really need to actively supervise it regularly to make sure that a steady operation. As you outsource an individual's cleaning products, it becomes an absolute background service that could be constantly being employed behind any scene you can sell organic the perfect cleaning provider possible.

Avatar_small
I sell pittsburgh ho 说:
2019年11月10日 23:16

Help your house be look extra spacious by ridding yourself of any avoidable junk (hire space if you have to). You'll see a significant difference in the best way your closet look, and in your garage, hall and lavatory. Buyers prefer to feel for instance they're obtaining sufficient space or room,

Avatar_small
I sell pittsburgh ho 说:
2020年3月17日 04:04

Start with thinking about your position. Are you willing to buy a property? How much is it possible to afford in the monthly mortgage loan payment? How significantly space do you want? What aspects of town can you like? When you answer these kinds of questions, produce a "To Do" list and commence doing everyday research concerning property.

Avatar_small
north bama real esta 说:
2020年3月17日 04:06

Just about the most important investments of most our life would have been a Real Est investment and it'll be for sure just about the most appreciated possessions that individuals will have got on our own entire living.

Avatar_small
atlanta black busine 说:
2020年3月17日 04:07

Enterprise list are usually compiled and also updated month to month with help of varied sources for instance yellow web pages directories, business bank cards, annual accounts, business and also industry websites, and cell phone verification.

Avatar_small
freeze business rate 说:
2020年3月17日 04:09

World wide web Buisness, affiliate marketing online and blogging and site-building all move together properly. In distinct, you can easily add text message links in your blog content. These back links will blast your site viewer right through to your web site for the affiliate system.

Avatar_small
full time maids in d 说:
2020年4月27日 01:34

Preparing and arranging the engagement shower tend to be major one of the maid associated with honor duties. Seek without the intervention of the bride's mother and possibly the groom's mom too. Don't go this alone, for you might not know the actual finer particulars. So far since the event will go, you might plan anything just like a grand dinner to some cozy espresso meet as well as anywhere just like a nearby eatery, a ceremonial location to even your office garden.

Avatar_small
painting company in 说:
2020年4月27日 01:34

When screening and talking with painters, make sure that you specify the actual work and coverage that must be done so that you can get the quotes accurately. Keep in mind that the interested contractors and house painters should bid for a passing fancy scope of works- from the areas that needs to be painted to the kind of color that you want.

Avatar_small
part time maids in d 说:
2021年6月06日 00:29

Take away the lampshades in addition to, with this glass cleanser, wipe decrease the light bulbs. Run some sort of duster interior and outside the lampshade. On the bed, remove the mattress handle and post it towards laundry intended for washing. Other than, you ought to clean this carpet. Vacuum about the baseboards as a way to remove this buildup particles.


登录 *


loading captcha image...
(输入验证码)
or Ctrl+Enter