通过 Job History Server 的 web console 查阅在 Yarn 上 MapReduce job 的 job conf xml 文件

浏览数：99 / 时间：2015年06月09日

很多时候，Yarn 的用户希望知道自己运行过的某个 MapReduce job 的运行参数，此时可以从MapReduce History Server的 web console上查阅该 job的conf xml 文件内容。当然用户也可以先登录Yarn 的 web console的地址，然后再从上面跳转到 Job History Server 的 web console进行查阅。本文将以一个简单的图文例子来具体演示该功能。

步骤：

1、在启动 Job History Server 前，在mapred-site.xml文件里面对其相关参数进行设置，如下：

<property>
    <name>mapreduce.jobhistory.address</name>
    <value>hostname:10020</value>
</property>
<property>
    <name>mapreduce.jobhistory.webapp.address</name>
    <value>hostname:19888</value>
</property>
2、通过命令“sbin/mr-jobhistory-daemon.sh start historyserver”启动 Job History Server

3、执行一个简单的 MapReduce job，比如 wordcount。执行时，不加任何参数调整，使用默认设置。

4、执行完后直接登录Job History Server 的 web console，网址为“ http://hostname:19888/jobhistory”

技术分享

在网页上可以看到执行过的 job 列表。

5、点击 Job ID号（比如job_1417166623034_0343），进入具体 Job 的描述页面

技术分享

6、点击“Configuration” 链接，查阅MapReduce job 的 job conf xml 文件内容

技术分享

7、过滤查看参数‘mapreduce.reduce.shuffle.input.buffer.percent ’的值——为默认值“0.70”：

技术分享

8、提交执行一个新的MapReduce job，并指定参数‘mapreduce.reduce.shuffle.input.buffer.percent ’的值为“0.69”：“hadoop jar hadoop-examples-2.2.0.jar wordcount -Dmapreduce.reduce.shuffle.input.buffer.percent=0.69 /tmp/wdinput /tmp/wdoutput”。

9、执行完MR Job后，再次登录Job History Server 的 web console，然后查阅该Job的configuration内容——此时，其参数‘mapreduce.reduce.shuffle.input.buffer.percent ’的值为“0.69”。证明通过“-Dmapreduce.reduce.shuffle.input.buffer.percent=0.69”成功传递给了 MR job 新的参数值。

技术分享