Handy Hadoop…
Here are tips and tricks you might find useful if you are dealing with Hadoop/Hbase on daily basis to keep it running smoothly…
As an Hadoop Admin you should know basics HDFS componenets.
About Namenode (NN):
Admin port for Hadoop: 50070, Jobtracker: 50030, Hbase port: 60010
Directory structure:
A newly formatted namenode creates the following directory structure:
- ${dfs.name.dir}/current/VERSION
- /edits
- /fsimage
- /fstime
Administration Commands
Commands useful for administrators of a hadoop cluster.
balancer
Runs a cluster balancing utility. An administrator can simply press Ctrl-C to stop the rebalancing process. See Rebalancer for more details.
Usage: hadoop balancer [-threshold <threshold>]
|
COMMAND_OPTION
|
Description
|
| -threshold <threshold> |
Percentage of disk capacity. This overwrites the default threshold. |
daemonlog
Get/Set the log level for each daemon.
Usage: hadoop daemonlog -getlevel <host:port> <name>
Usage: hadoop daemonlog -setlevel <host:port> <name> <level>
|
COMMAND_OPTION
|
Description
|
| -getlevel <host:port> <name> |
Prints the log level of the daemon running at <host:port>. This command internally connects to http://<host:port>/logLevel?log=<name> |
| -setlevel <host:port> <name> <level> |
Sets the log level of the daemon running at <host:port>. This command internally connects to http://<host:port>/logLevel?log=<name> |
datanode
Runs a HDFS datanode.
Usage: hadoop datanode [-rollback]
|
COMMAND_OPTION
|
Description
|
| -rollback |
Rollsback the datanode to the previous version. This should be used after stopping the datanode and distributing the old hadoop version. |
dfsadmin
Runs a HDFS dfsadmin client.
Usage: hadoop dfsadmin [GENERIC_OPTIONS] [-report] [-safemode enter | leave | get | wait] [-refreshNodes] [-finalizeUpgrade] [-upgradeProgress status | details | force] [-metasave filename] [-setQuota <quota> <dirname>...<dirname>] [-clrQuota <dirname>...<dirname>] [-help [cmd]]
|
COMMAND_OPTION
|
Description
|
| -report |
Reports basic filesystem information and statistics. |
| -safemode enter | leave | get | wait |
Safe mode maintenance command. Safe mode is a Namenode state in which it
1. does not accept changes to the name space (read-only)
2. does not replicate or delete blocks.
Safe mode is entered automatically at Namenode startup, and leaves safe mode automatically when the configured minimum percentage of blocks satisfies the minimum replication condition. Safe mode can also be entered manually, but then it can only be turned off manually as well. |
| -refreshNodes |
Re-read the hosts and exclude files to update the set of Datanodes that are allowed to connect to the Namenode and those that should be decommissioned or recommissioned. |
| -finalizeUpgrade |
Finalize upgrade of HDFS. Datanodes delete their previous version working directories, followed by Namenode doing the same. This completes the upgrade process. |
| -upgradeProgress status | details | force |
Request current distributed upgrade status, a detailed status or force the upgrade to proceed. |
| -metasave filename |
Save Namenode’s primary data structures to <filename> in the directory specified by hadoop.log.dir property. <filename> will contain one line for each of the following
1. Datanodes heart beating with Namenode
2. Blocks waiting to be replicated
3. Blocks currrently being replicated
4. Blocks waiting to be deleted |
| -setQuota <quota> <dirname>…<dirname> |
Set the quota <quota> for each directory <dirname>. The directory quota is a long integer that puts a hard limit on the number of names in the directory tree.
Best effort for the directory, with faults reported if
1. N is not a positive integer, or
2. user is not an administrator, or
3. the directory does not exist or is a file, or
4. the directory would immediately exceed the new quota. |
| -clrQuota <dirname>…<dirname> |
Clear the quota for each directory <dirname>.
Best effort for the directory. with fault reported if
1. the directory does not exist or is a file, or
2. user is not an administrator.
It does not fault if the directory has no quota. |
| -help [cmd] |
Displays help for the given command or all commands if none is specified. |
mradmin
Runs MR admin client
Usage: hadoop mradmin [ GENERIC_OPTIONS ] [-refreshQueueAcls]
|
COMMAND_OPTION
|
Description
|
| -refreshQueueAcls |
Refresh the queue acls used by hadoop, to check access during submissions and administration of the job by the user. The properties present in mapred-queue-acls.xml is reloaded by the queue manager. |
jobtracker
Runs the MapReduce job Tracker node.
Usage: hadoop jobtracker [-dumpConfiguration]
|
COMMAND_OPTION
|
Description
|
| -dumpConfiguration |
Dumps the configuration used by the JobTracker alongwith queue configuration in JSON format into Standard output used by the jobtracker and exits. |
namenode
Runs the namenode. More info about the upgrade, rollback and finalize is at Upgrade Rollback
Usage: hadoop namenode [-format] | [-upgrade] | [-rollback] | [-finalize] | [-importCheckpoint]
|
COMMAND_OPTION
|
Description
|
| -format |
Formats the namenode. It starts the namenode, formats it and then shut it down. |
| -upgrade |
Namenode should be started with upgrade option after the distribution of new hadoop version. |
| -rollback |
Rollsback the namenode to the previous version. This should be used after stopping the cluster and distributing the old hadoop version. |
| -finalize |
Finalize will remove the previous state of the files system. Recent upgrade will become permanent. Rollback option will not be available anymore. After finalization it shuts the namenode down. |
| -importCheckpoint |
Loads image from a checkpoint directory and save it into the current one. Checkpoint dir is read from property fs.checkpoint.dir |
secondarynamenode
Runs the HDFS secondary namenode. See Secondary Namenode for more info.
Usage: hadoop secondarynamenode [-checkpoint [force]] | [-geteditsize]
|
COMMAND_OPTION
|
Description
|
| -checkpoint [force] |
Checkpoints the Secondary namenode if EditLog size >= fs.checkpoint.size. If -force is used, checkpoint irrespective of EditLog size. |
| -geteditsize |
Prints the EditLog size. |
tasktracker
Runs a MapReduce task Tracker node.
Usage: hadoop tasktracker
Entering and leaving safe mode.
% hadoop dfsadmin -safemode get
Safe mode is ON
Replace, Remove, Add disk into Running node in cluster.
How to remove disk from hadoop DN(datanode)
On DN first stop Hbase.
- <hbase_home>/bin/stop-hbase.sh
Then stop Hadoop
- <hadoop_home/bin/stop-all.sh
Edit file(using vi or your choice of editor) hdfs-site.xml
Remove the partition (disk slice) from the config parameter called dfs.data.dir
<hadoop_home/conf/hdfs-site.xml
<property>
<name>dfs.data.dir</name>
<value>/disk1/hdfs/data, /disk2/hdfs/data/ </value>
</property>
<property>
Take /disk1/hdfs/data out from the file, save it.
Put it back when you are done with the disk.
Or if replacing same time, Keep node down until replaced and just bounce the node.
Starting and Stopping Hadoop Daemons
/bin/
- start-balancer.sh start-dfs.sh start-mapred.sh
or just say start-all.sh
This will detect what kind of node (datanode, namenode, secondary namenode, etc) and start/stop services appropriately.
Hadoop Filesystem
A successful check will end with these words:
The filesystem under path ‘/’ is HEALTHY
Cleaning Up a CORRUPT Filesystem
When the namenode is in safemode, no edits to the filesystem are allowed. First, run fsck and determine the extent of the damage. If it is acceptable to delete or otherwise move aside the damaged files, turn off safemode, and move the file using the following command:
This moves any files with problematic blocks into /lost+found in the Hadoop namespace.
Restoring from a checkpoint
fsck
Runs a HDFS filesystem checking utility. See Fsck for more info.
Usage: hadoop fsck [GENERIC_OPTIONS] <path> [-move | -delete | -openforwrite] [-files [-blocks [-locations | -racks]]]
| COMMAND_OPTION |
Description |
| <path> |
Start checking from this path. |
| -move |
Move corrupted files to /lost+found |
| -delete |
Delete corrupted files. |
| -openforwrite |
Print out files opened for write. |
| -files |
Print out files being checked. |
| -blocks |
Print out block report. |
| -locations |
Print out locations for every block. |
| -racks |
Print out network topology for data-node locations. |
Finally To ensure that everything is working correctly hit your IP address with following ports. And this is the good list of ports to open while being hit by firewall.
HDFS: 50070
JobTracker: 50030
TaskTracker: 50060
Hbase Master: 60010
Hbase RegionServer: 60030
Post Reads: 108