In our last post we saw how we can install Hadoop on single machine, we managed to start and stop the Hadoop components. See the post here In this post we will dive into using the hadoop command line and try to understand the way hadoop stores the file into hdfs. This post will be done at the command line and no GUI will be involved.
hadoop command [genericOptions] [commandOptions]
-conf <configuration file specify an application configuration file
-D <property=value use value for given property
-fs <local|namenode:port specify a namenode
-jt <local|resourcemanager:port specify a ResourceManager
-files <comma separated list of files specify comma separated files to be copied to the map reduce cluster
-libjars <comma separated list of jars specify comma separated jar files to include in the classpath.
-archives <comma separated list of archives specify comma separated archives to be unarchived on the compute machines.
[hadoop@localhost sbin]$ hadoop fs
Usage: hadoop fs [generic options]
[-appendToFile <localsrc ... <dst]
[-cat [-ignoreCrc] <src ...]
[-checksum <src ...]
[-chgrp [-R] GROUP PATH...]
....................................
....................................
....................................
The general command line syntax is
bin/hadoop command [genericOptions] [commandOptions]
hadoop fs -ls /
hadoop fs -mkdir /new_dir
hadoop fs -ls /new_dir
hadoop fs -rmdir /new_dir
hadoop fs -rm -R /new_dir
hadoop fs -put /tmp/file.txt /newdir/
hadoop fs -put /tmp/file* /newdir/
[hadoop@localhost sbin]$ hadoop fs -ls /newdir/
Found 3 items
-rw-r--r-- 1 hadoop supergroup 6 2016-09-14 22:51 /newdir/file.txt
-rw-r--r-- 1 hadoop supergroup 6 2016-09-14 22:51 /newdir/file2.txt
-rw-r--r-- 1 hadoop supergroup 6 2016-09-14 22:51 /newdir/file_new.txt
[hadoop@localhost sbin]$ hadoop fs -cat /newdir/file.txt
1,2,3
[hadoop@localhost sbin]$ hadoop fs -cat /newdir/file*
1,2,3
1,2,3
1,2,3
[hadoop@localhost sbin]$ hadoop fs -appendToFile /tmp/file.txt /newdir/file.txt
[hadoop@localhost sbin]$ hadoop fs -cat /newdir/file.txt
1,2,3
1,2,3
[hadoop@localhost sbin]$ hadoop fs -count /
2 4 24 /
[hadoop@localhost sbin]$ hadoop fs -df /
Filesystem Size Used Available Use%
hdfs://localhost:9000 11170750464 61440 5839032320 0%
--human readable format
[hadoop@localhost sbin]$ hadoop fs -df -h /
Filesystem Size Used Available Use%
hdfs://localhost:9000 10.4 G 60 K 5.4 G 0%
[hadoop@localhost sbin]$hadoop fs -find / -iname file.txt
/newdir/file.txt
[hadoop@localhost sbin]$ hadoop fs -getfacl /newdir
# file: /newdir
# owner: hadoop
# group: supergroup
getfacl: The ACL operation has been rejected. Support for ACLs has been disabled by setting dfs.namenode.acls.enabled to false.
[hadoop@localhost sbin]$ hadoop fs -help getfacl
-getfacl [-R] <path :
Displays the Access Control Lists (ACLs) of files and directories. If a
directory has a default ACL, then getfacl also displays the default ACL.
-R List the ACLs of all files and directories recursively.
<path File or directory to list.