一、Hive命令选项
Usage:
Usage: hive [-hiveconf x=y]* [<-i filename>]* [<-f filename>|<-e query-string>] [-S]
-i <filename> Initialization Sql from file (executed automatically and silently before any other commands)
-e 'quoted query string' Sql from command line
-f <filename> Sql from file
-S Silent mode in interactive shell where only data is emitted
-hiveconf x=y Use this to set hive/hadoop configuration variables.
-e and -f cannot be specified together. In the absence of these options, interactive shell is started. However, -i can be used with any other options.
To see this usage help, run hive -h
下面的例子是做一个命令行的查询:
$HIVE_HOME/bin/hive -e 'select a.col from tab1 a'
下面的例子是指定Hive配置查询:
$HIVE_HOME/bin/hive -e 'select a.col from tab1 a' -hiveconf hive.exec.scratchdir=/home/my/hive_scratch -hiveconf mapred.reduce.tasks=32
下面的例子是将查询结果导入到文本文件:
$HIVE_HOME/bin/hive -S -e 'select a.col from tab1 a' > a.txt
二、hiverc file
如果没有-i参数,那么hive会直接进入命令行界面,同时会加载HIVE_HOME/bin/.hiverc and $HOME/.hiverc作为初始化所需要的文件
三、hive交互的Shell命令
CommandDescription
quit Use quit or exit to leave the interactive shell.
set key=value Use this to set value of particular configuration variable. One thing to note here is that if you misspell the variable name, cli will not show an error.
set This will print a list of configuration variables that are overridden by user or hive.
set -v This will print all hadoop and hive configuration variables.
add FILE [file] [file]* Adds a file to the list of resources
list FILE list all the files added to the distributed cache
list FILE [file]* Check if given resources are already added to distributed cache
! [cmd] Executes a shell command from the hive shell
dfs [dfs cmd] Executes a dfs command from the hive shell
[query] Executes a hive query and prints results to standard out
source FILE Used to execute a script file inside the CLI.
例子:
hive> set mapred.reduce.tasks=32;
hive> set;
hive> select a.* from tab1;
hive> !ls;
hive> dfs -ls;
文件资源仅被添加到目标cache中。Jar资源将被添加到Java classpath中。ARCHIVE资源将被自动添加来描述他们。
例如:
hive> add FILE /tmp/tt.py;
hive> list FILES;
/tmp/tt.py
hive> from networks a MAP a.networkid USING 'python tt.py' as nn where a.ds = '2009-01-04' limit 10;
如果命令在所有节点上均有效就没有必要加入到Session中. For example:
... MAP a.networkid USING 'wc -l' ...: here wc is an executable available on all machines
... MAP a.networkid USING '/home/nfsserv1/hadoopscripts/tt.py' ...: here tt.py may be accessible via a nfs mount point that's configured identically on all the cluster nodes.