peibaishi 发表于 2017-6-30 08:54:33

DW(六):polybase访问Azure Blob Storage

  目录:


[*]连接hadoop配置语法
[*]配置hadoop连接
[*]Pushdown配置
[*]Create external tables for Azure blob storage
  连接hadoop配置语法:   

  global configuration settings for PolyBase Hadoop and Azure blob storage connectivity, Syntax:


[*]sp_configure:List all of the configuration options
[*]配置与hadoop连接语法:
[*]


sp_configure [ @configname = ] 'hadoop connectivity',
[ @configvalue = ] { 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 }
[;]
RECONFIGURE
[;]
  



[*]Option 0: Disable Hadoop connectivity
[*]Option 1: Hortonworks HDP 1.3 on Windows Server
[*]Option 1: Azure blob storage (WASB)
[*]Option 2: Hortonworks HDP 1.3 on Linux
[*]Option 3: Cloudera CDH 4.3 on Linux
[*]Option 4: Hortonworks HDP 2.0 on Windows Server
[*]Option 4: Azure blob storage (WASB)
[*]Option 5: Hortonworks HDP 2.0 on Linux
[*]Option 6: Cloudera 5.1, 5.2, 5.3, 5.4, and 5.5 on Linux
[*]Option 7: Hortonworks 2.1, 2.2, and 2.3 on Linux
[*]Option 7: Hortonworks 2.1, 2.2, and 2.3 on Windows Server
[*]Option 7: Azure blob storage (WASB)      



[*]示例代码:


[*]显示所有可用的配置列表: EXEC sp_configure;
[*]显示某个配置项的配置信息:EXEC sp_configure @configname='hadoop connectivity';
[*]设置配置项示例语法如下:      
[*]


sp_configure @configname = 'hadoop connectivity', @configvalue = 7;
GO
RECONFIGURE
GO   

  


  配置hadoop连接



[*]在本节示例中,连接HDInsight3.2集,对应关系如下:
[*]
[*] 配置polybase连接配置项信息信息如下:
[*]


sp_configure @configname = 'hadoop connectivity', @configvalue = 7;
GO
RECONFIGURE
GO

  

[*]命令行:services.msc, 重启下面的服务
[*]SQL Server PolyBase Data Movement Service
[*]SQL Server PolyBase Engine

  Pushdown配置:



[*]To improve query performance, enable pushdown computation to a Hadoop cluster


[*]Find the file yarn-site.xml in the installation path of SQL Server. Typically, the path is: C:\Program Files\Microsoft SQL Server\MSSQL13.MSSQLSERVER\MSSQL\Binn\Polybase\Hadoop\conf
[*]On the Hadoop machine, find the analogous(类似) file in the Hadoop configuration directory. In the file, find and copy the value of the configuration key yarn.application.classpath
[*]On the SQL Server machine, in the yarn-site.xml file, find the yarn.application.classpath property. Paste the value from the Hadoop machine into the value element.
[*]


$HADOOP_CONF_DIR,/usr/hdp/current/hadoop-client/*,/usr/hdp/current/hadoop-client/lib/*,/usr/hdp/current/hadoop-hdfs-client/*,/usr/hdp/current/hadoop-hdfs-client/lib/*,/usr/hdp/current/hadoop-yarn-client/*,/usr/hdp/current/hadoop-yarn-client/lib/*
  

  Create external tables for Azure blob storage



[*]The Elastic(弹性) Database query feature relies on(依靠) the these four DDL statements. Typically, these DDL statements are used once or rarely when the schema of your application changes


[*]
[*]    (https://msdn.microsoft.com/library/ms174382.aspx)
[*]    (https://msdn.microsoft.com/library/ms189522.aspx)
[*]   (https://msdn.microsoft.com/library/mt270260.aspx)
[*]    (https://msdn.microsoft.com/library/dn935022.aspx)
[*]                (https://msdn.microsoft.com/library/dn935021.aspx)



[*]You can use the following syntax to drop the master key and credentials


[*]
[*]DROP CREDENTIAL <credential_name> ON DATABASE;
[*]DROP DATABASE SCOPED CREDENTIAL <credential_name>;
[*]DROP MASTER KEY;



[*]模拟简单文本数据,如下:
[*]      
[*]将文本文件上传到 Azure blob storage, 如下图:
[*]
[*]远程连接ploybase1, 打开ssms(注:此版本需单独安装),执行脚本如下:
[*]


CREATE MASTER KEY ENCRYPTION BY PASSWORD = 'Password01!';
CREATE DATABASE SCOPED CREDENTIAL JNAzureCredit   
WITH IDENTITY = 'zhushy', Secret =XXXX';/*存储访问Key*/

CREATE EXTERNAL DATA SOURCE JNAzureStorage with (
TYPE = HADOOP,   
LOCATION ='wasb://<blob_container_name>@<azure_storage_account_name>.blob.core.chinacloudapi.cn',
CREDENTIAL = JNAzureCredit
);

CREATE EXTERNAL FILE FORMAT TextFileFormat WITH (
FORMAT_TYPE = DELIMITEDTEXT,   
FORMAT_OPTIONS (FIELD_TERMINATOR =' ', USE_TYPE_DEFAULT = TRUE))
CREATE EXTERNAL TABLE . (
varchar(128) NULL,
varchar(128) null
)
WITH (LOCATION='/zhu/data1.txt',   
DATA_SOURCE = JNAzureStorage,
FILE_FORMAT = TextFileFormat
);
  

[*]验证数据,输入如下SQL: select * from . ,如下图,则OK
[*]
  
页: [1]
查看完整版本: DW(六):polybase访问Azure Blob Storage