设为首页 收藏本站
查看: 468|回复: 0

[经验分享] Cloudera Hadoop 管理员CCAH;开发者CCA-175考试大纲

[复制链接]
发表于 2018-10-29 13:04:53 | 显示全部楼层 |阅读模式
  Cloudera Certified Administrator forApache Hadoop (CCA-500)
  Number of Questions: 60 questions
  Time Limit: 90 minutes
  Passing Score: 70%
  Language: English, Japanese
  Exam Sections and Blueprint
  1. HDFS (17%)

  •   Describe the function of HDFS daemons
  •   Describe the normal operation of an Apache     Hadoop cluster, both in data storage and in data processing
  •   Identify current features of computing systems     that motivate a system like Apache Hadoop
  •   Classify major goals of HDFS Design

  •   Given a scenario,>
  •   Identify components and daemon of an HDFS     HA-Quorum cluster
  •   Analyze the role of HDFS security (Kerberos)
  •   Determine the best data serialization choice     for a given scenario
  •   Describe file read and write paths
  •   Identify the commands to manipulate files in     the Hadoop File System Shell
  2. YARN and MapReduce version 2 (MRv2)(17%)

  •   Understand how upgrading a cluster from Hadoop     1 to Hadoop 2 affects cluster settings
  •   Understand how to deploy MapReduce v2 (MRv2 /     YARN), including all YARN daemons
  •   Understand basic design strategy for MapReduce     v2 (MRv2)
  •   Determine how YARN handles resource     allocations
  •   Identify the workflow of MapReduce job running     on YARN
  •   Determine which files you must change and how     in order to migrate a cluster from MapReduce version 1 (MRv1) to MapReduce     version 2 (MRv2) running on YARN
  3. Hadoop Cluster Planning (16%)

  •   Principal points to consider in choosing the     hardware and operating systems to host an Apache Hadoop cluster
  •   Analyze the choices in selecting an OS
  •   Understand kernel tuning and disk swapping

  •   Given a scenario and workload pattern,    >
  •   Given a scenario, determine the ecosystem     components your cluster needs to run in order to fulfill the SLA

  •   Cluster sizing: given a scenario and frequency     of execution,>
  •   Disk Sizing and Configuration, including JBOD     versus RAID, SANs, virtualization, and disk sizing requirements in a     cluster

  •   Network Topologies: understand network usage     in Hadoop (for both HDFS and MapReduce) and propose or>
  4. Hadoop Cluster Installation andAdministration (25%)


  •   Given a scenario,>
  •   Analyze a logging configuration and logging     configuration file format
  •   Understand the basics of Hadoop metrics and     cluster health monitoring
  •   Identify the function and purpose of available     tools for cluster monitoring
  •   Be able to install all the ecoystme components     in CDH 5, including (but not limited to): Impala, Flume, Oozie, Hue,     Cloudera Manager, Sqoop, Hive, and Pig
  •   Identify the function and purpose of available     tools for managing the Apache Hadoop file system
  5. Resource Management (10%)

  •   Understand the overall design goals of each of     Hadoop schedulers
  •   Given a scenario, determine how the FIFO     Scheduler allocates cluster resources
  •   Given a scenario, determine how the Fair     Scheduler allocates cluster resources under YARN
  •   Given a scenario, determine how the Capacity     Scheduler allocates cluster resources
  6. Monitoring and Logging (15%)

  •   Understand the functions and features of     Hadoop’s metric collection abilities
  •   Analyze the NameNode and JobTracker Web UIs
  •   Understand how to monitor cluster daemons
  •   Identify and monitor CPU usage on master nodes
  •   Describe how to monitor swap and memory     allocation on all nodes
  •   Identify how to view and manage Hadoop’s log     files
  •   Interpret a log file
  CCA Spark and Hadoop Developer Exam(CCA175)
  Number of Questions: 10–12performance-based (hands-on) tasks on CDH5 cluster. See below for full clusterconfiguration
  Time Limit: 120 minutes
  Passing Score: 70%
  Language: English, Japanese (forthcoming)
  Required Skills
  Data Ingest
  The skills to transfer data between external systemsand your cluster. This includes the following:

  •   Import data from a MySQL database into HDFS     using Sqoop
  •   Export data to a MySQL database from HDFS     using Sqoop
  •   Change the delimiter and file format of data     during import using Sqoop
  •   Ingest real-time and near-real time (NRT)     streaming data into HDFS using Flume
  •   Load data into and out of HDFS using the     Hadoop File System (FS) commands
  Transform, Stage, Store
  Convert a set of data values in a given format storedin HDFS into new data values and/or a new data format and write them into HDFS.This includes writing Spark applications in both Scala and Python:

  •   Load data from HDFS and store results back to     HDFS using Spark
  •   Join disparate datasets together using Spark
  •   Calculate aggregate statistics (e.g., average     or sum) using Spark
  •   Filter data into a smaller dataset using Spark
  •   Write a query that produces ranked or sorted     data using Spark
  Data Analysis
  Use Data Definition Language (DDL) to create tables inthe Hive metastore for use by Hive and Impala.

  •   Read and/or create a table in the Hive     metastore in a given schema
  •   Extract an Avro schema from a set of datafiles     using avro-tools
  •   Create a table in the Hive metastore using the     Avro file format and an external schema file
  •   Improve query performance by creating     partitioned tables in the Hive metastore
  •   Evolve an Avro schema by changing JSON files
  以上,有疑问可加Q1438118790询问


运维网声明 1、欢迎大家加入本站运维交流群:群②:261659950 群⑤:202807635 群⑦870801961 群⑧679858003
2、本站所有主题由该帖子作者发表,该帖子作者与运维网享有帖子相关版权
3、所有作品的著作权均归原作者享有,请您和我们一样尊重他人的著作权等合法权益。如果您对作品感到满意,请购买正版
4、禁止制作、复制、发布和传播具有反动、淫秽、色情、暴力、凶杀等内容的信息,一经发现立即删除。若您因此触犯法律,一切后果自负,我们对此不承担任何责任
5、所有资源均系网友上传或者通过网络收集,我们仅提供一个展示、介绍、观摩学习的平台,我们不对其内容的准确性、可靠性、正当性、安全性、合法性等负责,亦不承担任何法律责任
6、所有作品仅供您个人学习、研究或欣赏,不得用于商业或者其他用途,否则,一切后果均由您自己承担,我们对此不承担任何法律责任
7、如涉及侵犯版权等问题,请您及时通知我们,我们将立即采取措施予以解决
8、联系人Email:admin@iyunv.com 网址:www.yunweiku.com

所有资源均系网友上传或者通过网络收集,我们仅提供一个展示、介绍、观摩学习的平台,我们不对其承担任何法律责任,如涉及侵犯版权等问题,请您及时通知我们,我们将立即处理,联系人Email:kefu@iyunv.com,QQ:1061981298 本贴地址:https://www.yunweiku.com/thread-628038-1-1.html 上篇帖子: 【DAY2】hadoop 完全分布模式中需要用到的SHELL脚本 下篇帖子: hadoop调整时区
您需要登录后才可以回帖 登录 | 立即注册

本版积分规则

扫码加入运维网微信交流群X

扫码加入运维网微信交流群

扫描二维码加入运维网微信交流群,最新一手资源尽在官方微信交流群!快快加入我们吧...

扫描微信二维码查看详情

客服E-mail:kefu@iyunv.com 客服QQ:1061981298


QQ群⑦:运维网交流群⑦ QQ群⑧:运维网交流群⑧ k8s群:运维网kubernetes交流群


提醒:禁止发布任何违反国家法律、法规的言论与图片等内容;本站内容均来自个人观点与网络等信息,非本站认同之观点.


本站大部分资源是网友从网上搜集分享而来,其版权均归原作者及其网站所有,我们尊重他人的合法权益,如有内容侵犯您的合法权益,请及时与我们联系进行核实删除!



合作伙伴: 青云cloud

快速回复 返回顶部 返回列表