🖥️ Administrating HBase

🖥️ 管理 HBase

Introduction to HBase Deployment

HBase 部署简介

“A single-node standalone HBase install is only meant for basic access, which you typically do either while learning how to use the system or while developing an application. It can’t handle any real workload or scale.”
“单节点独立 HBase 安装仅用于基本访问,通常在学习如何使用系统或开发应用程序时进行。它无法处理任何实际工作负载或进行扩展。”

Objectives of This Chapter

本章目标

You will learn about the following key points regarding HBase deployment:
您将学习到关于 HBase 部署的以下要点:

  • Deploying HBase
  • 部署 HBase
  • Planning your cluster
  • 规划您的集群
  • Deploying software
  • 部署软件
  • Distributions
  • 发行版
  • Configuration
  • 配置
  • Operations
  • 运维
  • Monitoring your cluster
  • 监控您的集群
  • Performance of your HBase cluster
  • 您的 HBase 集群性能
  • Cluster management
  • 集群管理

Planning a Fully Distributed HBase Setup

规划完全分布式 HBase 设置

When planning an HBase cluster, consider the following components:
在规划 HBase 集群时,请考虑以下组件:

  • HBase Master
  • HBase Master
  • ZooKeeper
  • ZooKeeper
  • RegionServers
  • RegionServers
  • HDFS DataNodes
  • HDFS DataNodes
  • MapReduce Framework (sometimes included)
  • MapReduce 框架 (有时会包括)

Hardware Considerations

硬件考量

Choosing the right hardware is crucial for an efficient deployment:
选择合适的硬件对于高效部署至关重要:

  • Commodity Hardware: This does not mean low-end; it refers to non-exotic parts readily available from various manufacturers.
  • 商用硬件 (Commodity Hardware):这并不意味着低端;它指的是可从不同制造商轻松获得的非特殊部件。
  • Avoid over-specifying hardware; sufficient resources in CPUs, RAM, and disk size are vital.
  • 避免硬件规格过高;拥有足够的 CPU内存 (RAM)磁盘大小 资源至关重要。
  • Balance resource selection to maximize performance and minimize costs.
  • 平衡资源选择,以最大化性能并最小化成本。

Cluster Size Guidelines

集群规模指南

Cluster Size Recommended Configuration
集群规模 推荐配置
Up to 25 nodes Single node for Hadoop JobTracker and NameNode; keep Secondary NameNode separate.
最多 25 个节点 单个节点用于 Hadoop JobTracker 和 NameNode;保持 Secondary NameNode 分开。
Larger than 25 nodes Dedicated hardware for Hadoop NameNode, JobTracker, and Secondary NameNode.
超过 25 个节点 为 Hadoop NameNode、JobTracker 和 Secondary NameNode 使用专用硬件。

Collocation and SLAs

共置与服务等级协议 (SLA)

  • RegionServers: Typically collocated with HDFS DataNodes.
  • RegionServers:通常与 HDFS DataNodes 共置。
  • Service Collocation: Avoid collocating HBase RegionServers with Hadoop TaskTrackers for low-latency workloads.
  • 服务共置:对于低延迟工作负载,避免将 HBase RegionServers 与 Hadoop TaskTrackers 共置。
  • MapReduce Jobs: If using MapReduce with real-time workloads, set up separate clusters to ensure SLAs.
  • MapReduce 作业:如果将 MapReduce 用于实时工作负载,请设置独立的集群以确保服务等级协议 (SLA)。

Types of HBase Cluster Configurations

HBase 集群配置类型

Prototype Cluster

原型集群

  • Description: For simple prototypes, collocate HBase Master with Hadoop NameNode and JobTracker.
  • 描述:对于简单的原型,将 HBase Master 与 Hadoop NameNode 和 JobTracker 共置。
  • Size: Typically fewer than 10 nodes.
  • 规模:通常少于 10 个节点。
  • Hardware: A machine with 4–6 cores, 24–32 GB RAM, and 4 SATA disks is sufficient.
  • 硬件:一台拥有 4-6 核、24-32 GB 内存和 4 个 SATA 硬盘的机器就足够了。

Small Production Cluster (10–20 servers)

小型生产集群 (10–20 台服务器)

  • Operation: More challenging to guarantee performance and SLAs.
  • 运维:更难保证性能和服务等级协议 (SLA)。
  • Configuration: HBase Master on separate hardware; consider redundancy.
  • 配置:HBase Master 部署在独立的硬件上;考虑冗余。
  • ZooKeeper: A single instance is usually sufficient and can be collocated with HBase Master.
  • ZooKeeper:单个实例通常足够,并可以与 HBase Master 共置。

Medium Production Cluster (up to ~50 servers)

中型生产集群 (最多约 50 台服务器)

  • Configuration Changes: Separate NameNode and JobTracker; maintain HBase Masters and ZooKeeper together.
  • 配置变更:分离 NameNode 和 JobTracker;将 HBase Masters 和 ZooKeeper 放在一起。
  • ZooKeeper Instances: Increase to three for quorum requirements.
  • ZooKeeper 实例:增加到三个以满足法定数量 (quorum) 的要求。

Large Production Cluster (>~50 servers)

大型生产集群 (超过约 50 台服务器)

  • Configuration: Similar to medium, but increase ZooKeeper instances to five.
  • 配置:与中型集群类似,但将 ZooKeeper 实例增加到五个。
  • Collocation: HBase Masters should be collocated with ZooKeeper.
  • 共置:HBase Masters 应与 ZooKeeper 共置。

Key Considerations for Each Cluster Type

各集群类型的主要考量

Cluster Type Key Considerations
集群类型 主要考量
Prototype Cluster Less than 10 nodes; collocate services; limited performance expectations.
原型集群 少于 10 个节点;共置服务;性能预期有限。
Small Production Cluster Better hardware needed; single HBase Master acceptable; redundancy recommended.
小型生产集群 需要更好的硬件;单个 HBase Master 是可接受的;建议冗余。
Medium Production Cluster Separate NameNode and JobTracker; three ZooKeeper instances for reliability.
中型生产集群 分离 NameNode 和 JobTracker;三个 ZooKeeper 实例以保证可靠性。
Large Production Cluster Five ZooKeeper instances; ensure adequate resources for NameNode and Secondary NameNode.
大型生产集群 五个 ZooKeeper 实例;确保为 NameNode 和 Secondary NameNode 提供足够的资源。

Conclusion

总结

This chapter lays the groundwork for efficiently deploying and managing an HBase cluster. Understanding the requirements for different cluster sizes and configurations is critical for optimal performance and resource utilization.
本章为高效部署和管理 HBase 集群奠定了基础。理解不同集群规模和配置的需求对于实现最佳性能和资源利用至关重要。

🖥️ Hadoop Master Processes

🖥️ Hadoop 主进程

Overview of Hadoop Master Processes

Hadoop 主进程概述

  • Hadoop Master processes include:
  • Hadoop 主进程 包括:
    • NameNode
    • NameNode
    • Secondary NameNode
    • Secondary NameNode
    • JobTracker
    • JobTracker

“These processes are typically referred to as the Hadoop Master processes. Depending on the cluster size, they may be deployed together or on separate nodes.”
“这些进程通常被称为 Hadoop 主进程。根据集群的大小,它们可以一起部署,也可以部署在不同的节点上。”

Characteristics of Master Processes

主进程特性

  • All are single processes with no built-in failover strategy.
  • 所有进程都是单一进程,没有内置的故障转移策略。
  • High availability for hardware is crucial to prevent data loss.
  • 硬件的高可用性对于防止数据丢失至关重要。

Hardware Recommendations for Master Processes

主进程的硬件建议

Redundancy and Reliability

冗余与可靠性

  • It is recommended to have hardware redundancy:
  • 建议具备硬件冗余:
    • Dual power supplies
    • 双电源
    • Bonded network interface cards (NICs)
    • 绑定的网络接口卡 (NIC)
    • Possibly RAID disks
    • 可选择 RAID 磁盘

Disk Configuration

磁盘配置

  • RAID 1 disks are common for the NameNode and Secondary NameNode for metadata storage.
  • 对于 NameNode 和 Secondary NameNode,通常使用 RAID 1 磁盘进行元数据存储
  • JBODs can also serve the purpose as the NameNode can write metadata to multiple locations.
  • JBOD (简单磁盘捆绑) 也可以满足需求,因为 NameNode 可以将元数据写入多个位置。

Importance of Backups

备份的重要性

“If the disks holding the metadata on the NameNode go down and you don’t have redundancy or backups, you’ll lose the data in the cluster.”
“如果 NameNode 上存储元数据的磁盘发生故障,而您又没有冗余或备份,那么您将丢失集群中的数据。”

Memory Requirements

内存要求

  • The NameNode serves all metadata from main memory; thus, it needs sufficient RAM.
  • NameNode 从主内存中提供所有元数据;因此,它需要足够的内存。
  • Recommended configuration for small clusters:
  • 小型集群的推荐配置:
    • 8 cores
    • 8 核
    • At least 16 GB DDR3 RAM
    • 至少 16 GB DDR3 内存
    • Dual 1 GB Ethernet NICs
    • 双 1 GB 以太网卡
    • SATA drives
    • SATA 硬盘

Scaling for Larger Clusters

针对大型集群的扩展

  • For medium and large clusters, add additional RAM:
  • 对于中型和大型集群,增加额外的内存:
    • Medium: +16 GB RAM
    • 中型:+16 GB 内存
    • Large: +16 GB RAM
    • 大型:+16 GB 内存

HBase Master Configuration

HBase Master 配置

Role of the HBase Master

HBase Master 的角色

  • The HBase Master is lightweight and can have multiple instances for failover, reducing the need for expensive hardware.
  • HBase Master 是轻量级的,并且可以有多个实例用于故障转移,从而减少了对昂贵硬件的需求。

推荐配置

  • Typical hardware configuration:
  • 典型的硬件配置:
    • 4 cores
    • 4 核
    • 8–16 GB DDR3 RAM
    • 8–16 GB DDR3 内存
    • 2 SATA disks (one for OS, another for logs)
    • 2 个 SATA 硬盘 (一个用于操作系统,另一个用于日志)
    • 1 GbE NIC
    • 1 GbE 网卡

Redundancy for HBase Master

HBase Master 的冗余

  • Multiple HBase Masters improve redundancy.
  • 多个 HBase Master 可以提高冗余性。

Hadoop DataNodes and HBase RegionServers

Hadoop DataNode 和 HBase RegionServer

Definition of Slave Nodes

从属节点的定义

  • Slave nodes are referred to as Hadoop DataNodes and HBase RegionServers.
  • 从属节点被称为 Hadoop DataNodeHBase RegionServer

“They don’t have fancy hardware requirements like the Master nodes due to built-in redundancy in the architecture.”
“由于架构中内置了冗余,它们不像主节点那样有高级的硬件要求。”

Configuration Requirements

配置要求

Resource Type Specifications
资源类型 规格
Cores 8–12 cores for DataNode, RegionServer, and OS processes
核心 8–12 核,用于 DataNode、RegionServer 和操作系统进程
RAM 24–32 GB RAM
内存 24–32 GB 内存
Disk Storage 12x 1 TB drives
磁盘存储 12x 1 TB 硬盘

Performance Considerations

性能考量

  • Commodity hardware is sufficient, but configurations should be tailored to specific workloads (memory-intensive vs. CPU-intensive).
  • 商用硬件已足够,但应根据特定工作负载(内存密集型 vs. CPU 密集型)定制配置。
  • Avoid running MapReduce on the same nodes to optimize performance.
  • 避免在相同节点上运行 MapReduce 以优化性能。

HBase RegionServer Memory Management

HBase RegionServer 内存管理

Memory Allocation

内存分配

  • HBase RegionServers are memory-intensive; however, excessive allocation can lead to performance issues:
  • HBase RegionServer 是内存密集型的;但是,过多的分配可能导致性能问题:
    • Recommended heap size: 10–15 GB.
    • 推荐堆大小:10–15 GB。

Garbage Collection

垃圾回收

“Stop-the-world garbage collection is the bane of its existence.”
“Stop-the-world(全局暂停)的垃圾回收是其存在的祸根。”

ZooKeeper Configuration

ZooKeeper 配置

Characteristics of ZooKeeper

ZooKeeper 的特性

  • ZooKeeper is lightweight but latency-sensitive.
  • ZooKeeper 是轻量级的,但对延迟敏感。

推荐硬件配置

  • Similar to HBase Master, but with a dedicated spindle for data persistence.
  • 与 HBase Master 类似,但需要一个专用的磁盘轴用于数据持久化。

Deployment Strategies

部署策略

Challenges in Cluster Management

集群管理中的挑战

  • Managing and deploying on a cluster is complex and requires careful planning.
  • 在集群上进行管理和部署是复杂的,需要仔细规划。

Tools for Deployment

部署工具

  • Apache Whirr simplifies cloud deployments, allowing for custom manifests and configuration management.
  • Apache Whirr 简化了云部署,允许自定义清单和配置管理。
  • Cloudera Manager is specialized for managing Hadoop and HBase clusters.
  • Cloudera Manager 专门用于管理 Hadoop 和 HBase 集群。

Example Whirr Recipe for Cloud Deployment

云部署的 Whirr 配方示例

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
$ cat my_cdh_recipe 
whirr.cluster-name=ak-cdh-hbase
whirr.instance-templates=1
zookeeper+hadoop-namenode+hadoop-jobtracker+hbasemaster,
5 hadoop-datanode+hadoop-tasktracker+hbase-regionserver
hbase-site.dfs.replication=3
whirr.zookeeper.install-function=install_cdh_zookeeper
whirr.zookeeper.configure-function=configure_cdh_zookeeper
whirr.hadoop.install-function=install_cdh_hadoop
whirr.hadoop.configure-function=configure_cdh_hadoop
whirr.hbase.install-function=install_cdh_hbase
whirr.hbase.configure-function=configure_cdh_hbase
whirr.provider=aws-ec2
whirr.identity=${env:AWS_ACCESS_KEY_ID}
whirr.credential=${env:AWS_SECRET_ACCESS_KEY}
whirr.hardware-id=m1.xlarge
whirr.image-id=us-east-1/ami-04c9306d
whirr.location-id=us-east-1

Launching and Listing Clusters

启动和列出集群

1
2
bin/whirr launch-cluster --config my_cdh_recipe 
bin/whirr list-cluster --config my_cdh_recipe

This guide covers the critical aspects of Hadoop Master processes, HBase configuration, and deployment strategies, ensuring you can understand and manage these systems effectively.
本指南涵盖了 Hadoop 主进程、HBase 配置和部署策略的关键方面,确保您能有效理解和管理这些系统。

🗃️ HBase Cluster Management

🗃️ HBase 集群管理

Hadoop Services

Hadoop 服务

  • Hadoop Services Overview
  • Hadoop 服务概览
    • Key components include:
    • 关键组件包括:
      • hadoop-datanode
      • hadoop-datanode
      • hadoop-tasktracker
      • hadoop-tasktracker
      • hbase-regionserver
      • hbase-regionserver
    • Example instance IDs:
    • 实例 ID 示例:
      • us-east-1/i-56c4e632
      • us-east-1/i-56c4e632
      • us-east-1/i-50c4e634
      • us-east-1/i-50c4e634
      • us-east-1/i-52c4e636
      • us-east-1/i-52c4e636

Destroying the Cluster

销毁集群

When you’re done with your cluster and want to kill it, use the destroy-cluster command like this:
当您用完集群并想终止它时,可以像这样使用 destroy-cluster 命令:

1
bin/whirr destroy-cluster --config my_cdh_recipe

HBase Distributions

HBase 发行版

  • There are multiple distributions/packages of HBase:
  • HBase 有多种发行版/软件包:
    • Stock Apache
    • 原生 Apache
      • The parent project where all HBase development occurs.
      • 所有 HBase 开发的父项目。
      • Releases include notable versions such as 0.20.x, 0.90.x, 0.92.x, and 0.94.x.
      • 发布版本包括 0.20.x、0.90.x、0.92.x 和 0.94.x 等知名版本。
    • Cloudera’s CDH
    • Cloudera 的 CDH
      • A distribution that includes Hadoop and other ecosystem components.
      • 一个包含 Hadoop 和其他生态系统组件的发行版。
      • Builds off Apache by adding patches for stability and features based on customer demand.
      • 基于 Apache 构建,根据客户需求添加补丁以增强稳定性和功能。
Distribution Characteristics
发行版 特点
Stock Apache Development community driven, consistent releases with notable versions.
原生 Apache 由开发社区驱动,发布版本一致,有多个知名版本。
Cloudera’s CDH Includes more patches for stability, better testing, and production use; recommended for deployment.
Cloudera 的 CDH 包含更多用于稳定性的补丁,经过更好的测试,适用于生产环境;推荐用于部署。

Installation Instructions

安装说明

Stock Apache Distribution

原生 Apache 发行版

  1. Download the tarball:
  2. 下载 tarball 压缩包
    1
    2
    3
    cd /tmp 
    wget http://mirrors.axint.net/apache/hbase/hbase-0.92.1/hbase-0.92.1.tar.gz
    mv /tmp/hbase-0.92.1.tar.gz /usr/local/lib
  3. Untar the tarball and create a symlink:
  4. 解压 tarball 并创建符号链接:
    1
    2
    3
    tar xvfz hbase-0.92.1.tar.gz 
    cd /usr/local/lib
    ln -s hbase-0.92.1 hbase
  5. Set $HBASE_HOME variable to point to the current install.
  6. 设置 $HBASE_HOME 变量,使其指向当前安装目录。

Cloudera’s CDH Distribution

Cloudera 的 CDH 发行版

  1. Add the CDH repository:
  2. 添加 CDH 仓库
    • For Red Hat-based systems:
    • 对于基于 Red Hat 的系统:
      1
      2
      cd /etc/yum.repos.d 
      wget http://archive.cloudera.com/cdh4/redhat/6/x86_64/cdh/cloudera-cdh4.repo
    • For Debian/Ubuntu-based systems:
    • 对于基于 Debian/Ubuntu 的系统:
      1
      2
      wget http://archive.cloudera.com/cdh4/one-click-install/precise/amd64/cdh4-repository_1.0_all.deb 
      sudo dpkg -i cdh4-repository_1.0_all.deb
  3. Install HBase packages:
  4. 安装 HBase 软件包
    • For Red Hat-based systems:
    • 对于基于 Red Hat 的系统:
      1
      2
      3
      sudo yum install hbase 
      sudo yum install hbase-master
      sudo yum install hbase-regionserver
    • For Debian/Ubuntu-based systems:
    • 对于基于 Debian/Ubuntu 的系统:
      1
      2
      3
      sudo apt-get install hbase 
      sudo apt-get install hbase-master
      sudo apt-get install hbase-regionserver

Configuration of HBase

HBase 的配置

Configuration Types

配置类型

  • Environment Configurations: Located in hbase-env.sh
  • 环境配置:位于 hbase-env.sh
  • Daemon Configurations: Located in hbase-site.xml
  • 守护进程配置:位于 hbase-site.xml

Environment Configuration

环境配置

  • Key Parameters:
  • 关键参数
    • Memory allocation
    • 内存分配
    • Garbage collection settings
    • 垃圾回收设置

Daemon Configuration

守护进程配置

  • Sample configuration in XML:
  • XML 格式的示例配置
1
2
3
4
5
6
7
8
<?xml version="1.0"?>
<configuration>
<property>
<name>hbase.rootdir</name>
<value>file:///tmp/hbase-${user.name}/hbase</value>
<description>The directory shared by region servers and into which HBase persists.</description>
</property>
</configuration>

Important Configuration Properties

重要配置属性

  • hbase.zookeeper.quorum: Defines the Zookeeper quorum servers.
  • hbase.zookeeper.quorum:定义 Zookeeper 法定数量服务器。
1
2
3
4
<property>
<name>hbase.zookeeper.quorum</name>
<value>server1ip,server2ip,server3ip</value>
</property>
  • hbase.rootdir: Specifies where HBase data is persisted in HDFS.
  • hbase.rootdir:指定 HBase 数据在 HDFS 中的持久化位置。
1
2
3
4
<property>
<name>hbase.rootdir</name>
<value>hdfs://namenode.yourcompany.com:520</value>
</property>

🛠️ HBase Configuration Parameters

🛠️ HBase 配置参数

Essential Configuration for Distributed Mode

分布式模式的基本配置

For HBase to operate in fully distributed mode, the following property must be set in the hbase-site.xml file:
要使 HBase 在完全分布式模式下运行,必须在 hbase-site.xml 文件中设置以下属性:

1
2
3
4
<property>
<name>hbase.cluster.distributed</name>
<value>true</value>
</property>

Important Ports

重要端口

  • NameNode Port: 5200 (configured in hdfs-site.xml)
  • NameNode 端口5200(在 hdfs-site.xml 中配置)
  • HBase Master Port: 60000
  • HBase Master 端口60000
  • HBase Master Info Port: 60010
  • HBase Master 信息端口60010
  • RegionServer Port: Configured in hbase.regionserver.port
  • RegionServer 端口:在 hbase.regionserver.port 中配置

Key Configuration Parameters

关键配置参数

The table below summarizes essential configuration parameters required for running HBase in a distributed environment:
下表总结了在分布式环境中运行 HBase 所需的基本配置参数:

Configuration Parameter Description
配置参数 描述
hbase.client.scanner.caching Defines the number of rows fetched when next() is called on a scanner.
hbase.client.scanner.caching 定义在扫描器上调用 next() 时获取的行数。
hbase.balancer.period Time interval for the balancer to run (default: 5 minutes or 300,000 milliseconds).
hbase.balancer.period 负载均衡器运行的时间间隔(默认为 5 分钟或 300,000 毫秒)。
hbase.client.write.buffer Write buffer size in bytes; larger buffers reduce RPCs but increase memory consumption.
hbase.client.write.buffer 写缓冲区大小(字节);较大的缓冲区可减少 RPC 次数,但会增加内存消耗。
hbase.hregion.majorcompaction Periodic major compactions can be configured.
hbase.hregion.majorcompaction 可以配置周期性的 major compaction。
hbase.hregion.max.filesize Maximum size for HStoreFile; exceeding this size triggers region splitting.
hbase.hregion.max.filesize HStoreFile 的最大大小;超过此大小将触发 region 分裂。
hbase.hregion.memstore.flush.size Max size of the MemStore in bytes; triggers flush when exceeded.
hbase.hregion.memstore.flush.size MemStore 的最大大小(字节);超过时触发刷新 (flush)。
hbase.hregion.memstore.mslab.enabled Prevents heap fragmentation during heavy writes.
hbase.hregion.memstore.mslab.enabled 在大量写入期间防止堆碎片化。
hbase.hstore.blockingStoreFiles Blocks writes if store files exceed a certain number until compaction completes.
hbase.hstore.blockingStoreFiles 如果 store 文件数量超过某个数值,则阻塞写入,直到 compaction 完成。
hbase.hstore.compaction.max Maximum number of files to compact in a single minor compaction (default: 7).
hbase.hstore.compaction.max 在一次 minor compaction 中要合并的最大文件数(默认为 7)。
hbase.hstore.compactionThreshold Triggers compaction when the store file count reaches this number.
hbase.hstore.compactionThreshold 当 store 文件数量达到此数值时触发 compaction。
zookeeper.session.timeout Session timeout for HBase daemons/clients with ZooKeeper (configured in milliseconds).
zookeeper.session.timeout HBase 守护进程/客户端与 ZooKeeper 的会话超时时间(以毫秒为单位配置)。
zookeeper.znode.parent Root znode path for HBase in ZooKeeper (default: /hbase).
zookeeper.znode.parent HBase 在 ZooKeeper 中的根 znode 路径(默认为 /hbase)。
dfs.support.append Ensures durable sync on HDFS for write-ahead log persistence during edits.
dfs.support.append 确保在编辑过程中 HDFS 上的持久化同步,用于预写日志 (write-ahead log) 的持久性。

📊 Monitoring HBase Operations

📊 监控 HBase 运维

Importance of Monitoring

监控的重要性

Effective monitoring is crucial in production environments to quickly identify issues without sifting through extensive logs. Metrics are essential for understanding the state and behavior of the system.
在生产环境中,有效的监控至关重要,可以在不筛选大量日志的情况下快速识别问题。指标对于理解系统的状态和行为至关重要。

Collecting and Graphing Metrics

收集和绘制指标图表

  • Collection Frameworks: These collect metrics generated by HBase and store them for later use, often rolling them up for summaries.

  • 收集框架:这些框架收集由 HBase 生成的指标并存储起来以备后用,通常会将它们汇总以生成摘要。

  • Graphing Tools: Visualize the collected data, allowing operators to quickly assess system performance and identify thresholds.

  • 图表工具:将收集的数据可视化,使操作员能够快速评估系统性能并识别阈值。

推荐工具

  1. Ganglia

  2. Ganglia

    • A distributed monitoring framework widely used for monitoring clusters.
    • 一种广泛用于监控集群的分布式监控框架。
    • To configure HBase to output metrics to Ganglia, modify the hadoop-metrics.properties file with the following configuration:
    • 要将 HBase 配置为向 Ganglia 输出指标,请使用以下配置修改 hadoop-metrics.properties 文件:
    1
    2
    3
    hbase.class=org.apache.hadoop.metrics.ganglia.GangliaContext31
    hbase.period=10
    hbase.servers=GMETADHOST_IP:PORT
  3. File-Based Metrics

  4. 基于文件的指标

    • HBase can also log metrics to a flat file, although this method is less effective for real-time monitoring due to its cumbersome nature.
    • HBase 也可以将指标记录到平面文件中,尽管由于其繁琐的性质,这种方法对于实时监控效果较差。
    • Configuration example for file logging:
    • 文件日志记录的配置示例:
    1
    2
    3
    hbase.class=org.apache.hadoop.hbase.metrics.file.TimeStampingFileContext
    hbase.period=10
    hbase.fileName=/tmp/metrics_hbase.log

Conclusion on Operations

运维总结

As developers, understanding both application development and operational aspects of HBase clusters is critical. Monitoring, performance testing, and efficient configuration are key to ensuring that HBase can run effectively in production environments.
作为开发人员,理解 HBase 集群的应用程序开发和运维方面都至关重要。监控、性能测试和高效配置是确保 HBase 能够在生产环境中有效运行的关键。

🛠️ Application-Side Monitoring

🛠️ 应用侧监控

“Monitoring HBase from the application perspective is crucial to ensure that the entire application stack is functioning well, not just the system-level metrics.”
“从应用程序的角度监控 HBase 对于确保整个应用程序堆栈运行良好至关重要,而不仅仅是系统级指标。”

Importance of Application-Side Monitoring

应用侧监控的重要性

  • Application-side monitoring complements system-level monitoring tools like Ganglia.
  • 应用侧监控补充了像 Ganglia 这样的系统级监控工具。
  • Custom implementations are often necessary, as there are no standardized templates from the HBase community yet.
  • 通常需要自定义实现,因为 HBase 社区尚无标准化模板。

Key Metrics to Monitor

需要监控的关键指标

Metric Description
指标 描述
Client Performance per RegionServer Monitor performance metrics as seen by the client for each RegionServer.
每个 RegionServer 的客户端性能 监控每个 RegionServer 在客户端所见的性能指标。
Scan Performance per RegionServer Evaluate scan performance metrics from the client’s viewpoint for every RegionServer.
每个 RegionServer 的扫描性能 从客户端的角度评估每个 RegionServer 的扫描性能指标。
Connectivity to RegionServers Assess the application’s connectivity to all RegionServers.
与 RegionServers 的连接性 评估应用程序与所有 RegionServers 的连接性。
Network Latencies Measure network latencies between the application and HBase cluster.
网络延迟 测量应用程序和 HBase 集群之间的网络延迟。
Concurrent Client Connections Track the number of concurrent clients accessing HBase at any time.
并发客户端连接数 跟踪任何时候访问 HBase 的并发客户端数量。
ZooKeeper Connectivity Ensure connectivity to ZooKeeper for reliable operations.
ZooKeeper 连接性 确保与 ZooKeeper 的连接以实现可靠操作。

Performance Expectations

性能预期

  • Response Time: Critical for user satisfaction; users should ideally receive responses in milliseconds.
  • 响应时间:对用户满意度至关重要;理想情况下,用户应在毫秒级内收到响应。
  • Testing and Tuning: It is essential to test performance thoroughly and tune the HBase cluster to meet user expectations.
  • 测试和调优:必须彻底测试性能并调优 HBase 集群以满足用户期望。

🚀 Performance Testing of HBase

🚀 HBase 性能测试

“Performance testing is vital to ensure that your HBase cluster operates within expected Service Level Agreements (SLAs).”
“性能测试对于确保您的 HBase 集群在预期的服务等级协议 (SLA) 内运行至关重要。”

Methods for Performance Testing

性能测试方法

  1. Real Workload Testing: Simulate real application workloads to evaluate performance.
  2. 真实工作负载测试:模拟真实应用程序工作负载以评估性能。
  3. PerformanceEvaluation Tool:
  4. PerformanceEvaluation 工具
    • Bundled with HBase for evaluating performance across various operations.
    • 与 HBase 捆绑,用于评估各种操作的性能。
    • Based on concepts from the original Bigtable paper.
    • 基于原始 Bigtable 论文中的概念。

Using the PerformanceEvaluation Tool

使用 PerformanceEvaluation 工具

  • Basic Command: To run the tool without arguments:

  • 基本命令:不带参数运行该工具:

    1
    $ $HBASE_HOME/bin/hbase org.apache.hadoop.hbase.PerformanceEvaluation
  • Command Usage:

  • 命令用法

    1
    Usage: java org.apache.hadoop.hbase.PerformanceEvaluation [--miniCluster] [--nomapred] [--rows=ROWS] <command> <nclients>

Key Options

关键选项

Option Description
选项 描述
--miniCluster Run the test on an HBaseMiniCluster.
--miniCluster 在 HBaseMiniCluster 上运行测试。
--nomapred Use threads instead of MapReduce for multiple clients.
--nomapred 对多个客户端使用线程而非 MapReduce。
--rows=ROWS Specify the number of rows each client runs. Default is one million.
--rows=ROWS 指定每个客户端运行的行数。默认为一百万。

Command Examples

命令示例

Command Description
命令 描述
filterScan Run scan test using a filter for specific rows.
filterScan 使用过滤器对特定行运行扫描测试。
randomRead Conduct random read tests.
randomRead 进行随机读取测试。
sequentialWrite Run sequential write tests.
sequentialWrite 运行顺序写入测试。

📈 YCSB – Yahoo! Cloud Serving Benchmark

📈 YCSB – 雅虎云服务基准测试

“YCSB is a standard benchmarking tool designed for performance comparison of different databases, including HBase.”
“YCSB 是一个标准的基准测试工具,旨在对包括 HBase 在内的不同数据库进行性能比较。”

Overview of YCSB

YCSB 概述

  • Purpose: Used to measure and compare the performance of distributed databases.
  • 目的:用于测量和比较分布式数据库的性能。
  • Components:
  • 组件
    • YCSB Client: An extensible workload generator.
    • YCSB 客户端:一个可扩展的工作负载生成器。
    • Core Workloads: Prepackaged workloads for performance testing.
    • 核心工作负载:用于性能测试的预打包工作负载。

Setting Up YCSB

设置 YCSB

  1. Clone the Repository:

  2. 克隆仓库

    1
    $ git clone git://github.com/brianfrankcooper/YCSB.git
  3. Compile the Code:

  4. 编译代码

    1
    2
    $ cd YCSB
    $ mvn -DskipTests package
  5. Configuration:

  6. 配置

    • Place your HBase cluster configuration in hbase/src/main/conf/hbase-site.xml.
    • 将您的 HBase 集群配置放在 hbase/src/main/conf/hbase-site.xml 中。
    • Include the hbase.zookeeper.quorum property.
    • 包含 hbase.zookeeper.quorum 属性。

Running Workloads with YCSB

使用 YCSB 运行工作负载

  1. Create HBase Table:

  2. 创建 HBase 表

    1
    hbase(main):002:0> create 'mytable', 'myfamily'
  3. Load Workload:

  4. 加载工作负载

    1
    $ bin/ycsb load hbase -P workloads/workloada -p columnfamily=myfamily -p table=mytable

Features of YCSB

YCSB 的特性

  • Configurable multiple clients and threads.
  • 可配置多个客户端和线程。
  • Ability to run mixed workloads with different statistical distributions of data.
  • 能够运行具有不同数据统计分布的混合工作负载。

🛠️ Adding Nodes to HBase

🛠️ 向 HBase 添加节点

“Adding nodes to an HBase cluster requires careful configuration and setup.”
“向 HBase 集群添加节点需要仔细的配置和设置。”

Steps for Adding Nodes

添加节点的步骤

  • SSH Password-Free Login: Ensure SSH access without requiring a password.
  • SSH 免密登录:确保无需密码即可进行 SSH 访问。
  • Copy Hadoop and HBase: Transfer necessary files to the new node.
  • 复制 Hadoop 和 HBase:将必要文件传输到新节点。
  • Start the New Slave Node: Initialize the new node in the cluster.
  • 启动新的从属节点:在集群中初始化新节点。

🗂️ Cluster Management in HBase

🗂️ HBase 中的集群管理

Modifying Hostname

修改主机名

To modify the hostname of a newly added node, follow these steps:
要修改新添加节点的​​主机名,请按照以下步骤操作:

  1. Log in to the server of the newly added node.
  2. 登录到新添加节点的​​服务器。
  3. Edit the hosts file:
  4. 编辑 hosts 文件:
    1
    vi /etc/hosts
  5. Create a user for Hadoop:
  6. 为 Hadoop 创建用户
    1
    2
    useradd hadoop
    passwd Hadoop

Setting Up SSH Password-Free Login

设置 SSH 免密登录

  1. Log in to the master node.
  2. 登录到主节点
  3. Use the following command to set up password-free SSH login:
  4. 使用以下命令设置免密 SSH 登录:
    1
    ssh-copy-id slave80
    • Note: The first time, you will need to enter your password.
    • 注意:第一次需要输入密码。

Copying Hadoop and HBase

复制 Hadoop 和 HBase

  1. Copy Hadoop to the slave node:

  2. 将 Hadoop 复制到从属节点:

    • Navigate to the Hadoop installation directory and use:
    • 导航到 Hadoop 安装目录并使用:
    1
    scp -r hadoop_directory slave80:/path/to/destination
  3. Copy HBase to the slave node:

  4. 将 HBase 复制到从属节点:

    • Similarly, navigate to the HBase installation directory and use:
    • 同样,导航到 HBase 安装目录并使用:
    1
    scp -r hbase_directory slave80:/path/to/destination

Increasing RegionServers Information

增加 RegionServers 信息

  • Add new host information to regionservers.
  • 将新的主机信息添加到 regionservers 文件中。

Starting a New Node

启动一个新节点

To start the RegionServer on the slave node:
要在从属节点上启动 RegionServer:

  1. Log in to the slave80 node.
  2. 登录到 slave80 节点
  3. Navigate to the HBase bin directory:
  4. 导航到 HBase 的 bin 目录:
    1
    cd $HBASE_HOME/bin
  5. Start the RegionServer:
  6. 启动 RegionServer:
    1
    ./hbase-daemon.sh start regionserver

Deleting a Node

删除节点

Methods to Delete a Node:

删除节点的方法:

Method Description
方法 描述
Rough Explosion Use jps to query the process number of RegionServer and kill it using: <br> kill -9 <process_number> <br> Note: This method may cause data loss and is not recommended.
粗暴方式 使用 jps 查询 RegionServer 的进程号,并使用以下命令杀死它: <br> kill -9 <进程号> <br> 注意:此方法可能导致数据丢失,不推荐使用。
Civilized Use the command: <br> hbase-daemon.sh stop regionserver <br> This method also may cause data loss.
文明方式 使用命令: <br> hbase-daemon.sh stop regionserver <br> 此方法也可能导致数据丢失。
Elegant Execute: <br> graceful_stop.sh in $HBASE_HOME/bin <br> This method is preferred as it ensures data integrity.
优雅方式 $HBASE_HOME/bin 中执行: <br> graceful_stop.sh <br> 此方法是首选,因为它能确保数据完整性。

Process Introduction for Elegant Method:

优雅方法的流程介绍:

  1. Stop load balancing first.
  2. 首先停止负载均衡。
  3. Move the data on the node to other nodes.
  4. 将节点上的数据移动到其他节点。
  5. Shut down the service.
  6. 关闭服务。

Rolling Start

滚动启动

As data in the cluster increases, a rolling start allows for server restarts without downtime. This is achieved using HBase’s graceful_stop.sh script.
随着集群中数据的增加,滚动启动允许在不中断服务的情况下重启服务器。这可以通过使用 HBase 的 graceful_stop.sh 脚本来实现。

Command for Rolling Start:

滚动启动命令:

1
2
cd $HBASE_HOME/bin
./graceful_stop.sh --restart --reload debug <Node_name>

Parameter Introduction:

参数介绍:

Parameter Description
参数 描述
restart Restart after removing the data
restart 移除数据后重启
reload Stop the service after moving the data
reload 移动数据后停止服务
debug Output information about the execution process
debug 输出执行过程的相关信息
Node Name Specify which node to stop
节点名称 指定要停止的节点

Shell Script for Roll Start:

滚动启动的 Shell 脚本:

1
2
3
4
#!/bin/sh
for i in `cat $HBASE_HOME/conf/regionservers | sort`; do
$HBASE_HOME/bin/graceful_stop.sh --restart --reload debug $i
done

Data Export

数据导出

Using CopyTable Command

使用 CopyTable 命令

  • Before executing CopyTable, create a table.
  • 在执行 CopyTable 之前,请创建一个表。
  • Example command to copy a table:
  • 复制表的示例命令:
1
hbase org.apache.hadoop.hbase.mapreduce.CopyTable --new.name=tableCopy srcTable
  • Cross-cluster copy table:
  • 跨集群复制表:
1
hbase org.apache.hadoop.hbase.mapreduce.CopyTable --peer.adr=dstClusterZK:2181:/hbase srcTable

Important Notes:

重要提示:

  • The load balancing is automatically turned off after performing this function.
  • 执行此功能后,负载均衡会自动关闭。

Command Help and Parameters:

命令帮助和参数:

To view help for CopyTable, use:
要查看 CopyTable 的帮助信息,请使用:

1
./hbase org.apache.hadoop.hbase.mapreduce.CopyTable -help

Example Usage for Copying Tables:

复制表示例用法:

To copy ‘TestTable’ to a cluster that uses replication for a 1-hour window:
要将 ‘TestTable’ 复制到一个使用 1 小时窗口复制的集群:

1
$ bin/hbase org.apache.hadoop.hbase.mapreduce.CopyTable --starttime=1265875194289 --endtime=1265878794289 --peer.adr=server1,server2,server3:2181:/hbase --families=myOldCf:myNewCf,cf2,cf3 TestTable

Data Import

数据导入

Data import can be batched by API but is less efficient. It is recommended to import data in HFILE format directly into HBase for better performance.
数据导入可以通过 API 进行批处理,但效率较低。建议以 HFILE 格式将数据直接导入 HBase 以获得更好的性能。

Example Format for Input Data:

输入数据示例格式:

1
2
3
4
5
6
12333
value1 value2 value3
43434
value4 value5 value6
78788
value7 value8 value9
  • The first field is row keys, and the subsequent fields are column values separated by tabs.
  • 第一个字段是行键 (row key),随后的字段是以制表符分隔的列值。

Map Class Example:

Map 类示例:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
public static class Map extends Mapper<LongWritable, Text, ImmutableBytesWritable, KeyValue> {
protected void map(LongWritable key, Text value, Context context) throws IOException, InterruptedException {
String[] strArr = value.toString().split(SEPARATOR);
ImmutableBytesWritable rowkey = new ImmutableBytesWritable(StringUtils.reverse(strArr[0]).getBytes());
List<KeyValue> list = createKeyValue(value.toString());
Iterator<KeyValue> it = list.iterator();
while (it.hasNext()) {
KeyValue kv = it.next();
if (kv != null) {
context.write(rowkey, kv);
}
}
}

private List<KeyValue> createKeyValue(String str) {
List<KeyValue> list = new ArrayList<KeyValue>();
String[] strArr = str.split(SEPARATOR);
// Create KeyValue objects and add to the list
// (Implementation details omitted for brevity)
// 创建 KeyValue 对象并添加到列表中
// (为简洁起见,省略了实现细节)
return list;
}
}

🏗️ HBase Key Concepts

🏗️ HBase 关键概念

KeyValue Format

KeyValue 格式

The KeyValue format is essential for storing data in HBase. The constructor for the KeyValue class can be defined in two ways:
KeyValue 格式对于在 HBase 中存储数据至关重要。KeyValue 类的构造函数可以通过两种方式定义:

1
2
public KeyValue(final byte[] row, final byte[] family, 
final byte[] qualifier, final byte[] value)

or
或者

1
2
public KeyValue(final byte[] row, final byte[] family, 
final byte[] qualifier, final long timestamp, Type type)

Main Function Structure

Main 函数结构

The main function is structured as follows:
main 函数的结构如下:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
public static void main(String[] args) throws Exception { 
Configuration conf = HBaseConfiguration.create();
String[] otherArgs = new GenericOptionsParser(conf, args).getRemainingArgs();
conf.set("hbase.zookeeper.property.clientPort", "2181");
conf.set("hbase.zookeeper.quorum", "master,slave2,slave1");
conf.set("hbase.master", "192.168.2.81:60000");
Job job = Job.getInstance(conf);
job.setJarByClass(CarWish.class);
job.setMapOutputKeyClass(ImmutableBytesWritable.class);
job.setMapOutputValueClass(KeyValue.class);
job.setMapperClass(Map.class);
job.setReducerClass(KeyValueSortReducer.class);
job.setOutputFormatClass(HFileOutputFormat2.class);
FileSystem fs = FileSystem.get(URI.create("/"), conf);
Path outPath = new Path(otherArgs[1]);
if (fs.exists(outPath)) {
fs.delete(outPath, true);
}
FileInputFormat.setInputPaths(job, new Path(otherArgs[0]));
FileOutputFormat.setOutputPath(job, outPath);
HTable table = new HTable(conf, TABLENAME);
HFileOutputFormat2.configureIncrementalLoad(job, table);
if (job.waitForCompletion(true)) {
LoadIncrementalHFiles loader = new LoadIncrementalHFiles(conf);
loader.doBulkLoad(outPath, table);
}
System.exit(0);
}

Troubleshooting Tools

故障排除工具

HBase Consistency Checks

HBase 一致性检查

HBase provides an HBCK tool to check data consistency. Key aspects of data consistency include:
HBase 提供了一个 HBCK 工具来检查数据一致性。数据一致性的关键方面包括:

  • Each region must be correctly assigned to a regionserver, with accurate location information.
  • 每个 region 必须正确分配给一个 regionserver,并具有准确的位置信息。
  • Each table should be complete, ensuring every possible rowkey corresponds to a unique region.
  • 每个表都应该是完整的,确保每个可能的 rowkey 都对应一个唯一的 region。
  • Proper usage of query-related command collections.
  • 正确使用与查询相关的命令集合。

Command to invoke HBCK:
调用 HBCK 的命令:

1
$HBASE_HOME/bin/hbase hbck -help

Metadata Repair Options

元数据修复选项

The HBCK tool includes several options for metadata repair, which should be used with caution:
HBCK 工具包含多个用于元数据修复的选项,应谨慎使用:

Command Description
命令 描述
-fix Tries to fix region assignments (backward compatibility)
-fix 尝试修复 region 分配(向后兼容)
-fixAssignments Fixes region assignments, replaces the old -fix
-fixAssignments 修复 region 分配,取代旧的 -fix
-fixMeta Fixes meta problems, assuming HDFS region info is good
-fixMeta 修复 meta 问题,假设 HDFS 的 region 信息是好的
-noHdfsChecking Skips loading/checking region info from HDFS
-noHdfsChecking 跳过从 HDFS 加载/检查 region 信息
-fixHdfsHoles Fixes region holes in HDFS
-fixHdfsHoles 修复 HDFS 中的 region 空洞
-fixHdfsOrphans Fixes region directories with no .regioninfo file
-fixHdfsOrphans 修复没有 .regioninfo 文件的 region 目录
-fixTableOrphans Fixes table directories with no .tableinfo file (online mode only)
-fixTableOrphans 修复没有 .tableinfo 文件的表目录(仅限在线模式)
-fixHdfsOverlaps Fixes region overlaps in HDFS
-fixHdfsOverlaps 修复 HDFS 中的 region 重叠
-fixVersionFile Fixes missing hbase.version file in HDFS
-fixVersionFile 修复 HDFS 中缺失的 hbase.version 文件
-maxMerge <n> Limits at most <n> regions to merge (default n=5)
-maxMerge <n> 最多合并 <n> 个 region(默认 n=5)
-sidelineBigOverlaps Sidelines big overlaps when fixing
-sidelineBigOverlaps 修复时将大的重叠区域置于一旁
-maxOverlapsToSideline <n> Limits at most <n> regions to sideline per group (default n=2)
-maxOverlapsToSideline <n> 每组最多将 <n> 个 region 置于一旁(默认 n=2)
-fixSplitParents Forces offline split parents to be online
-fixSplitParents 强制离线的分裂父 region 上线
-ignorePreCheckPermission Ignores filesystem permission pre-check
-ignorePreCheckPermission 忽略文件系统权限预检查
-fixReferenceFiles Offlines lingering reference store files
-fixReferenceFiles 使残留的引用存储文件离线
-fixEmptyMetaCells Fixes hbase:meta entries not referencing any region
-fixEmptyMetaCells 修复不引用任何 region 的 hbase:meta 条目

Example Commands

命令示例

  • To fix inconsistency in regions:
  • 修复 region 中的不一致:
1
$HBASE_HOME/bin/hbase hbck -fix
  • To re-fix the meta table:
  • 重新修复 meta 表:
1
$HBASE_HOME/bin/hbase hbck -fixMeta
  • To redistribute the HBase meta table to the regionserver:
  • 将 HBase meta 表重新分配到 regionserver:
1
$HBASE_HOME/bin/hbase hbck -fixAssignments

Log Monitoring

日志监控

To view the running status of HBase, check the logs located by default in:
要查看 HBase 的运行状态,请检查默认位于以下位置的日志:

1
$HBASE_HOME/logs/

Cluster Types

集群类型

Common types of HBase clusters include:
常见的 HBase 集群类型包括:

  • Prototype Cluster: No strict SLAs.
  • 原型集群:没有严格的服务等级协议 (SLA)。
  • Small Production Cluster
  • 小型生产集群
  • Medium Production Cluster
  • 中型生产集群
  • Large Production Cluster
  • 大型生产集群

Hadoop Components

Hadoop 组件

The Hadoop architecture includes:
Hadoop 架构包括:

  • Master Processes:

  • 主进程

    • NameNode
    • NameNode
    • Secondary NameNode
    • Secondary NameNode
    • JobTracker
    • JobTracker
  • Slave Nodes:

  • 从属节点

    • DataNodes
    • DataNodes
    • HBase RegionServers
    • HBase RegionServers

Deploying HBase

部署 HBase

Deploying HBase requires:
部署 HBase 需要:

  • Configuring Linux
  • 配置 Linux
  • Configuring Hadoop
  • 配置 Hadoop
  • Configuring HBase
  • 配置 HBase

Recommendations are available based on experiences from various production deployments.
基于各种生产部署经验,提供了一些建议。

Monitoring Clusters

监控集群

Using the GANGLIA framework, which is a distributed monitoring tool designed for cluster monitoring.
使用 GANGLIA 框架,这是一个为集群监控设计的分布式监控工具。

Performance Testing Tools

性能测试工具

Different methods to test HBase cluster performance include:
测试 HBase 集群性能的不同方法包括:

  • Performance Evaluation Tool: Bundled with HBase.
  • Performance Evaluation Tool:与 HBase 捆绑。
  • YCSB: Yahoo! Cloud Serving Benchmark.
  • YCSB:雅虎云服务基准测试 (Yahoo! Cloud Serving Benchmark)。