Cluster Hardware

Describes important hardware architecture considerations for your cluster.

When planning the hardware architecture for the cluster, make sure all hardware meets the node requirements listed in Preparing Each Node.

The architecture of the cluster hardware is an important consideration when planning a deployment. Among the considerations are anticipated data storage and network bandwidth needs, including intermediate data generated when jobs and applications are executed. The type of workload also is important. Consider whether the planned cluster usage will be CPU-intensive, I/O-intensive, or memory-intensive. Think about how data will be loaded into and out of the cluster, and how much data is likely to be transmitted over the network.

Planning a cluster often involves tuning key ratios, such as:

Disk I/O speed to CPU processing power
Storage capacity to network speed
Number of nodes to network speed

Typically, the CPU is less of a bottleneck than network bandwidth and disk I/O. To the extent possible, balance network and disk transfer rates to meet the anticipated data rates using multiple NICs per node. It is not necessary to bond or trunk the NICs together. The HPE Ezmeral Data Fabric can take advantage of multiple NICs transparently. Each node should provide raw disks to the data-fabric, with no RAID or logical volume manager, as the data-fabric takes care of formatting and data protection.

The following example architecture provides specifications for a recommended standard data-fabric Hadoop compute/storage node for general purposes. This configuration is highly scalable in a typical data center environment. The HPE Ezmeral Data Fabric can make effective use of more drives per node than standard Hadoop, so each node should present enough faceplate area to allow a large number of drives.

Standard Compute/Storage Node

Dual CPU socket system board
2x8 core CPU, 32 cores with HT enabled
8x8GB DIMMs, 64GB RAM (DIMM count must be multiple of CPU memory channels)
12x2TB SATA drives
10GbE network interface
OS using entire single drive, not shared as data drive

Minimum Cluster Size

All data-fabric production clusters must have a minimum of four data nodes except for MapR Edge.

A data node is defined as a node running a FileServer process that is responsible for storing data on behalf of the entire cluster. Having additional nodes deployed with control-only services such as CLDB and ZooKeeper is recommended, but they do not count toward the minimum node total because they do not contribute to the overall availability of data.

Note: Note these special considerations for clusters of 10 nodes or fewer:

Erasure coding and rolling updates are not supported for clusters of four nodes or fewer.
Erasure coding is not recommended for five- and six-node clusters. See the Important note in Erasure Coding Scheme for Data Protection and Recovery.
Dedicated control nodes are not needed on clusters with fewer than 10 data nodes.
As the cluster size is reduced, each individual node has a larger proportional impact on cluster performance. As cluster size drops below 10 nodes, especially during times of failure recovery, clusters can begin to exhibit variable performance depending on the workload, network and storage I/O speed, and the amount of data being re-replicated.
For information about fault tolerance, see Priority 1 - Maximize Fault Tolerance and Cluster Design Objectives.

To maximize fault tolerance in the design of your cluster, see Example Cluster Designs.

Best Practices

Hardware recommendations and cluster configuration vary by use case. For example, is the application an HPE Ezmeral Data Fabric Database application? Is the application latency-sensitive?

The following recommendations apply in most cases:

Disk Drives

Drives should be JBOD, using single-drive RAID0 volumes to take advantage of the controller cache.
SSDs are recommended when using HPE Ezmeral Data Fabric Database JSON with secondary indexes. HDDs can be used with secondary indexes only if the performance requirements are thoroughly understood. Performance can be substantially impaired on HDDs because of high levels of disordered I/O requests. SSDs are not needed for using HPE Ezmeral Data Fabric Event Store.
SAS drives can provide better I/O latency; SSDs provide even lower latency.
Match aggregate drive throughput to network throughput. 20GbE ~= 16-18 HDDs or 5-6 SSDs or 1 NVMe drive.

Cluster Size

In general, it is better to have more nodes. Larger clusters recover faster from disk failures because more nodes are available to contribute. For information fault tolerance, see Example Cluster Designs.
For smaller clusters, all nodes are likely to fit on a single non-blocking switch. Larger clusters require a well-designed Spine/Leaf fabric that can scale.

Operating System and Server Configuration

Red Hat Enterprise Linux, SUSE Linux Enterprise Server, Ubuntu, CentOS, and Oracle Enterprise Linux are supported as described in Operating System Support Matrix (Release 6.x).
Install the minimal server configuration. Use a product like Cobbler to PXE boot and install a consistent OS image.
Install the full JDK (1.11).
For best performance, avoid deploying a data-fabric cluster on virtual machines. However, VMs are supported for use as clients or edge nodes.

Memory, CPUs, Number of Cores

Make sure the DIMM count is an exact multiple of the number of memory channels the selected CPU provides.
Use CPUs with as many cores as you can. Having more cores is more important than having a slightly higher clock speed.
HPE Ezmeral Data Fabric Database benefits from lots of RAM: 256GB per node or more.
Filesystem-only nodes can have fewer, faster cores: 6 cores for the first 10GbE of network bandwidth, and an additional 2 cores for each additional 10GbE. For example, dual 25GbE (50GbE) filesystem-only nodes perform best with at least 6+(4*2)=14 cores.
Filesystem-only nodes should have hyperthreading disabled.