How to install HDP 3.1 cluster on AWS

The purpose of this document is to describe a detailed guide to installing the HDP 3.1 distribution on AWS using Apache Ambari. We will install HDP 3.1 on a three-node cluster using Amazon Linux 2 instances. The official documentation for installing HD 3.1 with Ambari can be found on this page: https://docs.cloudera.com/HDPDocuments/Ambari-2.7.3.0/bk_ambari-installation/content/ch_Getting_Ready.html 

1. Preparing AWS instances

Fist we will launch three amazon Linux 2 instances. Connect to your AWS console go to EC2, launch instance and Select Amazon Linux 2 AMI.

For the purpose of this article we will chose t3a.2xlarge instance with 8 vCPUS and 32GB of memory (you could choose t3a.xlarge too if you want), for disk space at least 30GB.

Configure Security Group: For the purpose of this article we will allow all traffic 

 

2. Preparing environments

 

First, we will connect in ssh on the 3 nodes of the cluster. The environment preparation commands are the same on the 3 Nodes, it is strongly advised to use a cli terminal which allows you to execute the same commands at the same time on 3 different hosts.

Connect as root on the 3 nodes of your cluster (1 of the nodes will be the Ambari-Server node)

ssh root@{server-ip-adress}
 

Check the number of open files and authorized processes:

ulimit -n -u

Increase configuration manually (2¹⁵ et 2¹⁶)

ulimit -n 32768
ulimit -u 65536

Permanent solution is to modify the following file:

vi /etc/security/limits.conf

Add the two lines to the previous file

root – nofile 32768
root – nproc 65536

Install and configure ntp ( NTPD: Network Time Protocol Daemon)

yum install ntp -y
vim /etc/ntp.conf

Add

server 169.254.169.123 prefer iburst

start the service

systemctl start ntpd

check if the service is running and check if communicating with the ntp servers (p: print)

systemctl status ntpd
ntpq -p

Hostname configuration

Check hosts file

vi /etc/hosts

On each node add its hostname to the file as follows

Be careful not to delete the two existing lines

ip FQDN
1.2.3.4 <fully.qualified.domain.name>

On each node, Check network configuration file:

vi /etc/sysconfig/network

Modify the HOSTNAME property to set the fully qualified domain name of the host.

NETWORKING=yes
HOSTNAME=<fully.qualified.domain.name>

Check with the command the full qualify domain name

hostname -f

Configuration iptables : Disable firewalls and stop the service

check if enforcing — should be Permissive or disabled

getenforce

check that selinux is disabled

cat /etc/selinux/config
SELINUX=disabled

UMASK (User Mask or User file creation MASK) sets the default permissions or base permissions granted when a new file or folder is created on a Linux machine.

Most Linux distros set 022 as the default umask value. A umask value of 022 grants read, write, execute permissions of 755 for new files or folders. A umask value of 027 grants read, write, execute permissions of 750 for new files or folders.

Ambari, HDP, and HDF support umask values of 022 (0022 is functionally equivalent), 027 (0027 is functionally equivalent). These values must be set on all hosts. Setting the umask for your current login session

umask 0022

Checking your current umask

umask

Permanently changing the umask for all interactive users:

echo umask 0022 >> /etc/profile

Install MySQL server on each host

yum install mysql-connector-java -y
wget http://repo.mysql.com/mysql-community-release-el7-5.noarch.rpm
rpm -ivh mysql-community-release-el7-5.noarch.rpm
yum update -y
yum install mysql-server -y

The configuration of the hosts is now finished.

3. Installing Ambari Server and Ambari Agents

We will first download the ambari packages

Get the Ambari Repo :

cd /etc/yum.repos.d/
wget -nv http://public-repo-1.hortonworks.com/ambari/amazonlinux2/2.x/updates/2.7.3.0/ambari.repo

update « yum » cache

yum makecache

Check that the ambari repo is imported (ambari-server must appear in the list)

yum repolist

In one host install ambari-server

yum install ambari-server

Configurer Ambari with the command Ambari-server setup and follow instructions

ambari-server setup

Set the path to sql driver for Ambari server

ambari-server setup --jdbc-db=mysql --jdbc-driver=/usr/share/java/mysql-connector-java.jar

Start Ambari Server

ambari-server start

In the two other hosts install ambari-agents

yum install ambari-agent

In these two hosts configure and start ambari-agent

Using a text editor, configure the Ambari Agent by editing the ambari-agent.ini file. as shown in the following:

vi /etc/ambari-agent/conf/ambari-agent.ini

Modify following

[server]hostname=<your.ambari.server.hostname>url_port=8440secured_url_port=8441

Start the agent on every host in your cluster.

ambari-agent start

The agent registers with the Server on start.

The installation of Ambari is now complete, connect to the Ambari url started at port 8080 to install the different hadoop components.

Login to apache ambari : url:8080, Username: admin and password: admin

Launch the Ambari Cluster install wizard

Give a name to your cluster

Select version: HDP 3.1.0

Remove repositories you will not use (keep only amazonlinux2)

Add hosts to the installation with their FQDN

As we already installed and configured Ambari Agents select: “Perform manual registration on hosts and do not use SSH”

 

Next

All checks should pass

Customize your install

Choose the services which will be deployed

Assign Slaves and Clients

Set a password for each service

4. Configure MySql database for Ranger and Ranger KMS

Start the mysql service and configure the root password

systemctl start mysqld
mysql_secure_installation

Configure a Ranger DB: MySQL

A MySQL/Oracle/PostgreSQL/Amazon RDS database instance must be running and available to be used by Ranger.

When using MySQL, the storage engine used for the Ranger admin policy store tables MUST support transactions

The MySQL database administrator should be used to create the Ranger databases. The following series of commands could be used to create the rangeradmin user with password rangeradmin.

a. Log in as the root user, then use the following commands to create the rangeradmin user and grant it adequate privileges.

Connect to mysql as root:
mysql -u root -pyourPasswd

Create Ranger user, grant and refresh privilege

CREATE USER 'rangeradmin'@'localhost' IDENTIFIED BY 'rangeradmin';
GRANT ALL PRIVILEGES ON *.* TO 'rangeradmin'@'localhost';
CREATE USER 'rangeradmin'@'%' IDENTIFIED BY 'rangeradmin';
GRANT ALL PRIVILEGES ON *.* TO 'rangeradmin'@'%';
GRANT ALL PRIVILEGES ON *.* TO 'rangeradmin'@'localhost' WITH GRANT OPTION;
GRANT ALL PRIVILEGES ON *.* TO 'rangeradmin'@'%' WITH GRANT OPTION;
FLUSH PRIVILEGES;

Use the exit command to exit MySQL.

You should now be able to reconnect to the database as rangeradmin using the following command and create ranger database:

mysql -u rangeradmin -prangeradmin
CREATE DATABASE ranger;
Exit;

Configure Ranger KMS DB: MySQL

Connect to mysql as root:
mysql -u root -pYourpsswd
CREATE USER 'rangerkms'@'localhost' IDENTIFIED BY 'rangerkms';
GRANT ALL PRIVILEGES ON *.* TO 'rangerkms'@'localhost';
CREATE USER 'rangerkms'@'%' IDENTIFIED BY 'rangerkms';
GRANT ALL PRIVILEGES ON *.* TO 'rangerkms'@'%';
GRANT ALL PRIVILEGES ON *.* TO 'rangerkms'@'localhost' WITH GRANT OPTION;
GRANT ALL PRIVILEGES ON *.* TO 'rangerkms'@'%' WITH GRANT OPTION;
FLUSH PRIVILEGES;
Exit

Connect to mysql using rangerkms user

mysql -u rangerkms -prangerkms
CREATE DATABASE rangerkms;

Add privileges to root users for others hosts

Connect to mysql as root:
mysql -u root -pYourPassWD
GRANT ALL PRIVILEGES ON *.* TO 'root'@hostname1' IDENTIFIED BY 'YourPassWD' WITH GRANT OPTION;
GRANT ALL PRIVILEGES ON *.* TO 'root'@’hostname2' IDENTIFIED BY 'YourPassWD' WITH GRANT OPTION;
GRANT ALL PRIVILEGES ON *.* TO 'root'@’hostname3' IDENTIFIED BY 'YourPassWD' WITH GRANT OPTION;

Flush privileges

FLUSH PRIVILEGES;

2 réflexions sur “How to install HDP 3.1 cluster on AWS”

Les commentaires sont fermés.