The purpose of this document is to describe a detailed guide to installing the HDP 3.1 distribution on AWS using Apache Ambari. We will install HDP 3.1 on a three-node cluster using Amazon Linux 2 instances. The official documentation for installing HD 3.1 with Ambari can be found on this page: https://docs.cloudera.com/HDPDocuments/Ambari-2.7.3.0/bk_ambari-installation/content/ch_Getting_Ready.html
1. Preparing AWS instances
Fist we will launch three amazon Linux 2 instances. Connect to your AWS console go to EC2, launch instance and Select Amazon Linux 2 AMI.
For the purpose of this article we will chose t3a.2xlarge instance with 8 vCPUS and 32GB of memory (you could choose t3a.xlarge too if you want), for disk space at least 30GB.
Configure Security Group: For the purpose of this article we will allow all traffic
2. Preparing environments
First, we will connect in ssh on the 3 nodes of the cluster. The environment preparation commands are the same on the 3 Nodes, it is strongly advised to use a cli terminal which allows you to execute the same commands at the same time on 3 different hosts.
Connect as root on the 3 nodes of your cluster (1 of the nodes will be the Ambari-Server node)
ssh root@{server-ip-adress}
Get the hostnames of each host and save it in your notes
hostname
Check the number of open files and authorized processes:
ulimit -n -u
Increase configuration manually (2¹⁵ et 2¹⁶)
ulimit -n 32768
ulimit -u 65536
Permanent solution is to modify the following file:
vi /etc/security/limits.conf
Add the two lines to the previous file
root – nofile 32768
root – nproc 65536
Install and configure ntp ( NTPD: Network Time Protocol Daemon)
yum install ntp -y
vim /etc/ntp.conf
Add
server 169.254.169.123 prefer iburst
start the service
systemctl start ntpd
check if the service is running and check if communicating with the ntp servers (p: print)
systemctl status ntpd
ntpq -p
Hostname configuration
Check hosts file
vi /etc/hosts
On each node add its hostname to the file as follows
Be careful not to delete the two existing lines
ip FQDN
1.2.3.4 <fully.qualified.domain.name>
On each node, Check network configuration file:
vi /etc/sysconfig/network
Modify the HOSTNAME property to set the fully qualified domain name of the host.
NETWORKING=yes
HOSTNAME=<fully.qualified.domain.name>
Check with the command the full qualify domain name
hostname -f
Configuration iptables : Disable firewalls and stop the service
check if enforcing — should be Permissive or disabled
getenforce
check that selinux is disabled
cat /etc/selinux/config
SELINUX=disabled
UMASK (User Mask or User file creation MASK) sets the default permissions or base permissions granted when a new file or folder is created on a Linux machine.
Most Linux distros set 022 as the default umask value. A umask value of 022 grants read, write, execute permissions of 755 for new files or folders. A umask value of 027 grants read, write, execute permissions of 750 for new files or folders.
Ambari, HDP, and HDF support umask values of 022 (0022 is functionally equivalent), 027 (0027 is functionally equivalent). These values must be set on all hosts. Setting the umask for your current login session
umask 0022
Checking your current umask
umask
Permanently changing the umask for all interactive users:
echo umask 0022 >> /etc/profile
Install MySQL server on each host
yum install mysql-connector-java -y
wget http://repo.mysql.com/mysql-community-release-el7-5.noarch.rpm
rpm -ivh mysql-community-release-el7-5.noarch.rpm
yum update -y
yum install mysql-server -y
The configuration of the hosts is now finished.
3. Installing Ambari Server and Ambari Agents
We will first download the ambari packages
Get the Ambari Repo :
cd /etc/yum.repos.d/
wget -nv http://public-repo-1.hortonworks.com/ambari/amazonlinux2/2.x/updates/2.7.3.0/ambari.repo
update « yum » cache
yum makecache
Check that the ambari repo is imported (ambari-server must appear in the list)
yum repolist
In one host install ambari-server
yum install ambari-server
Configurer Ambari with the command Ambari-server setup and follow instructions
ambari-server setup
Set the path to sql driver for Ambari server
ambari-server setup --jdbc-db=mysql --jdbc-driver=/usr/share/java/mysql-connector-java.jar
Start Ambari Server
ambari-server start
In the two other hosts install ambari-agents
yum install ambari-agent
In these two hosts configure and start ambari-agent
Using a text editor, configure the Ambari Agent by editing the ambari-agent.ini file. as shown in the following:
vi /etc/ambari-agent/conf/ambari-agent.ini
Modify following
[server]hostname=<your.ambari.server.hostname>url_port=8440secured_url_port=8441
Start the agent on every host in your cluster.
ambari-agent start
The agent registers with the Server on start.
The installation of Ambari is now complete, connect to the Ambari url started at port 8080 to install the different hadoop components.
Login to apache ambari : url:8080, Username: admin and password: admin
Launch the Ambari Cluster install wizard
Give a name to your cluster
Select version: HDP 3.1.0
Remove repositories you will not use (keep only amazonlinux2)
Add hosts to the installation with their FQDN
As we already installed and configured Ambari Agents select: “Perform manual registration on hosts and do not use SSH”

Next
All checks should pass
Customize your install
Choose the services which will be deployed
Assign Slaves and Clients
Set a password for each service
4. Configure MySql database for Ranger and Ranger KMS
Start the mysql service and configure the root password
systemctl start mysqld
mysql_secure_installation
Configure a Ranger DB: MySQL
A MySQL/Oracle/PostgreSQL/Amazon RDS database instance must be running and available to be used by Ranger.
When using MySQL, the storage engine used for the Ranger admin policy store tables MUST support transactions
The MySQL database administrator should be used to create the Ranger databases. The following series of commands could be used to create the rangeradmin user with password rangeradmin.
a. Log in as the root user, then use the following commands to create the rangeradmin user and grant it adequate privileges.
Connect to mysql as root:
mysql -u root -pyourPasswd
Create Ranger user, grant and refresh privilege
CREATE USER 'rangeradmin'@'localhost' IDENTIFIED BY 'rangeradmin';
GRANT ALL PRIVILEGES ON *.* TO 'rangeradmin'@'localhost';
CREATE USER 'rangeradmin'@'%' IDENTIFIED BY 'rangeradmin';
GRANT ALL PRIVILEGES ON *.* TO 'rangeradmin'@'%';
GRANT ALL PRIVILEGES ON *.* TO 'rangeradmin'@'localhost' WITH GRANT OPTION;
GRANT ALL PRIVILEGES ON *.* TO 'rangeradmin'@'%' WITH GRANT OPTION;
FLUSH PRIVILEGES;
Use the exit command to exit MySQL.
You should now be able to reconnect to the database as rangeradmin using the following command and create ranger database:
mysql -u rangeradmin -prangeradmin
CREATE DATABASE ranger;
Exit;
Configure Ranger KMS DB: MySQL
Connect to mysql as root:
mysql -u root -pYourpsswd
CREATE USER 'rangerkms'@'localhost' IDENTIFIED BY 'rangerkms';
GRANT ALL PRIVILEGES ON *.* TO 'rangerkms'@'localhost';
CREATE USER 'rangerkms'@'%' IDENTIFIED BY 'rangerkms';
GRANT ALL PRIVILEGES ON *.* TO 'rangerkms'@'%';
GRANT ALL PRIVILEGES ON *.* TO 'rangerkms'@'localhost' WITH GRANT OPTION;
GRANT ALL PRIVILEGES ON *.* TO 'rangerkms'@'%' WITH GRANT OPTION;
FLUSH PRIVILEGES;
Exit
Connect to mysql using rangerkms user
mysql -u rangerkms -prangerkms
CREATE DATABASE rangerkms;
Add privileges to root users for others hosts
Connect to mysql as root:
mysql -u root -pYourPassWD
GRANT ALL PRIVILEGES ON *.* TO 'root'@hostname1' IDENTIFIED BY 'YourPassWD' WITH GRANT OPTION;
GRANT ALL PRIVILEGES ON *.* TO 'root'@’hostname2' IDENTIFIED BY 'YourPassWD' WITH GRANT OPTION;
GRANT ALL PRIVILEGES ON *.* TO 'root'@’hostname3' IDENTIFIED BY 'YourPassWD' WITH GRANT OPTION;
Flush privileges
FLUSH PRIVILEGES;
Thank you for the MySQL part, I was missing it 🙂
Very helpful ! thank you