The purpose of this document is to describe a detailed guide to installing the HDP 3.1 distribution on AWS using Apache Ambari. We will install HDP 3.1 on a three-node cluster using Amazon Linux 2 instances. The official documentation for installing HD 3.1 with Ambari can be found on this page: https://docs.cloudera.com/HDPDocuments/Ambari-2.7.3.0/bk_ambari-installation/content/ch_Getting_Ready.html
1. Preparing AWS instances
Fist we will launch three amazon Linux 2 instances. Connect to your AWS console go to EC2, launch instance and Select Amazon Linux 2 AMI.

For the purpose of this article we will chose t3a.2xlarge instance with 8 vCPUS and 32GB of memory (you could choose t3a.xlarge too if you want), for disk space at least 30GB.
Configure Security Group: For the purpose of this article we will allow all traffic

2. Preparing environments
First, we will connect in ssh on the 3 nodes of the cluster. The environment preparation commands are the same on the 3 Nodes, it is strongly advised to use a cli terminal which allows you to execute the same commands at the same time on 3 different hosts.
Connect as root on the 3 nodes of your cluster (1 of the nodes will be the Ambari-Server node)
ssh root@{server-ip-adress}Get the hostnames of each host and save it in your notes
hostnameCheck the number of open files and authorized processes:
ulimit -n -uIncrease configuration manually (2¹⁵ et 2¹⁶)
ulimit -n 32768
ulimit -u 65536
Permanent solution is to modify the following file:
vi /etc/security/limits.confAdd the two lines to the previous file
root – nofile 32768
root – nproc 65536Install and configure ntp ( NTPD: Network Time Protocol Daemon)
yum install ntp -y
vim /etc/ntp.confAdd
server 169.254.169.123 prefer iburststart the service
systemctl start ntpd
check if the service is running and check if communicating with the ntp servers (p: print)
systemctl status ntpd
ntpq -pHostname configuration
Check hosts file
vi /etc/hostsOn each node add its hostname to the file as follows
Be careful not to delete the two existing lines
ip FQDN
1.2.3.4 <fully.qualified.domain.name>On each node, Check network configuration file:
vi /etc/sysconfig/networkModify the HOSTNAME property to set the fully qualified domain name of the host.
NETWORKING=yes
HOSTNAME=<fully.qualified.domain.name>Check with the command the full qualify domain name
hostname -fConfiguration iptables : Disable firewalls and stop the service
check if enforcing — should be Permissive or disabled
getenforcecheck that selinux is disabled
cat /etc/selinux/config
SELINUX=disabledUMASK (User Mask or User file creation MASK) sets the default permissions or base permissions granted when a new file or folder is created on a Linux machine.
Most Linux distros set 022 as the default umask value. A umask value of 022 grants read, write, execute permissions of 755 for new files or folders. A umask value of 027 grants read, write, execute permissions of 750 for new files or folders.
Ambari, HDP, and HDF support umask values of 022 (0022 is functionally equivalent), 027 (0027 is functionally equivalent). These values must be set on all hosts. Setting the umask for your current login session
umask 0022Checking your current umask
umaskPermanently changing the umask for all interactive users:
echo umask 0022 >> /etc/profileInstall MySQL server on each host
yum install mysql-connector-java -y
wget http://repo.mysql.com/mysql-community-release-el7-5.noarch.rpm
rpm -ivh mysql-community-release-el7-5.noarch.rpm
yum update -y
yum install mysql-server -yThe configuration of the hosts is now finished.
3. Installing Ambari Server and Ambari Agents
We will first download the ambari packages
Get the Ambari Repo :
cd /etc/yum.repos.d/
wget -nv http://public-repo-1.hortonworks.com/ambari/amazonlinux2/2.x/updates/2.7.3.0/ambari.repoupdate « yum » cache
yum makecacheCheck that the ambari repo is imported (ambari-server must appear in the list)
yum repolistIn one host install ambari-server
yum install ambari-serverConfigurer Ambari with the command Ambari-server setup and follow instructions
ambari-server setupSet the path to sql driver for Ambari server
ambari-server setup --jdbc-db=mysql --jdbc-driver=/usr/share/java/mysql-connector-java.jarStart Ambari Server
ambari-server startIn the two other hosts install ambari-agents
yum install ambari-agentIn these two hosts configure and start ambari-agent
Using a text editor, configure the Ambari Agent by editing the ambari-agent.ini file. as shown in the following:
vi /etc/ambari-agent/conf/ambari-agent.iniModify following
[server]hostname=<your.ambari.server.hostname>url_port=8440secured_url_port=8441
Start the agent on every host in your cluster.
ambari-agent startThe agent registers with the Server on start.
The installation of Ambari is now complete, connect to the Ambari url started at port 8080 to install the different hadoop components.
Login to apache ambari : url:8080, Username: admin and password: admin
Launch the Ambari Cluster install wizard
Give a name to your cluster
Select version: HDP 3.1.0
Remove repositories you will not use (keep only amazonlinux2)
Add hosts to the installation with their FQDN
As we already installed and configured Ambari Agents select: “Perform manual registration on hosts and do not use SSH”

Next
All checks should pass
Customize your install
Choose the services which will be deployed
Assign Slaves and Clients
Set a password for each service
4. Configure MySql database for Ranger and Ranger KMS
Start the mysql service and configure the root password
systemctl start mysqld
mysql_secure_installationConfigure a Ranger DB: MySQL
A MySQL/Oracle/PostgreSQL/Amazon RDS database instance must be running and available to be used by Ranger.
When using MySQL, the storage engine used for the Ranger admin policy store tables MUST support transactions
The MySQL database administrator should be used to create the Ranger databases. The following series of commands could be used to create the rangeradmin user with password rangeradmin.
a. Log in as the root user, then use the following commands to create the rangeradmin user and grant it adequate privileges.
Connect to mysql as root:
mysql -u root -pyourPasswdCreate Ranger user, grant and refresh privilege
CREATE USER 'rangeradmin'@'localhost' IDENTIFIED BY 'rangeradmin';
GRANT ALL PRIVILEGES ON *.* TO 'rangeradmin'@'localhost';
CREATE USER 'rangeradmin'@'%' IDENTIFIED BY 'rangeradmin';
GRANT ALL PRIVILEGES ON *.* TO 'rangeradmin'@'%';
GRANT ALL PRIVILEGES ON *.* TO 'rangeradmin'@'localhost' WITH GRANT OPTION;
GRANT ALL PRIVILEGES ON *.* TO 'rangeradmin'@'%' WITH GRANT OPTION;
FLUSH PRIVILEGES;Use the exit command to exit MySQL.
You should now be able to reconnect to the database as rangeradmin using the following command and create ranger database:
mysql -u rangeradmin -prangeradmin
CREATE DATABASE ranger;
Exit;Configure Ranger KMS DB: MySQL
Connect to mysql as root:
mysql -u root -pYourpsswd
CREATE USER 'rangerkms'@'localhost' IDENTIFIED BY 'rangerkms';
GRANT ALL PRIVILEGES ON *.* TO 'rangerkms'@'localhost';
CREATE USER 'rangerkms'@'%' IDENTIFIED BY 'rangerkms';
GRANT ALL PRIVILEGES ON *.* TO 'rangerkms'@'%';
GRANT ALL PRIVILEGES ON *.* TO 'rangerkms'@'localhost' WITH GRANT OPTION;
GRANT ALL PRIVILEGES ON *.* TO 'rangerkms'@'%' WITH GRANT OPTION;
FLUSH PRIVILEGES;
ExitConnect to mysql using rangerkms user
mysql -u rangerkms -prangerkms
CREATE DATABASE rangerkms;Add privileges to root users for others hosts
Connect to mysql as root:
mysql -u root -pYourPassWD
GRANT ALL PRIVILEGES ON *.* TO 'root'@hostname1' IDENTIFIED BY 'YourPassWD' WITH GRANT OPTION;
GRANT ALL PRIVILEGES ON *.* TO 'root'@’hostname2' IDENTIFIED BY 'YourPassWD' WITH GRANT OPTION;
GRANT ALL PRIVILEGES ON *.* TO 'root'@’hostname3' IDENTIFIED BY 'YourPassWD' WITH GRANT OPTION;Flush privileges
FLUSH PRIVILEGES;
Thank you for the MySQL part, I was missing it 🙂
Very helpful ! thank you