🔰Configure Hadoop and start cluster services using Ansible Playbook🔰

rishabhsharma
4 min readDec 13, 2020

Redhat Ansible

Ansible is an open-source automation tool, or platform, used for IT tasks such as configuration management, application deployment, intra-service orchestration, and provisioning. Automation is crucial these days, with IT environments that are too complex and often need to scale too quickly for system administrators and developers to keep up if they had to do everything manually. Automation simplifies complex tasks, not just making developers’ jobs more manageable but allowing them to focus attention on other tasks that add value to an organization. In other words, it frees up time and increases efficiency. And Ansible, as noted above, is rapidly rising to the top in the world of automation tools. Let’s look at some of the reasons for Ansible’s popularity.

Hadoop Cluster

Hadoop is an Apache open-source framework written in java that allows distributed processing of large datasets across clusters of computers using simple programming models. The Hadoop framework application works in an environment that provides distributed storage and computation across clusters of computers. Hadoop is designed to scale up from a single server to thousands of machines, each offering local computation and storage.

ARTH — Task 11.1 👨‍💻

Task Description 📃

Configure Hadoop and start cluster services using Ansible Playbook

👉🏻Lets get started…😃

🌟Controller Node🌟

My Controller Node IP 172.31.43.34 on which ansible is installed.

Let’s check its version :

🔹Inventory file : Here we put instance IP to which we want to configure as a Master Node, Data Nodes and Clients.

🔹Configuration file of ansible :

# vim /etc/ansible/ansible.cfg

🔹 List Hosts :

🔹Now let’s check the connectivity :

🌟In Ansible Playbook :

🔹For configuring hadoop cluster we need jdk & hadoop software so first we will copy and install this software in all the target nodes:

🔹var.yml file which we have included in our playbook :

🔶Configuring Namenode:

🔹 hdfs-site.xml file of Namenode:

🔶Configuring Datanode:

🔹 hdfs-site.xml file of Datanode:

🔶Configuring Client:

🔹 core-site.xml file for all :

🔹Now let’s run our playbook :

# ansible-playbook hadoop.yml

We can clearly see that we have successfully added one datanode to the namenode (master) of around 10GB. Like this we just need to update IP’s of multiple instances in the inventory file and run the playbook & we can increase our hadoop cluster.

TASK COMPLETED Successfully✌🏻👨🏻‍💻

Thanks for reading !!!😊✨

🔰Keep Learning ❗❗🔰Keep Sharing ❗❗

--

--

rishabhsharma

AWS Certified ☁️ | PySpark | DevOps | Machine Learning 🧠 | Kubernetes ☸️ | SQL 🛢