Ansible is an agent-less provisioning and deployment tool. We provide an Ansible playbook, custom module and config file templates to deploy Apache Hadoop Ozone to an arbitrarily large cluster.
All you need is ssh access to the cluster for the root
user, and Ansible installed on a control node. The control node may be one of the nodes in your cluster or an external node.
A template hosts.yaml
file is provided. It contains a list of hosts grouped by role. The format of the included template file is (hopefully) self-documenting.
Edit the conf/settings.yaml
template file. At the very least, you will need to specify the Ozone tarball file name and the JAVA_HOME
location.
For production clusters or anytime you care about data persistence or performance, you should also edit the storage and metadata directory locations.
ansible-playbook -i hosts.yaml install-ozone.yaml -v
The -v
option enables verbose logging which is useful for debugging when things go wrong.
Ansible allows running ad-hoc commands on cluster hosts. A few useful examples:
It is useful to verify the generated ozone-site.xml
before doing a full installation. The file can be generated with the following command:
ansible-playbook -i hosts.yaml install-ozone.yaml -v --tags "generate"
The output file will be available as conf/auto_generated/ozone-site.xml
.
Ozone services can be stopped with the following command that simply kills all processes running as the hdfs
user.
ansible --inventory=hosts.yaml all -m shell -a 'pkill -u hdfs'
The following works by listing the java process names on each host.
ansible --inventory=hosts.yaml all -m shell -a 'jps | grep -ivw [j]ps'
Apache®, Apache Hadoop, Hadoop®, and the yellow elephant logo are either registered trademarks or trademarks of the Apache Software Foundation in the United States and/or other countries.