How To Create a Highly Available Multi-node Hadoop Cluster on CentOS/RHEL 7/8

March 9, 2020, 9:14 pm

≫ Next: How To Deploy a Fault-Tolerant MongoDB Sharded Cluster on Ubuntu 18/19/20

≪ Previous: How To Set Up a Multi-node Hadoop Cluster on Ubuntu 18.04/19.04/19.10/20.04

This tutorial will show you how to set up a highly available multi-node hadoop cluster on CentOS/RHEL server. Please note that, this guide is specifically written for CentOS and RHEL release 7 and 8.

Prerequisites

To follow this tutorial, you will need three (physical or virtual) machines installed with CentOS/RHEL release 7 or 8 having sudo non-root user privileges.

We will use following three servers for this guide:

Name	IP	Purpose
master-node	192.168.10.1	Master Node
worker-node1	192.168.10.2	Worker Node1
worker-node2	192.168.10.3	Worker Node2

You should set hostname on each node using the below command but make sure you replace highlighted text with yours:


sudo hostnamectl set-hostname master-node

sudo hostnamectl set-hostname worker-node1
sudo hostnamectl set-hostname worker-node2

Also, set the correct timezone on each node using the below command:

sudo timedatectl set-timezone Asia/Karachi

Update Hosts File

For each node to communicate with each other by name, you need to map the ip addresses against their name.

Edit the /etc/hosts file on each node:

sudo vi /etc/hosts

Remove everything, add your nodes ip and name like below:

192.168.10.1    master-node
192.168.10.2    worker-node1
192.168.10.3    worker-node2

Save and close file when you are finished.

Adding Hadoop User

You need a user with sudo privileges for hadoop installation and configuration on each node.

Type below command to create a user called hadoop:

sudo adduser -m hadoop -G wheel

Set hadoop user password with below command:

sudo passwd hadoop

This will prompt you for new password and confirm password.

Make sure you create the same user on each node before moving to next step.

SSH Key-Pair Authentication

The master node in hadoop cluster will use an SSH connection to connect to other nodes with key-pair authentication to actively manage the cluster. For this, we need to set up key-pair ssh authentication on each node.

ssh-keygen

This will prompt you for passphrase, make sure you leave the fields blank.

Repeat the same step on each worker node as the hadoop user. When you are finished generating ssh key-pair on all nodes, move to next step.

Now you need to copy id_rsa.pub contents from master-node to each worker node like below:

ssh-copy-id -i ~/.ssh/id_rsa.pub localhost
ssh-copy-id -i ~/.ssh/id_rsa.pub worker-node1
ssh-copy-id -i ~/.ssh/id_rsa.pub worker-node2


ssh-copy-id -i ~/.ssh/id_rsa.pub localhost

ssh-copy-id -i ~/.ssh/id_rsa.pub master-node
ssh-copy-id -i ~/.ssh/id_rsa.pub worker-node2

ssh-copy-id -i ~/.ssh/id_rsa.pub localhost
ssh-copy-id -i ~/.ssh/id_rsa.pub master-node
ssh-copy-id -i ~/.ssh/id_rsa.pub worker-node1

If everything setup correctly, as the hadoop user you can ssh each other node with key-pair authentication without providing password.

Installing Java

Hadoop comes with code and scripts that need java to run, you can install latest version of java on each node with below command:

sudo dnf -y install java-latest-openjdk java-latest-openjdk-devel

If you are on CentOS/RHEL 7, install latest java with yum package manager:

sudo yum -y install java-latest-openjdk java-latest-openjdk-devel

Set Java Home Environment

Hadoop comes with code and configuration that references the JAVA_HOME environment variable. This variable points to the java binary file, allowing them to run java code.

You can set up JAVA_HOME variable on each node like below:

echo "JAVA_HOME=$(which java)" | sudo tee -a /etc/environment

Reload your system’s environment variables with below command:

source /etc/environment

Verify that variable was set correctly:

echo $JAVA_HOME

This should return the path to the java binary. Make sure you repeat the same step on each worker node as well.

You need to manually set hadoop binaries location into system path so that default environment understand where to look for hadoop commands.

edit /home/hadoop/.bashrc like below:

vi /home/hadoop/.bashrc

add following lines at the end of the file:

export HADOOP_HOME=/home/hadoop/hadoop
export PATH=${PATH}:${HADOOP_HOME}/bin:${HADOOP_HOME}/sbin

Save and close.

Next, edit /home/hadoop/.bash_profile:

vi ~/.bash_profile

add following line at the end of the file:

PATH=/home/hadoop/hadoop/bin:/home/hadoop/hadoop/sbin:$PATH

Save and close file.

Make sure you repeat the same step on each worker node as well.

Download Hadoop

At the time of writing this article, hadoop 3.1.3 was the most latest available release.

Login to master-node as the hadoop user, download the Hadoop tarball file, and unzip it:

cd ~
wget http://apache.cs.utah.edu/hadoop/common/current/hadoop-3.1.3.tar.gz
tar -xzf hadoop-3.1.3.tar.gz
mv hadoop-3.1.3 hadoop

Configure Hadoop

At this stage, we'll configure hadoop on master-node first, then replicate the configuration to worker nodes later.

On master-node, type below command to find java installation path:

update-alternatives --display java

Take the value of the (link currently points to) and remove the trailing /bin/java. For example on CentOS or RHEL, the link is /usr/lib/jvm/java-11-openjdk-11.0.5.10-2.el8_1.x86_64/bin/java, so JAVA_HOME should be /usr/lib/jvm/java-11-openjdk-11.0.5.10-2.el8_1.x86_64.

Edit hadoop-env.sh like below:

cd ~
vi hadoop/etc/hadoop/hadoop-env.sh

Uncomment by removing # and updateJAVA_HOME line like below:

export JAVA_HOME=/usr/lib/jvm/java-11-openjdk-amd64

Save and close when you are finished.

Next, edit core-site.xml file to set the NameNode location to master-node on port 9000:

vi hadoop/etc/hadoop/core-site.xml

add the following code, make sure you replace master-node with yours:

<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!--
  Licensed under the Apache License, Version 2.0 (the "License");
  you may not use this file except in compliance with the License.
  You may obtain a copy of the License at

    http://www.apache.org/licenses/LICENSE-2.0

  Unless required by applicable law or agreed to in writing, software
  distributed under the License is distributed on an "AS IS" BASIS,
  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
  See the License for the specific language governing permissions and
  limitations under the License. See accompanying LICENSE file.
-->

<!-- Put site-specific property overrides in this file. -->

<configuration>
<property>
<name>fs.default.name</name>
<value>hdfs://master-node:9000</value>
</property>
</configuration>

Save and close.

Next, edit hdfs-site.conf to resemble the following configuration:

vi hadoop/etc/hadoop/hdfs-site.xml

add following code:

<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!--
  Licensed under the Apache License, Version 2.0 (the "License");
  you may not use this file except in compliance with the License.
  You may obtain a copy of the License at

    http://www.apache.org/licenses/LICENSE-2.0

  Unless required by applicable law or agreed to in writing, software
  distributed under the License is distributed on an "AS IS" BASIS,
  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
  See the License for the specific language governing permissions and
  limitations under the License. See accompanying LICENSE file.
-->

<!-- Put site-specific property overrides in this file. -->

<configuration>
<property>
<name>dfs.namenode.name.dir</name>
<value>/home/hadoop/data/nameNode</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>/home/hadoop/data/dataNode</value>
</property>
<property>
<name>dfs.replication</name>
<value>2</value>
</property>
</configuration>

Note that the last property string dfs.replication, indicates how many times data is replicated in the cluster. We set 2 to have all the data duplicated on the two of our worker nodes. If you have only one worker node, enter 1, if you have three, enter 3 but don’t enter a value higher than the actual number of worker nodes you have.

Save and close file when you are finished.

Next, edit the mapred-site.xml file, setting YARN as the default framework for MapReduce operations:

vi hadoop/etc/hadoop/mapred-site.xml

add following code:

<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!--
  Licensed under the Apache License, Version 2.0 (the "License");
  you may not use this file except in compliance with the License.
  You may obtain a copy of the License at

    http://www.apache.org/licenses/LICENSE-2.0

  Unless required by applicable law or agreed to in writing, software
  distributed under the License is distributed on an "AS IS" BASIS,
  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
  See the License for the specific language governing permissions and
  limitations under the License. See accompanying LICENSE file.
-->

<!-- Put site-specific property overrides in this file. -->

<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
<property>
<name>yarn.app.mapreduce.am.env</name>
<value>HADOOP_MAPRED_HOME=$HADOOP_HOME</value>
</property>
<property>
<name>mapreduce.map.env</name>
<value>HADOOP_MAPRED_HOME=$HADOOP_HOME</value>
</property>
<property>
<name>mapreduce.reduce.env</name>
<value>HADOOP_MAPRED_HOME=$HADOOP_HOME</value>
</property>
<property>
<name>yarn.app.mapreduce.am.resource.mb</name>
<value>512</value>
</property>
<property>
<name>mapreduce.map.memory.mb</name>
<value>256</value>
</property>
<property>
<name>mapreduce.reduce.memory.mb</name>
<value>256</value>
</property>
</configuration>

Save and close.

Next, edit yarn-site.xml, which contains the configuration options for YARN.

vi hadoop/etc/hadoop/yarn-site.xml

add below code, make sure you replace 192.168.10.1 with the your master-node's ip address:

<?xml version="1.0"?>
<!--
  Licensed under the Apache License, Version 2.0 (the "License");
  you may not use this file except in compliance with the License.
  You may obtain a copy of the License at

    http://www.apache.org/licenses/LICENSE-2.0

  Unless required by applicable law or agreed to in writing, software
  distributed under the License is distributed on an "AS IS" BASIS,
  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
  See the License for the specific language governing permissions and
  limitations under the License. See accompanying LICENSE file.
-->
<configuration>

<!-- Site specific YARN configuration properties -->
<property>
<name>yarn.acl.enable</name>
<value>0</value>
</property>
<property>
<name>yarn.resourcemanager.hostname</name>
<value>192.168.10.1</value>
</property>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<property>
<name>yarn.nodemanager.resource.memory-mb</name>
<value>2048</value>
</property>
<property>
<name>yarn.scheduler.maximum-allocation-mb</name>
<value>2048</value>
</property>
<property>
<name>yarn.scheduler.minimum-allocation-mb</name>
<value>1024</value>
</property>
<property>
<name>yarn.nodemanager.vmem-check-enabled</name>
<value>false</value>
</property>
</configuration>

The last property disables virtual-memory checking which can prevent containers from being allocated properly with openjdk if enabled.

Note: Memory allocation can be tricky on low RAM nodes because default values are not suitable for nodes with less than 8GB of RAM. We have manually set memory allocation for MapReduce jobs, and provide a sample configuration for 4GB RAM nodes.

Save and close.

Next, edit workers file to include both of the worker nodes (worker-node1, worker-node2) in our case:

vi hadoop/etc/hadoop/workers

Remove localhost if exists, add your worker nodes like below:

worker-node1
worker-node2

Save and close.

The workers file is used by hadoop startup scripts to start required daemons on all nodes.

At this stage, we have completed hadoop configuration on master-node. In the next step we will duplicate hadoop configuration on worker nodes.

Configure Worker Nodes

This section will show you how to duplicate hadoop configuration from master-node to all work nodes.

First copy the hadoop tarball file from master-node to worker nodes like below:

cd ~
scp hadoop-*.tar.gz worker-node1:/home/hadoop/
scp hadoop-*.tar.gz worker-node2:/home/hadoop/

Next, login to each worker node as the hadoop user via SSH and unzip the hadoop archive, rename the directory then exit from worker nodes to get back on the master-node:


ssh worker-node1

tar -xzf hadoop-3.1.2.tar.gz
mv hadoop-3.1.3 hadoop
exit

Repeat the same step on worker-node2.

From the master-node, duplicate the Hadoop configuration files to all worker nodes using command below:


for node in worker-node1worker-node2; do

scp ~/hadoop/etc/hadoop/* $node:/home/hadoop/hadoop/etc/hadoop/;
done

Make sure you replace worker-node1, worker-node2 with your worker nodes name.

Next, on master-node as the hadoop user, type the below command to format hadoop file system:

hdfs namenode -format

You will see the output similar to the following which says hadoop cluster is ready to run.

WARNING: /home/hadoop/hadoop/logs does not exist. Creating.
2020-03-09 11:38:04,791 INFO namenode.NameNode: STARTUP_MSG:
/************************************************************
STARTUP_MSG: Starting NameNode
STARTUP_MSG:   host = master-node/192.168.10.1
STARTUP_MSG:   args = [-format]
STARTUP_MSG:   version = 3.1.3
STARTUP_MSG:   build = https://gitbox.apache.org/repos/asf/hadoop.git -r ba631c436b806728f8ec2f54ab1e289526c90579; compiled by 'ztang' on 2019-09-12T02:47Z
STARTUP_MSG:   java = 11.0.5
************************************************************/
2020-03-09 11:38:04,818 INFO namenode.NameNode: registered UNIX signal handlers for [TERM, HUP, INT]
2020-03-09 11:38:05,062 INFO namenode.NameNode: createNameNode [-format]
2020-03-09 11:38:06,162 INFO common.Util: Assuming 'file' scheme for path /home/hadoop/data/nameNode in configuration.
2020-03-09 11:38:06,163 INFO common.Util: Assuming 'file' scheme for path /home/hadoop/data/nameNode in configuration.
Formatting using clusterid: CID-e791ed9f-f86f-4a19-bbf4-aaa06c9c3238
2020-03-09 11:38:06,233 INFO namenode.FSEditLog: Edit logging is async:true
2020-03-09 11:38:06,275 INFO namenode.FSNamesystem: KeyProvider: null
2020-03-09 11:38:06,276 INFO namenode.FSNamesystem: fsLock is fair: true
2020-03-09 11:38:06,277 INFO namenode.FSNamesystem: Detailed lock hold time metrics enabled: false
2020-03-09 11:38:06,387 INFO namenode.FSNamesystem: fsOwner             = hadoop (auth:SIMPLE)
2020-03-09 11:38:06,387 INFO namenode.FSNamesystem: supergroup          = supergroup
2020-03-09 11:38:06,387 INFO namenode.FSNamesystem: isPermissionEnabled = true
2020-03-09 11:38:06,387 INFO namenode.FSNamesystem: HA Enabled: false
2020-03-09 11:38:06,476 INFO common.Util: dfs.datanode.fileio.profiling.sampling.percentage set to 0. Disabling file IO profiling
2020-03-09 11:38:06,510 INFO blockmanagement.DatanodeManager: dfs.block.invalidate.limit: configured=1000, counted=60, effected=1000
2020-03-09 11:38:06,510 INFO blockmanagement.DatanodeManager: dfs.namenode.datanode.registration.ip-hostname-check=true
2020-03-09 11:38:06,516 INFO blockmanagement.BlockManager: dfs.namenode.startup.delay.block.deletion.sec is set to 000:00:00:00.000
2020-03-09 11:38:06,516 INFO blockmanagement.BlockManager: The block deletion will start around 2020 Mar 09 11:38:06
2020-03-09 11:38:06,518 INFO util.GSet: Computing capacity for map BlocksMap
2020-03-09 11:38:06,518 INFO util.GSet: VM type       = 64-bit
2020-03-09 11:38:06,530 INFO util.GSet: 2.0% max memory 908.7 MB = 18.2 MB
2020-03-09 11:38:06,530 INFO util.GSet: capacity      = 2^21 = 2097152 entries
2020-03-09 11:38:06,537 INFO blockmanagement.BlockManager: dfs.block.access.token.enable = false
2020-03-09 11:38:06,557 INFO Configuration.deprecation: No unit for dfs.namenode.safemode.extension(30000) assuming MILLISECONDS
2020-03-09 11:38:06,557 INFO blockmanagement.BlockManagerSafeMode: dfs.namenode.safemode.threshold-pct = 0.9990000128746033
2020-03-09 11:38:06,557 INFO blockmanagement.BlockManagerSafeMode: dfs.namenode.safemode.min.datanodes = 0
2020-03-09 11:38:06,557 INFO blockmanagement.BlockManagerSafeMode: dfs.namenode.safemode.extension = 30000
2020-03-09 11:38:06,558 INFO blockmanagement.BlockManager: defaultReplication         = 2
2020-03-09 11:38:06,558 INFO blockmanagement.BlockManager: maxReplication             = 512
2020-03-09 11:38:06,558 INFO blockmanagement.BlockManager: minReplication             = 1
2020-03-09 11:38:06,558 INFO blockmanagement.BlockManager: maxReplicationStreams      = 2
2020-03-09 11:38:06,558 INFO blockmanagement.BlockManager: redundancyRecheckInterval  = 3000ms
2020-03-09 11:38:06,558 INFO blockmanagement.BlockManager: encryptDataTransfer        = false
2020-03-09 11:38:06,559 INFO blockmanagement.BlockManager: maxNumBlocksToLog          = 1000
2020-03-09 11:38:06,602 INFO namenode.FSDirectory: GLOBAL serial map: bits=24 maxEntries=16777215
2020-03-09 11:38:06,669 INFO util.GSet: Computing capacity for map INodeMap
2020-03-09 11:38:06,669 INFO util.GSet: VM type       = 64-bit
2020-03-09 11:38:06,669 INFO util.GSet: 1.0% max memory 908.7 MB = 9.1 MB
2020-03-09 11:38:06,669 INFO util.GSet: capacity      = 2^20 = 1048576 entries
2020-03-09 11:38:06,670 INFO namenode.FSDirectory: ACLs enabled? false
2020-03-09 11:38:06,670 INFO namenode.FSDirectory: POSIX ACL inheritance enabled? true
2020-03-09 11:38:06,670 INFO namenode.FSDirectory: XAttrs enabled? true
2020-03-09 11:38:06,670 INFO namenode.NameNode: Caching file names occurring more than 10 times
2020-03-09 11:38:06,679 INFO snapshot.SnapshotManager: Loaded config captureOpenFiles: false, skipCaptureAccessTimeOnlyChange: false, snapshotDiffAllowSnapRootDescendant: true, maxSnapshotLimit: 65536
2020-03-09 11:38:06,681 INFO snapshot.SnapshotManager: SkipList is disabled
2020-03-09 11:38:06,685 INFO util.GSet: Computing capacity for map cachedBlocks
2020-03-09 11:38:06,685 INFO util.GSet: VM type       = 64-bit
2020-03-09 11:38:06,686 INFO util.GSet: 0.25% max memory 908.7 MB = 2.3 MB
2020-03-09 11:38:06,686 INFO util.GSet: capacity      = 2^18 = 262144 entries
2020-03-09 11:38:06,697 INFO metrics.TopMetrics: NNTop conf: dfs.namenode.top.window.num.buckets = 10
2020-03-09 11:38:06,697 INFO metrics.TopMetrics: NNTop conf: dfs.namenode.top.num.users = 10
2020-03-09 11:38:06,697 INFO metrics.TopMetrics: NNTop conf: dfs.namenode.top.windows.minutes = 1,5,25
2020-03-09 11:38:06,700 INFO namenode.FSNamesystem: Retry cache on namenode is enabled
2020-03-09 11:38:06,701 INFO namenode.FSNamesystem: Retry cache will use 0.03 of total heap and retry cache entry expiry time is 600000 millis
2020-03-09 11:38:06,707 INFO util.GSet: Computing capacity for map NameNodeRetryCache
2020-03-09 11:38:06,707 INFO util.GSet: VM type       = 64-bit
2020-03-09 11:38:06,708 INFO util.GSet: 0.029999999329447746% max memory 908.7 MB = 279.1 KB
2020-03-09 11:38:06,708 INFO util.GSet: capacity      = 2^15 = 32768 entries
2020-03-09 11:38:06,760 INFO namenode.FSImage: Allocated new BlockPoolId: BP-1174644765-192.168.10.1-1583735886736
2020-03-09 11:38:06,787 INFO common.Storage: Storage directory /home/hadoop/data/nameNode has been successfully formatted.
2020-03-09 11:38:06,862 INFO namenode.FSImageFormatProtobuf: Saving image file /home/hadoop/data/nameNode/current/fsimage.ckpt_0000000000000000000 using no compression
2020-03-09 11:38:07,029 INFO namenode.FSImageFormatProtobuf: Image file /home/hadoop/data/nameNode/current/fsimage.ckpt_0000000000000000000 of size 393 bytes saved in 0 seconds .
2020-03-09 11:38:07,045 INFO namenode.NNStorageRetentionManager: Going to retain 1 images with txid >= 0
2020-03-09 11:38:07,072 INFO namenode.FSImage: FSImageSaver clean checkpoint: txid = 0 when meet shutdown.
2020-03-09 11:38:07,074 INFO namenode.NameNode: SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG: Shutting down NameNode at master-node/192.168.10.1
************************************************************/

With this hdfs format, your hadoop installation is now configured and ready to run.

Running Hadoop

start-dfs.sh

You will see similar to the following output:

start-dfs.sh

You will see similar to the following output:

Starting namenodes on [master-node]
Starting datanodes
worker-node2: WARNING: /home/hadoop/hadoop/logs does not exist. Creating.
worker-node1: WARNING: /home/hadoop/hadoop/logs does not exist. Creating.
Starting secondary namenodes [master-node]

This will start NameNode and SecondaryNameNode component on master-node, and DataNode on worker-node1 and worker-node2, according to the configuration in the workers config file.

Check that every process is running with the jps command on each node.

On master-node, type jps and you should see the following:

8066 NameNode
8292 SecondaryNameNode
8412 Jps

On worker-node1 and worker-node2, type jps and you should see the following:

17525 DataNode
17613 Jps

You can get useful information about your hadoop cluster with the below command.

hdfs dfsadmin -report

This will print information (e.g., capacity and usage) for all running nodes in the cluster.

Next, open up your preferred web browser and navigate to http://your_master_node_IP:9870, and you’ll get a user-friendly hadoop monitoring web console like below:

Testing Hadoop Cluster

You can test your hadoop cluster by writing and reading come contents using hdfs dfs command.

First, manually create your home directory. All other commands will use a path relative to this default home directory:

On master-node, type below command:

hdfs dfs -mkdir -p /user/hadoop

We'll use few textbooks from the Gutenberg project as an example for this guide.

Create a books directory in hadoop file-system. The following command will create it in the home directory, /user/hadoop/books:

hdfs dfs -mkdir books

Now download a few books from the Gutenberg project:

cd /home/hadoop

wget -O franklin.txt http://www.gutenberg.org/files/13482/13482.txt
wget -O herbert.txt http://www.gutenberg.org/files/20220/20220.txt
wget -O maria.txt http://www.gutenberg.org/files/29635/29635.txt

Next, put these three books using hdfs, in the books directory:

hdfs dfs -put franklin.txt herbert.txt maria.txt books

List the contents of the books directory:

hdfs dfs -ls books

Next, move one of the books to the local filesystem:

hdfs dfs -get books/franklin.txt

You can also directly print the books on terminal from hdfs:

hdfs dfs -cat books/maria.txt

These are just few example of hadoop commands. However, there are many commands to manage your hdfs. For a complete list, you can look at the Apache hdfs shell documentation, or print help with:

hdfs dfs -help

Start YARN

HDFS is a distributed storage system, and doesn’t provide any services for running and scheduling tasks in the cluster. This is the role of the YARN framework. The following section is about starting, monitoring, and submitting jobs to YARN.

On master-node, you can start YARN with the below script:

start-yarn.sh

You will see the output like below:

Starting resourcemanager
Starting nodemanagers

Check that everything is running with the jps command. In addition to the previous HDFS daemon, you should see a ResourceManager on master-node, and a NodeManager on worker-node1 and worker-node2.

To stop YARN, run the following command on master-node:

stop-yarn.sh

Similarly, you can get a list of running applications with below command:

yarn application -list

To get all available parameters of the yarn command, see Apache YARN documentation.

As with HDFS, YARN provides a friendlier web UI, started by default on port 8088 of the Resource Manager. You can navigate to http://master-node-IP:8088 to browse the YARN web console:

Submit MapReduce Jobs to YARN

YARN jobs are packaged into jar files and submitted to YARN for execution with the command yarn jar. The Hadoop installation package provides sample applications that can be run to test your cluster. You’ll use them to run a word count on the three books previously uploaded to HDFS.

On master-node, submit a job with the sample jar to YARN:

yarn jar ~/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-examples-3.1.3.jar wordcount "books/*" output

The last argument is where the output of the job will be saved - in HDFS.

After the job is finished, you can get the result by querying with below command:

hdfs dfs -ls output

Print the result with:

hdfs dfs -cat output/part-r-00000 | less

Wrapping up

Now that you have a YARN cluster up and running, you can learn how to code your own YARN jobs with Apache documentation and install Spark on top of your YARN cluster. You may wish to take the following resources into consideration for additional information on this topic.

↧

How To Deploy a Fault-Tolerant MongoDB Sharded Cluster on Ubuntu 18/19/20

March 17, 2020, 6:11 am

≫ Next: Create a Fault-Tolerant MongoDB Sharded Cluster using Shared Storage on Ubuntu 18/19/20

≪ Previous: How To Create a Highly Available Multi-node Hadoop Cluster on CentOS/RHEL 7/8

This tutorial will show you how to deploy a fault-tolerant and highly available MongoDB sharded cluster for your production use.

A MongoDB sharded cluster consists of the following components:

shard: Each shard contains a subset of the sharded data. Each shard can be deployed as a replica set.
mongos: The mongos acts as a query router, providing an interface between client applications and the sharded cluster.
config servers: Config servers store metadata and configuration settings for the cluster.

The following graphic describes the interaction of components within a sharded cluster:

Note: This guide is specifically written for Ubuntu 18.04, 19.04, 19.10, 20.04 and Debian 9, 10.

Prerequisites

To follow this tutorial along, you will need at least 10 (physical or virtual) machines installed with Ubuntu or Debian having sudo non-root user privileges.

Before you begin with MongoDB installation, make sure you have completed basic network settings including hostname, timezone and IP addresses set up on all 10 servers.

You can set hostname on your servers with below command:

sudo hostnamectl set-hostname server-name.domain

You can set timezone on your servers with below command:

sudo timedatectl set-timezone US/New_York

Update Hosts Files

sudo nano /etc/hosts

Add the following to the hosts file, make sure you replace IP addresses with yours and (if possible) keep the hostnames same to avoid confusion:


# Config Server

192.168.10.1 cfg1.example.pk
192.168.10.2 cfg2.example.pk
192.168.10.3 cfg3.example.pk

# Shard Server (rs0)
192.168.10.4 shrd1.example.pk
192.168.10.5 shrd2.example.pk
192.168.10.6 shrd3.example.pk

# Shard Server (rs1)
192.168.10.7 shrd4.example.pk
192.168.10.8 shrd5.example.pk
192.168.10.9 shrd6.example.pk

# Query Router
192.168.10.10 qrt1.example.pk

Save and close file when you are finished.

Repeat the same on remaining servers before proceeding to next step.

Add MongoDB Source

We need to add official MongoDB repository in Ubuntu's source list in order to install latest release using apt package manager:

Type below to add mongo official repository:

sudo apt-key adv --keyserver keyserver.ubuntu.com --recv-keys 4b7c549a058f8b6b
echo "deb [ arch=amd64,arm64 ] https://repo.mongodb.org/apt/ubuntu bionic/mongodb-org/4.2 multiverse" | sudo tee /etc/apt/sources.list.d/mongodb-org-4.2.list

Make sure you repeat the same steps on remaining servers before proceeding to next step.

Installing MongoDB

When you are finished adding mongo official repository on all of the servers, login to your first server (cfg1.example.pk) in our case.

Type below command to install mongodb version v4.2.3:

sudo apt update
sudo apt -y install mongodb-org

Repeat the same on remaining servers as well.

Create an Administrative User

Login to your first config server (cfg1.example.pk) in our case, we intend to use as the primary member of replica set of config servers.

Start mongodb service with below command:

sudo systemctl start mongod.service
sudo systemctl enable mongod.service

Access mongo shell with below command:

mongo

On mongo shell, type below to switch to the default admin database:

use admin

Create an administrative user with root privileges, make sure you replace “password” with a strong password of your choice:

db.createUser({user: "administrator", pwd: "password", roles:[{role: "root", db: "admin"}]})

This will return similar to the following output:

Successfully added user: {
"user" : "administrator",
"roles" : [
                {
"role" : "root",
"db" : "admin"
                }
        ]
}

Type below to exit from mongo shell:

quit()

Set Up MongoDB Authentication

In this section, you need to create a key file that will be used to secure authentication between the members of MongoDB replica set.

On (cfg1.example.pk) in our case, type below command to generate a key file:

openssl rand -base64 756 > ~/mongodb_key.pem

For this guide, we generated key file under the home directory of user.

Copy ~/mongodb_key.pem file to /var/lib/mongodb/, and set appropriate permissions like below:


sudo cp ~/mongodb_key.pem /var/lib/mongodb/

sudo chown -R mongodb:mongodb /var/lib/mongodb/mongodb_key.pem
sudo chmod -R 400 /var/lib/mongodb/mongodb_key.pem

Next, copy mongodb_key.pem file from cfg1.example.pk server to each of your servers so that they have the key file located in the same directory, with identical permissions:


sudo scp -p /var/lib/mongodb/mongodb_key.pem cfg2.example.pk:~/

sudo scp -p /var/lib/mongodb/mongodb_key.pem cfg3.example.pk:~/

sudo scp -p /var/lib/mongodb/mongodb_key.pem shrd1.example.pk:~/
sudo scp -p /var/lib/mongodb/mongodb_key.pem shrd2.example.pk:~/
sudo scp -p /var/lib/mongodb/mongodb_key.pem shrd3.example.pk:~/
sudo scp -p /var/lib/mongodb/mongodb_key.pem shrd4.example.pk:~/
sudo scp -p /var/lib/mongodb/mongodb_key.pem shrd5.example.pk:~/
sudo scp -p /var/lib/mongodb/mongodb_key.pem shrd6.example.pk:~/

sudo scp -p /var/lib/mongodb/mongodb_key.pem qrt1.example.pk:~/

Next, move mongodb_key.pem file from user's home directory under /var/lib/mongodb/ and assign the appropriate permissions on each of your servers like below:

We are performing following steps from (cfg1.example.pk) using ssh on each server:

ssh cfg2.example.pk sudo mv ~/mongodb_key.pem /var/lib/mongodb/
ssh cfg2.example.pk sudo chown -R mongodb:mongodb /var/lib/mongodb/mongodb_key.pem
ssh cfg2.example.pk sudo chmod -R 400 /var/lib/mongodb/mongodb_key.pem

ssh cfg3.example.pk sudo mv ~/mongodb_key.pem /var/lib/mongodb/
ssh cfg3.example.pk sudo chown -R mongodb:mongodb /var/lib/mongodb/mongodb_key.pem
ssh cfg3.example.pk sudo chmod -R 400 /var/lib/mongodb/mongodb_key.pem

ssh shrd1.example.pk sudo mv ~/mongodb_key.pem /var/lib/mongodb/
ssh shrd1.example.pk sudo chown -R mongodb:mongodb /var/lib/mongodb/mongodb_key.pem
ssh shrd1.example.pk sudo chmod -R 400 /var/lib/mongodb/mongodb_key.pem

ssh shrd2.example.pk sudo mv ~/mongodb_key.pem /var/lib/mongodb/
ssh shrd2.example.pk sudo chown -R mongodb:mongodb /var/lib/mongodb/mongodb_key.pem
ssh shrd2.example.pk sudo chmod -R 400 /var/lib/mongodb/mongodb_key.pem

ssh shrd3.example.pk sudo mv ~/mongodb_key.pem /var/lib/mongodb/
ssh shrd3.example.pk sudo chown -R mongodb:mongodb /var/lib/mongodb/mongodb_key.pem
ssh shrd3.example.pk sudo chmod -R 400 /var/lib/mongodb/mongodb_key.pem

ssh shrd4.example.pk sudo mv ~/mongodb_key.pem /var/lib/mongodb/
ssh shrd4.example.pk sudo chown -R mongodb:mongodb /var/lib/mongodb/mongodb_key.pem
ssh shrd4.example.pk sudo chmod -R 400 /var/lib/mongodb/mongodb_key.pem

ssh shrd5.example.pk sudo mv ~/mongodb_key.pem /var/lib/mongodb/
ssh shrd5.example.pk sudo chown -R mongodb:mongodb /var/lib/mongodb/mongodb_key.pem
ssh shrd5.example.pk sudo chmod -R 400 /var/lib/mongodb/mongodb_key.pem

ssh shrd6.example.pk sudo mv ~/mongodb_key.pem /var/lib/mongodb/
ssh shrd6.example.pk sudo chown -R mongodb:mongodb /var/lib/mongodb/mongodb_key.pem
ssh shrd6.example.pk sudo chmod -R 400 /var/lib/mongodb/mongodb_key.pem

ssh qrt1.example.pk sudo mv ~/mongodb_key.pem /var/lib/mongodb/
ssh qrt1.example.pk sudo chown -R mongodb:mongodb /var/lib/mongodb/mongodb_key.pem
ssh qrt1.example.pk sudo chmod -R 400 /var/lib/mongodb/mongodb_key.pem

Make sure you have copied mongodb_key.pem file to each of your servers and set the appropriate permission as described in above section before proceeding to next step.

Create the Config Server Replica Set

sudo nano /etc/mongod.conf

Add, update the following highlighted values, make sure you replace default port with 27019

port: 27019
  bindIp: cfg1.example.pk

security:
  keyFile: /var/lib/mongodb/mongodb_key.pe2cm

replication:
  replSetName: configReplSet

sharding:
  clusterRole: configsvr

Save and close file when you are finished.

Repeat the same on remaining config servers (cfg2.example.pk, cfg3.example.pk) in our case.

For reference, you can see below mongod.conf from cfg1.example.pk in our case.

This mongod.conf from cfg2.example.pk in our case.

and lastly mongod.conf from cfg3.example.pk in our case.

When you are finished with the above on all three config servers, restart the mongodb service to take changes into effect:


sudo systemctl start mongod.service

sudo systemctl enable mongod.service

Confirm that the mongodb service is active and running:

sudo systemctl status mongod.service

screenshot of cfg1.example.pk

screenshot of cfg2.example.pk

screenshot of cfg3.example.pk

Initiate the config replica set.

On your first config server, (cfg1.example.pk) in our case, connect to the MongoDB shell over port 27019 with the administrative user like below:

mongo --host cfg1.example.pk --port 27019 -u administrator --authenticationDatabase admin

This will prompt you for password:

MongoDB shell version v4.2.3
Enter password:

From the mongo shell, initiate the config server's replica set like below:

rs.initiate({ _id: "configReplSet", configsvr: true, members: [{ _id : 0, host : "cfg1.example.pk:27019"},{ _id : 1, host : "cfg2.example.pk:27019"},{ _id : 2, host : "cfg3.example.pk:27019"}]})

You will see a message similar to below indicating the operation succeeded:

Notice that the MongoDB shell prompt has also changed to configReplSet:PRIMARY> or configReplSet:SECONDARY>.

To make sure that each config server has been added to the replica set, type below on mongo shell:

rs.status()

If the replica set has been configured properly, you’ll see output similar to the following:

Now exit from mongo shell with below command:

quit()

Create the Shard Replica Sets

In this section we will configure shard replica set (rs0) on (shrd1, shrd2, shrd3) servers.

sudo nano /etc/mongod.conf

Add, update the following values, make sure you replace default port with 27018:

net:
  port: 27018
  bindIp: shrd1.example.pk

security:
  keyFile: /var/lib/mongodb/mongodb_key.pem

replication:
  replSetName: rs0

sharding:
  clusterRole: shardsvr

Save and close file when you are finished.

Make sure you repeat the same on (shrd2.example.pk, shrd3.example.pk) as well.

For reference you can see below screenshot of mongod.conf file on shrd1.example.pk after the changes we made.

Below screenshot of mongod.conf from shrd2.example.pk

and below screenshot of mongod.conf from shrd3.example.pk

When you are finished with the above step, start mongodb service on (shrd1, shrd2, shrd3) to take changes into effect:


sudo systemctl start mongod

sudo systemctl enable mongod
sudo systemctl status mongod

screenshot of mongod service status from shrd1.example.pk

screenshot of mongod service status from shrd2.example.pk

screenshot of mongod service status from shrd3.example.pk

When you are finished with the above step on (shrd1, shrd2, shrd3), proceed with the below.

In this step, we will configure shard replica set (rs1) on (shrd4, shrd5, shrd6) servers.

Login to (shrd4.example.pk) in our case, edit /etc/mongod.conf file:

sudo nano /etc/mongod.conf

Add, update the following values, make sure you replace default port with 27018:

net:
  port: 27018
  bindIp: shrd4.example.pk

security:
  keyFile: /var/lib/mongodb/mongodb_key.pem

replication:
  replSetName: rs1

sharding:
  clusterRole: shardsvr

Save and close file when you are finished.

Make sure you repeat the same on (shrd5.example.pk, shrd6.example.pk) as well.

For reference you scan see below screenshot of mongod.conf from shrd4.example.pk after the changes we made.

Below screenshot of mongod.conf from shrd5.example.pk

and below screenshot of mongod.conf from shrd6.example.pk

When you are finished with the above step, restart the mongod service on (shrd4, shrd5, shrd6) to take changes into effect:


sudo systemctl start mongod

sudo systemctl enable mongod
sudo systemctl status mongod

mongod service status screenshot from shrd4.example.pk

mongod service status screenshot from shrd5.example.pk

mongod service status screenshot from shrd6.example.pk

Initiate the shard replica set.

Login to your first shard server (shrd1.example.pk) in our case, connect to mongo shell on port 27018 with administrative user authentication like below:

mongo --host shrd1.example.pk --port 27018 -u administrator --authenticationDatabase admin

Type below on mongo shell to initiate shard replica set (rs0):

rs.initiate({ _id : "rs0", members:[{ _id : 0, host : "shrd1.example.pk:27018" },{ _id : 1, host : "shrd2.example.pk:27018" },{ _id : 2, host : "shrd3.example.pk:27018" }]})

This will return { "ok" : 1 } indicating that shard replica set rs0 is initiated successfully.

Now exit from the mongo shell with below command:

quit()

Next, login to (shrd4.example.pk) in our case, connect to mongo shell on port 27018 with administrative user authentication like below:

mongo --host shrd4.example.pk --port 27018 -u administrator --authenticationDatabase admin

Type below on mongo shell to initiate shard replica set (rs1):

rs.initiate({ _id : "rs1", members:[{ _id : 0, host : "shrd4.example.pk:27018" },{ _id : 1, host : "shrd5.example.pk:27018" },{ _id : 2, host : "shrd6.example.pk:27018" }]})

This will return { "ok" : 1 } indicating that shard replica set rs1 is initiated successfully.

Now exit from the mongo shell with below command:

quit()

Configure Mongos (Query Router)

Login to your query router server (qrt1.example.pk) in our case and perform the following.

Type below command to deactivate mongod service:

sudo systemctl stop mongod.service
sudo systemctl disable mongod.service

The mongos service that we will create in below step needs to obtain data locks that conflicts with mongod, so be sure mongod is stopped before proceeding:

sudo systemctl status mongod.service

As you can see mongod.service has been stopped on query router.

Type below to create /etc/mongos.conf file:

sudo nano /etc/mongos.conf

Add the following configuration directives:

# Where write logging data.
systemLog:
  destination: file
  logAppend: true
  path: /var/log/mongodb/mongos.log

# network interfaces
net:
  port: 27017
  bindIp: qrt1.example.pk

security:
  keyFile: /var/lib/mongodb/mongodb_key.pem

sharding:
  configDB: configReplSet/cfg1.example.pk:27019,cfg2.example.pk:27019,cfg3.example.pk:27019

Save and close file when you are finished.

You can see final mongos.conf file in below screenshot.

Next, create a systemd service unit file for mongos like below:

sudo nano /lib/systemd/system/mongos.service

Add the following parameters:

[Unit]
Description=Mongo Cluster Router
After=network.target

[Service]
User=mongodb
Group=mongodb

ExecStart=/usr/bin/mongos --config /etc/mongos.conf

LimitFSIZE=infinity
LimitCPU=infinity
LimitAS=infinity

LimitNOFILE=64000
LimitNPROC=64000

TasksMax=infinity
TasksAccounting=false

[Install]
WantedBy=multi-user.target

Save and close file when you are finished.

You can see final mongos.service file in below screenshot.

Next, start mongos.service with below command to activate query router.

sudo systemctl start mongos.service
sudo systemctl enable mongos.service

Confirm that mongos.service is running with below command:

sudo systemctl status mongos.service

You can see below screenshot of mongos.service status

Add Shards to the Cluster

From (qrt1.example.pk) in our case, connect to mongo shell on port 27017 with administrative user authentication like below:

mongo --host qrt1.example.pk --port 27017 -u administrator --authenticationDatabase admin

On mongo shell, type below to add shard replica set (rs0) in the cluster:

sh.addShard( "rs0/shrd1.example.pk:27018,shrd2.example.pk:27018,shrd2.example.pk:27018")

Type below to add shard replica set (rs1) in the cluster:

sh.addShard( "rs1/shrd4.example.pk:27018,shrd5.example.pk:27018,shrd6.example.pk:27018")

You will see the similar output indicating that shard replica sets added successfully.

At this stage, your MongoDB sharded cluster is active and running.

The last step is to enable sharding for database. This process takes place in stages due to the organization of data in MongoDB. To understand how data will be distributed, let’s quickly understand the main data structures:

Databases - The broadest data structure in MongoDB, used to hold groups of related data.
Collections - Analogous to tables in traditional relational database systems, collections are the data structures that comprise databases
Documents - The most basic unit of data storage in MongoDB. Documents use JSON format for storing data using key-value pairs that can be queried by applications

Sharding Strategy

Before you enable sharding, you’ll need to decide on a sharding strategy. When data is distributed among the shards, MongoDB needs a way to sort it and know which data is on which shard. To do this, it uses a shard key, a designated field in your documents that is used by the mongos query router know where a given piece of data is stored.

The two most common sharding strategies are:

Range-based sharding divides your data based on specific ranges of values in the shard key.
Hash-based sharding distributes data by using a hash function on your shard key for a more even distribution of data among the shards.

This is not intended to be a comprehensive explanation to choosing a sharding strategy. You may wish to consult with official resource of MongoDB’s documentation on sharding strategy.

Enable Sharding for a Database

Now that the testDB database is available for sharding and we’ve decided to use a hash-based sharding strategy, enable sharding at the collections level. This allows the documents within a collection to be distributed among your shards.

mongo --host qrt1.example.pk --port 27017 -u administrator --authenticationDatabase admin

From the mongos shell, type below to create a test database called testDB

use testDB

Create a new collection called testCollection and hash its _id key. The _id key is already created by default as a basic index for new documents:

db.testCollection.ensureIndex( { _id : "hashed" } )

Type below to enable sharding for newly created database:

sh.enableSharding("testDB")
sh.shardCollection( "testDB.testCollection", { "_id" : "hashed" } )

This enables sharding across any shards that you added to your cluster in the Add Shards to the Cluster step.

To verify that the sharding was successfully enabled, type below to switch to the config database:

use config

Next, run below method:

db.databases.find()

This will return a list of all databases with useful information similar to like below:

In the above output, you can see only one entry for the testDB database we just created.

Once you enable sharding for a database, MongoDB assigns a primary shard for that database where MongoDB stores all data in that database.

Verify Shard Cluster

To ensure your data is being distributed evenly in the testDB database and collection you configured above, you can follow these steps to generate some basic test data and see how it is divided among the shards.

Connect to the mongos shell on your query router:

mongo --host qrt1.example.pk --port 27017 -u administrator -p --authenticationDatabase admin

Switch to your testDB database:

use testDB

Type the following code in the mongo shell to generate 10000 simple documents and insert them into testCollection:

for (var i = 1; i <= 10000; i++) db.testCollection.insert( { x : i } )

Check your data distribution:

db.testCollection.getShardDistribution()

This will return information similar to the following:

The sections beginning with Shard give information about each shard in your cluster. Since we only added 2 shards with three members each, there are only two sections, but if you add more shards to the cluster, they’ll show up here as well.

The Totals section provides information about the collection as a whole, including its distribution among the shards. Notice that distribution is not perfectly equal. The hash function does not guarantee absolutely even distribution, but with a carefully chosen shard key it will usually be fairly close.

When you’re finished, we recommend you to delete the testDB (because it has no use) with below command:

db.dropDatabase()

Wrapping up

Now that you have successfully deployed a highly available fault-tolerant MongoDB sharded cluster ready to use for your production environment, it is recommended to configure a firewall to limit ports 27018 and 27019 to only accept traffic between hosts within your cluster.

↧

Create a Fault-Tolerant MongoDB Sharded Cluster using Shared Storage on Ubuntu 18/19/20

March 20, 2020, 12:56 am

≫ Next: How To Set Up a Highly Available PostgreSQL Cluster on Ubuntu 19/20

≪ Previous: How To Deploy a Fault-Tolerant MongoDB Sharded Cluster on Ubuntu 18/19/20

This tutorial will walk you through the steps to set up a highly available fault-tolerant MongoDB sharded cluster using a shared storage for your production use. If you are not intend to use shared storage, follow this tutorial.

A MongoDB sharded cluster consists of the following three components:

Shard that can be deployed as a replica set.

Config store metadata and configuration settings for the cluster.

Mongos acts as a query router, providing an interface between client applications and the sharded cluster.

The following illustration describes the interaction of components within a sharded cluster, and you can see, there is no single point of failure in this highly available setup.

Prerequisites

To follow this tutorial along, you will need at least 11 (physical or virtual) machines installed with Ubuntu or Debian having sudo non-root user privileges. Make sure you have completed basic network settings including hostname, timezone, and IP addresses on each of your servers.

We will use these 12 machines with the information (as described) for our sharded cluster throughout this tutorial. Make sure you substitute, hostname, ip address, domain and red highlighted text with yours wherever applicable.

The NFS server that will be used as remote shared storage for this guide is already in place. You can use any storage of your choice whatever suits your environment.

Note: This guide is specifically written for Ubuntu 18.04, 19.04, 19.10, 20.04 and Debian 9, 10.

STEP1 - Create SSH Key-Pair

We will generate an ssh key-pair to set up passwordless authentication between the hosts in the cluster.

ssh-keygen

This will return you the following prompts, press enter to select default location:

Generating public/private rsa key pair.
Enter file in which to save the key (/home/administrator/.ssh/id_rsa):

Again press enter to leave the passphrase fields blank:

Enter passphrase (empty for no passphrase):
Enter same passphrase again:

You will see similar to the following output, confirms that the generating ssh key-pair succeeded under your user's home directory

Your identification has been saved in /home/administrator/.ssh/id_rsa.
Your public key has been saved in /home/administrator/.ssh/id_rsa.pub.

The key fingerprint is:
SHA256:DWWeuoSoQgHJnYkbrs8QoFs8oMjP0Sv8/3ehuN17MPE administrator@cfg1.example.pk
The key's randomart image is:
+---[RSA 2048]----+
|o.o o     o      |
|=+ +     + .     |
|B+o .   . o      |
|=+=. o . +   .   |
|.=+.o o S .   o  |
|=  * . . .   + E |
|.+. o   . . . +  |
| .o  .   ..o.. . |
|      ...oo..oo  |
+----[SHA256]-----+

Type below to create an authorized_keys file:

ssh-copy-id -i .ssh/id_rsa.pub localhost

Copy ~/.ssh directory with all its contents from cfg1.example.pk to each of your servers like below:

scp -r /home/administrator/.ssh/ cfg2.example.pk:~/
scp -r /home/administrator/.ssh/ cfg3.example.pk:~/
scp -r /home/administrator/.ssh/ shrd1.example.pk:~/
scp -r /home/administrator/.ssh/ shrd2.example.pk:~/
scp -r /home/administrator/.ssh/ shrd3.example.pk:~/
scp -r /home/administrator/.ssh/ shrd4.example.pk:~/
scp -r /home/administrator/.ssh/ shrd5.example.pk:~/
scp -r /home/administrator/.ssh/ shrd6.example.pk:~/
scp -r /home/administrator/.ssh/ qrt1.example.pk:~/
scp -r /home/administrator/.ssh/ qrt2.example.pk:~/

If everything set up correctly as demonstrated above, you can access any of your servers via ssh, and it won't prompt you for the password anymore.

STEP2 - Update Hosts File

If you are running a DNS server, you can create HOST-A records to resolve names against servers' IP addresses. For this guide, we will use /etc/hosts file to map each server's name against its IP address:

sudo nano /etc/hosts

add each of your servers's name and IP addresses in the hosts file like below:

# Config Server Replica Set
192.168.10.1 cfg1.example.pk
192.168.10.2 cfg2.example.pk
192.168.10.3 cfg3.example.pk

# Shard Server Replica Set (rs0)
192.168.10.4 shrd1.example.pk
192.168.10.5 shrd2.example.pk
192.168.10.6 shrd3.example.pk

# Shard Server Replica Set (rs1)
192.168.10.7 shrd4.example.pk
192.168.10.8 shrd5.example.pk
192.168.10.9 shrd6.example.pk

# Mongos (Query Router)
192.168.10.10 qrt1.example.pk
192.168.10.11 qrt2.example.pk

# NFS Shared Storage
192.168.10.12 rstg.example.pk

Save and close the file when you are finished.

Repeat the same on each of your remaining servers before proceeding to the next step.

STEP3 - Add MongoDB Source

Ubuntu default apt source list doesn't provide the latest release of mongodb database, hence we need to add mongodb official packages source on each server to install the latest stable version.

On (cfg1.example.pk), add MongoDB official repository with below command:

sudo apt-key adv --keyserver keyserver.ubuntu.com --recv-keys 4b7c549a058f8b6b
echo "deb [ arch=amd64,arm64 ] https://repo.mongodb.org/apt/ubuntu bionic/mongodb-org/4.2 multiverse" | sudo tee /etc/apt/sources.list.d/mongodb-org-4.2.list

From cfg1.example.pk, add mongodb source on each of your remaining servers via ssh as we have already set up passwordless authentication.

ssh cfg2.example.pk
sudo apt-key adv --keyserver keyserver.ubuntu.com --recv-keys 4b7c549a058f8b6b
echo "deb [ arch=amd64,arm64 ] https://repo.mongodb.org/apt/ubuntu bionic/mongodb-org/4.2 multiverse" | sudo tee /etc/apt/sources.list.d/mongodb-org-4.2.list
exit

ssh cfg3.example.pk
sudo apt-key adv --keyserver keyserver.ubuntu.com --recv-keys 4b7c549a058f8b6b
echo "deb [ arch=amd64,arm64 ] https://repo.mongodb.org/apt/ubuntu bionic/mongodb-org/4.2 multiverse" | sudo tee /etc/apt/sources.list.d/mongodb-org-4.2.list
exit

ssh shrd1.example.pk
sudo apt-key adv --keyserver keyserver.ubuntu.com --recv-keys 4b7c549a058f8b6b
echo "deb [ arch=amd64,arm64 ] https://repo.mongodb.org/apt/ubuntu bionic/mongodb-org/4.2 multiverse" | sudo tee /etc/apt/sources.list.d/mongodb-org-4.2.list
exit

ssh shrd2.example.pk
sudo apt-key adv --keyserver keyserver.ubuntu.com --recv-keys 4b7c549a058f8b6b
echo "deb [ arch=amd64,arm64 ] https://repo.mongodb.org/apt/ubuntu bionic/mongodb-org/4.2 multiverse" | sudo tee /etc/apt/sources.list.d/mongodb-org-4.2.list
exit

ssh shrd3.example.pk
sudo apt-key adv --keyserver keyserver.ubuntu.com --recv-keys 4b7c549a058f8b6b
echo "deb [ arch=amd64,arm64 ] https://repo.mongodb.org/apt/ubuntu bionic/mongodb-org/4.2 multiverse" | sudo tee /etc/apt/sources.list.d/mongodb-org-4.2.list
exit

ssh shrd4.example.pk
sudo apt-key adv --keyserver keyserver.ubuntu.com --recv-keys 4b7c549a058f8b6b
echo "deb [ arch=amd64,arm64 ] https://repo.mongodb.org/apt/ubuntu bionic/mongodb-org/4.2 multiverse" | sudo tee /etc/apt/sources.list.d/mongodb-org-4.2.list
exit

ssh shrd5.example.pk
sudo apt-key adv --keyserver keyserver.ubuntu.com --recv-keys 4b7c549a058f8b6b
echo "deb [ arch=amd64,arm64 ] https://repo.mongodb.org/apt/ubuntu bionic/mongodb-org/4.2 multiverse" | sudo tee /etc/apt/sources.list.d/mongodb-org-4.2.list
exit

ssh shrd6.example.pk
sudo apt-key adv --keyserver keyserver.ubuntu.com --recv-keys 4b7c549a058f8b6b
echo "deb [ arch=amd64,arm64 ] https://repo.mongodb.org/apt/ubuntu bionic/mongodb-org/4.2 multiverse" | sudo tee /etc/apt/sources.list.d/mongodb-org-4.2.list
exit

ssh qrt1.example.pk
sudo apt-key adv --keyserver keyserver.ubuntu.com --recv-keys 4b7c549a058f8b6b
echo "deb [ arch=amd64,arm64 ] https://repo.mongodb.org/apt/ubuntu bionic/mongodb-org/4.2 multiverse" | sudo tee /etc/apt/sources.list.d/mongodb-org-4.2.list
exit

STEP4 - Installing MongoDB

On (cfg1.example.pk) install mongodb latest stable release like below:

sudo apt update
sudo apt -y install mongodb-org

When installation complete on your cfg1.example.pk, install mongodb on each of your remaining servers using ssh like below:

ssh cfg2.example.pk
sudo apt update
sudo apt -y install mongodb-org
exit

ssh cfg3.example.pk
sudo apt update
sudo apt -y install mongodb-org
exit

ssh shrd1.example.pk
sudo apt update
sudo apt -y install mongodb-org
exit

ssh shrd2.example.pk
sudo apt update
sudo apt -y install mongodb-org
exit

ssh shrd3.example.pk
sudo apt update
sudo apt -y install mongodb-org
exit

ssh shrd4.example.pk
sudo apt update
sudo apt -y install mongodb-org
exit

ssh shrd5.example.pk
sudo apt update
sudo apt -y install mongodb-org
exit

ssh shrd6.example.pk
sudo apt update
sudo apt -y install mongodb-org
exit

ssh qrt1.example.pk
sudo apt update
sudo apt -y install mongodb-org
exit

ssh qrt2.example.pk
sudo apt update
sudo apt -y install mongodb-org
exit

With this, you have completed mongodb installation on each of your servers.

STEP5 - Install NFS Client

To mount an NFS share, we need to install nfs client. You can skip this step if you are not using nfs shared storage.

On (cfg1.example.pk) , install and run nfs client like below:

sudo apt -y install nfs-common
sudo systemctl start rpcbind nfs-client.target
sudo systemctl enable rpcbind nfs-client.target

Repeat the same nfs client installation step on each of your remaining servers before proceeding to next.

STEP6 - Mount NFS Share

On (cfg1.exampke.pk), type below command to mount nfs share.

sudo mkdir /nfs_share/
sudo mount rstg.example.pk:/u01/mongodb_shared_storage /nfs_share
sudo mkdir /nfs_share/mongodb

Create data directory for each server under nfs_share like below:

sudo mkdir /nfs_share/mongodb/cfg1
sudo mkdir /nfs_share/mongodb/cfg2
sudo mkdir /nfs_share/mongodb/cfg3
sudo mkdir /nfs_share/mongodb/shrd1
sudo mkdir /nfs_share/mongodb/shrd2
sudo mkdir /nfs_share/mongodb/shrd3
sudo mkdir /nfs_share/mongodb/shrd4
sudo mkdir /nfs_share/mongodb/shrd5
sudo mkdir /nfs_share/mongodb/shrd6
sudo mkdir /nfs_share/mongodb/qrt1
sudo mkdir /nfs_share/mongodb/qrt2

Set appropriate permission on nfs_share like below:

sudo chown -R mongodb:mongodb /nfs_share/
sudo chmod -R 755 /nfs_share/

Type below command on each of your remaining servers to mount nfs share:

sudo mkdir /nfs_share/
sudo chown -R mongodb:mongodb /nfs_share/
sudo chmod -R 755 /nfs_share/
sudo mount rstg.example.pk:/u01/mongodb_shared_storage /nfs_share

STEP7 - Configure MongoDB

At this stage, we need to make a change under mongod.conf file to tell mongodb where to store data.

Login to your cfg1.example.pk and edit /etc/mongod.conf like below:

sudo nano /etc/mongod.conf

Update dbPath value with your shared storage like below:

dbPath: /nfs_share/mongodb/cfg1

Save and close file when you are finished

Start mongod service and make it persistent on reboot with below command:

sudo systemctl start mongod
sudo systemctl enable mongod

Confirm that mongod service is active and running with below command:

sudo systemctl status mongod

You can see in the below output that mongod is active and running.

● mongod.service - MongoDB Database Server
   Loaded: loaded (/lib/systemd/system/mongod.service; enabled; vendor preset: enabled)
   Active: active (running) since Thu 2020-03-19 09:37:32 PKT; 4min 16s ago
     Docs: https://docs.mongodb.org/manual
 Main PID: 6024 (mongod)
   CGroup: /system.slice/mongod.service
└─6024 /usr/bin/mongod --config /etc/mongod.conf

Make sure you repeat the same on each of your remaining servers except (qrt1, qrt2) before proceeding to next step.

STEP8 - Create a Administrative User

To Administer and manage mongodb sharded cluster, we need to create an administrative user with root privileges.

On (cfg1.example.pk), type below command to access mongo shell:

mongo

You will see mongo shell prompt like below:

MongoDB shell version v4.2.3
connecting to: mongodb://127.0.0.1:27017/?compressors=disabled&gssapiServiceName=mongodb
Implicit session: session { "id" : UUID("8abb0c14-e2cf-4947-855b-d339279b52c9") }
MongoDB server version: 4.2.3
Welcome to the MongoDB shell.
>

On mongo shell, type below to switch to the default admin database:

use admin

Type below on mongo shell to create a user called "administrator", make sure you replace “password” with a strong password of your choice:

db.createUser({user: "administrator", pwd: "password", roles:[{role: "root", db: "admin"}]})

This will return similar to the following output:

Successfully added user: {
"user" : "administrator",
"roles" : [
                {
"role" : "root",
"db" : "admin"
                }
        ]
}

Type below to exit from mongo shell:

quit()

You will also need to create same user on (shrd1.example.pk)

ssh shrd1.example.pk
sudo systemctl start mongod;sudo systemctl enable mongod
mongo
use admin
db.createUser({user: "administrator", pwd: "password", roles:[{role: "root", db: "admin"}]})
quit()
exit

and on (shrd4.example.pk) as well:

ssh shrd4.example.pk
sudo systemctl start mongod;sudo systemctl enable mongod
mongo
use admin
db.createUser({user: "administrator", pwd: "password", roles:[{role: "root", db: "admin"}]})
quit()
exit

When you are finished creating user on (cfg1, shrd1, shrd4), proceed to next step.

STEP9 - Set Up MongoDB Authentication

We will generate a key file that will be used to secure authentication between the members of replica set. While in this guide we’ll be using a key file generated with openssl, MongoDB recommends using an X.509 certificate to secure connections between production systems.

On (cfg1.example.pk), type below command to generate a key file and set appropriate permission:

openssl rand -base64 756 > ~/mongodb_key.pem
sudo cp ~/mongodb_key.pem /nfs_share/mongodb/
sudo chown -R mongodb:mongodb /nfs_share/mongodb/mongodb_key.pem
sudo chmod -R 400 /nfs_share/mongodb/mongodb_key.pem

As you can see, we have stored key file under nfs_share directory which means each server can read this file from the same location with identical permission. In case of local storage (/var/lib/mongodb) for example, you have to copy key file under the same location and set identical permission to each of your servers.

STEP10 - Set Up Config Servers Replica Set

We will make few changes in mongod.conf file on (cfg1, cfg2, cfg3) servers:

Log in to your (cfg1.example.pk), edit mongod.conf like below:

sudo nano /etc/mongod.conf

Add, update following values, make sure you replace port value with 27019 and bindIp value with your server's name:

port: 27019
  bindIp: cfg1.example.pk

security:
  keyFile: /nfs_share/mongodb/mongodb_key.pem

replication:
replSetName: configReplSet

sharding:
clusterRole: configsvr

Save and close when you are finished.

Restart mongod service to take changes into effect:

sudo systemctl restart mongod
sudo systemctl status mongod

Make sure you repeat the same on each of your remaining config servers (cfg2.example.pk, cfg3.example.pk), before proceeding to next step.

STEP11 - Initiate the Config Replica Set.

Log in to your (cfg1.example.pk), connect to the MongoDB shell over port 27019 with the administrator user like below:

mongo --host cfg1.example.pk --port 27019 -u administrator --authenticationDatabase admin

This will prompt you for password:

MongoDB shell version v4.2.3
Enter password:

From the mongo shell, initiate the config server's replica set like below:

rs.initiate({ _id: "configReplSet", configsvr: true, members: [{ _id : 0, host : "cfg1.example.pk:27019"},{ _id : 1, host : "cfg2.example.pk:27019"},{ _id : 2, host : "cfg3.example.pk:27019"}]})

You will see a message like below indicating that operation succeeded:

{
"ok" : 1,
"$gleStats" : {
"lastOpTime" : Timestamp(1584600261, 1),
"electionId" : ObjectId("000000000000000000000000")
        },
"lastCommittedOpTime" : Timestamp(0, 0)
}
configReplSet:SECONDARY>

Notice that the MongoDB shell prompt has also changed to configReplSet:SECONDARY> or configReplSet:PRIMARY>.

To make sure that each config server has been added to the replica set, type below on mongo shell:

rs.config()

If the replica set has been configured properly, you’ll see output similar to the following:

{
"_id" : "configReplSet",
"version" : 1,
"configsvr" : true,
"protocolVersion" : NumberLong(1),
"writeConcernMajorityJournalDefault" : true,
"members" : [
                {
"_id" : 0,
"host" : "cfg1.example.pk:27019",
"arbiterOnly" : false,
"buildIndexes" : true,
"hidden" : false,
"priority" : 1,
"tags" : {

                        },
"slaveDelay" : NumberLong(0),
"votes" : 1
                },
                {
"_id" : 1,
"host" : "cfg2.example.pk:27019",
"arbiterOnly" : false,
"buildIndexes" : true,
"hidden" : false,
"priority" : 1,
"tags" : {

                        },
"slaveDelay" : NumberLong(0),
"votes" : 1
                },
                {
"_id" : 2,
"host" : "cfg3.example.pk:27019",
"arbiterOnly" : false,
"buildIndexes" : true,
"hidden" : false,
"priority" : 1,
"tags" : {

                        },
"slaveDelay" : NumberLong(0),
"votes" : 1
                }
        ],
"settings" : {
"chainingAllowed" : true,
"heartbeatIntervalMillis" : 2000,
"heartbeatTimeoutSecs" : 10,
"electionTimeoutMillis" : 10000,
"catchUpTimeoutMillis" : -1,
"catchUpTakeoverDelayMillis" : 30000,
"getLastErrorModes" : {

                },
"getLastErrorDefaults" : {
"w" : 1,
"wtimeout" : 0
                },
"replicaSetId" : ObjectId("5e7314c4ba14c5d2412a1949")
        }
}

For maximum replica set configuration information, type below:

rs.status()

You’ll see output similar to the following:

{
"set" : "configReplSet",
"date" : ISODate("2020-03-19T06:47:14.266Z"),
"myState" : 1,
"term" : NumberLong(1),
"syncingTo" : "",
"syncSourceHost" : "",
"syncSourceId" : -1,
"configsvr" : true,
"heartbeatIntervalMillis" : NumberLong(2000),
"majorityVoteCount" : 2,
"writeMajorityCount" : 2,
"optimes" : {
"lastCommittedOpTime" : {
"ts" : Timestamp(1584600433, 1),
"t" : NumberLong(1)
                },
"lastCommittedWallTime" : ISODate("2020-03-19T06:47:13.490Z"),
"readConcernMajorityOpTime" : {
"ts" : Timestamp(1584600433, 1),
"t" : NumberLong(1)
                },
"readConcernMajorityWallTime" : ISODate("2020-03-19T06:47:13.490Z"),
"appliedOpTime" : {
"ts" : Timestamp(1584600433, 1),
"t" : NumberLong(1)
                },
"durableOpTime" : {
"ts" : Timestamp(1584600433, 1),
"t" : NumberLong(1)
                },
"lastAppliedWallTime" : ISODate("2020-03-19T06:47:13.490Z"),
"lastDurableWallTime" : ISODate("2020-03-19T06:47:13.490Z")
        },
"lastStableRecoveryTimestamp" : Timestamp(1584600391, 1),
"lastStableCheckpointTimestamp" : Timestamp(1584600391, 1),
"electionCandidateMetrics" : {
"lastElectionReason" : "electionTimeout",
"lastElectionDate" : ISODate("2020-03-19T06:44:32.291Z"),
"electionTerm" : NumberLong(1),
"lastCommittedOpTimeAtElection" : {
"ts" : Timestamp(0, 0),
"t" : NumberLong(-1)
                },
"lastSeenOpTimeAtElection" : {
"ts" : Timestamp(1584600261, 1),
"t" : NumberLong(-1)
                },
"numVotesNeeded" : 2,
"priorityAtElection" : 1,
"electionTimeoutMillis" : NumberLong(10000),
"numCatchUpOps" : NumberLong(0),
"newTermStartDate" : ISODate("2020-03-19T06:44:33.110Z"),
"wMajorityWriteAvailabilityDate" : ISODate("2020-03-19T06:44:34.008Z")
        },
"members" : [
                {
"_id" : 0,
"name" : "cfg1.example.pk:27019",
"health" : 1,
"state" : 1,
"stateStr" : "PRIMARY",
"uptime" : 1013,
"optime" : {
"ts" : Timestamp(1584600433, 1),
"t" : NumberLong(1)
                        },
"optimeDate" : ISODate("2020-03-19T06:47:13Z"),
"syncingTo" : "",
"syncSourceHost" : "",
"syncSourceId" : -1,
"infoMessage" : "",
"electionTime" : Timestamp(1584600272, 1),
"electionDate" : ISODate("2020-03-19T06:44:32Z"),
"configVersion" : 1,
"self" : true,
"lastHeartbeatMessage" : ""
                },
                {
"_id" : 1,
"name" : "cfg2.example.pk:27019",
"health" : 1,
"state" : 2,
"stateStr" : "SECONDARY",
"uptime" : 173,
"optime" : {
"ts" : Timestamp(1584600421, 1),
"t" : NumberLong(1)
                        },
"optimeDurable" : {
"ts" : Timestamp(1584600421, 1),
"t" : NumberLong(1)
                        },
"optimeDate" : ISODate("2020-03-19T06:47:01Z"),
"optimeDurableDate" : ISODate("2020-03-19T06:47:01Z"),
"lastHeartbeat" : ISODate("2020-03-19T06:47:12.377Z"),
"lastHeartbeatRecv" : ISODate("2020-03-19T06:47:14.013Z"),
"pingMs" : NumberLong(0),
"lastHeartbeatMessage" : "",
"syncingTo" : "cfg1.example.pk:27019",
"syncSourceHost" : "cfg1.example.pk:27019",
"syncSourceId" : 0,
"infoMessage" : "",
"configVersion" : 1
                },
                {
"_id" : 2,
"name" : "cfg3.example.pk:27019",
"health" : 1,
"state" : 2,
"stateStr" : "SECONDARY",
"uptime" : 173,
"optime" : {
"ts" : Timestamp(1584600421, 1),
"t" : NumberLong(1)
                        },
"optimeDurable" : {
"ts" : Timestamp(1584600421, 1),
"t" : NumberLong(1)
                        },
"optimeDate" : ISODate("2020-03-19T06:47:01Z"),
"optimeDurableDate" : ISODate("2020-03-19T06:47:01Z"),
"lastHeartbeat" : ISODate("2020-03-19T06:47:12.377Z"),
"lastHeartbeatRecv" : ISODate("2020-03-19T06:47:14.020Z"),
"pingMs" : NumberLong(0),
"lastHeartbeatMessage" : "",
"syncingTo" : "cfg1.example.pk:27019",
"syncSourceHost" : "cfg1.example.pk:27019",
"syncSourceId" : 0,
"infoMessage" : "",
"configVersion" : 1
                }
        ],
"ok" : 1,
"$gleStats" : {
"lastOpTime" : Timestamp(1584600261, 1),
"electionId" : ObjectId("7fffffff0000000000000001")
        },
"lastCommittedOpTime" : Timestamp(1584600433, 1),
"$clusterTime" : {
"clusterTime" : Timestamp(1584600433, 1),
"signature" : {
"hash" : BinData(0,"5S/ou6ONJAUK+J4roPWAKmOf2nk="),
"keyId" : NumberLong("6805806349767671838")
                }
        },
"operationTime" : Timestamp(1584600433, 1)
}

Now exit from mongo shell with below command:

quit()

STEP12 - Create the Shard Replica Set (rs0)

We will configure shard replica set (rs0) on (shrd1, shrd2, shrd3) servers.

Log in to (shrd1.example.pk), edit /etc/mongod.conf file like below:

sudo nano /etc/mongod.conf

Add, update the following values, make sure you replace port value with 27018 and bindIp value with your server's name:

net:
  port: 27018
  bindIp: shrd1.example.pk

security:
  keyFile: /var/lib/mongodb/mongodb_key.pem

replication:
replSetName: rs0

sharding:
clusterRole: shardsvr

Save and close when you are finished.

Restart mongod service to take changes into effect:

sudo systemctl restart mongod
sudo systemctl status monogod

Make sure you repeat the same on (shrd2, shrd3) server before proceeding to next step.

STEP13 - Initiate the shard replica set (rs0)

mongo --host shrd1.example.pk --port 27018 -u administrator --authenticationDatabase admin

Type below on mongo shell to initiate shard replica set (rs0):

rs.initiate({ _id : "rs0", members:[{ _id : 0, host : "shrd1.example.pk:27018" },{ _id : 1, host : "shrd2.example.pk:27018" },{ _id : 2, host : "shrd3.example.pk:27018" }]})

This will return { "ok" : 1 } indicating that shard replica set rs0 initiated successfully.

Now exit from the mongo shell with below command:

quit()

STEP14 - Create the Shard Replica Set (rs1)

We will configure shard replica set (rs1) on (shrd4, shrd5, shrd6) servers.

Log in to (shrd4.example.pk), edit /etc/mongod.conf file:

sudo nano /etc/mongod.conf

Add, update the following values, make sure you replace port value with 27018 and bindIp with your server's name:

net:
  port: 27018
  bindIp: shrd4.example.pk

security:
  keyFile: /nfs_share/mongodb/mongodb_key.pem

replication:
replSetName: rs1

sharding:
clusterRole: shardsvr

Save and close file when you are finished.

Restart mongod service with below command to take changes into effect:

sudo systemctl restart mongod
sudo systemctl status mongod

Repeat the same on (shrd5, shrd6) before proceeding to next step.

STEP15 - Initiate the shard replica set (rs1)

Log in to (shrd1.example.pk), connect to mongo shell on port 27018 with administrative authentication like below:

mongo --host shrd4.example.pk --port 27018 -u administrator --authenticationDatabase admin

Type below on mongo shell to initiate shard replica set (rs1):

rs.initiate({ _id : "rs1", members:[{ _id : 0, host : "shrd4.example.pk:27018" },{ _id : 1, host : "shrd5.example.pk:27018" },{ _id : 2, host : "shrd6.example.pk:27018" }]})

This will return { "ok" : 1 } indicating that shard replica set rs0 is initiated successfully.

Now exit from the mongo shell with below command:

quit()

STEP16 - Configure Mongos (Query Router)

We'll create a mongos service that needs to obtain data locks, so be sure mongod is stopped before proceeding:

Log in to (qrt1.example.pk), and deactivate mongod service with below command:

sudo systemctl stop mongod
sudo systemctl disable mongod

Confirm that mongod service is stopped with below command:

sudo systemctl status mongod

The output confirm that mongod is stopped:

● mongod.service - MongoDB Database Server
   Loaded: loaded (/lib/systemd/system/mongod.service; disabled; vendor preset: enabled)
   Active: inactive (dead)
     Docs: https://docs.mongodb.org/manual

Mar 19 13:35:48 qrt1.example.pk systemd[1]: Stopped MongoDB Database Server.

On (qrt1.example.pk), create mongos.conf file file like below:

sudo nano /etc/mongos.conf

Add the following configuration directives

systemLog:
  destination: file
  logAppend: true
  path: /var/log/mongodb/mongos.log

net:
  port: 27017
  bindIp: qrt1.example.pk

security:
  keyFile: /nfs_share/mongodb/mongodb_key.pem

sharding:
configDB: configReplSet/cfg1.example.pk:27019,cfg2.example.pk:27019,cfg3.example.pk:27019

Save and close file.

Next, create a systemd service unit file for mongos like below:

sudo nano /lib/systemd/system/mongos.service

Add the following parameters:

[Unit]
Description=Mongo Cluster Router
After=network.target

[Service]
User=mongodb
Group=mongodb

ExecStart=/usr/bin/mongos --config /etc/mongos.conf

LimitFSIZE=infinity
LimitCPU=infinity
LimitAS=infinity

LimitNOFILE=64000
LimitNPROC=64000

TasksMax=infinity
TasksAccounting=false

[Install]
WantedBy=multi-user.target

Save and close.

Start mongos service with below command to activate query router.

sudo systemctl start mongos
sudo systemctl enable mongos

Confirm that mongos is active and running with below command:

sudo systemctl status mongos

You will see mongos status like below.

● mongos.service - Mongo Cluster Router
   Loaded: loaded (/lib/systemd/system/mongos.service; enabled; vendor preset: enabled)
   Active: active (running) since Thu 2020-03-19 13:59:25 PKT; 33s ago
 Main PID: 26985 (mongos)
   CGroup: /system.slice/mongos.service
└─26985 /usr/bin/mongos --config /etc/mongos.conf

Mar 19 13:59:25 qrt1.example.pk systemd[1]: Started Mongo Cluster Router.

Next, log in to qrt2.example.pk and stop mongod like below:

sudo systemctl stop mongod
sudo systemctl disable mongod

Create mongos.conf file:

sudo nano /etc/mongos.conf

Add the same configuration directives, like we did on qrt1.example.pk, replace bindIp value with qrt2.example.pk:

systemLog:
  destination: file
  logAppend: true
  path: /var/log/mongodb/mongos.log

net:
  port: 27017
  bindIp: qrt2.example.pk

security:
  keyFile: /nfs_share/mongodb/mongodb_key.pem

sharding:
  configDB: configReplSet/cfg1.example.pk:27019,cfg2.example.pk:27019,cfg3.example.pk:27019

Save and close when you are finished.

Next, create a systemd service unit file:

sudo nano /lib/systemd/system/mongos.service

Add the following same parameters we did on qrt1.example.pk

[Unit]
Description=Mongo Cluster Router
After=network.target

[Service]
User=mongodb
Group=mongodb

ExecStart=/usr/bin/mongos --config /etc/mongos.conf

LimitFSIZE=infinity
LimitCPU=infinity
LimitAS=infinity

LimitNOFILE=64000
LimitNPROC=64000

TasksMax=infinity
TasksAccounting=false

[Install]
WantedBy=multi-user.target

Save and close.

Start mongos service with below command to activate query router.

sudo systemctl start mongos
sudo systemctl enable mongos

Confirm that mongos is active and running with below command:

sudo systemctl status mongos

You will see mongos status like below.

● mongos.service - Mongo Cluster Router
   Loaded: loaded (/lib/systemd/system/mongos.service; enabled; vendor preset: enabled)
   Active: active (running) since Thu 2020-03-19 14:04:35 PKT; 40min ago
 Main PID: 27137 (mongos)
   CGroup: /system.slice/mongos.service
└─27137 /usr/bin/mongos --config /etc/mongos.conf

Mar 19 14:04:35 qrt2.example.pk systemd[1]: Started Mongo Cluster Router.

Add Shards to the Cluster

On (qrt1.example.pk), connect to mongo shell on port 27017 with administrative authentication like below:

mongo --host qrt1.example.pk --port 27017 -u administrator --authenticationDatabase admin

On mongo shell, type below to add shard replica set (rs0) in the cluster:

sh.addShard( "rs0/shrd1.example.pk:27018,shrd2.example.pk:27018,shrd2.example.pk:27018")

You will see the similar output indicating that shard replica set rs0 added successfully.

Type below to add shard replica set (rs1) in the cluster:

sh.addShard( "rs1/shrd4.example.pk:27018,shrd5.example.pk:27018,shrd6.example.pk:27018")

You will see the similar output indicating that shard replica set rs1 added successfully.

At this stage, your fault-tolerant sharded cluster is active and running.

STEP17 - Enable Sharding

The last step is to enable sharding for database. This process takes place in stages due to the organization of data in MongoDB. Before you enable sharding, you’ll need to decide on a sharding strategy.

The two most common sharding strategies are:

Range-based sharding divides your data based on specific ranges of values in the shard key.

Hash-based sharding distributes data by using a hash function on your shard key for a more even distribution of data among the shards.

This is not a comprehensive explanation for choosing a sharding strategy. You may wish to consult with official resource of MongoDB’s documentation on sharding strategy.

For this guide, we’ve decided to use a hash-based sharding strategy, enabling sharding at the collections level. This allows the documents within a collection to be distributed among your shards.

mongo --host qrt2.example.pk --port 27017 -u administrator --authenticationDatabase admin

From the mongos shell, type below to create a test database called testDB

use testDB

Create a new collection called testCollection and hash its _id key. The _id key is already created by default as a basic index for new documents:

db.testCollection.ensureIndex( { _id : "hashed" } )

You will see the output similar to the following

Type below to enable sharding for newly created database:

sh.enableSharding("testDB")
sh.shardCollection( "testDB.testCollection", { "_id" : "hashed" } )

You will see output similar to the following

This enables sharding across any shards that you added to your cluster.

To verify that the sharding was successfully enabled, type below to switch to the config database:

use config

Run below method:

db.databases.find()

If sharding was enabled properly, you will see useful information similar to like below:

Once you enable sharding for a database, MongoDB assigns a primary shard for that database where MongoDB stores all data in that database.

STEP18 - Test MongoDB Sharded Cluster

To ensure your data is being distributed evenly in the database, follow these steps to generate some dummy data in testDB and see how it is divided among the shards.

Connect to the mongos shell on any of your query routers:

mongo --host qrt1.example.pk --port 27017 -u administrator -p --authenticationDatabase admin

Switch to your newly created database (testDB) for example:

use testDB

Type the following code in the mongo shell to generate 10000 simple dummy documents and insert them into testCollection:

for (var i = 1; i <= 10000; i++) db.testCollection.insert( { x : i } )

Run following code to check your dummy data distribution:

db.testCollection.getShardDistribution()

This will return information similar to the following:

The Totals section provides information about the collection as a whole, including its distribution among the shards. Notice that distribution is not perfectly equal. The hash function does not guarantee absolutely even distribution, but with a carefully chosen shard key it will usually be fairly close.

When you’re finished, we recommend you to delete the testDB (because it has no use) with below command:

use testDB
db.dropDatabase()

Wrapping up

Now that you have successfully deployed a highly available fault-tolerant sharded cluster ready to use for your production environment, we recommended you to configure firewall to limit ports 27018 and 27019 to only accept traffic between hosts within your cluster.

You'll always make connection to the database in the sharded cluster via query routers.

↧

How To Set Up a Highly Available PostgreSQL Cluster on Ubuntu 19/20

February 20, 2020, 8:36 pm

≫ Next: How To Set Up a Highly Available PostgreSQL Cluster on Ubuntu 18.04

≪ Previous: Create a Fault-Tolerant MongoDB Sharded Cluster using Shared Storage on Ubuntu 18/19/20

This guide will walk you through the steps to set up a highly available PostgreSQL cluster using Patroni and HAProxy on Ubuntu 19/20.04. If you wish to skip step by step guide, the below video tutorial will be sufficient if you are good enough to pick things up quickly.

Note that, this guide is written for Ubuntu 19.04, 19.10, 20.04 and Debian 9, 10.

Prerequisites

To follow the steps covered in this tutorial, you will need five (physical or virtual) machines installed with Ubuntu or Debian having sudo non-root user privileges.

These are the machines we will use in this guide for our cluster setup. However, if you wish you can add up more or go with less, its completely up to your requirement.

HOSTNAME	IP ADDRESS	PURPOSE
NODE1	192.168.10.1	Postgresql, Patroni
NODE2	192.168.10.2	Postgresql, Patroni
NODE3	192.168.10.3	Postgresql, Patroni

NODE4	192.168.10.4	etcd
NODE5	192.168.10.5	HAProxy

When you have prerequisites in place, please proceed to the below steps:

Installing PostreSQL

In this step, we will install postgres on three of the nodes (node1, node2, node3) one by one using the below command:

sudo apt update
sudo apt -y install postgresql postgresql-server-dev-all

At the time of this tutorial, the postgresql version 11 was the default release in Ubuntu packages repository. If you wish you can install postgresql version 12 like below:

sudo apt -y install postgresql-12 postgresql-server-dev-12

Upon installation, Postgres automatically runs as a service. We need to stop the Postgres service at this point with below command:

sudo systemctl stop postgresql

Installing Patroni

Patroni is an open-source python package that manages postgres configuration. It can be configured to handle tasks like replication, backups and restorations.

Patroni uses utilities that come installed with postgres, located in the /usr/lib/postgresql/11/bin directory by default on Ubuntu 19. You will need to create symbolic links in the PATH to ensure that Patroni can find the utilities.

Type below command to create symbolic link and make sure you replace postgresql version if you are running an earlier or later release.

sudo ln -s /usr/lib/postgresql/11/bin/* /usr/sbin/

Type the below command to install python and python-pip packages:

sudo apt -y install python python-pip

Ensure that you have latest version of the setuptools of python package with below command:

sudo pip install --upgrade setuptools

Now you can type below commands to install Patroni:

sudo pip install psycopg2
sudo pip install patroni
sudo pip install python-etcd

Repeat these steps on remaining nodes (node2, node3 in our setup) as well. When you are finished with the above on all three nodes, you can move to next step.

Configuring Patroni

Patroni can be configured using a YAML file which can be placed anywhere. For sake of this guide, we will place this file under /etc/patroni.yml.

Create a patroni.yml file on all three nodes that have postgres and Patroni installed (node1, node2, node3 in our case). Change name to something unique, and change listen and connect_address (under postgresql and restapi) to the appropriate values on each node.

Create /etc/patroni.yml file on your first node (node1 in our case) like below:

sudo nano /etc/patroni.yml

add, update below configuration parameters to reflect yours:

scope: postgres
namespace: /db/
name: node1

restapi:
    listen: 192.168.10.1:8008
    connect_address: 192.168.10.1:8008

etcd:
    host: 192.168.10.4:2379

bootstrap:
    dcs:
        ttl: 30
        loop_wait: 10
        retry_timeout: 10
        maximum_lag_on_failover: 1048576
        postgresql:
            use_pg_rewind: true

    initdb:
    - encoding: UTF8
    - data-checksums

    pg_hba:
    - host replication replicator 127.0.0.1/32 md5
    - host replication replicator 192.168.10.1/0 md5
    - host replication replicator 192.168.10.2/0 md5
    - host replication replicator 192.168.10.3/0 md5
    - host all all 0.0.0.0/0 md5

    users:
        admin:
            password: admin
            options:
                - createrole
                - createdb

postgresql:
    listen: 192.168.10.1:5432
    connect_address: 192.168.10.1:5432
    data_dir: /data/patroni
    pgpass: /tmp/pgpass
    authentication:
        replication:
            username: replicator
            password: reppassword
        superuser:
            username: postgres
            password: secretpassword
    parameters:
        unix_socket_directories: '.'

tags:
    nofailover: false
    noloadbalance: false
    clonefrom: false
    nosync: false

Save and close file.

You can see /etc/patroni.yml file on (node1 in our case) in below screenshot:

The below screenshot is /etc/patroni.yml on (node2 in our case)

and below is the /etc/patroni.yml screenshot of (node3 in our case):

Make note of the data_dir value in the above file. The postgres user needs the ability to write to this directory. If this directory doesn’t exist, create it with below command on all three nodes (node1, node2, node3 in our case):

sudo mkdir -p /data/patroni

Make postgres the owner of /data/patroni:

sudo chown postgres:postgres /data/patroni

Change the permissions on this directory to make it accessible only to the postgres user:

sudo chmod 700 /data/patroni

Next, we will create a systemd script that will allow us to start, stop and monitor Patroni.

You need to create a file /etc/systemd/system/patroni.service on all three nodes (node1, node2, node3 in our case) with below command:

sudo nano /etc/systemd/system/patroni.service

add below configuration parameters in it:

[Unit]
Description=High availability PostgreSQL Cluster
After=syslog.target network.target

[Service]
Type=simple
User=postgres
Group=postgres
ExecStart=/usr/local/bin/patroni /etc/patroni.yml
KillMode=process
TimeoutSec=30
Restart=no

[Install]
WantedBy=multi-user.targ

Save and close file.

If Patroni is installed in a location other than /usr/local/bin/patroni on your machine, update the appropriate path in above file accordingly.

Installing etcd

Etcd is a fault-tolerant, distributed key-value store that is used to store the state of the postgres cluster. Using Patroni, all of the postgres nodes make use of etcd to keep the postgres cluster up and running.

For sake of this guide we will use a single-server etcd cluster. However, in production, it may be best to use a larger etcd cluster so that one etcd node fails, it doesn’t affect your postgres servers.

Type below command to install etcd on (node4 in our case):

sudo apt -y install etcd

Configuring etcd

At this point, you need to edit the /etc/default/etcd file on (node4 in our case) like below:

sudo nano /etc/default/etcd

Look for the following parameters, uncomment by removing # and update these parameters to reflect yours:

ETCD_LISTEN_PEER_URLS="http://192.168.10.4:2380"
ETCD_LISTEN_CLIENT_URLS="http://localhost:2379,http://192.168.10.4:2379"
ETCD_INITIAL_ADVERTISE_PEER_URLS="http://192.168.10.4:2380"
ETCD_INITIAL_CLUSTER="default=http://192.168.10.4:2380,"
ETCD_ADVERTISE_CLIENT_URLS="http://192.168.10.4:2379"
ETCD_INITIAL_CLUSTER_TOKEN="etcd-cluster"
ETCD_INITIAL_CLUSTER_STATE="new"

Save and close the file.

You can see /etc/default/etcd screenshot on node4 in our case:

Restart the etcd service to take changes into effect:

sudo systemctl restart etcd

If etcd service failed to start, reboot your machine.

At this point, start Patroni and Postgres service on your first node (node1 in our case) with below command:

sudo systemctl start patroni

Verify the status of Patroni service status with below command:

sudo systemctl status patroni

If everything is set up correctly, the output from the node1 (master) will look similar to like below:

When starting patroni on subsequent nodes, (node2, node3 in our case) the output will look similar to like below:

Make sure you have performed these steps on each of the three nodes with Postgres installed to create a highly available Postgres cluster with one master and two slaves.

Installing HAProxy

When developing an application that uses a database, it can be cumbersome to keep track of the database endpoints if they keep changing. Using HAProxy simplifies this by giving a single endpoint to which you can connect the application.

HAProxy forwards the connection to whichever node is currently the master. It does this using a REST endpoint that Patroni provides. Patroni ensures that, at any given time, only the master postgres node will appear as online, forcing HAProxy to connect to the correct node.

Type the below command to install HAProxy on (node5 in our case):

sudo apt -y install haproxy

Configuring HAProxy

With the Postgres cluster set up, you need a method to connect to the master regardless of which of the servers in the cluster is the master. This is where HAProxy steps in. All Postgres clients (your applications, psql, etc.) will connect to HAProxy which will make sure you connect to the master in the cluster.

You need to edit the configuration file /etc/haproxy/haproxy.cfg on the HAProxy node (in our case node5) that has HAProxy installed:

sudo nano /etc/haproxy/haproxy.cfg

add, update following configuration parameters:

global
    maxconn 100

defaults
    log global
    mode tcp
    retries 2
    timeout client 30m
    timeout connect 4s
    timeout server 30m
    timeout check 5s

listen stats
    mode http
    bind *:7000
    stats enable
    stats uri /

listen postgres
    bind *:5000
    option httpchk
    http-check expect status 200
    default-server inter 3s fall 3 rise 2 on-marked-down shutdown-sessions
    server postgresql_192.168.10.1_5432 192.168.10.1:5432 maxconn 100 check port 8008
    server postgresql_192.168.10.2_5432 192.168.10.2:5432 maxconn 100 check port 8008
    server postgresql_192.168.10.3_5432 192.168.10.3:5432 maxconn 100 check port 8008

Save and close file.

You can see below screenshot of our /etc/haproxy/haproxy.cfg file on node5:

Restart HAProxy to take the changes into effect and use the new settings:

sudo systemctl restart haproxy

If HAProxy fails to start, check for syntax errors:

/usr/sbin/haproxy -c -V -f /etc/haproxy/haproxy.cfg

Testing Postgres Cluster Setup

Connect Postgres clients to the HAProxy IP address of the node on which you installed HAProxy (in this guide, 192.168.10.5) on port 5000.

You can also access HAProxy node on port 7000 using any of your preferred web browser to see the HAProxy dashboard like below:

As you can see, the postgresql_192.168.10.1_5432 row is highlighted in green. This indicates that 192.168.10.1 is currently acting as the master.

If you kill the primary node (using sudo systemctl stop patroni or by completely shutting down the server), the dashboard will look similar to like below:

In the postgres section, the postgresql_192.168.10.1_5432 row is now red and the postgresql_192.168.10.3_5432 row is highlighted in green. This indicates that 192.168.10.3 is currently acting as the master.

Note: In this case, it just so happens that the third Postgres server is promoted to master. This might not always be the case. It is equally likely that the second server may be promoted to master.

When you now bring up the first server, it will rejoin the cluster as a slave and will sync up with the master.

Wrapping up

You now have a robust, highly available Postgres cluster ready for use. While the setup in this tutorial should go far in making your Postgres deployment highly available, here are few more steps you can take to improve it further:

Use a larger etcd cluster to improve availability.
Use PgBouncer to pool connections.
Add another HAProxy server and configure IP failover to create a highly available HAProxy cluster.

↧

How To Set Up a Highly Available PostgreSQL Cluster on Ubuntu 18.04

February 23, 2020, 10:26 pm

≫ Next: How To Create a Highly Available PostgreSQL Cluster on CentOS/RHEL 8

≪ Previous: How To Set Up a Highly Available PostgreSQL Cluster on Ubuntu 19/20

This step by step guide will show you how to set up a highly available PostgreSQL cluster using Patroni and HAProxy on Ubuntu 18.0. These instruction can also be applied (slight changes may be required) if you are running an earlier release of Ubuntu 16.04 or 17.

Prerequisites

To follow the steps covered in this tutorial, you will need four (physical or virtual) machines installed with Ubuntu 18.04.4 server having sudo non-root user privileges.

These are the machines we will use in this guide for our cluster setup. However, if you wish you can add up more, its completely up to your requirement.

HOSTNAME	IP ADDRESS	PURPOSE
NODE1	192.168.10.1	Postgresql, Patroni
NODE2	192.168.10.2	Postgresql, Patroni


NODE3	192.168.10.3	etcd
NODE4	192.168.10.4	HAProxy

When you have prerequisites in place, please proceed to the below steps:

Installing PostreSQL

In this step, we will install postgres on two of the nodes (node1, node2) one by one using the below command:

sudo apt update
sudo apt -y install postgresql postgresql-server-dev-all

At the time of this tutorial, the postgresql version 10 was the default release in Ubuntu 18.04 packages repository. If you wish you can install postgresql version 12 like below:

sudo apt -y install postgresql-12 postgresql-server-dev-12

Upon installation, Postgres automatically runs as a service. We need to stop the Postgres service at this point with below command:

sudo systemctl stop postgresql

Installing Patroni

Patroni is an open-source python package that manages postgres configuration. It can be configured to handle tasks like replication, backups and restorations.

Patroni uses utilities that come installed with postgres, located in the /usr/lib/postgresql/10/bin directory by default on Ubuntu 18.04. You will need to create symbolic links in the PATH to ensure that Patroni can find the utilities.

Type below command to create symbolic link and make sure you replace postgresql version if you are running an earlier or later release.

sudo ln -s /usr/lib/postgresql/10/bin/* /usr/sbin/

Type the below command to install python and python-pip packages:

sudo apt -y install python python-pip

Ensure that you have latest version of the setuptools of python package with below command:

sudo -H pip install --upgrade setuptools

Type below command to install psycopg2:

sudo -H pip install psycopg2

Type below command to install patroni and python-etcd:

sudo -H pip install patroni
sudo -H pip install python-etcd

Repeat these steps on remaining nodes (node2 in our case) as well. When you are finished with the above on each node (designated for postgresql and patroni), you can move to next step.

Configuring Patroni

Patroni can be configured using a YAML file which can be placed anywhere. For sake of this guide, we will place this file under /etc/patroni.yml.

Create a patroni.yml file on all three nodes that have postgres and Patroni installed (node1, node2 in our case). Change name to something unique, and change listen and connect_address (under postgresql and restapi) to the appropriate values on each node.

Create /etc/patroni.yml file on your first node (node1 in our case) like below:

sudo nano /etc/patroni.yml

add, update below configuration parameters to reflect yours:

scope: postgres
namespace: /db/
name: node1

restapi:
    listen: 192.168.10.1:8008
    connect_address: 192.168.10.1:8008

etcd:
    host: 192.168.10.3:2379

bootstrap:
    dcs:
        ttl: 30
        loop_wait: 10
        retry_timeout: 10
        maximum_lag_on_failover: 1048576
        postgresql:
            use_pg_rewind: true

    initdb:
    - encoding: UTF8
    - data-checksums

    pg_hba:
    - host replication replicator 127.0.0.1/32 md5
    - host replication replicator 192.168.10.1/0 md5
    - host replication replicator 192.168.10.2/0 md5    - host all all 0.0.0.0/0 md5

    users:
        admin:
            password: admin
            options:
                - createrole
                - createdb

postgresql:
    listen: 192.168.10.1:5432
    connect_address: 192.168.10.1:5432
    data_dir: /data/patroni
    pgpass: /tmp/pgpass
    authentication:
        replication:
            username: replicator
            password: reppassword
        superuser:
            username: postgres
            password: secretpassword
    parameters:
        unix_socket_directories: '.'

tags:
    nofailover: false
    noloadbalance: false
    clonefrom: false
    nosync: false

Save and close file.

Next, create /etc/patroni.yml file on your second node (node2 in our case) like below:

scope: postgres
namespace: /db/
name: node2

restapi:
    listen: 192.168.10.2:8008
    connect_address: 192.168.10.2:8008

etcd:
    host: 192.168.10.3:2379

bootstrap:
    dcs:
        ttl: 30
        loop_wait: 10
        retry_timeout: 10
        maximum_lag_on_failover: 1048576
        postgresql:
            use_pg_rewind: true

    initdb:
    - encoding: UTF8
    - data-checksums

    pg_hba:
    - host replication replicator 127.0.0.1/32 md5
    - host replication replicator 192.168.10.2/0 md5
    - host replication replicator 192.168.10.1/0 md5    - host all all 0.0.0.0/0 md5

    users:
        admin:
            password: admin
            options:
                - createrole
                - createdb

postgresql:
    listen: 192.168.10.2:5432
    connect_address: 192.168.10.2:5432
    data_dir: /data/patroni
    pgpass: /tmp/pgpass
    authentication:
        replication:
            username: replicator
            password: reppassword
        superuser:
            username: postgres
            password: secretpassword
    parameters:
        unix_socket_directories: '.'

tags:
    nofailover: false
    noloadbalance: false
    clonefrom: false
    nosync: false

Save and close file.

sudo mkdir -p /data/patroni

Make postgres the owner of /data/patroni:

sudo chown postgres:postgres /data/patroni

Change the permissions on this directory to make it accessible only to the postgres user:

sudo chmod 700 /data/patroni

Next, we will create a systemd script that will allow us to start, stop and monitor Patroni.

You need to create a file /etc/systemd/system/patroni.service on each node (node1, node2 in our case) with below command:

sudo nano /etc/systemd/system/patroni.service

add below configuration parameters in it:

[Unit]
Description=High availability PostgreSQL Cluster
After=syslog.target network.target

[Service]
Type=simple
User=postgres
Group=postgres
ExecStart=/usr/local/bin/patroni /etc/patroni.yml
KillMode=process
TimeoutSec=30
Restart=no

[Install]
WantedBy=multi-user.targ

Save and close file.

If Patroni is installed in a location other than /usr/local/bin/patroni on your machine, update the appropriate path in above file accordingly.

Installing etcd

Type below command to install etcd on (node3 in our case):

sudo apt -y install etcd

Configuring etcd

At this point, you need to edit the /etc/default/etcd file on (node3 in our case) like below:

sudo nano /etc/default/etcd

Look for the following parameters, uncomment by removing # and update these parameters to reflect yours:

ETCD_LISTEN_PEER_URLS="http://192.168.10.3:2380"
ETCD_LISTEN_CLIENT_URLS="http://localhost:2379,http://192.168.10.3:2379"
ETCD_INITIAL_ADVERTISE_PEER_URLS="http://192.168.10.3:2380"
ETCD_INITIAL_CLUSTER="default=http://192.168.10.3:2380,"
ETCD_ADVERTISE_CLIENT_URLS="http://192.168.10.3:2379"
ETCD_INITIAL_CLUSTER_TOKEN="etcd-cluster"
ETCD_INITIAL_CLUSTER_STATE="new"

Save and close the file.

You can see final /etc/default/etcd screenshot below on node3 after modification in our case:

Restart the etcd service to take changes into effect:

sudo systemctl restart etcd

If etcd service failed to start, reboot your machine.

When etcd service start successfully, you need to start Patroni service on each node but primary (node1 in our case) first with below command:

sudo systemctl start patroni

Verify the status of Patroni service status with below command:

sudo systemctl status patroni

If everything is set up correctly, the output from the node1 (master) will look similar to like below:

When starting patroni on subsequent nodes, (node2 in our case) the output will look similar to like below:

Make sure you have performed these steps on each of the node with Postgres and Patroni installed (node1, node2 in our case) to create a highly available Postgres cluster with one master and one slave.

Installing HAProxy

HAProxy forwards the connection to whichever node is currently the master. It does this using a REST endpoint that Patroni provides. Patroni ensures that, at any given time, only the master postgres node will appear as online, forcing HAProxy to connect to the correct node.

Type the below command to install HAProxy on (node4 in our case):

sudo apt -y install haproxy

Configuring HAProxy

sudo nano /etc/haproxy/haproxy.cfg

add, update following configuration parameters:

global
    maxconn 100

defaults
    log global
    mode tcp
    retries 2
    timeout client 30m
    timeout connect 4s
    timeout server 30m
    timeout check 5s

listen stats
    mode http
    bind *:7000
    stats enable
    stats uri /

listen postgres
    bind *:5000
    option httpchk
    http-check expect status 200
    default-server inter 3s fall 3 rise 2 on-marked-down shutdown-sessions
    server postgresql_192.168.10.1_5432 192.168.10.1:5432 maxconn 100 check port 8008
    server postgresql_192.168.10.2_5432 192.168.10.2:5432 maxconn 100 check port 8008

Save and close file when you are finished.

You can see the below screenshot of final /etc/haproxy/haproxy.cfg file on node4 in our case:

Restart HAProxy to take the changes into effect with below command:

sudo systemctl restart haproxy

Verify the service status with below command:

sudo systemctl status haproxy

You will see the output similar to like below:

If HAProxy fails to start, check for syntax errors with below command:

/usr/sbin/haproxy -c -V -f /etc/haproxy/haproxy.cfg

Testing Postgres Cluster Setup

Connect Postgres clients to the HAProxy IP address of the node on which you installed HAProxy (in this guide, 192.168.10.4) on port 5000.

You can also access HAProxy node on port 7000 using any of your preferred web browser to see the HAProxy dashboard like below:

As you can see, the postgresql_192.168.10.1_5432 row is highlighted in green. This indicates that 192.168.10.1 is currently acting as the master.

If you kill the primary node (using sudo systemctl stop patroni or by completely shutting down the server), the dashboard will look similar to like below:

In the postgres section, the postgresql_192.168.10.1_5432 row is now red and the postgresql_192.168.10.2_5432 row is highlighted in green. This indicates that 192.168.10.2 is currently acting as the master.

Note: In this case, it just so happens that the second Postgres server is promoted to master. This might not always be the case if you have more than two nodes in cluster. It is equally likely that the third, fourth or fifth server may be promoted to master.

When you bring up the first server, it will rejoin the cluster as a slave and will sync up with the master.

Wrapping up

Use a larger etcd cluster to improve availability.
Use PgBouncer to pool connections.
Add another HAProxy server and configure IP failover to create a highly available HAProxy cluster.

↧

How To Create a Highly Available PostgreSQL Cluster on CentOS/RHEL 8

February 26, 2020, 4:13 am

≫ Next: How To Install and Secure MongoDB on Ubuntu 18/19/20

≪ Previous: How To Set Up a Highly Available PostgreSQL Cluster on Ubuntu 18.04

This tutorial will walk you through the steps to set up a highly available PostgreSQL cluster using Patroni and HAProxy on CentOS/RHEL 8.

Prerequisites

To follow this tutorial, you will need 5 (physical or virtual) machines with CentOS or RHEL 8 server with minimal installed, having sudo non-root user privileges. We have already prepared following 5 machines with CentOS release 8.1.1911 for this guide. However, if you wish you can add up more machines in your cluster environment.

HOSTNAME	IP ADDRESS	PURPOSE
node1	192.168.10.1	Postgresql, Patroni
node2	192.168.10.2	Postgresql, Patroni
node3	192.168.10.3	Postgresql, Patroni

node4	192.168.10.4	etcd
node5	192.168.10.5	HAProxy

If you wish, you can watch the below quick video tutorial to set up your postgres cluster environment:

If you are not comfortable with the video tutorial, please follow the below step by step instruction:

Adding EPEL Repository

It is always recommended to install extra packages for enterprise Linux repository before installing any other packages on your server.

Type below command to add epel repo on (node1, node2, node3) only:

dnf -y install epel-release
dnf config-manager --set-enabled PowerTools
dnf -y install yum-utils

Adding PostgreSQL Repository

PostgreSQL version 12 is not available in CentOS/RHEL 8 default repository, therefore we need install official repository of postgres with below command:

Type below add postgres repo on (node1, node2, node3) only:

dnf -y install https://download.postgresql.org/pub/repos/yum/reporpms/EL-8-x86_64/pgdg-redhat-repo-latest.noarch.rpm
yum-config-manager --enable pgdg12
dnf -qy module disable postgresql

Installing PostgreSQL

We are installing PostgreSQL version 12 for this guide. If you wish, you can install any other version of your choice:

Type below command to install postgres version 12 on (node1, node2, node3) only:

dnf -y install postgresql12-server postgresql12 postgresql12-devel

Installing Patroni

Patroni is an open-source python package that manages postgres configuration. It can be configured to handle tasks like replication, backups and restorations.

Type below command to install patroni on (node1, node2, node3) only:

dnf -y install https://download-ib01.fedoraproject.org/pub/epel/7/x86_64/Packages/p/python3-psycopg2-2.7.7-2.el7.x86_64.rpm
dnf -y install https://github.com/cybertec-postgresql/patroni-packaging/releases/download/1.6.0-1/patroni-1.6.0-1.rhel7.x86_64.rpm

Configuring Patroni

Patroni can be configured using a YAML file which is by default located under /opt/app/patroni/etc/ with appropriate permission.

Type below command on (node1, node2, node3) to copy postgresql.yml.sample to postgresql.yml:

cp -p /opt/app/patroni/etc/postgresql.yml.sample /opt/app/patroni/etc/postgresql.yml

Now you need to edit postgresql.yml file on each node (node1, node2, node3) with any of your preferred editor i.e. vi, vim, nano etc:

vi /opt/app/patroni/etc/postgresql.yml

Remove everything from this file and add below configuration parameters:

scope: postgres
namespace: /pg_cluster/
name: node1

restapi:
    listen: 192.168.10.1:8008
    connect_address: 192.168.10.1:8008

etcd:
    host: 192.168.10.4:2379

bootstrap:
  dcs:
    ttl: 30
    loop_wait: 10
    retry_timeout: 10
    maximum_lag_on_failover: 1048576
    postgresql:
      use_pg_rewind: true
      use_slots: true
      parameters:

  initdb:
  - encoding: UTF8
  - data-checksums

  pg_hba:
  - host replication replicator 127.0.0.1/32 md5
  - host replication replicator 192.168.10.1/0 md5
  - host replication replicator 192.168.10.2/0 md5
  - host replication replicator 192.168.10.3/0 md5
  - host all all 0.0.0.0/0 md5

  users:
    admin:
      password: admin
      options:
        - createrole
        - createdb

postgresql:
  listen: 192.168.10.1:5432
  connect_address: 192.168.10.1:5432
  data_dir: /var/lib/pgsql/12/data
  bin_dir: /usr/pgsql-12/bin
  pgpass: /tmp/pgpass
  authentication:
    replication:
      username: replicator
      password: reppassword
    superuser:
      username: postgres
      password: postgrespassword

tags:
    nofailover: false
    noloadbalance: false
    clonefrom: false
    nosync: false

Change name to something unique, and change listen and connect_address (under postgresql and restapi) to the appropriate values on each node (node1, node2, node3 in our case).

Save and close file when you are finished.

For reference, you can see the below screenshots of /opt/app/patroni/etc/postgresql.yml from node1 in our setup:

below is from node2:

and below is from node3:

Make sure you have performed all of the above steps on each node those are designated for postgresql and patroni (node1, node2, node3 in our case) before going to next step of installing and configuring etcd.

Installing etcd

Type below command to install etcd on node that is designated for etcd (node4 in our case):

dnf -y install http://mirror.centos.org/centos/7/extras/x86_64/Packages/etcd-3.3.11-2.el7.centos.x86_64.rpm

Configuring etcd

At this point, you need to edit the /etc/etcd/etcd.conf file like below:

vi /etc/etcd/etcd.conf

uncomment by removing # from the following highlighted configuration parameters and make sure you replace ip address of the etcd node with yours:


#[Member]

#ETCD_CORS=""
ETCD_DATA_DIR="/var/lib/etcd/default.etcd"
#ETCD_WAL_DIR=""
ETCD_LISTEN_PEER_URLS="http://192.168.10.4:2380"
ETCD_LISTEN_CLIENT_URLS="http://192.168.10.4:2379"
#ETCD_MAX_SNAPSHOTS="5"
#ETCD_MAX_WALS="5"
ETCD_NAME="default"
#ETCD_SNAPSHOT_COUNT="100000"
#ETCD_HEARTBEAT_INTERVAL="100"
#ETCD_ELECTION_TIMEOUT="1000"
#ETCD_QUOTA_BACKEND_BYTES="0"
#ETCD_MAX_REQUEST_BYTES="1572864"
#ETCD_GRPC_KEEPALIVE_MIN_TIME="5s"
#ETCD_GRPC_KEEPALIVE_INTERVAL="2h0m0s"
#ETCD_GRPC_KEEPALIVE_TIMEOUT="20s"
#
#[Clustering]
ETCD_INITIAL_ADVERTISE_PEER_URLS="http://192.168.10.4:2380"
ETCD_ADVERTISE_CLIENT_URLS="http://192.168.10.4:2379"
#ETCD_DISCOVERY=""
#ETCD_DISCOVERY_FALLBACK="proxy"
#ETCD_DISCOVERY_PROXY=""
#ETCD_DISCOVERY_SRV=""
ETCD_INITIAL_CLUSTER="default=http://192.168.10.4:2380"
ETCD_INITIAL_CLUSTER_TOKEN="etcd-cluster"
ETCD_INITIAL_CLUSTER_STATE="new"
#ETCD_STRICT_RECONFIG_CHECK="true"
#ETCD_ENABLE_V2="true"
#
#[Proxy]
#ETCD_PROXY="off"
#ETCD_PROXY_FAILURE_WAIT="5000"
#ETCD_PROXY_REFRESH_INTERVAL="30000"
#ETCD_PROXY_DIAL_TIMEOUT="1000"
#ETCD_PROXY_WRITE_TIMEOUT="5000"
#ETCD_PROXY_READ_TIMEOUT="0"
#
#[Security]
#ETCD_CERT_FILE=""
#ETCD_KEY_FILE=""
#ETCD_CLIENT_CERT_AUTH="false"
#ETCD_TRUSTED_CA_FILE=""
#ETCD_AUTO_TLS="false"
#ETCD_PEER_CERT_FILE=""
#ETCD_PEER_KEY_FILE=""
#ETCD_PEER_CLIENT_CERT_AUTH="false"
#ETCD_PEER_TRUSTED_CA_FILE=""
#ETCD_PEER_AUTO_TLS="false"
#
#[Logging]
#ETCD_DEBUG="false"
#ETCD_LOG_PACKAGE_LEVELS=""
#ETCD_LOG_OUTPUT="default"
#
#[Unsafe]
#ETCD_FORCE_NEW_CLUSTER="false"
#
#[Version]
#ETCD_VERSION="false"
#ETCD_AUTO_COMPACTION_RETENTION="0"
#
#[Profiling]
#ETCD_ENABLE_PPROF="false"
#ETCD_METRICS="basic"
#
#[Auth]
#ETCD_AUTH_TOKEN="simple"

Save and close file when you are finished.

Now start etcd service on (node4 in our case) to take changes in to effect with below command:

systemctl enable etcd
systemctl start etcd
systemctl status etcd

If everything goes well, you will see the output similar to like below screenshot:

Starting Patroni At this point, you need to start patroni service on your first node (node1 in our case):

systemctl enable patroni
systemctl start patroni
systemctl status patroni

If everything was set up correctly, you will see the output similar to like below screenshot.

When starting patroni on subsequent nodes, (node2, node3 in our case) the output will look similar to like below:

Output from node2:

Output from node3:

Make sure patroni service is running on each node (node1, node2, node3 in our case) before going to next step of installing and configuring haproxy.

Installing HAProxy

Type the below command to install HAProxy on (node5 in our case):

dnf -y install haproxy

Configuring HAProxy

With the Postgres cluster set up, you need a method to connect to the master regardless of which of the servers in the cluster is the master. This is where HAProxy steps in.

All Postgres clients (your applications, psql, etc.) will connect to HAProxy which will make sure you connect to the master node in the cluster.

Take the backup of original file first with below command:

cp -p /etc/haproxy/haproxy.cfg /etc/haproxy/haproxy.cfg.orig

vi /etc/haproxy/haproxy.cfg

Remove everything from this file, add below configuration parameters but make sure you replace highlighted text with yours:


global

        maxconn 100
        log     127.0.0.1 local2

defaults
        log global
        mode tcp
        retries 2
        timeout client 30m
        timeout connect 4s
        timeout server 30m
        timeout check 5s

listen stats
    mode http
    bind *:7000
    stats enable
    stats uri /

listen postgres
    bind *:5000
    option httpchk
    http-check expect status 200
    default-server inter 3s fall 3 rise 2 on-marked-down shutdown-sessions
    server node1 192.168.10.1:5432 maxconn 100 check port 8008
    server node2 192.168.10.2:5432 maxconn 100 check port 8008
    server node3 192.168.10.3:5432 maxconn 100 check port 8008

Save and close file when you are finished.

Now start HAProxy to take changes into effect with the below command:

systemctl enable haproxy
systemctl start haproxy
systemctl status haproxy

You will see the output similar to like below screenshot.

haproxy -c -V -f /etc/haproxy/haproxy.cfg

Testing Postgres HA Cluster Setup

Connect Postgres clients to the HAProxy IP address of the node on which you installed HAProxy (in this guide, 192.168.10.5) on port 5000 to verify your HA Cluster setup.

As you can see in above screenshot, client machine is able to make connection to postgres server via haproxy.

You can also access HAProxy node (192.168.10.5 in our case) on port 7000 using any of your preferred web browser to see your HA Cluster status on HAProxy dashboard like below:

As you can see in above screenshot, the (node1) row is highlighted in green. This indicates that (node1 192.168.10.1) is currently acting as the master.

If you kill the primary node with systemctl stop patroni command or by completely shutting down the server, the dashboard will look similar to like below:

In the postgres section in above screenshot, the (node1) row is now red and the (node2) row is highlighted in green. This indicates that (node2 192.168.10.2) is currently acting as the master.

Note: In this case, it just so happens that the second Postgres server is promoted to master. This might not always be the case if you have more than two nodes in cluster. It is equally likely that the third server may be promoted to master.

When you bring up the first server, it will rejoin the cluster as a slave and will sync up with the master.

Wrapping up

Use a larger etcd cluster to improve availability.
Use PgBouncer to pool connections.
Add another HAProxy server and configure IP failover to create a highly available HAProxy cluster.

↧

How To Install and Secure MongoDB on Ubuntu 18/19/20

March 3, 2020, 8:49 pm

≫ Next: How To Set Up a Highly Available MongoDB Sharded Cluster on CentOS/RHEL 7/8

≪ Previous: How To Create a Highly Available PostgreSQL Cluster on CentOS/RHEL 8

MongoDB is a database engine that provides access to non-relational, document-oriented databases. This tutorial will show you how to install and configure MongoDB on Ubuntu. We will also explain on some basic features and functions of the database.

Note that, this guide is specifically written for Ubuntu 18.04, 19.04, 19.10, 20.04 and Debian 9, 10.

Prerequisites

To follow this guide, you will need one (physical or virtual) machine installed with Ubuntu or Debian having sudo non-root user privileges. You should set correct timezone on your server with below command:


sudo timedatectl list-timezones

sudo timedatectl set-timezone Asia/Karachi

You should set hostname of your server with below command:

sudo hostnamectl set-hostname your_server_name

Adding MongoDB Repository

Import the MongoDB public GPG key for package signing:

sudo apt-key adv --keyserver keyserver.ubuntu.com --recv-keys 4b7c549a058f8b6b

Add the MongoDB repository to your sources.list.d directory:

echo "deb [ arch=amd64,arm64 ] https://repo.mongodb.org/apt/ubuntu bionic/mongodb-org/4.2 multiverse" | sudo tee /etc/apt/sources.list.d/mongodb-org-4.2.list

Installing MongoDB

Now that the MongoDB repository has been added, we’re ready to install the latest stable version of MongoDB:

sudo apt -y install mongodb-org

Configuring MongoDB

The configuration file for MongoDB is located at /etc/mongod.conf, and is written in YAML format.

We highly recommend uncommenting the security section by removing # and adding the authorization: enabled string like below:

sudo nano /etc/mongod.conf

# mongod.conf

# for documentation of all options, see:
#   http://docs.mongodb.org/manual/reference/configuration-options/

# Where and how to store data.
storage:
  dbPath: /var/lib/mongodb
  journal:
    enabled: true
#  engine:
#  mmapv1:
#  wiredTiger:

# where to write logging data.
systemLog:
  destination: file
  logAppend: true
  path: /var/log/mongodb/mongod.log

# network interfaces
net:
  port: 27017
  bindIp: 127.0.0.1


# how the process runs
processManagement:
  timeZoneInfo: /usr/share/zoneinfo

security:
  authorization: enabled

#operationProfiling:

#replication:

#sharding:

## Enterprise-Only Options:

#auditLog:

#snmp:

Save and close file when you are finished.

The authorization option enables role-based access control for your databases. If no value is specified, any user will have the ability to modify any database. We'll explain it to you how to create database users and set their permissions later in this guide.

After making changes to the MongoDB configuration file, restart the service as shown in the following section.

Start and Stop MongoDB

To start, restart, or stop the MongoDB service, type the appropriate command from the following:

sudo systemctl start mongod
sudo systemctl restart mongod
sudo systemctl stop mongod

You can also enable MongoDB to start on boot:

sudo systemctl enable mongod

Creating Database Users

If you enabled role-based access control in the Configuring MongoDB section, create a user administrator with credentials for use on the database:

Type below command to open the mongo shell:

mongo

By default, MongoDB connects to a database called test. Before adding any users, create a database to store user data for authentication:

use admin

Use the following command to create an administrative user with the ability to create other users on any database. For better security, change the values mongo-admin and password:

db.createUser({user: "dbadmin", pwd: "P@ssw0rd", roles:[{role: "userAdminAnyDatabase", db: "admin"}]})

Keep these credentials in a safe place for future reference. The output will display all the information written to the database except the password:

Type below to exit the mongo shell:

quit()

Now, test your connection to MongoDB with the newly created credentials:

mongo -u dbadmin -p --authenticationDatabase admin

The -u, -p, and --authenticationDatabase options in the above command are required in order to authenticate connections to the shell. Without authentication, the MongoDB shell can be accessed but will not allow connections to databases.

As the dbadmin user, create a new database to store regular user data for authentication. The following example calls this database user-db:

use user-db

db.createUser({user: "dbuser1", pwd: "password", roles:[{role: "read", db: "user-db"}, {role:"readWrite", db: "testDB"}]})

The similar to the following output will display all the information written to the database except the password:

Successfully added user: {
"user" : "dbuser1",
"roles" : [
                {
"role" : "read",
"db" : "user-db"
                },
                {
"role" : "readWrite",
"db" : "testDB"
                }
        ]
}

To create additional users, repeat these steps as the administrative user, creating new usernames, passwords and roles by substituting the appropriate values.

Type below to exit the mongo shell:

quit()

Managing Data and Collections

This section will explain a few basic features, but we encourage you to do further research based on your specific use case.

Open the MongoDB shell using the dbuser1 we created in earlier step:

mongo -u dbuser1 -p --authenticationDatabase user-db

Create a new database, for example testDB:

use testDB

Make sure that this database name corresponds with the one for which the user has read and write permissions (we already added these permissions in previous step).

To show the name of the current working database, run the db command. Create a new collection, for example testDBCollection:

db.createCollection("testdbCollection", {capped: false})

If you’re not familiar with MongoDB terminology, you can think of a collection as analogous to a table in a relational database management system.

Create sample data for entry into the test database. MongoDB accepts input as documents in the form of JSON objects such as those below. The a and b variables are used to simplify entry; objects can be inserted directly via functions as well.


var a = { name : "M Anwar",  attributes: { age : 34, address : "21 Fish St", phone : 03218675309 }}

var b = { name : "F Kamal",  attributes: { age : 30, address : "31 Main Rd", favorites : { food : "Burgers", animal : "Cat" } }}

Note that documents inserted into a collection need not have the same schema, which is one of many benefits of using a NoSQL database. Insert the data into testDBCollection, using the insert method:


db.testDBCollection.insert(a)


db.testDBCollection.insert(b)

The output for each of these operations will show the number of objects successfully written to the current working database:

WriteResult({ "nInserted" : 1 })

Confirm that the testCollection was properly created:

show collections

The output will list all collections containing data within the current working database:

testDBCollection

View unfiltered data in the testDBCollection using the find method. This returns up to the first 20 documents in a collection, if a query is not passed:

db.testDBCollection.find()

The output will show the similar to the following:


{ "_id" : ObjectId("471a2e6507d0fcd67baef07f"), "name" : "M Anwar" }


{ "_id" : ObjectId("471a2e7607d0fcd67baef080"), "age" : 34 }

You may notice the objects we entered are preceded by _id keys and ObjectId values. These are unique indexes generated by MongoDB when an _id value is not explicitly defined. ObjectId values can be used as primary keys when entering queries, although for ease of use, you may wish to create your own index as you would with any other database system.

The find method can also be used to search for a specific document or field by entering a search term parameter (in the form of an object) rather than leaving it empty. For example:

db.testDBCollection.find({"name" : "M Anwar"})

Running the command above returns a list of documents containing the {"name" : "M Anwar"} object. To view the available options or how to use a particular method, append .help() to the end of your commands. For example, to see a list of options for the find method

db.testDBCollection.find().help()

Wrapping up

You now have a ready to use MongoDB database for your production. In our next guide, we'll show you how to create a highly available MongoDB Shard Cluster for your production environment.

↧

How To Set Up a Highly Available MongoDB Sharded Cluster on CentOS/RHEL 7/8

March 25, 2020, 12:17 pm

≫ Next: How To Install Ansible AWX on CentOS/RHEL 7/8

≪ Previous: How To Install and Secure MongoDB on Ubuntu 18/19/20

This tutorial will walk you through the steps to set up a highly available MongoDB sharded cluster on CentOS/RHEL 7/8.

A MongoDB sharded cluster consists of the following three components:

Shard that can be deployed as a replica set.

Config store metadata and configuration settings for the cluster.

Mongos acts as a query router, providing an interface between client applications and the sharded cluster.

Prerequisites

To follow this tutorial along, you will need at least 7 (physical or virtual) machines installed with CentOS/RHEL 7 or 8 having sudo non-root user privileges. Make sure you have completed basic network settings including hostname, timezone, and IP addresses on each of your servers.

We will use these 7 machines with the information (as described) for our sharded cluster throughout this tutorial. Make sure you substitute, hostname, ip address, domain and red highlighted text with yours wherever applicable.

Create SSH Key-Pair

We will generate an ssh key-pair to set up passwordless authentication between the hosts in the cluster.

ssh-keygen

This will return you the following prompts, press enter to select default location:

Generating public/private rsa key pair.
Enter file in which to save the key (/home/administrator/.ssh/id_rsa):

Again press enter to leave the passphrase fields blank:

Enter passphrase (empty for no passphrase):
Enter same passphrase again:

You will see similar to the following output, confirms that the generating ssh key-pair succeeded under your user's home directory

Your identification has been saved in /home/administrator/.ssh/id_rsa.
Your public key has been saved in /home/administrator/.ssh/id_rsa.pub.

The key fingerprint is:
SHA256:DWWeuoSoQgHJnYkbrs8QoFs8oMjP0Sv8/3ehuN17MPE administrator@cfg1.example.pk
The key's randomart image is:
+---[RSA 2048]----+
|o.o o     o      |
|=+ +     + .     |
|B+o .   . o      |
|=+=. o . +   .   |
|.=+.o o S .   o  |
|=  * . . .   + E |
|.+. o   . . . +  |
| .o  .   ..o.. . |
|      ...oo..oo  |
+----[SHA256]-----+

Type below to create an authorized_keys file:

ssh-copy-id -i .ssh/id_rsa.pub localhost

Copy ~/.ssh directory with all its contents from cfg1.example.pk to each of your servers like below:

scp -r /home/administrator/.ssh/ cfg2.example.pk:~/
scp -r /home/administrator/.ssh/ cfg3.example.pk:~/

scp -r /home/administrator/.ssh/ shrd1.example.pk:~/
scp -r /home/administrator/.ssh/ shrd2.example.pk:~/
scp -r /home/administrator/.ssh/ shrd3.example.pk:~/

scp -r /home/administrator/.ssh/ qrt1.example.pk:~/

If everything set up correctly as demonstrated above, you can access any of your servers via ssh, and it won't prompt you for the password anymore.

Update Hosts File

sudo vi /etc/hosts

add each of your servers's name and IP addresses in the hosts file like below:

# Config Server Replica Set
192.168.10.1 cfg1.example.pk
192.168.10.2 cfg2.example.pk
192.168.10.3 cfg3.example.pk

# Shard Server Replica Set (rs0)
192.168.10.4 shrd1.example.pk
192.168.10.5 shrd2.example.pk
192.168.10.6 shrd3.example.pk

# Mongos (Query Router)
192.168.10.7 qrt1.example.pk

Save and close the file when you are finished.

Repeat the same on each of your remaining servers before proceeding to the next step.

Add MongoDB Repository

You need to add an official repository on each of your servers to install the latest stable release of mongodb like below:

On (cfg1.example.pk), create a file mongodb.repo under /etc/yum.repos.d like below:

sudo vi /etc/yum.repos.d/mongodb.repo

[mongodb-org-4.2]
name=MongoDB
baseurl=https://repo.mongodb.org/yum/redhat/$releasever/mongodb-org/4.2/x86_64/
gpgcheck=1
enabled=1
gpgkey=https://www.mongodb.org/static/pgp/server-4.2.asc

Save and close file when you are finished.

Repeat the same on each of your remaining servers before proceeding to next step.

Install MongoDB

On (cfg1.example.pk) install mongodb latest stable release like below:

sudo yum update
sudo yum -y install mongodb-org

For CentOS/RHEL 8, you can install mongodb using dng package manager like below:

sudo dnf update
sudo dnf -y install mongodb-org

When installation complete on your cfg1.example.pk, repeat the same on each of your remaining servers before proceeding to next step.

Configure MongoDB

On (cfg1.example.pk) start mongod service and make it persistent on reboot with below command:

sudo systemctl start mongod
sudo systemctl enable mongod

Confirm that mongod service is active and running with below command:

sudo systemctl status mongod

You can see in the below output that mongod is active and running.

● mongod.service - MongoDB Database Server
   Loaded: loaded (/usr/lib/systemd/system/mongod.service; enabled; vendor preset: disabled)
   Active: active (running) since Wed 2020-03-25 21:20:40 PKT; 4s ago
     Docs: https://docs.mongodb.org/manual
  Process: 11330 ExecStart=/usr/bin/mongod $OPTIONS (code=exited, status=0/SUCCESS)
  Process: 11328 ExecStartPre=/usr/bin/chmod 0755 /var/run/mongodb (code=exited, status=0/SUCCESS)
  Process: 11326 ExecStartPre=/usr/bin/chown mongod:mongod /var/run/mongodb (code=exited, status=0/SUCCESS)
  Process: 11325 ExecStartPre=/usr/bin/mkdir -p /var/run/mongodb (code=exited, status=0/SUCCESS)
 Main PID: 11333 (mongod)
   CGroup: /system.slice/mongod.service
└─11333 /usr/bin/mongod -f /etc/mongod.conf

Make sure you repeat the same on each of your remaining servers except (qrt1) before proceeding to next step.

Create a Administrative User

To Administer and manage mongodb sharded cluster, we need to create an administrative user with root privileges.

On (cfg1.example.pk), type below command to access mongo shell:

mongo

You will see mongo shell prompt like below:

MongoDB shell version v4.2.3
connecting to: mongodb://127.0.0.1:27017/?compressors=disabled&gssapiServiceName=mongodb
Implicit session: session { "id" : UUID("8abb0c14-e2cf-4947-855b-d339279b52c9") }
MongoDB server version: 4.2.3
Welcome to the MongoDB shell.
>

On mongo shell, type below to switch to the default admin database:

use admin

Type below on mongo shell to create a user called "administrator", make sure you replace “password” with a strong password of your choice:

db.createUser({user: "administrator", pwd: "password", roles:[{role: "root", db: "admin"}]})

This will return similar to the following output:

Successfully added user: {
"user" : "administrator",
"roles" : [
                {
"role" : "root",
"db" : "admin"
                }
        ]
}

Type below to exit from mongo shell:

quit()

Repeat the same user creation step on (shrd1.example.pk) before proceeding to next step.

Set Up MongoDB Authentication

On (cfg1.example.pk), type below command to generate a key file and set appropriate permission:

openssl rand -base64 756 > ~/mongodb_key.pem
sudo cp ~/mongodb_key.pem /var/lib/mongo/
sudo chown -R mongod:mongod /var/lib/mongo/mongodb_key.pem
sudo chmod -R 400 /var/lib/mongo/mongodb_key.pem

As you can see, we have stored key file under (/var/lib/mongo), and set appropriate permission for mongod user. Make sure you copy monodb_key.pem file to each of your servers under /var/lib/mongo/ with identical permission.

Set Up Config Servers Replica Set

We will make few changes in mongod.conf file on (cfg1, cfg2, cfg3) servers:

Log in to your (cfg1.example.pk), edit mongod.conf like below:

sudo vi /etc/mongod.conf

Add, update following values, make sure you replace port value with 27019 and bindIp value with your server's name:

port: 27019
  bindIp: cfg1.example.pk

security:
  keyFile: /var/lib/mongo/mongodb_key.pem

replication:
replSetName: configReplSet

sharding:
clusterRole: configsvr

Save and close when you are finished.

Restart mongod service to take changes into effect:

sudo systemctl restart mongod
sudo systemctl status mongod

Make sure you repeat the same on each of your remaining config servers (cfg2.example.pk, cfg3.example.pk), before proceeding to next step.

Initiate the Config Replica Set.

Log in to your (cfg1.example.pk), connect to the MongoDB shell over port 27019 with the administrator user like below:

mongo --host cfg1.example.pk --port 27019 -u administrator --authenticationDatabase admin

This will prompt you for password:

MongoDB shell version v4.2.3
Enter password:

From the mongo shell, initiate the config server's replica set like below:

rs.initiate({ _id: "configReplSet", configsvr: true, members: [{ _id : 0, host : "cfg1.example.pk:27019"},{ _id : 1, host : "cfg2.example.pk:27019"},{ _id : 2, host : "cfg3.example.pk:27019"}]})

You will see a message like below indicating that operation succeeded:

{
"ok" : 1,
"$gleStats" : {
"lastOpTime" : Timestamp(1584600261, 1),
"electionId" : ObjectId("000000000000000000000000")
        },
"lastCommittedOpTime" : Timestamp(0, 0)
}
configReplSet:SECONDARY>

rs.config()

If the replica set has been configured properly, you’ll see output similar to the following:

{
"_id" : "configReplSet",
"version" : 1,
"configsvr" : true,
"protocolVersion" : NumberLong(1),
"writeConcernMajorityJournalDefault" : true,
"members" : [
                {
"_id" : 0,
"host" : "cfg1.example.pk:27019",
"arbiterOnly" : false,
"buildIndexes" : true,
"hidden" : false,
"priority" : 1,
"tags" : {

                        },
"slaveDelay" : NumberLong(0),
"votes" : 1
                },
                {
"_id" : 1,
"host" : "cfg2.example.pk:27019",
"arbiterOnly" : false,
"buildIndexes" : true,
"hidden" : false,
"priority" : 1,
"tags" : {

                        },
"slaveDelay" : NumberLong(0),
"votes" : 1
                },
                {
"_id" : 2,
"host" : "cfg3.example.pk:27019",
"arbiterOnly" : false,
"buildIndexes" : true,
"hidden" : false,
"priority" : 1,
"tags" : {

                        },
"slaveDelay" : NumberLong(0),
"votes" : 1
                }
        ],
"settings" : {
"chainingAllowed" : true,
"heartbeatIntervalMillis" : 2000,
"heartbeatTimeoutSecs" : 10,
"electionTimeoutMillis" : 10000,
"catchUpTimeoutMillis" : -1,
"catchUpTakeoverDelayMillis" : 30000,
"getLastErrorModes" : {

                },
"getLastErrorDefaults" : {
"w" : 1,
"wtimeout" : 0
                },
"replicaSetId" : ObjectId("5e7314c4ba14c5d2412a1949")
        }
}

For maximum replica set configuration information, type below:

rs.status()

You’ll see output similar to the following:

{
"set" : "configReplSet",
"date" : ISODate("2020-03-19T06:47:14.266Z"),
"myState" : 1,
"term" : NumberLong(1),
"syncingTo" : "",
"syncSourceHost" : "",
"syncSourceId" : -1,
"configsvr" : true,
"heartbeatIntervalMillis" : NumberLong(2000),
"majorityVoteCount" : 2,
"writeMajorityCount" : 2,
"optimes" : {
"lastCommittedOpTime" : {
"ts" : Timestamp(1584600433, 1),
"t" : NumberLong(1)
                },
"lastCommittedWallTime" : ISODate("2020-03-19T06:47:13.490Z"),
"readConcernMajorityOpTime" : {
"ts" : Timestamp(1584600433, 1),
"t" : NumberLong(1)
                },
"readConcernMajorityWallTime" : ISODate("2020-03-19T06:47:13.490Z"),
"appliedOpTime" : {
"ts" : Timestamp(1584600433, 1),
"t" : NumberLong(1)
                },
"durableOpTime" : {
"ts" : Timestamp(1584600433, 1),
"t" : NumberLong(1)
                },
"lastAppliedWallTime" : ISODate("2020-03-19T06:47:13.490Z"),
"lastDurableWallTime" : ISODate("2020-03-19T06:47:13.490Z")
        },
"lastStableRecoveryTimestamp" : Timestamp(1584600391, 1),
"lastStableCheckpointTimestamp" : Timestamp(1584600391, 1),
"electionCandidateMetrics" : {
"lastElectionReason" : "electionTimeout",
"lastElectionDate" : ISODate("2020-03-19T06:44:32.291Z"),
"electionTerm" : NumberLong(1),
"lastCommittedOpTimeAtElection" : {
"ts" : Timestamp(0, 0),
"t" : NumberLong(-1)
                },
"lastSeenOpTimeAtElection" : {
"ts" : Timestamp(1584600261, 1),
"t" : NumberLong(-1)
                },
"numVotesNeeded" : 2,
"priorityAtElection" : 1,
"electionTimeoutMillis" : NumberLong(10000),
"numCatchUpOps" : NumberLong(0),
"newTermStartDate" : ISODate("2020-03-19T06:44:33.110Z"),
"wMajorityWriteAvailabilityDate" : ISODate("2020-03-19T06:44:34.008Z")
        },
"members" : [
                {
"_id" : 0,
"name" : "cfg1.example.pk:27019",
"health" : 1,
"state" : 1,
"stateStr" : "PRIMARY",
"uptime" : 1013,
"optime" : {
"ts" : Timestamp(1584600433, 1),
"t" : NumberLong(1)
                        },
"optimeDate" : ISODate("2020-03-19T06:47:13Z"),
"syncingTo" : "",
"syncSourceHost" : "",
"syncSourceId" : -1,
"infoMessage" : "",
"electionTime" : Timestamp(1584600272, 1),
"electionDate" : ISODate("2020-03-19T06:44:32Z"),
"configVersion" : 1,
"self" : true,
"lastHeartbeatMessage" : ""
                },
                {
"_id" : 1,
"name" : "cfg2.example.pk:27019",
"health" : 1,
"state" : 2,
"stateStr" : "SECONDARY",
"uptime" : 173,
"optime" : {
"ts" : Timestamp(1584600421, 1),
"t" : NumberLong(1)
                        },
"optimeDurable" : {
"ts" : Timestamp(1584600421, 1),
"t" : NumberLong(1)
                        },
"optimeDate" : ISODate("2020-03-19T06:47:01Z"),
"optimeDurableDate" : ISODate("2020-03-19T06:47:01Z"),
"lastHeartbeat" : ISODate("2020-03-19T06:47:12.377Z"),
"lastHeartbeatRecv" : ISODate("2020-03-19T06:47:14.013Z"),
"pingMs" : NumberLong(0),
"lastHeartbeatMessage" : "",
"syncingTo" : "cfg1.example.pk:27019",
"syncSourceHost" : "cfg1.example.pk:27019",
"syncSourceId" : 0,
"infoMessage" : "",
"configVersion" : 1
                },
                {
"_id" : 2,
"name" : "cfg3.example.pk:27019",
"health" : 1,
"state" : 2,
"stateStr" : "SECONDARY",
"uptime" : 173,
"optime" : {
"ts" : Timestamp(1584600421, 1),
"t" : NumberLong(1)
                        },
"optimeDurable" : {
"ts" : Timestamp(1584600421, 1),
"t" : NumberLong(1)
                        },
"optimeDate" : ISODate("2020-03-19T06:47:01Z"),
"optimeDurableDate" : ISODate("2020-03-19T06:47:01Z"),
"lastHeartbeat" : ISODate("2020-03-19T06:47:12.377Z"),
"lastHeartbeatRecv" : ISODate("2020-03-19T06:47:14.020Z"),
"pingMs" : NumberLong(0),
"lastHeartbeatMessage" : "",
"syncingTo" : "cfg1.example.pk:27019",
"syncSourceHost" : "cfg1.example.pk:27019",
"syncSourceId" : 0,
"infoMessage" : "",
"configVersion" : 1
                }
        ],
"ok" : 1,
"$gleStats" : {
"lastOpTime" : Timestamp(1584600261, 1),
"electionId" : ObjectId("7fffffff0000000000000001")
        },
"lastCommittedOpTime" : Timestamp(1584600433, 1),
"$clusterTime" : {
"clusterTime" : Timestamp(1584600433, 1),
"signature" : {
"hash" : BinData(0,"5S/ou6ONJAUK+J4roPWAKmOf2nk="),
"keyId" : NumberLong("6805806349767671838")
                }
        },
"operationTime" : Timestamp(1584600433, 1)
}

Now exit from mongo shell with below command:

quit()

Create the Shard Replica Set (rs0)

We will configure shard replica set (rs0) on (shrd1, shrd2, shrd3) servers.

Log in to (shrd1.example.pk), edit /etc/mongod.conf file like below:

sudo vi /etc/mongod.conf

Add, update the following values, make sure you replace port value with 27018 and bindIp value with your server's name:

net:
  port: 27018
  bindIp: shrd1.example.pk

security:
  keyFile: /var/lib/mongo/mongodb_key.pem

replication:
replSetName: rs0

sharding:
clusterRole: shardsvr

Save and close when you are finished.

Restart mongod service to take changes into effect:

sudo systemctl restart mongod
sudo systemctl status mongod

Make sure you repeat the same on (shrd2, shrd3) server before proceeding to next step.

Initiate the shard replica set (rs0)

mongo --host shrd1.example.pk --port 27018 -u administrator --authenticationDatabase admin

Type below on mongo shell to initiate shard replica set (rs0):

rs.initiate({ _id : "rs0", members:[{ _id : 0, host : "shrd1.example.pk:27018" },{ _id : 1, host : "shrd2.example.pk:27018" },{ _id : 2, host : "shrd3.example.pk:27018" }]})

This will return { "ok" : 1 } indicating that shard replica set rs0 initiated successfully.

Now exit from the mongo shell with below command:

quit()

Configure Mongos (Query Router)

We'll create a mongos service that needs to obtain data locks, so be sure mongod is stopped before proceeding:

Log in to (qrt1.example.pk), and deactivate mongod service with below command:

sudo systemctl stop mongod
sudo systemctl disable mongod

Confirm that mongod service is stopped with below command:

sudo systemctl status mongod

The output confirm that mongod is stopped:

● mongod.service - MongoDB Database Server
   Loaded: loaded (/lib/systemd/system/mongod.service; disabled; vendor preset: enabled)
   Active: inactive (dead)
     Docs: https://docs.mongodb.org/manual

Mar 25 23:48:38 qrt1.example.pk systemd[1]: Stopped MongoDB Database Server.

On (qrt1.example.pk), create mongos.conf file file like below:

sudo vi /etc/mongos.conf

Add the following configuration directives

systemLog:
  destination: file
  logAppend: true
  path: /var/log/mongodb/mongos.log

net:
  port: 27017
  bindIp: qrt1.example.pk

security:
  keyFile: /var/lib/mongo/mongodb_key.pem

sharding:
configDB: configReplSet/cfg1.example.pk:27019,cfg2.example.pk:27019,cfg3.example.pk:27019

Save and close file.

Next, create a systemd service unit file for mongos like below:

sudo vi /usr/lib/systemd/system/mongos.service

Add the following parameters:

[Unit]
Description=Mongo Cluster Router
After=network.target

[Service]
User=mongod
Group=mongod

ExecStart=/usr/bin/mongos --config /etc/mongos.conf

LimitFSIZE=infinity
LimitCPU=infinity
LimitAS=infinity

LimitNOFILE=64000
LimitNPROC=64000

TasksMax=infinity
TasksAccounting=false

[Install]
WantedBy=multi-user.target

Save and close.

Start mongos service with below command to activate query router.

sudo systemctl start mongos
sudo systemctl enable mongos

Confirm that mongos is active and running with below command:

sudo systemctl status mongos

You will see mongos status like below.

● mongos.service - Mongo Cluster Router
   Loaded: loaded (/usr/lib/systemd/system/mongos.service; enabled; vendor preset: enabled)
   Active: active (running) since Wed 2020-03-25 23:52:25 PKT; 33s ago
 Main PID: 26985 (mongos)
   CGroup: /system.slice/mongos.service
└─26985 /usr/bin/mongos --config /etc/mongos.conf

Mar 25 23:52:25 qrt1.example.pk systemd[1]: Started Mongo Cluster Router.

Add Shards to the Cluster

On (qrt1.example.pk), connect to mongo shell on port 27017 with administrative authentication like below:

mongo --host qrt1.example.pk --port 27017 -u administrator --authenticationDatabase admin

On mongo shell, type below to add shard replica set (rs0) in the cluster:

sh.addShard( "rs0/shrd1.example.pk:27018,shrd2.example.pk:27018,shrd2.example.pk:27018")

You will see the similar output indicating that shard replica set rs0 added successfully.

At this stage, your mongodb sharded cluster is active and running.

Enable Sharding

The two most common sharding strategies are:

Range-based sharding divides your data based on specific ranges of values in the shard key.

Hash-based sharding distributes data by using a hash function on your shard key for a more even distribution of data among the shards.

This is not a comprehensive explanation for choosing a sharding strategy. You may wish to consult with official resource of MongoDB’s documentation on sharding strategy.

For this guide, we’ve decided to use a hash-based sharding strategy, enabling sharding at the collections level. This allows the documents within a collection to be distributed among your shards.

mongo --host qrt1.example.pk --port 27017 -u administrator --authenticationDatabase admin

From the mongos shell, type below to create a test database called testDB

use testDB

Create a new collection called testCollection and hash its _id key. The _id key is already created by default as a basic index for new documents:

db.testCollection.ensureIndex( { _id : "hashed" } )

Type below to enable sharding for newly created database:

sh.enableSharding("testDB")
sh.shardCollection( "testDB.testCollection", { "_id" : "hashed" } )

This enables sharding across any shards that you added to your cluster.

To verify that the sharding was successfully enabled, type below to switch to the config database:

use config

Run below method:

db.databases.find()

If sharding was enabled properly, this will return useful information with the list of databases you have. Once you enable sharding for a database, MongoDB assigns a primary shard for that database where MongoDB stores all data in that database.

Test MongoDB Sharded Cluster

To ensure your data is being distributed evenly in the database, follow these steps to generate some dummy data in testDB and see how it is divided among the shards.

Connect to the mongos shell on any of your query routers:

mongo --host qrt1.example.pk --port 27017 -u administrator -p --authenticationDatabase admin

Switch to your newly created database (testDB) for example:

use testDB

Type the following code in the mongo shell to generate 10000 simple dummy documents and insert them into testCollection:

for (var i = 1; i <= 10000; i++) db.testCollection.insert( { x : i } )

Run following code to check your dummy data distribution:

db.testCollection.getShardDistribution()

The sections beginning with Shard give information about each shard in your cluster. Since we only added 1 shard rs0 with three members, there is only once section, but if you add more shards to the cluster, they’ll show up here as well.

The Totals section provides information about the collection as a whole, including its distribution among the shards.

When you’re finished, we recommend you to delete the testDB (because it has no use) with below command:

use testDB
db.dropDatabase()

Wrapping up

Now that you have successfully deployed a highly available sharded cluster ready to use for your production environment, we recommended you to configure firewall to limit ports 27018 and 27019 to only accept traffic between hosts within your cluster.

You'll always make connection to the database in the sharded cluster via query routers.

↧

How To Install Ansible AWX on CentOS/RHEL 7/8

March 26, 2020, 7:49 am

≫ Next: How To Backup and Restore PostgreSQL Database

≪ Previous: How To Set Up a Highly Available MongoDB Sharded Cluster on CentOS/RHEL 7/8

AWX provides a web-based user interface, REST API, and task engine built on top of Ansible. This tutorial will walk you through the steps to set up Ansible AWX using Docker on a CentOS/RHEL 7/8.

Prerequisites

To follow this guide along, you will need one (physical or virtual) machine with CentOS/RHEL 7/8 minimal installed having a non-root user sudo privileges.

Configure SELinux

By default, SELinux is enforcing in CentOS/RHEL 7/8. It is recommended to change SELINUX=enforcing to SELINUX=disabled to run Ansible AWX under a Docker container:

sudo nano /etc/sysconfig/selinux

Find the following line and replace the value "enforcing" with "disabled":

SELINUX=disabled

Save and close the file.

Install EPEL Repository

You will need to install the extra packages for enterprise Linux (EPEL) repository on your CentOS/RHEL 7/8:

Type below command if you are running CentOS/REHL 8:

sudo dnf -y install epel-release

Type below command if you are running CentOS/RHEL 7:

sudo yum -y install epel-release

Next, install following additional dependencies on your CentOS/RHEL using the below command:

Type below on CentOS/RHEL 8:

sudo dnf -y install git gcc gcc-c++ ansible nodejs gettext device-mapper-persistent-data lvm2 bzip2 python3-pip

Type below on CentOS/RHEL 7:

sudo yum -y install git gcc gcc-c++ ansible nodejs gettext device-mapper-persistent-data lvm2 bzip2 python3-pip nano

Install Docker

In this section, we will install Docker and Docker Compose to run Ansible AWX inside a Docker container:

Type below to add Docker repository on CentOS/RHEL 8:

sudo dnf config-manager --add-repo=https://download.docker.com/linux/centos/docker-ce.repo

Type below command to install Docker on CentOS/RHEL 8:

sudo dnf -y install docker-ce

Type below to create Docker repository on CentOS/RHEL 7:

sudo vi /etc/yum.repos.d/docker-ce.repo

Add following contents:

[docker-ce-stable]
name=Docker CE Stable - $basearch
baseurl=https://download.docker.com/linux/centos/7/$basearch/stable
enabled=1
gpgcheck=1
gpgkey=https://download.docker.com/linux/centos/gpg

[docker-ce-stable-debuginfo]
name=Docker CE Stable - Debuginfo $basearch
baseurl=https://download.docker.com/linux/centos/7/debug-$basearch/stable
enabled=0
gpgcheck=1
gpgkey=https://download.docker.com/linux/centos/gpg

[docker-ce-stable-source]
name=Docker CE Stable - Sources
baseurl=https://download.docker.com/linux/centos/7/source/stable
enabled=0
gpgcheck=1
gpgkey=https://download.docker.com/linux/centos/gpg

[docker-ce-edge]
name=Docker CE Edge - $basearch
baseurl=https://download.docker.com/linux/centos/7/$basearch/edge
enabled=0
gpgcheck=1
gpgkey=https://download.docker.com/linux/centos/gpg

[docker-ce-edge-debuginfo]
name=Docker CE Edge - Debuginfo $basearch
baseurl=https://download.docker.com/linux/centos/7/debug-$basearch/edge
enabled=0
gpgcheck=1
gpgkey=https://download.docker.com/linux/centos/gpg

[docker-ce-edge-source]
name=Docker CE Edge - Sources
baseurl=https://download.docker.com/linux/centos/7/source/edge
enabled=0
gpgcheck=1
gpgkey=https://download.docker.com/linux/centos/gpg

[docker-ce-test]
name=Docker CE Test - $basearch
baseurl=https://download.docker.com/linux/centos/7/$basearch/test
enabled=0
gpgcheck=1
gpgkey=https://download.docker.com/linux/centos/gpg

[docker-ce-test-debuginfo]
name=Docker CE Test - Debuginfo $basearch
baseurl=https://download.docker.com/linux/centos/7/debug-$basearch/test
enabled=0
gpgcheck=1
gpgkey=https://download.docker.com/linux/centos/gpg

[docker-ce-test-source]
name=Docker CE Test - Sources
baseurl=https://download.docker.com/linux/centos/7/source/test
enabled=0
gpgcheck=1
gpgkey=https://download.docker.com/linux/centos/gpg

[docker-ce-nightly]
name=Docker CE Nightly - $basearch
baseurl=https://download.docker.com/linux/centos/7/$basearch/nightly
enabled=0
gpgcheck=1
gpgkey=https://download.docker.com/linux/centos/gpg

[docker-ce-nightly-debuginfo]
name=Docker CE Nightly - Debuginfo $basearch
baseurl=https://download.docker.com/linux/centos/7/debug-$basearch/nightly
enabled=0
gpgcheck=1
gpgkey=https://download.docker.com/linux/centos/gpg

[docker-ce-nightly-source]
name=Docker CE Nightly - Sources
baseurl=https://download.docker.com/linux/centos/7/source/nightly
enabled=0
gpgcheck=1
gpgkey=https://download.docker.com/linux/centos/gpg

Save and close the file.

Type below to install Docker on CentOS/RHEL 7:

sudo yum -y install docker-ce

Next, start the Docker service and make it persistent on system reboot with below command:

sudo systemctl start docker
sudo systemctl enable docker

Confirm that docker is active and running:

sudo systemctl status docker

You will see similar output like below:

Type below to install Docker Compose on CentOS/RHEL 7/8:

pip3 install docker-compose

Once you are finished with docker and docker compose installation, type the below command to set python 3 in environment variable:

sudo alternatives --set python /usr/bin/python3

Download Ansible AWX

You can download Ansible AWX latest release from the Git Hub repository using the below command:

cd ~
git clone https://github.com/ansible/awx.git

When download complete, generate a secret key using openssl to encrypt inventory file:

openssl rand -base64 30

You will see the output similar to the following:

Copy your key and save it for later use in the inventory file.

Install Ansible AWX

You need to edit inventory file like below:

cd ~/awx/installer/
sudo nano inventory

Replace the following values with yours:

[all:vars]
dockerhub_base=ansible
awx_task_hostname=awx
awx_web_hostname=awxweb
postgres_data_dir="/var/lib/pgdocker"
host_port=80
host_port_ssl=443
docker_compose_dir="~/.awx/awxcompose"
pg_username=awx
pg_password=awxpass
pg_database=awx
pg_port=5432
pg_admin_password=password
rabbitmq_password=awxpass
rabbitmq_erlang_cookie=cookiemonster
admin_user=admin
admin_password=password
create_preload_data=True
secret_key=Paste_Your_Secret_Key_Here
awx_official=true
awx_alternate_dns_servers="8.8.8.8,8.8.4.4"
project_data_dir=/var/lib/awx/projects

Save and close the file when you are finished.

Next, create a pgdocker directory under /var/lib/

sudo mkdir /var/lib/pgdocker

Type the following command to install AWX:

sudo ansible-playbook -i inventory install.yml

Upon successful installation, you will see output similar to the following:

This will create and start all the required Docker containers for Ansible AWX.

You can verify the running containers with the following command:

sudo docker ps

You will see output similar to the following:

Add Firewall Rules

You will need to add following rules to allow http and https service to pass through firewalld:

sudo firewall-cmd --zone=public --add-masquerade --permanent
sudo firewall-cmd --permanent --add-service=http
sudo firewall-cmd --permanent --add-service=https
sudo firewall-cmd --reload

Access AWX Web Interface

Open up your preferred web browser and type the http://your-server-ip in the address bar, you will be redirected to the AWX login page like below:

IMAGE HERE

Log in with the admin username and password which you have defined in the inventory file, you should see the AWX default dashboard like below:

IMAGE HERE

Wrapping up

Now that you have Ansible AWX running, you can now administrer and manage your Ansible project easily using the AWX web interface.

↧

How To Backup and Restore PostgreSQL Database

March 28, 2020, 11:52 pm

≫ Next: How To Backup and Restore MySQL, MariaDB Database

≪ Previous: How To Install Ansible AWX on CentOS/RHEL 7/8

If you are using PostgreSQL in a production environment, it is recommended to frequently take backups to ensure that your important data is not lost. With the backups, you will be able to quickly restore if your database is lost or corrupted. The good thing is, PostgreSQL includes tools to make database backup simple and easy to restore.

This tutorial will show you how to backup and restore your PostgreSQL database.

Prerequisites

This tutorial assumes that you have a working installation of PostgreSQL on your system. The steps in this guide require root privileges so be sure to perform these steps as root or with the sudo prefix.

Backup Single Database

PostgreSQL comes with the builtin (pg_dump) utility that simplifies backing up a single database. Log in into your database server with a user that has read permissions to the database you intend to back up:

Create a backup directory and dump the contents of a database to a file in that directory by running the following command.

mkdir -p ~/backup_db
pg_dump exampleDB> /backup_db/exampleDB.bak

In the above example, we created a directory (backup_db) to store the (exampleDB.bak) locally. However, it is recommended to store your critical backup remotely over the network in a safe and secure place for later use.

There are several options for the backup format:

*.bak: compressed binary format
*.sql: plaintext dump
*.tar: tarball

Restore Single Database

The following section demonstrates restoring the lost data from the backup. For this guide, we will delete exampleDB, create an empty database called testDB, then restore database contents from the backup file:

dropdb exampleDB
createdb testDB

Restore the database using psql:

psql testDB< /backup_db/exampleDB.bak

Backup Database Remotely

PostgreSQL client utility (psql) allows you to run (pg_dump) from a client computer to take the database back up on a remote server like below:

pg_dump -h database_server_ip -p 5432 exampleDB> /backup_db/exampleDB.bak

Clustered Database Backup

The pg_dump utility only creates a backup of one database at a time, it does not store information about database roles or cluster-wide configuration. To backup such information including all of your databases simultaneously, you can use pg_dumpall.

The following example will backup all databases at once:

pg_dumpall > /backup_db/all_databases.bak

Restore all databases at once from the backup:

psql -f /backup_db/all_databases.bak postgres

Automating Database Backup

If you wish, you can set up a cron job to automatically backup your database at regular intervals. The steps in this section will set up a cron task that will run pg_dump once every week. Make sure you perform these task as the postgres user:

su - postgres

Create a directory to store the automatic backups:

mkdir -p ~/postgres/backup_db

Edit the crontab to create the new cron task:

crontab -e

Add the following line to the end of the crontab:

0 0 * * 0 pg_dump -U postgres exampleDB> ~/postgres/backup_db/exampleDB.bak

Save and close the editor.

This cron job will automatically back up your database at midnight every Sunday.

Wrapping up

PostgreSQL offers more advanced procedures to back up your critical databases. This official documentation will help you to set up continuous archiving and point-in-time recovery. This method is quite complex, but in the end it will maintain a constant archive of your critical database and make it possible to recover the state of the database at any point in the past.

↧

How To Backup and Restore MySQL, MariaDB Database

March 29, 2020, 9:01 am

≫ Next: How To Enable Two-factor Authentication For SSH on CentOS/RHEL 7/8

≪ Previous: How To Backup and Restore PostgreSQL Database

MySQL and MariaDB include a built-in backup utility that simplifies the process to create a backup of a database. With mysqldump, you can create a logical backup, only if your database is accessible and running.

This guide will show you how to create a backup of MySQL and MariaDB database using mysqldump utility.

Prerequisites

This tutorial assumes that you have a working MySQL or MariaDB installation, including a non-root user with sudo privileges.

Manual Database Backup

sudo mysqldump --all-databases --single-transaction --quick --lock-tables=false > full_backup_$(date +%F).sql -u root -p

You can create a specific database backup with below command:

sudo mysqldump -u dbadmin -p exampleDB --single-transaction --quick --lock-tables=false > exampleDB_backup_$(date +%F).sql

Make sure you replace username with a user that has access to the database and exampleDB with the name of the database you intend to back up:

You can even create a single table backup from any database:

sudo mysqldump -u dbadmin -p --single-transaction --quick --lock-tables=false exampleDBtable_name> exampleDB_table_name_$(date +%F).sql

Automate Database Backup

This section will show you how to schedule a backup task using cronjob to regularly create database backups.

Create a file .mylogin.cnf under your user's home directory to hold the login credentials of the MySQL or MariaDB root user. Note that the system user whose home directory this file is stored in can be unrelated to any MySQL users.

sudo nano /home/your_system_user/.mylogin.cnf

Add the MySQL or MariaDB database root credentials, make sure you replace the password field with your database root user's password:

[client]
user = root
password = Type MySQL root user's password here

Save and close the editor.

Make .mylogin.cnf file read-only with below command:

sudo chmod 600 /home/your_system_user/.mylogin.cnf

Next, create the cron job file to back up the entire database management system every day at 2:00AM:

sudo nano /etc/cron.daily/mysqldump

Add the following line, make sure you replace highlighted text with yours:

0 2 * * * /usr/bin/mysqldump --defaults-extra-file=/home/your_system_user/.mylogin.cnf -u root --single-transaction --quick --lock-tables=false --all-databases > full_backup_$(date +\%F).sql

Save and close the editor.

Restore Database Backup

This section demonstrates, how to restore an entire database management system (DBMS) backup.

Type below command to restore entire DBMS from the backup:

sudo mysql -u root -p < full_backup.sql

This will prompt you for the MySQL root user’s password:

To restore a single database dump, make sure an empty or old destination database must already exist to import the data into it, and the MySQL user you’re running the command must have write access to that database:

sudo mysql -u dbadmin -p exampleDB< exampleDB_table_name.sql

Wrapping up

These are just a few examples to create MySQL or MariaDB database backups. If you wish, go through the following official documentation for additional information on this topic.

↧

How To Enable Two-factor Authentication For SSH on CentOS/RHEL 7/8

March 30, 2020, 12:34 pm

≫ Next: How To Enable Two-factor Authentication For SSH on Ubuntu 18/19/20

≪ Previous: How To Backup and Restore MySQL, MariaDB Database

This tutorial will walk you through the steps to set up two-factor authentication for SSH on a CentOS/RHEL 7/8.

Cyberthreats are increasing with the passage of time, hence securing access to your servers is critical in terms of preventing important information from being compromised.

By default, in Linux, we need a password or a key-pair authentication to log in to the server via SSH. However, there is another option that exists to further harden log in methods. A time-based one-time password allows you to enable two-factor authentication with single-use passwords that change every 30 seconds.

The time-based generated password will change every 30-60 seconds. This means that if an attacker tries to use brute force, they’ll almost certainly run out of time before new credentials are needed to gain access.

A one-time password will be valid for a single authentication only, thus minimizing the risk of a replay attack. Even if your TOTP is intercepted upon sending it to the server, it will no longer be valid after you’ve logged in.

By implementing two-factor authentication with a regular password, public-key (or even both), you can add an additional layer of security to protect your servers from internal or external threats.

Prerequisites

This guide assumes that you have a working installation of CentOS/RHEL 7 or 8 with SSH installed and running.

You will also need a smartphone or a client device with an authenticator application such as Google Authenticator, Microsoft authenticator or Authy. Several other apps exist, and steps covered in this guide should be compatible with nearly all of them.

When you have these prerequisites in place, please proceed with the following steps.

Install Google Authenticator

We will install the Google Authenticator to set up two-factor authentication. This will generate keys on your server, which will then be paired with an app on a client device (often a smartphone) to generate single-use passwords that expire after a set period of time.

Log in to your server with a non-root-user having sudo privileges and enable the extra packages for enterprise Linux (EPEL) repository:

Type below if you are on CentOS/RHEL 7:

sudo yum -y install epel-release

Type below if you are on CentOS/RHEL 8:

sudo dnf -y install epel-release

Next, install the google-authenticator package that you’ll be using to generate keys and passwords.

Type below if you are on CentOS/RHEL 7:

sudo yum -y install google-authenticator

Type below if you are on CentOS/RHEL 8:

sudo dnf -y install google-authenticator

Although we are using the Google Authenticator for two-factor authentication, the keys it generates are compatible with other authentication apps.

Now that the required packages have been installed, we’ll use them to generate keys, so be sure to have your smartphone or client device ready with any of these authenticator apps installed.

We'll use a google authenticator app on a smartphone for this guide. If you haven’t downloaded an authenticator app on your smartphone or a client device, do so before proceeding with the below.

Generate a Key

The following instructions will allow you to specify a user for whom you’d like to generate a password. If you are configuring two-factor authentication for multiple users, follow these steps for each user.

Type below to execute the google-authenticator program on your Linux terminal:

google-authenticator

The following prompt will appear asking you to specify whether you’d like to use time-based authentication (as opposed to one-time or counter-based). Choose “yes” by entering y at the prompt.

Do you want authentication tokens to be time-based (y/n) y

You should see a QR code in your terminal like below:

Open the authenticator app on your smartphone and scan your QR code, this automatically adds your system's user account and generates verification code every 30 seconds.

You’ll also see a “secret key” below the QR code. You can also enter this secret key into the smartphone authenticator app manually, instead of scanning the QR code, to add your account.

You’ll be prompted to answer the following questions:

Do you want me to update your "/home/administrator/.google_authenticator" file (y/n) y

This specifies whether the authentication settings will be set for this user. Answer y to create the file that stores these settings.

Do you want to disallow multiple uses of the same authentication
token? This restricts you to one login about every 30s, but it increases
your chances to notice or even prevent man-in-the-middle attacks (y/n) y

This makes your token a true one-time password, preventing the same password from being used twice. For example, if you set this to “no,” and your password was intercepted while you logged in, someone may be able to gain entry to your server by entering it before the time expires. We strongly recommend answering y.

By default, a new token is generated every 30 seconds by the mobile app.
In order to compensate for possible time-skew between the client and the server,
we allow an extra token before and after the current time. This allows for a
time skew of up to 30 seconds between authentication server and client. If you
experience problems with poor time synchronization, you can increase the window
from its default size of 3 permitted codes (one previous code, the current
code, the next code) to 17 permitted codes (the 8 previous codes, the current
code, and the 8 next codes). This will permit for a time skew of up to 4 minutes
between client and server.
Do you want to do so (y/n) y

This setting accounts for time syncing issues across devices. If you believe that your phone or device may not sync properly, answer y.

If the computer that you are logging into isn't hardened against brute-force
login attempts, you can enable rate-limiting for the authentication module.
By default, this limits attackers to no more than 3 login attempts every 30s.
Do you want to enable rate-limiting (y/n) y

This setting prevents attackers from using brute force to guess your token. Although the time limit should be enough to prevent most attacks, this will ensure that an attacker only has three chances per 30 seconds to guess your password. We recommend answering y.

You have finished generating your key and adding it to your client, but some additional configuration is needed before these settings will go into effect. Carefully read the following section in this guide for instructions on how to require two-factor authentication for all SSH login attempts.

Configure Authentication Settings

You must go through this section carefully to avoid getting locked out of your server. We recommend opening another terminal session for reverting back the settings if you misconfigure anything.

Edit /etc/pam.d/sshd with any of your preferred editor:

sudo nano /etc/pam.d/sshd

Add the following lines to the end of the file:

auth    required      pam_unix.so     try_first_pass
auth    required      pam_google_authenticator.so

Save and close the editor.

Next, edit /etc/ssh/sshd_config file:

sudo nano /etc/ssh/sshd_config

Comment out the following line by adding # at the beginning like below:

# ChallengeResponseAuthentication no

Uncomment the following line by removing #

ChallengeResponseAuthentication yes

then, add these lines to the end of the file

Match User administrator
    AuthenticationMethods keyboard-interactive

Save and close the editor.

Make sure you replace administrator with your system user for which you’d like to enable two-factor authentication.

If you want to enforce two-factor authentication for all users instead of a single user, you can use the AuthenticationMethods directive by itself, outside of a Match User block. However, this should not be done until two-factor credentials have been provided to all users.

Restart the SSH to apply these changes:

sudo systemctl restart sshd

Two-factor authentication is now enabled. When you connect to your server via SSH, the authentication process will take place.

Test Two-factor Authentication

You can test your two-factor configuration by connecting to your server via SSH. You will be prompted to enter your standard user account’s password and then, you will be prompted to enter a Verification Code as shown in the image below.

Open the authenticator app on your smartphone, and enter the verification code that is displayed.

You should authenticate successfully and gain access to your server.

If your SSH client disconnects before you can enter your two-factor token, check if PAM is enabled for SSH. You can do this by editing /etc/ssh/sshd_config: look for UsePAM and set it to yes, then restart the SSH to apply changes.

Combine Two-factor and Public Key Authentication

If you’d like to use public-key authentication instead of a password alongside TOTP, follow these steps to set it up.

We assume that you have already set up ssh key-pair

Edit /etc/ssh/sshd_config to include public key:

sudo nano /etc/ssh/sshd_config

Set PasswordAuthentication to no and modify the AuthenticationMethods line like below:

PasswordAuthentication no

Match User administrator
    AuthenticationMethods publickey,keyboard-interactive

Save and close the editor.

Configure this setting in the AuthenticationMethods directive for each user as appropriate. When any of these users log in, they will need to provide their SSH key and they will be authenticated via TOTP, as well.

Restart your SSH to apply these changes.

sudo systemctl restart sshd

Next, you’ll need to make changes to your PAM configuration.

sudo nano /etc/pam.d/sshd

Comment out the following lines by inserting # at the beginning like below:

# auth       substack     password-auth
# auth    required      pam_unix.so     no_warn try_first_pass

You should now be able to log in using your SSH key as the first method of authentication and your verification code as the second. To test your configuration, log out and try to log in again via SSH. You should be asked for your 6-digit verification code only since the key authentication will not produce a prompt.

Wrapping up

You may wish to consult the following resources for additional information on this topic.

↧

How To Enable Two-factor Authentication For SSH on Ubuntu 18/19/20

March 31, 2020, 6:38 am

≫ Next: How To Install Node.JS on CentOS/RHEL 8

≪ Previous: How To Enable Two-factor Authentication For SSH on CentOS/RHEL 7/8

This tutorial will walk you through the steps to set up two-factor authentication for SSH on Ubuntu or Debian.

Cyberthreats are increasing with the passage of time, hence securing access to your servers is critical in terms of preventing important information from being compromised.

By implementing two-factor authentication with a regular password, public-key (or even both), you can add an additional layer of security to protect your servers from internal or external threats.

Prerequisites

This guide assumes that you have a working installation of Ubuntu 18, 19 or 20.04 and/or Debian 9, 10 with SSH installed and running.

When you have these prerequisites in place, please proceed with the following steps.

Install Google Authenticator

Log in to your Ubuntu or Debian with a non-root-user having sudo privileges and install google-authenticator package like below:

sudo apt -y install libpam-google-authenticator

Now that the required packages have been installed, we’ll use them to generate keys, so be sure to have your smartphone or client device ready with any of these authenticator apps installed.

We'll use a google authenticator app on a smartphone for this guide. If you haven’t downloaded an authenticator app on your smartphone or a client device, do so before proceeding with the below.

Generate a Key

google-authenticator

The following prompt will appear asking you to specify whether you’d like to use time-based authentication (as opposed to one-time or counter-based). Choose “yes” by entering y at the prompt.

Do you want authentication tokens to be time-based (y/n) y

You should see a QR code in your terminal like below:

Open the authenticator app on your smartphone and scan your QR code, this automatically adds your system's user account and generates verification code every 30 seconds.

You’ll also see a “secret key” below the QR code. You can also enter this secret key into the smartphone authenticator app manually, instead of scanning the QR code, to add your account.

You’ll be prompted to answer the following questions:

Do you want me to update your "/home/administrator/.google_authenticator" file (y/n) y

This specifies whether the authentication settings will be set for this user. Answer y to create the file that stores these settings.

Do you want to disallow multiple uses of the same authentication
token? This restricts you to one login about every 30s, but it increases
your chances to notice or even prevent man-in-the-middle attacks (y/n) y

By default, a new token is generated every 30 seconds by the mobile app.
In order to compensate for possible time-skew between the client and the server,
we allow an extra token before and after the current time. This allows for a
time skew of up to 30 seconds between authentication server and client. If you
experience problems with poor time synchronization, you can increase the window
from its default size of 3 permitted codes (one previous code, the current
code, the next code) to 17 permitted codes (the 8 previous codes, the current
code, and the 8 next codes). This will permit for a time skew of up to 4 minutes
between client and server.
Do you want to do so (y/n) y

This setting accounts for time syncing issues across devices. If you believe that your phone or device may not sync properly, answer y.

If the computer that you are logging into isn't hardened against brute-force
login attempts, you can enable rate-limiting for the authentication module.
By default, this limits attackers to no more than 3 login attempts every 30s.
Do you want to enable rate-limiting (y/n) y

Configure Authentication Settings

sudo nano /etc/pam.d/sshd

Add the following lines to the end of the file:

auth    required      pam_unix.so     try_first_pass
auth    required      pam_google_authenticator.so

Save and close the editor.

Next, edit /etc/ssh/sshd_config file:

sudo nano /etc/ssh/sshd_config

Comment out the following line by adding # at the beginning like below:

# ChallengeResponseAuthentication no

Uncomment the following line by removing #

ChallengeResponseAuthentication yes

then, add these lines to the end of the file

Match User administrator
    AuthenticationMethods keyboard-interactive

sudo systemctl restart sshd

Two-factor authentication is now enabled. When you connect to your server via SSH, the authentication process will take place.

Test Two-factor Authentication

Open the authenticator app on your smartphone, and enter the verification code that is displayed.

You should authenticate successfully and gain access to your server.

Combine Two-factor and Public Key Authentication

sudo nano /etc/ssh/sshd_config

Set PasswordAuthentication to no and modify the AuthenticationMethods line like below:

PasswordAuthentication no

Match User administrator
    AuthenticationMethods publickey,keyboard-interactive

sudo systemctl restart sshd

Next, you’ll need to make changes to your PAM configuration.

sudo nano /etc/pam.d/sshd

Comment out the following lines by inserting # at the beginning like below:

# auth       substack     password-auth
# auth    required      pam_unix.so     try_first_pass

Wrapping up

Now that you enabled two-factor authentication for SSH on your Ubuntu or Debian, you may wish to consult the Linux PAM Documentation following resources for additional information on this topic.

↧

How To Install Node.JS on CentOS/RHEL 8

April 2, 2020, 8:56 am

≫ Next: How To Create a Bootable Ubuntu USB from Windows 10

≪ Previous: How To Enable Two-factor Authentication For SSH on Ubuntu 18/19/20

This tutorial will show you how to install Node.js using three different options on your CentOS/RHEL 8. You can adopt any of the following methods to install Node.js on your system.

Prerequisites

To follow this guide along, you will need one (physical or virtual) machine running CentOS/RHEL 8 with a non-root user sudo privileges.

If you are running RHEL 8, you will need to add to extra packages for enterprise Linux (EPEL) repository to install nodejs using the dnf package manager.

sudo dnf -y install epel-release

Install Node.JS using DNF

There are multiple nodejs versions available, and you can choose between them by enabling the appropriate module stream on your system.

First, check the available streams for the nodejs module using the below command:

sudo dnf module list nodejs

You should see below output:

Name                     Stream                   Profiles                                                Summary
nodejs                   10 [d]                   common [d], development, minimal, s2i                   Javascript runtime
nodejs                   12                       common, development, minimal, s2i                       Javascript runtime

As you can see, two streams are available, 10 and 12. The [d] indicates that version 10 is the default stream.

For instance, If you’d like to install Node.js 12, switch module streams to nodejs:12 with below command:

sudo dnf -y module enable nodejs:12

Type below to install the nodejs with below command:

sudo dnf -y install nodejs

Check that the install was successful by querying node for its version number:

node --version

You will see output similar to the below:

v12.13.1

When you install the nodejs using dnf package manager, the npm Node Package Manager utility will automatically be installed as a dependency.

You can verify whether npm was installed with below command:

npm --version

This will print the npm version like below:

6.12.1

At this stage, you have successfully installed Node.js and npm using the dnf package manager on your CentOS/RHEL 8.

Install NodeJS using nvm

This section will show you how to install the nodejs using Node Version Manager (npm) on your CentOS/RHEL 8. The npm allows you to install and maintain many different independent versions of Node.js, and their associated Node packages, at the same time.

To install NVM on your CentOS/RHEL 8 machine, visit the project’s GitHub page.

Copy the curl command that displays on the main page. This will get you the most recent version of the installation script.

curl -o- https://raw.githubusercontent.com/nvm-sh/nvm/v0.35.3/install.sh | bash

The script clones the nvm repository to ~/.nvm, and attempts to add the source lines to the correct profile file (~/.bash_profile, ~/.zshrc, ~/.profile, or ~/.bashrc).

To use it, you must first source your correct profile file like below:

source ~/.bash_profile

Now, you can check which versions of nodejs are available:

nvm list-remote

You should see output similar to the following:

v12.13.0   (LTS: Erbium)
       v12.13.1   (LTS: Erbium)
       v12.14.0   (LTS: Erbium)
       v12.14.1   (LTS: Erbium)
       v12.15.0   (LTS: Erbium)
       v12.16.0   (LTS: Erbium)
       v12.16.1   (Latest LTS: Erbium)
        v13.0.0
        v13.0.1
        v13.1.0
        v13.2.0
        v13.3.0
        v13.4.0
        v13.5.0
        v13.6.0
        v13.7.0
        v13.8.0
        v13.9.0
       v13.10.0
       v13.10.1
       v13.11.0
       v13.12.0

You can install a version of Nodejs by typing any of the released versions you see. For this guide, we will install version v13.6.0 using the below command:

nvm install v13.6.0

You can see the different versions you have installed by typing:

nvm list

This will return you similar to the following output:

->      v13.6.0
default -> v13.6.0
node -> stable (-> v13.6.0) (default)
stable -> 13.6 (-> v13.6.0) (default)

The output shows the currently active version on the first line, followed by some named aliases and the versions that those aliases point to.

Note: if you also have a version of Nodejs installed through the dnf package manager, you may see a system -> v12.13.1 (or some other version number) line here. You can always activate the system version of Nodejs using the nvm use system.

Additionally, you’ll see aliases for the various long-term support (or LTS) releases of Nodejs:

lts/* -> lts/erbium (-> N/A)
lts/argon -> v4.9.1 (-> N/A)
lts/boron -> v6.17.1 (-> N/A)
lts/carbon -> v8.17.0 (-> N/A)
lts/dubnium -> v10.19.0 (-> N/A)
lts/erbium -> v12.16.1 (-> N/A)

You can install a release based on these aliases as well. For example, install the latest long-term support version, erbium, with below command:

nvm install lts/erbium

You will see output similar to the following:

Downloading and installing node v12.16.1
Now using node v12.16.1 (npm v6.13.4)

You can switch between installed versions using the nvm use followed by version:

nvm use v13.6.0

This will return output similar to the below:

Now using node v13.6.0 (npm v6.13.4)

Install Node.JS using Source Code

This section will show you how to download, compile and install Node.js using source code.

Open up your preferred web browser, navigate to the official Node.js download page, right-click on the Source Code and Copy Link Address.

Return to your terminal session, download source code using curl in current user’s home directory:

cd ~
curl https://nodejs.org/dist/v12.16.1/node-v12.16.1.tar.gz | tar xz
cd node-v*
sudo dnf -y install gcc-c++ make python2

Now, you can configure and compile the nodejs source code:

./configure
make -j4

When the compilation is finished, you can install nodejs on your system with below command:

sudo make install

Type below to verify that the installation was successful:

node --version

If the command return the correct version number in the output, then the installation was completed successfully.

Wrapping up

In this guide you learned how to install Node.js using the three different options on CentOS/RHEL 8.

↧

How To Create a Bootable Ubuntu USB from Windows 10

April 9, 2020, 12:55 pm

≫ Next: How To Fix Ubuntu not Detecting Windows 10 While Installing in Dual Boot

≪ Previous: How To Install Node.JS on CentOS/RHEL 8

Creating a bootable Ubuntu USB from Windows is a very simple and easy process. With a bootable Ubuntu USB stick, you can test out the Ubuntu desktop experience even without installing it on your personal computer. You can, of course, install or upgrade Ubuntu on your system using a bootable USB stick or boot into Ubuntu to repair, fix a broken configuration.

Prerequisites

To make a bootable Ubuntu USB, you will need at least 4GB or larger USB stick/flash drive.

Download Rufus, a utility that helps format and creates bootable USB flash drives, such as USB keys/pen drives, memory sticks, etc.

An Ubuntu ISO image file can be obtained from Ubuntu's official website.

When you have all the perquisites in place, Right-click on Rufus.exe file you downloaded and click Run as administrator on your Windows machine.

The following Rufus screen will appear.

Rufus will automatically detect when you attach a USB stick with your system. Once USB stick is detected, click the SELECT button to select an Ubuntu ISO image file.

Select your appropriate Ubuntu ISO image file.

This will automatically select remaining options as you can see in the below screenshot. Next, click on START to begin writing Ubuntu into your USB stick.

Click Yes on the below warning screen.

Keep the default option on below screen, click OK.

Click OK.

This will start writing Ubuntu ISO into your USB.

Click CLOSE to finish the process.

You now have Ubuntu on a USB stick, bootable and ready to use.

↧

How To Fix Ubuntu not Detecting Windows 10 While Installing in Dual Boot

April 10, 2020, 5:56 am

≫ Next: How To Set Up 389 Directory Server on CentOS/RHEL 8

≪ Previous: How To Create a Bootable Ubuntu USB from Windows 10

Ubuntu installation wizard will automatically detect your Windows if you are installing in dual boot, but in some cases, if you encounter "This computer currently has no detected operating system" warning, then this guide will help you to resolve this issue.

Close the Ubuntu installation wizard, reboot it into your Windows and follow the below step by step instructions to clean up your Windows 10 errors first.

From Windows 10, Right-click on Start > File Explorer and find the Windows partition usually “C:\” volume.

Right-click on the C:\ volume, click Properties.

Click Tools then click the Check button to fix errors.

This will scan the C:\ drive for errors, and fix them automatically.

When the error checking process finished, open up the Windows PowerShell(Admin) from the start menu.

When the prompt opens up on the screen, saying, “Do you want to allow this app to make changes to your device, click Yes.

Type the chkdsk on the PowerShell to scan and fix the errors.

If a prompt appears in the command-line, saying you need to reboot, press Y key on the keyboard to restart your system and let the chkdsk command complete its process.

When the chkdsk process finished, boot into Windows 10 and then, Shutdown from the Start menu to turn it off safely.

Next, boot with Ubuntu installation media and see if the Ubuntu installation wizard detects Windows 10 as shown in the below screenshot.

If it detects and gives you the option to install Ubuntu alongside Windows 10 then you are good to go with the wizard following on-screen instructions to complete the rest of the Ubuntu installation.

For instance, if it doesn't detect your Windows 10 after completing the Ubuntu installation in the dual boot then boot with the Ubuntu installation media again. When you see the below screen click Try Ubuntu.

When your system reaches to Ubuntu graphical desktop, open a terminal session by pressing the key combination of Ctrl + Alt + T or Ctrl + Shift + T on the keyboard.

Type the lsblk command to detect what your Windows PC’s drive label is, as well as the partition names. For example, it can be “/dev/sdb”, and the partitions you’ll work with are “/dev/sdb1”, “/dev/sdb2”, and “/dev/sdb3”.

Run the fsck tool on each of your Windows partitions to clean out the bad sectors/bits on the hard drive. Make sure you replace each instance of “/dev/sdb” with your actual Windows partition names.


sudo fsck -y /dev/sdb1

sudo fsck -y /dev/sdb2
sudo fsck -y /dev/sdb3

Next, install the os-prober package on Ubuntu using the apt command:

sudo apt -y install os-prober

Next, type the below command to manually update your bootloader:

sudo update-grub

This will automatically detect and add Windows 10 in the grub bootloader.

Reboot your system and take a look at the Grub bootloader. If the above steps were successful, Ubuntu will show you Windows 10 as a boot option.

Wrapping up

I hope this guide was helpful to set up your Ubuntu alongside Windows in a dual boot environment.

↧

How To Set Up 389 Directory Server on CentOS/RHEL 8

April 11, 2020, 12:13 am

≫ Next: Secure Apache Web Server Content using ModSecurity on Ubuntu 18.04/19.10/20.04

≪ Previous: How To Fix Ubuntu not Detecting Windows 10 While Installing in Dual Boot

The 389 Directory Server is an open-source enterprise-class LDAP server for Linux that can be deployed in less than an hour. This guide will help you to set up a 389 Directory Server on CentOS/RHEL 8.

Prerequisites

You will need one (physical or virtual) machine installed with CentOS/RHEL 8 having root user privileges.

Configure SELinux

Login to your server with root user and make the following required changes to prepare your server for 389-ds installation.

First, edit /etc/selinux/config and change SELINUX=enforcing to SELINUX=disabled:

sudo vi /etc/selinux/config

# This file controls the state of SELinux on the system.
# SELINUX= can take one of these three values:
#     enforcing - SELinux security policy is enforced.
#     permissive - SELinux prints warnings instead of enforcing.
#     disabled - No SELinux policy is loaded.
SELINUX=disabled
# SELINUXTYPE= can take one of three two values:
#     targeted - Targeted processes are protected,
#     minimum - Modification of targeted policy. Only selected processes are protected.
#     mls - Multi Level Security protection.
SELINUXTYPE=targeted

Save and close the editor.

Reboot your server to apply these changes.

Add EPEL Repository

You can add epel repository to your CentOS/RHEL 8 server using the following command:

Type below if you are on CentOS 8:

dnf -y install https://dl.fedoraproject.org/pub/epel/epel-release-latest-8.noarch.rpm
dnf config-manager --set-enabled PowerTools

Type below if you are on RHEL 8:

dnf -y install https://dl.fedoraproject.org/pub/epel/epel-release-latest-8.noarch.rpm
ARCH=$( /bin/arch )
subscription-manager repos --enable "codeready-builder-for-rhel-8-${ARCH}-rpms"

Install 389 Directory Server

There are two 389-ds streams available: stable and testing. Testing is a bleeding-edge development version. As its name implies, it is NOT supposed to be used in production. After a period of testing and bug fixing it becomes the next stable version.

Each stream has 3 profiles:

default - 389-ds-base and cockpit web ui
minimal - just 389-ds-base
legacy - same as default plus legacy Perl tools and scripts

Type below command to install 389-ds on your CentOS/RHEL 8:

dnf -y module install 389-directory-server:stable/default

Configure 389 Directory Server

dscreate interactive

You will see the following prompts:

Install Directory Server (interactive mode)
===========================================
selinux is disabled, will not relabel ports or files.

Selinux support will be disabled, continue? [yes]:

Enter system's hostname [ldapsvr.techsupport.pk]:

Enter the instance name [ldapsvr]:

Enter port number [389]:

Create self-signed certificate database [yes]:

Enter secure port number [636]:

Enter Directory Manager DN [cn=Directory Manager]:

Enter the Directory Manager password:
Confirm the Directory Manager Password:

Enter the database suffix (or enter "none" to skip) [dc=ldapsvr,dc=techsupport,dc=pk]:

Create sample entries in the suffix [no]: yes

Do you want to start the instance after the installation? [yes]:

Are you ready to install? [no]: yes
Starting installation...
Completed installation for ldapsvr

Next, check the ldap instance name with below command:

dsctl --list

You will see the output similar to the following:

slapd-ldapsvr

Confirm that slapd-ldapsvr instance is running with below command:

dsctl slapd-ldapsvr status

You will see the output similar to the following:

Instance "ldapsvr" is running

Next, start cockpit service with below command:

systemctl start cockpit
systemctl enable cockpit

Add Firewall Rules

firewall-cmd --permanent --add-port=389/tcp
firewall-cmd --permanent --add-port=636/tcp
firewall-cmd --permanent --add-port=9090/tcp
firewall-cmd --reload

Open up your preferred web browser and access the cockpit web interface by navigating to http://your_server_ip:9090.

Enter the user root and password you created for root to log in.

From here you can manage your 389 Directory Server.

Wrapping up

Congratulation, your 389 Directory Server is now ready to serve the purpose.

↧

Secure Apache Web Server Content using ModSecurity on Ubuntu 18.04/19.10/20.04

April 13, 2020, 10:25 am

≫ Next: TikTok vulnerability allows hackers to easily penetrate into users data

≪ Previous: How To Set Up 389 Directory Server on CentOS/RHEL 8

ModSecurity is an open-source, cross-platform web application firewall (WAF) module. Known as the "Swiss Army Knife" of WAFs, it enables web application defenders to gain visibility into HTTP(S) traffic and provides a power rules language and API to implement advanced protections. ModSecurity is a toolkit for real-time web application monitoring, logging, and access control.

The following is a list of the most important usage scenarios:

Real-time application security monitoring and access control

At its core, ModSecurity gives you access to the HTTP traffic stream, in real-time, along with the ability to inspect it. This is enough for real-time security monitoring. There's an added dimension of what's possible through ModSecurity's persistent storage mechanism, which enables you to track system elements over time and perform event correlation. You are able to reliably block if you so wish because ModSecurity uses full request and response buffering.

Full HTTP traffic logging

Web servers traditionally do very little when it comes to logging for security purposes. They log very little by default, and even with a lot of tweaking you are not able to get everything that you need. I have yet to encounter a web server that is able to log full transaction data. ModSecurity gives you the ability to log anything you need, including raw transaction data, which is essential for forensics. In addition, you get to choose which transactions are logged, which parts of a transaction are logged, and which parts are sanitized.

Continuous passive security assessment

A security assessment is largely seen as an active scheduled event, in which an independent team is sourced to try to perform a simulated attack. The continuous passive security assessment is a variation of real-time monitoring, where, instead of focusing on the behavior of the external parties, you focus on the behavior of the system itself. It's an early warning system of sorts that can detect traces of many abnormalities and security weaknesses before they are exploited.

Web application hardening

One of my favorite uses for ModSecurity is attack surface reduction, in which you selectively narrow down the HTTP features you are willing to accept (e.g., request methods, request headers, content types, etc.). ModSecurity can assist you in enforcing many similar restrictions, either directly, or through collaboration with other Apache modules. They all fall under web application hardening. For example, it is possible to fix many session management issues, as well as cross-site request forgery vulnerabilities.

This guide will help you to set up ModSecurity for Apache webserver running on an Ubuntu or Debian server.

Prerequisites

You will need one (physical or virtual) machine installed with Ubuntu or Debian having a non-root user sudo privileges. This guide assumes that you have already set up Apache on your Ubuntu or Debian server.

Install ModSecurity

Type below command to install ModSecurity on Ubuntu:

sudo apt-get install libapache2-mod-security2

Restart apache service to take mod-security module into account:

sudo systemctl restart apache2

Type below command to install ModSecurity on Debian:

sudo apt install libapache2-modsecurity
sudo systemctl restart apache2

Type below command to install ModSecurity on CentOS/RHEL 7

sudo yum -y install epel-release
sudo yum -y install mod_security
sudo systemctl restart httpd

Type below command to install ModSecurity on CentOS/RHEL 8

sudo dnf -y install epel-release
sudo dnf -y install mod_security
sudo systemctl restart httpd

Configure ModSecurity

The following steps are for Ubuntu or Debian based distributions. If you are on CentOS/RHEL, file paths and commands will differ slightly.

Move and change the name of the default ModSecurity file:

sudo mv /etc/modsecurity/modsecurity.conf-recommended  /etc/modsecurity/modsecurity.conf

Download the OWASP ModSecurity CRS from Github:

cd ~
git clone https://github.com/SpiderLabs/owasp-modsecurity-crs.git

Move and rename crs-setup.conf.example to crs-setup.conf. Then move rules/ directory as well.

cd ~/owasp-modsecurity-crs
sudo mv crs-setup.conf.example /etc/modsecurity/crs-setup.conf
sudo mv rules/ /etc/modsecurity/

The configuration file should match the path above as defined in the IncludeOptional directive.

sudo nano etc/apache2/mods-available/security2.conf

Add another Include directive pointing to the ruleset:

<IfModule security2_module>
        # Default Debian dir for modsecurity's persistent data
        SecDataDir /var/cache/modsecurity

        # Include all the *.conf files in /etc/modsecurity.
        # Keeping your local configuration in that directory
        # will allow for an easy upgrade of THIS file and
        # make your life easier
        IncludeOptional /etc/modsecurity/*.conf
Include /etc/modsecurity/rules/*.conf
</IfModule>

Save and close the editor.

Restart Apache to take changes into effect:

sudo systemctl restart apache2

Test ModSecurity

The OWASP CRS builds on top of ModSecurity so that existing rules can be extended.

Edit the default Apache configuration file and add two additional directives, using the default configuration as an example:

sudo nano /etc/apache2/sites-available/000-default.conf

<VirtualHost *:80>
    ServerAdmin webmaster@localhost
    DocumentRoot /var/www/html

    ErrorLog ${APACHE_LOG_DIR}/error.log
    CustomLog ${APACHE_LOG_DIR}/access.log combined

SecRuleEngine On
SecRule ARGS:modsecparam "@contains test""id:4321,deny,status:403,msg:'ModSecurity test rule has triggered'"
</VirtualHost>

Save and close the editor.

Restart Apache to take changes into effect.

sudo systemctl restart apache2

Curl the index page to intentionally trigger the alarms:

curl localhost/index.html?modsecparam=test

The response code should be 403. There should be a message in the logs that shows the defined ModSecurity rule worked.

You can check using: sudo tail -f /var/log/apache2/error.log

ModSecurity: Access denied with code 403 (phase 2). String match “test” at ARGS:modsecparam. [file “/etc/apache2/sites-enabled/000-default.conf”] [line “24”] [id “1234”] [msg “ModSecurity test rule has triggered”] [hostname “localhost”] [uri “/index.html”] [unique_id “WenFd36AAAEAAEmQyEAAAAAD”]

Verify the OWASP CRS is in effect:

curl localhost/index.html?exec=/bin/bash

Check the error logs again: the rule has caught the attempted execution of an arbitrary bash script.

ModSecurity: Warning. Matched phrase “bin/bash” at ARGS:. [file “/etc/modsecurity/rules/REQUEST-932-APPLICATION-ATTACK-RCE.conf”] [line “448”] [id “932160”] [rev “1”] [msg “Remote Command Execution: Unix Shell Code Found”] [data “Matched Data: bin/bash found within ARGS:: exec/bin/bash”] [severity “CRITICAL”] [ver “OWASP_CRS/3.0.0”] [maturity “1”] [accuracy “8”] [tag “application-multi”] [tag “language-shell”] [tag “platform-unix”] [tag “attack-rce”] [tag “OWASP_CRS/WEB_ATTACK/COMMAND_INJECTION”] [tag “WASCTC/WASC-31”] [tag “OWASP_TOP_10/A1”] [tag “PCI/6.5.2”] [hostname “localhost”] [uri “/index.html”] [unique_id “WfnVf38AAAEAAEqya3YAAAAC”]

Wrapping up

Review the configuration files located in /etc/modsecurity/*.conf. Most of the files are commented with definitions of the available options. ModSecurity uses an Anomaly Scoring Level where the highest number (5) is most severe. Review the wiki for additional directives to update the rules when encountering false positives. You may wish to go through the following official resources for additional information on this topic.

↧

TikTok vulnerability allows hackers to easily penetrate into users data

April 14, 2020, 10:46 am

≫ Next: How To Install or Upgrade PHP on Fedora/CentOS/RHEL

≪ Previous: Secure Apache Web Server Content using ModSecurity on Ubuntu 18.04/19.10/20.04

A security vulnerability in a popular video-sharing app (TikTok ) allows hackers to exploit millions of users. The popular video-sharing app uses an insecure protocol (HTTP) to process the videos and images over unencrypted channels, allows hackers to gain access to any user's data including all user's activity history, exposing users' privacy.

The attacker can download, add or modify the videos, images exploiting users in terms of feeding fake, malicious content publicly on behalf of the victim accounts. The Tiktok release 15.5.6 for iOS and the release 15.7.4 for Android are still using the unencrypted channels to process data risking millions of users over the internet.

This is an example of misleading information being circulated by the attackers using TikTok vulnerability that causes huge risk in such deadliest pandemic.

↧

How To Install or Upgrade PHP on Fedora/CentOS/RHEL

January 6, 2020, 9:52 pm

≫ Next: How to upgrade from Ubuntu 18.04 LTS (Bionic Beaver) to Ubuntu 20.04 LTS (Focal Fossa)

≪ Previous: TikTok vulnerability allows hackers to easily penetrate into users data

This guide will show you how to install or upgrade to latest release of PHP on Fedora, RHEL and CentOS.

Note: On CentOS/RHEL 8, yum command has been replaced with dnf and in near future yum package manager will be discontinued. It is now recommended to use dnf for installing packages on CentOS/RHEL 8 but if you still wish to use yum you can use it.

Adding EPEL/Remi Repository

If you are on Fedora, standards repositories are enough, you can directly install or upgrade PHP version. For (RHEL, CentOS) the Extra Packages for Enterprise Linux (EPEL) repository must be configured before proceeding to install, and on RHEL the optional channel must also be enabled.

For CentOS 8

sudo dnf -y install https://dl.fedoraproject.org/pub/epel/epel-release-latest-8.noarch.rpm
sudo dnf config-manager --set-enabled PowerTools

sudo dnf -y install https://rpms.remirepo.net/enterprise/remi-release-8.rpm
sudo dnf config-manager --set-enabled remi

For RHEL 8

sudo dnf -y install https://dl.fedoraproject.org/pub/epel/epel-release-latest-8.noarch.rpm
ARCH=$( /bin/arch )
sudo dnf config-manager --set-enabled PowerTools
sudo subscription-manager repos --enable "codeready-builder-for-rhel-8-${ARCH}-rpms"

sudo dnf -y install https://rpms.remirepo.net/enterprise/remi-release-8.rpm
sudo dnf config-manager --set-enabled remi

For CentOS 7.6

sudo yum -y install https://dl.fedoraproject.org/pub/epel/epel-release-latest-7.noarch.rpm
sudo yum -y install https://rpms.remirepo.net/enterprise/remi-release-7.rpm

For RHEL 7.6

sudo yum -y install https://dl.fedoraproject.org/pub/epel/epel-release-latest-7.noarch.rpm
sudo yum -y install https://rpms.remirepo.net/enterprise/remi-release-7.rpm
sudo subscription-manager repos --enable=rhel-7-server-optional-rpms

For CentOS 6.10

yum -y install https://dl.fedoraproject.org/pub/epel/epel-release-latest-6.noarch.rpm
yum -y install https://rpms.remirepo.net/enterprise/remi-release-6.rpm

For RHEL 6.10

yum -y install https://dl.fedoraproject.org/pub/epel/epel-release-latest-6.noarch.rpm
yum -y install https://rpms.remirepo.net/enterprise/remi-release-6.rpm
rhn-channel --add --channel=rhel-$(uname -i)-server-optional-6

For Fedora 29

sudo dnf -y install http://rpms.remirepo.net/fedora/remi-release-29.rpm

For Fedora 28

sudo dnf -y install http://rpms.remirepo.net/fedora/remi-release-28.rpm

Installing PHP

For CentOS/RHEL 8, you can install any version of php with below command:

sudo dnf -y install php php-fpm

For installing specific php version, type one of the below command:

sudo dnf -y install php70
sudo dnf -y install php71
sudo dnf -y install php72
sudo dnf -y install php73
sudo dnf -y install php74

For CentOS/RHEL 7, you can install default version of php with below command:

sudo yum -y install php php-fpm

For installing specific php version, type below command:

sudo yum -y install php70
sudo yum -y install php71
sudo yum -y install php72
sudo yum -y install php73
sudo yum -y install php74

For CentOS/RHEL 6, you can install default version of php with below command:

yum -y install php php-fpm

For installing specific php version, type one of below command:

yum -y install php70
yum -y install php71
yum -y install php72
yum -y install php73

For Fedora, you can install any version of php with below command:

sudo dnf -y install dnf-plugins-core

For installing default php version, type below command:

sudo dnf -y install php php-fpm

For installing specific php version, type one of the below command:

sudo dnf -y install php70
sudo dnf -y install php71
sudo dnf -y install php72
sudo dnf -y install php73
sudo dnf -y install php74

We hope this guide was helpful.

↧