Quantcast
Channel: Severalnines
Viewing all 1257 articles
Browse latest View live

Comparing Galera Cluster Cloud Offerings: Part One Amazon AWS

$
0
0

Running a MySQL Galera Cluster (either the Percona, MariaDB, or Codership build) is, unfortunately, not a  supported (nor part of) the databases supported by Amazon RDS. Most of the databases supported by RDS use asynchronous replication, while Galera Cluster is a synchronous multi-master replication solution. Galera also requires InnoDB as its storage engine to function properly, and while you can use other storage engines such as MyISAM it is not advised that you use this storage engine because of the lack of transaction handling. 

Because of the lack of support natively in RDS, this blog will focus on the offerings available when choosing and hosting your Galera-based cluster using an AWS environment.

There are certainly many reasons why you would choose or not choose the AWS cloud platform, but for this particular topic we’re going to go over the advantages and benefits of what you can leverage rather than why you would choose the AWS Platform.

The Virtual Servers (Elastic Compute Instances)

As mentioned earlier, MySQL Galera is not part of RDS and InnoDB is a transactional storage engine for which you need the right resources for your application requirement. It must have the capacity to serve the demand of your client request traffic. At the time of this article, your sole choice for running Galera Cluster is by using EC2, Amazon's compute instance cloud offering. 

Because you have the advantage of running your system on a number of nodes on EC2 instances, running a Galera Cluster on EC2 verses on-prem doesn’t differ much. You can access the server remotely via SSH, install your desired software packages, and choose the kind of Galera Cluster build you like to utilize. 

Moreover, with EC2 this offering is more elastic and flexible, allowing you to deliver and offer a simpler,  granular setup. You can take advantage of the web services to automate or build a number of nodes if you need to scaleout your environment, or for example, automate the building of your staging or development environment. It also gives you an edge to quickly build your desired environment, choose and setup your desired OS, and pickup the right computing resources that fits your requirements (such as CPU, memory, and disk storage.) EC2 eliminates the time to wait for hardware, since you can do this on the fly. You can also leverage their AWS CLI tool to automate your Galera cluster setup.

Pricing for Amazon EC2 Instances

EC2 offers a number of selections which are very flexible for consumers who would like to host their Galera Cluster environment on AWS compute nodes. The AWS Free Tier includes 750 hours of Linux and Windows t2.micro instances, each month, for one year. You can stay within the Free Tier by using only EC2 Micro instances, but this might not be the best thing for production use. 

There are multiple types of EC2 instances for which you can deploy when provisioning your Galera nodes. Ideally, these r4/r5/x1 family (memory optimized) and c4/c5 family (compute optimized) are an ideal choice, and these prices differ depending on how large your server resource needs are and type of OS.

These are the types of paid instances you can choose...

On Demand 

Pay by compute capacity (per-hour or per-second), depends on the type of instances you run. For example, prices might differ when provisioning an Ubuntu instances vs RHEL instance aside from the type of instance. It has no long-term commitments or upfront payments needed. It also has the flexibility to increase or decrease your compute capacity. These instances are recommended for low cost and flexible environment needs like applications with short-term, spiky, or unpredictable workloads that cannot be interrupted, or applications being developed or tested on Amazon EC2 for the first time. Check it out here for more info.

Dedicated Hosts

If you are looking for compliance and regulatory requirements such as the need to acquire a dedicated server that runs on a dedicated hardware for use, this type of offer suits your needs. Dedicated Hosts can help you address compliance requirements and reduce costs by allowing you to use your existing server-bound software license, including Windows Server, SQL Server, SUSE Linux Enterprise Server, Red Hat Enterprise Linux, or other software licenses that are bound to VMs, sockets, or physical cores, subject to your license terms. It can be purchased On-Demand (hourly) or as a Reservation for up to 70% off the On-Demand price. Check it out here for more info.

Spot Instances

These instances allow you to request spare Amazon EC2 computing capacity for up to 90% off the On-Demand price. This is recommended for applications that have flexible start and end times, applications that are only feasible at very low compute prices, or users with urgent computing needs for large amounts of additional capacity. Check it out here for more info.

Reserved Instances

This type of payment offer provides you the option to grab up to a 75% discount and, depending on which instance you would like to reserve, you can acquire a capacity reservation giving you additional confidence in your ability to launch instances when you need them. This is recommended if your applications have steady state or predictable usage, applications that may require reserved capacity, or customers that can commit to using EC2 over a 1 or 3 year term to reduce their total computing costs. Check it out here for more info.

Pricing Note

One last thing with EC2, they also offer a per-second billing which also takes cost of unused minutes and seconds in an hour off of the bill. This is advantageous if you are scaling-out for a minimal amount of time, just to handle traffic request from a Galera node or in case you want to try and test on a specific node for just a limited time use.

Database Encryption on AWS

If you're concerned about the confidentiality of your data, or abiding the laws required for your security compliance and regulations, AWS offers data-at-rest encryption. If you're using MariaDB Cluster version 10.2+, they have built-in plugin support to interface with the Amazon Web Services (AWS) Key Management Service (KMS) API. This allows you to take advantage of AWS-KMS key management service to facilitate separation of responsibilities and remote logging & auditing of key access requests. Rather than storing the encryption key in a local file, this plugin keeps the master key in AWS KMS. 

When you first start MariaDB, the AWS KMS plugin will connect to the AWS Key Management Service and ask it to generate a new key. MariaDB will store that key on-disk in an encrypted form. The key stored on-disk cannot be used to decrypt the data; rather, on each startup, MariaDB connects to AWS KMS and has the service decrypt the locally-stored key(s). The decrypted key is stored in-memory as long as the MariaDB server process is running, and that in-memory decrypted key is used to encrypt the local data.

Alternatively, when deploying your EC2 instances, you can encrypt your data storage volume with EBS (Elastic Block Storage) or encrypt the instance itself. Encryption for EBS type volumes are all supported, though it might have an impact but the latency is very minimal or even not visible to the end users. For EC2 instance-type encryption, most of the large instances are supported. So if you're using compute or memory optimized nodes, you can leverage its encryption. 

Below are the list of supported instances types...

  • General purpose: A1, M3, M4, M5, M5a, M5ad, M5d, T2, T3, and T3a
  • Compute optimized: C3, C4, C5, C5d, and C5n
  • Memory optimized: cr1.8xlarge, R3, R4, R5, R5a, R5ad, R5d, u-6tb1.metal, u-9tb1.metal, u-12tb1.metal, X1, X1e, and z1d
  • Storage optimized: D2, h1.2xlarge, h1.4xlarge, I2, and I3
  • Accelerated computing: F1, G2, G3, P2, and P3

You can setup your AWS account to always enable encryption upon deployment of your EC2-type instances. This means that AWS will encrypt new EBS volumes on launch and encrypts new copies of unencrypted snapshots.

Multi-AZ/Multi-Region/Multi-Cloud Deployments

Unfortunately, as of this writing, there's no such direct support in the AWS Console (nor any of their AWS API) that supports Multi-AZ/-Region/-Cloud deployments for Galera node clusters. 

High Availability, Scalability, and Redundancy

To achieve a multi-AZ deployment, it's recommendable that you provision your galera nodes in different availability zones. This prevents the cluster from going down or a cluster malfunction due to lack of quorum. 

You can also setup an AWS Auto Scaling and create an auto scaling group to monitor and do status checks so your cluster will always have redundancy, scalable, and highly availability. Auto Scaling should solve your problem in the case that your node goes down for some unknown reason.

For multi-region or multi-cloud deployment, Galera has its own parameter called gmcast.segment for which you can set this upon server start. This parameter is designed to optimize the communication between the Galera nodes and minimize the amount of traffic sent between network segments including writeset relaying and IST and SST donor selection. 

This type of setup allows you to deploy multiple nodes in different regions for your Galera Cluster. Aside from that, you can also deploy your Galera nodes on a different vendor, for example, if it's hosted in Google Cloud and you want redundancy on Microsoft Azure. 

I would recommend you to check out our blog Multiple Data Center Setups Using Galera Cluster for MySQL or MariaDB and Zero Downtime Network Migration With MySQL Galera Cluster Using Relay Node to gather more information on how to implement these types of deployments.

Database Performance on AWS

Depending on your application demand, if your queries memory consuming the memory optimized instances are your ideal choice. If your application has higher transactions that require high-performance for web servers or batch processing, then choose compute optimized instances. If you want to learn more about optimizing your Galera Cluster, you can check out this blog How to Improve Performance of Galera Cluster for MySQL or MariaDB.

Database Backups on AWS

Creating backups can be difficult since there's no direct support within AWS that is specific for MySQL Galera technology. However, AWS provides you a disaster and recovery solution using EBS Snapshots. You can take snapshots of the EBS volumes attached to your instance, then either take a backup by schedule using CloudWatch or by using the Amazon Data Lifecycle Manager (Amazon DLM) to automate the snapshots. 

Take note that the snapshots taken are incremental backups, which means that only the blocks on the device that have changed after your most recent snapshot are saved. You can store these snapshots to AWS S3 to save storage costs. Alternatively,  you can use external tools like Percona Xtrabackup, and Mydumper (for logical backups) and store these to AWS EFS -> AWS S3 -> AWS Glacier

You can also setup Lifecycle Management in AWS if you need your backup data to be stored in a more cost efficient manner. If you have large files and are going to utilize the AWS EFS, you can leverage their AWS Backup solution as this is also a simple yet cost-effective solution.

On the other hand, you can also use external services (as well such as ClusterControl) which provides you both monitoring and backup solutions. Check this out if you want to know more.

Database Monitoring on AWS

AWS offers health checks and some status checks to provide you visibility into your Galera nodes. This is done through CloudWatch and CloudTrail

CloudTrail lets you enable and inspect the logs and perform audits based on what actions and traces have been made. 

CloudWatch lets you collect and track metrics, collect and monitor log files, and set custom alarms. You can set it up according to your custom needs and gain system-wide visibility into resource utilization, application performance, and operational health. CloudWatch comes with a free tier as long as you still fall within its limits (See the screenshot below.)

CloudWatch also comes with a price depending on the volume of metrics being distributed. Checkout its current pricing by checking here

Take note: there's a downside to using CloudWatch. It is not designed to cater to the database health, especially for monitoring MySQL Galera cluster nodes. Alternatively, you can use external tools that offer high-resolution graphs or charts that are useful in reporting and are easier to analyze when diagnosing a problematic node. 

For this you can use PMM by Percona, DataDog, Idera, VividCortex, or our very own ClusterControl (as monitoring is FREE with ClusterControl Community.) I would recommend that you use a monitoring tool that suits your needs based on your individual application requirements. It's very important that your monitoring tool be able to notify you aggressively or provide you integration for instant messaging systems such as Slack, PagerDuty or even send you SMS when escalating severe health status.

Database Security on AWS

Securing your EC2 instances is one of the most vital parts of deploying your database into the public cloud. You can setup a private subnet and setup the required security groups only favored to allow the port  or source IP depending on your setup. You can set your database nodes with a non-remote access and just set up a jump host or an Internet Gateway, if nodes requires to access the internet to access or update software packages. You can read our previous blog Deploying Secure Multicloud MySQL Replication on AWS and GCP with VPN on how we set this up. 

In addition to this, you can secure your data in-transit by using TLS/SSL connection or encrypt your data when it's at rest. If you're using ClusterControl, deploying a secure data in-transit is simple and easy. You can check out our blog SSL Key Management and Encryption of MySQL Data in Transit if you want to try out. For data at-rest, storing your data via S3 can be encrypted using AWS Server-Side Encryption or use AWS-KMS which I have discussed earlier. Check this external blog on how to setup and leverage a MariaDB Cluster using AWS-KMS so you can store your data securely at-rest.

Galera Cluster Troubleshooting on AWS

AWS CloudWatch can help especially when investigating and checking out the system metrics. You can check the network, CPU, memory, disk, and it's instance or compute usage and balance. This might not, however, meet your requirements when digging into a specific case. 

CloudTrail can perform solid traces of actions that has been governed based on your specific AWS account. This will help you determine if the occurrences aren't coming from MySQL Galera, but might be some bug or issues within the AWS environment (such as Hyper-V is having issues within the host machine where your instance, as the guest, is being hosted.)

If you're using ClusterControl, going to Logs -> System Logs, you'll be able to browse the captured error logs taken from the MySQL Galera node itself. Apart from this, ClusterControl provides real-time monitoring that would amplify your alarm and notification system in case an emergency or if your MySQL Galera node(s) is kaput.

Conclusion

AWS does not have pure support for a MySQL Galera Cluster setup, unlike AWS RDS which has MySQL compatibility. Because of this most of the recommendations or opinions running a Galera Cluster for production use within the AWS environment are based on experienced and well-tested environments that have been running for a very long time. 

MariaDB Cluster comes with a great productivity, as they constantly provide concise support for the AWS technology stack solution. In the upcoming release of MariaDB 10.5 version, they will offer a support for S3 Storage Engine, which may be worth the wait.

External tools can help you manage and control your MySQL Galera Cluster running on the AWS Cloud, so it's not a huge concern if you have some dilemmas and FUD on why you should run or shift to the AWS Cloud Platform.

AWS might not be the one-size-fits-all solution in some cases, but it provides a wide-array of solutions that you can customize and tailor it to fit your needs. 

In the next part of our blog, we'll look at another public cloud platform, particularly Google Cloud and see how we can leverage if we choose to run our Galera Cluster into their platform.


Tips for Storing PostgreSQL Backups on Amazon AWS

$
0
0

Data is probably one of the most valuable assets in a company. Because of this we should always have a Disaster Recovery Plan (DRP) to prevent data loss in the event of an accident or hardware failure. 

A backup is the simplest form of DR, however it might not always be enough to guarantee an acceptable Recovery Point Objective (RPO). It is recommended that you have at least three backups stored in different physical places. 

Best practice dictates backup files should have one stored locally on the database server (for a faster recovery), another one in a centralized backup server, and the last one the cloud. 

For this blog, we’ll take a look at which options Amazon AWS provides for the storage of PostgreSQL backups in the cloud and we’ll show some examples on how to do it.

About Amazon AWS

Amazon AWS is one of the world’s most advanced cloud providers in terms of features and services, with millions of customers. If we want to run our PostgreSQL databases on Amazon AWS we have some options...

  • Amazon RDS: It allows us to create, manage and scale a PostgreSQL database (or different database technologies) in the cloud in an easy and fast way.

  • Amazon Aurora: It’s a PostgreSQL compatible database built for the cloud. According to the AWS web site, it’s three times faster than standard PostgreSQL databases.

  • Amazon EC2: It’s a web service that provides resizable compute capacity in the cloud. It provides you with complete control of your computing resources and allows you to set up and configure everything about your instances from your operating system up to your applications.

But, in fact, we don’t need to have our databases running on Amazon to store our backups here.

Storing Backups on Amazon AWS

There are different options to store our PostgreSQL backup on AWS. If we’re running our PostgreSQL database on AWS we have more options and (as we’re in the same network) it could also be faster. Let’s see how AWS can help us store our backups.

AWS CLI

First, let’s prepare our environment to test the different AWS options. For our examples, we’ll use an On-prem PostgreSQL 11 server, running on CentOS 7. Here, we need to install the AWS CLI following the instructions from this site.

When we have our AWS CLI installed, we can test it from the command line:

[root@PG1bkp ~]# aws --version

aws-cli/1.16.225 Python/2.7.5 Linux/4.15.18-14-pve botocore/1.12.215

Now, the next step is to configure our new client running the aws command with the configure option.

[root@PG1bkp ~]# aws configure

AWS Access Key ID [None]: AKIA7TMEO21BEBR1A7HR

AWS Secret Access Key [None]: SxrCECrW/RGaKh2FTYTyca7SsQGNUW4uQ1JB8hRp

Default region name [None]: us-east-1

Default output format [None]:

To get this information, you can go to the IAM AWS Section and check the current user, or if you prefer, you can create a new one for this task.

After this, we’re ready to use the AWS CLI to access our Amazon AWS services.

Amazon S3

This is probably the most commonly used option to store backups in the cloud. Amazon S3 can store and retrieve any amount of data from anywhere on the Internet. It’s a simple storage service that offers an extremely durable, highly available, and infinitely scalable data storage infrastructure at low costs.

Amazon S3 provides a simple web service interface which you can use to store and retrieve any amount of data, at any time, from anywhere on the web, and (with the AWS CLI or AWS SDK) you can integrate it with different systems and programming languages.

How to use it

Amazon S3 uses Buckets. They are unique containers for everything that you store in Amazon S3. So, the first step is to access the Amazon S3 Management Console and create a new Bucket.

Create Bucket Amazon AWS

In the first step, we just need to add the Bucket name and the AWS Region.

Create Bucket Amazon AWS

Now, we can configure some details about our new Bucket, like versioning and logging.

Block Public Access Bucket Amazon AWS

And then, we can specify the permissions for this new Bucket.

S3 Buckets Amazon AWS

Now we have our Bucket created, let’s see how we can use it to store our PostgreSQL backups.

First, let’s test our client connecting it to S3.

[root@PG1bkp ~]# aws s3 ls

2019-08-23 19:29:02 s9stesting1

It works! With the previous command, we list the current Buckets created.

So, now, we can just upload the backup to the S3 service. For this, we can use aws sync or aws cp command. 

[root@PG1bkp ~]# aws s3 sync /root/backups/BACKUP-5/ s3://s9stesting1/backups/

upload: backups/BACKUP-5/cmon_backup.metadata to s3://s9stesting1/backups/cmon_backup.metadata

upload: backups/BACKUP-5/cmon_backup.log to s3://s9stesting1/backups/cmon_backup.log

upload: backups/BACKUP-5/base.tar.gz to s3://s9stesting1/backups/base.tar.gz

[root@PG1bkp ~]# 

[root@PG1bkp ~]# aws s3 cp /root/backups/BACKUP-6/pg_dump_2019-08-23_205919.sql.gz s3://s9stesting1/backups/

upload: backups/BACKUP-6/pg_dump_2019-08-23_205919.sql.gz to s3://s9stesting1/backups/pg_dump_2019-08-23_205919.sql.gz

[root@PG1bkp ~]# 

We can check the Bucket content from the AWS web site.

S3 Overview

Or even by using the AWS CLI.

[root@PG1bkp ~]# aws s3 ls s3://s9stesting1/backups/

2019-08-23 19:29:31          0

2019-08-23 20:58:36    2974633 base.tar.gz

2019-08-23 20:58:36       1742 cmon_backup.log

2019-08-23 20:58:35       2419 cmon_backup.metadata

2019-08-23 20:59:52       1028 pg_dump_2019-08-23_205919.sql.gz

For more information about AWS S3 CLI, you can check the official AWS documentation.

Amazon S3 Glacier

This is the lower-cost version of Amazon S3. The main difference between them is velocity and accessibility. You can use Amazon S3 Glacier if the cost of storage needs to stay low and you don’t require millisecond access to your data. Usage is another important difference between them.

How to use it

Instead Buckets, Amazon S3 Glacier uses Vaults. It’s a container for storing any object. So, the first step is to access the Amazon S3 Glacier Management Console and create a new Vault.

Create Vault S3 Glacier

Here, we need to add the Vault Name and the Region and, in the next step, we can enable the event notifications that uses the Amazon Simple Notification Service (Amazon SNS).

Now we have our Vault created, we can access it from the AWS CLI.

[root@PG1bkp ~]# aws glacier describe-vault --account-id - --vault-name s9stesting2

{

    "SizeInBytes": 0,

    "VaultARN": "arn:aws:glacier:us-east-1:984227183428:vaults/s9stesting2",

    "NumberOfArchives": 0,

    "CreationDate": "2019-08-23T21:08:07.943Z",

    "VaultName": "s9stesting2"

}

It’s working. So now, we can upload our backup here.

[root@PG1bkp ~]# aws glacier upload-archive --body /root/backups/BACKUP-6/pg_dump_2019-08-23_205919.sql.gz --account-id - --archive-description "Backup upload test" --vault-name s9stesting2

{

    "archiveId": "ddgCJi_qCJaIVinEW-xRl4I_0u2a8Ge5d2LHfoFBlO6SLMzG_0Cw6fm-OLJy4ZH_vkSh4NzFG1hRRZYDA-QBCEU4d8UleZNqsspF6MI1XtZFOo_bVcvIorLrXHgd3pQQmPbxI8okyg",

    "checksum": "258faaa90b5139cfdd2fb06cb904fe8b0c0f0f80cba9bb6f39f0d7dd2566a9aa",

    "location": "/984227183428/vaults/s9stesting2/archives/ddgCJi_qCJaIVinEW-xRl4I_0u2a8Ge5d2LHfoFBlO6SLMzG_0Cw6fm-OLJy4ZH_vkSh4NzFG1hRRZYDA-QBCEU4d8UleZNqsspF6MI1XtZFOo_bVcvIorLrXHgd3pQQmPbxI8okyg"

}

One important thing is the Vault status is updated about once per day, so we should wait to see the file uploaded.

[root@PG1bkp ~]# aws glacier describe-vault --account-id - --vault-name s9stesting2

{

    "SizeInBytes": 33796,

    "VaultARN": "arn:aws:glacier:us-east-1:984227183428:vaults/s9stesting2",

    "LastInventoryDate": "2019-08-24T06:37:02.598Z",

    "NumberOfArchives": 1,

    "CreationDate": "2019-08-23T21:08:07.943Z",

    "VaultName": "s9stesting2"

}

Here we have our file uploaded on our S3 Glacier Vault.

For more information about AWS Glacier CLI, you can check the official AWS documentation.

EC2

This backup store option is the more expensive and time consuming one, but it’s useful if you want to have full-control over the backup storage environment and wish to perform custom tasks on the backups (e.g. Backup Verification.)

Amazon EC2 (Elastic Compute Cloud) is a web service that provides resizable compute capacity in the cloud. It provides you with complete control of your computing resources and allows you to set up and configure everything about your instances from your operating system up to your applications. It also allows you to quickly scale capacity, both up and down, as your computing requirements change.

Amazon EC2 supports different operating systems like Amazon Linux, Ubuntu, Windows Server, Red Hat Enterprise Linux, SUSE Linux Enterprise Server, Fedora, Debian, CentOS, Gentoo Linux, Oracle Linux, and FreeBSD.

How to use it

Go to the Amazon EC2 section, and press on Launch Instance. In the first step, you must choose the EC2 instance operating system.

EC2 Choose an Amazon Machine Image (AMI)

In the next step, you must choose the resources for the new instance.

Choose an Instance Type AWS

Then, you can specify more detailed configuration like network, subnet, and more.

Configure Instance Details - AWS

Now, we can add more storage capacity on this new instance, and as a backup server, we should do it.

Add Storage AWS

When we finish the creation task, we can go to the Instances section to see our new EC2 instance.

Launch AWS EC2 Instance

When the instance is ready (Instance State running), you can store the backups here, for example, sending it via SSH or FTP using the Public DNS created by AWS. Let’s see an example with Rsync and another one with SCP Linux command.

[root@PostgreSQL1 ~]# rsync -avzP -e "ssh -i /home/user/key1.pem" /root/backups/BACKUP-11/base.tar.gz ubuntu@ec2-3-87-167-157.compute-1.amazonaws.com:/backups/20190823/

sending incremental file list

base.tar.gz

      4,091,563 100%    2.18MB/s 0:00:01 (xfr#1, to-chk=0/1)



sent 3,735,675 bytes  received 35 bytes 574,724.62 bytes/sec

total size is 4,091,563  speedup is 1.10

[root@PostgreSQL1 ~]# 

[root@PostgreSQL1 ~]# scp -i /tmp/key1.pem /root/backups/BACKUP-12/pg_dump_2019-08-25_211903.sql.gz ubuntu@ec2-3-87-167-157.compute-1.amazonaws.com:/backups/20190823/

pg_dump_2019-08-25_211903.sql.gz                                                                                                                                        100% 24KB 76.4KB/s 00:00

AWS Backup

AWS Backup is a centralized backup service that provides you with backup management capabilities, such as backup scheduling, retention management, and backup monitoring, as well as additional features, such as lifecycling backups to a low-cost storage tier, backup storage, and encryption that is independent of its source data, and backup access policies.

You can use AWS Backup to manage backups of EBS volumes, RDS databases, DynamoDB tables, EFS file systems, and Storage Gateway volumes.

How to use it

Go to the AWS Backup section on the AWS Management Console.

AWS Backup

Here you have different options, such as Schedule, Create or Restore a backup. Let’s see how to create a new backup.

Create On Demand Backup AWS Backup

In this step, we must choose the Resource Type that can be DynamoDB, RDS, EBS, EFS or Storage Gateway, and more details like expiration date, backup vault, and the IAM Role.

AWS Backup Jobs

Then, we can see the new job created in the AWS Backup Jobs section.

Snapshot

Now, we can mention this known option in all virtualization environments. The snapshot is a backup taken at a specific point in time, and AWS allows us to use it for the AWS products. Let’s an example of an RDS snapshot.

AWS DB Snapshot

We only need to choose the instance and add the snapshot name, and that’s it. We can see this and the previous snapshot in the RDS Snapshot section.

Amazon RDS Snapshots

Managing Your Backups with ClusterControl

ClusterControl is a comprehensive management system for open source databases that automates deployment and management functions, as well as health and performance monitoring. ClusterControl supports deployment, management, monitoring and scaling for different database technologies and environments, EC2 included. So, we can, for example, create our EC2 instance on AWS, and deploy/import our database service with ClusterControl.

ClusterControl Database Clusters

Creating a Backup

For this task, go to ClusterControl -> Select Cluster -> Backup -> Create Backup.

ClusterControl Create Backup

We can create a new backup or configure a scheduled one. For our example, we’ll create a single backup instantly.

ClusterControl Create Backup Details

We must choose one method, the server from which the backup will be taken, and where we want to store the backup. We can also upload our backup to the cloud (AWS, Google or Azure) by enabling the corresponding button.

ClusterControl Create Backup Settings

Then we specify the use of compression, the compression level, encryption and retention period for our backup.

ClusterControl Create Backup Cloud Settings

If we enabled the upload backup to the cloud option, we’ll see a section to specify the cloud provider (in this case AWS) and the credentials (ClusterControl -> Integrations -> Cloud Providers). For AWS, it uses the S3 service, so we must select a Bucket or even create a new one to store our backups.

ClusterControl Backup Overview

On the backup section, we can see the progress of the backup, and information like method, size, location, and more.

Conclusion

Amazon AWS allows us to store our PostgreSQL backups, whether we’re using it as a database cloud provider or not. To have an effective backup plan you should consider storing at least one database backup copy in the cloud to avoid data loss in the event of hardware failure in another backup store. The cloud lets you store as many backups as you want to store or pay for.

An Overview of the Various Scan Methods in PostgreSQL

$
0
0

In any of the relational databases engines, it is required to generate a best possible plan which corresponds to the execution of the query with least time and resources. Generally, all databases generate plans in a tree structure format, where the leaf node of each plan tree is called table scan node. This particular node of the plan corresponds to the algorithm to be used to fetch data from the base table.

For example, consider a simple query example as SELECT * FROM TBL1, TBL2 where TBL2.ID>1000; and suppose the plan generated is as below:

PostgreSQL Sample Plan Tree

So in the above plan tree, “Sequential Scan on TBL1” and “Index Scan on TBL2” corresponds to table scan method on table TBL1 and TBL2 respectively. So as per this plan, TBL1 will be fetched sequentially from the corresponding pages and TBL2 can be accessed using INDEX Scan.

Choosing the right scan method as part of the plan is very important in terms of overall query performance.

Before getting into all types of scan methods supported by PostgreSQL, let’s revise some of the major key points which will be used frequently as we go through the blog.

PostgreSQL Data Layout
  • HEAP: Storage area for storing the whole row of the table. This is divided into multiple pages (as shown in the above picture) and each page size is by default 8KB. Within each page, each item pointer (e.g. 1, 2, ….) points to data within the page.
  • Index Storage: This storage stores only key values i.e. columns value contained by index. This is also divided into multiple pages and each page size is by default 8KB.
  • Tuple Identifier (TID): TID is 6 bytes number which consists of two parts. The first part is 4-byte page number and remaining 2 bytes tuple index inside the page. The combination of these two numbers uniquely points to the storage location for a particular tuple.

Currently, PostgreSQL supports below scan methods by which all required data can be read from the table:

  • Sequential Scan
  • Index Scan
  • Index Only Scan
  • Bitmap Scan
  • TID Scan

Each of these scan methods are equally useful depending on the query and other parameters e.g. table cardinality, table selectivity, disk I/O cost, random I/O cost, sequence I/O cost, etc. Let’s create some pre-setup table and populate with some data, which will be used frequently to better explain these scan methods.

postgres=# CREATE TABLE demotable (num numeric, id int);

CREATE TABLE

postgres=# CREATE INDEX demoidx ON demotable(num);

CREATE INDEX

postgres=# INSERT INTO demotable SELECT random() * 1000,  generate_series(1, 1000000);

INSERT 0 1000000

postgres=# analyze;

ANALYZE

So in this example, one million records are inserted and then the table is analyzed so that all statistics are up to date.

Sequential Scan

As the name suggests, a Sequential scan of a table is done by sequentially scanning all item pointers of all pages of the corresponding tables. So if there are 100 pages for a particular table and then there are 1000 records in each page, as part of sequential scan it will fetch 100*1000 records and check if it matches as per isolation level and also as per the predicate clause. So even if only 1 record is selected as part of the whole table scan, it will have to scan 100K records to find a qualified record as per the condition.

As per the above table and data, the following query will result in a sequential scan as the majority of data are getting selected.

postgres=# explain SELECT * FROM demotable WHERE num < 21000;

                             QUERY PLAN

--------------------------------------------------------------------

 Seq Scan on demotable  (cost=0.00..17989.00 rows=1000000 width=15)

   Filter: (num < '21000'::numeric)

(2 rows)

NOTE

Though without calculating and comparing plan cost, it is almost impossible to tell which kind of scans will be used. But in order for the sequential scan to be used at-least below criteria should match:

  1. No Index available on key, which is part of the predicate.
  2. Majority of rows are getting fetched as part of the SQL query.

TIPS

In case only very few % of rows are getting fetched and the predicate is on one (or more) column, then try to evaluate performance with or without index.

Index Scan

Unlike Sequential Scan, Index scan does not fetch all records sequentially. Rather it uses different data structure (depending on the type of index) corresponding to the index involved in the query and locate required data (as per predicate) clause with very minimal scans. Then the entry found using the index scan points directly to data in heap area (as shown in the above figure), which is then fetched to check visibility as per the isolation level. So there are two steps for index scan:

  • Fetch data from index related data structure. It returns the TID of corresponding data in heap.
  • Then the corresponding heap page is directly accessed to get whole data. This additional step is required for the below reasons:
    • Query might have requested to fetch columns more than whatever available in the corresponding index.
    • Visibility information is not maintained along with index data. So in order to check the visibility of data as per isolation level, it needs to access heap data.

Now we may wonder why not always use Index Scan if it is so efficient.  So as we know everything comes with some cost. Here the cost involved is related to the type of I/O we are doing. In the case of Index Scan, Random I/O is involved as for each record found in index storage, it has to fetch corresponding data from HEAP storage whereas in case of Sequential Scan, Sequence I/O is involved which takes roughly just 25% of random I/O timing.

So Index scan should be chosen only if overall gain outperform the overhead incurred because of Random I/O cost.

As per the above table and data, the following query will result in an index scan as only one record is getting selected. So random I/O is less as well as searching of the corresponding record is quick.

postgres=# explain SELECT * FROM demotable WHERE num = 21000;

                                QUERY PLAN

--------------------------------------------------------------------------

 Index Scan using demoidx on demotable  (cost=0.42..8.44 rows=1 width=15)

   Index Cond: (num = '21000'::numeric)

(2 rows)

Index Only Scan

Index Only Scan is similar to Index Scan except for the second step i.e. as the name implies it only scans index data structure. There are two additional pre-condition in order to choose Index Only Scan compare to Index Scan:

  • Query should be fetching only key columns which are part of the index.
  • All tuples (records) on the selected heap page should be visible. As discussed in previous section index data structure does not maintain visibility information so in order to select data only from index we should avoid checking for visibility and this could happen if all data of that page are considered visible.

The following query will result in an index only scan. Even though this query is almost similar in terms of selecting number of records but as only key field (i.e. “num”) is getting selected, it will choose Index Only Scan.

postgres=# explain SELECT num FROM demotable WHERE num = 21000;

                                  QUERY PLAN

-----------------------------------------------------------------------------

Index Only Scan using demoidx on demotable  (cost=0.42..8.44 rows=1 Width=11)

   Index Cond: (num = '21000'::numeric)

(2 rows)

Bitmap Scan

Bitmap scan is a mix of Index Scan and Sequential Scan. It tries to solve the disadvantage of Index scan but still keeps its full advantage. As discussed above for each data found in the index data structure, it needs to find corresponding data in heap page. So alternatively it needs to fetch index page once and then followed by heap page, which causes a lot of random I/O. So bitmap scan method leverage the benefit of index scan without random I/O. This works in two levels as below:

  • Bitmap Index Scan: First it fetches all index data from the index data structure and creates a bit map of all TID. For simple understanding, you can consider this bitmap contains a hash of all pages (hashed based on page no) and each page entry contains an array of all offset within that page.
  • Bitmap Heap Scan: As the name implies, it reads through bitmap of pages and then scans data from heap corresponding to stored page and offset. At the end, it checks for visibility and predicate etc and returns the tuple based on the outcome of all these checks.

Below query will result in Bitmap scan as it is not selecting very few records (i.e. too much for index scan) and at the same time not selecting a huge number of records (i.e. too little for a sequential scan).

postgres=# explain SELECT * FROM demotable WHERE num < 210;

                                  QUERY PLAN

-----------------------------------------------------------------------------

 Bitmap Heap Scan on demotable  (cost=5883.50..14035.53 rows=213042 width=15)

   Recheck Cond: (num < '210'::numeric)

   ->  Bitmap Index Scan on demoidx  (cost=0.00..5830.24 rows=213042 width=0)

      Index Cond: (num < '210'::numeric)

(4 rows)

Now consider below query, which selects the same number of records but only key fields (i.e. only index columns). Since it selects only key, it does not need to refer heap pages for other parts of data and hence there is no random I/O involved. So this query will choose Index Only Scan instead of Bitmap Scan.

postgres=# explain SELECT num FROM demotable WHERE num < 210;

                                   QUERY PLAN

---------------------------------------------------------------------------------------

 Index Only Scan using demoidx on demotable  (cost=0.42..7784.87 rows=208254 width=11)

   Index Cond: (num < '210'::numeric)

(2 rows)

TID Scan

TID, as mentioned above, is 6 bytes number which consists of 4-byte page number and remaining 2 bytes tuple index inside the page. TID scan is a very specific kind of scan in PostgreSQL and gets selected only if there is TID in the query predicate. Consider below query demonstrating the TID Scan:

postgres=# select ctid from demotable where id=21000;

   ctid

----------

 (115,42)

(1 row) 

postgres=# explain select * from demotable where ctid='(115,42)';

                        QUERY PLAN

----------------------------------------------------------

 Tid Scan on demotable  (cost=0.00..4.01 rows=1 width=15)

   TID Cond: (ctid = '(115,42)'::tid)

(2 rows)

So here in the predicate, instead of giving an exact value of the column as condition, TID is provided. This is something similar to ROWID based search in Oracle.

Bonus

All of the scan methods are widely used and famous. Also, these scan methods are available in almost all relational database. But there is another scan method recently in discussion in the PostgreSQL community and as well recently added in other relational databases. It is called “Loose IndexScan” in MySQL, “Index Skip Scan” in Oracle and “Jump Scan” in DB2.

This scan method is used for a specific scenario where in distinct value of leading key column of B-Tree index is selected. As part of this scan, it avoids traversing all equal key column value rather just traverse the first unique value and then jump to the next big one. 

This work is still in progress in PostgreSQL with the tentative name as “Index Skip Scan” and we may expect to see this in a future release.

Cloud Vendor Deep-Dive: PostgreSQL on Google Cloud Platform (GCP)

$
0
0

Where to Start?

The best place I could find to start was none other than the official documentation. There is also a GCP Youtube channel for those who prefer multimedia. Once finding myself into the Cloud SQL documentation land I turned to Concepts where we are promised to “develop a deep understanding” of the product.

So let’s get started!

PostgreSQL Google Cloud Features

Google Cloud SQL for PostgreSQL offers all the standard features we’d expect from a managed solution: high availability with automatic failover, automatic backups, encryption at rest and in transit, advanced logging and monitoring, and of course a rich API to interact with all services.

And for a bit of history, PostgreSQL support started in March 2017, up to then the only supported database engine was MySQL.

Cloud SQL runs PostgreSQL on Google’s Second Generation computing platform. The full list of features is available here and also here. Reviewing the former it is apparent that there was never a First Generation platform for PostgreSQL.

Databases running on the Second Generation platform are expected to run at speeds 7x faster and benefit of 20x more storage capacity. The blog announcing the Second Generation platform goes into the details of running the sysbench test to compare Google Cloud SQL with the then main competitor AWS in both incarnations RDS, and Aurora. The results did surprise me as they show Cloud SQL performing better whereas the recent tests performed using the AWS Benchmark released about a year later concluded the opposite. That is around the same time PostgreSQL support was available. While I’m itching at the idea of running the benchmark myself, I’m guessing that there are two potential factors that could have influenced the results: Google’s sysbench benchmark used different parameters and AWS may have improved their products during that time.

GCP PostgreSQL Compatibility

As expected Google Cloud SQL for PostgreSQL is almost a drop-in replacement for the community version and supports all PL/pgSQL SQL procedural languages.

Some features are not available due to security reasons, for example SUPERUSER access. Other features were removed due to potential risks posed to product stability and performance. Lastly, some options and parameters cannot be changed, although requests to change that behavior can be made via the Cloud SQL Discussion Group.

Cloud SQL is also wire compatible with the PostgreSQL protocol.

When it comes to transaction isolation Cloud SQL follows the PostgreSQL default behavior, defaulting to Read Committed isolation level.

For some of the server configuration parameters, Cloud SQL implements different ranges for reasons unexplained in the documentation, still an important thing to remember.

Networking

There are multiple ways for connecting to the database, depending on whether the instance is on a private network or a public network (applications connecting from outside GCP). Common to both cases is the predefined VPC managed by Google where all Cloud SQL database instances reside.

Private IP

Clients connecting to a private IP address are routed via a peering connection between the VPCs hosting the client and respectively the database instance. Although not specific to PostgreSQL it is important to review the network requirements, in order to avoid connection issues. One gotcha: once enabled, the private IP capability cannot be removed.

Connecting from External Applications

Connections from applications hosted outside GCP, can, and should be encrypted. Additionally, in order to avoid the various attacks, client connections and application must install the provided client certificate. The procedure for generating and configuring the certificates it’s somewhat complicated, requiring custom tools to ensure that certificates are renewed periodically. That may be one of the reasons why Google offers the option of using the Cloud SQL Proxy.

Connecting Using Cloud SQL Proxy

The setup is fairly straightforward, which in fact, I’ve found to be the case for all instructions in the Google Cloud SQL documentation. On a related note, submitting documentation feedback is dead simple, and the screenshot feature was a first for me.

There are multiple ways to authorize proxy connections and I chose to configure a service account, just as outlined in the Cloud SQL Proxy documentation.

Once everything is in place it’s time to start the proxy:

~/usr/local/google $ ./cloud_sql_proxy -instances=omiday:us-west1:s9s201907141919=tcp:5432 -credential_file=omiday-427c34fce588.json

2019/07/14 21:22:43 failed to setup file descriptor limits: failed to set rlimit {&{8500 4096}} for max file descriptors: invalid argument

2019/07/14 21:22:43 using credential file for authentication; email=cloud-sql-proxy@omiday.iam.gserviceaccount.com

2019/07/14 21:22:43 Listening on 127.0.0.1:5432 for omiday:us-west1:s9s201907141919

2019/07/14 21:22:43 Ready for new connections

To connect to the remote instance we are now using the proxy by specifying localhost instead of the instance public IP address:

~ $ psql "user=postgres dbname=postgres password=postgres hostaddr=127.0.0.1"

Pager usage is off.

psql (11.4, server 9.6.11)

Type "help" for help.

Note that there is no encryption since we are connecting locally and the proxy takes care of encrypting the traffic flowing into the cloud.

A common DBA task is viewing the connections to the database by querying pg_stat_activity. The documentation states that proxy connections will be displayed as cloudsqlproxy~1.2.3.4 so I wanted to verify that claim. I’ve opened two sessions as postgres, one via proxy and the other one from my home address, so the following query will do:

postgres@127:5432 postgres> select * from pg_stat_activity where usename = 'postgres';

-[ RECORD 1 ]----+-----------------------------------------------------------

datid            | 12996

datname          | postgres

pid              | 924

usesysid         | 16389

usename          | postgres

application_name | psql

client_addr      |

client_hostname  |

client_port      | -1

backend_start    | 2019-07-15 04:25:37.614205+00

xact_start       | 2019-07-15 04:28:43.477681+00

query_start      | 2019-07-15 04:28:43.477681+00

state_change     | 2019-07-15 04:28:43.477684+00

wait_event_type  |

wait_event       |

state            | active

backend_xid      |

backend_xmin     | 8229

query            | select * from pg_stat_activity where usename = 'postgres';

-[ RECORD 2 ]----+-----------------------------------------------------------

datid            | 12996

datname          | postgres

pid              | 946

usesysid         | 16389

usename          | postgres

application_name | psql

client_addr      | <MY_HOME_IP_ADDRESS>

client_hostname  |

client_port      | 60796

backend_start    | 2019-07-15 04:27:50.378282+00

xact_start       |

query_start      |

state_change     | 2019-07-15 04:27:50.45613+00

wait_event_type  |

wait_event       |

state            | idle

backend_xid      |

backend_xmin     |

query            |

It appears that the proxy connections are instead identified as client_port == -1 and an empty client_addr. This can be additionally confirmed by comparing the timestamps for backend_start and proxy log below:

2019/07/14 21:25:37 New connection for "omiday:us-west1:s9s201907141919"

PostgreSQL High Availability on Google Cloud

Google Cloud SQL for PostgreSQL ensures high availability using low level storage data synchronization by means of regional persistent disks. Failover is automatic, with a heartbeat check interval of one second, and a failover triggered after about 60 seconds.

Performance and Monitoring

The Performance section of the documentation points out general cloud rules of thumb: keep the database (both writer and read replicas) close to the application, and vertically scale the instance. What stands out is the recommendation to provisioning an instance with at least 60 GB of RAM when performance is important.

Stackdriver provides monitoring and logging, as well access to PostgreSQL logs:

Stackdriver PostgreSQL Logs

Access Control

This is implemented at project, instance and database level.

Project Access Control

Project access control is the cloud specific access control — it uses the concept of IAM roles in order to allow project members (users, groups, or service accounts) access to various Cloud SQL resources. The list of roles is somewhat self-explanatory, for a detailed description of each role and associated permissions refer to APIs Explorer, or Cloud SQL Admin API for one of the supported programming languages.

To demonstrate how IAM roles work let’s create a read-only (viewer) service account:

IAM Service Account setup

Start a new proxy instance on port 5433 using the service account associated with the viewer role:

~/usr/local/google $ ./cloud_sql_proxy -instances=omiday:us-west1:s9s201907141919=tcp:5433 -credential_file=omiday-4508243deca9.json

2019/07/14 21:49:56 failed to setup file descriptor limits: failed to set rlimit {&{8500 4096}} for max file descriptors: invalid argument

2019/07/14 21:49:56 using credential file for authentication; email=cloud-sql-proxy-read-only@omiday.iam.gserviceaccount.com

2019/07/14 21:49:56 Listening on 127.0.0.1:5433 for omiday:us-west1:s9s201907141919

2019/07/14 21:49:56 Ready for new connections

Open a psql connection to 127.0.0.1:5433:

~ $ psql "user=postgres dbname=postgres password=postgres hostaddr=127.0.0.1 port=5433"

The command exits with:

psql: server closed the connection unexpectedly

      This probably means the server terminated abnormally

      before or while processing the request.

Oops! Let’s check the proxy logs:

2019/07/14 21:50:33 New connection for "omiday:us-west1:s9s201907141919"

2019/07/14 21:50:33 couldn't connect to "omiday:us-west1:s9s201907141919": ensure that the account has access to "omiday:us-west1:s9s201907141919" (and make sure there's no typo in that name). Error during createEphemeral for omiday:us-west1:s9s201907141919: googleapi: Error 403: The client is not authorized to make this request., notAuthorized

Instance Access Control

Instance-level access is dependent on the connection source:

Access based on the connection source Viorel Tabara Viorel Tabara 2:52 AM Jul 16 Automated Backups Viorel Tabara Viorel Tabara 3:18 AM Jul 16 AppEngine documentation: connectivity limits

The combination of authorization methods replaces the ubiquitous pg_hba.conf.

Backup and Recovery

By default automated backups are enabled:

Automated Backups

While backups do not affect database read and write operations they do impact the performance and therefore it is recommended that backups be scheduled during periods of lower activity.

For redundancy, backups can be stored in two regions (additional charges apply) with the option of selecting custom locations.

In order to save on storage space, use compression. .gz compressed files are transparently restored.

Cloud SQL also supports instance cloning. For the smallest dataset the operation took about 3 minutes:

Cloning start time 10:07:10:

PostgreSQL logs for cloned instance

The PostgreSQL logs show that PostgreSQL became available on the cloned instance at 10:10:47:

PostgreSQL logs for cloned instance

That is still an easier way than backup and restore, for creating a copy of an instance for testing, development or troubleshooting purposes.

Google Cloud Best Practices for PostgreSQL

  • Configure an activation policy for instances that are not required to be running 24/7.
  • Place the database instance in the same zone, or region, with the compute engine instances and App Engine applications in order to avoid network latency.
  • Create the database instance in the same zone as the Compute Engine. If using any other connection type accept the default zone.
  • Users created using Cloud SQL are by default cloud superusers. Use PostgreSQL ALTER ROLE to modify their permissions.
  • Use the latest Cloud SQL Proxy version.
  • Instance names should include a timestamp in order to be able to reuse the name when deleting and recreating instances.
  • pg_dump defaults to including large objects. If the database contains BLOB-s perform the dump during periods of low activity to prevent the instance from becoming unresponsive.
  • Use gcloud sql connect to quickly connect from an external client without the need to whitelist the client IP address.
  • Subscribe to announce group in order to receive notifications on product updates and alerts such as issues when creating instances:
Google Cloud SQL announce group
Maintenance timing options

Launch Checklist for Cloud SQL

The checklist section in the documentation provides an overview of recommended activities when setting up a production ready Cloud SQL for PostgreSQL instance. In particular, applications must be designed to handle Cloud SQL restarts. Also, while there are no queries per second limits there are connection limits.

PostgreSQL GCP Extensions Support

Cloud SQL supports most of the PostgreSQL extensions. As of this writing out of 52 community extensions there are 22 unsupported extensions and 2 unsupported PostGIS extensions.

postgis_raster

postgis_sfcgal

For PostgreSQL extensions we can either review the PostgreSQL contrib repository, or better, diff the output of pg_available_extensions:

Upstream:

~ $ psql -U postgres -p 54396

Pager usage is off.

psql (11.4, server 9.6.14)

Type "help" for help.

postgres@[local]:54396 postgres# select * from pg_available_extensions order by name;

      name        | default_version | installed_version |                               comment

--------------------+-----------------+-------------------+----------------------------------------------------------------------

adminpack          | 1.1 |                   | administrative functions for PostgreSQL

autoinc            | 1.0 |                   | functions for autoincrementing fields

bloom              | 1.0 |                   | bloom access method - signature file based index

btree_gin          | 1.0 |                   | support for indexing common datatypes in GIN

btree_gist         | 1.2 |                   | support for indexing common datatypes in GiST

chkpass            | 1.0 |                   | data type for auto-encrypted passwords

citext             | 1.3 |                   | data type for case-insensitive character strings

cube               | 1.2 |                   | data type for multidimensional cubes

dblink             | 1.2 |                   | connect to other PostgreSQL databases from within a database

dict_int           | 1.0 |                   | text search dictionary template for integers

dict_xsyn          | 1.0 |                   | text search dictionary template for extended synonym processing

earthdistance      | 1.1 |                   | calculate great-circle distances on the surface of the Earth

file_fdw           | 1.0 |                   | foreign-data wrapper for flat file access

fuzzystrmatch      | 1.1 |                   | determine similarities and distance between strings

hstore             | 1.4 |                   | data type for storing sets of (key, value) pairs

hstore_plperl      | 1.0 |                   | transform between hstore and plperl

hstore_plperlu     | 1.0 |                   | transform between hstore and plperlu

hstore_plpython2u  | 1.0 |                   | transform between hstore and plpython2u

hstore_plpythonu   | 1.0 |                   | transform between hstore and plpythonu

insert_username    | 1.0 |                   | functions for tracking who changed a table

intagg             | 1.1 |                   | integer aggregator and enumerator (obsolete)

intarray           | 1.2 |                   | functions, operators, and index support for 1-D arrays of integers

isn                | 1.1 |                   | data types for international product numbering standards

lo                 | 1.1 |                   | Large Object maintenance

ltree              | 1.1 |                   | data type for hierarchical tree-like structures

ltree_plpython2u   | 1.0 |                   | transform between ltree and plpython2u

ltree_plpythonu    | 1.0 |                   | transform between ltree and plpythonu

moddatetime        | 1.0 |                   | functions for tracking last modification time

pageinspect        | 1.5 |                   | inspect the contents of database pages at a low level

pg_buffercache     | 1.2 |                   | examine the shared buffer cache

pg_freespacemap    | 1.1 |                   | examine the free space map (FSM)

pg_prewarm         | 1.1 |                   | prewarm relation data

pg_stat_statements | 1.4             | | track execution statistics of all SQL statements executed

pg_trgm            | 1.3 |                   | text similarity measurement and index searching based on trigrams

pg_visibility      | 1.1 |                   | examine the visibility map (VM) and page-level visibility info

pgcrypto           | 1.3 |                   | cryptographic functions

pgrowlocks         | 1.2 |                   | show row-level locking information

pgstattuple        | 1.4 |                   | show tuple-level statistics

plpgsql            | 1.0 | 1.0               | PL/pgSQL procedural language

postgres_fdw       | 1.0 |                   | foreign-data wrapper for remote PostgreSQL servers

refint             | 1.0 |                   | functions for implementing referential integrity (obsolete)

seg                | 1.1 |                   | data type for representing line segments or floating-point intervals

sslinfo            | 1.2 |                   | information about SSL certificates

tablefunc          | 1.0 |                   | functions that manipulate whole tables, including crosstab

tcn                | 1.0 |                   | Triggered change notifications

timetravel         | 1.0 |                   | functions for implementing time travel

tsearch2           | 1.0 |                   | compatibility package for pre-8.3 text search functions

tsm_system_rows    | 1.0 |                   | TABLESAMPLE method which accepts number of rows as a limit

tsm_system_time    | 1.0 |                   | TABLESAMPLE method which accepts time in milliseconds as a limit

unaccent           | 1.1 |                   | text search dictionary that removes accents

uuid-ossp          | 1.1 |                   | generate universally unique identifiers (UUIDs)

xml2               | 1.1 |                   | XPath querying and XSLT

Cloud SQL:

postgres@127:5432 postgres> select * from pg_available_extensions where name !~ '^postgis' order by name;

      name        | default_version | installed_version |                              comment

--------------------+-----------------+-------------------+--------------------------------------------------------------------

bloom              | 1.0 |                   | bloom access method - signature file based index

btree_gin          | 1.0 |                   | support for indexing common datatypes in GIN

btree_gist         | 1.2 |                   | support for indexing common datatypes in GiST

chkpass            | 1.0 |                   | data type for auto-encrypted passwords

citext             | 1.3 |                   | data type for case-insensitive character strings

cube               | 1.2 |                   | data type for multidimensional cubes

dict_int           | 1.0 |                   | text search dictionary template for integers

dict_xsyn          | 1.0 |                   | text search dictionary template for extended synonym processing

earthdistance      | 1.1 |                   | calculate great-circle distances on the surface of the Earth

fuzzystrmatch      | 1.1 |                   | determine similarities and distance between strings

hstore             | 1.4 |                   | data type for storing sets of (key, value) pairs

intagg             | 1.1 |                   | integer aggregator and enumerator (obsolete)

intarray           | 1.2 |                   | functions, operators, and index support for 1-D arrays of integers

isn                | 1.1 |                   | data types for international product numbering standards

lo                 | 1.1 |                   | Large Object maintenance

ltree              | 1.1 |                   | data type for hierarchical tree-like structures

pg_buffercache     | 1.2 |                   | examine the shared buffer cache

pg_prewarm         | 1.1 |                   | prewarm relation data

pg_stat_statements | 1.4             | | track execution statistics of all SQL statements executed

pg_trgm            | 1.3 |                   | text similarity measurement and index searching based on trigrams

pgcrypto           | 1.3 |                   | cryptographic functions

pgrowlocks         | 1.2 |                   | show row-level locking information

pgstattuple        | 1.4 |                   | show tuple-level statistics

plpgsql            | 1.0 | 1.0               | PL/pgSQL procedural language

sslinfo            | 1.2 |                   | information about SSL certificates

tablefunc          | 1.0 |                   | functions that manipulate whole tables, including crosstab

tsm_system_rows    | 1.0 |                   | TABLESAMPLE method which accepts number of rows as a limit

tsm_system_time    | 1.0 |                   | TABLESAMPLE method which accepts time in milliseconds as a limit

unaccent           | 1.1 |                   | text search dictionary that removes accents

uuid-ossp          | 1.1 |                   | generate universally unique identifiers (UUIDs)

Unsupported extensions in Cloud SQL:

adminpack          1.1 administrative functions for PostgreSQL

autoinc            1.0 functions for autoincrementing fields

dblink             1.2 connect to other PostgreSQL databases from within a database

file_fdw           1.0 foreign-data wrapper for flat file access

hstore_plperl      1.0 transform between hstore and plperl

hstore_plperlu     1.0 transform between hstore and plperlu

hstore_plpython2u  1.0 transform between hstore and plpython2u

hstore_plpythonu   1.0 transform between hstore and plpythonu

insert_username    1.0 functions for tracking who changed a table

ltree_plpython2u   1.0 transform between ltree and plpython2u

ltree_plpythonu    1.0 transform between ltree and plpythonu

moddatetime        1.0 functions for tracking last modification time

pageinspect        1.5 inspect the contents of database pages at a low level

pg_freespacemap    1.1 examine the free space map (FSM)

pg_visibility      1.1 examine the visibility map (VM) and page-level visibility info

postgres_fdw       1.0 foreign-data wrapper for remote PostgreSQL servers

refint             1.0 functions for implementing referential integrity (obsolete)

seg                1.1 data type for representing line segments or floating-point intervals

tcn                1.0 Triggered change notifications

timetravel         1.0 functions for implementing time travel

tsearch2           1.0 compatibility package for pre-8.3 text search functions

xml2               1.1 XPath querying and XSLT

Logging

Operations performed within Cloud SQL are logged under the Activity tab along with all the details. Example from creating an instance, showing all instance details:

Activity log for creating an instance

PostgreSQL Migration to GCP

In order to provide migration of on-premises PostgreSQL installations, Google takes advantage of pgBouncer.

Cloud SQL Console: Migration Wizard - start migration
Cloud SQL Console: Migration Wizard - not available for PostgreSQL

Note that there is no GCP Console wizard for PostgreSQL migrations.

DBA Beware!

High Availability and Replication

A master node cannot failover to a read replica. The same section outlines other important aspects of read replicas:

  • can be taken offline at any time for patching
  • do not follow the master node in another zone following a failover — since the replication is synchronous this can affect the replication lag
  • there is no load balancing between replicas, in other words, no single endpoint applications can be pointed to
  • replica instance size must be at least the size of the master node
  • no cross-region replication
  • replicas cannot be backed up
  • all replicas must be deleted before a master instance can be restored from backup or deleted
  • cascading replication is not available

Users

By default, the “cloud superuser” is postgres which is a member of the cloudsqlsuperuser role. In turn, cloudsqlsuperuser inherits the default PostgreSQL roles:

postgres@35:5432 postgres> \du+ postgres

                           List of roles

Role name  | Attributes       | Member of | Description

-----------+------------------------+---------------------+-------------

postgres   | Create role, Create DB | {cloudsqlsuperuser} |



postgres@35:5432 postgres> \du+ cloudsqlsuperuser

                              List of roles

   Role name       | Attributes       | Member of | Description

-------------------+------------------------+--------------+-------------

cloudsqlsuperuser  | Create role, Create DB | {pg_monitor} |

Note that the roles SUPERUSER and REPLICATIONare not available.

Backup and Recovery

Backups cannot be exported.

Backups cannot be used for upgrading an instance i.e. restoring into a different PostgreSQL engine.

Features such as PITR, Logical Replication, and JIT Compilation are not available. Feature requests can be filed in the Google’s Issue Tracker.

Google Issue Tracker - PostgreSQL feature request

Encryption

At instance creation SSL/TLS is enabled but not enforced:

Creating an instance: encryption is enabled but not enforced

In this mode encryption can be requested, however certificate validation is not available.

~ $ psql "sslmode=verify-ca user=postgres dbname=postgres password=postgres hostaddr=35.233.149.65"

psql: root certificate file "/home/lelu/.postgresql/root.crt" does not exist

Either provide the file or change sslmode to disable server certificate verification.

~ $ psql "sslmode=require user=postgres dbname=postgres password=postgres hostaddr=35.233.149.65"

Pager usage is off.

psql (11.4, server 9.6.11)

SSL connection (protocol: TLSv1.2, cipher: ECDHE-RSA-AES128-GCM-SHA256, bits: 128, compression: off)

Type "help" for help.

Attempting to connect using psql to an SSL enforced instance will return a self-explanatory error:

~ $ psql "sslmode=require user=postgres dbname=postgres password=postgres hostaddr=35.233.149.65"

psql: FATAL:  connection requires a valid client certificate

Storage

  • Storage can be increased after instance creation but never decreased so watch out for costs associated with the growing storage space, or configure the increase limit.
  • Storage is limited to 30 TB.

CPU

Instances can be created with less than one core, however, the option isn’t available in the Cloud SQL Console as the instance must be created by specifying one of the sample machine types, in this case –tier:

Cloud SQL Console: shared-code (less than one CPU) instance setting is not available

Example of creating a shared-code instance using gcloud inside Cloud Shell:

Cloud Shell: creating a shared-code instance

The number of CPUs is limited to 64, a relatively low limit for large installations, considering that back when 9.2 was benchmarked high-end servers started at 32 cores.

Instance Locations

Multi-regional location is only available for backups.

Access via Public IP

By default, the GCP Console Wizard enables only public IP address access, however, access is denied until the client’s network is configured:

Creating an instance: connectivity options

Maintenance

Updates may exceed the maintenance window and read replicas are updated at any time.

The documentation doesn’t specify how long the maintenance window duration is. The information is provided when creating the instance:

Maintenance window: one-hour duration

Changes to CPU count, memory size, or the zone where the instance is located requires the database to be offline for several minutes.

Users

Cloud SQL uses the terms “role” and “user” interchangebly.

High Availability

Cost in a highly available configuration is double the standalone instance, and that includes storage.

Automatic failover is initiated after about 60 seconds following the primary node becoming unavailable. According to Oracle MAA report, this translates into $5,800 per minute loss. Considering that it takes 2 to 3 minutes until the applications can reconnect the outage doubles to triples. Additionally, the 60 seconds heartbeat interval doesn’t appear to be a configurable option.

Replication

Read replicas cannot be accessed using a single endpoint, each receiving a new IP address:

Read replicas: each instance receives an IP address

Regional persistent disks provide data redundancy at the cost of write performance.

Cloud SQL will not failover to read replicas, hence readers cannot be considered a high availability solution

External replicas and external masters are currently not supported.

Connecting to Instance

Google does not automatically renew the instance SSL certificates, however, both the initiation and rotation procedures can be automated.

If the application is built on the App Engine platform additional limits apply, such as 60 seconds for a database request to complete, 60 concurrent connections for PHP applications. The “App Engine Limits” section in Quotas and limits provides more details:

AppEngine documentation: connectivity limits

IP addresses in the range 172.17.0.0/16 are reserved.

Administration

Once started, operations cannot be canceled. Runaway queries can still be stopped by using the pg_terminate_backend and pg_cancel_backend PostgreSQL built-in functions.

A short demonstration using two psql sessions and starting a long running query in the second session:

postgres@35:5432 postgres> select now(); select pg_sleep(3600); select now();

            now

-------------------------------

2019-07-16 02:08:18.739177+00

(1 row)

In the first session, cancel the long running query:

postgres@35:5432 postgres> select pid, client_addr, client_port, query, backend_start from pg_stat_activity where usename = 'postgres';

-[ RECORD 1 ]-+-------------------------------------------------------------------------------------------------------------

pid           | 2182

client_addr   | 173.180.222.170

client_port   | 56208

query         | select pid, client_addr, client_port, query, backend_start from pg_stat_activity where usename = 'postgres';

backend_start | 2019-07-16 01:57:34.99011+00

-[ RECORD 2 ]-+-------------------------------------------------------------------------------------------------------------

pid           | 2263

client_addr   | 173.180.222.170

client_port   | 56276

query         | select pg_sleep(3600);

backend_start | 2019-07-16 02:07:43.860829+00



postgres@35:5432 postgres> select pg_cancel_backend(2263); select now();

-[ RECORD 1 ]-----+--

pg_cancel_backend | t



-[ RECORD 1 ]----------------------

now | 2019-07-16 02:09:09.600399+00

Comparing the timestamps between the two sessions:

ERROR:  canceling statement due to user request

            now

-------------------------------

2019-07-16 02:09:09.602573+00

(1 row)

It’s a match!

While restarting an instance is a recommended method when attempting to resolve database instance issues, avoid restarting before the first restart completed.

Data Import and Export

CSV import/export is limited to one database.

Exporting data as an SQL dump that can be imported later, requires a custom pg_dump command.

To quote from the documentation:

pg_dump -U [USERNAME] --format=plain --no-owner --no-acl [DATABASE_NAME] \

    | sed -E 's/(DROP|CREATE|COMMENT ON) EXTENSION/-- \1 EXTENSION/g'> [SQL_FILE].sql

Pricing

Charge Type

Instance ON

Instance OFF

Storage

Yes

Yes

Instance

No

Yes

Troubleshooting

Logging

All actions are recorded and can be viewed under the Activity tab.

Resources

Review the Diagnosing Issues with Cloud SQL instances and Known issues sections in the documentation.

Conclusion

Although missing some important features the PostgreSQL DBA is used to, namely PITR and Logical Replication, Google Cloud SQL provides out of the box high-availability, replication, encryption, and automatic storage increase, just to name a few, making manage PostgreSQL an appealing solution for organizations looking to quickly deploy their PostgreSQL workloads or even migrating from Oracle.

Developers can take advantage of cheap instances such as shared CPU (less than one CPU).

Google approaches the PostgreSQL engine adoption in a conservative manner, the stable offering lagging behind current upstream by 3 versions.

Just as with any solution provider consider getting support which can come in handy during edge scenarios such as when instances are suspended.

For professional support, Google maintains a list of partners which currently includes one of the PostgreSQL professional services , namely EDB.

Comparing Failover Times for Amazon Aurora, Amazon RDS, and ClusterControl

$
0
0

If your IT infrastructure is running on AWS, you have probably heard about Amazon Relational Database Service (RDS), an easy way to set up, operate, and scale a relational database in the cloud. It provides cost-effective and resizable capacity while automating time-consuming administration tasks such as hardware provisioning, database setup, patching, and backups. There are a number of database engine offerings for RDS like MySQL, MariaDB, PostgreSQL, Microsoft SQL Server and Oracle Server.

ClusterControl 1.7.3 acts similarly to RDS as it supports database cluster deployment, management, monitoring, and scaling on the AWS platform. It also supports a number of other cloud platforms like Google Cloud Platform and Microsoft Azure. ClusterControl understands the database topology and is capable of performing automatic recovery, topology management, and many more advanced features to take control of your database.

In this blog post, we are going to compare automatic failover times for Amazon Aurora, Amazon RDS for MySQL, and a MySQL Replication setup deployed and managed by ClusterControl. The type of failover that we are going to do is slave promotion in case that the master goes down. This is where the most up-to-date slave takes over the master role in the cluster to resume the database service.

Our Failover Test

To measure the failover time, we are going to run a simple MySQL connect-update test, with a loop to count the SQL statement status that connect to a single database endpoint. The script looks like this:

#!/bin/bash

_host='{MYSQL ENDPOINT}'

_user='sbtest'

_pass='password'

_port=3306



j=1

while true

do

        echo -n "count $j : "

        num=$(od -A n -t d -N 1 /dev/urandom |tr -d '')



        timeout 1 bash -c "mysql -u${_user} -p${_pass} -h${_host} -P${_port} --connect-timeout=1 --disable-reconnect -A -Bse \

        \"UPDATE sbtest.sbtest1 SET k = $num WHERE id = 1\"> /dev/null 2> /dev/null"

        if [ $? -eq 0 ]; then

                echo "OK $(date)"

        else

                echo "Fail ---- $(date)"

        fi



        j=$(( $j + 1 ))

        sleep 1

done

The above Bash script simply connects to a MySQL host and performs an update on a single row with a timeout of 1 second on both Bash and mysql client commands. The timeouts related parameters are required so we can measure the downtime in seconds correctly since mysql client defaults to always reconnect until it reaches the MySQL wait_timeout. We populated a test dataset with the following command beforehand:

$ sysbench \

/usr/share/sysbench/oltp_common.lua \

--db-driver=mysql \

--mysql-host={MYSQL HOST} \

--mysql-user=sbtest \

--mysql-db=sbtest \

--mysql-password=password \

--tables=50 \

--table-size=100000 \

prepare

The script reports whether the above query succeeded (OK) or failed (Fail). Sample outputs are shown further down.

Failover with Amazon RDS for MySQL

In our test, we use the lowest RDS offering with the following specs:

  • MySQL version: 5.7.22
  • vCPU: 4
  • RAM: 16 GB
  • Storage type: Provisioned IOPS (SSD)
  • IOPS: 1000
  • Storage: 100Gib
  • Multi-AZ Replication: Yes

After Amazon RDS provisions your DB instance, you can use any standard MySQL client application or utility to connect to the instance. In the connection string, you specify the DNS address from the DB instance endpoint as the host parameter, and specify the port number from the DB instance endpoint as the port parameter.

According to Amazon RDS documentation page, in the event of a planned or unplanned outage of your DB instance, Amazon RDS automatically switches to a standby replica in another Availability Zone if you have enabled Multi-AZ. The time it takes for the failover to complete depends on the database activity and other conditions at the time the primary DB instance became unavailable. Failover times are typically 60-120 seconds.

To initiate a multi-AZ failover in RDS, we performed a reboot operation with "Reboot with Failover" checked, as shown in the following screenshot:

Reboot AWS DB Instance

The following is what being observed by our application:

...

count 30 : OK Wed Aug 28 03:41:06 UTC 2019

count 31 : OK Wed Aug 28 03:41:07 UTC 2019

count 32 : Fail ---- Wed Aug 28 03:41:09 UTC 2019

count 33 : Fail ---- Wed Aug 28 03:41:11 UTC 2019

count 34 : Fail ---- Wed Aug 28 03:41:13 UTC 2019

count 35 : Fail ---- Wed Aug 28 03:41:15 UTC 2019

count 36 : Fail ---- Wed Aug 28 03:41:17 UTC 2019

count 37 : Fail ---- Wed Aug 28 03:41:19 UTC 2019

count 38 : Fail ---- Wed Aug 28 03:41:21 UTC 2019

count 39 : Fail ---- Wed Aug 28 03:41:23 UTC 2019

count 40 : Fail ---- Wed Aug 28 03:41:25 UTC 2019

count 41 : Fail ---- Wed Aug 28 03:41:27 UTC 2019

count 42 : Fail ---- Wed Aug 28 03:41:29 UTC 2019

count 43 : Fail ---- Wed Aug 28 03:41:31 UTC 2019

count 44 : Fail ---- Wed Aug 28 03:41:33 UTC 2019

count 45 : Fail ---- Wed Aug 28 03:41:35 UTC 2019

count 46 : OK Wed Aug 28 03:41:36 UTC 2019

count 47 : OK Wed Aug 28 03:41:37 UTC 2019

...

The MySQL downtime as seen by the application side was started from 03:41:09 until 03:41:36 which is around 27 seconds in total. From the RDS events, we can see the multi-AZ failover only happened 15 seconds after actual downtime:

Wed, 28 Aug 2019 03:41:24 GMT Multi-AZ instance failover started.

Wed, 28 Aug 2019 03:41:33 GMT DB instance restarted

Wed, 28 Aug 2019 03:41:59 GMT Multi-AZ instance failover completed.

Once the new database instance restarted around 03:41:33, the MySQL service was then accessible around 3 seconds later.

Failover with Amazon Aurora for MySQL

Amazon Aurora can be considered as a superior version of RDS, with a lot of notable features like faster replication with shared storage, no data loss during failover, and up to 64TB of a storage limit. Amazon Aurora for MySQL is based on the open source MySQL Edition, but is not open source by itself; it is a proprietary, closed-source database. It works similarly with MySQL replication (one and only one master, with multiple slaves) and failover is automatically handled by Amazon Aurora.

According to Amazon Aurora FAQS, if you have an Amazon Aurora Replica, in the same or a different Availability Zone, when failing over, Aurora flips the canonical name record (CNAME) for your DB Instance to point at the healthy replica, which is in turn is promoted to become the new primary. Start-to-finish, failover typically completes within 30 seconds.

If you do not have an Amazon Aurora Replica (i.e. single instance), Aurora will first attempt to create a new DB Instance in the same Availability Zone as the original instance. If unable to do so, Aurora will attempt to create a new DB Instance in a different Availability Zone. From start to finish, failover typically completes in under 15 minutes.

Your application should retry database connections in the event of connection loss.

After Amazon Aurora provisions your DB instance, you will get two endpoints one for the writer and one for the reader. The reader endpoint provides load-balancing support for read-only connections to the DB cluster. The following endpoints are taken from our test setup:

  • writer - aurora-sysbench.cluster-cw9j4kdnvun9.ap-southeast-1.rds.amazonaws.com
  • reader - aurora-sysbench.cluster-ro-cw9j4kdnvun9.ap-southeast-1.rds.amazonaws.com

In our test, we used the following Aurora specs:

  • Instance type: db.r5.large
  • MySQL version: 5.7.12
  • vCPU: 2
  • RAM: 16 GB
  • Multi-AZ Replication: Yes

To trigger a failover, simply pick the writer instance -> Actions -> Failover, as shown in the following screenshot:

Amazon Aurora Failover with SysBench

The following output is reported by our application while connecting to the Aurora writer endpoint:

...

count 37 : OK Wed Aug 28 12:35:47 UTC 2019

count 38 : OK Wed Aug 28 12:35:48 UTC 2019

count 39 : Fail ---- Wed Aug 28 12:35:49 UTC 2019

count 40 : Fail ---- Wed Aug 28 12:35:50 UTC 2019

count 41 : Fail ---- Wed Aug 28 12:35:51 UTC 2019

count 42 : Fail ---- Wed Aug 28 12:35:52 UTC 2019

count 43 : Fail ---- Wed Aug 28 12:35:53 UTC 2019

count 44 : Fail ---- Wed Aug 28 12:35:54 UTC 2019

count 45 : Fail ---- Wed Aug 28 12:35:55 UTC 2019

count 46 : OK Wed Aug 28 12:35:56 UTC 2019

count 47 : OK Wed Aug 28 12:35:57 UTC 2019

...

The database downtime was started at 12:35:49 until 12:35:56 with total amount of 7 seconds. That's pretty impressive. 

Looking at the database event from Aurora management console, only these two events happened:

Wed, 28 Aug 2019 12:35:50 GMT A new writer was promoted. Restarting database as a reader.

Wed, 28 Aug 2019 12:35:55 GMT DB instance restarted

It doesn't take much time for Aurora to promote a slave to become a master, and demote the master to become a slave. Note that all Aurora replicas share the same underlying volume with the primary instance and this means that replication can be performed in milliseconds as updates made by the primary instance are instantly available to all Aurora replicas. Therefore, it has minimal replication lag (Amazon claimed to be 100 milliseconds and less). This will greatly reduce the health check time and improve the recovery time significantly.

Failover with ClusterControl ClusterControl

In this example, we imitate a similar setup with Amazon RDS using m5.xlarge instances, with a ProxySQL in between to automate the failover from application using a single endpoint access just like RDS. The following diagram illustrates our architecture:

ClusterControl with ProxySQL

Since we are having direct access to the database instances, we would trigger an automatic failover by simply killing the MySQL process on the active master:

$ kill -9 $(pidof mysqld)

The above command triggered an automatic recovery inside ClusterControl:

[11:08:49]: Job Completed.

[11:08:44]: 10.15.3.141:3306: Flushing logs to update 'SHOW SLAVE HOSTS'

[11:08:39]: 10.15.3.141:3306: Flushing logs to update 'SHOW SLAVE HOSTS'

[11:08:39]: Failover Complete. New master is  10.15.3.141:3306.

[11:08:39]: Attaching slaves to new master.

[11:08:39]: 10.15.3.141:3306: Command 'RESET SLAVE /*!50500 ALL */' succeeded.

[11:08:39]: 10.15.3.141:3306: Executing 'RESET SLAVE /*!50500 ALL */'.

[11:08:39]: 10.15.3.141:3306: Successfully stopped slave.

[11:08:39]: 10.15.3.141:3306: Stopping slave.

[11:08:39]: 10.15.3.141:3306: Successfully stopped slave.

[11:08:39]: 10.15.3.141:3306: Stopping slave.

[11:08:38]: 10.15.3.141:3306: Setting read_only=OFF and super_read_only=OFF.

[11:08:38]: 10.15.3.141:3306: Successfully stopped slave.

[11:08:38]: 10.15.3.141:3306: Stopping slave.

[11:08:38]: Stopping slaves.

[11:08:38]: 10.15.3.141:3306: Completed preparations of candidate.

[11:08:38]: 10.15.3.141:3306: Applied 0 transactions. Remaining: .

[11:08:38]: 10.15.3.141:3306: waiting up to 4294967295 seconds before timing out.

[11:08:38]: 10.15.3.141:3306: Checking if the candidate has relay log to apply.

[11:08:38]: 10.15.3.141:3306: preparing candidate.

[11:08:38]: No errant transactions found.

[11:08:38]: 10.15.3.141:3306: Skipping, same as slave  10.15.3.141:3306

[11:08:38]: Checking for errant transactions.

[11:08:37]: 10.15.3.141:3306: Setting read_only=ON and super_read_only=ON.

[11:08:37]: 10.15.3.69:3306: Can't connect to MySQL server on '10.15.3.69' (115)

[11:08:37]: 10.15.3.69:3306: Setting read_only=ON and super_read_only=ON.

[11:08:37]: 10.15.3.69:3306: Failed to CREATE USER rpl_user. Error: 10.15.3.69:3306: Query  failed: Can't connect to MySQL server on '10.15.3.69' (115).

[11:08:36]: 10.15.3.69:3306: Creating user 'rpl_user'@'10.15.3.141.

[11:08:36]: 10.15.3.141:3306: Executing GRANT REPLICATION SLAVE 'rpl_user'@'10.15.3.69'.

[11:08:36]: 10.15.3.141:3306: Creating user 'rpl_user'@'10.15.3.69.

[11:08:36]: 10.15.3.141:3306: Elected as the new Master.

[11:08:36]: 10.15.3.141:3306: Slave lag is 0 seconds.

[11:08:36]: 10.15.3.141:3306 to slave list

[11:08:36]: 10.15.3.141:3306: Checking if slave can be used as a candidate.

[11:08:33]: 10.15.3.69:3306: Trying to shutdown the failed master if it is up.

[11:08:32]: 10.15.3.69:3306: Setting read_only=ON and super_read_only=ON.

[11:08:31]: 10.15.3.141:3306: Setting read_only=ON and super_read_only=ON.

[11:08:30]: 10.15.3.69:3306: Setting read_only=ON and super_read_only=ON.

[11:08:30]: 10.15.3.141:3306: ioerrno=2003 io running 0

[11:08:30]: Checking 10.15.3.141:3306

[11:08:30]: 10.15.3.69:3306: REPL_UNDEFINED

[11:08:30]: 10.15.3.69:3306

[11:08:30]: Failover to a new Master.

Job spec: Failover to a new Master.

While from our test application point-of-view, the downtime happened at the following time while connecting to ProxySQL host port 6033:

...

count 1 : OK Wed Aug 28 11:08:24 UTC 2019

count 2 : OK Wed Aug 28 11:08:25 UTC 2019

count 3 : OK Wed Aug 28 11:08:26 UTC 2019

count 4 : Fail ---- Wed Aug 28 11:08:28 UTC 2019

count 5 : Fail ---- Wed Aug 28 11:08:30 UTC 2019

count 6 : Fail ---- Wed Aug 28 11:08:32 UTC 2019

count 7 : Fail ---- Wed Aug 28 11:08:34 UTC 2019

count 8 : Fail ---- Wed Aug 28 11:08:36 UTC 2019

count 9 : Fail ---- Wed Aug 28 11:08:38 UTC 2019

count 10 : OK Wed Aug 28 11:08:39 UTC 2019

count 11 : OK Wed Aug 28 11:08:40 UTC 2019

...

By looking at both the recovery job events and the output from our application, the MySQL database node was down 4 seconds before the cluster recovery job starts, from 11:08:28 until 11:08:39, with total MySQL downtime of 11 seconds. One of the most impressive things about ClusterControl is, you can track the recovery progress on what action being taken and performed by ClusterControl during the failover. It provides a level of transparency that you won't be able to get with any database offerings by cloud providers.

For MySQL/MariaDB/PostgreSQL replication, ClusterControl allows you to have a more fine-grained against your databases with the support of the following advanced configuration and parameters:

  • Master-master replication topology management
  • Chain replication topology management
  • Topology viewer
  • Whitelist/Blacklist slaves to be promoted as master
  • Errant transaction checker
  • Pre/post, success/fail failover/switchover events hook with external script
  • Automatic rebuild slave on error
  • Scale out slave from existing backup

Failover Time Summary

In terms of failover time, Amazon RDS Aurora for MySQL is the clear winner with 7 seconds, followed by ClusterControl11 seconds and Amazon RDS for MySQL with 27 seconds

Note that this is just a simple test, with one client and one transaction per second to measure the fastest recovery time. Large transactions or a lengthy recovery process can increase failover time e.g, long running transactions may take long time rolling back when shutting down MySQL.

 

Comparing Galera Cluster Cloud Offerings: Part Two Google Cloud Platform (GCP)

$
0
0

In our last blog we discussed the offerings available within Amazon Web Services (AWS) when running a MySQL Galera Cluster. In this blog, we'll continue the discussion by looking further at what the offerings are for running the same clustering technology, but this time on the Google Cloud Platform (GCP)

GCP, as an alternative to AWS, has been continuously attracting applications suited for DevOps by offering support for a wide array of full-stack technologies, containerized applications, and large production database systems. Google Cloud is a full-blown, battle-tested environment which powers its own hardware infrastructure at Google for products like YouTube and Gmail.

GCP has gained traction largely because of its ever-growing list of capabilities. It offers support for platforms like Visual Studio, Android Studio, Eclipse, Powershell and many others. GCP has one of the largest and most advanced computer networks and it provides access to numerous tools that help you focus on building your application. 

Another thing that attracts customers to migrate, import, or use Google Cloud is their strong support and solutions for containerization. Kubernetes (GKE: Google Kubernetes Engine) is built on their platform. 

GCP has also recently launched a new solution called Anthos. This product is designed to let organizations manage workloads using the same interface on the Google Cloud Platform (GCP) or on-premises using GKE On-Prem, and even on rival clouds such as Amazon Web Services (AWS) or Azure. 

In addition to these technologies, GCP offers sophisticated and powerful, compute-optimized machine types like the C2 family in GCE which is built on the latest generation Intel Scalable Processors (Cascade Lake).

GCP is continuing to support open source as well, which benefits users by providing well-supported and a straightforward framework that makes it easy to deliver a final product in a timely manner. Despite this support of open source technology, GCP does not provide native support for the deployment or configuration of a MySQL Galera Cluster. In this blog we will show you the only option available to you if you wish to use this technology, deployment via a compute instance which you have to manage yourself.

The Google Compute Engine (GCE)

GCE has a sophisticated and powerful set of compute nodes available for your consumption. Unlike AWS, GCE has the most powerful compute node available on the market (n1-ultramem-160 having 160 vCPU and 3.75 TB of memory). GCE also just recently introduced a new type of compute instance family called C2 machine-type. Built on the latest generation of Intel Scalable Processors (Cascade Lake), C2 machine types offer up to 3.8 GHz sustained all-core turbo and provide full transparency into the architecture of the underlying server platforms; letting you fine-tune the performance. C2 machine types offer much more computing power, run on a newer platform, and are generally more robust for compute-intensive workloads than the N1 high-CPU machine types. C2 family offerings are limited (as of the time of writing) and it’s not available in all regions and zones. C2 also does not support regional persistent disks though it would be a great add-on for stateful database services that requires redundancy and high availability. The resources of a C2 instance is too much for a Galera node, so we'll focus on the compute nodes instead, which are ideal.

GCE also uses KVM as its virtualization technology software, whereas Amazon is using Xen. Let's take a look at the compute nodes available in GCE which are suitable for running Galera alongside its equivalence in AWS EC2. Prices differs based on region, but for this chart, we use us-east region using on-demand pricing type for AWS.

 

Machine/Instance Type

Google Compute Engine

AWS EC2

Shared

f1-micro

G1-small

 

Prices starts at $0.006 -  $0.019 hourly

t2.nano – t3.2xlarge'

 

Price starts at $0.0058 - $0.3328 hourly

Standard

n1-standard-1 – n1-standard-96

 

Prices starts at $0.034  - $3.193 hourly

m4.large – m4.16xlarge

m5.large – m5d.metal

 

Prices starts at $0.1 - $5.424  hourly

High Memory/ Memory Optimized

n1-highmem-2 – n1-highmem-96

n1-megamem-96

n1-ultramem-40 – n1-ultramem-160

 

Prices starts at $0.083  - $17.651 hourly

r4.large – r4.16xlarge

x1.16xlarge – x1.32xlarge

x1e.xlarge – x1e.32xlarge

 

Prices starts at $0.133  - $26.688 hourly

High CPU/Storage Optimized

n1-highcpu-2 – n1-highcpu-32

 

Prices starts at $0.05 - $2.383 hourly

h1.2xlarge – h1.16xlarge

i3.large – i3.metal

I3en.large - i3en.metal

d2.xlarge – d2.8xlarge

 

Prices starts at $0.156 - $10.848  hourly

GCE has a fewer number of available predefined types of compute nodes to choose from, unlike AWS. When it comes to the type of node, however, it has more granularity. This makes it easier to setup and choose what kind of instance you want to use. For example, you can add a disk and set its physical block size (4 is default) to 16 or you can set its mode either read/write or read-only. This allows you to offer the right type of machine or compute instance ready to manage your Galera node. You may also instantiate your compute nodes using Cloud SDK, or by using Cloud APIs, to automate or integrate it to your Continuous Integration, Delivery, or Deployment (CI/CD). 

Pricing (Compute Instance, Disk, vCPU, Memory, and Network)

The price as well depends on the region where its located, the type of OS or licensing (RHEL vs Suse Linux Enterprise), and also the type of disk storage you're using. 

GCP also offers discounts which allows you to economize your resource consumption. For Compute Engine, it provides different discounts to avail. 

Sustained use discounts apply to the following resources:

Take note that sustained use discounts do not apply to VMs created using App Engine Flexible Environment and Cloud Dataflow.

You can also use Committed Use Discounts when you purchase a VMS which is bound to a contract. This type of choice is ideal for predictable workloads and resource needs. When you purchase a committed use contract you purchase a certain amount of vCPUs, memory, GPUs, and local SSDs at a discounted price in return for committing to paying for those resources for 1 year or 3 years. The discount is up to 57% for most resources like machine types or GPUs. The discount is up to 70% for memory-optimized machine types. Once purchased, you are billed monthly for the resources you purchased for the duration of the term you selected (whether you use the services or not). 

A preemptible VM is an instance that you can create and run at a much lower price than normal instances. Compute Engine may, however, terminate (preempt) these instances if it requires access to those resources for other tasks. Preemptible instances use excess Compute Engine capacity, so their availability varies with usage.

If your applications are fault-tolerant and can withstand possible instance preemptions, then preemptible instances can reduce your Compute Engine costs significantly. For example, batch processing jobs can run on preemptible instances. If some of those instances terminate during processing, the job slows but does not completely stop. Preemptible instances complete your batch processing tasks without placing additional workload on your existing instances, and without requiring you to pay full price for additional normal instances.

For Compute Engine, disk size, machine type memory, and network usage are calculated in gigabytes (GB), where 1 GB is 230 bytes. This unit of measurement is also known as a gibibyte (GiB). This means that GCP offers you to only pay based on the resource consumption you have allocated. 

Now, if you have a high-grade, production database application, it's recommendable (and ideal) to attach or add a separate persistent disk. You would then use that disk as your database volume, as it offers you reliable and consistent disk performance in GCE. The higher the size you setup, the higher the IOPS it offers you.  Checkout their list of persistent disk pricing to determine the price you would get. In addition to this, GCE has regional persistent disk which is suitable in case you require more solid and sustainable high-availability within your database cluster. Regional persistent disk adds more redundancy in the case that your instance terminates or crashes or becomes corrupted. It provides synchronous replication of data between two zones in one region which happens transparently in the VM instance. In the unlikely event of  zone failure, your workload can fail-over to another VM instance in the same, or a secondary, zone. You can then force-attach your regional persistent disk to that instance. Force-attach time is estimated in less than one minute.

If you store backups as part of your disaster recovery solution, and requires a volume that is cluster-wide, GCP offers Cloud Filestore, NetApp Cloud Volumes, and some other alternative file-sharing solutions. These are fully-managed services that offers standard and premium services. You can checkout NetApp's pricing page here and Filestore pricing here.

Galera Encryption on GCP

GCP does not include specific support for the type of encryption available for Galera. GCP, however, encrypts customer data stored at rest by default, with no additional action required from you. GCP also offers another option to encrypt your data using Customer-managed encryption keys (CMEK) with Cloud KMS as well as with Customer-supplied encryption keys (CSEK). GCP also uses SSL/TLS encryption for all communications intercepted as data moves between your site and the cloud provider or between two services. This protection is achieved by encrypting the data before transmission; authenticating the endpoints; and decrypting and verifying the data on arrival.

Because Galera uses MySQL under the hood (Percona, MariaDB, or Codership build), you can take advantage of the File Key Management Encryption Plugin by MariaDB or by using the MySQL Keyring plugins. Here's an external blog by Percona which is a good resource on how you can implement this.

Galera Cluster Multi-AZ/Multi-Region/Multi-Cloud Deployments with GCP

Similarly to AWS, GCP does not offer direct support to deploy a Galera cluster on a Multi-AZ/-Region/-Cloud.

Galera Cluster High Availability, Scalability, and Redundancy on GCP

One of the primary reasons to use a Galera node cluster is the high-availability, redundancy, and it's ability to scale. If you are serving traffic globally, it's best that you cater your traffic based by regions with your architectural design including a geo-distribution of your database nodes. In order to achieve this, multi-AZ and multi-region or multi-cloud/multi-datacenter deployment is recommendable and achievable. This prevents the cluster from going down or a cluster malfunction due to lack of quorum. 

To help you more with your scalability design, GCP also has an autoscaler you can set up with an autoscaling group. This will work as long as you created your cluster as managed instance groups. For example, you can monitor the CPU utilization or relying on the metrics from Stackdriver defined in your autoscaling policy. This allows you to provision and automate instances when a certain threshold is reached, or terminate the instances when it goes back to its normal state.

For multi-region or multi-cloud deployment, Galera has its own parameter called gmcast.segment for which you can set this upon server start. This parameter is designed to optimize the communication between the Galera nodes and minimize the amount of traffic sent between network segments. This includes writeset relaying and IST and SST donor selection. This type of setup allows you to deploy multiple nodes in different regions. Aside from that, you can also deploy your Galera nodes on a different cloud vendor routing from GCP, AWS, Microsoft Azure, or within on-premise. 

We recommend you to check out our blog Multiple Data Center Setups Using Galera Cluster for MySQL or MariaDB and Zero Downtime Network Migration With MySQL Galera Cluster Using Relay Node to gather more information on how to implement these types of deployments.

Galera Cluster Database Performance on GCP

Since there's no available support for Galera in GCP your choices depend on the requirements and design of your application’s traffic and resource demands. For queries that are high on memory consumption, you can start with n1-highmem-2 instance. High CPU instances (n1-highcpu* family) can be a good fit if this is a high-transactional database, or a good fit for gaming applications.

Choosing the right storage and required IOPS for your database volume is a must. Generally, SSD-based persistent disk is your choice here. It depends on the volume of traffic is required, you might have to checkout the GCP storage options so you can determine the right size for your application.

We also recommend you to check and read our blog How to Improve Performance of Galera Cluster for MySQL or MariaDB to learn more about optimizing your Galera Cluster.

Galera Data Backups on GCP

Not only does your MySQL Galera data has to be backed-up, you should also backup the entire tier which comprises your database application. This includes log files (logical or binary), external files, temporary files, dump files, etc. Google recommends that you always create a snapshot of your persistent disks volumes which are being used by your GCE instances. You can easily create and schedule snapshots. GCP Snapshots are stored in Cloud Storage and you can select your desired location or region where the backup will be located. You can also setup a schedule for your snapshots as well as set a snapshot retention policy.

You can also use external services like, ClusterControl, which provides you both monitoring and backup solutions. Check this out if you want to know more.

Galera Cluster Database Monitoring on GCP

GCP does not offer database monitoring when using GCE. Monitoring your of instance health can be done through Stackdriver. For the database, though, you will need to grab an external monitoring tool which has advanced, highly-granular database metrics. There are a lot of choices you can choose from such as PMM by Percona, DataDog, Idera, VividCortex, or our very own ClusterControl (Monitoring is FREE with ClusterControl Community.)

Galera Cluster Database Security on GCP

As discussed in our previous blog, you can take the same approach for securing your database in the public cloud. In GCP you can setup a private subnet, firewall rules to only allow the ports required for running Galera (particularly ports 3306, 4444, 4567, 4568). You can use NAT Gateway or setup a bastion host to access your private database nodes. When these nodes are encapsulated they cannot be accessed from the outside of the GCP premises. You can read our previous blog Deploying Secure Multicloud MySQL Replication on AWS and GCP with VPN on how we set this up.

In addition to this, you can secure your data-in-transit by using a TLS/SSL connection or by encrypting your data when it's at rest. If you're using ClusterControl, deploying a secure data in-transit is simple and easy. You can check out our blog SSL Key Management and Encryption of MySQL Data in Transit if you want to try out. For data at-rest, you can follow the discussion I have stated earlier in the Encryption section of this blog.

Galera Cluster Troubleshooting 

GCP offers Stackdriver Logging which you can leverage to help you with observability, monitoring, and notification requirements. The great thing about Stackdriver Logging is that it offers integration with AWS. With it you can catch the events selectively and then raise an alert based on that event. This can keep you in the loop on certain issues which may arise and help you during troubleshooting. GCP also has Cloud Audit Logs which provide you more traceable information from inside the GCP environment, from admin activity, data access, and system events. 

If you're using ClusterControl, going to Logs -> System Logs, and you'll be able to browse the captured error logs taken from the MySQL Galera node itself. Apart from this, ClusterControl provides real-time monitoring that would amplify your alarm and notification system in case an emergency or if your MySQL Galera node(s) is kaput.

Conclusion

The Google Cloud Platform offers a wide-variety of efficient and powerful services that you can leverage. There are indeed pros and cons for each of public cloud platforms, but GCP proves that AWS doesn’t have a lock on the cloud. 

It's interesting that big companies such as Vimeo are moving to GCP coming from on-premise and they experienced some interesting results in their technology stack. Bloomberg as well is happy with GCP and is using Percona XtraDB Cluster (a Galera variant). Let us know what you think about using GCP for MySQL Galera setups in the comments below.

Database Failover for WordPress Websites

$
0
0

Every profitable enterprise requires high availability. Websites & Blogs are no different as even smaller companies and individuals require their sites to stay live to keep their reputation. 

WordPress is, by far, the most popular CMS in the world powering millions of websites from small to large. But how can you ensure that your website stays live. More specifically, how can I ensure the unavailability of my database will not impact my website? 

In this blog post we will show how to achieve failover for your WordPress website using ClusterControl.

The setup we will use for this blog will use Percona Server 5.7. We will have another host which contains the Apache and Wordpress application. We will not touch the application high-availability portion, but this also  something you want to make sure to have. We will use ClusterControl to manage databases to ensure the availability and we will use a third host to install and setup ClusterControl itself.

Assuming that the ClusterControl is up and running, we will need to import our existing database into it.

Importing a Database Cluster with ClusterControl

ClusterControl Import Cluster

Go to the Import Existing Server/Database option in the deployment wizard.

Importing an Existing Cluster with ClusterControl

We have to configure the SSH connectivity as this is a requirement for ClusterControl to be able to manage the nodes.

Configuring an Imported Cluster with ClusterControl

We now have to define some details about the vendor, version, root user access, the node itself, and if we want ClusterControl to manage autorecovery for us or not. That’s all, once the job succeeds, you will be presented with a cluster on the list.

Database Cluster List

To set up the highly-available environment, we need to execute a couple of actions. Our environment will consists of...

  • Master - Slave pair
  • Two ProxySQL instances for read/write split and topology detection
  • Two Keepalived instances for Virtual IP management

The idea is simple - we will deploy the slave to our master so we will have a second instance to failover to should the master fail. ClusterControl will be responsible for failure detection and it will promote the slave should the master become unavailable. ProxySQL will keep the track of the replication topology and it will redirect the traffic to the correct node - writes will be sent to the master, no matter which node it’s in, reads can either be sent to master-only or distributed across master and slaves. Finally, Keepalived will be collocated with ProxySQL and it will provide VIP for the application to connect to. That VIP will always be assigned to one of ProxySQL instances and Keepalived will move it to the second one, should the “main” ProxySQL node fail.

Having said all of that, let’s configure this using ClusterControl. All of it can be done in just a couple of clicks. We’ll start with adding the slave.

Adding a Database Slave with ClusterControl

Adding a Database Slave with ClusterControl

We start with picking “Add Replication Slave” job. Then we are asked to fill a form:

Adding a Replication Slave

We have to pick the master (in our case we don’t really have many options), we have to pass the IP or hostname for the new slave. If we had backups previously created, we could use one of them to provision the slave. In our case this is not available and ClusterControl will provision the slave directly from the master. That’s all, the job starts and ClusterControl performs required actions. You can monitor the progress in the Activity tab.

ClusterControl Activity Tab

Finally, once the job completes successfully, the slave should be visible on the cluster list.

Cluster List

Now we will proceed with configuring the ProxySQL instances. In our case the environment is minimal so, to keep things simpler, we will locate ProxySQL on one of the database nodes. This is not, however, the best option in a real production environment. Ideally, ProxySQL would either be located on a separate node or collocated with the other application hosts.

Configure ProxySQL ClusterControl

The place to start the job is Manage -> Loadbalancers.

ProxySQL Load Balancer Configuration ClusterControl

Here you have to pick where the ProxySQL should be installed, pass administrative credentials, and add a database user. In our case, we will use our existing user as our WordPress application already uses it for connecting to the database. We then have to pick which nodes to use in ProxySQL (we want both master and slave here) and let ClusterControl know if we use explicit transactions or not. This is not really relevant in our case, as we will reconfigure ProxySQL once it will be deployed. When you have that option enabled, read/write split will not be enabled. Otherwise ClusterControl will configure ProxySQL for read/write split. In our minimal setup we should seriously think if we want the read/write split to happen. Let’s analyse that.

The Advantages & Disadvantages of Read/Write Spit in ProxySQL

The main advantage of using the read/write split is that all the SELECT traffic will be distributed between the master and the slave. This means that the load on the nodes will be lower and response time should also be lower. This sounds good but keep in mind that should one node fail, the other node will have to be able to accommodate all of the traffic. There is little point in having automated failover in place if the loss of one node means that the second node will be overloaded and, de facto, unavailable too. 

It might make sense to distribute the load if you have multiple slaves - losing one node out of five is less impactful than losing one out of two. No matter what you decide on, you can easily change the behavior by going to ProxySQL node and clicking on the Rules tab.

ProxySQL Rules - ClusterControl

Make sure to look at rule 200 (the one which catches all SELECT statements). On the screenshot below you can see that the destination hostgroup is 20, which means all nodes in the cluster - read/write split and scale-out is enabled. We can easily disable this by editing this rule and changing the Destination Hostgroup to 10 (the one which contain master).

ProxySQL Configuration - ClusterControl

If you would like to enable the read/write split, you can easily do so by editing this query rule again and setting the destination hostgroup back to 20.

Now, let’s deploy second ProxySQL.

Deploy ProxySQL ClusterControl

To avoid passing all the configuration options again we can use the “Import Configuration” option and pick our existing ProxySQL as the source.

When this job will complete we still have to perform the last step in setting our environment. We have to deploy Keepalived on top of the ProxySQL instances.

Deploying Keepalived on Top of ProxySQL Instances

Deploy Keepalived with ProxySQL - ClusterControl

Here we picked ProxySQL as the load balancer type, passed both ProxySQL instances for Keepalived to be installed on and we typed our VIP and network interface.

Topology View - ClusterControl

As you can see, we now have the whole setup up and ready. We have a VIP of 10.0.0.111 which is assigned to one of the ProxySQL instances. ProxySQL instances will redirect our traffic to the correct backend MySQL nodes and ClusterControl will keep an eye on the environment performing failover if needed. The last action we have to take is to reconfigure Wordpress to use the Virtual IP to connect to the database.

To do that, we have to edit wp-config.php and change the DB_HOST variable to our Virtual IP:

/** MySQL hostname */

define( 'DB_HOST', '10.0.0.111' );

Conclusion

From now on Wordpress will connect to the database using VIP and ProxySQL. In case the master node fails, ClusterControl will perform the failover.

ClusterControl Failover with ProxySQL

As you can see, new master has been elected and ProxySQL also points towards new master in the hostgroup 10.

We hope this blog post gives you some idea about how to design a highly-available database environment for a Wordpress website and how ClusterControl can be used to deploy all of its elements.

Database Load Balancing Using HAProxy on Amazon AWS

$
0
0

When traffic to your database increases day-after-day it can start to become hard to manage. When this situation happens it’s useful to distribute the traffic across multiple servers, thus improving performance. Depending on the application, however, this may not be possible (if you have a single configurable endpoint).  To achieve a split, you will need to use a load balancer to perform the task. 

A load balancer can redirect applications to available/healthy database nodes and then failover when required. To deploy it, you don’t need a physical server as you can deploy it in the cloud; making it easier and faster. In this blog, we’ll take a look at the popular database load balancer HAProxy and how to deploy it to Amazon AWS both manually and with ClusterControl’s help.

What is HAProxy?

HAProxy is an open source proxy that can be used to implement high availability, load balancing, and proxying for TCP and HTTP based applications.

As a load balancer, HAProxy distributes traffic from one origin to one or more destinations and can define specific rules and/or protocols for this task. If any of the destinations stops responding, it is marked as offline, and the traffic is sent to the rest of the available destinations.

An Overview of Amazon EC2

Amazon Elastic Compute Cloud (or EC2) is a web service that provides resizable compute capacity in the cloud. It gives you complete control of your computing resources and allows you to set up and configure everything within your instances from the operating system up to your applications. It also allows you to quickly scale capacity, both up and down, as your computing requirements change.

Amazon EC2 supports different operating systems like Amazon Linux, Ubuntu, Windows Server, Red Hat Enterprise Linux, SUSE Linux Enterprise Server, Fedora, Debian, CentOS, Gentoo Linux, Oracle Linux, and FreeBSD.

Now, let’s see how to create an EC2 instance to deploy HAProxy there.

Creating an Amazon EC2 Instance

For this example, we’ll assume that you have an Amazon AWS account.

Go to the Amazon EC2 section, and press on Launch Instance. In the first step, you must choose the EC2 instance operating system.

Create Amazon EC2 Instance

In the next step, you must choose the resources for the new instance.

Choose an Amazon EC2 Instance Type

Then, you can specify a more detailed configuration like network, subnet, and more.

Configure Amazon EC2 Instance

We can now add more storage capacity on this new instance, as this will be only a load balancer (it's probably not necessary).

Amazon EC2 Add Storage

When we finish the creation task, we can go to the Instances section to see our new EC2 instance.

Launch Amazon EC2 Instance

Now that our EC2 instance is ready (Instance State running), we can deploy our load balancer here. For this task, we’ll see two different ways, manually and using ClusterControl.

How Manually Install and Configure HAProxy

To install HAProxy on Linux you can use the following commands in our EC2 instance:

On Ubuntu/Debian OS:

$ apt-get install haproxy -y

On CentOS/RedHat OS:

$ yum install haproxy -y

And then we need to edit the following configuration file to manage our HAProxy configuration:

$ /etc/haproxy/haproxy.cfg

Configuring our HAProxy is not complicated, but we need to know what we are doing. We have several parameters to configure, depending on how we want HAProxy to work. For more information, we can follow the documentation about the HAProxy configuration.

Let's look at a basic configuration example. Suppose that you have the following database topology:

Basic Load Balancer Configuration

We want to create an HAProxy listener to balance the read traffic between the three nodes.

listen haproxy_read

   bind *:5434

   balance     roundrobin

   server  node1 10.1.1.10:5432 check

   server  node2 10.1.1.11:5432 check

   server  node3 10.1.1.12:5432 check

As we mentioned before, there are several parameters to configure here, and this configuration depends on what we want to do. For example:

listen  haproxy_read

       bind *:5434

       mode tcp

       timeout client  10800s

       timeout server  10800s

       tcp-check expect string is\ running

       balance leastconn

       option tcp-check

       default-server port 9201 inter 2s downinter 5s rise 3 fall 2 slowstart 60s maxconn 64 maxqueue 128 weight 100

       server  node1 10.1.1.10:5432 check

       server  node2 10.1.1.11:5432 check

       server  node3 10.1.1.12:5432 check

Now, let’s see how ClusterControl can make this task in an easy way.

How to Install and Configure HAProxy with ClusterControl

For this task, we’ll assume that you have ClusterControl installed (on-prem or in the cloud) and it’s currently managing your databases.

Go to ClusterControl -> Select Cluster -> Cluster Actions -> Add Load Balancer.

ClusterControl Cluster List

Here we must add the information that ClusterControl will use to install and configure our HAProxy load balancer.

Configure HAProxy in ClusterControl

The information that we need to introduce is:

Action: Deploy or Import.

Server Address: IP Address for our HAProxy server.

Listen Port (Read/Write): Port for read/write mode.

Listen Port (Read Only): Port for read only mode.

Policy: It can be:

  • leastconn: The server with the lowest number of connections receives the connection.
  • roundrobin: Each server is used in turns, according to their weights.
  • source: The source IP address is hashed and divided by the total weight of the running servers to designate which server will receive the request.

Install for read/write splitting: For master-slave replication.

Build from Source: We can choose Install from a package manager or build from source.

And we need to select which servers you want to add to the HAProxy configuration and some additional information like:

Role: It can be Active or Backup.

Include: Yes or No.

Connection address information.

Also, we can configure Advanced Settings like Admin User, Backend Name, Timeouts, and more.

When you finish the configuration and confirm the deploy, we can follow the progress in the Activity section on ClusterControl UI.

Setup HAProxy Server ClusterControl

And when this finishes, we can go to ClusterControl -> Nodes -> HAProxy node, and check the current status.

HAProxy Node in ClusterControl

We can also monitor our HAProxy servers from ClusterControl checking the Dashboard section.

HAProxy Monitoring with ClusterControl

We can improve our HA design adding a new HAProxy node and configuring Keepalived service between them. All this can be performed by ClusterControl. 

What is Amazon Elastic Load Balancing?

HAProxy is not the only possibility to deploy a Load Balancer on AWS as they have their own product for this task. Amazon Elastic Load Balancing (or ELB) distributes incoming application or network traffic across multiple targets, such as Amazon EC2 instances, containers, and IP addresses, in multiple Availability Zones. 

You can add and remove compute resources from your load balancer as your needs change, without disrupting the overall flow of requests to your applications.

You can configure health checks, which are used to monitor the health of the compute resources so that the load balancer can send requests only to the healthy ones. You can also offload the work of encryption and decryption to your load balancer so that your compute resources can focus on their main work.

To configure it, go to the Amazon EC2 section, and click on the Load Balancers option in the left menu. There, we’ll see three different options.

Amazon EC2 Elastic Load Balancing ELB
  • Application Load Balancer: If you need a flexible feature set for your web applications with HTTP and HTTPS traffic. Operating at the request level, Application Load Balancers provide advanced routing and visibility features targeted at application architectures, including microservices and containers.
  • Network Load Balancer: If you need ultra-high performance, TLS offloading at scale, centralized certificate deployment, support for UDP, and static IP addresses for your application. Operating at the connection level, Network Load Balancers are capable of handling millions of requests per second securely while maintaining ultra-low latencies.
  • Classic Load Balancer: If you have an existing application running in the EC2-Classic network.

Conclusion

As we could see, a Load Balancer can help us manage our database traffic by balancing it between multiple servers. It’s also useful to improve our high availability environment by performing failover tasks. We can deploy it manually on AWS or by using ClusterControl in a fast and easy way. With ClusterControl (download for FREE!) we can also take advantage of different features like monitoring, management and scaling for different database technologies, and we can deploy this system on-prem or in the cloud.


An Overview of MongoDB Atlas: Part Two

$
0
0

In the first part of the blog “An Overview of MongoDB Atlas,” we looked at getting started with MongoDB Atlas, the initial setup and migration of an existing MongoDB Cluster to MongoDB Atlas. In this part we are going to continue to explore several management elements required for every MongoDB production system, such as security and business continuity. 

Database Security in MongoDB Atlas

Security always comes first. While it is important for all databases, for MongoDB it has a special meaning. In mid 2017 the internet was full of news regarding ransomware attacks which specifically targeted vulnerabilities in MongoDB systems. Hackers were hijacking MongoDB instances and asking for a ransom in exchange for the return of the stored data. There were warnings. Prior to these ransomware attacks bloggers and experts wrote about how many production instances were found to be vulnerable. It stirred up vibrant discussion around MongoDB security for a long time after.

We are now in 2019 and MongoDB is getting even more popular. The new major version (4.0) was recently released, and we have seen increased stability in MongoDB Atlas. But what has been done to increase security for the NoSQL databases in the cloud. 

The ransomware and constant press must have had an impact on MongoDB as we can clearly see that security is now at the center of the MongoDB ecosystem. MongoDB Atlas in no exception as it now comes with built-in security controls for production data processing needs and many enterprise security features out of the box. The default approach (which caused the vulnerability) from the older version is gone and the database is now secured by default (network, crud authorisations etc). It also comes with features you would expect to have in a modern production environment (auditing, temporary user access, etc). 

But it doesn’t stop there. Since Atlas is an online solution you can now use integrations with third parties like LDAP authentication or modern MongoDB internet services like MongoDB charts. MongoDB Atlas is built atop of Amazon WebServices (AWS), Microsoft Azure, and Google Cloud Platform (GCP) which also offer high-security measures of their own. This great combination ensures MongoDB Atlas security standards are what we would expect. Let’s take a quick look at some of these key features.

MongoDB Atlas & Network Security

MongoDB Atlas builds clusters on top of your existing cloud infrastructure. When one chooses AWS, the customer data is stored in MongoDB Atlas systems. These systems are single-tenant, dedicated, AWS EC2 virtual servers which are created solely for an Atlas Customer. Amazon AWS data centers are compliant with several physical security and information security standards, but since we need an open network, it can raise concerns.

MongoDB Atlas dedicated clusters are deployed in a Virtual Private Cloud (VPC) with dedicated firewalls. Access must be granted by an IP whitelist or through VPC Peering. By default all access is disabled.

MongoDB requires the following network ports for Atlas...

  • 27016 for shards
  • 27015 for the BI connector
  • 27017 for server
  • If LDAP is enabled, MongoDB requires LDAP network 636 on the customer side open to 0.0.0.0 (entire Internet) traffic.

The network ports cannot be changed and TLS cannot be disabled. Access can also be isolated by IP whitelist. 

MongoDB Atlas Add Whitelist Entry

Additionally you can choose to access MongoDB Atlas via Bastion hosts. Bastion hosts are configured to require SSH keys (not passwords). They also require multi-factor authentication, and users must additionally be approved by senior management for backend access. 

MongoDB Atlas Role-Based Access Management

You can configure advanced, role-based access rules to control which users (and teams) can access, manipulate, and/or delete data in your databases. By default there are no users created so you will be prompted to create one.

MongoDB Atlas allows administrators to define permissions for a user or application as well as what data can be accessed when querying MongoDB. MongoDB Atlas provides the ability to provision users with roles specific to a project or database, making it possible to realize a separation of duties between different entities accessing and managing the data. The process is simple and fully interactive.

To create a new user go to the Security tab on the left side and choose between MongoDB users and MongoDB roles. 

MongoDB Atlas Add a New User

MongoDB Roles

MongoDB Atlas Add Custom Role

End-to-End Database Encryption in MongoDB Atlas

All the MongoDB Atlas data in transit is encrypted using Transport Layer Security (TLS). You have the flexibility to configure the minimum TLS protocol version. Encryption for data-at-rest is automated using encrypted storage volumes.

You can also integrate your existing security practices and processes with MongoDB Atlas to provide additional control over how you secure your environment. 

For the MongoDB Atlas Cluster itself, authentication is automatically enabled by default via SCRAM to ensure a secure system out of the box.

With Encryption Key Management you can bring your own encryption keys to your dedicated clusters for an additional layer of encryption on the database files, including backup snapshots.

MongoDB Atlas Encryption Key

Auditing in MongoDB Atlas

Granular database auditing answers detailed questions about system activity for deployments with multiple users by tracking all the commands against the database. Auditing in MongoDB is only available in MongoDB Enterprise. You can write audit events to the console, to the syslog, to a JSON file, or to a BSON file. You configure the audit option using the –auditDestination qualifier. For example, to send audit events as JSON events to syslog use...

mongod --dbpath data/db --auditDestination syslog

MongoDB maintains a centralized log management system for collection, storage, and analysis of log data for production environments. This information can be used for health monitoring, troubleshooting, and for security purposes. Alerts are configured in the system in order to notify SREs of any operational concerns.

MongoDB Atlas Activity Feed

MongoDB Atlas LDAP Integration

User authentication and authorization against MongoDB Atlas clusters can be managed via a customer’s Lightweight Directory Access Protocol (LDAP) server over TLS. A single LDAP configuration applies to all database clusters within an Atlas project. LDAP servers are used to simplify access control and make permissions management more granular. 

For customers running their LDAP server in an AWS Virtual Private Cloud (VPC), a peering connection is recommended between that environment and the VPC containing their Atlas databases.

MongoDB Atlas LDAP Integration

MongoDB Business Continuity and Disaster Recovery

MongoDB Atlas creates and configures dedicated clusters on infrastructure provided by AWS, Azure and/or Google GCP. Data availability is subject to the infrastructure provider service Business Continuity Plans (BCP) and Disaster Recovery (DR) processes. MongoDB Atlas infrastructure service providers hold a number of certifications and audit reports for these controls. 

Database Backups in MongoDB Atlas

MongoDB Atlas backs up data, typically only seconds behind an operational system. MongoDB Atlas ensures continuous backup of replica sets, consistent, cluster-wide snapshots of sharded clusters, and point-in-time recovery. This fully-managed backup service uses Amazon S3 in the region nearest to the customer's database deployment.

Backup data is protected using server-side encryption. Amazon S3 encrypts backed up data at the object level as it writes it to disks in its data centers and decrypts it for you when you restore it. All keys are fully managed by AWS.

Atlas clusters deployed in Amazon Web Services and Microsoft Azure can take advantage of cloud provider snapshots which use the native snapshot capabilities of the underlying cloud provider. Backups are stored in the same cloud region as the corresponding cluster. For multi-region clusters, snapshots are stored in the cluster’s preferred region. 

Atlas offers the following methods to back up your data...

Continuous Database Backups

Continuous backups are available in M10+ Clusters and versions lower than server version 4.2. This is an old method of performing MongoDB backups. Atlas uses incremental snapshots to continuously back up your data. Continuous backup snapshots are typically just a few seconds behind the operational system. Atlas ensures point-in-time backup of replica sets and consistent, cluster-wide snapshots of sharded clusters on it’s own, which it uses S3 for.

Full-Copy Snapshots

Atlas uses the native snapshot capabilities of your cloud provider to support full-copy snapshots and localized snapshot storage.

MongoDB Atlas Data Lake

Using Atlas Data Lake to ingest your S3 data into Atlas clusters allows you to quickly query data stored in your AWS S3 buckets using the Mongo Shell, MongoDB Compass, and any MongoDB driver.

When you create a Data Lake, you will grant Atlas read only access to S3 buckets in your AWS account and create a data configuration file that maps data from your S3 buckets to your MongoDB databases and collections. Atlas supports using any M10+ cluster, including Global Clusters, to connect to Data Lakes in the same. 

MongoDB Atlas Data Lake

At the time of writing this blog following formats are supported.

  • Avro
  • Parquet
  • JSON
  • JSON/Gzipped
  • BSON
  • CSV (requires header row)
  • TSV (requires header row)

Conclusion

That’s all for now, I hope you enjoyed my two part overview of MongoDB Atlas. Remember that ClusterControl also provides end-to-end management of MongoDB Clusters as well and is a great, lower-cost alternative to MongoDB Atlas which can also be deployed in the cloud.

An Overview of the JOIN Methods in PostgreSQL

$
0
0

In my previous blog, we discussed various ways to select, or scan, data from a single table. But in practical, fetching data from a single table is not enough. It requires selecting data from multiple tables and then correlating among them. Correlation of this data among tables is called joining tables and it can be done in various ways. As the joining of tables requires input data (e.g. from the table scan), it can never be a leaf node in the plan generated.

E.g. consider a simple query example as SELECT * FROM TBL1, TBL2 where TBL1.ID > TBL2.ID; and suppose the plan generated is as below:

Nested Loop Join

So here the first both tables are scanned and then they are joined together as per the correlation condition as TBL.ID > TBL2.ID

In addition to the join method, the join order is also very important. Consider the below example:

SELECT * FROM TBL1, TBL2, TBL3 WHERE TBL1.ID=TBL2.ID AND TBL2.ID=TBL3.ID;

Consider that TBL1, TBL2 AND TBL3 have 10, 100 and 1000 records respectively. 

The condition TBL1.ID=TBL2.ID returns only 5 records, whereas TBL2.ID=TBL3.ID returns 100 records, then it’s better to join TBL1 and TBL2 first so that lesser number of records get joined with TBL3. The plan will be as shown below:

Nested Loop Join with Table Order

PostgreSQL supports the below kind of joins:

  • Nested Loop Join
  • Hash Join
  • Merge Join

Each of these Join methods are equally useful depending on the query and other parameters e.g. query, table data, join clause, selectivity, memory etc. These join methods are implemented by most of the relational databases.

Let’s create some pre-setup table and populate with some data, which will be used frequently to better explain these scan methods.

postgres=# create table blogtable1(id1 int, id2 int);

CREATE TABLE

postgres=# create table blogtable2(id1 int, id2 int);

CREATE TABLE

postgres=# insert into blogtable1 values(generate_series(1,10000),3);

INSERT 0 10000

postgres=# insert into blogtable2 values(generate_series(1,1000),3);

INSERT 0 1000

postgres=# analyze;

ANALYZE

In all our subsequent examples, we consider default configuration parameter unless otherwise specified specifically.

Nested Loop Join

Nested Loop Join (NLJ) is the simplest join algorithm wherein each record of outer relation is matched with each record of inner relation. The Join between relation A and B with condition A.ID < B.ID can be represented as below:

For each tuple r in A
       	For each tuple s in B
            	If (r.ID < s.ID)
                 	Emit output tuple (r,s)

Nested Loop Join (NLJ) is the most common joining method and it can  be used almost on any dataset with any type of join clause. Since this algorithm scan all tuples of inner and outer relation, it is considered to be the most costly join operation.

As per the above table and data, the following query will result in a Nested Loop Join as shown below:

postgres=# explain select * from blogtable1 bt1, blogtable2 bt2 where bt1.id1 < bt2.id1;

                               QUERY PLAN

------------------------------------------------------------------------------

 Nested Loop  (cost=0.00..150162.50 rows=3333333 width=16)

   Join Filter: (bt1.id1 < bt2.id1)

   ->  Seq Scan on blogtable1 bt1  (cost=0.00..145.00 rows=10000 width=8)

   ->  Materialize  (cost=0.00..20.00 rows=1000 width=8)

         ->  Seq Scan on blogtable2 bt2  (cost=0.00..15.00 rows=1000 width=8)

(5 rows)

Since the join clause is “<”, the only possible join method here is Nested Loop Join.

Notice here one new kind of node as Materialize; this node acts as intermediate result cache i.e. instead of fetching all tuples of a relation multiple times, the first time fetched result is stored in memory and on the next request to get tuple will be served from the memory instead of fetching from the relation pages again. In-case if all tuples cannot be fit in memory then spill-over tuples go to a temporary file. It is mostly useful in-case of Nested Loop Join and to some extent in-case of Merge Join as they rely on rescan of inner relation. Materialize Node is not only limited to caching result of relation but it can cache results of any node below in the plan tree.

TIP: In case join clause is “=” and nested loop join is chosen between a relation, then it is really important to investigate if more efficient join method such as hash or merge join can be chosen by tuning configuration (e.g. work_mem but not limited to ) or by adding an index, etc.

Some of the queries may not have join clause, in that case also the only choice to join is Nested Loop Join. E.g. consider the below queries as per the pre-setup data:

postgres=# explain select * from blogtable1, blogtable2;

                             QUERY PLAN

--------------------------------------------------------------------------

 Nested Loop  (cost=0.00..125162.50 rows=10000000 width=16)

   ->  Seq Scan on blogtable1  (cost=0.00..145.00 rows=10000 width=8)

   ->  Materialize  (cost=0.00..20.00 rows=1000 width=8)

      ->  Seq Scan on blogtable2  (cost=0.00..15.00 rows=1000 width=8)

(4 rows)

The join in the above example is just a Cartesian product of both tables.

Hash Join

This algorithm works in two phases:

  • Build Phase: A Hash table is built using the inner relation records. The hash key is calculated based on the join clause key.
  • Probe Phase: An outer relation record is hashed based on the join clause key to find matching entry in the hash table.

The join between relation A and B with condition A.ID = B.ID can be represented as below:

  • Build Phase
    • For each tuple r in inner relation B
    • Insert r into hash table HashTab with key r.ID
  • Probe Phase
  • For each tuple s in outer relation A
  • For each tuple r in bucker HashTab[s.ID]
  • If (s.ID = r.ID)
    • Emit output tuple (r,s)

As per above pre-setup table and data, the following query will result in a Hash Join as shown below:

postgres=# explain select * from blogtable1 bt1, blogtable2 bt2 where bt1.id1 = bt2.id1;

                               QUERY PLAN

------------------------------------------------------------------------------

 Hash Join  (cost=27.50..220.00 rows=1000 width=16)

   Hash Cond: (bt1.id1 = bt2.id1)

   ->  Seq Scan on blogtable1 bt1  (cost=0.00..145.00 rows=10000 width=8)

   ->  Hash  (cost=15.00..15.00 rows=1000 width=8)

         ->  Seq Scan on blogtable2 bt2  (cost=0.00..15.00 rows=1000 width=8)

(5 rows) 

Here the hash table is created on the table blogtable2 because it is the smaller table so the minimal memory required for hash table and whole hash table can fit in memory.

Merge Join

Merge Join is an algorithm wherein each record of outer relation is matched with each record of inner relation until there is a possibility of join clause matching. This join algorithm is only used if both relations are sorted and join clause operator is “=”. The join between relation A and B with condition A.ID = B.ID can be represented as below:

    For each tuple r in A

        For each tuple s in B

             If (r.ID = s.ID)

                  Emit output tuple (r,s)

                  Break;

             If (r.ID > s.ID)

                  Continue;

             Else

                  Break;

The example query which resulted in a Hash Join, as shown above, can result in a Merge Join if the index gets created on both tables. This is because the table data can be retrieved in sorted order because of the index, which is one of the major criteria for the Merge Join method:

postgres=# create index idx1 on blogtable1(id1);

CREATE INDEX

postgres=# create index idx2 on blogtable2(id1);

CREATE INDEX

postgres=# explain select * from blogtable1 bt1, blogtable2 bt2 where bt1.id1 = bt2.id1;

                                   QUERY PLAN

---------------------------------------------------------------------------------------

 Merge Join  (cost=0.56..90.36 rows=1000 width=16)

   Merge Cond: (bt1.id1 = bt2.id1)

   ->  Index Scan using idx1 on blogtable1 bt1  (cost=0.29..318.29 rows=10000 width=8)

   ->  Index Scan using idx2 on blogtable2 bt2  (cost=0.28..43.27 rows=1000 width=8)

(4 rows)

So, as we see, both tables are using index scan instead of sequential scan because of which both tables will emit sorted records.

Configuration

PostgreSQL supports various planner related configurations, which can be used to hint the query optimizer to not select some particular kind of join methods. If the join method chosen by the optimizer is not optimal, then these configuration parameters can be switch-off to force the query optimizer to choose a different kind of join methods. All of these configuration parameters are “on” by default. Below are the planner configuration parameters specific to join methods.

  • enable_nestloop: It corresponds to Nested Loop Join.
  • enable_hashjoin: It corresponds to Hash Join.
  • enable_mergejoin: It corresponds to Merge Join.

There are many plan related configuration parameters used for various purposes. In this blog, keeping it restricted to only join methods.

These parameters can be modified from a particular session. So in-case we want to experiment with the plan from a particular session, then these configuration parameters can be manipulated and other sessions will still continue to work as it is.

Now, consider the above examples of merge join and hash join. Without an index, query optimizer selected a Hash Join for the below query as shown below but after using configuration, it switches to merge join even without index:

postgres=# explain select * from blogtable1, blogtable2 where blogtable1.id1 = blogtable2.id1;

                             QUERY PLAN

--------------------------------------------------------------------------

 Hash Join  (cost=27.50..220.00 rows=1000 width=16)

   Hash Cond: (blogtable1.id1 = blogtable2.id1)

   ->  Seq Scan on blogtable1  (cost=0.00..145.00 rows=10000 width=8)

   ->  Hash  (cost=15.00..15.00 rows=1000 width=8)

      ->  Seq Scan on blogtable2  (cost=0.00..15.00 rows=1000 width=8)

(5 rows)



postgres=# set enable_hashjoin to off;

SET

postgres=# explain select * from blogtable1, blogtable2 where blogtable1.id1 = blogtable2.id1;

                             QUERY PLAN

----------------------------------------------------------------------------

 Merge Join  (cost=874.21..894.21 rows=1000 width=16)

   Merge Cond: (blogtable1.id1 = blogtable2.id1)

   ->  Sort  (cost=809.39..834.39 rows=10000 width=8)

      Sort Key: blogtable1.id1

      ->  Seq Scan on blogtable1  (cost=0.00..145.00 rows=10000 width=8)

   ->  Sort  (cost=64.83..67.33 rows=1000 width=8)

      Sort Key: blogtable2.id1

      ->  Seq Scan on blogtable2  (cost=0.00..15.00 rows=1000 width=8)

(8 rows)

Initially Hash Join is chosen because data from tables are not sorted. In order to choose the Merge Join Plan, it needs to first sort all records retrieved from both tables and then apply the merge join. So, the cost of sorting will be additional and hence the overall cost will increase. So possibly, in this case, the total (including increased) cost is more than the total cost of Hash Join, so Hash Join is chosen.

Once configuration parameter enable_hashjoin is changed to “off”, this means the query optimizer directly assign a cost for hash join as disable cost (=1.0e10 i.e. 10000000000.00).  The cost of any possible join will be lesser than this. So, the same query result in Merge Join after enable_hashjoin changed to “off” as even including the sorting cost, the total cost of merge join is lesser than disable cost.

Now consider the below example:

postgres=# explain select * from blogtable1, blogtable2 where blogtable1.id1 < blogtable2.id1;

                             QUERY PLAN

--------------------------------------------------------------------------

 Nested Loop  (cost=0.00..150162.50 rows=3333333 width=16)

   Join Filter: (blogtable1.id1 < blogtable2.id1)

   ->  Seq Scan on blogtable1  (cost=0.00..145.00 rows=10000 width=8)

   ->  Materialize  (cost=0.00..20.00 rows=1000 width=8)

      ->  Seq Scan on blogtable2  (cost=0.00..15.00 rows=1000 width=8)

(5 rows)



postgres=# set enable_nestloop to off;

SET

postgres=# explain select * from blogtable1, blogtable2 where blogtable1.id1 < blogtable2.id1;

                             QUERY PLAN

--------------------------------------------------------------------------

 Nested Loop  (cost=10000000000.00..10000150162.50 rows=3333333 width=16)

   Join Filter: (blogtable1.id1 < blogtable2.id1)

   ->  Seq Scan on blogtable1  (cost=0.00..145.00 rows=10000 width=8)

   ->  Materialize  (cost=0.00..20.00 rows=1000 width=8)

      ->  Seq Scan on blogtable2  (cost=0.00..15.00 rows=1000 width=8)

(5 rows)

As we can see above, even though the nested loop join related configuration parameter is changed to “off” still it chooses Nested Loop Join as there is no alternate possibility of any other kind of Join Method to get selected. In simpler terms, since Nested Loop Join is the only possible join, then whatever is the cost it will be always the winner (Same as I used to be the winner in 100m race if I ran alone…:-)). Also, notice the difference in cost in the first and second plan. The first plan shows the actual cost of Nested Loop Join but the second one shows the disable cost of the same.

Conclusion

All kinds of PostgreSQL join methods are useful and get selected based on the nature of the query, data, join clause, etc. In-case the query is not performing as expected, i.e. join methods are not selected as expected then, the user can play around with different plan configuration parameters available and see if something is missing.

Cloud Vendor Deep-Dive: PostgreSQL on Microsoft Azure

$
0
0

If you have followed Microsoft lately it will come as no surprise that the provider of a competing database product, namely SQL Server, also jumped on the PostgreSQL bandwagon. From releasing 60,000 patents to OIN to being Platinum sponsor at PGCon, Microsoft as one of the PostgreSQL corporate backing organizations. Took every opportunity for showing that not only can you run PostgreSQL on Microsoft, but also the reverse is true: Microsoft, through its cloud offering, can run PostgreSQL for you. The statement became even more clear with the acquisition of Citus Data and the release of their flagship product in the Azure Cloud under the name of Hyperscale. It is safe to say that PostgreSQL adoption is growing and now there are even more good reasons to choose it.

My journey through the Azure cloud started right at the landing page where I meet the contenders: Single Server and a preview (in other words no SLA provided) release of Hyperscale (Citus). This blog will focus on the former. While on this journey, I had the opportunity to practice what open source is all about — giving back to the community — in this case, by providing feedback to the documentation that, to Microsoft’s credit, they make this very easy by piping the feedback straight into Github:

Github: My Azure Documentation Feedback Issues

PostgreSQL Compatibility with Azure

Versioning

According to product documentation Single Server targets PostgreSQL versions in the n-2 major range:

Azure Database for PostgreSQL: Single server PostgreSQL versions

As a solution built for performance Single Server is recommended for data sets 100 GB and larger. The servers provided predictable performance — the database instances come with a predefined number of vCores and IOPS (based on the size of provisioned storage).

Extensions

There is a fair number of Supported Extensions with some of them being installed out of the box:

postgres@pg10:5432 postgres> select name, default_version, installed_version from pg_available_extensions where name !~ '^postgis' order by name;

            name             | default_version | installed_version

------------------------------+-----------------+-------------------

address_standardizer         | 2.4.3 |

address_standardizer_data_us | 2.4.3           |

btree_gin                    | 1.2 |

btree_gist                   | 1.5 |

chkpass                      | 1.0 |

citext                       | 1.4 |

cube                         | 1.2 |

dblink                       | 1.2 |

dict_int                     | 1.0 |

earthdistance                | 1.1 |

fuzzystrmatch                | 1.1 |

hstore                       | 1.4 |

hypopg                       | 1.1.1 |

intarray                     | 1.2 |

isn                          | 1.1 |

ltree                        | 1.1 |

orafce                       | 3.7 |

pg_buffercache               | 1.3 | 1.3

pg_partman                   | 2.6.3 |

pg_prewarm                   | 1.1 |

pg_qs                        | 1.1 |

pg_stat_statements           | 1.6 | 1.6

pg_trgm                      | 1.3 |

pg_wait_sampling             | 1.1 |

pgcrypto                     | 1.3 |

pgrouting                    | 2.5.2 |

pgrowlocks                   | 1.2 |

pgstattuple                  | 1.5 |

plpgsql                      | 1.0 | 1.0

plv8                         | 2.1.0 |

postgres_fdw                 | 1.0 |

tablefunc                    | 1.0 |

timescaledb                  | 1.1.1 |

unaccent                     | 1.1 |

uuid-ossp                    | 1.1 |

(35 rows)

PostgreSQL Monitoring on Azure

Server monitoring relies on a set of metrics that can be neatly grouped to create a custom dashboard:

Azure Database for PostgreSQL: Single server --- Metrics

Those familiar with Graphviz or Blockdiag are likely to appreciate the option of exporting the entire dashboard to a JSON file:

Azure Database for PostgreSQL: Single server --- Metrics

Furthermore metrics can — and they should — be linked to alerts:

Azure Database for PostgreSQL: Single Server --- Available Alerts

Query statistics can be tracked by means of Query Store and visualized with Query Performance Insight. For that, a couple of Azure specific parameters will need to be enabled:

postgres@pg10:5432 postgres> select * from pg_settings where name ~ 'pgms_wait_sampling.query_capture_mode|pg_qs.query_capture_mode';

-[ RECORD 1 ]---+------------------------------------------------------------------------------------------------------------------

name            | pg_qs.query_capture_mode

setting         | top

unit            |

category        | Customized Options

short_desc      | Selects which statements are tracked by pg_qs. Need to reload the config to make change take effect.

extra_desc      |

context         | superuser

vartype         | enum

source          | configuration file

min_val         |

max_val         |

enumvals        | {none,top,all}

boot_val        | none

reset_val       | top

sourcefile      |

sourceline      |

pending_restart | f

-[ RECORD 2 ]---+------------------------------------------------------------------------------------------------------------------

name            | pgms_wait_sampling.query_capture_mode

setting         | all

unit            |

category        | Customized Options

short_desc      | Selects types of wait events are tracked by this extension. Need to reload the config to make change take effect.

extra_desc      |

context         | superuser

vartype         | enum

source          | configuration file

min_val         |

max_val         |

enumvals        | {none,all}

boot_val        | none

reset_val       | all

sourcefile      |

sourceline      |

pending_restart | f

In order to visualize the slow queries and waits we proceed to the Query Performance widget:

Long Running Queries​​​

Azure Database for PostgreSQL: Single server --- Long running queries graph

Wait Statistics

Azure Database for PostgreSQL: Single server --- wait statistics

PostgreSQL Logging on Azure

The standard PostgreSQL logs can be downloaded, or exported to Log Analytics for more advanced parsing:

Azure Database for PostgreSQL: Single server --- Log Analytics

PostgreSQL Performance and Scaling with Azure

While the number of vCores can be easily increased or decreased, this action will trigger a server restart:

Azure Database for PostgreSQL: Single server PostgreSQL versions

In order to achieve zero downtime applications must be able to gracefully handle transient errors.

For tuning queries, Azure provides the DBA with Performance Recommendations, in addition to the preloaded pg_statements and pg_buffercache extensions:

Azure Database for PostgreSQL: Single server --- Performance Recommendations screen

High Availability and Replication on Azure

Database server high availability is achieved by means of a node based hardware replication. This ensures that in the case of hardware failure, a new node can be brought up within tens of seconds.

Azure provides a redundant gateway as a network connection endpoint for all database servers within a region.

PostgreSQL Security on Azure

By default firewall rules deny access to the PostgreSQL instance. Since an Azure database server is the equivalent of a database cluster the access rules will apply to all databases hosted on the server.

In addition to IP addresses, firewall rules can reference virtual network, a feature available only for General Purpose and Memory Optimized tiers.

Azure Database for PostgreSQL: Single server --- Firewall --- Adding a VNet

One thing I found peculiar in the firewall web interface — I could not navigate away from the page while changes were being saved:

Azure Database for PostgreSQL: Single server --- change security rules in progress pop-up screen when attempting to navigate away

Data at rest is encrypted using a Server-Managed Key and cloud users cannot disable the encryption. Data in transit is also encrypted — SSL required can only be changed after the database server is created. Just as the data at rest, backups are encrypted and encryption cannot be disabled.

Advanced Threat Protection provides alerts and recommendations on a number of database access requests that are considered a security risk. The feature is currently in preview. To demonstrate, I simulated a password brute force attack:

~ $ while : ; do psql -U $(pwgen -s 20 1)@pg10 ; sleep 0.1 ; done

psql: FATAL:  password authentication failed for user "AApT6z4xUzpynJwiNAYf"

psql: FATAL:  password authentication failed for user "gaNeW8VSIflkdnNZSpNV"

psql: FATAL:  password authentication failed for user "SWZnY7wGTxdLTLcbqnUW"

psql: FATAL:  password authentication failed for user "BVH2SC12m9js9vZHcuBd"

psql: FATAL:  password authentication failed for user "um9kqUxPIxeQrzWQXr2v"

psql: FATAL:  password authentication failed for user "8BGXyg3KHF3Eq3yHpik1"

psql: FATAL:  password authentication failed for user "5LsVrtBjcewd77Q4kaj1"

....

Check the PostgreSQL logs:

2019-08-19 07:13:50 UTC-5d5a4c2e.138-FATAL:  password authentication failed

for user "AApT6z4xUzpynJwiNAYf"

2019-08-19 07:13:50 UTC-5d5a4c2e.138-DETAIL:  Role "AApT6z4xUzpynJwiNAYf" does not exist.

   Connection matched pg_hba.conf line 3: "host all all 173.180.222.170/32 password"

2019-08-19 07:13:51 UTC-5d5a4c2f.13c-LOG:  connection received: host=173.180.222.170 port=27248 pid=316

2019-08-19 07:13:51 UTC-5d5a4c2f.13c-FATAL:  password authentication failed for user "gaNeW8VSIflkdnNZSpNV"

2019-08-19 07:13:51 UTC-5d5a4c2f.13c-DETAIL:  Role "gaNeW8VSIflkdnNZSpNV" does not exist.

   Connection matched pg_hba.conf line 3: "host all all 173.180.222.170/32 password"

2019-08-19 07:13:52 UTC-5d5a4c30.140-LOG:  connection received: host=173.180.222.170 port=58256 pid=320

2019-08-19 07:13:52 UTC-5d5a4c30.140-FATAL:  password authentication failed for user "SWZnY7wGTxdLTLcbqnUW"

2019-08-19 07:13:52 UTC-5d5a4c30.140-DETAIL:  Role "SWZnY7wGTxdLTLcbqnUW" does not exist.

   Connection matched pg_hba.conf line 3: "host all all 173.180.222.170/32 password"

2019-08-19 07:13:53 UTC-5d5a4c31.148-LOG:  connection received: host=173.180.222.170 port=32984 pid=328

2019-08-19 07:13:53 UTC-5d5a4c31.148-FATAL:  password authentication failed for user "BVH2SC12m9js9vZHcuBd"

2019-08-19 07:13:53 UTC-5d5a4c31.148-DETAIL:  Role "BVH2SC12m9js9vZHcuBd" does not exist.

   Connection matched pg_hba.conf line 3: "host all all 173.180.222.170/32 password"

2019-08-19 07:13:53 UTC-5d5a4c31.14c-LOG:  connection received: host=173.180.222.170 port=43384 pid=332

2019-08-19 07:13:54 UTC-5d5a4c31.14c-FATAL:  password authentication failed for user "um9kqUxPIxeQrzWQXr2v"

2019-08-19 07:13:54 UTC-5d5a4c31.14c-DETAIL:  Role "um9kqUxPIxeQrzWQXr2v" does not exist.

   Connection matched pg_hba.conf line 3: "host all all 173.180.222.170/32 password"

2019-08-19 07:13:54 UTC-5d5a4c32.150-LOG:  connection received: host=173.180.222.170 port=27672 pid=336

2019-08-19 07:13:54 UTC-5d5a4c32.150-FATAL:  password authentication failed for user "8BGXyg3KHF3Eq3yHpik1"

2019-08-19 07:13:54 UTC-5d5a4c32.150-DETAIL:  Role "8BGXyg3KHF3Eq3yHpik1" does not exist.

   Connection matched pg_hba.conf line 3: "host all all 173.180.222.170/32 password"

2019-08-19 07:13:55 UTC-5d5a4c33.154-LOG:  connection received: host=173.180.222.170 port=12712 pid=340

2019-08-19 07:13:55 UTC-5d5a4c33.154-FATAL:  password authentication failed for user "5LsVrtBjcewd77Q4kaj1"

2019-08-19 07:13:55 UTC-5d5a4c33.154-DETAIL:  Role "5LsVrtBjcewd77Q4kaj1" does not exist.

The email alert arrived about 30 minutes later:

Azure Database for PostgreSQL: Single server --- Advanced Threat Protection email alert

In order to allow fine grained access to database server, Azure provides RBAC, which is a cloud native access control feature, just one more tool in the arsenal of the PostgreSQL Cloud DBA. This is as close as we can get to the ubiquitous pg_hba access rules.

PostgreSQL Backup and Recovery on Azure

Regardless of pricing tiers, backups are retained between 7 and 35 days. The pricing tier also influences the ability to restore data.

Point-in-time recovery is available via the Azure Portal or the CLI and according to documentation as granular as up to five minutes. The portal functionality is rather limited — the date picker widget blindly shows the last 7 days as possible dates to select, although I created the server today. Also, there is no verification performed on the recovery target time — I expected that entering a value outside the recovery interval would trigger an error preventing the wizard to continue:

Azure Database for PostgreSQL: Single server --- point-in-time restore screen

Once the restore process is started, an error, supposedly caused by the out of range value, will popup about a minute later:

Azure Database for PostgreSQL: Single server --- Activity Log error message on restore failure

…but, unfortunately, the error message was not very helpful:

Azure Database for PostgreSQL: Single server --- Activity Log error details on restore failure

Lastly, backup storage is free for retention periods of up to 7 days. That could prove extremely handy for development environments.

Hints and Tips

Limits

Get accustomed with the Single Server Limits.

Connectivity

Always use the connection string in order for the connection to be routed to the correct database server.

Replication

For disaster recovery scenarios, locate read replicas in one of the paired regions.

Roles

Just as is the case with AWS and GCloud, there is no superuser access.

GUCs

Parameters requiring a server restart or superuser access cannot be configured.

Scaling

During auto-scaling, applications should retry until the new node is brought up.

Memory amount and IOPS cannot be specified — memory is allocated in units of GB per vCore, up to a maximum of 320GB (32vCores x 10GB), and IOPS are dependent on the size of the provisioned storage to a maximum of 6000 IOPS. At this time Azure offers a large storage preview option with a maximum of 20,000 IOPS.

Servers created in the Basic tier cannot be upgraded to General Purpose or Memory Optimized.

Storage

Ensure that the auto-grow feature is enabled — if the amount of data exceed the provisioned storage space, the database will enter in read-only mode.

Storage can only be scaled up. Just as with all the other cloud providers storage allocation cannot be decreased and I couldn’t come across any explanation. Given the state of the art equipment, the big cloud players can afford there should be no reason for not providing features similar to LVM online data relocation. Storage is really cheap nowadays, there is really no reason to think about scaling down until the next major version upgrade.

Firewall

In some cases, updates to firewall rules may take up to five minutes to propagate.

A server is located in the same subnet as the application servers will not be reachable until the appropriate firewall rules are in place.

Virtual network rules do not allow cross-region access and as a result, dblink and postgres_fdw cannot be used to connect to databases outside the Azure cloud.

The VNet/Subnet approach cannot be applied to Web Apps as their connections originate from public IP addresses.

Large virtual networks will be unavailable while the service endpoints are enabled.

Encryption

For applications that require server certificate validation, the file is available for download from Digicert. Microsoft made it easy and you shouldn’t have to worry about renewal until 2025:

~ $ openssl x509 -in BaltimoreCyberTrustRoot.crt.pem -noout -dates

notBefore=May 12 18:46:00 2000 GMT

notAfter=May 12 23:59:00 2025 GMT

Intrusion Detection System

The preview release of Advanced Threat Protection is not available for the Basic tier instances.

Backup and Restore

For applications that cannot afford a region downtime, consider configuring the server with geo-redundant backup storage. This option can only be enabled at the time of creating the database server.

The requirement for reconfiguring the cloud firewall rules after a PITR operation is particularly important.

Deleting a database server removes all backups.

Following the restore, there are certain post-restore tasks that will have to be performed.

Unlogged tables are recommended for bulk inserts in order to boost performance, however, they are not replicated.

Monitoring

Metrics are recorded every minute and stored for 30 days.

Logging

Query Store is a global option, meaning that it applies to all databases. Read-only transactions and queries longer than 6,000 bytes are problematic. By default, the captured queries are retained for 7 days.

Performance

Query Performance Insight recommendations are currently limited to create and drop index.

Disable pg_stat_staements when not needed.

Replace uuid_generate_v4 with gen_random_uuid(). This is inline with the recommendation in the official PostgreSQL documentation, see Building uuid-ossp.

High Availability and Replication

There is a limit of five read replicas. Write-intensive applications should avoid using read replicas as the replication mechanism is asynchronous which introduces some delays that applications must be able to tolerate. Read replicas can be located in a different region.

REPLICA support can only be enabled after the server was created. The feature requires a server restart:

Azure Database for PostgreSQL: Single server --- enabling replication
Azure Database for PostgreSQL: Single server --- read replica missing firewall rules after creation

Read replicas do not inherit the firewall rules from master node:

Azure Database for PostgreSQL: Single server --- read replica missing firewall rules after creation

Failover to read replica is not automatic. The failover mechanism is node based.

There is a long list of Considerations that needs to be reviewed before configuring read replicas.

Creating replicas takes a long time, even when I tested with relatively small data set:

Azure Database for PostgreSQL: Single server --- Replicas creation taking a long time
 
Vacuum

Vacuum

Review the key parameters, as Azure Database for PostgreSQL ships with upstream vacuum default values:

postgres@pg10:5432 postgres> select name,setting from pg_settings where name ~ '^autovacuum.*';

               name                 | setting

-------------------------------------+-----------

autovacuum                          | on

autovacuum_analyze_scale_factor     | 0.05

autovacuum_analyze_threshold        | 50

autovacuum_freeze_max_age           | 200000000

autovacuum_max_workers              | 3

autovacuum_multixact_freeze_max_age | 400000000

autovacuum_naptime                  | 15

autovacuum_vacuum_cost_delay        | 20

autovacuum_vacuum_cost_limit        | -1

autovacuum_vacuum_scale_factor      | 0.05

autovacuum_vacuum_threshold         | 50

autovacuum_work_mem                 | -1

(12 rows)

Upgrades

Automatic major upgrades are not supported. As mentioned earlier, this is a cost savings opportunity, by scaling down the auto-grown storage.

PostgreSQL Azure Enhancements

Timeseries

TimescaleDB is available as an extension (not part of the PostgreSQL modules), however, it is just a few clicks away. The only drawback being the older version 1.1.1, while the upstream version is currently at  1.4.1 (2019-08-01).

postgres@pg10:5432 postgres> CREATE EXTENSION IF NOT EXISTS timescaledb CASCADE;

WARNING:

WELCOME TO

_____ _                               _ ____________

|_   _(_)                             | | | _ \ ___ \

| |  _ _ __ ___   ___ ___ ___ __ _| | ___| | | | |_/ /

| | | |  _ ` _ \ / _ \/ __|/ __/ _` | |/ _ \ | | | ___ \

| | | | | | | | |  __/\__ \ (_| (_| | |  __/ |/ /| |_/ /

|_| |_|_| |_| |_|\___||___/\___\__,_|_|\___|___/ \____/

               Running version 1.1.1

For more information on TimescaleDB, please visit the following links:



1. Getting started: https://docs.timescale.com/getting-started

2. API reference documentation: https://docs.timescale.com/api

3. How TimescaleDB is designed: https://docs.timescale.com/introduction/architecture




CREATE EXTENSION



postgres@pg10:5432 postgres> \dx timescaledb

                                    List of installed extensions

   Name     | Version | Schema |                            Description

-------------+---------+--------+-------------------------------------------------------------------

timescaledb | 1.1.1   | public | Enables scalable inserts and complex queries for time-series data

(1 row)

Logging

In addition to PostgreSQL logging options, Azure Database for PostgreSQL can be configured to record additional diagnostics events.

Firewall

Azure Portal includes a handy feature for allowing connections from the IP addresses logged in to the portal:

Azure Database for PostgreSQL: Single server --- Firewall --- Add Client IP Address

I noted the feature as it makes it easy for developers and system administrators to allow themselves in, and it stands out as a feature not offered by neither AWS, nor GCloud.

Conclusion

Azure Database for PostgreSQL Single Server offers enterprise level services, however, many of these services are still in preview mode: Query Store, Performance Insight, Performance Recommendation, Advanced Threat Protection, Large Storage, Cross-region Read Replicas.

While operating system knowledge is no longer required for administering PostgreSQL in the Azure cloud, the DBA is expected to acquire skills which are not limited to the database itself — Azure networking (VNet), connection security (firewall), log viewer and analytics along with KQL, Azure CLI for handy scripting, and the list goes on.

Lastly, for those planning to migrate their PostgreSQL workloads to Azure, there are a number of resources available along with a select list of Azure Partners including Credativ, one of the PostgreSQL major sponsors and contributors.

 

Comparing Galera Cluster Cloud Offerings: Part Three Microsoft Azure

$
0
0

Microsoft Azure is known to many as an alternative public cloud platform to Amazon AWS. It's not easy to directly compare these two giant companies. Microsoft's cloud business -- dubbed commercial cloud -- includes everything from Azure to Office 365 enterprise subscriptions to Dynamics 365 to LinkedIn services. After LinkedIn was acquired by Microsoft it began moving its infrastructure to Azure. While moving LinkedIn to Azure could take some time, it demonstrates Microsoft Azure’s capabilities and ability to handle millions of transactions. Microsoft's strong enterprise heritage, software stack, and data center tools offer both familiarity and a hybrid approach to cloud deployments.

Microsoft Azure is built as an Infrastructure as a Service (IaaS) as well as a Platform as a Service (PaaS). The Azure Virtual machine offers per-second billing and it's currently a multi-tenant compute. It has, however, recently previewed its new offering which allows virtual machines to run on single-tenant physical servers. The offering is called Azure Dedicated Hosts

Azure also offers specialized large instances (such as for SAP HANA). There are multitenant blocks, file storage, and many  other additional IaaS and PaaS capabilities. These include object storage (Azure Blob Storage), a CDN, a Docker-based container service (Azure Container Service), a batch computing service (Azure Batch), and event-driven “serverless computing” (Azure Functions). The Azure Marketplace offers third-party software and services. Colocation needs are met via partner exchanges (Azure ExpressRoute) offered from partners like Equinix and CoreSite.

With all of these offerings Microsoft Azure has stepped up its game to play a vital role in the public cloud market. The PaaS infrastructure offered to its consumers has garnered a lot of trust and many are moving their own infrastructure or private cloud to Microsoft Azure's public cloud infrastructure. This is especially advantageous for consumers who need integration with other Windows Services, such as Visual Studio.

So what’s different between Azure and the other clouds we have looked at in this series? Microsoft has focused heavily on AI, analytics, and the Internet of Things. AzureStack is another “cloud-meets-data center” effort that has been a real differentiator in the market.

Microsoft Azure Migration Pros & Cons

There are several things you should consider when moving your legacy applications or infrastructure to Microsoft Azure.

Strengths

  • Enterprises that are strategically committed to Microsoft technology generally choose Azure as their primary IaaS+PaaS provider. The integrated end-to-end experience for enterprises building .NET applications using Visual Studio (and related services) is unsurpassed. Microsoft is also leveraging its tremendous sales reach and ability to co-sell Azure with other Microsoft products and services in order to drive adoption.
  • Azure provides a well-integrated approach to edge computing and Internet of Things (IoT), with offerings that reach from its hyperscale data center out through edge solutions such as AzureStack and Data Box Edge.
  • Microsoft Azure’s capabilities have become increasingly innovative and open. 50% of the workloads are Linux-based alongside numerous open-source application stacks. Microsoft has a unique vision for the future that involves bringing in technology partners through native, first-party offerings such as those from VMware, NetApp, Red Hat, Cray and Databricks.

Cautions

  • Microsoft Azure’s reliability issues continue to be a challenge for customers, largely as a result of Azure’s growing pains. Since September 2018, Azure has had multiple service-impacting incidents, including significant outages involving Azure Active Directory. These outages leave customers with no ability to mitigate the downtime.
  • Gartner clients often experience challenges with executing on-time implementations within budget. This comes from Microsoft often providing unreasonably high expectations for customers. Much of this stems from the Microsoft’s field sales teams being “encouraged” to appropriately position and sell Azure within its customer base.
  • Enterprises frequently lament the quality of Microsoft technical support (along with the increasing cost of support) and field solution architects. This negatively impacts customer satisfaction, and slows Azure adoption and therefore customer spending.

Microsoft may not be your first choice as it has been seen as a “not-so-open-source-friendly” tech giant, but in fairness it has embraced a lot of activity and support within the Open Source world. Microsoft Azure offers fully-managed services to most of the top open source RDBMS database like PostgreSQL, MySQL, and MariaDB.  

Galera Cluster (Percona, Codership, or MariaDB) variants, unfortunately, aren't supported by Azure. The only way you can deploy your Galera Cluster to Azure is by means of a Virtual Machine. You may also want to check their blog on using MariaDB Enterprise Cluster (which is based on Galera) on Azure.

Azure's Virtual Machine

Virtual Machine is the equivalent offering for compute instances in GCP and AWS. An Azure Virtual Machine is an on-demand, high-performance computing server in the cloud and can be deployed in Azure using various methods. These might include the user interface within the Azure portal, using pre-configured images in the Azure marketplace, scripting through Azure PowerShell, deploying from a template that is defined by using a JSON file, or by deploying directly through Visual Studio.

Azure uses a deployment model called the Azure Resource Manager (ARM), which defines all resources that form part of your overall application solution, allowing you to deploy, update, or delete your solution in a single operation.

Resources may include the storage account, network configurations, and IP addresses. You may have heard the term “ARM templates”, which essentially means the JSON template which defines the different aspects of your solution which you are trying to deploy.

Azure Virtual Machines come in different types and sizes, with names beginning with A-series to N-series. Each VM type is built with specific workloads or performance needs in mind, including general purpose, compute optimized, storage optimized or memory optimized. You can also deploy less common types like GPU or high performance compute VMs.

Similar to other public cloud offerings, you can do the following in your virtual machine instances...

  • Encrypt your disk on virtual machine. Although this does not come easily when compared to GCP and AWS. Encrypting your virtual machine requires a more manual approach. It requires you to complete the Azure Disk Encryption prerequisites. Since Galera does not support Windows, we're only talking here about Linux-based images. Basically, it requires you to have dm-crypt and vfat modules present in the system. Once you get that piece right, then you can encrypt the VM using the Azure CLI. You can check out how to Enable Azure Disk Encryption for Linux IaaS VMs to see how to do it. Encrypting your disk is very important, especially if your company or organization requires that your Galera Cluster data must follow the standards mandated by laws and regulations such as PCI DSS or GDPR.
  • Creating a snapshot. You can create a snapshot either using the Azure CLI or through the portal. Check their manual on how to do it.
  • Use auto scaling or Virtual Machine Scale Sets if you require horizontal scaling. Check out the overview of autoscaling in Azure or the overview of virtual machine scale sets.
  • Multi Zone Deployment. Deploy your virtual machine instances into different availability zones to avoid single-point of failure.

You can also create (or get information from) your virtual machines in different ways. You can use the Azure portal, Azure PowerShell, REST APIs, Client SDKs, or with the Azure CLI. Virtual machines in the Azure virtual network can also easily be connected to your organization’s network and treated as an extended datacenter.

Microsoft Azure Pricing

Just like other public cloud providers, Microsoft Azure also offers a free tier with some free services. It also offers pay-as-you-go options and reserved instances to choose from. Pay-as-you-go starts at $0.008/hour - $0.126/hour.

Microsoft Azure Pricing

For reserved instances, the longer you commit and contract with Azure, the more you save on the cost. Microsoft Azure claims to help subscribers save up to 72% of their billing costs compared to its pay-as-you-go model when subscribers sign up for a one to three year term for a Windows or Linux Virtual Machine. Microsoft also offers added flexibility in the sense that if your business needs change, you can cancel your Azure RI subscription at any time and return the remaining unused RI to Microsoft as an early termination fee.

Let's checkout it's pricing in comparison between GCP, AWS EC2, and an Azure Virtual Machine. This is based on us-east1 region and we will compare the price ranges for the compute instances required to run your Galera Cluster.

Machine/
Instance
Type

Google
Compute Engine

AWS EC2

Microsoft
Azure

Shared

f1-micro

G1-small

Prices starts at $0.006 -  $0.019 hourly

t2.nano – t3a.2xlarge

Price starts at $0.0058 - $0.3328 hourly

B-Series

Price starts at $0.0052 - $0.832 hourly

Standard

n1-standard-1 – n1-standard-96

Prices starts at $0.034  - $3.193 hourly

m4.large – m4.16xlarge

m5.large – m5d.metal

Prices starts at $0.1 - $5.424  hourly

Av2 Standard, D2-64 v3 latest generation, D2s-64s v3 latest generation, D1-5 v2, DS1-S5 v2, DC-series

Price starts at $0.043 - $3.072 hourly

High Memory/ Memory Optimized

n1-highmem-2 – n1-highmem-96

n1-megamem-96

n1-ultramem-40 – n1-ultramem-160

Prices starts at $0.083  - $17.651 hourly

r4.large – r4.16xlarge

x1.16xlarge – x1.32xlarge

x1e.xlarge – x1e.32xlarge

Prices starts at $0.133  - $26.688 hourly

D2a – D64a v3, D2as – D64as v3, E2-64 v3 latest generation, E2a – E64a v3, E2as – E64as v3, E2s-64s v3 latest generation, D11-15 v2, DS11-S15 v2, M-series, Mv2-series, Instances, Extreme Memory Optimized

Price starts at $0.043 - $44.62 hourly

High CPU/Storage Optimized

n1-highcpu-2 – n1-highcpu-32

Prices starts at $0.05 - $2.383 hourly

h1.2xlarge – h1.16xlarge

i3.large – i3.metal

I3en.large - i3en.metal

d2.xlarge – d2.8xlarge

Prices starts at $0.156 - $10.848  hourly

Fsv2-series, F-series, Fs-Series

Price starts at $0.0497 - $3.045 hourly

 

Data Encryption on Microsoft Azure

Microsoft Azure does not offer encryption support directly for Galera Cluster (or vice-versa). There are, however, ways you can encrypt data either at-rest or in-transit.

Encryption in-transit is a mechanism for protecting data when it's transmitted across networks. With Azure Storage, you can secure data by using:

Microsoft uses encryption to protect customer data when it’s in-transit between customers realm and Microsoft cloud services. More specifically, Transport Layer Security (TLS) is the protocol that Microsoft’s data centers will use to negotiate with client systems that are connected to Microsoft cloud services.  

Perfect Forward Secrecy (PFS) is also employed so that each connection between customers’ client systems and Microsoft’s cloud services use unique keys. Connections to Microsoft cloud services also take advantage of RSA based 2,048-bit encryption key lengths.

Encryption At-Rest

For many organizations, data encryption at-rest is a mandatory step towards achieving data privacy, compliance, and data sovereignty. Three Azure features provide encryption of data at-rest:

  • Storage Service Encryption is always enabled and automatically encrypts storage service data when writing it to Azure Storage. If your application logic requires your MySQL Galera Cluster database to store valuable data, then storing to Azure Storage can be an option.
  • Client-side encryption also provides the feature of encryption at-rest.
  • Azure Disk Encryption enables you to encrypt the OS disks and data disks that an IaaS virtual machine uses. Azure Disk Encryption also supports enabling encryption on Linux VMs that are configured with disk striping (RAID) by using mdadm, and by enabling encryption on Linux VMs by using LVM for data disks

Galera Cluster Multi-AZ/Multi-Region/Multi-Cloud Deployments with GCP

Similar to AWS and GCP, Microsoft Azure does not offer direct support for deploying a Galera Cluster onto a Multi-AZ/-Region/-Cloud. You can, however, deploy your nodes manually as well as creating scripts using PowerShell or Azure CLI to do this for you. Alternatively, when you provision your Virtual Machine instance you can place your nodes in different availability zones. Microsoft Azure also offers another type of redundancy, aside from having its availability zone, which is called Virtual Machine Scale Sets. You can check the differences between virtual machine and scale sets.

Galera Cluster High Availability, Scalability, and Redundancy on Azure

One of the primary reasons for using a Galera node cluster is for high-availability, redundancy, and for its ability to scale. If you are serving traffic globally, it's best that you cater your traffic by region. You should ensure your architectural design includes geo-distribution of your database nodes. In order to achieve this, multi-AZ, multi-region, or multi-cloud/multi-datacenter deployments are recommended. This prevents the cluster from going down as well as a malfunction due to lack of quorum. 

As mentioned earlier, Microsoft Azure has an auto scaling solution which can be leveraged using scale sets. This allows you to autoscale a node when a certain threshold has been met (based on what you are monitoring). This depends on which health status items you are monitoring before it then vertically scales. You can check out their tutorial on this topic here.

For multi-region or multi-cloud deployments, Galera has its own parameter called gmcast.segment for which can be set upon server start. This parameter is designed to optimize the communication between the Galera nodes and minimize the amount of traffic sent between network segments. This includes writeset relaying and IST and SST donor selection. This type of setup allows you to deploy multiple nodes in different regions. Aside from that, you can also deploy your Galera nodes on a different cloud vendors routing from GCP, AWS, Microsoft Azure, or within an on-premise setup. 

We recommend you to check out our blog Multiple Data Center Setups Using Galera Cluster for MySQL or MariaDB and Zero Downtime Network Migration With MySQL Galera Cluster Using Relay Node to gather more information on how to implement these types of deployments.

Galera Cluster Database Performance on Microsoft Azure

The underlying host machines used by virtual machines in Azure are, in fact, very powerful. The newest VM's in Azure have already been equipped with network optimization modules. You can check this in your kernel info by running (e.g. in Ubuntu).

uname -r|grep azure

Note: Make certain that your command has the "azure" string on it. 

For Centos/RHEL, installing any Linux Integration Services (LIS) since version 4.2 contains network optimization. To learn more about this, visit the page on optimizing network throughput.

If your application is very sensitive to network latency, you might be interested in looking at the proximity placement group. It's currently in preview (and not yet recommended for production use) but this helps optimize your network throughput. 

For the type of virtual machine you would consume, then this would depend on the requirement of your application traffic and resource demands. For queries that are high on memory consumption, you can start with Dv3. However, for memory-optimized, then start with the Ev3 series. For High CPU requirements, such as high-transactional database or gaming applications, then start with Fsv2 series.

Choosing the right storage and required IOPS for your database volume is a must. Generally, a SSD-based persistent disk is your ideal choice. Begin with Standard SSD which is cost-effective and offers consistent performance. This decision, however, might depend on if you need more IOPS in the long run. If this is the case, then you should go for Premium SSD storage.

We also recommend you to check and read our blog How to Improve Performance of Galera Cluster for MySQL or MariaDB to learn more about optimizing your Galera Cluster.

Database Backup for Galera Nodes on Azure

There's no existing naitve backup support for your MySQL Galera data in Azure, but you can take a snapshot. Microsoft Azure offers Azure VM Backup which takes a snapshot which can be scheduled and encrypted. 

Alternatively, if you want to backup the data files from your Galera Cluster, you can also use external services like ClusterControl, use Percona Xtrabackup for your binary backup, or use mysqldump or mydumper for your logical backups. These tools provide backup copies for your mission-critical data and you can read this if you want to learn more.

Galera Cluster Monitoring on Azure

Microsoft Azure has its monitoring service named Azure Monitor. Azure Monitor maximizes the availability and performance of your applications by delivering a comprehensive solution for collecting, analyzing, and acting on telemetry from your cloud and on-premise environments. It helps you understand how your applications are performing and proactively identifies issues affecting them (and the resources they depend on). You can setup or create health alerts, get notified on advisories and alerts detected in the services you deployed.

If you want monitoring specific to your database, then you will need to utilize external monitoring tools which have  advanced, highly-granular database metrics. There are several choices you can choose from such as PMM by Percona, DataDog, Idera, VividCortex, or our very own ClusterControl (Monitoring is FREE with ClusterControl Community.)

Galera Cluster Database Security on Azure

As discussed in our previous blogs for AWS and GCP, you can take the same approach for securing your database in the public cloud. Once you create a virtual machine, you can specify what ports only can be opened, or create and setup your Network Security Group in Azure. You can setup the ports need to be open (particularly ports 3306, 4444, 4567, 4568), or create a Virtual Network in Azure and specify the private subnets if they remain as a private node. To add this, if you setup your VM's in Azure without a public IP, it can still an outbound connection merely because it uses SNAT and PAT. If you're familiar with AWS and GCP, you'll like this explanation to make it easier to comprehend.

Another feature available is Role-Based Access Control in Microsoft Azure. This gives you control on which people that access to the specific resources they need.

In addition to this, you can secure your data-in-transit by using a TLS/SSL connection or by encrypting your data when it's at-rest. If you're using ClusterControl, deploying a secure data in-transit is simple and easy. You can check out our blog SSL Key Management and Encryption of MySQL Data in Transit if you want to try out. For data at-rest, you can follow the discussion I have stated earlier in the Encryption section of this blog.

Galera Cluster Troubleshooting 

Microsoft Azure offers a wide array of log types to aid troubleshooting and auditing. The logs Activity logs, Azure diagnostics logs, Azure AD reporting, Virtual machines and cloud services, Network Security Group (NSG) flow logs, and Application insight are very useful when troubleshooting. It might not always be necessary to go into all of these when you need troubleshooting, however, it would add more insights and clues when checking the logs.

If you're using ClusterControl, going to Logs -> System Logs, and you'll be able to browse the captured error logs taken from the MySQL Galera node itself. Apart from this, ClusterControl provides real-time monitoring that would amplify your alarm and notification system in case an emergency or if your MySQL Galera node(s) is kaput.

Conclusion

As we finish this three part blog series, we have showed you the offerings and the advantages of each of the tech-giants serving the public cloud industry. There are advantages and disadvantages when selecting one over the other, but what matters most is your reason for moving to a public cloud, its benefits for your organization, and how it serves the requirements of your application. 

The choice of provider for your Galera Cluster may involve financial considerations like “what's most cost-efficient” and better suits your budgetary needs. It could also be due to privacy laws and regulation compliance, or even because of the technology stack you are wanting to use.  What's important is how your application and database will perform once it's in the cloud handling large amounts of traffic. It has to be highly-available, must be resilient, has the right levels of scalability and redundancy, and takes backups to ensure data recovery.

The Basics of Deploying a MongoDB Replica Set and Shards Using Puppet

$
0
0

Database system perform best when they are integrated with some well defined approaches that facilitate both the read and write throughput  operations. MongoDB went the extra mile by embracing replication and sharding with the aim of enabling horizontal and vertical scaling as opposed to relational DBMs whose same concept only enhance vertical scaling.

 Sharding ensures distribution of load among the members of the database cluster so that the read operations are carried out with little latency. Without sharding, the capacity of a single database server with a large set of data and high throughput operations can be technically challenged and may result in failure of that server if the necessary measures are not taken into account. For example, if the rate of queries is very high, the CPU capacity of the server will be overwhelmed.

Replication on the other hand is a concept whereby different database servers are housing the same data. It ensures high availability of data besides enhancing data integrity. Take an example of a high performing social media application, if the main serving database system fails like in case of a power blackout, we should have another system to be serving the same data. A good replica set should have more than 3 members, an arbiter and optimal electionTimeoutMillis. In replication, we will have a master/primary node where all the write operations are made and then applied to an Oplog. From the Oplog, all the made changes are then applied to the other members, which in this case are referred to as secondary nodes or slaves. In case the primary nodes does not communicate after some time: electionTimeoutMillis, the other nodes are signaled to go for an election. The electionTimeoutMillis should be set not too high nor too low for reason that the systems will be down for a long time hence lose a lot of data or frequent elections that may result even with  temporary network latency hence data inconsistency respectively. An arbiter is used to add a vote to a winning member to become a master in case there is a draw but does not carry any data like the other members.

Why Use Puppet to Deploy a MongoDB Replica Set

More often, sharding is used hand in hand with replication. The process of configuring and maintaining a replica set is not easy due to:

  1. High chances of human error
  2. Incapability to carry out repetitive tasks automatically
  3. Time consuming especially when a large number of members is involved
  4. Possibility of work dissatisfaction
  5. Overwhelming complexity that may emerge.

In order to overcome the outlined setbacks, we settle to an automated system like Puppet that have plenty of resources to help us work with ease.

In our previous blog, we learnt the process of installing and configuring MongoDB with Puppet. However, it is important to understand the basic resources of Puppet since we will be using them in configuring our replica set and shards. In case you missed it out, this is the manifest file for the process of installing and running your MongoDB on the machine you created

​  package {'mongodb':

    ensure => 'installed',

  }

  service {'mongodb':

    ensure => 'running',

    enable => true

  }

So we can put the content above in a file called runMongoDB.pp and run it with the command 

$ sudo apply runMongoDB.pp

Sing the 'mongodb' module and functions, we can set up our replica set with the corresponding parameters for each  mongodb resource.

MongoDB Connection

We need to establish a mongodb connection between a node and the mongodb server. The main aim of this is to prevent configuration changes from being applied if the mongodb server cannot be reached but can potentially be used for other purposes like database monitoring. We use the mongodb_conn_validator

mongodb_conn_validator{‘mongodb_validator’:

ensure => present,

     server: ‘127.0.0.1:27017’,

     timeout: 40,

     tcp_port:27017

    }

name:  in this case the name mongodb_validator defines identity of the resource. It could also be considered as a connection string

server: this could be a string or an array of strings containing DNS names/ IP addresses of the server where mongodb should be running.

timeout: this is the maximum number of seconds the validator should wait before deciding that the puppetdb is not running.

tcp_port:  this is a provider for the resource that validates the mongodb connection by attempting the https connection to the mongodb server. The puppet SSL certificate setup from the local puppet environment is used in the authentication.

Creating the Database

mongodb_database{‘databaseName’:

ensure => present,

     tries => 10

}

This function takes 3 params that is:

name:  in this case the name databaseName defines the name of the database we are creating, which would have also been declared as name => ‘databaseName’.

tries: this defines the maximum amount of two second tries to wait MongoDB startup

Creating MongoDB User

The module mongodb_user enables one to create and manage users for a given database in the puppet module.

mongodb_user {userprod:

  username => ‘prodUser’,

  ensure => present,

  password_hash => mongodb_password(‘prodUser’, ‘passProdser’),

  database => prodUser,

  roles => [‘readWrite’, ‘dbAdmin’],

  tries  => 10

}

Properties

username: defines the name of the user.

password_hash: this is the password hash of the user. The function mongodb_password() available on MongoDB 3.0 and later is used for creating the hash.

roles: this defines the roles that the user is allowed to execute on the target database.

password: this is the plain user password text.

database: defines the user’s target database.

Creating a Replica Set

We use the module mongodb_replset to create a replica set.

Mongodb_replset{'replicaset1':

   arbiter: 'host0:27017',

   ensure  => present,

   members => ['host0:27017','host1:27017', 'host2:27017', 'host3:27017'] 

   initialize_host: host1:27017

}

name: defines the name of the replica set.

members: an array of members the replica set will  hold.

initialize_host: host to be used in initialization of the replica set

arbiter: defines the replica set member that will be used as an arbiter.

Creating a MongoDB Shard

mongodb_shard{'shard1':

   ensure  => present,

   members => ['shard1/host1:27017', 'shard1/host2:27017', 'shard1/host3:27017'] 

   keys: 'price'

}

name: defines the name of the shard.

members: this the array  of members the shard will  hold.

keys: define the key to be used in the sharding or an array of keys that can be used to create a compound shard key.

Integrations & Services Available from MongoDB for the Cloud

$
0
0

MongoDB is a document data store that has been around for over a decade. In the last few years, MongoDB has evolved into a mature product that features enterprise-grade options like scalability, security, and resilience. However, with the demanding cloud movement that wasn’t good enough.

Cloud resources, such as virtual machines, containers, serverless compute resources, and databases are currently in high demand. These days many software solutions can be spun-up in a fraction of the time it used to take to deploy onto one’s own hardware. It started a trend and changed the markets expectations at the same time.

But the quality of an online service is not limited to deployment alone. Often users need additional services, integrations, or extra features that help them to do their work. Cloud offerings can still be very limited and may cause more issues than what you can gain from the automation and remote infrastructure.

So what is MongoDB Inc.’s approach this common problem?

The answer was MongoDB Atlas, which brings internal extensions as a part of a larger cloud/automation platform. With the addition of third-party components, MongoDB has flourished. In today's blog, we are going to see what they have developer and how it can help you to address your data processing needs.

The items we will explore today are...

  • MongoDB Charts
  • MongoDB Stich
  • MongoDB Kubernetes Integrations with Ops Manager
  • MongoDB Cloud migration
  • Fulltext Search
  • MongoDB Data Lake (beta)

MongoDB Charts

MongoDB Charts is one of the services accessible through the MongoDB Atlas platform. It simply provides an easy way to visualize your data living inside MongoDB. You don’t need to move your data to a different repository or write your own code as MongoDB Charts was designed to work with data documents and make it easy to visualize your data.

MongoDB Charts

MongoDB Charts makes communicating your data a straightforward process by providing built-in tools to easily share and collaborate on visualizations. Data visualization is a key component to providing a clear understanding of your data, highlighting correlations between variables and making it easy to discern patterns and trends within your dataset. 

Here are some key features which you can use in the Charts.

Aggregation

Aggregation framework is an operational process that manipulates documents in different stages, processes them in accordance with the provided criteria, and then returns the computed results. Values from multiple documents are grouped together, on which more operations can be performed to return matching results.

MongoDB Charts Aggregation

MongoDB Charts provides built-in aggregation functionality. Aggregation allows you to process your collection data by a variety of metrics and perform calculations such as mean and standard deviation.

Charts provide seamless integration with MongoDB Atlas. You can link MongoDB Charts to Atlas projects and quickly get started visualizing your Atlas cluster data.

Document Data Handling

MongoDB Charts natively understands the benefits of the Document Data Model. It manages document-based data, including fixed objects and arrays. Using a nested data structure provides the flexibility to structure your data as it fits for your application while still maintaining visualization capabilities.

MongoDB Charts provides built-in aggregation functionality which allows you to process your collection data using a variety of metrics. It’s intuitive enough for non-developers to use, allowing for self-service data analysis which makes it a great tool for data analytics teams.

MongoDB Stitch

Have you heard about serverless architecture? 

With Serverless, you compose your application into individual, autonomous functions. Each function is hosted by the serverless provider and can be scaled automatically as function call frequency increases or decreases. This turns out to be a very cost-effective way of paying for computing resources. You only pay for the times that your functions get called, rather than paying to have your application always on and waiting for requests on so many different instances.

MongoDB Stitch

MongoDB Stitch is a different kind of MongoDB service taking only what’s most useful in the cloud infrastructure environments. It is a serverless platform that enables developers to build applications without having to set up server infrastructure. Stitch is made on top of MongoDB Atlas, automatically integrating the connection to your database. You can connect to Stitch through the Stitch Client SDKs, which are open for many of the platforms that you develop.

MongoDB Kubernetes Integrations with Ops Manager

Ops Manager is a management platform for MongoDB Clusters that you run on your own infrastructure. The capabilities of Ops Manager include monitoring, alerting, disaster recovery, scaling, deploying, and upgrading of Replica Sets and sharded clusters, and other MongoDB products. In 2018 MongoDB introduced beta integration with Kubernetes. 

The MongoDB Enterprise Operator is compatible with Kubernetes v1.11 and above. It has been tested against Openshift 3.11. This Operator requires Ops Manager or Cloud Manager. In this document, when we refer to "Ops Manager", you may substitute "Cloud Manager". The functionality is the same.

The installation is fairly simple and requires

  • Installing the MongoDB Enterprise Operator. This could be done via helm or YAML file.
  • Gather Ops Manager properties. 
  • Create and apply a Kubernetes ConfigMap file
  • Create the Kubernetes secret object which will store the Ops Manager API Key

In this basic example we are going to use YAML file:

kubectl apply -f crds.yaml
kubectl apply -f https://raw.githubusercontent.com/mongodb/mongodb-enterprise-kubernetes/master/mongodb-enterprise.yaml

The next step is to obtain the following information that we are going to use in ConfigMap File. All that can be found in the ops manager.

  • Base URL. Base Url is the URL of your Ops Manager or Cloud Manager.
  • Project Id. The id of an Ops Manager Project which the Kubernetes Operator will deploy into.
  • User. An existing Ops Manager username
  • Public API Key. Used by the Kubernetes Operator to connect to the Ops Manager REST API endpoint

Now that we have acquired the necessary Ops Manager configuration information we need to create a Kubernetes ConfigMap file for the Kubernetes. For exercise purposes we can call this file project.yaml.

apiVersion: v1

kind: ConfigMap

metadata:

  name:<<Name>>

  namespace: mongodb

data:

  projectId:<<Project ID>>

  baseUrl: <<OpsManager URL>>

The next step is to create ConfigMap to Kubernetes and secret file

kubectl apply -f my-project.yaml

kubectl -n mongodb create secret generic <<Name of credentials>> --from-literal="user=<<User>>" --from-literal="publicApiKey=<<public-api-key>>"

Once we have we can deploy our first cluster

apiVersion: mongodb.com/v1

kind: MongoDbReplicaSet

metadata:

  name: <<Replica set name>>

  namespace: mongodb

spec:

  members: 3

  version: 4.2.0



  persistent: false



  project: <<Name value specified in metadata.name of ConfigMap file>>

  credentials: <<Name of credentials secret>>

For more detailed instructions please visit the MongoDB documentation. 

MongoDB Cloud migration

The Atlas Live Migration Service can migrate your data from your existing environment whether it's on AWS, Azure, GCP, or on-prem to MongoDB Atlas, the global cloud database for MongoDB.

The migration is done via a dedicated replication service. Atlas Live Migration process streams data through a MongoDB-controlled application server. 

Live migration works by keeping a cluster in MongoDB Atlas in sync with your source database. During this process, your application can continue to read and write from your source database. Since the process watches upcoming changes, all will be replicated, and the migration can be done online. You decide when to change the application connection setting and do cutover. To do the process less prone Atlas provides Validate option which checks whitelist IP access, SSL configuration, CA, etc.

Full-Text Search

Full-text search is another service cloud service provided by MongoDB and is available only in MongoDB Atlas. Non-Atlas MongoDB deployments can use text indexing. Atlas Full-Text Search is built on Open Source Apache Lucene. Lucene is a powerful text search library. Lucene has a custom query syntax for querying its indexes. It’s a foundation of popular systems such as Elasticsearch and Apache Solr. It allows creating an index for full-text search, it's searching, saving and reading. It’s fully integrated into Atlas MongoDB so there are no additional systems or infrastructure to provision or manage.

MongoDB Data Lake (beta)

The last MongoDB cloud feature we would like to mention in MongoDB Data Lake. It’s fairly new service addressing the popular concept of data lakes. A data lake is a vast pool of raw data, the purpose for which is not yet defined. Instead of placing data in a purpose-built data store, you move it into a data lake in its original format. This eliminates the upfront costs of data ingestion, like transformation. Once data is placed into the. 

Using Atlas Data Lake to ingest your S3 data into Atlas clusters allows you to query data stored in your AWS S3 buckets using the Mongo Shell, MongoDB Compass, and any MongoDB driver.

There are some limitations though. The following features do not work yet like monitoring Data Lakes with Atlas monitoring tools, single S3 AWS account support, IP whitelist and AWS account and AWS security groups limitations or no possibility to add indexes.

 

Automatic Scaling with Amazon Aurora Serverless

$
0
0

Amazon Aurora Serverless provides an on-demand, auto-scalable, highly-available, relational database which only charges you when it’s in use. It provides a relatively simple, cost-effective option for infrequent, intermittent, or unpredictable workloads. What makes this possible is that it automatically starts up, scales compute capacity to match your application's usage, and then shuts down when it's no longer needed.

The following diagram shows Aurora Serverless high-level architecture.

Aurora Serverless high-level architecture

With Aurora Serverless, you get one endpoint (as opposed to two endpoints for the standard Aurora provisioned DB). This is basically a DNS record consisting of a fleet of proxies which sits on top of the database instance. From a MySQL server point it means that the connections are always coming from the proxy fleet.

Aurora Serverless Auto-Scaling

Aurora Serverless is currently only available for MySQL 5.6. You basically have to set the minimum and maximum capacity unit for the DB cluster. Each capacity unit is equivalent to a specific compute and memory configuration. Aurora Serverless reduces the resources for the DB cluster when its workload is below these thresholds. Aurora Serverless can reduce capacity down to the minimum or increase capacity to maximum capacity unit.

The cluster will automatically scale up if either of the following conditions are met:

  • CPU utilization is above 70% OR
  • More than 90% of connections are being used

The cluster will automatically scale down if both of the following conditions are met:

  • CPU utilization drops below 30% AND
  • Less than 40% of connections are being used.

Some of the notable things to know about Aurora automatic scaling flow:

  • It only scales up when it detects performance issues that can be resolved by scaling up.
  • After scaling up, the cooldown period for scaling down is 15 minutes. 
  • After scaling down, the cooldown period for the next scaling down again is 310 seconds.
  • It scales to zero capacity when there are no connections for a 5-minute period.

By default, Aurora Serverless performs the automatic scaling seamlessly, without cutting off any active database connections to the server. It is capable of determining a scaling point (a point in time at which the database can safely initiate the scaling operation). Under the following conditions, however, Aurora Serverless might not be able to find a scaling point:

  • Long-running queries or transactions are in progress.
  • Temporary tables or table locks are in use.

If either of the above cases happens, Aurora Serverless continues to try to find a scaling point so that it can initiate the scaling operation (unless "Force Scaling" is enabled). It does this for as long as it determines that the DB cluster should be scaled.

Observing Aurora Auto Scaling Behaviour

Note that in Aurora Serverless, only a small number of parameters can be modified and max_connections is not one of them. For all other configuration parameters, Aurora MySQL Serverless clusters use the default values. For max_connections, it is dynamically controlled by Aurora Serverless using the following formula: 

max_connections = GREATEST(

{log(DBInstanceClassMemory/805306368)*45},

{log(DBInstanceClassMemory/8187281408)*1000}

)

Where, log is log2 (log base-2) and "DBInstanceClassMemory" is the number of bytes of memory allocated to the DB instance class associated with the current DB instance, less the memory used by the Amazon RDS processes that manage the instance. It's pretty hard to predetermine the value that Aurora will use, thus it's good to put some tests to understand how this value is scaled accordingly.

Here is our Aurora Serverless deployment summary for this test:

Aurora Serverless deployment summary

For this example I’ve selected a minimum of 1 Aurora capacity unit, which is equal to 2GB of RAM up until the maximum 256 capacity unit with 488GB of RAM.

Tests were performed using sysbench, by simply sending out multiple threads until it reaches the limit of MySQL database connections. Our first attempt to send out 128 simultaneous database connections at once got into a straight failure:

$ sysbench \

/usr/share/sysbench/oltp_read_write.lua \

--report-interval=2 \

--threads=128 \

--delete_inserts=5 \

--time=360 \

--max-requests=0 \

--db-driver=mysql \

--db-ps-mode=disable \

--mysql-host=${_HOST} \

--mysql-user=sbtest \

--mysql-db=sbtest \

--mysql-password=password \

--tables=20 \

--table-size=100000 \

run

The above command immediately returned the 'Too many connections' error:

FATAL: unable to connect to MySQL server on host 'aurora-sysbench.cluster-cdw9q2wnb00s.ap-southeast-1.rds.amazonaws.com', port 3306, aborting...

FATAL: error 1040: Too many connections

When looking at the max_connection settings, we got the following:

mysql> select @@hostname, @@max_connections;

+----------------+-------------------+

| @@hostname     | @@max_connections |

+----------------+-------------------+

| ip-10-2-56-105 |                90 |

+----------------+-------------------+

It turns out, the starting value of max_connections for our Aurora instance with one DB capacity (2GB RAM) is 90. This is actually way lower than our anticipated value if calculated using the provided formula to estimate the max_connections value:

mysql> select GREATEST({log2(2147483648/805306368)*45},{log2(2147483648/8187281408)*1000});

+------------------------------------------------------------------------------+

| GREATEST({log2(2147483648/805306368)*45},{log2(2147483648/8187281408)*1000}) |

+------------------------------------------------------------------------------+

|                                                                     262.2951 |

+------------------------------------------------------------------------------+

This simply means the DBInstanceClassMemory is not equal to the actual memory for Aurora instance. It must be way lower. According to this discussion thread, the variable's value is adjusted to account for memory already in use for OS services and RDS management daemon.

Nevertheless, changing the default max_connections value to something higher won't help us either since this value is dynamically controlled by Aurora Serverless cluster. Thus, we had to reduce the sysbench starting threads value to 84 because Aurora internal threads already reserved around 4 to 5 connections via 'rdsadmin'@'localhost'. Plus, we also need an extra connection for our management and monitoring purposes.

So we executed the following command instead (with --threads=84):

$ sysbench \

/usr/share/sysbench/oltp_read_write.lua \

--report-interval=2 \

--threads=84 \

--delete_inserts=5 \

--time=600 \

--max-requests=0 \

--db-driver=mysql \

--db-ps-mode=disable \

--mysql-host=${_HOST} \

--mysql-user=sbtest \

--mysql-db=sbtest \

--mysql-password=password \

--tables=20 \

--table-size=100000 \

run

After the above test was completed in 10 minutes (--time=600), we ran the same command again and at this time, some of the notable variables and status had changed as shown below:

mysql> select @@hostname as hostname, @@max_connections as max_connections, (SELECT VARIABLE_VALUE from global_status where VARIABLE_NAME = 'THREADS_CONNECTED') as threads_connected, (SELECT VARIABLE_VALUE from global_status where VARIABLE_NAME = 'UPTIME') as uptime;

+--------------+-----------------+-------------------+--------+

| hostname     | max_connections | threads_connected | uptime |

+--------------+-----------------+-------------------+--------+

| ip-10-2-34-7 |             180 | 179 | 157    |

+--------------+-----------------+-------------------+--------+

Notice that the max_connections has now doubled up to 180, with a different hostname and small uptime like the server was just getting started. From the application point-of-view, it looks like another "bigger database instance" has taken over the endpoint and configured with a different max_connections variable. Looking at the Aurora event, the following has happened:

Wed, 04 Sep 2019 08:50:56 GMT The DB cluster has scaled from 1 capacity unit to 2 capacity units.

Then, we fired up the same sysbench command, creating another 84 connections to the database endpoint. After the generated stress test completed, the server automatically scales up to 4 DB capacity, as shown below:

mysql> select @@hostname as hostname, @@max_connections as max_connections, (SELECT VARIABLE_VALUE from global_status where VARIABLE_NAME = 'THREADS_CONNECTED') as threads_connected, (SELECT VARIABLE_VALUE from global_status where VARIABLE_NAME = 'UPTIME') as uptime;

+---------------+-----------------+-------------------+--------+

| hostname      | max_connections | threads_connected | uptime |

+---------------+-----------------+-------------------+--------+

| ip-10-2-12-75 |             270 | 6 | 300   |

+---------------+-----------------+-------------------+--------+

You can tell by looking at the different hostname, max_connection and uptime value if compared to the previous one. Another bigger instances has "taken over" the role from the previous instance, where DB capacity was equal to 2. The actual scaling point is when the server load was dropping and almost hitting the floor. In our test, if we kept the connection full and the database load consistently high, automatic scaling wouldn't take place. 

By looking at both screenshots below, we can tell the scaling only happens when our Sysbench has completed its stress test for 600 seconds because that is the safest point to perform automatic scaling.

Serverless DB CapacityServerless DB Capacity
 
 
 
CPU UtilizationCPU UtilizationCPU Utilization

When looking at Aurora events, the following events happened:

Wed, 04 Sep 2019 16:25:00 GMT Scaling DB cluster from 4 capacity units to 2 capacity units for this reason: Autoscaling.

Wed, 04 Sep 2019 16:25:05 GMT The DB cluster has scaled from 4 capacity units to 2 capacity units.

Finally, we generated much more connections until almost 270 and wait until it finished, to get into the 8 DB capacity:

mysql> select @@hostname as hostname, @@max_connections as max_connections, (SELECT VARIABLE_VALUE from global_status where VARIABLE_NAME = 'THREADS_CONNECTED') as threads_connected, (SELECT VARIABLE_VALUE from global_status where VARIABLE_NAME = 'UPTIME') as uptime;

+---------------+-----------------+-------------------+--------+

| hostname      | max_connections | threads_connected | uptime |

+---------------+-----------------+-------------------+--------+

| ip-10-2-72-12 |            1000 | 144 | 230    |

+---------------+-----------------+-------------------+--------+

In the 8 capacity unit instance, MySQL max_connections value is now 1000. We repeated similar steps by maxing out the database connections and up until the limit of 256 capacity unit. The following table summarizes overall DB capacity unit versus max_connections value in our testing up to the maximum DB capacity:

Amazon Aurora DB Capacity

Forced Scaling

As mentioned above, Aurora Serverless will only perform automatic scaling when it's safe to do so. However, the user has the option to force the DB capacity scaling to happen immediately by ticking on the Force scaling checkbox under 'Additional scaling configuration' option:

Amazon Aurora Capacity Settings

When forced scaling is enabled, the scaling happens as soon as the timeout is reached which is 300 seconds. This behaviour may cause database interruption from your application where active connections to the database may get dropped. We observed the following error when force automatic scaling happened after it reaches timeout:

FATAL: mysql_drv_query() returned error 1105 (The last transaction was aborted due to an unknown error. Please retry.) for query 'SELECT c FROM sbtest19 WHERE id=52824'

FATAL: `thread_run' function failed: /usr/share/sysbench/oltp_common.lua:419: SQL error, errno = 1105, state = 'HY000': The last transaction was aborted due to an unknown error. Please retry.

The above simply means, instead of finding the right time to scale up, Aurora Serverless forces the instance replacement to take place immediately after reaches its timeout, which cause transactions being aborted and roll backed. Retrying the aborted query for the second time will likely solve the problem. This configuration could be used if your application is resilient to connection drops.

Summary

Amazon Aurora Serverless auto scaling is a vertical scaling solution, where a more powerful instance takes over an inferior instance, utilizing the underlying Aurora shared storage technology efficiently. By default, the auto scaling operation is performed seamlessly, whereby Aurora finds a safe scaling point to perform the instance switching. One has the option to force for automatic scaling with risks of active database connections get dropped.


Database Switchover and Failover for Drupal Websites Using MySQL or PostgreSQL

$
0
0

Drupal is a Content Management System (CMS) designed to create everything from tiny to large corporate websites. Over 1,000,000 websites run on Drupal and it is used to make many of the websites and applications you use every day (including this one). Drupal has a great set of standard features such as easy content authoring, reliable performance, and excellent security. What sets Drupal apart is its flexibility as modularity is one of its core principles. 

Drupal is also a great choice for creating integrated digital frameworks. You can extend it with the thousands of add-ons available. These modules expand Drupal's functionality. Themes let you customize your content's presentation and distributions (Drupal bundles) are bundles which you can use as starter-kits. You can use all these functionalities to mix and match to enhance Drupal's core abilities or to integrate Drupal with external services. It is content management software that is powerful and scalable.

Drupal uses databases to store its web content. When your Drupal-based website or application is experiencing a large amount of traffic it can have an impact on your database server. When you are in this situation you'll require load balancing, high availability, and a redundant architecture to keep your database online. 

When I started researching this blog, I realized there are many answers to this issue online, but the solutions recommended were very dated. This could be a result of the increase in market share by WordPress resulting in a smaller open source community. What I did find were some examples on implementing high availability by using Master/Master (High Availability) or Master/Master/Slave (High Availability/High Performance)

Drupal offers support for a wide array of databases, but it was initially designed using MySQL variants. Though using MySQL is fully supported, there are better approaches you can implement. Implementing these other approaches, however, if not done properly, can cause your website to experience large amounts of downtime, cause your application to suffer performance issues, and may result in write issues to your slaves. Performing maintenance would also be difficult as you need failover to apply the server upgrades or patches (hardware or software) without downtime. This is especially true if you have a large amount of data, causing a potential major impact to your business. 

These are situations you don't want to happen which is why in this blog we’ll discuss how you can implement database failover for your MySQL or PostgreSQL databases.

Why Does Your Drupal Website Need Database Failover?

From Wikipedia“failover is switching to a redundant or standby computer server, system, hardware component or network upon the failure or abnormal termination of the previously active application, server, system, hardware component, or network. Failover and switchover are essentially the same operation, except that failover is automatic and usually operates without warning, while switchover requires human intervention.” 

In database operations, switchover is also a term used for manual failover, meaning that it requires a person to operate the failover. Failover comes in handy for any admin as it isolates unwanted problems such as accidental deletes/dropping of tables, long hours of downtime causing business impact, database corruption, or system-level corruption. 

Database Failover consists of more than a single database node, either physically or virtually. Ideally, since failover requires you to do switching over to a different node, you might as well switch to a different database server, if a host is running multiple database instances on a single host. That still can be either switchover or failover, but typically it's more of redundancy and high-availability in case a catastrophe occurs on that current host.

MySQL Failover for Drupal

Performing a failover for your Drupal-based application requires that the data handled by the database does not differentiate, nor separate. There are several solutions available, and we have already discussed some of them in previous Severalnines blogs. You may likely want to read our Introduction to Failover for MySQL Replication - the 101 Blog.

The Master-Slave Switchover

The most common approaches for MySQL Failover is using the master-slave switch over or the manual failover. There are two approaches you can do here:

  • You can implement your database with a typical asynchronous master-slave replication.
  • or can implement with asynchronous master-slave replication using GTID-based replication.

Switching to another master could be quicker and easier. This can be done with the following MySQL syntax:

mysql> SET GLOBAL read_only = 1; /* enable read-only */

mysql> CHANGE MASTER TO MASTER_HOST = '<hostname-or-ip>', MASTER_USER = '<user>', MASTER_PASSWORD = '<password>', MASTER_LOG_FILE = '<master-log-file>', MASTER_LOG_POS=<master_log_position>; /* master information to connect */

mysql> START SLAVE; /* start replication */

mysql> SHOW SLAVE STATUS\G /* check replication status */

or with GTID, you can simply do,

...

mysql> CHANGE MASTER TO MASTER_HOST = '<hostname-or-ip>', MASTER_USER = '<user>', MASTER_PASSWORD = '<password>', MASTER_AUTO_POSITION = 1; /* master information to connect */

...

Wit

Using the non-GTID approach requires you to determine first the master's log file and master's log pos. You can determine this by looking at the master's status in the master node before switching over. 

mysql> MASTER STATUS;

You may also consider hardening your server adding sync_binlog = 1 and innodb_flush_log_at_trx_commit = 1 as, in the event your master crashes, you'll have a higher chance that transactions from master will be insync with your slave(s). In such a case that promoted master has a higher chance of being a consistent datasource node.

This, however, may not be the best approach for your Drupal database as it could impose long downtimes if not performed correctly, such as being taken down abruptly. If your master database node experiences a bug resulting in a database to crash, you’ll need your application to point to another database waiting on standby as your new master or by having your slave promoted to be the master. You will need to specify exactly which node should take over and then determine the lag and consistency of that node. Achieving this is not as easy as just doing SET GLOBAL read_only=1; CHANGE MASTER TO… (etc), there are certain situations which require deeper analysis, looking at the viable transactions required to be present in that standby server or promoted master, to get it done. 

Drupal Failover Using MHA

One of the most common tools for automatic and manual failover is MHA. It has been around for a long while now and is still used by many organizations. You can checkout these previous blogs we have on the subject, Top Common Issues with MHA and How to Fix Them or MySQL High Availability Tools - Comparing MHA, MRM and ClusterControl.

Drupal Failover Using Orchestrator

Orchestrator has been widely adopted now and is being used by large organizations such as Github and Booking.com. It not only allows you to manage a failover, but also topology management, host discovery, refactoring, and recovery. There's a nice external blog here which I found it very useful to learn about its failover mechanism with Orchestrator. It's a two part blog series; part one and part two.

Drupal Failover Using MaxScale

MaxScale is not just a load balancer designed for MariaDB server, it also extends high availability, scalability, and security for MariaDB while, at the same time, simplifying application development by decoupling it from underlying database infrastructure. If you are using MariaDB, then MaxScale could be a relevant technology for you. Check out our previous blogs on how you can use the MaxScale failover mechanism.

Drupal Failover Using ClusterControl

Severalnines'ClusterControl offers a wide array of database management and monitoring solutions. Part of the solutions we offer is automatic failover, manual failover, and cluster/node recovery. This is very helpful as if it acts as your virtual database administrator, notifying you in real-time in case your cluster is in “panic mode,” all while the recovery is being managed by the system. You can check out this blog How to Automate Database Failover with ClusterControl to learn more about ClusterControl failover.

Other MySQL Solutions

Some of the old approaches are still applicable when you want to failover. There's MMM, MRM, or you can checkout Group Replication or Galera (note: Galera does not use asynchronous, rather synchronous replication). Failover in a Galera Cluster does not work the same way as it does with asynchronous replication. With Galera you can just write to any node or, if you implement a master-slave approach, you can direct your application to another node that will be the active-writer for the cluster.

Drupal PostgreSQL Failover

Since Drupal supports PostgreSQL, we will also checkout the tools to implement a failover or switchover process for PostgreSQL. PostgreSQL uses built-in Streaming Replication, however you can also set it to use a Logical Replication (added as a core element of PostgreSQL in version 10). 

Drupal Failover Using pg_ctlcluster

If your environment is Ubuntu, using pg_ctlcluster is a simple and easy way to achieve failover. For example, you can just run the following command:

$ pg_ctlcluster 9.6 pg_7653 promote

or with RHEL/Centos, you can use the pg_ctl command just like,

$ sudo -iu postgres /usr/lib/postgresql/9.6/bin/pg_ctl promote -D  /data/pgsql/slave/data

server promoting

You can also trigger failover of a log-shipping standby server by creating a trigger file with the filename and path specified by the trigger_file in the recovery.conf. 

You have to be careful with standby promotion or slave promotion here as you might have to ensure that only one master is accepting the read-write request. This means that, while doing the switchover, you might have to ensure the previous master has been relaxed or stopped.

Taking care of switchover or manual failover from primary to standby server can be fast, but it requires some time to re-prepare the failover cluster. Regularly switching from primary to standby is a useful practice as it allows for regular downtime on each system for maintenance. This also serves as a test of the failover mechanism, to ensure that it will really work when you need it. Written administration procedures are always advised. 

Drupal PostgreSQL Automatic Failover

Instead of a manual approach, you might require automatic failover. This is especially needed when a server goes down due to hardware failure or virtual machine corruption. You may also require an application to automatically perform the failover to lessen the downtime of your Drupal application. We'll now go over some of these tools which can be utilized for automatic failover.

Drupal Failover Using Patroni

Patroni is a template for you to create your own customized, high-availability solution using Python and - for maximum accessibility - a distributed configuration store like ZooKeeper, etcd, Consul or Kubernetes. Database engineers, DBAs, DevOps engineers, and SREs who are looking to quickly deploy HA PostgreSQL in the datacenter-or anywhere else-will hopefully find it useful

Drupal Failover Using Pgpool

Pgpool-II is a proxy software that sits between the PostgreSQL servers and a PostgreSQL database client. Aside from having an automatic failover, it has multiple features that includes connection pooling, load balancing, replication, and limiting the exceeding connections. You can read more about this tool is our three part blog; part one, part two, part three.

Drupal Failover Using pglookout

pglookout is a PostgreSQL replication monitoring and failover daemon. pglookout monitors the database nodes, their replication status, and acts according to that status. For example, calling a predefined failover command to promote a new master in the case the previous one goes missing.

pglookout supports two different node types, ones that are installed on the db nodes themselves and observer nodes that can be installed anywhere. The purpose of having pglookout on the PostgreSQL DB nodes is to monitor the replication status of the cluster and act accordingly, the observers have a more limited remit: they just observe the cluster status to give another viewpoint to the cluster state.

Drupal Failover Using repmgr

repmgr is an open-source tool suite for managing replication and failover in a cluster of PostgreSQL servers. It enhances PostgreSQL's built-in hot-standby capabilities with tools to set up standby servers, monitor replication, and perform administrative tasks such as failover or manual switchover operations.

repmgr has provided advanced support for PostgreSQL's built-in replication mechanisms since they were introduced in 9.0. The current repmgr series, repmgr 4, supports the latest developments in replication functionality introduced from PostgreSQL 9.3 such as cascading replication, timeline switching and base backups via the replication protocol.

Drupal Failover Using ClusterControl

ClusterControl supports automatic failover for PostgreSQL. If you have an incident, your slave can be promoted to master status automatically. With ClusterControl you can also deploy standalone, replicated, or clustered PostgreSQL database. You can also easily add or remove a node with a single action.

Other PostgreSQL Drupal Failover Solutions

There are certainly automatic failover solutions that I might have missed on this blog. If I did, please add your comments below so we can know your thoughts and experiences with your implementation and setup for failover especially for Drupal websites or applications.

Additional Solutions For Drupal Failover

While the tools I have mentioned earlier definitely handles the solution for your problems with failover, adding some tools that makes the failover pretty easier, safer, and has a total isolation between your database layer can be satisfactory. 

Drupal Failover Using ProxySQL

With ProxySQL, you can just point your Drupal websites or applications to the ProxySQL server host and it will designate which node will receive writes and which nodes will receive the reads. The magic happens transparently within the TCP layer and no changes are needed for your application/website configuration. In addition to that, ProxySQL acts also as your load balancer for your write and read requests for your database traffic. This is only applicable if you are using MySQL database variants.

Drupal Failover Using HAProxy with Keepalived

Using HAProxy and Keepalived adds more high availability and redundancy within your Drupal's database. If you want to failover, it can be done without your application knowing what's happening within your database layer. Just point your application to the vrrp IP that you setup in your Keepalived and everything will be handled with total isolation from your application. Having an automatic failover will be handled transparently and unknowingly by your application so no changes has to occur once, for example, a disaster has occurred and a recovery or failover was applied. The good thing about this setup is that it is applicable for both MySQL and PostgreSQL databases. I suggest you check out our blog PostgreSQL Load Balancing Using HAProxy & Keepalived to learn more about how to do this.

All of the options above are supported by ClusterControl. You can deploy or import the database and then deploy ProxySQL, MaxScale, or HAProxy & Keepalived. Everything will be managed, monitored, and will be set up automatically without any further configuration needed by your end. It all happens in the background and automatically creates a ready-for-production.

Conclusion

Having an always-on Drupal website or application, especially if you are expecting a large amount of traffic, can be complicated to create. If you have the right tools, the right setup, and the right technology stack, however, it is possible to achieve high availability and redundancy.

And if you don’t? Well then ClusterControl will set it up and maintain it for you. Alternatively, you can create a setup using the technologies mentioned in this blog, most of which are open source, free tools that would cater to your needs.

The Most Common PostgreSQL Failure Scenarios

$
0
0

There is not a perfect system, hardware, or topology to avoid all the possible issues that could happen in a production environment. Overcoming these challenges requires an effective DRP (Disaster Recovery Plan), configured according to your application, infrastructure, and business requirements. The key to success in these types of situations is always how fast we can fix or recover from the issue.

In this blog we’ll take a look at the most common PostgreSQL failure scenarios and show you how you can solve or cope with the issues. We’ll also look at how ClusterControl can help us get back online

The Common PostgreSQL Topology

To understand common failure scenarios, you must first start with a common PostgreSQL topology. This can be any application connected to a PostgreSQL Primary Node which has a replica connected to it.

The Common PostgreSQL Topology - Severalnines

You can always improve or expand this topology by adding more nodes or load balancers, but this is the basic topology we’ll start working with.

Primary PostgreSQL Node Failure

Primary PostgreSQL Node Failure - Severalnines

This is one of the most critical failures as we should fix it ASAP if we want to keep our systems online. For this type of failure it’s important to have some kind of automatic failover mechanism in place. After the failure, then you can look into the reason for the issues. After the failover process we ensure that the failed primary node doesn’t still think that it’s the primary node. This is to avoid data inconsistency when writing to it.

The most common causes of this kind of issue are an operating system failure, hardware failure, or a disk failure. In any case, we should check the database and the operating system logs to find the reason.

The fastest solution for this issue is by performing a failover task to reduce downtime To promote a replica we can use the pg_ctl promote command on the slave database node, and then, we must send the traffic from the application to the new primary node. For this last task, we can implement a load balancer between our application and the database nodes, to avoid any change from the application side in case of failure. We can also configure the load balancer to detect the node failure and instead of sending traffic to him, send the traffic to the new primary node.

After the failover process and make sure the system is working again, we can look into the issue, and we recommend to keep always at least one slave node working, so in case of a new primary failure, we can perform the failover task again.

PostgreSQL Replica Node Failure

PostgreSQL Replica Node Failure - Severalnines

This is not normally a critical issue (as long as you have more than one replica and are not using it to send the read production traffic). If you are experiencing issues on the primary node, and don’t have your replica up-to-date, you’ll have a real critical issue. If you’re using our replica for reporting or big data purposes, you will probably want to fix it quickly anyway.

The most common causes of this kind of issue are the same that we saw for the primary node, an operating system failure, hardware failure, or disk failure .You should check the database and the operating system logs to find the reason.

It’s not recommended to keep the system working without any replica as, in case of failure, you don’t have a fast way to get back online. If you have only one slave, you should solve the issue ASAP; the fastest way being by creating a new replica from scratch. For this you’ll need to take a consistent backup and restore it to the slave node, then configure the replication between this slave node and the primary node.

If you wish to know the failure reason, you should use another server to create the new replica, and then look into the old one to discover it. When you finish this task, you can also reconfigure the old replica and keep both working as a future failover option.

If you’re using the replica for reporting or for big data purposes, you must change the IP address to connect to the new one. As in the previous case, one way to avoid this change is by using a load balancer that will know the status of each server, allowing you to add/remove replicas as you wish.

PostgreSQL Replication Failure

PostgreSQL Replication Failure - Severalnines

In general, this kind of issue is generated due to a network or configuration issue. It’s related to a WAL (Write-Ahead Logging) loss in the primary node and the way PostgreSQL manages the replication.

If you have important traffic, you’re doing checkpoints too frequently, or you’re storing WALS for only a few minutes; if you have a network issue you’ll have little time to solve it. Your WALs would be deleted before you can send and apply it to the replica.

If the WAL that the replica needs to continue working was deleted you need to rebuild it, so to avoid this task, we should check our database configuration to increase the wal_keep_segments (amounts of WALS to keep in the pg_xlog directory) or the max_wal_senders (maximum number of simultaneously running WAL sender processes) parameters.

Another recommended option is to configure archive_mode on and send the WAL files to another path with the parameter archive_command. This way, if PostgreSQL reaches the limit and deletes the WAL file, we’ll have it in another path anyway.

PostgreSQL Data Corruption / Data Inconsistency / Accidental Deletion

PostgreSQL Data Corruption / Data Inconsistency / Accidental Deletion - Severalnines

This is a nightmare for any DBA and probably the most complex issue to be fixed, depending on how widespread the issue is.

When your data is affected by some of these issues, the most common way to fix it (and probably the only one) is by restoring a backup. That is why backups are the basic form of any disaster recovery plan and it is recommended that you have at least three backups stored in different physical places. Best practice dictates backup files should have one stored locally on the database server (for a faster recovery), another one in a centralized backup server, and the last one on the cloud

We can also create a mix of full/incremental/differential PITR compatible backups to reduce our Recovery Point Objective.

Managing PostgreSQL Failure with ClusterControl

Now that we have looked at these common PostgreSQL failures scenarios let’s look at what would happen if we were managing your PostgreSQL databases from a centralized database management system. One that is great in terms of reaching a fast and easy way to fix the issue, ASAP, in the case of failure.

Managing PostgreSQL Failure with ClusterControl

ClusterControl provides automation for most of the PostgreSQL tasks described above; all in a centralized and user-friendly way. With this system you will be able to easily configure things that, manually, would take time and effort. We will now review some of its main features related to PostgreSQL failure scenarios.

Deploy / Import a PostgreSQL Cluster

Once we enter the ClusterControl interface, the first thing to do is to deploy a new cluster or import an existing one. To perform a deployment, simply select the option Deploy Database Cluster and follow the instructions that appear.

Scaling Your PostgreSQL Cluster

If you go to Cluster Actions and select Add Replication Slave, you can either create a new replica from scratch or add an existing PostgreSQL database as a replica. In this way, you can have your new replica running in a few minutes and we can add as many replicas as we want; spreading read traffic between them using a load balancer (which we can also implement with ClusterControl).

PostgreSQL Automatic Failover

ClusterControl manages failover on your replication setup. It detects master failures and promotes a slave with the most current data as the new master. It also automatically fails-over the rest of the slaves to replicate from the new master. As for client connections, it leverages two tools for the task: HAProxy and Keepalived.

HAProxy is a load balancer that distributes traffic from one origin to one or more destinations and can define specific rules and/or protocols for the task. If any of the destinations stop responding, it is marked as offline, and the traffic is sent to one of the available destinations. This prevents traffic from being sent to an inaccessible destination and the loss of this information by directing it to a valid destination.

Keepalived allows you to configure a virtual IP within an active/passive group of servers. This virtual IP is assigned to an active “Main” server. If this server fails, the IP is automatically migrated to the “Secondary” server that was found to be passive, allowing it to continue working with the same IP in a transparent way for our systems.

Adding a PostgreSQL Load Balancer

If you go to Cluster Actions and select Add Load Balancer (or from the cluster view - go to Manage -> Load Balancer) you can add load balancers to our database topology.

The configuration needed to create your new load balancer is quite simple. You only need to add IP/Hostname, port, policy, and the nodes we are going to use.  You can add two load balancers with Keepalived between them, which allows us to have an automatic failover of our load balancer in case of failure. Keepalived uses a virtual IP address, and migrates it from one load balancer to another in case of failure, so our setup can continue to function normally.

PostgreSQL Backups

We have already discussed the importance of having backups. ClusterControl provides the functionality either to generate an immediate backup or schedule one.

You can choose between three different backup methods, pgdump, pg_basebackup, or pgBackRest. You can also specify where to store the backups (on the database server, on the ClusterControl server, or in the cloud), the compression level, encryption required, and the retention period.

PostgreSQL Monitoring & Alerting

Before being able to take action you need to know what is happening, so you’ll need to monitor your database cluster. ClusterControl allows you to monitor our servers in real-time. There are graphs with basic data such as CPU, Network, Disk, RAM, IOPS, as well as database-specific metrics collected from the PostgreSQL instances. Database queries can also be viewed from the Query Monitor.

In the same way that you enable monitoring from ClusterControl, you can also setup alerts which inform you of events in your cluster. These alerts are configurable, and can be personalized as needed.

Conclusion

Everyone will eventually need to cope with PostgreSQL issues and failures. And since you can’t avoid the issue, you need to be able to fix it ASAP and keep the system running. We also saw also how using ClusterControl can help with these issues; all from a single and user-friendly platform.

These are what we thought were some of the most common failure scenarios for PostgreSQL. We would love to hear about your own experiences and how you fixed it.

 

Achieving MySQL Failover & Failback on Google Cloud Platform (GCP)

$
0
0

There are numerous cloud providers these days. They can be small or large, local or with data centers spread across the whole world. Many of these cloud providers offer some kind of a managed relational database solution. The databases supported tend to be MySQL or PostgreSQL or some other flavor of relational database. 

When designing any kind of database infrastructure it is important to understand your business needs and decide what kind of availability you would need to achieve. 

In this blog post, we will look into high availability options for MySQL-based solutions from one of the largest cloud providers - Google Cloud Platform.

Deploying a Highly Available Environment Using GCP SQL Instance

For this blog we want is a very simple environment - one database, with maybe one or two replicas. We want to be able to failover easily and restore operations as soon as possible if the master fails. We will use MySQL 5.7 as the version of choice and start with the instance deployment wizard:

We then have to create the root password, set the instance name, and determine where it should be located:

Next, we will look into the configuration options:

We can make changes in terms of the instance size (we will go with db-n1-standard-4), storage,  and maintenance schedule. What is most important for us in this setup are the high availability options:

Here we can choose to have a failover replica created. This replica will be promoted to a master should the original master fail.

After we deploy the setup, let’s add a replication slave:

Once the process of adding the replica is done, we are ready for some tests. We are going to run test workload using Sysbench on our master, failover replica, and read replica to see how this will work out. We will run three instances of Sysbench, using the endpoints for all three types of nodes.

Then we will trigger the manual failover via the UI:

Testing MySQL Failover on Google Cloud Platform?

I have got to this point without any detailed knowledge of how the SQL nodes in GCP work. I did have some expectations, however, based on previous MySQL experience and what I’ve seen in the other cloud providers. For starters, the failover to the failover node should be very quick. What we would like is to keep the replication slaves available, without the need for a rebuild. We would also like to see how fast we can execute the failover a second time (as it is not uncommon that the issue propagates from one database to another).

What we determined during our tests...

  1. While failing over, the master became available again in 75 - 80 seconds.
  2. Failover replica was not available for 5-6 minutes.
  3. Read replica was available during the failover process, but it became unavailable for 55 - 60 seconds after the failover replica became available

What we’re not sure about...

What is happening when the failover replica is not available? Based on the time, it looks like the failover replica is being rebuilt. This makes sense, but then the recovery time would be strongly related to the size of the instance (especially I/O performance) and the size of the data file.

What is happening with read replica after the failover replica would have been rebuilt? Originally, the read replica was connected to the master. When the master failed, we would expect the read replica to provide an outdated view of the dataset. Once the new master shows up, it should reconnect via replication to the instance (which used to be failover replica and which has been promoted to master). There is no need for a minute of downtime when CHANGE MASTER is being executed.

More importantly, during the failover process there is no way to execute another failover (which sort of makes sense):

It is also not possible to promote read replica (which not necessarily makes sense - we would expect to be able to promote read replicas at any time).

It is important to note, relying on the read replicas to provide high availability (without creating a failover replica) is not a viable solution. You can promote a read replica to become a master, however a new cluster would be created; detached from the rest of the nodes.

There is no way to slave your other replicas off the new cluster. The only way to do this would be to create new replicas, but this is a time-consuming process. It is also virtually non-usable, making the failover replica to be the only real option for high availability for SQL nodes in Google Cloud Platform.

Conclusion

While it is possible to create a highly-available environment for SQL nodes in GCP, the master will not be available for roughly a minute and a half. The whole process (including rebuilding the failover replica and some actions on the read replicas) took several minutes. During that time we weren’t able to trigger an additional failover, nor we we able to promote a read replica. 

Do we have any GCP users out there? How are you achieving high availability?

 

PostgreSQL Top Learning & Training Resources

$
0
0

Oftentimes, people want to know about “That One Place” to get all their learning and training resources for PostgreSQL. When I get such a question from a colleague, my typical response it to tell them to look it up online. But I know as soon as they hit the “.com” highway, they will be confronted with a barrage of resources about PostgreSQL from blogs, articles, whitepapers, videos, webinars, cookbooks for dummies, cheat sheets, and more.

In this blog, I am going to take you on a journey of some of the important avenues to quickly obtain most of the knowledge you would need to know about PostgreSQL

Here we go...

Read the PostgreSQL Manual

The first stop are the online manuals of PostgreSQL. The official documentation (or docs as they are referred to in short) of any product is the best place to find the largest wealth of information. For most people nowadays, manuals are typically the last place to look for help.  It should, however, always be the first stop on the list for various reasons as listed below:

  • Official docs explain the internals of various components of a product and how they relate to each other
  • They link to various other sections of manuals discussing a concept when a new concept is introduced
  • There is sample code to be executed and its expected output with explanation
  • There is a logical flow from one idea to another
  • There is a “Tip” and “Quick Setup” section wherever required that gives bonus information for newbies
  • Most of the other online resources lead you to official documentation in one way or the other
  • The manuals are divided into appropriate sections as per the need such as developer oriented, administrator related, programming focused, utilities, command reference, internals and appendices etc.

One excellent feature of using manuals that I liked the most is the “Supported Versions” subtitle on top of the page which provides links to other versions of PostgreSQL where a concept is available. It makes it convenient to navigate between various versions of PostgreSQL for the same concept, especially when you want to compare default settings across versions, parameter names, and error conditions etc. 

I once wanted to play around with “Logical Replication” when it was first introduced in PostgreSQL 10. I found a dedicated chapter in the manuals on Logical Replication that explains the architecture, components involved, configuration settings, and a quick setup. All I did was follow the steps of “Quick Setup” and had a working Logical Replication setup on my test virtual machine in no time.

These docs are like the owner’s manual for a home appliance. Any error code from the appliance can only be understood by referring to the  owner’s manual to take necessary action to troubleshoot and remedy the issue. The notion sounds like a cliche but it holds true about manuals.

The other benefit of getting used to online manuals is by attaining first hand information about the added and/or enhanced features  in a newly released version of PostgreSQL (called Release Notes). Online manuals may give you a comprehensive account of enhancements, added features, and deprecated features, but Release Notes give you the “introductory gist” of what the new feature is, what enhancements have been made, and what features are no longer supported. A quick glance of Release Notes across major release versions also gives you an understanding of what developments have been made in a specific PostgreSQL version since the earlier release.

In addition to online manuals, there is a repository of all stuff PostgreSQL in the form of WIKI pages. This has supplementary information covering tutorials, guides, how-tos, and tips 'n' tricks related to PostgreSQL. It also serves as a collaboration area for PostgreSQL contributors. You can also get access to automation scripts developed by various users on installation, administration, and management of PostgreSQL, which could be utilized in your environment under GPL notice.

Using the PostgreSQL Distribution Lists

The next top learning and training resources are the community distribution lists. This is where you can interact with other PostgreSQL enthusiasts from across the globe. There are over 45 community distribution lists divided into 7 broad categories (listed below).

  • User lists
  • Developer lists
  • Regional lists
  • Associations
  • User groups
  • Project lists
  • Internal lists

There is a dedicated distribution list for every type of PostgreSQL professional for you depending on the regional language, experience level, and background of PostgreSQL interest. But as PostgreSQL gains more and more momentum this may quickly build up to be over a 100 distributions lists across even more categories.

To stay up-to-speed on PostgreSQL you have to subscribe and follow some of the community distribution lists, because you will see a lot of action around PostgreSQL. There is an audience of various levels of expertise starting from newbies requesting a little hand-holding to industry and community heavy-weights offering suggestions to solve complex issues being faced in production environments.

The best way to participate in these community distribution lists is to start with a PostgreSQL database instance running in your own local virtual machine (VM). This will help you to know the terminologies and nuances of PostgreSQL.  You are also in a position to offer help to the community when someone confronts a PostgreSQL situation you may have already faced and successfully resolved. 

PostgreSQL Partners & Software Tools

There are many tools that can be configured to work with a PostgreSQL database. It is not possible for a new user to truly get a grasp of the whole market out there, but it does get easier if you narrow down to a specific concept and evaluate the most popular tools related to the concept of your choice. 

My personal interest around databases is Backup& Recovery, Replication, High Availability, and Monitoring. I have spent enough time learning and implementing some of the open source tools around these areas, and when a fellow community member gets into a bind, and I know what could be the cause, I offer to help with a quick explanation and plan of action by citing references from the respective documentation. 

Official PostgreSQL Webinars

There are also online webinars conducted by various registered organizations (note: you will need a PostgreSQL account to view these), with their members forming part of a core team of contributors or committers of PostgreSQL code. Some of the other core team members manage their own personal blogs publishing technical content from time to time such as know-hows, white papers, case studies, tutorials or simple tips and tricks of working with PostgreSQL internals. The other forms of engaging with the PostgreSQL community members online include IRC, Slack, GitHub and several other online networking portals.

A List of PostgreSQL Events

Now that you have started learning and exploring the possibilities of PostgreSQL, it’s time to meet some real people in person. One way of achieving that would be to attend events and technical symposia organized by various local PostgreSQL user groups within your region. These events run anywhere from a few hours a day to one full week of activities revolving around PostgreSQL development, PostgreSQL hacks, bootcamps, and workshops etc.

There are plenty of conferences held all year round across the globe such as listed below:

The sponsored conferences listed above are held at various geographical locations and they are named after the region being conducted at, such as PGDay UK, PGConf Asia, PGConf EU and so on (note that some of them are only held in the region’s local language). 

If you can only attend one, the most important conference is the PGCon. This is an annual conference for users and developers of PostgreSQL held during the last week of May every year at the University of Ottawa in Ottawa City, Canada. This is where the top developers and committers of PostgreSQL meet each year to discuss enhancements, new features, and the development activities of PostgreSQL (in addition to presenting and conducting training bootcamps). It is during this event the community recognized developers and committers that have contributed immensely to PostgreSQL. Some are also formally inducted into the panel of contributors. 

The bootcamps and trainings conducted during PGCon are handled by industry experts who have developed the core features of PostgreSQL, which means you get to know the internals of PostgreSQL from the people who designed it. While a good reason to attend the community events is so you can expand your technical  network, the other good reason is to collect the PostgreSQL shirts which can be worn to work with pride in order to get others interested in PostgreSQL. The events calendar can be accessed from here, and each of the events will point you to its unique website managed and maintained by the respective event organisers. 

PostgreSQL Local and Regional User Groups

The User Groups such as PUG (PostgreSQL UG), SIG (Special Interest Group) and RUG (Regional UG). They give you an opportunity to bump into the PostgreSQL enthusiast next door. These are casual meetups organized by its members who meet on a regular basis. The frequency of these quick meets can be as often as once in a fortnight (which means two weeks for those who don’t read English literature) to once every quarter.

The main purpose of these user groups is to keep its members informed of the latest news around PostgreSQL and on upcoming global events. The members can be seen presenting technical content to a smaller group of individuals to cut their teeth for presenting at the global events. The topics of these meetups can get intriguing, especially when you have a bunch of IT engineers from varied technological backgrounds all discussing issues, limitations, and advantages of various database products and the ways to reduce costs, etc. These events also give you an opportunity to present a topic of your choice, which further widens your horizons within PostgreSQL. Most of the local group events are managed via the popular meetup platform as can be seen from the Local User Groups page

In addition to all the above, there are the official international websites of PostgreSQL, hosted and maintained in the local language of the region. The international websites tend to add more content on training and learning; catering to the needs of local audiences in a regional language. An excellent benefit of having such local and regional language sites is, you get to meet like minded individuals that can collaborate together to build systems and solutions using PostgreSQL.

The PostgreSQL Planet

Did you know that PostgreSQL has its own planet, where everything exists only related to PostgreSQL. It is like the master portal consolidating all the information from community distribution lists, PostgreSQL developers network, PostgreSQL bloggers, news, latest releases, and GitHub repositories. In planet.postgresql.org you could come across small projects of interest which can give you a quick hands-on experience of a specific feature of PostgreSQL. There are some basic projects in this site which can get you started in developing your skills of PostgreSQL.

My own personal favourite is the consolidated record of a real world computing issue within PostgreSQL applications, discussed within the distribution list with plenty of inputs and replies from various PostgreSQL enthusiasts. These real world issues gain traction by way of someone trying to create a use case out of it, in order to discuss the possible solutions and come up with a quick fix. The quick fixes are published on the GitHub repositories with further enhancements by other community members. What starts as a problem for a PostgreSQL user ends up being a minor feature enhancement.

The PostgreSQL Planet is also a one-stop-shop for various maintenance scripts that are developed and tested by notable community bigwigs. One can build a repository of tool-sets out of these code snippets to manage and monitor PostgreSQL implementations. Most of the code comes with a default disclaimer that the developer is not liable and/or responsible for any damage, service failure, or performance degradation caused to the systems (but most of the code snippet is safe to run on production workloads for monitoring and learning purpose).

PostgreSQL Extensions

As you start following all the resources around PostgreSQL, after getting a firm grasp of its internals, you might want to develop something on your own and share it with the rest of the community members. A step forward would be to put various similar enhancements and functionalities together in the form of a PostgreSQL extension. PostgreSQL extensions are an extended feature set that can be included in a PostgreSQL database system as a ‘plug and play’ option. PostgreSQL extensions undergo an exhaustive process of review before being published on the official PostgreSQL extensions website. More on various PostgreSQL extensions and their uses would be discussed in another post in great detail.

Conclusion

I hope this blog gave you an idea of where to seek more information about PostgreSQL and how to enhance your PostgreSQL skills on a self taught, self learned basis from using the various types of resources. Make sure to reach out to our team of experts for your PostgreSQL management needs.

MongoDB Database Automation Basics Using Chef

$
0
0

Managed environments grow over time, often as a result of the increased data involvement or maybe due to the need of increasing performance through shared workload. Because of this there is a need to add members. For instance, with MongoDB one can decide to do replication and sharding which consequently will require one to add more members to the cluster. Configuring and deploying these environments with time becomes hectic, time consuming, prone to human errors and so many associated setbacks which finally pose operational expenses. Take an example of a replica set with 50 members in MongoDB and you want to shard a certain collection in each of the members, doing this manually for each is time consuming, we thus need a centralized system from which you can easily configure all the members. With a centralized systems you write some code which in tern configure the connected members. The code is therefore human readable, versionable and testable to remove possible errors before deployment.

What is Chef Software?

Chef is an automation software written in the Ruby language that is used to streamline the task of configuring and maintaining cloud machines or on prem servers. It does so by ensuring that all connected members get the required resources, the resources are well configured and corrects any resources that are not in the desired state. So, Chef basically ensures files and software resources expected to be in a certain machine are present, configured correctly and working properly as intended.

The Components of Chef

Chef Server

This is the central controlling system that houses the configuration data. The data is written in a ‘recipe’  and if many of these recipes are involved they form a cookbook. The central system also contains metadata describing each of the nodes as outlined in chef-client.

All changes made in recipes pass here for validation before deployment. The server also ensures that the workstation and the connected nodes are paired using authorization keys before allowing communication between them and applying the changes.

Chef Client Node

The Chef client node registers and validates nodes and builds the node objects. It essentially holds the current state of a given node and its metadata.

Node

This is the physical, virtual or cloud machine to be configured and each should have the client node installed.

Workstation

The workstation provides an interface for communication between the server and the client nodes. It provides a platform for writing, testing and deploying  the cookbooks. This is where the roles are also defined

Test Kitchen

This is where the code is validated.

Chef Knife

Interacts the nodes.

Cookbooks

Contains recipes written in Ruby language and are used to define the tasks that are to be carried out. The recipes specify resources and order of implementation on the defined tasks. 

  • attributes are used to override the default settings.
  • filesused to transfer files from specific path to chef-client.
  • metadata resource defines the node information as described in the client-node.

How Chef Works

Chef has two ways of operation that is, client/server or in a standalone mode known as ‘chef-solo’.

The Chef-server receives various attributes regarding a certain node from the Chef-client. These attributes are then indexed using Elasticsearch by the server which then provides an Application Program Interface (API) from where the client-nodes can query these data. The returned results are then used by the client-nodes to configure the relevant machines and transform them to the desired state.

The server hubs all operations where changes are to be stored

MongoDB Chef

Chef managed servers are evaluated from time to time against some desired state ensuring that any changes in configurations are automatically corrected and applied universally. With this approach, there is a consistent configuration at scale.

Getting Started with Chef

You can download the chef-workstation from this site and install it. Make a folder named cookbooks and inside this folder run the command:

$ chef generate cookbook first_cookbook

This will generate a directory named first_cookbook with some sub-folders and files.

Navigate to cookbooks/first_cookbook/recipes/ and update the default.rb recipe with the contents

file "test.txt" do

  content 'This is my first recipe file'

end

We then execute this file using the command

$ chef-client --local-mode --override-runlist first_cookbook.

OR, inside the recipe folder, you can run the  file with the command

$ chef-apply default.rb

If you navigate to your recipe folder, you will definitely see the test.txt file with the content This is my first recipe file. It’s that easy. In the next section we will be creating recipes to do some specific tasks regarding MongoDB

Installing and Configuring MongoDB with Chef

You can install MongoDB by creating an install recipe MongoDBInstall.rb and populate it with the contents

package "mongodb" do

 action :install

 version '4.0.3'

end

In this case our package name is mongodb and we are going to install version 4.0.3

What we have is a basic recipe but in many cases we will need an advanced cookbook to do our configuration in MongoDB. To easen the task, there are developed cookbooks such as SC-MongoDB that generally make the configuration precise.

SC-MongoDB Cookbook

The cookbook provides  mongodb_instance that enhances one to configure MongoDB params, replica set and a sharded cluster.

To install the cookbook just run the command

$ knife supermarket download sc-mongodb

You can then use the defined attributes in this site to reconfigure some of the default MongoDB attributes.

Viewing all 1257 articles
Browse latest View live


<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>