Building database clusters with MySQL


MySQL is a mainstay of many web based applications, and is popular with lots of our customers. There does comes a time when a single database server is not enough. To enhance redundancy MySQL has a couple of options. You can add more servers with vanilla replication enabled. Or you can look at setting up MySQL Cluster. The documentation has this to say…

MySQL Cluster is designed not to have any single point of failure.

Sounds pretty good right? Lets take a look at an example setup.

Setup Overview

MySQL Cluster has one or more of several major components. Typically each component will have its own server (a node). Those components are

  • sql node(front-end)
  • data node (back-end)
  • management node (central config,meta-data and etc)

Generally speaking, application requests are sent to the sql nodes, which then talk to the data nodes and coordinates each other via management node(s) to actually recover the content you want. MySQL Cluster documentation also says… “Although a MySQL Cluster SQL node uses the mysqld server daemon, it differs in a number of critical respects from the mysqld binary … and the two versions of mysqld are not interchangeable.”

The mysql cluster package that will be installed is not part of any standard Linux distribution package and it’s outside from any of the official Linux distribution repository, the reason is that this approach reduces version and dependency conflicts of the installed stock mysql packages on the server and any existing mysql data directory commonly /var/lib/mysql will also not be touched. The binary packages we used from MySQL downloads generally runs on most major Linux distributions so this approach should work on most linux servers.

Physical Setup

Nodes of each type need to be as similar as possible, including the physical architecture and distribution setup. This makes the cluster behave more predictably and insures that kernel level settings are the same. Also…

  • two different configuration files are used. The well known my.cnf and config.ini, however in this setup we will create /etc/mysqld-cluster.cf so as not to touch existing /etc/my.cnf or /etc/mysql/my.cnf (debian/ubuntu), which is required by manager nodes. Refer to MySQL Cluster Configuration Files
  • all cluster nodes in this setup is using Ubuntu 11.04 32bit
  • we will be using 1 management node, 4 data nodes and 4 sql nodes.

Server Details:

  • management node: hostname: mgr1, private IP: 192.168.0.100, tcp port: 2205
  • sql node: hostname: dbsrv1, private IP: 192.168.0.21
  • sql node: hostname: dbsrv2, private IP: 192.168.0.22
  • sql node: hostname: dbsrv3, private IP: 192.168.0.23
  • sql node: hostname: dbsrv4, private IP: 192.168.0.24
  • data node: hostname: ndb1, private IP: 192.168.0.11
  • data node: hostname: ndb2, private IP: 192.168.0.12
  • data node: hostname: ndb3, private IP: 192.168.0.13
  • data node: hostname: ndb4, private IP: 192.168.0.14

Logon to management node server mgr1 as root using an ssh session and download the latest mysql cluster installer with the following commands, as of this writing it’s version 7.1.14.

cd
wget http://dev.mysql.com/get/Downloads/MySQL-Cluster-7.1/mysql-cluster-gpl-7.1.14-linux-i686-glibc23.tar.gz/from/ftp://mirror.anl.gov/pub/mysql/

If there’s already mysql user and group it’s safe to ignore errors from adduser and groupadd

adduser mysql
groupadd mysql
cd /usr/local
tar zxvf ~/mysql-cluster-gpl-7.1.14-linux-i686-glibc23.tar.gz
ln -s mysql-cluster-gpl-7.1.14-linux-i686-glibc23 mysql-cluster
cd mysql-cluster
chown -R mysql:mysql /usr/local/mysql-cluster/
mkdir -p /var/lib/mysql-mgmd-config-cache
mkdir -p /var/lib/mysql-mgmd-data

This is the minimum config.ini settings on management node. Refer MySQL Cluster Management Node Configuration Parameters and MySQL Cluster Data Node Configuration Parameters.

Important
Setting NoOfReplicas to 1 means that there is only a single copy of all Cluster data; in this case, the loss of a single data node causes the cluster to fail because there are no additional copies of the data stored by that node.
The maximum possible value is 4; currently, only the values 1 and 2 are actually supported.

cat >> /var/lib/mysql-mgmd-data/config.ini << EOF
[ndbd default]
NoOfReplicas = 2
DataDir= /var/lib/mysql-ndb-data
DataMemory = 64M
IndexMemory = 128M

[ndb_mgmd]
# Management process options:
DataDir = /var/lib/mysql-mgmd-data
PortNumber = 2205
HostName = 192.168.0.100

[ndbd]
# Options for data node ndb1
hostname = 192.168.0.11

[ndbd]
# Options for data node ndb2
hostname = 192.168.0.12

[ndbd]
# Options for data node ndb3
hostname = 192.168.0.13

[ndbd]
# Options for data node ndb4
hostname = 192.168.0.14

[mysqld]
# SQL node options for dbsrv1
hostname = 192.168.0.21

[mysqld]
# SQL node options for dbsrv2
hostname = 192.168.0.22

[mysqld]
# SQL node options for dbsrv3
hostname = 192.168.0.23

[mysqld]
# SQL node options for dbsrv4
hostname = 192.168.0.24
EOF

cd /usr/local/mysql-cluster/bin

./ndb_mgmd --initial --configdir=/var/lib/mysql-mgmd-config-cache --config-file=/var/lib/mysql-mgmd-data/config.ini

That's it! you're done on management node. Grab a coffee if you can, the next steps will just be more on hopping to sql node and data node servers. For each data node server, logon to each and perform the commands below. Assuming this is on ssh console connection, first logon to ndb1.

cd
wget http://dev.mysql.com/get/Downloads/MySQL-Cluster-7.1/mysql-cluster-gpl-7.1.14-linux-i686-glibc23.tar.gz/from/ftp://mirror.anl.gov/pub/mysql/
adduser mysql
groupadd mysql
cd /usr/local
tar zxvf ~/mysql-cluster-gpl-7.1.14-linux-i686-glibc23.tar.gz
ln -s mysql-cluster-gpl-7.1.14-linux-i686-glibc23 mysql-cluster
cd mysql-cluster
chown -R mysql:mysql /usr/local/mysql-cluster/
scripts/mysql_install_db --user=mysql --datadir=/var/lib/mysql-ndb-data
bin/ndbd  --ndb-connectstring=192.168.0.100:2205

Repeat the commands above on ndb2,ndb3 and ndb4 data node servers.

You can now login to mgr1 node via ssh console and check the status of the nodes...

cd /usr/local/mysql-cluster
bin/ndb_mgm --ndb-connectstring=192.168.0.100:2205
show

For each sql node servers dbsrv1, dbsrv2, dbsrv3 and dbsrv4 run the commands below

cd
wget http://dev.mysql.com/get/Downloads/MySQL-Cluster-7.1/mysql-cluster-gpl-7.1.14-linux-i686-glibc23.tar.gz/from/ftp://mirror.anl.gov/pub/mysql/

adduser mysql
groupadd mysql
cd /usr/local
tar zxvf ~/mysql-cluster-gpl-7.1.14-linux-i686-glibc23.tar.gz
ln -s mysql-cluster-gpl-7.1.14-linux-i686-glibc23 mysql-cluster
cd mysql-cluster
chown -R mysql:mysql /usr/local/mysql-cluster/
scripts/mysql_install_db --user=mysql --datadir=/var/lib/mysql-node-data --basedir=/usr/local/mysql-cluster
apt-get install libaio1
cat >> /etc/mysqld-cluster.cf <

Now on mgr1 node check the status of the nodes

cd /usr/local/mysql-cluster
bin/ndb_mgm --ndb-connectstring=192.168.0.100:2205
show

ndb_mgm output from show command with all data nodes and sql nodes up and running.

root@mgr1:/usr/local/mysql-cluster# bin/ndb_mgm --ndb-connectstring=192.168.0.100:2205
-- NDB Cluster -- Management Client --
ndb_mgm> show
Connected to Management Server at: 192.168.0.100:2205
Cluster Configuration
---------------------
[ndbd(NDB)]     4 node(s)
id=2    @192.168.0.11  (mysql-5.1.56 ndb-7.1.14, Nodegroup: 0)
id=3    @192.168.0.12  (mysql-5.1.56 ndb-7.1.14, Nodegroup: 0, Master)
id=4    @192.168.0.13  (mysql-5.1.56 ndb-7.1.14, Nodegroup: 1)
id=5    @192.168.0.14  (mysql-5.1.56 ndb-7.1.14, Nodegroup: 1)

[ndb_mgmd(MGM)] 1 node(s)
id=1    @192.168.0.100  (mysql-5.1.56 ndb-7.1.14)

[mysqld(API)]   4 node(s)
id=6    @192.168.0.21  (mysql-5.1.56 ndb-7.1.14)
id=7    @192.168.0.22  (mysql-5.1.56 ndb-7.1.14)
id=8    @192.168.0.23  (mysql-5.1.56 ndb-7.1.14)
id=9    @192.168.0.24  (mysql-5.1.56 ndb-7.1.14)

To test the setup we can use the sample world database from mysql's site. Login to any of the sql nodes, in this example dbsrv1 is used and execute the commands below assuming you're on ssh console connection.

cd
wget http://downloads.mysql.com/docs/world_innodb.sql.gz
gunzip world_innodb.sql.gz
cat world_innodb.sql | sed 's/InnoDB/NDBCLUSTER/g' > world_ndb.sql
cd /usr/local/mysql-cluster
bin/mysql -uroot -e 'create database world'
bin/mysql -uroot world < /root/world_ndb.sql

Optionally run on each of the sql nodes to check if it created the world db successfully for all sql nodes.

bin/mysql -uroot -e 'show databases'

actual mysql command line session in ssh console from dbsrv1 and dbsrv2 sql nodes

root@dbsrv1:~# mysql
Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 4
Server version: 5.1.56-ndb-7.1.14-cluster-gpl MySQL Cluster Server (GPL)

Copyright (c) 2000, 2010, Oracle and/or its affiliates. All rights reserved.
This software comes with ABSOLUTELY NO WARRANTY. This is free software,
and you are welcome to modify and redistribute it under the GPL v2 license

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

mysql> show databases;
+--------------------+
| Database           |
+--------------------+
| information_schema |
| mysql              |
| ndbinfo            |
| test               |
| world              |
+--------------------+
5 rows in set (0.00 sec)

mysql> use world;
Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A

Database changed
mysql> show tables;
+-----------------+
| Tables_in_world |
+-----------------+
| City            |
| Country         |
| CountryLanguage |
+-----------------+
3 rows in set (0.00 sec)

mysql> select * from City limit 5;
+------+----------------+-------------+--------------+------------+
| ID   | Name           | CountryCode | District     | Population |
+------+----------------+-------------+--------------+------------+
| 2519 | Nezahualcóyotl | MEX         | México       |    1224924 |
| 1101 | Akola          | IND         | Maharashtra  |     328034 |
| 1726 | Ebina          | JPN         | Kanagawa     |     115571 |
| 1616 | Fukui          | JPN         | Fukui        |     254818 |
|  715 | Port Elizabeth | ZAF         | Eastern Cape |     752319 |
+------+----------------+-------------+--------------+------------+
5 rows in set (0.01 sec)


root@dbsrv2:/usr/local/mysql-cluster# mysql
Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 5
Server version: 5.1.56-ndb-7.1.14-cluster-gpl MySQL Cluster Server (GPL)

Copyright (c) 2000, 2010, Oracle and/or its affiliates. All rights reserved.
This software comes with ABSOLUTELY NO WARRANTY. This is free software,
and you are welcome to modify and redistribute it under the GPL v2 license

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

mysql> use world
Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A

Database changed
mysql> select * from City limit 5;
+------+----------------+-------------+--------------+------------+
| ID   | Name           | CountryCode | District     | Population |
+------+----------------+-------------+--------------+------------+
| 2519 | Nezahualcóyotl | MEX         | México       |    1224924 |
| 1101 | Akola          | IND         | Maharashtra  |     328034 |
| 1726 | Ebina          | JPN         | Kanagawa     |     115571 |
| 1616 | Fukui          | JPN         | Fukui        |     254818 |
|  715 | Port Elizabeth | ZAF         | Eastern Cape |     752319 |
+------+----------------+-------------+--------------+------------+
5 rows in set (0.00 sec)

FAQ

    Is there a reason why the test setup needs 4 nodes?
    Answer: From MySQL Cluster Core Concepts
    Important
    It is not realistic to expect to employ a three-node setup in a production environment. Such a configuration provides no redundancy;to benefit from MySQL Cluster's high-availability features, you must use multiple data and SQL nodes. The use of multiple management nodes is also highly recommended.

    Can we describe an example with two or three for a bit more simplicity?
    Answer:
    A simplier example is to have 2 data nodes, 1 sql node and 1 management node. This is the most minimal setup.
    The whole steps above can still be followed but only setup ndb1 and ndb2 data nodes then only setup dbsrv1 a single sql node.
    The rest of the steps for setting up of mgr1 management node can still be followed but a simpler config.ini can be used. Here's a trim down version.

    cat >> /var/lib/mysql-mgmd-data/config.ini << EOF
    [ndbd default]
    NoOfReplicas = 2
    DataDir= /var/lib/mysql-ndb-data
    DataMemory = 64M
    IndexMemory = 128M
    
    [ndb_mgmd]
    # Management process options:
    DataDir = /var/lib/mysql-mgmd-data
    PortNumber = 2205
    HostName = 192.168.0.100
    
    [ndbd]
    # Options for data node ndb1
    hostname = 192.168.0.11
    
    [ndbd]
    # Options for data node ndb2
    hostname = 192.168.0.12
    
    [mysqld]
    # SQL node options for dbsrv1
    hostname = 192.168.0.21
    EOF
    

    Does the number of nodes (odd or even) affect ease of management?
    Answer:
    Yes it affects the ease of management, increasing number of nodes means increasing numbers of servers to maintain and total number of data nodes must be in even number.
    From the manual Defining MySQL Cluster Data Nodes
    The value for this parameter must divide evenly into the number of data nodes in the cluster. For example, if there are two data nodes, then NoOfReplicas must be equal to either 1 or 2, since 2/3 and 2/4 both yield fractional values; if there are four data nodes, then NoOfReplicas must be equal to 1, 2, or 4.
    moreover from the manual MySQL Cluster Core Concepts
    There are as many data nodes as there are replicas, times the number of fragments For example, with two replicas, each having two fragments, you need four data nodes. One replica is sufficient for data storage, but provides no redundancy; therefore, it is recommended to have 2 (or more) replicas to provide redundancy, and thus high availability

    OK, my mysql cluster is working so how can I use it on my existing app?
    Answer:
    Connect to any of the sql node on the cluster and create a table with engine type ndbcluster, for example.
    create table mytable(col1 int not null primary key auto_increment, col2 varchar(100)) engine=NDBCLUSTER;

Where to from here


15 responses to “Building database clusters with MySQL”

  1. would you like show how to make myslq load balancing cluster with minimum computers. thank in advance.

  2. Excellent tutorial! I was struggling to make Cluster with all kinds of RPMs which were conflicting with others and bad versions, now this tutorial went just fine and I got cluster running in no time. Thanks !

  3. Thanks a lot for sharing such a nice artical,it helps a lot to understand the NDB Clusster architecture design.The lanuage is so easy to understand!! Good for beginners and expert too.

  4. Hi Elton,

    Thank you so much for your very clear instruction of this topic. After all the researching on the net, your instruction is the most useful one and the most straight explanation. Will definitely put your blog on my favorites. 3 cheers on you.

  5. Thank you guys for your great feedback and appreciation. If you have any questions or suggestions feel free to post it here.

    thanks !
    Elton

  6. Great article. Is there any reason not to use MySQL Cluster for most typical DB’s?
    I’m reading that MySQL Cluster does not work with foreign keys and searches. Do you know of any list that explains what MySQL Cluster does not do as compared to MySQL 5.5?

  7. Very nice article, thanks for sharing this!
    However, i do have a question about this setup.

    How would we go in case of creating backups?
    Can i just stop one of the data nodes and create a backup and just simply start it again?
    If that is the case, will the data node that was stopped catch up on the newer data it missed?

    Thanks in advance.

  8. Hi,

    Awesome tutorial! I got everything working with ndb 7.2 based on this. However, when you say log into dbserv1 and issue commands on world db, both my db nodes show:

    ERROR 2002 (HY000): Can’t connect to local MySQL server through socket ‘/tmp/mysql.sock’ (2)

    Is this right? How do I check that the data is indeed spread over the data nodes?

    Thanks

    Sharif

    • Hi Sharif,

      You may need to check that the mysqld process is running on the sql nodes.

      run the command

      ps axfww|grep mysqld

      if the above returns mysqld then it’s running. Then check if the mysql.sock file exist on /var/lib/mysql-node-data or you need to specify explicitely the location of the socket file mysql.sock by using the –socket parameter. for example.
      mysql –socket /var/lib/mysql-node-data/mysql.sock -u root -p

      cheers,
      Elton

      • Hi Elton,

        Sorry I misunderstood your blog. Mysql works on the SQL nodes (well node in my case as I have just the one). How can I check that each data node has successfully got a copy of any database changes I make? (just for my own piece of mind).

        Thanks

        Sharif