Friday, August 22, 2008

Deploying rails application with passenger

For the last few months, we have been having tons of problems with our rails deployment on lighttpd/fastcgi. One ruby process would ended up consuming all of the machine's memory and would then crash the machine. The only way to fix the issue was to use the excellent watchdog program monit to monitor memory usage and kill lighttpd when memory is too high.

This works ok if you only have 1 application running on the web server. However, if you have more than 1 app hosted on the same server, all applications will restart when lighttpd restarts. In a sense, 1 misbehaving app would cause the rest of the apps to suffer.

So I looked at mongrel, and the situation isn't much better. In fact, it is recommended that mongrel be used with either monit or god (ruby version of the watchdog program) to monitor individual mongrel instances and kill when memory usage is too high. The problem with this approach is that the watchdog program now has to monitor multiple mongrel instances and killing of an individual mongrel may leave orphan processes behind.

So my only other choice was to try out mod_rails. I was reluctant to do this since the mod_rails software has only been around for a few months. But seeing that there are really no better alternatives, I figure it's worth a shot.

After testing in a controlled environment for a couple of weeks, I deployed it in production. Within a couple of hours during times of heavy traffic, apache crashed with an out of memory error. It turns out that mod_rails were spawning way too many rails instances without recycling them properly - and it all has to do with the "PassengerPoolIdleTime" setting.

You see, mod_rails is suppose to recycle individual ruby instances when it idles for a specific period of time, the default is 5 minutes. On a site with constant traffic, it means that the processes will never time out. So these ruby instances just sits there and grow to no end, and consumes all of the system's memory at the end.

Now comes the fun part - just what is the ideal setting for PassengerPoolIdleTime? In order to conserve memory and avoid crashing the server, I have set this to 30 seconds. I realized that I am incurring a penalty for frequent re-spawning of rails instances, but as it turns out, mod_rails caches the entire rails app in memory so spawning is relatively fast.

The next version of mod_rails come with a new memory limit setting, which will further increase the stability of rails deployment. Can't wait for it to be released.

Friday, March 28, 2008

Vanilla PHP

PHP is a great general scripting language. For instance, I have a little data conversion script that takes data from our MSSQL database and insert them into our MySQL database.

The problem is that you'll have to custom compile PHP if we want support to 2 different databases... and it's not that easy, so here are the configure options to bring up a vanilla php installation, with no other support other than connection to a couple of databases:

./configure --with-mssql --with-mysql --disable-libxml --disable-dom --disable-simplexml --disable-xml --disable-xmlreader --disable-xmlwriter --without-pear

Make sure you also installed the freetds-dev and the libmysqlclient-dev packages.

Wednesday, March 12, 2008

Custom replication

One of our requirements is to have multiple master databases merge into a single slave database. The built-in MySQL replication engine does not support this kind of setup. A search on google came up with this paper: http://www.shinguz.ch/MySQL/mm-single-slave-repl.pdf. In it, the author described a way of using a middle-man database slave but using federated table pointing to the real slave. This approach would work but there are still a few bugs in the federated table engine (especially for auto-increment keys).

So I was thinking what would it take to create a custom solution. Here is what I came up with:

First a little diagram:

[M1][M2]
| |
\ /
[S]

We have 3 servers. M1 and M2 are masters and they do not share any information. So one going down would not effect the other. The data in both M1 and M2 would be replicated in S in a timely manner.

Let's say both M1 and M2 has a table t that we wanted to replicated into S. The structure of t is as follows:

CREATE TABLE `t` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`s` varchar(100) DEFAULT NULL,
PRIMARY KEY (`id`)
) ENGINE=InnoDB

The first challenge is to make sure that records in M1 and M2 do not conflict. To achieve that, we setup M1 and M2 to produce non-conflicting auto-increment key using the settings "auto-increment-offset" and "auto-increment-increment". For instances, M1 would produce records that has odd numbered keys and M2 would produce records with even numbered keys.

Now, for each table t in M1 and M2, create a corresponding federated table (fed_t1 and fed_t2) in S as follows:

CREATE TABLE `fed_t1` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`s` varchar(100) DEFAULT NULL,
PRIMARY KEY (`id`)
) ENGINE=FEDERATED CONNECTION='mysql://user@masterdbhost:3306/test/t'

So now, S contains a "view" into both M1 and M2.

The next step is to regularly query fed_t1 and fed_t2 to move the new records from fed_t1 and fed_t2 to table t in S. To keep things simple, assuming that we are only interested in new records, we can first setup a table to remember the last id from each table that got copied over. I call this table insert_pos:

mysql> select * from insert_pos;
+----+------------+-----------+
| id | table_name | latest_id |
+----+------------+-----------+
| 1 | fed_t1 | 18 |
| 2 | fed_t2 | 19 |
+----+------------+-----------+

In the above case, it shows that the last record that was copied over from fed_t1 has an id of 18, and the last record from fed_t2 has an id of 19.

So now, it is simply a matter of setting up an event to iterate over the insert_pos table. The algorithm is something like this:

1. For each record in the insert_pos:
1.1 insert into t all records from table_name between id 18 and the max id.
1.2 set the latest_id to the max id.
2. End

In my implementation, I have the following stored procedure to do the work:



CREATE PROCEDURE populate_t()
BEGIN

DECLARE remote_table_name VARCHAR (20);
DECLARE min_id INT;
DECLARE max_id INT;
DECLARE done INT DEFAULT 0;

-- open a cursor to the position table
DECLARE pos_table_cur CURSOR FOR SELECT table_name, latest_id FROM insert_pos;
DECLARE CONTINUE HANDLER FOR NOT FOUND SET done = 1;
OPEN pos_table_cur;

-- iterate the position table. For each table, insert records into
-- the main table and update position
REPEAT
FETCH pos_table_cur INTO remote_table_name, min_id;
IF NOT done THEN
-- First, get the max id for the transaction table
DROP TEMPORARY TABLE IF EXISTS remote_max_id;
SET @get_max = CONCAT('CREATE TEMPORARY TABLE remote_max_id',
' AS SELECT max(id) AS max FROM ',
remote_table_name);
PREPARE stmt FROM @get_max;
EXECUTE stmt;
DEALLOCATE PREPARE stmt;

SELECT max INTO max_id FROM remote_max_id;

-- Now, insert into
SET @populate = CONCAT('INSERT INTO t (id, s) SELECT id, s FROM ',
remote_table_name,' WHERE id > ',min_id,
' AND id <= ',max_id);
PREPARE stmt FROM @populate;
EXECUTE stmt;
DEALLOCATE PREPARE stmt;
-- Now update with latest insert position
SET @update_pos = CONCAT('UPDATE insert_pos SET latest_id = ',
max_id,
' WHERE table_name = \'',
remote_table_name, '\'');
PREPARE stmt FROM @update_pos;
EXECUTE stmt;
DEALLOCATE PREPARE stmt;
END IF;
UNTIL done END REPEAT;
END;



The actual procedure is very ugly since I have to construct the query dynamically using prepared statements.

Finally, I setup an event to call the procedure every 5 seconds:

CREATE EVENT timed_populate
ON SCHEDULE EVERY 5 SECOND
DO
CALL populate_t;



Preliminary Results

I have only tested the above in a very limited way. However, it seems to be rather reliable. If any of the connections to fed_t* is interrupted, the insert query will just not execute. Once the connection re-established itself, replication would work again.

To add a new master database in the above scenario, simple create a federated table from S to the new master, and add the corresponding entry to insert_pos table. Of course, always make sure that there are no key conflicts!

Saturday, March 8, 2008

MySQL replication bug

This is a nasty one...

We are trying to setup replication with MySQL. Previously, we setup our transaction isolation level to 'READ COMMITTED' as per our requirements. It turns out that replication will crash the slave at this isolation level. I'm not talking about mysqld giving an error, I'm referring to a full blown CRASH!

Specifically, replication will work fine if I only insert into the db, but the slave will crash if I tried to delete from the db. Worst yet, if MySQL was started using mysqld_safe, it will try to re-start mysqld after a crash. At that point, mysqld will crash again because it will try to execute the delete statement again. This becomes an infinite loop and the log files will start filling until you run out of space.

So the lesson here is this: set your transaction isolation level to REPEATABLE READ or higher if you are using replication with innodb. The specific MySQL version was 5.1.22rc and 5.1.23rc.

Saturday, March 1, 2008

Accessing MySQL tables from MSSQL

While my main task is to convert this huge database from mssql to mysql, there are still many legacy applications that need to speak to the sql server database. After searching online for a while, I found the "linked servers" feature in sql server that may allow the our legacy apps a way to speak to the new database using sql server essentially as a proxy.

The article is a little dated, so I'm not sure if this will work, but it's definitely worth investigating.

Tuesday, February 26, 2008

PHP and NULL comparisons

Just a little gotcha in PHP and ms sql access:

If you fetch a row from my sql, and are trying to compare a field value to see if it is NULL, use the === operator. The == operator will trigger a type conversion inside PHP and make null == 0 returns true.

Reference: http://ca.php.net/manual/en/types.comparisons.php

Monday, February 25, 2008

A little bug in Rails?

We had to roll back a database migration on one of our development server today. Previously, we have changed column from "NOT NULL" to "NULL". This roll back is suppose to revert the column back to "NOT NULL".

It turns out that Rails did NOT revert the column, even if the migration states explicitly to do it. This is done on a MS SQL database using the freetds driver.