Show me a company that has solved Database scaling and I will show you one that hasn’t. As we all know, databases continue to be the primary scaling challenge within any organization. Various techniques have been applied to address this challenge. Sharding is one of the options that most DB Architects consider when scaling the DB tier especially for SaaS, Social Networking and Online Service Providers. DB Applications that demand high throughput and high number of read/write transactions tend to implement Sharding or “Shared-Nothing”. Sharding although conceptually simple, can easily get complicated to implement due extensive application modifications. Also when sharding, DBAs have to take into consideration scaling/adding more shards/servers without application downtime.
Here we take a session based sharding use case and briefly go over the benefits of having a NetScaler ADC perform the sharding and scaling without having to overhaul the application. In session based sharding for web applications, one can chose a shard based on customer specific login data like userid and then route all SQL queries to the same shard for the duration of the session. The sharding algorighm is a simple modulo as shown in the deployment diagram below.
With the release of 9.3 NetScaler, we introduced DataStreamTM, which natively handles SQL traffic for MySQL and Microsoft SQL server with advanced features like SQL connection multiplexing, SQL request switching, SQL Load balancing, intelligent SQL health monitoring and SQL content switching policies (for Sharding and much more). Using a SQL regular expression similar to ‘MYSQL.REQ.QUERY.TEXT.REGEX_SELECT(RE#userid=\d+#).AFTER_STR(“userid=”).TYPECAST_NUM_T(DECIMAL, 5) % 3 == 0’, one can create a SQL Loadbalancing(LB) ContentSwitching(CS) vserver which is now ready to shard!. It is that simple. All of the application servers connect to the DB servers through the NetScaler MySQL/MSSQL VIP. With the sharding duties now handled by NetScaler SQL VIP, adding more servers is relatively easy. Take an existing shard replicate the data on to another server, now add the new server to the NetScaler VIP and split the load between the old shard server and the new one by updating the content switching policies. All of this can be achieved with minimal downtime.
With NetScaler MPX platforms capable of up to 50Gbps throughput on a single appliance, DBAs can be rest assured from a throughput perspective. NetScaler DataStream also significantly reduces load on the backend MySQL and Microsoft SQL Servers with SQL connection multiplexing/offload and is shown to improve the application performance from a latency angle as well.
DB Sharding with NetScaler DataStream