Snapdeal optimises database performance
Thursday February 27 2014
Indian online shopping giant manages high transactional volumes with in-memory NoSQL database that delivers the best price and performance
Snapdeal recently revealed it has migrated to an open-source database platform to manage millions of pricing and inventory records with the highest performance to price ratio.
The leading Indian online shopping site realised that, in order to manage exponential growth, it would require a high-throughput, low latency system that could scale from 2 million to 150 million records, from 10,000 to 30,000 reads per second, and from 30 to 500 writes per second.
In production since November 2013, the Java-based Snapdeal inventory and pricing management system uses Aerospike
to provide predictable sub-millisecond responses while managing 100 million-plus objects stored in 32 Gigabytes (GB) of dynamic random access memory (DRAM).
The data stored includes seller and product IDs, inventory, seller rankings and pricing attributes. Product and price changes are made to both Aerospike and MySQL
, while seller rankings and product details are read from Aerospike.
Rapid deployment results
“We were up and running with Aerospike in a matter of days. It was extremely easy. There has been no need for maintenance with the Aerospike database; it just works out of the box,” said Amitabh Misra, vice president of engineering at Snapdeal.
Snapdeal's advanced marketplace platform is written in Java and includes sub-systems for order and catalogue management, inventory and pricing management, fulfilment centre management, shipping, delivery and tracking management and TrustPay, a buyer-seller protection platform.Within this platform, the Java-based inventory and pricing management system uses Aerospike to provide predictable sub-millisecond responses
"Our implementation runs on two Linux servers on the Amazon EC2
[Elastic Compute Cloud, and it takes advantage of Amazon EBS
[Elastic Block Store] for persistent block-level cloud storage. As a start-up, we have seen significant advantages to implementing our marketplace in the cloud and using open source software throughout that platform."
Misra explained how the Aerospike-based system is the second database Snapdeal has implemented to support its dynamic inventory and pricing system.
Outgrowing early systems
"We initially deployed 10 MongoDB
NoSQL database servers with 5 GB of data in DRAM as a cache in front of MySQL," he said. "Our Snapdeal application used write-through techniques to update information first in MySQL and then in MongoDB, while processing reads from MongoDB."
Misra added that, at first, the Mongo deployment worked "perfectly fine". "It was giving us very consistent response times, and it scaled well with hardware," he said. "However, as our business scaled and more sellers made price adjustments on more products, the MongoDB response times shot up from 5 milliseconds to more than a full second.
"As a result, we had to moderate the rate at which we were pushing updates. We had to spread them out evenly throughout the day as opposed to being able to absorb them in near real-time. Sometimes the updates had to wait for hours."
Testing for success
A number of tests were run to make sure that prices could be reflected in near real-time when millions of buyers a day were loading pages. But Misra reported: "We were getting stuck when there was a surge in concurrent price updates from many sellers and we saw degradation in the buyer experience."
With 500% growth in 2013 and company revenue projected to exceed $1 billion by 2015, Snapdeal realised it would require a high-throughput, low latency system. It did not want a system that gave good levels of average performance, but then degraded for 10% of the calls.
"At the same time we wanted a technology solution that could affordably scale as our business expanded," added Misra. "Notably, operational efficiencies directly impact profits, so we did not want to use an expensive clustered RDBMS (relational database management system)."
An evaluation of a variety SQL and NoSQL technologies led Snapdeal to select Aerospike’s in-memory NoSQL database. The company did also consider the option upgrading the MongoDB NoSQL implementation. "We also evaluated Couchbase, Redis, Terracotta BigMemory Max, Amazon Memcache and Amazon DynamoDB, as well as Aerospike," reported Misra.
"We were looking for a solution that gave us three things – cost-effectiveness, concurrent reads and writes that could scale up with added hardware; and performance that was very good 95% to 99% of the time."
Meeting key criteria
He also explained how most of the systems that Snapdeal engineers reviewed failed to meet at least one of the criteria. "The existing MongoDB solution lacked predictable response times under high write loads, sharding was complex, and hardware requirements for scaling were cost prohibitive.
"Clustered RDBMS databases and the Terracotta BigMemory Max caching technology were too expensive. Redis did not have server side distribution mechanism at that point of time. Meanwhile, Amazon Memcache and Amazon DynamoDB did not deliver predictable low latency," Misra said.
In addition, he said Couchbase replicas could only be used as backup copies, not to distribute load. "Moreover, it required twice the number of servers as Aerospike for the same throughput."
By contrast, Misra said the Aerospike in-memory NoSQL database provided several advantages. It performed with predictable low latency with 95-99% of transactions completing within 10 milliseconds, "which is essential for enabling a responsive customer experience," he said.
Aerospike also had the highest throughput as well as delivering the highest price/performance and offering the lowest cost system for Snapdeal in terms of both hardware requirements and ease of operations.
Indeed, Misra went so far as to say: "None of the other systems that we evaluated prior to Aerospike were fully viable options for us. Ultimately, we selected the Aerospike in-memory NoSQL database because it delivered the highest performance and enabled the lowest cost of deployment. It’s a fantastic product."
Consistent rapid response
Misra also shared the best practices Snapdeal has learned. “One of the most important things is to have a plan for ensuring a consistent rapid customer response in the face of high growth and a rapid rise in volume. In internet retail, you have a very short window for responding to a consumer, and you can’t afford to see that response time degrade or to see delays in your inventory updates.
“As part of that plan, it is critical to test any database being considered for this function to see if it will consistently process data in real-time even when there are millions of users, as well as identify if certain seller actions have any negative impact on the buyer experience.
“Also, a customer’s loyalty is only as good as his or her last experience, which is why consistency is so crucial. An important part of that consistency is having full replication of the data, so that if one server cluster supporting the database goes down, another one with the replicated data can take over. The dynamic elasticity of the cloud is an advantage here, since the second server can scale to support the extra workload.”
He added: “Additionally, servers and storage account for a significant part of an Internet retailer’s business, whether they are managed locally or hosted in the cloud. That is why an evaluation of databases should include an understanding of their resource utilisation, which can vary significantly between different brands.”
Looking to the future
The most important benefit of migrating to a new database platform for Snapdeal has been the ability to scale to meet rapidly growing business volume while ensuring a first-class consumer experience, according to Misra.
“With Aerospike, we can push through huge price changes while maintaining the same response time experience on the buyer's side – even with millions of buyers. That has been a big advantage. We will continue to build on that by adding more Aerospike database servers as our business continues to grow.”
Snapdeal also today announced that it received a $133.7 million (£80.4m) investment from eBay
Tagged as: Snapdeal | marketplace | online | e-commerce | database | open source | latency | throughput | MySQL | No SQL | DRAM | RDBMS | Amazon EC2 | Aerospike