We always look for ways to improve performance and usability for our clients and our software, which is why OrbitScripts now offers NoSQL databases for our ad systems.
Classical relational databases—or SQL (Structured Querey Language) databases—have been used since the 1970s to manage data. Although as programmers we have used SQL for years, we have also seen the problems SQL can have with very large amounts of data under high-load requirements.
Sooner or later we knew there would be a need for distributed computing and the need to use NoSQL for high availability and more stability.
Why NoSQL?
NoSQL databases—or “not only” Structured Query Language databases—are a broad class of newer database management systems that do not use SQL’s classical relational database management model.
Companies like Google and Amazon (and OrbitScripts!) have developed their own NoSQL database systems to address several common concerns with the SQL model, primarily:
1. Speed: SQL databases can slow systems down to a crawl, especially when millions of users are doing lookups against tables with millions of rows of data.
2. Mapping: SQL’s relational data doesn’t map well to programming structures with complex data types or hierarchical data like XML. Complex objects that contain other objects and lists do not usually map to a single row in a single table.
3. Programming Ease: Because SQL data doesn’t map as well, writing the software code becomes more difficult using SQL, and we don’t want to drive our programmers crazy!
(Read more on the SQL vs. NoSQL debate here.)
The Tradeoffs of NoSQL
NoSQL databases are not built primarily on tables and generally do not use SQL for data manipulation. As such, they are highly optimized for retrieval and appending operations, but usually offer limited functionality beyond record storage (key–value stores, for example).
That limitation reduces the run-time flexibility compared to full SQL systems, but it also gives NoSQL better scalability and performance for certain data models.
NoSQL database management systems work better when what really matters is the ability to store and retrieve great quantities of data, not the relationships between the data elements.
Ad Serving Systems and Their Data
Heavy-load ad management systems have a lot of data, primarily:
- Balance sheets of users (advertisers, publishers, agencies, etc.)
- Advertising budgets and limits
- Targeting information for the ads themselves
The speed ??of access to this information as it updates and the integrity and reliability of the data play an important role in the ad system as a whole. Ad systems also require fault tolerance, scalability, and support for atomic operations like simultaneous read and write.
Because the emphasis in our business is on data storage and retrieval, the NoSQL model works best for us, and we are excited to provide those benefits to our customers.
What Are the Best NoSQL Storage Systems?
In developing our NoSQL databases, we tested and used a lot of NoSQL storage services. We found some services to be better than others at different tasks.
The “best” service depends on what you need, of course, but here’s a summary of the different NoSQL companies we used:
Company
|
Average speed
for write/read on node on the first key
|
Search on second key?
|
Scalability vs. Replication
|
Atomic Operations/
Transactions
|
Cassandra |
15,000 / 10,000
|
Yes
|
Scalable
|
Yes / No
|
MongoDB |
10,000 / 10,000
|
Yes
|
Scalable
|
Yes / No
|
Redis |
50,000 / 100,000
|
No
|
Replication
|
Yes / Yes
|
Aerospike |
100,000 / 200,000
|
No
|
Replication
|
Yes / No
|
CouchBase |
10,000 / 15,000
|
Yes
|
Scalable
|
Yes / No
|
HyperDex |
10,000 / 10,000
|
Yes
|
Scalable
|
Yes / No
|
Tarantool |
100,000/ 150,000
|
No
|
Replication
|
Yes / No
|
(Our testing was done with utility YCSB on Amazon AWS.)
The following services ranked best for handling the balance of NoSQL database users (advertisers, partners and agencies):
Aerospike, Redis and Tarantool
Best for creating the most expedient NoSQL database:
Redis or Tarantool
The services with the highest efficiency in storing information about targeting:
MongoDB, CouchBase and HyperDex
And the best NoSQL services for handling clicks and statistics:
Cassandra and MongoDB
We hope you find this information useful, not only to explain why we developed NoSQL database systems for our products, but also as a reference for your own NoSQL development.
Have you used any of these NoSQL services before?
Do you agree with our assessments?
Do you disagree?
Let us hear from you!