Things to consider when choosing a cloud-based database


The database is a key component in most computing infrastructures. The database allows users to store data in an organized manner and retrieve them easily. A cloud database is a database management system that acts through cloud computing and is available and accessible from anywhere. The main way of communication that cloud database uses is over the internet, whereby it shares information between multiple devices, and the number of these devices is expected to increase.

Nowadays, many companies offer cloud databases as a cloud service, such as Microsoft Azure, Google, Amazon EC2, GoGrid, Garantia Data, Mongo Lab, etc. These businesses provide cloud services with two common deployment options: using a database separately from a virtual machine or purchasing a cloud service database that the cloud companies maintain.

With different architectures of the cloud-based database, there are several considerations users should consider when choosing to use a cloud-based database system. The selection of a cloud database depends not only on the services being provided by the company but also depends on the requirements of the company as well. Certain parameters can be taken as a guide to choosing the best cloud database.

1. Portability

Moving to a cloud-based database system means the user needs to transfer their existing data from their current database to the cloud. Especially with organizations that currently use a traditional relational database and have lots of existing data, portability is needed. For these organizations, choosing some relational database systems on the cloud, such as Amazon Relational Database services (which support Oracle and MySQL databases, with import and export features) or Google Cloud SQL (which is currently in a limited preview phase, which also supports import, export existing data) is a sensible solution.

In addition, the migration possibility of the database from one cloud-computing provider to another, or even from a cloud-computing provider to your server, matters. There might be unexpected circumstances, forcing the user to drop the current cloud-computing provider and move to another one. Therefore, before actually settling on a particular database from a cloud-computing provider, the user needs to consider if they can easily port their application and its database code after implementing it.

2. Reliability and Availability

For a database that requires high reliability and availability, a cloud-based database that offers replication of data is essential.

For example, Amazon Relation Database Services offer a feature that could help ensure data reliability: Multi-AZ deployment. Amazon RDS will automatically create and manage a “standby” replica in a different Availability Zone when users enable this and run their instance as a Multi-AZ instance. Database updates are performed to both the primary and standby databases simultaneously. The standby database cannot be used to serve read traffic, but it can be used to replace the primary one in case of database maintenance or database instance failure. It helps to ensure the reliability and availability of the database system in case of any incident.

Google Cloud SQL is also designed to cater to the database with high replication applications because it is designed with inherent support for data replication in different availability regions. Google Datastore (a NoSQL DB system) offers the model high-replication Datastore (HRD), using Paxos architecture to increase the reliability and availability of the database system. However, with a database that does not require high data replication, a database service with this feature could badly affect the application’s performance.

3. Scalability

Scalability is one of the main reasons companies should consider using the cloud-based database system because most cloud-based database systems offer users easier scalability than traditional database systems.

For users with some existing database systems, who just want to improve their database performance and take advantage of a cloud solution, but at the same time require complex transaction operations ( such as join query) and complex relations among data in their database, the solution of moving their existing database system to a cloud service like Amazon Relational Database Service or Google Cloud SQL is a great option to consider.

However, for applications that demand performance and scalability instead of complex database operation, or the data stored is not well-structured, any relational database system, due to the innate nature of the relational database, will not perform as well as a NoSQL database system for an extremely large amount of data. Therefore, a NoSQL database solution on the cloud, such as DynamoDB (from Amazon), or Google Datastore (used for Google App Engine), is much more suitable. For Amazon DynamoDB, all the users need to do is specify the level of traffic they wish to serve. Amazon will take care of all the work of scaling up the system to ensure the application can serve the desired traffic level.

4. Programming Environment

Aside from deciding whether to use a relational or non-relational database or what architecture the database should be built upon, the user should also be concerned with the programming environments that come with the database. It is because the programming environment contributes to the perceived speed of database operations from the client-side and the migration possibility of your database.

Different databases can be accessed by only certain programming languages and their APIs. For example, when using Google App Engine’s non-relational database (Google Datastore), users can only use Java, Python, or Go to access it, even though Google said it plans to support more languages in the future.

However, using MySQL hosted on Amazon Web Services, the user can use various programming languages such as C#, Visual Basic, and Java. The runtime of programs coded in these various languages differ, impacting the end user’s experience because the information interchange of client-server and server-database depend mostly on the programming environment.