Database Performance Metrics that Matter
Database performance can be measured using a number of different key performance indicators. The indicators are the same despite the actual database platform (SQL, NoSQL, MongoDB, Apache Cassandra etc.). The main database performance testing metrics are as follows:
Throughput measures the amount of data that can be processed in a particular span of time. Response time or latency is the time it takes before a database operation completes, and you get the results back. Database connections measure the number of clients that can simultaneously be connected to your database. The error rate is the number of system crashes that occur in a certain time period. By looking at all of these metrics together, you get a complete picture of how your database is performing.
What’s Involved in a Database Call
Before you can improve the performance of your database, it’s handy to understand exactly what happens when you make a request to your database. If you are communicating with a remote database, all of the following steps need to take place before you receive a response:
- A connection to the database is opened using a database driver.
- A pipeline is created between TCP sockets on your computer and database computer and added to the connection pool (the connection pool is just a list of all the requests your database is processing at any one time).
- Your computer sends a request (i.e., the query) to the database through the pipelines.
- The database receives the request and process it.
- The response is sent back to your computer through the pipeline.
- The database connection and pipeline are closed after you get a response back.
The most time-consuming step is the database processing time. Getting data back to your computer may also take a significant amount of time, especially if you have requested a large volume of data. But the easiest way to improve performance is to improve the time it takes for your database to respond to requests.
Improving Database Throughput and Response Times
There are three main ways you can improve database throughput, reduce latency and increase response times. First you can add more hardware to the computer that is running your database. You can get a faster CPU, add additional system memory and add additional hard drive space. The second way might be to increase your internet speed if you are communicating with a remote database.
And if your database is cloud based, installed on a service like Microsoft Azure, Amazon Web Services (AWS) or Rack space you may also be able to scale up resources on your control panel. This is especially an option if you are using a database like Amazon DynamoDB or Microsoft Cosmos. However, you should be aware that scaling up resources can get very expensive. While you might be able to get virtually unlimited throughput through a service like AWS, database usage charges could become prohibitive.
In any case, fixing issues through hardware, scale and speed upgrades may just be a Band-Aid for a poorly designed or implemented system. In fact, if you just blindly add hardware, you may not experience the performance gains you were hoping for.
Instead, before you rush to upgrade your computer system or internet speed you should make sure that your database is using optimal indexes and optimal queries. Optimal indexes will let you more quickly find information in your database.
For example, let’s assume you have a database table that tracks name and phone number, and name is set as the primary index (this means that the table has an index that lets it quickly find names). However, what happens if you find yourself searching for phone numbers just as often as you search for names? In this case you could experience large performance gains by simply adding a phone number index to the table. The drawback is that an extra index will take more memory, but if you have memory available this is a quick way to improve database performance.
Similarly, you can optimize database queries and make the more efficient. One example point be if you have a name and phone number table and a name an address table. If you want a list of people’s names and phone numbers, the most efficient way to do it might be to join the name table and address table together in query. If you query separately and the try and combine the results, there could be a lot more system overhead.
Refactoring your Database
Sometimes you may find that your data in your database is not optimally stored, causing difficulties with queries. In this case you may want to change the structure of your database to more closely match your business processes. For instance, in our name / phone number example above if you find yourself always searching for phone numbers, and never search for names, you may consider changing the primary index of your table to phone number from name.
Other refactoring techniques may include adding proper relational constraints and normalization. The purpose of these types of techniques is to reduce the duplication of the same data in your database. By reducing duplication your database will require less memory, and performance can improve. Also, tasks like database updates may become easier, as you’ll only need to update one copy of data, instead of multiple copies of the same data.
Refactoring can be a lot of work, especially if you have a lot of data stored in your database. But the long terms gains could easily out weigh the required labor.
Improving Database Performance Through Caching
Another way to improve performance is to create a cache. A cache is just a special storage space for information that is frequently accessed. Cache data is placed in a special type of memory that is quick to access. Properly setting up a cache will require the services of somebody who has expertise with your particular database system, like MySQL, SQL server or Oracle. The database expert will measure things like the cache hit ratio, to determine which types of items should be cached.
Improving Performance Through Replication
If you are operating an enterprise level database there, be other ways to improve performance, but they can get very expensive. For example, if you operate a popular website, millions of people may access your site every day. If most of these people just view data you can employ techniques like creating multiple copies of your database, and using a load balancer to distribute read request between the different replicated databases. Of course, this means that you need to update multiple databases when data changes. Still, this type of approach can help if the majority of your database transactions are read requests, instead of write requests. However, the system architecture for these types of systems can get quite complex and you should only embark on this type of development if you have exhausted all other possibilities.
Data Compression and Archiving
Other techniques to improve performance might be to compress stored data. This will take less memory, but it will increase computational time as data will need to be decompressed before you can work with it. You an also look at archiving data – the process of removing information that you no longer need. Again, this process will save memory in your core database. For more information on archiving please see our “When to Archive Data” article.
If you are just looking for a desktop database application, which only be accessed from your computer, have a look at our Tracker Ten Windows database system. We take care of database optimization, memory management and throughput issues for you, so you can concentrate on your home or business application, not the technology behind it!