Database Hardware Prerequisites
If you have decided that you need a database or a data warehouse, there are several things that you may need to do before you even select and install system. This is especially the case if you want to operate your own hardware and software, instead of using a cloud-based service like Microsoft Azure or Amazon Web Services, or a shared or dedicated blade at a data center.
The focus of this article is to outline some of the issues you need to consider when selecting, engineering and configuring database hardware. And if you are setting things up at your own site, not only will you need to purchase needed hardware, you’ll also need to ensure you have a solid electrical power supply, and you’ll need to setup required networking.
At this stage it’s also important to understand the distinction between logical database requirements and physical database requirements. A logical database definition defines all of the tables needed to store your information, along with the related hierarchical organization and data relationships and the user authorization rules, regardless of where the actual tables are stored. A physical database definition defines the actual physical hardware and network database model where individual tables are stored. Obviously, the physical database definition has a direct impact on hardware requirements.
It doesn’t matter if you eventually go with a relational database system like SQL server, PostgreSQL, MariaDB or MySQL; or a NoSQL document database system like Mongo DB. An awareness of hardware issues can even be important if you select or an off the shelf proprietary system like Tracker Ten or another DBMS. Being rushed and frugal at the initial hardware selection stage will only lead to future difficulties. Obviously, hardware pricing will always be a factor, but it should not be the only consideration. While the ideal database hardware setup may vary, there are definitely some general database hardware calculator guidelines you can follow.
Selecting Database Hardware
Whether you are starting from scratch, or have an existing system, you will need to ensure that your hardware configuration is capable of running a database system. Database read and write ETL (extract, transform, load) workloads are memory and I/O intensive. Complex queries can be CPU intensive. Data visualization can be graphics intensive. Demands placed on your system can be huge if you have database tables that track millions or billions of rows of data. Ideally your hardware will not be a bottleneck to system performance.
Many of these issues can be alleviated by selecting proper certified hardware. In fact, some database systems are “close-to-the-metal”, meaning that they have in depth knowledge of the hardware they are running on, and can squeeze out the best possible performance using specialized instruction sets. And there are also appliance layers like “SAP HANA”, that can optimize performance of database systems on intel Xeon processors, and technologies like NBMe and NVDimm storage (non-volatile memory express), that speeds communication between storage and your computers central processing unit (CPU). But before you jump into this type of optimization, and select recommended hardware, you still need to figure out what the best type of hardware is first. And obviously at the same time you at the same time you want to prevent hardware wastage (i.e., purchasing more then you need).
Before you go any further, you need a solid understanding of your physical database requirements. You’ll want to ensure that you have enough RAM memory and physical hard disk space. If you don’t have enough resources, performance related errors can quickly accumulate, and your database can be left in a poor state. Additionally, depending on the type of queries you need to perform and the type of reporting you need to do, you’ll want to make sure that your CPU and GPU are up to the task. And if you want other people to be able to access your database system, you will require appropriate networking hardware and you’ll want to minimize “node sprawl”. Without a solid hardware foundation, your database performance will falter.
Database Storage Requirements
When you are sizing your database and trying to figure out your database table space requirements and storage subsystem, you need to look at the type of data you are storing, not just the number of records. You also need to understand the difference between RAM memory, and more permanent storage. RAM memory is temporary storage used for processing data while your computer is operating. Permanent storage, like hard drives store data long term, and the information is saved even when your computer is not operating.
For a basic database system, you want to have at least 4 GB or ram (keeping in mind that at least 1 GB RAM will likely be used by your computers operating system). Enterprise level systems, like SQL server or Oracle, may require a lot more ram (16 GB and up). And if you are going to need to support multiple concurrent users, you will likely want to add 1GB of additional RAM for every 5 concurrent users. And it’s important to realize that users may include people that have direct access to your database working at the same time, and it may also include users that work with your database through web browsers and other remote systems (for example if you have library database, and patrons browse your collection online, they can qualify as database users, or if you have an ecommerce store, the number shoppers searching your inventory database at once count as users as well).
If you don’t have enough RAM your computer’s operating system may virtually uses some of your hard disk storage as RAM. But the drawback of this approach is hard drives are much slower than RAM, so if your computer is forced to use virtual memory, system performance will slow dramatically. In fact, doubling the amount of physical RAM in your computer and often halve the amount of time it takes to perform complex operations.
It’s also important to understand that images, videos and sound files will take much more space than textual information (in fact you may elect to use file storage instead of database storage for these items, for a more detailed discussion please see our “Storing Images in Database” article). If you are just storing text in your database, you may get away with a couple of GBs of hard disk storage space. If you are storing media, you’ll need much more space (40 GB or more is often recommended by professionals).
Your application can also have a huge impact on future space requirements. A transactional system (i.e., tracking sales for a store) may start out small but grow quickly, and you’ll definitely want to setup a OLTP architecture, that supports transaction orientated applications. On the other hand, an insurance inventory asset tracking system may remain relatively static in size. An in depth understanding of your yearly rate of growth is essential.
You will also want to consider the number of people that simultaneously need to access your database. If you have a large number of data consumers, you may want to consider data replication and mirroring to increase system availability. Also, you might look into setting up back up systems, so you can automatically failover to different hardware in case of catastrophic hardware failure. to These requirements could greatly increase your hardware needs.
For physical storage, in addition to the storage capacity you’ll also want to consider the pros and cons of the technology you select. For example, SSD hard drives will be much faster the traditional magnetic hard drives, but they will also be much more expensive for comparable storage. You could also consider setting up RAID (redundant arrays of inexpensive disks) hard drives and NAS (network attached storage) devices, and you may also look at techniques like striping data (when you stripe data, you divide it into small chunks and store the chunks in separate files for quicker I/O). Ideally your database system will support automatic striping, so you don’t need manually configure anything. However even with automatic striping, you want to make sure that your hardware is up to the task.
Don’t Forget about Database Backups
When you are determining your database storage requirements, don’t forget about the need to make regular database backups. For more information see our “Making a Full Database Backup” and “When to Archive Data” articles.
Database Processor Requirements
When you are looking at processing power, all of the mainstream chip manufacturers including Intel, AMD and Nvidia may offer workable microprocessor solutions. Generally speaking, the more “cores” that a chip contains, the greater the processing power. For a basic single user database system, you will want to have at least a dual core processor, available on most desktop computers. For enterprise level system you’ll want systems with 8-16 cores, and you may have to start looking at dedicated servers. And the more sockets your motherboard supports the better – more sockets will also you to add additional processing power.
Hardware Electrical Power Requirements
In may instances you will want your database to be available 24/7. This means that in addition to be plugged directly into a power supply, you will also want to ensure that your hardware is attached to a power backup system or generator, in case the power goes out. These power backups will ensure system availability. Since database systems often run on very powerful hardware, you need to ensure that your backup system is capable of powering your database server hardware.
Database Networking Requirements
If you require multi-user access for your database, you will need it connected to the internet. Networking hardware may include a combination of interface cards, peripherals and other communication devices.
A WI/FI connection is not recommended as it’s prone to drop and get interrupted. Instead, you should try and attach your database server to a wired ethernet connection with a fixed IP address. This setup will ensure that clients can communicate with the databases at all times. A properly configured network database model will ensure that clients and end users can access information and data when they need it.
Another distinction to be aware of is between a distributed database architecture and a multi-user database system. In a multi-user system, end users may all be communicating with the same database server, that is performing all database operations using a service-oriented architecture (SOA), constantly listening for client requests. In a distributed architecture, database operations may be performed by different computers on the network, possibly using a multi-tier architecture. If you are using a distributed architecture, you may also need to look into special hardware layers that perform functions like service handling and load balancing.
Database GPU Requirements
If you plan to do a lot of data visualization with your database (like 3 dimensional charts and graphs), you will also want your system to include a robust graphics processing unit. If you try and use the graphics integrated into your motherboard, you may find visualization tasks are slow and kludgy. On the other hand, a dedicated GPU (like NVidia GeForce or AMD Radeon), will quickly render images on your screen using specialized rendering engines. And dedicated GPU’s have an added advantage of being able to multitask and perform tasks in parallel. A dedicated GPU may even be able to assist with complex queries and data analysis, accelerating everyday database performance. In fact, a dedicated GPU may even be able to be programmed to act as additional cores in your system (that’s why graphics cards are great for computational complex tasks like crypto currency and bitcoin mining). When you are looking for a dedicated GPU, you’ll want to ensure that it works with visualization tools, and it can be scaled up if your needs change.
Our own Tracker Ten database system works on Microsoft Windows computers. Using the guidelines in this article you should be able to determine the size and features of the hardware you will need to smoothly run Tracker Ten, depending on your individual requirements.