Relational databases store data as large database files. Although the information is hard to retrieve as a whole, one can FTP these files and, with a simple text editor, figure out patterns that will yield data such as credit card numbers, customer information and passwords.
If you are storing graphics in your relational database, such as binary large objects, this information is harder to steal. Yet anyone who obtained a data file of your binary object database could figure out the offsets and recreate the binary images using a simple program.
Theft of relational data is simple because the data is stored in table format in its original form. Your valuable data is simply sitting in a structure to be retrieved quickly and related to other items.
The only way to protect content in a relational database is to encrypt the data. When you encrypt data, you change its form into useless unreadable data. Later, with a password key, you can decrypt the data and restore it to its original valuable form.
There are basically three software strategies to encrypting relational data. The first strategy is to encrypt your operating system files. This usually involves a hardware solution to stream all file I/O in encrypted format. This option usually is the most expensive.
The second strategy is to purchase a relational database where encryption is built into the database engine. In this case the relational database translates the SQL statements and data inputs into encrypted format. When data is written to the disk, it appears as junk. When again this data is retrieved, the database engine decrypts the data blocks so the data appears in its original form.
The drawbacks of the above strategy are that it does not necessarily include a good methodology of granting users rights on encrypted data, nor does it imply a strategy to audit access to encrypted data, and finally, there is no solution to key management. Key management is the storage of keys (passwords) in the relational database.
The third solution is to purchase software that installs itself on your relational database and encrypts schemas, tables and columns. The software solution is slower but less expensive usually. The software will probably employ the use of views to decrypt data and the use of triggers to encrypt data. The software solution must also incorporate a well thought out key-management strategy, as does the database solution.
Key Management
Key management is critical to database encryption. How do you store the passwords (keys) to the encrypted data? If you store this information in the database, then someone can steal the encryption keys using FTP and you have a useless solution. Therefore, unless you are using hardware encryption that stores the algorithms in hardware, you need to consider key management.
The most obvious solution that is used to store keys in a database is to encrypt these keys also using internal software algorithms. In this case, a hacker could steal all your data files and not be able to retrieve the password keys that would allow them to decrypt valuable content.
Another solution to key management is to store the keys on a removable device, such as on a floppy disk. Whenever a user needs to view your database data, the user must first load the keys from the floppy into your encryption key-management system. This solution is obviously not practical for online content, but for credit card and customer information, using a removable device is the safest solution.
The drawback to encrypting your content is twofold. In the first place, database encryption is not simply clicking a button. Encryption needs to be planned in the same way we plan database design. A strategy of what data to encrypt and which users have access to encrypted data will require time to create and implement.
The second drawback to encryption is, of course, performance. Whenever a valid user retrieves your content, the data has to be decrypted into its original form. In the case of even a small graphical file, this will take CPU time to decrypt each block of data.
Most encryption algorithms, such as AES (Advanced Encryption Standard), will encrypt data in 16-, 24- and 32-byte increments. So for example, when a valid user requests a sexually explicit graphical image in your database, this image has to be decrypted recursively and quickly during the SQL query that delivers the data to your customer.
More CPU Usage
Encryption will thus increase CPU usage on your database or web server. Yet the good news is that most servers are bound by disk usage, therefore if you have fast multiprocessing machines, the increased overhead for encryption can be minimal.
The biggest slowdown to encryption is poor planning. If you encrypt your primary keys or other fields that users search on, then a user query will become very slow unless you have planned to encrypt the search string before it is evaluated by the SQL statement. Remember, the indexes of your database also will be encrypted since they mirror your data.
One solution around the problems of index encryption is the use of bitmap/hash indexes. With bitmap indexes, the data is decrypted, meaning it is not as safe, yet the data is stored in bitmap format by your database vendor, which is very difficult to translate back into worthwhile data.
Regardless of which encryption solution you chose, you eventually will choose one. In the future, people will laugh at the early business of the 21st century and how valuable data and content was left sitting open on hard drives that were accessible to the Internet.
James Edwards is an independent Oracle software developer. He can be reached at relationalwizard@aol.com.