Scaling Applications to Meet Increased Demand - 1
For some organizations, the worst thing that can happen to them is unbridled success. As companies move their operations to the Internet, they play a balancing act of getting and keeping users while supplying those users with a worthwhile service. Whether you run a web server, e-mail server, or database server, not planning for unexpected exponential growth can be a critical error for your business. Too often applications and systems are not designed to handle this unexpected growth. Users may find your service less valuable and seek solutions elsewhere.
Increased Demand and Server Resources
What happens to a server when it becomes overloaded? If a server was not designed properly, without enough CPU power or memory, the server becomes sluggish and non-responsive. Once this begins to happen, responsiveness will become worse and worse until the system becomes unusable. There are ways to avoid this problem in the first place, by right-sizing your server hardware and knowing how your applications function. Purchasing the proper hardware to handle your client requests is extremely important. Your servers need to have enough CPU and I/O capacity to respond to the unlimited growth your servers might encounter. Furthermore, it is important to understand how your application works, and how it will consume these server resources.
Applications are designed using one of three methodologies, single-threaded non-forking, single-threaded forking, and multithreaded. There is an enormous difference between these three types of applications and how they consume resources. Choosing an application that uses the correct design methodology can mean the difference between an overloaded server and an idle server.
Single-threaded non-forking applications loop inside themselves, dealing with requests as they come in. They never spawn children or threads to handle external requests. The major drawback to this type of application is that they will never consume more than one CPU. If you buy an eight processor system, the single-threaded non-forking application will never use more than one of those CPU's. This can be extremely difficult to scale with increase demand, as adding extra hardware will not make the system respond more quickly. The benefit to these type of applications is that they are much easier to program than multi-threaded applications.
Single-threaded forking applications do not use threads to handle external requests; instead, they fork a copy of themselves to respond to the request. This is a better approach than a non-forking application because the forked children can run on multiple processors. The drawback is that forking a process is generally an expensive operation in terms of system resources. Multi-threaded applications will execute specific tasks in parallel, instead of waiting for the entire request to finish. This way the application can thread multiple requests to handle concurrent external requests. The application can use these threads to handle as many concurrent requests as are needed, each thread running on different CPU's. There is also no expensive startup cost of forking a copy of the application. The drawback is that multi-threaded applications are more difficult to develop. So which type of application is best for you, and how can you tell what kind of application you are running? Generally, a multi-threaded application is the best choice for any application, as it is the most efficient programming model and will utilize your hardware resources effectively. Avoid the single-threaded non-forking applications if possible. You can use the ps command to tell what kind of application you are running.
ps -ef | grep process name
If one process listed, you are running either a multi-threaded application or a single-threaded non-forking application, but if multiple processes are listed you are probably using a single-threaded forking application.
ps -eLf | grep process name
If more than one process is listed when using the "-L" option, you may be using a multi-threaded application. Threads can only be seen in the process list by using the "-L" option. If you still see only see one instance of your application, you are probably using a single-threaded non-forking application. Scaling is defined as the ability for an application to properly utilize extra hardware resources, on either one server or multiple servers.
Horizontal and Vertical Scaling
Scaling is defined as the ability for an application to properly utilize extra hardware resources, on either one server or multiple servers. Multi-threaded applications scale very well on a server, as you add more resources it will use them without modification. However, what happens when your application is consuming all of the resources on your single server? Then it is time to scale your application either horizontally, vertically, or both horizontally and vertically.
The concept of scaling your application horizontally and vertically is very powerful, yet easy to understand. Horizontal scaling allows you to run the same application serving the same data on multiple servers. Users can connect to these servers in a number of different ways including round-robin DNS or load balancing software. This way you can add extra servers horizontally to handle increased load. You can add or remove servers, and end users will not even notice. This way if one server becomes overloaded, add in another one with the identical application and data to balance some of the requests. Horizontal scalability also has another benefit in server availability. Using multiple redundant servers means that you should never have a total system outage, if one server goes down, remove it from your pool and allow the others to continue. This way data is redundant and requests can always be served, leading to a more reliable service.
Vertical scaling is a little bit more difficult to understand. Applications can scale vertically if they can be broken down into different parts, each of which serves specific requests. Web servers are an excellent example of a vertically scalable application. The basic web server model has one server handling all client requests, but you can install two servers, one to serve HTML requests, and the other to serve image files. Each piece of a web site that you are trying to serve, from HTML to images to cgi scripts can be run on independent servers in a vertical configuration. E-mail servers are also a good example of servers that are easy to scale vertically. Instead of having one mail server for incoming mail, outgoing mail, and POP, you can have three, each one serving one specific piece of the overall application.
Once you understand these concepts you can see the ultimate benefit, which is both horizontal and vertical scaling. If you can break an application down into pieces and serve multiple copies of those pieces, then you have achieved true redundancy with the ability to scale to any level of demand. As requests increase, simply add a server in a specific area. Even if growth increases exponentially, you can scale your servers any way that you need to accommodate this growth.
Next, we'll look at some specific examples of horizontal and vertical scaling...
Jamie Wilson has worked in the online adult industry for well over a year. He specializes in Solaris, Unix, and Web consulting, as well as providing content to adult webmasters. He can be reached for follow-up inquiries at firstname.lastname@example.org, or for adult content please visit www.jtwis.com/content