Late last year, I had a discussion with my editor about the 2004 editorial schedule of MC Mag Online, during which I learned that the June focus was Web Serving. "Wow," I thought, "that should be an easy topic to write about." But then I contemplated the assignment further and came to the realization that an article about open-source Web Serving hails directly from the Department of Redundancy Department. So much of the Web is running on open-source software that the topic can be at once exciting (because of its possibilities) or dull and pedestrian (because of its success).
This month, we will review a few of the projects that underpin the 'net services that we all take for granted. You may know about all of these projects, or some may be new to you. In either case, the scope of open-source software on the Internet is most impressive.
The OS
The Internet was built with UNIX and, to this day, UNIX and the UNIX-like open-source OSes such as Linux and the various BSDs (OpenBSD, FreeBSD, NetBSD, et al) dominate the infrastructure and servers of the 'net. In fact, the Internet's fifty longest uptimes are all logged by the BSD variants, as reported by Netcraft. I expect that as Linux replaces UNIX on many servers, it will join its BSD cousins on the list.
Although the iSeries makes a good Web server right out of the box, I believe that the real strength of the platform is in its ability to host Linux instances. Assuming that you have sufficient hardware resources on your iSeries, you can host up to 31 separate virtual Linux servers. This is a real convenience for those system administrators who like to segregate their applications on different servers. (These folks are typically Windows admins who are accustomed to dedicating one machine per service.) Once you learn how to create one logical partition, subsequent configurations are fairly straightforward to create and clone. Thus, you can add servers as quickly as you need them.
Which brings up another benefit of Linux on logical partitions: server consolidation. One good, adequately sized iSeries can replace many smaller Intel boxes (assuming that you are running Linux or UNIX on the servers that you are replacing). You gain floor space, power savings, the renowned reliability of the iSeries platform, and improved manageability.
The Web Server
According to the latest Netcraft report, there is one Web server with the lion's share of the market: Apache. With over 67% market share, this open-source project is ubiquitous, appearing on 47 of the "top 50" sites in the list I mentioned earlier. Apache can run on any of the BSDs, as well Linux and the various UNIX flavors, Windows, and OS/400. The Apache project provides the base upon which IBM has built its OS/400 Web server. The 2.x version of Apache has been redesigned to allow easier extensibility to the base Web server. This positions Apache to be the server of choice for today's technologies as well as whatever may come down the pike. Since it is open source, the source code is available for all to see and use, which guarantees its existence for years to come.
Don't Give Me Any Static!
We all know that sites based entirely on static HTML pages are rapidly becoming a relic of a bygone era. (Can an era be less than 15 years?) Today, most sites use some kind of dynamic content generation. Early solutions used the Web server's Common Gateway Interface (CGI) to call external programs, returning their output to the client's browser. These programs were typically written in C, particularly on UNIX servers. On OS/400, they were usually written in RPG and sometimes COBOL. On Linux, the more modern implementations of CGI programs are now commonly written in Perl or Python, as well as C. The problem with CGI programs is that they typically become a performance bottleneck. They simply don't scale well.
The modern solution takes advantage of Apache's extensibility by embedding an interpreter within the Web server itself. Common languages for this purpose include Perl (via mod_perl) and PHP (via mod_php). The advantage to an embedded interpreter is that the overhead involved with a CGI program invocation is eliminated, thereby increasing scalability. The formerly static HTML pages can now be embellished with embedded PHP or Perl code, allowing for database access and computation directly from within the Web server.
Although a major improvement over CGI, embedded scripting languages also can become performance bottlenecks. Worse, embedding code within Web pages tends to lead to spaghetti coding and a major maintenance nightmare. An excellent solution to this problem is the one that IBM is pushing: Java.
Whereas the embedded interpreter solutions tend to put all of the burden on the Web server, server-side Java implementations provide separate processes to execute the Java code. The Web server takes care of the static HTML and communications with the client while passing the code execution to the servlet container, returning any response back to the client. While similar to CGI, server-side Java eliminates the overhead of program invocation that CGI suffers. The result is a solution that scales very well.
There are commercially available products to provide server-side Java. Probably the most familiar example is IBM's all-encompassing WebSphere product. For those with simpler requirements or budgets, the open-source community offers Tomcat. And for full J2EE applications, there is always JBoss. Many of the Internet sites using Java employ one of these products.
But That's Not Open!
"Ah, but Java isn't open source," I hear you say. At the moment, that is correct. Java is under the control of Sun Systems. But that may change in the future. Sun is being pressured to open up Java to the community. However the JBoss and Tomcat implementations are open source. The source code is available for these products, so if the Tomcat project ends or JBoss goes belly up, the code will undoubtedly be used to spawn new projects.
Freedom or Vendor Lock-In?
I have mentioned only briefly the most prominent examples of open-source software in this article. Literally thousands of articles have been written about the open-source projects supporting the Web. I have written a number of them myself in the "pages" of MC Mag Online. I simply wanted to put these together on one page to get you thinking.
As you design your software, consider these questions: Do you want your software to be deployable on many different platforms or just one? Do you want your software to play well with others, or do you want to limit your options? There are basically two factions left on the Internet: those that respect Internet standards (promoting interoperability) and those that try to redefine the standards for their own personal gain (promoting vendor lock-in). Part of your design process must include a decision about which camp you'll join. Making the wrong choice can be a real pain in the ASP which will certainly reduce your .NET gain.
Barry L. Kline is a consultant and has been developing software on various DEC and IBM midrange platforms for over 21 years. Barry discovered Linux back in the days when it was necessary to download diskette images and source code from the Internet. Since then, he has installed Linux on hundreds of machines, where it functions as servers and workstations in iSeries and Windows networks. He also co-authored the book Understanding Linux Web Hosting with Don Denoncourt. Barry can be reached at
LATEST COMMENTS
MC Press Online