Well, as you might expect, for all its dynamism, CGI was not a holy grail.
In fact, there are a lot of sysadmins out there who would be ecstatic if CGI
were outlawed. CGI simply causes too many problems.
- CGI introduces security holes. Lincoln Stein writes the following
eloquent warning on the problem:
Unfortunately, there's a lot to worry about [when running a web server with CGI]. The moment you install a Web server at your site, you've opened a window into your local network that the entire Internet can peer through. Most visitors are content to window shop, but a few will try to peek at things you don't intend for public consumption. Others, not content with looking without touching, will attempt to force the window open and crawl in.
It is one thing to allow any freako on the Internet access to your web server, when the communication is controlled through the boundaries defined by HTTP and implemented by web browsers. It is another thing to allow a stranger access to an unlimited amount of applications housed on the same server through a renegade CGI script.It's a maxim in system security circles that buggy software opens up security holes. It's a maxim in software development circles that large, complex programs contain bugs. Unfortunately, Web servers are large, complex programs that can (and in some cases have been proven to) contain security holes.
Furthermore, the open architecture of Web servers allows arbitrary CGI scripts to be executed on the server's side of the connection in response to remote requests. Any CGI script installed at your site may contain bugs, and every such bug is a potential security hole.
In the WWW Security FAQ, Stein identifies four overlapping types of risk:
- Private or confidential documents stored in the Web site's document tree may fall into the hands of unauthorized individuals.
- Private or confidential information sent by the remote user to the server (such as credit card information) might be intercepted.
- Information about the Web server's host machine might leak through, giving outsiders access to data that can potentially allow them to break into the host.
- Bugs can allow outsiders to execute commands on the server's host machine, allowing them to modify and/or damage the system. This includes "denial of service" attacks, in which the attackers pummel the machine with so many requests that it is rendered effectively useless.
I recommend checking out the following CGI Security sites if you are interested in getting more detailed information.
- Writing safe CGI scripts -- an overview (Paul Phillips)
- NCSA's tips for writing secure CGI scripts
- Latro, a tool for identifying insecure Perl CGI installations, by Tom Christiansen
- CGI is at the mercy of HTTP. It is important to note that HTTP only
provides for a one-time, question/answer type of communication. After all,
it was defined primarily for web browsers and web servers to exchange HTML
documents. Thus, by definition, HTTP is not very dynamic.
One-time, question/answer communication works like this: the web browser and the web server are only connected as long as it takes for the web browser to send one document request and the web server to send one requested document. If the browser wants a second document, it must recontact the server and ask again. Each request is new. The server maintains no ongoing connection or record of past exchanges.
While this is very efficient for network traffic (because the bandwidth is only used when information needs to be exchanged), it is a big pain in the butt when it comes to CGI, because CGI is about conversations, not about one-time question/answers
Imagine that when talking on the phone you had to hang up and redial every time you said something and received an answer. Imagine further that every time you called back you had to go over every previous exchange before you could get to the next piece. That is the way web browsers work with web servers and this makes communication tough.
This makes communication tough for three reasons.
First, if the client and server are to maintain information over several exchanges, the CGI must be responsible for keeping a running dictation of the conversation so that every time there is a new exchange, the web server can consult the record of the entire conversation up to that point. This is what CGI aficionados call "maintaining state". The CGI script must be able to keep track of certain information like username or the contents of a virtual shopping cart for every "instance" of a script. (6). That is, there must be a way to tie the current HTTP request to related ones that have gone on before. Maintaining state is possible with CGI using hidden variables, by encoding the URL, or by maintaining a state file on the server, it's just not easy or efficient. (7).
Second, every set of question/answers causes the web server to execute a unique instance of the CGI script. This is pretty expensive, especially on a high volume web site that may have 100 instances of a CGI script executing at any given moment, each, perhaps, with its own Perl interpreter. (8) Every one of those CGI scripts takes a little bit of umph out of the server engine. If we were not limited to question/answer format, we would not need to execute so many instances.
Consider the following CGI application executing....
Client: Hello? Server: Welcome, what would you like (CGI script executed once) Client: I would like a list of products you are selling Server: Here is a list (another one) Client: I want to purchase this product Server: Okay. (yep) Client: I'm done, can I check out? Server: Yes, what is your credit card number? (another script) Client: Here it is. Server: Thanks (another instance of the script that also emails the results to some store admin) (9)Yuck, this exchanged caused 5 instances of the store script to be executed as well as 5 Perl interpreters if the CGI script was written in Perl.
Third, CGI is extremely slow. Every time the client does something, the CGI Script must recreate the entire dialog and execute a new request. Add a new item to a virtual shopping cart - new request. Calculate a running total - new request. Submit an order - yet another request. Each request takes time and since the CGI script must be executed again and everyone must wait for a busy Internet.
- CGI is ugly. Finally, CGI scripts produce fairly ugly user-interfaces.
Basically, CGI is limited to bland HTML-based forms and whatever bells and
whistles can be provided by surrounding HTML layout. Thus, no CGI application
looks like your swank bootleg copy of Word.
This may not seem like a big issue at first, but when you start competing for web hits with multi-million dollar companies, image is indeed everything. CGI simply cannot compare with web based applications that are not limited to HTML.
Well, those are some pretty damning flaws. Like I said, many systems administrators would love to see CGI fall off the face of the Earth. Unfortunately for those system administrators, the fact is that CGI has continued to be the workhorse of the web, powering 90% of the dynamic web pages out there.
The fact is that CGI, especially CGI/Perl is easy to work with and most non-technically oriented webmasters out there can get their needs filled, and filled right away. However amazingly, brand-fantasmagorically wonderful other technologies sound, they are still vaporware as far as the average web developer is concerned. Either the ISP does not provide those technologies, or the learning and development curve is too steep or expensive. And of course for small applications typical of most websites, the big guns of C or C++ are just overkill.
CGI, for all its flaws, works, and works pretty darn well if done carefully. "Intranet" developers with massive budgets can yack all they want to about servlets and SQL gateways and Server Side Includes and customized server applications written in Java, but for most "Internet" developers out there, CGI is the only tool available for solving their problems. And with creativity and care, CGI can also be the right tool.











