Site performance makes or breaks an e-business; Monitoring software helps you stay
By Mark Brohan
E-retailers wave the 24/7 banner as a strategic advantage over brick-and-mortar businesses. But downtime and system failures continue to tear away at that always-open image.
Outages have struck a who’s who of Internet retailers, from Amazon.com to Toys ‘R’ Us. Last June, analysts had a field day ripping into eBay after a 22-hour system failure frustrated customers and drove down its stock price by 26%. Recently eBay renewed its contract with Sun Microsystems to supply servers and data storage units at the center of that outage.
While eBay and other business-to-consumer sites have vowed to avoid crashes by paying stricter attention to backend systems integration and site traffic management, many still haven’t, says Eric Pakman, chief technology officer for Network-shop Inc., an Internet systems integration and consulting firm based in Montreal. “People are putting so much time and energy into building a site that monitoring anything related to systems is an afterthought,” he explains. “But site performance will ultimately make or break their business.”
Slow site performance and major server crashes are already at an all-time high: 50 so far this year. And analysts project that up to 70% of Web sites will suffer some power failure or system slowdown by December.
Amazon.com executives won’t discuss their internal monitoring software, but they acknowledge contracting with Keynote Systems for more than two years to monitor site traffic externally. Even so, an Amazon security executive questions the service’s accuracy. Keynote tracks how sites are responding to traffic from different access points around the globe, then notifies its clients when problems occur. But during the denial of service attacks that struck various e-commerce sites in February, Keynote erroneously reported that Amazon was one of the sites blocked, says Tom Killalea, Amazon’s director of information security. “They said Amazon was blocked when people were actually able to access the site and perform transactions,” he contends. “So there are limitations to Keynote’s ability to accurately reflect what happens on the Web.”
The problem, according to Killalea, is that Keynote does not account for the fact that previous Amazon visitors access the site using cookies, giving them a personalized greeting page rather than the standard Amazon start page.
E-commerce sites are made up of multiple—and disparate—operating systems including networking and database hardware or servers. The applications cover Web server, networking, load balancing and database software. Because sites are connected by many hardware and software configurations, problems can occur when traffic overwhelms a particular subsystem or when servers can’t handle sudden, unexpected volume.
Site developers certainly aren’t ignoring the problem, following some troubling failures. Charles Schwab wound up pumping more than $70 million of emergency money into its Internet infrastructure last year after heavier-than-usual trading crashed the site for four consecutive days. And despite more than doubling its processing horsepower, Toys ‘R’ Us crashed during the holiday shopping season for the second consecutive year. EBay, Schwab and Toys ‘R’ Us are responding to performance problems by building more backup systems and redundant data centers or by investing in clustering and load-balancing technology. The systems can detect a server failure and then route messages or transactions to an unused portion of the network.
Raising red flags
But such measures aren’t enough. Newport Group Inc., a Barnstable, Mass.-based information technology consulting company, estimates that almost 52% of site crashes could be prevented if Web companies did a better job of troubleshooting potential problems.
That’s the reason why more CIOs and systems managers are looking at a new option: performance-management software, which sends up a red flag whenever the e-commerce application being monitored fails to meet performance expectations.
Written to work with Oracle 7.2.2 databases or higher and with various server operating systems, performance management software monitors front- and middle-tier and backend databases for message routing or Web trafficking problems.
The applications also track the flow of information across a commerce server platform by scanning operating systems, switches, routers, load balancers, database engines, Ethernet networks and other areas for signs of trouble. Using Windows-based monitoring and diagnostic tools linked to the Oracle or main administration database, the software can pinpoint and prepare instant summaries on various trouble spots.
Once the software spots a problem, electrical pulses traveling between the performance-manage-ment software and server networks create an instant graphic on the database administrator’s computer console in real time. “The conventional answer to solving traffic trouble has been throwing more hardware at the problem,” says Peter Urban, senior analyst of database technologies for AMR Research Inc., Boston. “What they really should do is build in a performance-management program that spots problems before they hit the pipeline.”
Cheap protection
Performance management software is relatively inexpensive to buy and install. Most packages consist of a recorder for creating system workload scripts, a controller for stress-testing Web applications and a performance reporting tool.
The software sells for less than $50,000 and is available from such developers as BMC Software of Houston; Computer Associates International, Icelandia, N.Y.; Keynote Systems, San Mateo, Calif.; and Landmark Systems Corp., Reston, Va.; among others.
Austin, Texas-based Hoover’s Corp. is installing Quest’s Instance Monitor 1.0 software to give its 2.5 million online business customers faster and more consistent access to its business information repository on 55,000 U.S. companies and corporations. Unlike several major B2C sites, Hoover’s site has yet to incur an outage. But traffic has more than doubled in the last year to 1.2 million daily page views, and the company’s server network is running at close to capacity.
To prevent potential traffic problems, Hoover’s wants to ensure its central processing unit always has at least 20% of unused capacity in case of a big spike in processing volume or to head off potential system crash.
If the network is reaching overload or database administrators want excess capacity, programmers and network managers now have to create multiple programs and scripts to check messages and transactions flowing across multiple servers and databases. The process can take hours to complete and several more hours to analyze the results.
Hoover’s is installing performance-management software to create instant reports on network uptime and database capacity. Specifically the software can:
— Help resolve CPU or network space usage problems and support capacity planning for databases.
— Work out problems resulting from unavailable applications through SQL tuning applications embedded in the software.
— Adjust the Oracle database so messaging and transaction flow rates adjust to the specific performance characteristics of the network manager or data-base administrator.
“Before we put in this software, we had no tidy way to look up and check the problem,” says David Boyd, Hoover’s database administrator. “Now we have much faster access to performance information across the network and in the database.”
Performance-management software can be acquired as a turnkey package or a hosted solution from an application service provider. But with either version, the aim is monitoring the various transactions that travel across the site, including log-ons, information searches, account queries and file downloads.
In effect, says Billie Shea, director of research for the Newport Group, Web application-monitoring solutions collect data on transaction response times and validate the accuracy of application responses to customer transactions or information queries. “The software isolates where problems are occurring in terms of time of day, geographic location and specific transactions,” she says. “It’s a finger on the pulse of the network that allows the database administrator to see the same problem the end user is seeing.”
Proactive monitoring
Because performance management software is so new, analysts aren’t forecasting annual market sales or site installations. But many sites such as Bank of America, U.S. Postal Service, Chubb Corp. and Procter & Gamble Co. are using BMC Software’s ManageIT Suite software. Keynote Systems and Precise Software also have signed up a number of major retailers and search engines that include such e-retailers as Amazon.com, Ask Jeeves, AutoTrader.com, eBay, Pets.com and Priceline.com.
The software companies aren’t saying which customers are buying packages to spot systems failures before they happen—or whether sites have been forced by their stockholders or venture-capital backers to make major infrastructure improvements. With the software relatively inexpensive to acquire, install or host, more CIOs and systems managers are buying applications to be proactive in monitoring their sites for signs of trouble.
“If a user is dissatisfied because a Web site is too slow or not performing properly, they will immediately leave that site because the competition is just a mouse click away,” says John McHugh, vice president of marketing, Precise Software. “It is imperative that these businesses implement strategies for maintaining exceptional application performance levels, no matter how many users are on the system.”
Mark Brohan was the founding editor of Internet Retailer.
|