Eternal vigilance is the price of success--especially at retailers` web sites
By Mary Wagner
Before Uzi Nitsan founded web performance monitoring software company Vertain
Software two and a half years ago, he ran an online hotel reservations start-up.
One day, after booking a new customer, Nitsan departed for a trade show in London.
He returned to discover that his site—and his business—had been down all weekend.
“We knew we needed monitoring—not only to see that the site was up but to
see that a person could actually go on and make a hotel reservation,” he says.
“We looked around and found there was no tool that could do that. That’s the
reason I started this business.”
In the heyday of Internet start-ups, Nitsan’s experience was not unique, and
it didn’t take retail sites long to realize that keeping a constant eye on site
availability and page download speed was critical to survival. Retailers soon
learned that when customers can’t access the site, or become impatient when
page downloads take too long, they easily go elsewhere, taking their wallets
with them.
Going deeper
And today, the savviest among them also understand that home page availability
and speed are only part of the story. Plenty of other performance issues that
can occur throughout the site can also lead to trouble. Stories of applications
gone haywire have approached the status of legend: the luxury hotel operator
flooded with reservations after a web page error listed an $89 price for accommodations
normally listed at $890; the faulty GIF already hanging up traffic at an e-retailer
launching a TV ad blitz that could drive hundreds of thousands more visitors
to its site.
E-retailers
are beginning to realize they must carry monitoring activities deeper into shoppers’
transactions to get an accurate read on site performance and its impact on customers,
experts say. In response, web performance monitoring service providers are moving
beyond simply tracking site availability and download speed. They are developing
broader approaches to address issues arising from the increasing complexity
of sites. Monitoring, diagnosis and repair service are available to support
the technology at every point in the customer’s web experience, from front-end
applications all the way into the infrastructure of local networks that represent
a web transmission’s “last mile.”
Page loading speed was one of the first performance metrics to be monitored,
but tracking used to stop at the home page. Today, other key pages in the transaction
are part of the equation. “Depending on what the customer is trying to do on
the site, the home page is not the critical thing, but the average page is,”
Nitsan points out. “If it takes five to six pages to complete a task, for example,
the bulk of waiting time for the consumer is not on the home page, but the rest
of the pages.”
Vertain’s weekly monitoring of transaction processes beyond the home page
at various retail sites shows that some have error rates—typically timeouts—of
as much as 10% or more. “That means one out of every 10 customers who are trying
to do some transaction on the site is not successful,” he says.
A study of the performance of eight popular Valentine’s Day web destinations
over the eight days leading up to the holiday showed more trouble: transaction
failures before the holiday were as high as one in 20.
The study, conducted by web performance monitoring services provider Empirix
Inc., tracked the success rate for a three-step transaction: navigating to the
site’s home page, initiating a search, and confirming that correct results came
back from the search. The transactions were repeated at one-hour intervals,
24 hours a day, at each tested site over the eight days.
Down, down, down
The testing showed a downward trend in transaction success rates and response
times and an increase in applications errors as the holiday approached and traffic
grew. Performance was at its lowest February 10 and 11, as last-minute shoppers
flocked to the sites in order to avoid overnight shipping charges.
Failure rates were highest at the web sites of Ghirardelli Chocolates and
florist KaBloom. Over the eight days, the transaction success rate at Ghirardelli.com
had a low of 96.2% while KaBloom.com had a low of 95.4%, meaning that approximately
one in 20 transactions failed. Meanwhile, chocolatier Godiva.com turned in the
best performance with a transaction success rate that averaged 99.9% over the
eight days and was 100% on six days. The average transaction success rate across
all eight sites was 99%.
Performance success rates that rank only a percentage point off 100% might
not initially seem poor. But Empirix vice president of product management and
marketing Walter Vahey points out that the failure rate of telephone systems,
a popular alternative ordering mechanism, is never more than in the range of
thousandths of a percentage point.
“Retail sites depend on consumer loyalty, because it’s so easy for visitors
to just click to a competitor’s site,” Vahey says. “That’s why delivering a
consistently good experience through the entire transaction is so important.
Now that using the web is more mainstream, people are expecting the same level
of reliability in Internet applications that they get through the telephone.”
Web applications failures have become a problem for e-retailers in part simply
because there are now more applications per page. Web pages today tend to be
more dynamic, and they integrate a greater number of elements: data, graphics
and tools fed in from several different sources. That results in pages that
may be slower to load at the user end, and it can increase the chance of error
at any point in a transaction that uses multiple applications.
“Sites change so often. They are a moving target,” says Sean Kline, director
of applications performance management services at Empirix. “They have lots
of different moving parts in terms of physical hardware and software infrastructure.
Then there are frequent content changes and third party dependencies, as when
outside providers serve up images, for example.”
Though the increasingly complex nature of web sites is becoming a handful
for e-retailer IT staffs, some vendors have found expanded monitoring services
and software a tough sell, given the current retail climate and a corresponding
tightening up on new technology spending among many retailers. Nevertheless,
many say they’ve demonstrated measurable gains for retailers.
TeaLeaf Technology’s hosted IntegriTea monitoring service technology, for
example, can identify within unique user sessions where a glitch occurs. Multi-channel
retailer Levenger says it’s cut 90% off the time it used to take to identify,
locate and fix web site errors since implementing IntegriTea. Another TeaLeaf
client, Hillerich & Bradsby Co., makers of the Louisville Slugger baseball
bat, saw results in time for spring training and a seasonal spike in activity
on the company’s retail web site, Sluggergifts.com.
“We’re able to recreate exact customer web sessions, reducing the time it
takes to identify and fix problems from days to minutes,” says Christopher Caudill,
web developer at Hillerich & Bradsby. Another benefit has been in the form
of data that inform site improvements for better customer service. “Now, when
a customer tells us he’s had a problem, we can go back and review the data IntegriTea
gathers,” says Caudill. “We could see if customers were having an issue with
logging in, for example, and then tailor the log-in process to make it easier
by adding copy to make it more self-explanatory.”
Multiple goof opportunities
For a high-volume e-retailer, such site functionality typically requires multiple
servers—introducing multiple opportunities for error, as Mike Baglietto, senior
product manager at Keynote Systems, has seen often. “One e-retailer was doing
a promotion with a farm of servers to handle the load. But every time users
got directed to a particular web server, the image that was supposed to be served
up wasn’t there. So they weren’t converting anybody on that particular server,”
he says. “That sort of thing happens frequently.”
Keynote monitoring identified that transactions for the promotion had a failure
rate of 5% to 10%, and that the failure was connected to a particular image,
served by a particular server. With this information, IT staff made the change
to get the image served properly.
At Brylane L.P., CTO Alex Betancur uses Keynote’s Measurement product on all
seven of the company’s e-commerce sites to keep tabs on page download speeds
as experienced in different locations throughout the country. Depending on local
carriers involved in transmission, traffic, remoteness from central servers,
and local events that can interfere with network transmission, download speeds
are not necessarily equal.
Keynote provides the speed of monitored pages from the 10 geographic hubs,
breaking down by hub which images, pieces of content or pages are loading slower
or faster. It also identifies which carriers in the network serving the region
are up, down, or experiencing problems.
Betancur’s goal in finding and fixing failures and slow-loading pages is to
keep monitored pages at or better than the average load time in a blinded index
of 40 top e-retailers monitored by Keynote. Load times that are too slow result
in lower sales on the sites, he says. While page content, merchandise selection
and other factors play into conversions, “The quicker the page downloads and
gets to customers, the more apt they are to stay on the site and make the purchase,”
he says.
The detailed performance data also are informing tech decisions for the future,
he adds. “If we need to improve load times in a geographic area, that may mean
building another network operating center, using a more effective cache system,
or choosing another carrier. And those decisions on technology correlate to
a faster page and therefore, a happier customer and possibly more items in the
cart,” he says.
Though applications monitoring service offerings are broadening in scope and
number, they don’t offer solutions that retailers couldn’t come up with on their
own. Indeed, e-retailers can and do find, diagnose and fix applications errors
unaided by third party services—but it’s at a considerable cost of time and
labor.
As sites get more complex, finding and fixing problems gets tougher. Indeed,
without web site performance monitoring that captures the right identifying
data, the average time to locate and resolve site a performance issue is 25.8
hours, according to research from Newport Group Inc.
Unreported problems
And the impact of performance issues on site operations actually may be greater
than those findings suggest, as the average resolution time reflects only problems
that have been reported by users. That leaves a pool of unreported problems
that may be even larger.
E-retailers vigilant about monitoring their site’s performance with staff
and resources already in-house may still miss errors if they can’t replicate
the customer view from outside their own firewalls. Cookie problems, security
issues, and other site glitches clearly visible to the customer can remain invisible
to site operators if they monitor only from within their own walls.
“Anyone could solve a web application problem if they throw enough time and
resources at it,” says Tim Smith, director of marketing at TeaLeaf. “But the
industry doesn’t have a lot of either. For that reason, our core objective is
just to help site operators solve performance problems faster.”
While some e-retailers are out ahead of the curve in monitoring site performance
and linking it directly to systems management, others lag. Part of that may
rest in communications gaps that exist between IT and business groups within
some organizations. Indeed, in a recent Newport Group survey of IT professionals,
half said IT applications development staff worked collaboratively only when
a problem occurred, and 8% reported strained relationships and a lack of any
communication. It’s no surprise that efforts to solve or avert web performance
problems are more likely to get the attention of business managers if they can
be translated into the dollar volume of sales lost or gained.
Knowing the impact
“Customers haven’t yet run into these problems and don’t know what the impact
can be,” says Kline of Empirix. But that can change when they do the math, he
adds: If a site has 90% availability, for example, it could have had potentially
10% more customers. If the average sale is $100, depending on how many customers
cross the site, the loss from faulty web performance could add up to millions.
While some business managers may tune out at the mention of site performance
monitoring, figuring it’s strictly the province of IT, “It’s really just blocking
and tackling at the IT level to support business requirements,” says Vahey of
Empirix. “Business requirements and IT functions have to be in sync. There’s
no question that as these types of self-service applications become more mainstream,
the responsibility for them will get more focus. It’s early, and people are
still learning.”
mary@verticalwebmedia.com