What we've done...

John Johnson jj at scn.org
Thu Mar 19 23:21:25 PST 1998


	== What we've done so far regarding the dialup problems. ==

There are various problems affecting the dialup lines, not necessarily
'modem' problems.  These problems became significantly more noticeable
about a month ago because many of them are associated with the 33.6 kbps
USR Sportster modems, and we moved those modems to the head of the hunt
group at that time. This increased the incidence of some of the problems,
but was necessary in order to diagnose the problems. 

These problems have been difficult to diagnose because:  1) They are not
cases of individual modems going 'bad', but transient problems.  2) There
are a great many subtle variables, not all of which are known, and it is
necessary to look at combinations of variables.  3) Some of the problems
have a time or usage dependent factor, so several days are necessary to
test a change. 4) The modems do not have any kind of debugging or monitor
mode. 

Nonetheless, progress is being made.  To-date we have:
   -- Replaced a flakey terminal server (dialup hub).
   -- Upgraded the firmware on all of the terminal servers.
   -- Arranged the modems to provide better monitoring.
   -- Modified the modems to provide better cooling.
   -- Determined that the 33.6 modems behave differently than the 14.4's.
   -- Modified settings on the 33.6 modems to avoid certain problems.
   -- Developed a software tool to do reliable, unattended modem configuration
      changes and monitoring.
   -- Developed other tools to monitor the state of the dialup connections.
   -- Eliminated modem configuration inconsistencies.
   -- Isolated and fixed the 'scrolling login' problem.
   -- Added protection against problems caused by incorrectly configured
      user modems.
   -- Identified (and had fixed) a problem caused by US West.

These are the results of several hundred hours of research, experimenting,
monitoring, and otherwise trying to pin down the problems, and a good many
other hours trying to devise scenarios of what is happening.  The results
of these efforts are not always apparent because some of the problems are
not clearly distinguished.  E.g., 'ring/no-answer' can be caused in
several ways, and fixing one cause would reduce the problem, but not
eliminate it.  And the perception of various problems is highly variable,
being dependent on specific circumstances.  (Amazingly, a few users don't
know what we are talking about because they have never noticed these
problems!  We should all be so lucky.) 

The most serious dial-up problem currently is the "ring/no-answer" 
problem, which we continue to work on.  This involves modems going
non-responsive, and generally requires a trip on-site to power cycle the
recalcitrant device.  This may possibly be a defect with the 33.6 kbps
modems, and we are starting to consider replacement options.  While a
large ISP like AOL might 'simply' replace suspect equipment at the first
sign of trouble, we are constrained by the cost (estimated to be $2,000 to
$6,000+). 

Please note that the problems in February affecting telnet connections
(and other TCP/IP services), and thus dialup connections via SPL and KCLS,
were due to an error at the library, and were entirely unconnected with
our dialup problems. 

=== JJ =================================================================

* * * * * * * * * * * * * *  From the Listowner  * * * * * * * * * * * *
.	To unsubscribe from this list, send a message to:
majordomo at scn.org		In the body of the message, type:
unsubscribe scn
END



More information about the scn mailing list