...  eqn | troff -ms
.EQ
delim $$
.EN
.ND November 19, 1985
.TL
.ps 12
.in 0
Distribution of Mathematical Software Via
.br
Electronic Mail
.AU
.ps 11
.in 0
Jack J. Dongarra\|$size -1 {"" sup \(dg}$\h'.15i'    
.AI
.ps 10
.in 0
Mathematics and Computer Science Division\h'.20i'   
Argonne National Laboratory\h'.20i'   
Argonne, Illinois 60439\h'.20i'
Electronic mail: anl-mcs!dongarra or dongarra@anl-mcs
.AU
.ps 11
.in 0
Eric Grosse\h'.20i'
.AI
.ps 10
.in 0
AT&T Bell Laboratories\h'.20i'   
Murray Hill, New Jersey 07974\h'.20i'
Electronic mail: research!ehg or ehg@btl.csnet
.FS
.ps 9
.vs 11p
This draft was typeset on \*(DY.
Unix is a trademark of AT&T Bell Laboratories.
.br
$size -1 {"" sup \(dg}$\|The work of this author
was supported in part by the
National Science Foundation under Agreement No.
DCR-8419437.
Any opinion, findings and conclusions or recommendations expressed in this
publication are those of the authors and do not necessarily reflect the
views of the National Science Foundation.
.FE
.QS
.sp 2
.ps 10
.in .25i
.ll -.25i
.I Abstract
\(em
A large collection of public-domain mathematical software
is now available
via electronic mail.
Messages sent to
"netlib@anl-mcs"
(on the Arpanet/CSNET)
or to
"research!netlib"
(on the Unix\(tm network)
wake up a server that
distributes items from the collection.
For example the one-line message,
"send index",
gets a library catalog by return mail.
We describe how to use the service
and some of the issues
in its implementation.
.in 
.ll 
.QE
.nr PS 11
.nr VS 16
.nr PD 0.5v
.SH 
Introduction.
.PP
A large pool of high-quality mathematical software is
in use at educational, research, and industrial institutions around the country.
At present this software is available
from a number of
distribution agents \(em for example
AT&T Bell Laboratories for the PORT library,
IMSL,
the National Energy Software Center (NESC),
and the Numerical Algorithms Group (NAG).
All do a fine job with the distribution of
large packages of mathematical software, but there is no 
provision for convenient distribution of small pieces of software.
Currently scientists transmit such software by magnetic tapes,
but contacting authors and
deciphering alien tape formats wastes
an intolerable amount of time.
.PP
A new system,
.I
netlib,
.R
provides quick, easy, and efficient distribution
of public-domain software to the scientific computing community
on an as-needed basis.
It sends electronic mail over
Arpanet, CSNET, Telenet, or Unix uucp.
.SH 
Netlib in Use.
.PP
Imagine an engineer who needs to compute several integrals numerically.
He consults the resident numeric expert, who advises
trying the routine $dqag$ for some preliminary estimates
and then to use $gaussq$ for the production runs.
The engineer types at his terminal
.I
.nf
          mail research!netlib
          send dqag from quadpack
          send gaussq from go
          .
.fi
.R
In a short time, he receives back
two pieces of mail from 
$netlibd$.
The first contains the
double precision Fortran subroutine $dqag$
and all the routines from $quadpack$ that $dqag$ calls;
the second contains $gaussq$ and the routines it calls.
.PP
The utility routine $d1mach$
was not included with $gaussq$, since it is probably already installed
on his system;
if he had wanted it, he could have changed his request to
.I
"send gaussq from go core"
.R
to include the ``core library'' of machine constants and basic linear
algebra modules in the search list.
.PP
Should the engineer later decide that the routine
$dqags$
would be more effective, he could send the request
.I
"send dqags but not dqag from quadpack"
.R
to get
$dqags$
and any subroutines not already sent with
$dqag$ .
.PP
This engineer happens to be running Unix;  if instead his machine
were on the Arpanet, he would use the address 
.I
netlib@anl-mcs.
.R
If he needed the code in upper case, he would send his request
in all caps;  to get single precision, he need simply change the names
of the routines or the libraries, as appropriate.
Finally, he could ask for several routines together:
.I
.nf
        SEND RG RS FROM DEISPACK
        SEND DGECO FROM LINPACK CORE
.fi
.R
.PP
Meanwhile, the numerical expert decides she should
check on the current contents of netlib.  She types
.nf
.I
        mail research!netlib
        send index
.fi
.R
The return mail shows a library $toeplitz$ she is not
familiar with, so she sends mail
.I
"send index for toeplitz"
.R
to see what is included.
Curious to see a typical routine, she tries
.I
"send only cslz from toeplitz"
.R
and gets just $cslz$, not any of the routines which it calls.
.PP
More formally, requests have the following syntax:
.nf
$request_line$:
     send $names$ $exclusions sub opt$ $libraries sub opt$
     send only $names$ $libraries sub opt$
     who is $names$
$exclusions$:
     but not $names$
$libraries$:
     for $names$
     from $names$
.fi
where $names$ is a list of words, separated by blanks.
.PP
Just how quickly these requests are answered depends on the speed of the
network communications involved, but five or ten minutes is typical for
Arpanet.
CSNET or Unix uucp may require anywhere from minutes to
days to transmit a message from
sender to recipient.
The actual processing time is insignificant.
One user wrote back enthusiastically that the system was so fast
he preferred using it to hunting around on his own machine for
the library software.
.SH
Material Available through Netlib.
.PP
Currently netlib offers:
linear algebra routines from LINPACK [9], EISPACK [13,15],
and TOEPLITZ [1];
optimization routines from  MINPACK [13] and Gay [8];
the special function library FNLIB by Fullerton;
code from the book by Forsythe, Malcolm, and Moler [10];
quadrature routines from QUADPACK [16];
PPPACK routines from de Boor's
.I
Practical Guide to Splines [3];
.R
the Collected Algorithms of the ACM published in the
.I
Transactions on Mathematical Software;
.R
FISHPACK routines providing finite difference
approximations for elliptic boundary value problems [18];
iterative linear system solvers from ITPACK [14];
the public subset of FITPACK by Cline;
routines for machine constants and error handling
and other public routines from the PORT library [11]
and SLATEC,
the Basic Linear Algebra Subroutines and extensions [12],
Golub and Welsch's GAUSSQ [10],
biharmonic solvers [2],
the SCPACK Schwarz-Christoffel conformal mapping program [19],
the PARANOIA floating point test,
the PCHIP routines for Hermite cubics by Fritsch and Carlson,
the MA28 sparse matrix routine from the Harwell library,
the Y12M package for sparse linear systems,
Scott's LASO block Lanczos code,
and miscellaneous other items.
The multigrid program PLTMG by Bank
and the multiple precision package by Brent
are also in the collection, though
they are probably too large to realistically
send by mail.
.PP
The various standard linear algebra libraries
are included for convenience,
but the real heart of the collection lies
in the recent research codes and the
"golden oldies" that somehow never made it
into standard libraries.
Almost all of these programs are in Fortran
but some are in C, such as the routine $rainbow$ by Grosse
for generating uniformly spaced colors.
There is also a collection
of errata for numerical books,
descriptions and benchmark data for various computers,
test data for linear programming collected by Gay,
and the ``na-list'' electronic address book
maintained by Gene Golub.
.PP
We do
\fInot\fP
send out entire libraries.
The computer center setting up a comprehensive numerical library
should get magnetic tapes through the usual channels.
.PP
There is no reason to restrict the collection to mathematical software.
If the habit of sharing work using software libraries
of general utility
becomes popular in other fields, we would be delighted to accomodate them.
.SH
The Netlib Server.
.PP
The netlib server runs under the Unix
operating system
(8th edition at Bell Labs and 4.1BSD at Argonne)
and consists of a few shell scripts and C programs.
The following discussion necessarily
assumes some familiarity with Unix commands.
.PP
When mail arrives for
$netlib$,
it is piped through a sed editor script that strips punctuation,
through a sort process to remove duplicates,
and into a C program that parses the request.
This program then invokes a shell script that
translates the given library names into a search list
and invokes the system loader with the given routine names
as external symbols to be resolved.
The resulting loader map is edited into a list of
file names to satisfy the request.
These files, along with a time stamp and disclaimer,
are then mailed back to the requester.
A line is added to a logfile showing the time, return address,
number of characters sent, and requested routine and library names.
.PP
The programs can tolerate minor syntax deviations,
since we do get requests like:
"Please send me r1mach from port. Thank you."
from people who don't realize they are talking
to a program.
Users sometimes submit a single request on the
subject line of the mail message, so a 
"Subject:"
prefix is also allowed.
One user even sent 
.I
"send index 4 eispack"
.R
instead of
.I
"send index for eispack",
.R
so 
$4$
is a synonym for
.I
for
.R
and 
.I
from.
.R
(This is not such an unreasonable mistake, considering that
the instructions for using netlib are often given over the phone.)
However, we make no attempt to accept arbitrary English input.
.PP
One way to start up the mail processing is to have a daemon
process that wakes up every few minutes and checks for a
nonempty mailbox.
In 8th edition Unix, thanks to Dave Presotto, if a mailbox contains
.I
Pipe to rcv.cmd,
.R
then the mail delivery software, instead of appending the
incoming text to a mailbox, will pipe the text to the command
$rcv.cmd$.
(Similar functionality is available from the Berkeley mail alias facility.)
The mailbox is owned by user-id 
.I
netlibd
.R
so that the process is run as netlibd;
hence the return mail will have this mnemonic name attached.
The userid is not just 
.I
netlib
.R
because if the return mail command
fails or if the remote user sends a reply, the message should go
to the administrator, not back into the request processor.
For example, mail once came back announcing that a user
had gone on vacation in the few hours before the netlib
response had gotten to his mailbox!
.PP
The file that describes the mapping from library names to loader
search lists consists simply of lines of the
form 
"eispack => \-leispack" .
Several similar lines allow for alternate spellings
such as 
\f2eispac\f1
and 
\f2eispak\f1.
This file is easily updated when new libraries are added to the
collection.
.PP
A subtle security problem arises from the implementation:  we
construct commands to a shell based on text from a user.
It could be catastrophic to blindly send mail to a return address
of 
\f2kgbvax!\`rm -r *\`\f1,
since the backquote characters tell the
shell to first execute a command that removes all files!
Therefore, the request parser checks for dangerous characters.
Another potential security problem is that someone might tamper
with the program text as it is enroute to the user.
For now, we feel that the threat is not serious enough to adopt
encryption schemes, though those would be easy to add.
.PP
Even though there are standards,
it is not particularly easy to extract from a request a
valid return address.
There are comment brackets and anticomment brackets to be recognized
and address transformations to be unwound,
but we now seem to be correctly answering except
when the return address contains blanks.
.PP
We do not use checksums since the network software
already provides a reliable channel.
We have received only one complaint, which involved
noise on the link from a user's Vax to his PC;
we regard that as his responsibility.
If checksums were required, we would choose
a scheme like that in MOSIS [15]
which allows for anticipated, insignificant
changes such as addition of trailing blanks on lines.
To avoid problems with mail processing programs in the various
networks, our request syntax avoids colons
and our replies start with a blank line
so that message contents are not processed as header information
along the mail route.
Problems occassionally arise with computers that are willing
to send us mail, but will not allow us to send mail back.
Delays for multihop and inter-network mail are more common,
but we have no way to collect statistics on that and
in any event it is out of our control.
.PP
The most difficult problem we have encountered has been length limitation;
a few of the programs are more than 100 kilobytes,
and that is more than the mail systems at many
Arpanet sites will tolerate.
Of course, the file transmission protocols can
handle larger sizes,
but those are too cumbersome and unstandardized
for our purposes.
We get around this by splitting up large items into several pieces of mail,
but would prefer to see the mail systems themselves improved.
We considered using Huffman coding to compress the files we send
out, but that would only save about a factor of two and would
require that we ship decoding programs.
However, in setting up the netlib
collection of test data for linear programming,
David Gay did decide to adopt a program
for compressing MPS format files.
.SH 
Discussion.
.PP
We chose this mode of interaction via electronic mail, keeping the
intelligence local to the central depository, because mail is at
present the only ubiquitous data communication service.
We considered putting an interactive program at
remote sites, communicating by mail with the depository.
That would allow a better dialogue (``Do you want that in single
or double?'') but would be difficult to write in the necessary 
portable way.
.PP
We are not aware of any comparable software distribution service
in existence,
although some personal computer "public bulletin board" systems
may be somewhat similar.
At least one bulletin board has been confiscated
because it contained a stolen telephone charge number.
For this reason and to control space, we do not allow users to
put their own software automatically in the collection.
.PP
The netlib service provides its users with features
not previously available.
There are no administrative channels to go through.
Since no human processes the request, it is
possible to get software in the middle of the night.
The most up-to-date version is always available.
Individual routines or pieces of a package can be obtained
instead of a whole collection.
One of the problems with receiving a 
large package of software is the volume
of material. Often only a few routines are required from a package, yet the
material is distributed as a whole collection and cannot easily be 
stripped off. 
.PP
At present, netlib is simply a clearinghouse for contributed
software and therefore subject to various disadvantages that
have plagued such projects in the past:
the only documents, example programs, and implementation tests
are those supplied by the code author or other users.
There may be multiple codes for the same task and no help
in choosing which is best.
We have made an effort not to stock numerous copies of machine
constants, but in general we have left submitted codes untouched.
Our system differs from previous efforts mainly by
a different focus than, say, the Quantum Chemistry Exchange,
and a more convenient distribution mechanism.
.PP
Several years ago there was a discussion on the Arpanet prompted
by a query from Jim Pool as to whether the time was not ripe
for "a portable set of documentation for interactive access by
users of a collection of mathematical software."
His idea was that the SLAC NAPLUG [5]
be put into an expert system form.
We have not yet tackled that problem in netlib,
although we do pass along whatever documentation comes
from the original code authors.
Since the time of that discussion, local mathematical typesetting
with output on terminals has become more common
but most of the other objections remain.
The user can not be assumed to describe his problem exactly as the
numerical analyst would;  thus the program must be able to translate
from the engineering to the mathematical domain.
Unserstanding only the general nature of the user's problem is not
enough;  this leaves too much documentation to wade through.
A certain amount of insight is required to realize that a
user may not need exactly what he thinks he needs.
.IP
"Do you need the matrix inverse?  Maybe you just need the
solution to a linear system."
.IP
"This is a correlation matrix, and I really do want to look at
the elements."
.LP
The general user will only be looking for a library routine a few
times a year.  He will certainly not remember more than a few
commands; a sophisticated search language is infeasible.
Who is going to write all the documentation in the required format?
At least a modest knowledge of numerical analysis and considerable
consulting experience will be necessary, but the job is tedious
and unrewarding.
The best interactive documentation system is a good numerical analyst
interested in the users' problems.  Unfortunately, this system
has its own difficulties:  expensive to reproduce,
inconsistent in intelligence and alertness,
hard to transport,
prone to use buzzwords,
often unavailable,
specialized,
difficult to keep current.
So there have been continuing efforts to build online numerical help
facilities, the most successful of these being
GAMS at the National Bureau of Standards,
the NAG online help facilities and decision trees,
and
NIT at Oak Ridge.
Entirely new writing styles are possible.
Beyond the graph structured text
popularized in "programmed learning manuals" a decade ago,
specific documentation might be derived,
rather than simply searching for and listing parts of a file.
Instead of a single example,
an online consultant could provide a complete program tailored to
the problem at hand.
Also, some knowledge of the previous experience of the reader
might be used to modify the level of explanation and avoid
needless repetition.
.PP
The main cost of running this service is for communications.
If it becomes necessary, we will
require uucp users to call the hosts
to pick up their return mail
so that such costs are distributed fairly.
At an average of a few requests per day,
the traffic has been small enough to impose a negligible load on the
host systems.  Disk costs are controlled by discarding files that
the host administrators are not themselves interested in keeping.
The current collection occupies 32 megabytes.
Most important, the human costs for maintaining the collection are modest
and consist mainly of collecting software.
We do not see how we could run such a widely accessible and low
overhead operation if we had to charge for the service,
and are not interested in doing so.
(See, however, [4] for a description of the Toolchest
electronic ordering system.
One problem mentioned there is that users want to see demonstrations
of software before purchase.)
.PP
The coverage of netlib obviously will tend to reflect the interests
of the collectors, so we would welcome "associate
editors" to augment the collection.  Please send mail to the authors.
At present, there are just two distribution sites.
Mail delays would be reduced if machines on other networks
or in other countries were willing to also serve
as depositories.
On the other hand, it is difficult even to keep two locations in sync!
The software netlib uses to reply to mail is itself available from netlib,
so it would be fairly easy for someone to, say, annnounce a service
for searching a bibliography that he has collected.
.PP
Netlib, being free, cannot replace
commercial software firms.
We provide no consulting,
make no claims for the quality of the software distributed,
and do not even guarantee the service will continue.
In compensation,
the quick response time and the lack of bureacratic, legal, and financial
impediments encourages researchers to send us their codes.
They know that their work can quickly be available to a wide
audience for testing and use.
We hope netlib will promote the use of modern numerical techniques
in general scientific computing.
.sp
.SH
Acknowledgements.
.PP
We wish to express our
gratitude to the many authors and editors
who have permitted their codes to be freely distributed
and to Gene Golub for his encouragement and help in starting
this project.
The trick of editing a loader map is taken from the GAMS system
at the National Bureau of Standards.
Finally, the managements of our organizations deserve thanks for
sponsoring this public service.
.SH
References.
.IP [1]
.R
O.B. Arushanian, et al, 
.I
The TOEPLITZ Package Users' Guide,
.R
Argonne National Laboratory, ANL-83-16, (1983).
.sp
.IP [2]
P. Bj\o'o/'rstad,
"Fast Numerical Solution of the Biharmonic Dirichlet Problem on Rectangles",
.I
SIAM J. on Numerical Analysis,
.R
20 (1983), 59-71.
.sp
.IP [3]
C. de Boor,
.I
A Practical Guide to Splines,
.R
Applied Mathematical Science, Vol. 27, Springer-Verlag, New York, 1978.
.sp
.IP [4]
Catherine A. Brooks,
"Experiences with Electronic Software Distribution",
.I
USENIX Association 1985 Summer Conference Proceedings,
.R
Portland, Oregon.
.sp
.IP [5]
T. F. Chan, W.  M. Coughran, Jr., E. H. Grosse, M. T. Heath, F. T. Luk,
"Numerical Analysis Program Library User's Guide",
SLAC Computing Services User Note 82,
Stanford University, 1976.
.sp
.IP [6]
W.J. Cody,
"The Construction of Numerical Subroutine Libraries",
.I
SIAM Review,
.R
16 (1974), 36-46.
.sp
.IP [7]
W.J. Cody,
"Observations on the Mathematical Software Effort",
to appear in 
.I
Sources and Development of
Mathematical Software, 
.R
ed. W. Cowell, Prentice-Hall, Englewood Cliffs, N.J., 1983.
.sp
.IP [8]
J. E. Dennis, D. M. Gay, R. E. Welch,
"An Adaptive Nonlinear Least Squares Algorithm",
ACM Trans. on Mathematical Software,
7 (1981) 348-368,369-383.
.sp
.IP [9]
J.J. Dongarra, J.R. Bunch, C.B. Moler, and G.W. Stewart, 
.I
LINPACK Users' Guide,
.R
SIAM Publications, Philadelphia, 1979.
.sp
.IP [10]
G.E. Forsythe, M.A. Malcolm, and C.B. Moler,
.I
Computer Methods for Mathematical Computations,
.R
Prentice-Hall, Englewood Cliffs, N.J., 1977.
.sp
.IP [11]
P. A. Fox, A. D. Hall, N. L. Schryer,
"The PORT Mathematical Subroutine Library",
ACM Trans. on Mathematical Software,
4 (1978) 104-126, 177-188.
.sp
.IP [12]
W. Fullerton,
.I
FNLIB User's Manual,
.R
AT&T Bell Laboratories, (1981).
.sp
.IP [13] 
B.S. Garbow, J.M. Boyle, J.J. Dongarra, and C.B. Moler, 
.I
Matrix Eigensystem Routines - EISPACK Guide Extension, 
.R
Lecture Notes in Computer Science, Vol. 51, Springer-Verlag, Berlin, 1977.
.sp
.IP [10]
G.H. Golub, J.H. Welsch,
"Calculation of Gauss Quadrature Rules",
.I
Mathematics of Computation,
.R
23 (1969) 221-230.
.sp
.IP [14]
.R
D.R. Kincaid, J.R. Respess, D.M. Young,
"ITPACK 2C: A Fortran Package for Solving Large Sparse
Linear Systems by Adaptive Accelerated Iterative Methods",
.I
ACM Trans. Mathematical Software,
.R
8 (1982), 302-322.
.sp
.IP [12]
C. Lawson, R. Hanson, D. Kincaid, and F. Krogh,
"Basic Linear Algebra Subprograms for Fortran Usage",
.I
ACM Trans. Mathematical Software,
.R
5 (1979), 308-371.
.sp
.IP [15]
G. Lewicki, D. Cohen, P. Losleben, D. Trotter,
"MOSIS: Present & Future"
.I
1984 Conf. on Advanced Research in VLSI,
.R
MIT, Jan. 1984.
.sp
.IP [13]
.R
J. Mor\*'e, D. Sorensen, B. Garbow, and K. Hillstrom,
.I
The MINPACK Project,
.R
in Sources and Development of
Mathematical Software, edited by W. Cowell, Prentice Hall, pp. 88-111, 1984.
.sp
.IP [16]
R. Piessens, E. deDoncker-Kapenga, C. Uberhuber, D. Kahaner,
.I
Quadpack: a Subroutine Package for Automatic Integration,
.R
Series in Computational Mathematics v.1,
Springer Verlag, 1983.
.sp
.IP [17] 
B.T. Smith, J.M. Boyle, J.J. Dongarra, B.S. Garbow, Y. Ikebe,
V.C. Klema, and C.B. Moler, 
.I
Matrix Eigensystem Routines - EISPACK Guide, 
.R
Lecture Notes in Computer Science, Vol. 6, 2nd Edition, 
Springer-Verlag, Berlin, 1976.
.sp
.IP [18]
P.N. Swarztrauber and R.A. Sweet, 
"Efficient FORTRAN Subroutines
for the Solution of Separable Elliptic Equations, Algorithm 541",
.I
ACM Trans. Mathematical Software,
.R
5 (1979), 352-364.
.sp
.IP [19]
L. N. Trefethen,
"Numerical Computation of the Schwarz-Christoffel Transformation",
SIAM J. Scientific and Statistical Computing,
1 (1980) 82-102.