Date: April 1, 1997
From: Jay Black, Associate Provost, Information Systems and
Technology
In the Fall of 1996 I received the report of the Ad Hoc Working Group on News Management. I am now able to proceed in accordance with the recommendations of that committee ...
While news allows communication on a wide range of topics related to both the academic programs and the general intellectual life of the University, it consumes University resources. [...] Based on our study of the various possibilities and the approaches we considered, we recommend that the University cease importation of any newsgroup hierarchy whose volume constitutes a significant portion of the total and whose primary function appears to be the dissemination of non-text content (such as picture files, executable software, or sound files) [...] The use of newsgroups to disseminate such objects is not a worthwhile use of the resources that are being consumed. [...] Appeals to continue importing a specific newsgroup that is deemed essential for research, teaching and administration or to enrich the general intellectual life of the University will be received by the Associate Provost, Information Systems and Technology, and referred to an appeals committee.
The Working Group recommended that termination of predominantly-non-text newsgroups commence with the "alt.binaries" hierarchy. I have instructed the IST staff to proceed with arranging for that to happen as soon as possible.
The volume of network newsgroup postings (hereafter referred to simply as "news") that is arriving at the University of Waterloo is increasing rapidly to the point where existing resources will be inadequate. In 1995, the University Computing Committee (UCC) struck a small working group on newsgroup management with the mandate of making recommendations to the UCC for a university-wide news management policy. The membership of the working group was chosen to represent a range of expertise and viewpoints.
While news allows communication on a wide range of topics related to both the academic programs and the general intellectual life of the University, it consumes University resources. These include bandwidth on our external and internal connections, computing and personnel resources to provide and maintain primary and secondary news servers, and computer hardware in labs to permit news reading and news posting on campus. Many possibilities exist for reducing the load on these resources in minor ways; however, in order to have any significant impact, a significant reduction in the volume of the news imported must be made.
Based on our study of the various possibilities and the approaches we considered, we recommend that the University cease importation of any newsgroup hierarchy whose volume constitutes a significant portion of the total and whose primary function appears to be the dissemination of non-text content (such as picture files, executable software, or sound files) on November 1, 1996.
For example, the "alt.binaries" hierarchy currently represents 2%-3% of the newsgroups but 65%-80% of the volume.
The use of newsgroups to disseminate such objects is not a worthwhile use of the resources that are being consumed. The recommended approach is easy to administer and does not materially hamper the value of "news" for the free availability and exchange of ideas. Reputable repositories exist, both on campus and externally, for access to software that has been contributed to the public by its developers.
Appeals to continue importing a specific newsgroup that is deemed essential for research, teaching and administration or to enrich the general intellectual life of the University will be received by the Associate Provost, Information Systems and Technology, and referred to an appeals committee. Requests for the continued importing of a newsgroup will be evaluated based upon the stated need and the resource usage implications. The decision of the committee is subject to review based upon changing needs and resources.
It is proposed that the appeals committee be constituted with five members: chaired by a member of the University Committee on Information Systems and Technology, with one member from Information Systems and Technology (nominated by the Associate Provost, Information Systems and Technology), one member from the Library (nominated by the University Librarian), one member from the University faculty (nomination procedure to be decided) and one member from the Student body (nomination procedure to be decided).
A public meeting will be held early in the Fall 1996 term to publicize and explain the process which lead up to this recommendation.
1. Mandate
2. Principles
3. Process
4. Primer on News
5. The Value of News
6. Scope of the Problem
7. Other Approaches Considered
Appendix 1. Questions and Answers
Appendix 2. Glossary of Terms
The amount of network news arriving at the University of Waterloo is increasing rapidly to the point where existing resources will be inadequate. In 1995 the University Computing Committee (UCC) struck a small working group on newsgroup management with the mandate of making recommendations to UCC for a university-wide news management policy and an implementation strategy to disseminate the policy to the user community.
The working group formulated three principles for guidance:
When a university must make decisions about intellectual resources, those decisions are usually based, at least in part, on content. The principle of intellectual freedom ensures that selection decisions are made solely on value and cost, without consideration for partisan or doctrinaire disapproval.
Many library associations have issued statements underlining the importance of this principle. For example, the Ontario Library Association states
...intellectual freedom requires freedom to examine other ideas and other interpretations of life than those currently approved by the local community or by society in general, including those ideas and interpretations which may be unconventional or unpopular. [2]
Selection is the process by which librarians deal with acquisition under the reality of a finite budget [7]. Selection is not censorship. The goal of selection is to ensure that the intellectual resources selected represent a wide variety of disciplines and points of view, and support the intellectual life of the University community.
Acquisition of intellectual resources by the University, whether it be books or computer newsgroups, does not necessarily imply endorsement by the University of the points of view stated therein.
The working group collated information from a variety of sources on how newsgroups are managed at the University of Waterloo, what aspects of newsgroups impact on the resources consumed, available statistics on newsgroups and their usage, policies and resource usage at other institutions and principles of intellectual freedom applicable to this issue.
In 1991, a committee was struck to recommend how network news should be handled. It is worth recalling some of the conclusions of that report [1]:
4. The University's primary news-server continue to receive all newsgroups generated internally and all newsgroups which arrive over the networks to which the University is connected.5. The contents of these newsgroups continue to be made available to all lower-level servers.
6. When decisions are to be made re the consumption of computing resources for newsgroups, those responsible for such decisions should widely and formally consult with the full user community. In the case of the primary server, it should be the responsibility of the University Computing Committee to see that such consultation takes place. In the case of lower level servers, a well-defined consultative process, approved by the University Computing Committee, should exist.
This draft report of the working group presents a summary of the issues relevant to a university-wide news management policy. It concludes with a catalogue of approaches considered as part of a news management policy. It has been taken to the University Computing Committee for comment, and is a key part of a process of consultation with the user community. It will be publicized through a range of media including newsgroups, World Wide Web, the Gazette and notices of meeting.
Using this report to define the issues, a process of public consultation with the user community is planned in early 1996. Comment on the issues and on the range of approaches to news management will be solicited. It is anticipated that at least one open meeting will be held.
After public consultation, the working group will use the input to recommend a News Management Policy for the University. This recommended policy will be submitted to the University Computing Committee for consideration.
UW's campus network contains over 8000 computers on campus, with connectivity to the Internet via a link to the provincial network operated by ONet Networking (a corporation providing Internet connectivity for Ontario educational institutions, government agencies, and private-sector corporations; it has 148 member organizations). This external link is a T1 circuit, which has a maximum capacity of 1.536 million bits per second (Mbps). The current annual cost of this connection is $70,000. The T1 link carries electronic mail, ftp requests, network news, and World-Wide-Web (WWW) traffic, among other things.
Network news is a system of distributed computer bulletin boards. There is no central agency or control. New newsgroup messages, or *postings*, are posted from thousands of organizations all over the world; they are then automatically redistributed over a number of computer networks by the news software. Any particular computer system receives and distributes its network news from and to some small list of "feeds". UW's news feed is with ONet Networking's news server in Toronto. ONet Networking's primary news feed (about 99% of the volume exchanged) is with ANS, a network-operations entity in the US; it also exchanges news with network organizations in other provinces connected to the CA*net national network, primarily BCnet and RISQ.
As a distribution medium, network news is actually more friendly to bandwidth and storage than other distribution mechanisms, such as mailing lists. This is because multiple users retrieve the same copy of a posting.
Network news is classified into a number of different hierarchies. At the top level, there are approximately 120 different hierarchies, such as "alt", "rec", "comp", "sci", "talk", "soc", etc. Each of these is further subdivided, as in "comp.lang", and so on. Although the number is difficult to estimate precisely, and changes from week to week, worldwide there are at least 18,000 different newsgroups. Not all of these are distributed at all locations, since some are specific to a particular geographical area. For example, the Toronto FreeNet currently distributes only about 13,000 newsgroups. Currently at UW the primary news server distributes only about 5,500 newsgroups. One reason for this smaller number is that not every new newsgroup created worldwide arrives at ONet's primary news server. For the traditional hierarchies, such as "sci", "comp", etc., newsgroups are created automatically at UW when the proper control message arrives. In the case of the "alt" hierarchy, a new newsgroup is created when the traffic exceeds more than 50 postings per week, or when someone makes a specific request for them -- e.g., by posting to "uw.newsgroups".
Every news server maintains an access control list that governs which groups it distributes to the other news servers with which it exchanges news. Thus, to alter the selection of newsgroups that arrives at one's news server, one must request to have the lists altered at each of these other news servers. The access control list is hierarchically arranged, allowing diverse configurations from an entire hierarchy down to a single group. It also allows "all but" configurations: for example, distribute all of "comp" but not "comp.theory".
A news server takes news articles as they arrive from these other news servers, determining which newsgroup they belong to, and then storing the article in an appropriate directory entry. For example, an incoming article in the newsgroup "alt.comp.acad-freedom.talk" would be stored in the directory "/news/alt/comp/acad-freedom/talk". Articles do not reside forever in their directories; they are "expired" by a program that examines the date and time the article was posted and deletes it if the time exceeds a certain threshold. Currently there is no official policy for determining expiry times, and expiry times may differ widely among primary and secondary news servers.
Network news contains much that is extremely useful to faculty, staff, and students. For example:
However, network news also contains much that is of little apparent value. Newsgroups within a hierarchy may differ radically in content; contrast, for example, "alt.comp.acad-freedom.talk" (relatively sober forum for discussion of computers and academic freedom) and "alt.devilbunnies" (apparently devoted to the discussion of satanic rabbits). The content of a newsgroup often changes over time; what appears to be of little use today may be useful tomorrow.
Network news importation and on-campus availability was begun in the 1980s by the Mathematics Faculty Computing Facility (MFCF). The process was taken over as a university-wide activity in 1991 by the Department of Computing Services.
In the early days of network news on campus, the volume was relatively light and did not require dedicated processors or disks to handle. As time went on, however, network news began to require more and more resources. Today, the primary news server, "news.uwaterloo.ca", is a Sun Sparc20 machine with 4 gigabytes devoted to news, and secondary news server machines exist in Mathematics, Engineering, and the administrative sector.
In the past year, both DCS and MFCF have, on occasion, been forced to delete thousands of postings in order to be able process the news that is coming in. The backup in news processing in the Mathematics Faculty once became so bad that a professor in the Computer Science department reported that urgent messages posted to class newsgroups sometimes took as much as two or three days to appear.
The volume of network news has also begun to have impact on the external link. News is an ever-increasing portion of the traffic. Three years ago, network news formed 11% of all campus traffic. In the fall of 1995 it reached as high as 20%. Network traffic consists of a stream of packets; each packet represents a piece of a mail message, news posting, ftp transaction, World-Wide-Web transaction, or other item. When utilization increases to a significant fraction of capacity, the processing load on network routers results in discarded packets. A significant amount packet loss can cause backup, much like a drain backs up when it is clogged. The impact is most severe on network services such as telnet and rlogin (which allow users to connect to remote computers and use them in real time), mildly severe on WWW and ftp traffic, and less severe on electronic mail. Before the upgrading of the external link to T1 speed, telnet sessions to remote computers often resulted in 3-second delays between typing a character and watching it appear on the screen, making interactive work impossible.
Here are some projections for peak news import/export via ONet, under three different models: linear growth, exponential growth, and the arithmetic mean of these two. It appears that the exponential growth model is the best predictor of volume.
| MB/day | 1996-Jan | 1996-Nov |
|---|---|---|
| linear | 1,000 | 1,600 |
| exponential | 1,750 | 5,000 |
| arithmetic mean | 1,375 | 3,300 |
Note that the T1 line to ONet has a capacity of 16,580 MB/day which must allow transmission of not just news but also e-mail, ftp requests, WWW traffic etc.
The resources consumed by network news fall into five main categories.
Table 1 summarizes the impact of news on these University resources.
| impact at current level of consumption | ||||||
|---|---|---|---|---|---|---|
| network | server | |||||
| interaction | metric | external | campus | workgroup | cpu & i/o | filespace |
| server | number of articles (files) |   |   | minor | ||
| volume of articles |   |   | MAJOR | |||
| number of newsgroups (directories) |   |   | minor | |||
| article expiration |   | MAJOR | (recovery) | |||
| server-server | number of articles transferred | medium | medium | medium | medium | medium |
| volume of articles transferred | MAJOR | MAJOR | MAJOR | MAJOR | MAJOR | |
| client-server | number of "new news" queries |   |   | medium | medium |   |
| volume of articles read |   |   | MAJOR | MAJOR |   | |
| number of articles posted | minor | minor | minor | minor | minor | |
| volume of articles posted | minor | minor | minor | minor | minor | |
| effective efficient measures to reduce consumption | impact | |||||
| reduce volume imported to campus by selecting only "important" news structures | MAJOR | |||||
| reduce article-retention period based on age and size | MAJOR | |||||
| implement a secondary server in each faculty-level constituency where loading warrants | MAJOR | |||||
Table 2 summarizes the estimated costs of supporting network news at the University of Waterloo over the three-year period beginning 1995/96.
The committee considered each of the following approaches to resolve the problem.
Advantages: a simple solution that makes the problem go away in the short term. Students like the "abundant access" to the Internet at Waterloo, as mentioned in the 1994 Maclean's survey on universities. Avoids difficult decisions about what newsgroups to drop.
Concerns: According to Table 2, it takes approximately $68,000 annually to support network news. The current budget climate makes it likely that organizational units will be getting significant budget cuts over the next few years, not increases. Some feel that additional spending on network news, compared to other services, is not warranted by the content of the newsgroups.
Examples: One can get improved performance by having a single machine as a dedicated primary server -- but this has already been done. Using "striped disks" -- a form of disk parallelism that decreases effective disk seek time -- can increase efficiency, but this has already been done. One can run a "space is tight" panic routine to erase whole newsgroup hierarchies when the disk space allocated to them becomes full, but this is already being done. In the past, UW used to feed Wilfrid Laurier University (which consumed resources), but this was ended in January 1995. We will get improved server performance by moving to a different program for handling news (INN versus. the older program CNEWS) but again, this does not affect the bandwidth problem.
Advantages: technological solutions may offer some limited improvement at low cost.
Concerns: many known approaches have already been tried at UW. Many forms of improved technology cannot substantially improve the underlying bandwidth problem.
Advantages: some gains in disk space utilization.
Concerns: will not substantially affect bandwidth problem. More frequent expiries will use more CPU time.
| structure | reduction |
|---|---|
| alt | 70-80% |
| alt.binaries | 50-60% |
| alt.binaries.pictures | 10-20% |
| alt.binaries.multimedia | 3-15% |
| alt.binaries.warez | 2-6% |
| alt.binaries.games | 5-40% |
| alt.binaries.pictures.erotica | 2-20% |
However, there is no obvious correlation between newsgroup volume and local readership, nor between volume and value to the University community.
Advantages: large gains can be made in both bandwidth and disk storage with a relatively small change.
Concerns: censorship issues. Decisions based on content must be made on intellectual freedom principles, requiring staff time to analyze content and make recommendations. Removing an entire hierarchy may inadvertently remove groups of vital interest to the University community. This could even potentially increase traffic, if many users begin reading news from other computers off campus.
Advantages: large gains can be made in both bandwidth and disk storage with relatively minor changes -- especially if importation of certain groups or hierarchies is restricted. Changes to expiration mechanism alone gives only a small improvement. Decisions can be flexible; if more resources appear, or if better technological solutions arise, previously removed newsgroups can be imported again. Allows input of user community.
Concerns: Newsgroup readership is currently hard to estimate, although it could be done with the consumption of additional resources. Could potentially even increase traffic, if many users begin reading groups from other computers off campus. "Value to the University community" is hard to evaluate, and requires staff time to analyze content and make recommendations. It may not be feasible to make these decisions at the level of the individual newsgroup basis, for three reasons. First, the volume of newsgroups makes evaluation time-consuming. Second, the content of newsgroups varies erratically over time. Third, it is currently difficult to request frequent changes to the access control lists at the ONet Networking news servers; the ONet Operations Centre expects each ONet member to decide on its newsgroup-importation strategy and then request changes to that strategy only infrequently.
Interlibrary Loan/Document Delivery acquires, for research or study purposes, materials from other institutions or commercial services that are not available in the library collection of the University of Waterloo or Wilfrid Laurier University.... The Library has a responsibility to continue to provide the University community with access to material for research and study in the face of increasing acquisition costs and journal cancellations. [6]
In addition to the above, it has long been an established practice at the Information Desks in the Library (or at other points of contact such as ASKLIB, the electronic reference service) to refer individuals to other sources of information and/or resources such as public libraries, bookstores, government agencies, etc. whenever appropriate. The role of the Library in such instances is to provide the necessary information (such as institution/agency name, address, telephone, etc.) whenever possible to enable a person to pursue further inquiry on their own. If this model is followed the role of the Library in the management of newsgroups could be:
Advantages: (b) is essentially cost-free; (a) may be of value for those who do not have access to terminals linked to the campus network.
Concerns: For (a), cost (e.g. space in library, terminals, phone line, and Internet connection). Security (cannot allow anonymous users to post to newsgroups). Little benefit, since most newsgroups would continue to be available through the campus network. For (b), individual Internet access is likely to cost as much as $15-40 per month, and this may prove a difficult burden for many users.
Another possibility is to use WWW sites such as "excite Netsearch" (http://www.excite.com) or "DejaNews Research Service" (http://www.dejanews.com), which offer searching and browsing capabilities on newsgroups.
Advantages: free (at the moment), allows searching by topic.
Concerns: does not carry all groups, and if everyone reads news this way, the traffic will actually go up rather than down.
Advantages: Some small amount of access can be provided for relatively little cost.
Disadvantages: little gain, since many newsgroups would still be provided by the campus news server. Cost of providing terminals, maintaining software, etc. If the campus network is used to provide this service, the traffic might be burdensome.
Why doesn't UW currently import all newsgroups? As stated above, not all new newsgroups are received by ONet. Furthermore, not all new newsgroups are automatically created at UW. (Approximately 5-20 new groups (mostly in the "alt" hierarchy) are created each day, and many of them are short-lived or spurious.) Also, five newsgroups (alt.sex.stories, alt.sex.stories.d, alt.sex.bestiality, alt.sex.bondage, alt.tasteless) are currently banned by order of the Provost [4].
Why not have a committee review new newsgroups and decide which ones to import? One of the virtues of network news is that it allows a way for rapidly-breaking important stories to find a home in specially-created newsgroups. For example, "alt.current-events.la-quake" and "alt.current-events.kobe-quake" provided a forum for discussing recent earthquakes, and putting survivors in touch with relatives. Passing all new newsgroups through a committee would remove this useful aspect of network news. Yet another problem is that the large number of newsgroups, and their rapidly fluctuating content, makes it a difficult and time-consuming task to evaluate each individual newsgroup for its potential benefits to the University community.
Isn't the scope of the problem being exaggerated? No. For example, as mentioned previously, in Spring 1995 the Mathematics Faculty news server was so overloaded that one professor noted that it was taking two days or more for important postings to her class newsgroup to appear.
Won't the external link be saturated with WWW traffic in the near future anyway? Probably; this is a serious concern. However, with the World Wide Web, probably because of its commercial applications, there is currently research being done into more efficient architectures. For example, it is theoretically possible that a sequence of "caching web servers" may evolve at national and regional and local levels, where downloaded objects are stored. When you request a object, the request may be forwarded up the levels until it reaches the caching web server that has recently downloaded that page. Of course, this solution may not come to pass.
Isn't this just a ploy for the administration to remove newsgroups that are potentially embarrassing or deal with sexual topics? No; the group has not advocated the removal of any particular group or hierarchy; such a decision must be made using well-recognized intellectual freedom principles. In particular, we believe that newsgroups should not be deleted on the basis of partisan or doctrinaire disapproval. Furthermore, one of our recommendations is that if more resources become available, then previously unimported groups could be imported in the future. Upcoming budget cuts suggest that the University may be forced to cut back many of its services, and network news is just one area.
Isn't network news mostly junk? Why not just get rid of all of it? News serves two functions: first, many newsgroups provide valuable information for faculty, staff, and students. Second, news provides provides a forum for communication and debate, and these attributes are in line with the University's mission to seek truth.
Couldn't other groups, such as the Computer Science Club or the Federation of Students, take up the load by buying additional disks? It is not simply a question of disk space. Both processor time and campus bandwidth are seriously affected by the amount of news. Other campus groups could certainly purchase their own Internet service from some Internet Access Provider. To avoid congestion, however, such service would probably not be able to be linked through the campus networks.
control message: a message sent out worldwide to instruct newsgroup servers to create or remove a newsgroup
hierarchy: a collection of newsgroups similar in theme or content. For example, the "sci" hierarchy is devoted to newsgroups about scientific themes, and the "comp.lang" hierarchy is devoted to newsgroups about computer languages.
intellectual freedom: the freedom to hold and express controversial opinions, without fear of censorship based on partisan or doctrinaire disapproval. See, for example, [3].
newsgroup: a distributed computer bulletin board devoted to a single topic
news server: a computer that distributes news to other news servers and/or allows clients to read news. Primary news server: The single machine that receives all news traffic entering UW and redistributes news to secondary news servers (Math, Engineering, admin).
ONet Networking: an Ontario not-for-profit corporation providing Internet access to private-sector companies, public-sector organizations, and educational institutions in the province
selection: the process by which librarians decide what materials to add to their collections
T1: a transmission circuit with a theoretical maximum IP-packet data capacity of 1.536 million bits per second