Downloading Headers

From Newsbin

Jump to: navigation, search

Quick Links: Version 5 Documentation | Version 6.60 Documentation | Newsbin Home | Latest Newsbin Release | Glossary | Newsbin Forums

Documentation in progress...

Contents

Downloading Headers

Before you can download files from Usenet, you must first download headers to see what is available.

An alternative to downloading headers is to download using NZB files. This is discussed in more detail in the NZB File Processing section of this guide.

In Newsbin the to steps of downloading headers and viewing them are run asynchronously, although this is largely invisible to the user.. The download process wiill get the headers from the server and store them on the local hard disk. The view process will load them from disk so that they can be worked on by the user.

Downloading Headers for the first time in a group

The first time you visit a group you may well not want to download all the headers, but instead download some of the latest ones. Therefore the first time you do a Download Latest option in a group Newsbin limits itself to the most recent headers with the number being specified by the First Time Records value for the group. The First Time Records (FTR) value can be set at the global level as the default, or the global level can be over-ridden by setting it at the group properties level.

Unfortunately there is no obvious first relationship between the FTR value you select and the number of days of headers you get as it depends on the traffic within the group. A bit of experimentation can be necessary if you are looking for a specific period. Alternatively you can get all the headers as described later.

If you ever delete all the local headers for a group then Newsbin will follow the same process as it does for the first time you visit a group.

Downloading the Latest Headers in a group

The commonest thing that most users do is use the option to download the latest headers for a group by selecting the group; right-clicking; and select Download Latest. The keyboard shortcut of CTRL-G can be used as an alternative to using the right-click menu option.

It is also possible to start downloading the latest headers by double-clicking the group name in the Groups tab. As well as starting a Download Latest on the group, this will also do the equivalent of a Show posts for the group and open up a Post list to display the headers.

Another way way to trigger a the download latest is to make use of the Update button on the main tool bar or use the Update All Groups option on the Groups menu. This will start header downloads for all the groups that are marked as active in the Groups tab.

The final way is to take advantage of the Auto Header facility that can be set under Options->Switches at a frequency that you select. This is equivalent to to automatically using the Update All Groups facility. It can be a very useful way of getting the headers downloaded as a background task so that they are already downloaded when you want to view them.

Downloading All the headers in a group

Very occasionally there might be times when you want to get all the headers for a group regardless of whether you have downloaded them before or the value of the First Time Records setting. Examples of reasons why you might use this facility are:

  • You suspect that there may be corruption in the Newsbin spool files used to store previously downloaded headers.

  • The headers have been reset at the server end so that the currently downloaded headers are invalid.

  • You want to get all headers for a group without checking to see what the First Time Records value is set to.

The number of headers that are retrieved via this option can be very high in the high-traffic multimedia groups if you have a server with high retention as some of them get over a million headers a day. Therefore you do not want to use this option casually in such scenarios.

If you select this option any existing headers are first discarded, and then Newsbin will download all available headers ignoring the First Time Records value.

Downloading Headers as a Background Task

The downloading of headers can be quite a time-consuming task - particularly for the larger groups. It makes a lot of sense to do this as a background task rather than you having to wait for this at the time you actually want to examine the headers in the group. This is easily achieved if you tend to leave Newsbin running even when you are not actively interacting with it by scheduling the latest headers to be downloaded at regular intervals.

The way to achieve this in Newsbin is to set the Automatic Update settings under Options->Switches. These cause Newsbin to regularly schedule an Download Latest Headers task for all groups marked as active. This is in effect an automated press of the Update button available on the main toolbar or the Update All Groups selection from the Groups menu. the interval at which Newsbin should repeat this action is specified by the Update Interval value. If you want the first header update to be scheduled immediately you start Newsbin then set the Update When Started option otherwise Newsbin will simply start counting down the interval from that point. Note that Newsbin does not remember how far it was through an interval if you do an exit/restart - Newsbin will restart the interval from the beginning.

Once this has been setup it tends to make loading and browsing headers a much more pleasant and less time consuming task.

Stopping a Header Download

If you start a header download and decide you do not want to continue it, take these steps to stop the header download:

  1. Locate the header download task in the Download Tab.
  2. Click on the task so it is highlighted.
  3. Hit the delete key on your keyboard.
  4. For version 5.1 and higher, stop reading here
  5. Prior to version 5.1, the header download task will disappear but, header downloads will continue in the background until the end of the current header download block.
  6. If you can't wait for the end of the block, go to the Connections tab and select the active header download connection (will say "XOVER" and some numbers after it)
  7. Select the connection, right-click, and select "Kill Connection".

Minimizing Memory Usage while downloading headers

If you want to minimize the use of RAM while downloading headers make sure that you do not have any Post lists open for the groups whose headers you are downloading. If there is no Post list open then the headers are simply written to disk for later processing, so the RAM usage is independent of the number of headers you download. If you have a Post list open for the group then headers are also kept in memory for use in the Post list. In large groups with many millions of headers this can easily result in you running out of RAM.

Removing Old Headers

Newsbin will automatically remove any old headers as part of the header download process. The first step in this process is to ask the server what header number range it currently has, and Newsbin uses this information to remove ones that have now expired off the server. This is the recommended way of operating, and the user has to take no action for this to happen as this is built-in behavior. Headers are stored in highly compressed format, so the overhead of possibly storing some old headers you may not access again is minimal.

The automated process described above is the recommended manner of operating, but for those who want some more hands-on control Newsbin does provide facilies for the user to manually remove headers in several ways:

  • The Delete Posts option (keyboard shortcut of SHIFT-DEL) on the Post list right-click menu. Note this really removes them from the locally stored spool files rather than simply marking them as Read which is the more common approach that is adopted by most users.
  • The Post Storage->Delete Stored Posts option on the Group list menu. This will delete all headers for the selected group(s) and reset to the same state as a first time use of the group.
  • The Post Storage->Purge to Global MPA option on the Group list menu. This will delete all headers for the selected group(s) that are older than the Maximum Post Age (MPA) setting under Options->Setup.

Geek Notes

The header download algorithm has varied between different release of Newsbin 5:

  • Version 5.0 to 5.04: headers are downloaded in blocks of 50,000 headers. When you delete the task from the download list, NewsBin continues to download that current block of 50,000 headers and then will stop downloading. The only other way to stop a header download is to kill the connection. The version 4.x series used to do this but, we were running into problems with people killing connections too often and some news servers do not release the connections fast enough so people were then getting "too many connections" errors when they tried to start downloading again.
  • Version 5.05: NewsBin downloads headers in one large block so killing a header download task will not actually stop a header download, and you must manually do a Kill Connection to stop it. In version 5.05, there is an advanced user configuration called HeaderChop mode which tells NewsBin to download headers in 50K blocks again.
  • Version 5.1: we have reverted to just killing the connection when you kill the header download from the download list. It makes it easier on the user and hopefully won't happen too frequently so the news servers won't have an issue with it.
Personal tools