Using POPFile with Courier Email
POPFile is a free, open-source email classification system that can be used to filter SPAM. POPFile is based on Bayesian processing, which means that after a short "training period" the program can "predict" which messages you receive are SPAM.
POPFile can be used as a very sophisticated email classification system, filtering mail into a large number of different categories (or what it refers to as "buckets"). But for our basic purpose of filtering between SPAM and Not SPAM we can use a very simple configuration of the program.
This short guide will help you through the process of installing and configuring POPFile and Courier Email to be used together. For more advanced information on either POPFile's classification abilities or Courier Email's filtering functions, please refer to the respective program's Help files.
Downloading and Installing POP File
You can download POPFile from http://popfile.sourceforge.net/
During installation you do not need to install any of the "Optional Modules"
After installing you will be prompted that some basic program configuration is required.
First you need to select the data location for your files, unless you have some reason to change this then the default should work fine.
In most cases the default settings for ports are the best option. It is also useful to have the program start automatically with Windows.
Now it's time to create the "buckets" POPFile will classify messages into. For the simplest anti-SPAM setup, delete the default buckets other than "spam" and add one named "inbox".
Although POPFile indicates it will automatically configure Courier, this does not appear to work in many cases. You can just skip this or click Next.
It is now time to start POPFile. Use the system tray icon option which will place a program icon in the tray from which you can access the user interface (web browser based) by double-clicking on the tray icon.
Now there are just a couple more settings to verify before proceeding. Go ahead and launch the POPFile user Interface by double clicking on the tray icon.
There are three setting to pay attention to.
Subject Header Modification: This will add [inbox] or [spam] to the beginning of the subject field of all incoming messages. Since you can configure Courier to look for message classification information in a message header, unless you have a specific reason for using this option turn it Off.
X-Text-Classification Header: This option will a new X-Text-Classification: header to your incoming messages that we can use to filter out the SPAM using Courier's filters. Make sure this option in On.
X-POPFile-Link Header: This option will ad another optional header to your message that you can use to access any mis-classified messages directly from Courier in order to re-classify them. This is important during the first few days you use the program as you "train" it to recognize what you consider SPAM. Make sure this option is On.
Now you are ready to configure Courier to collect mail through POPFile.
Open the account you want to configure by clicking on the Accounts folder in the Courier folder list, and then double-clicking on the desired account in the Summary Pane. Next, click on the Mail Server tab. Basically you just need to chance the UserID to the following standard:
and then chance the server name to 127.0.0.1 (this is the IP address for "localhost", meaning a server running locally).
So for example, if the existing account setting are like this:
Then the User ID and Server name would be changed to:
Now that Courier is configured to collect mail via POPFile, the last configuration step is to create a filter that looks for POPFile's classification header in order to place the SPAM mail into the JunkYard folder (or the SPAM folder of your choice). The easiest method to do so is to look for messages with the X-Text-Classification: spam header that POPFile adds.
Remember, you can only assign 1 filter definition to automatically run for an account, but that definition can have as many different rules in it as you need.
1. (Skip this step if you already have a definition) Right-Click on the Filter icon in the folder list and select "New Definition" ; enter a name in the Filter Definition Name field.
2. Click Add to add a new rule to the filter definition.
3. In the "What pattern to search for", put an * in the first field, and the scroll down to find the "Attachment" header in the second field and check it. The * is a wildcard, so this filter will trigger for any message that has an attachment because the mail protocol requires a header field names "Attachment" contain the attachment information.
4. In What action to take, check "Save Attachment". An Attachment Settings window will open. Select the options you want; choose the folder you want the attachments saved in. For the other settings, We recommend using the "Create a unique name" setting to prevent an existing attachment getting overwritten by another one with the same name. The second setting determines whether the message gets kept in the mailbox database, or removed and simply linked to the file in the folder you choose. We recommend using the delete option so that the attachments are removed from the database and only remain in the attachment folder. Click OK when done in this screen.
5. Check the "Continue Filtering" box so that messages will continue to other filters and be sorted to appropriate folders (otherwise, any message with an attachment will stop filtering after triggering this one.
Your "Add Rule" window should look like
(You can filter these messages to any folder, even the Wastebasket. In the example shown we are using the "JunkYard" folder).
6. Click "OK" to save the new rule. With the rule selected, use the "Priority Up" button to move the attachment filter rule to the first spot in the filters.
7. Click "Save" to save the filter definition.
If the filter definition is already assigned to
the mail accounts you want to use it for, then you are done. If you need to
assign the filter definition to your incoming mail account(s) you can select
"Accounts" in the folder list, and then for each account that you want the
definition applied to, right-click, select "Enable Filter" and the choose the
filter from the list.
Now that POPFile and Courier are set to work with each other, you'll need to do a little "training" of POPFile. At first, a lot of messages will come in flagged as "Unclassified". Some messages that you consider SPAM may get marked as "inbox" instead. What you need to do is show POPFile how these message should be classified, and very quickly POPFile will learn and become more and more accurate in how it classifies messages.
To correct classification of a message, you can right-click on the message in your Summary Pane and select "Reclassify" from the context menu, or select the message and use the Ctrl+G hot key combination. This will open the message in question in the POPFile interface where you can change the classification to the correct one. Most users find that after a day or two of training POPFile will rapidly approach a better than 98% accuracy level. You can also create something called "magnets" in POPFile to help it always classify messages containing certain criteria as a particular classification (for example, some newsletters that you may wish to receive might get flagged as spam, so you can add the sender of the email as a "magnet" that should always be classified as "inbox". See the POPFile Help system for more information the program's features.