Using POPFile with Courier Email
POPFile is a free, open-source email classification system that can be
used to filter SPAM. POPFile is based on Bayesian processing, which
means that after a short "training period" the program can "predict" which
messages you receive are SPAM.
POPFile can be used as a very
sophisticated email classification system, filtering mail into a large number of
different categories (or what it refers to as "buckets"). But for our
basic purpose of filtering between SPAM and Not SPAM we can use a very simple
configuration of the program.
This short guide will help you through the
process of installing and configuring POPFile and Courier Email to be used
together. For more advanced information on either POPFile's classification
abilities or Courier Email's filtering functions, please refer to the respective
program's Help files.
Downloading and Installing POP
File
You can download POPFile from http://popfile.sourceforge.net/
During
installation you do not need to install any of the "Optional
Modules"
After installing you will be prompted that some basic program
configuration is
required.
First
you need to select the data location for your files, unless you have some reason
to change this then the default should work fine.
In
most cases the default settings for ports are the best option. It is also
useful to have the program start automatically with
Windows.
Now
it's time to create the "buckets" POPFile will classify messages into. For
the simplest anti-SPAM setup, delete the default buckets other than "spam" and
add one named "inbox".
Although
POPFile indicates it will automatically configure Courier, this does not appear
to work in many cases. You can just skip this or click
Next.
It
is now time to start POPFile. Use the system tray icon option which will
place a program icon in the tray from which you can access the user interface
(web browser based) by double-clicking on the tray icon.
Now there are
just a couple more settings to verify before proceeding. Go ahead and
launch the POPFile user Interface by double clicking on the tray
icon.
There are
three setting to pay attention to.
Subject Header
Modification: This will add [inbox] or [spam] to the beginning of
the subject field of all incoming messages. Since you can configure
Courier to look for message classification information in a message header,
unless you have a specific reason for using this option turn
it Off.
X-Text-Classification
Header: This option will a new X-Text-Classification: header to
your incoming messages that we can use to filter out the SPAM using
Courier's filters. Make sure this option in
On.
X-POPFile-Link Header: This
option will ad another optional header to your message that you can use to
access any mis-classified messages directly from Courier in order to
re-classify them. This is important during the first few days you use the
program as you "train" it to recognize what you consider SPAM. Make sure
this option is On.
Configuring
Courier
Now you are ready to configure Courier to collect mail
through POPFile.
Open the account you want to configure by clicking on
the Accounts folder in the Courier folder list, and then double-clicking on the
desired account in the Summary Pane. Next, click on the Mail Server
tab. Basically you just need to chance the UserID to the following
standard:
servername:username
and then chance the server name to
127.0.0.1 (this is the IP address for "localhost", meaning a server running
locally).
So for example, if the existing account setting are like
this:
Then
the User ID and Server name would be changed to:
Now
that Courier is configured to collect mail via POPFile, the last configuration
step is to create a filter that looks for POPFile's classification header in
order to place the SPAM mail into the JunkYard folder (or the SPAM folder of
your choice). The easiest method to do so is to look for messages with the
X-Text-Classification: spam header that POPFile adds.
Remember, you can only assign 1 filter
definition to automatically run for an account, but that definition can have as
many different rules in it as you need.
1. (Skip this step if you already have a definition) Right-Click on the Filter icon in the folder list and select "New Definition" ; enter a name in the Filter Definition Name field.
2. Click Add to add a new rule to the filter definition.
3. In the "What pattern to search for", put an * in the first field, and the scroll down to find the "Attachment" header in the second field and check it. The * is a wildcard, so this filter will trigger for any message that has an attachment because the mail protocol requires a header field names "Attachment" contain the attachment information.
4. In What action to take, check "Save Attachment". An Attachment Settings window will open. Select the options you want; choose the folder you want the attachments saved in. For the other settings, We recommend using the "Create a unique name" setting to prevent an existing attachment getting overwritten by another one with the same name. The second setting determines whether the message gets kept in the mailbox database, or removed and simply linked to the file in the folder you choose. We recommend using the delete option so that the attachments are removed from the database and only remain in the attachment folder. Click OK when done in this screen.
5. Check the "Continue Filtering" box so that messages will continue to other filters and be sorted to appropriate folders (otherwise, any message with an attachment will stop filtering after triggering this one.
Your "Add Rule" window should look like
this:
(You
can filter these messages to any folder, even the Wastebasket. In the
example shown we are using the "JunkYard" folder).
6. Click "OK" to save
the new rule. With the rule selected, use the "Priority Up" button to move the
attachment filter rule to the first spot in the filters.
7. Click "Save" to save the filter definition.
If the filter definition is already assigned to
the mail accounts you want to use it for, then you are done. If you need to
assign the filter definition to your incoming mail account(s) you can select
"Accounts" in the folder list, and then for each account that you want the
definition applied to, right-click, select "Enable Filter" and the choose the
filter from the list.
Training POPFile
Now that
POPFile and Courier are set to work with each other, you'll need to do a little
"training" of POPFile. At first, a lot of messages will come in flagged as
"Unclassified". Some messages that you consider SPAM may get marked as
"inbox" instead. What you need to do is show POPFile how these message
should be classified, and very quickly POPFile will learn and become
more and more accurate in how it classifies messages.
To correct
classification of a message, you can right-click on the message in your Summary
Pane and select "Reclassify" from the context menu, or select the message and
use the Ctrl+G hot key combination. This will open the message in question
in the POPFile interface where you can change the classification to the correct
one. Most users find that after a day or two of training POPFile will
rapidly approach a better than 98% accuracy level. You can also create
something called "magnets" in POPFile to help it always classify messages
containing certain criteria as a particular classification (for example, some
newsletters that you may wish to receive might get flagged as spam, so you can
add the sender of the email as a "magnet" that should always be classified as
"inbox". See the POPFile Help system for more information the program's
features.