by lunarg on April 23rd 2007, at 23:33
5 pages

Maintaining your database

Step 6 :: managing your database

SPAM mail changes all the time, and what has been learned could quickly become obsolete. DSPAM counters this with the ability to always retrain mails. However, the old learned data, which no longer applies, still remains in the database. Not only is this unneccesary, but it also makes the database grow large, and thus making it sluggish, and of course, there's the useless occupation of space.
For this, we need to occasionally clean out the database, and purge all old tokens and signatures.

The SQL versions of the database have two possibilities for this: either by running the dspam_clean application, or by sourcing the proper sql file into the database.
Neither of these two methods apply to the hash driver we're using. Luckily, there are other tools available to help us there.

Phase 1: use cssclean to clean the database from stale tokens

The hash database can be cleaned out quickly by issuing the simple command:

$ cssclean ~/.dspam/username.css

This automatically cleans out all obsolete tokens (i.e. tokens that haven't got hits in quite a while.

Phase 2: use csscompress to compress the database

Afterworths, we can issue the next command:

$ csscompress ~/.dspam/username.css

This command compresses the hash database by rearranging the database and removing gaps in the various extends of the hashes, effectively making the database smaller.

Do note that this might not have a visible effect the size of your database. Also, the database will never become smaller than the size of a single extend; so if the amount of data is smaller than a single extend, this command doesn't do anything.

Phase 3: purging email signature files (.sig) older than x days

What the hash driver lacks (in comparison of the SQL storage engines), is the ability to clean out signature data of old (and obsolete) emails. These are stored as individual files in your profile in the ~/.dspam/username.sig/ directory. A simple find command can do wonders here as well:

$ find ~/.dspam/username.sig/ -iname "*.sig" -mtime +14 -exec rm {} \;

The command above removes all files which haven't been changed in over 14 days. This is an effective method to rid your profile (and file system) of older mails, which you no longer need. In fact, you only need these files if you were to retrain a certain email. Supposedly, all mails over 14 days should already have the correct tag in the database, so it's safe to remove those files.

Depending on your email traffic, performing these steps once every few weeks, keeps your database in a good shape.

 
 
« November 2024»
SunMonTueWedThuFriSat
     12
3456789
10111213141516
17181920212223
24252627282930
 
Links
 
Quote
« You only find out who is swimming naked when the tide goes out. »
Warren Buffett