Installing and configuring SpamAssassin

May 31, 2025 by Roberto Puzzanghera 62 comments

Info: http://spamassassin.apache.org/
Docs: http://spamassassin.apache.org/full/4.0.x/doc/
Latest version: 4.0.1
Download: http://spamassassin.apache.org/downloads.cgi

SpamAssassin is a mature, widely-deployed open source project that serves as a mail filter to identify Spam. SpamAssassin uses a variety of mechanisms including header and text analysis, Bayesian filtering, DNS blocklists, and collaborative filtering databases. SpamAssassin runs on a server, and filters spam before it reaches your mailbox.

Troubles with latest `DBI::mysql` module

Days ago (Jan 06, 2025) the perl DBI::mysql module has been updated to v4.053 and v5.011. Both of them dropped the support for MariaDB and MySQL > 8. Infact v4.053 compilation exits with

/usr/bin/perl: symbol lookup error: /usr/local/lib64/perl5/auto/DBD/mysql/mysql.so: undefined symbol: mysql_real_escape_string_quote

while v5.011 doesn't compile as it seems not to be compliant with my openssl-1.1.1zb (I had the same issue on other distros with openssl-1.1 installed).

I'm restoring v4.052 like this

cpan install DVEEDEN/DBD-mysql-4.052.tar.gz

Changelog

May 31, 2025
- We are know denying validity, as it is imposing low limits and drastically blocking the queries (tx Shailendra Shukla).
May 26, 2024
- SA upgraded to v. 4.0.1
Jun 25, 2023
- The ExtractText notes have been revised and corrected by Gabriel Torres
Jul 14, 2021
- added DCC setup (next page)
- moved the configuration of Razor, Pyzor and Spamcop to a separate page

Upgrading `spamassassin` to version 4.0.x

You have detailed info in a separated page here.

Install

Create the spamd user and group, prepare config and log dirs:

mkdir -p /etc/mail/spamassassin /home/spamd /var/log/spamassassin

groupadd spamd
useradd -g spamd -d /home/spamd spamd
chown -R spamd:spamd /home/spamd

Then install spamassassin and all the modules we need with their prerequisites. First let's set CPAN to follow the dependencies without without showing us a prompt:

> perl -MCPAN -e shell
cpan> o conf prerequisites_policy follow
cpan> o conf commit

Cut and paste the following in CPAN (the order is important):

notest install Socket6 IO::Socket IO::Socket::INET6 LWP MD5 CPAN::DistnameInfo Mail::DKIM
notest install Test::More MIME::Base64 Digest::MD5 Digest::HMAC_MD5 Net::IP
notest install Net::Ping Net::DNS Time::HiRes Digest::SHA1 Getopt::Long Digest::Nilsimsa URI::Escape HTML::Parser HTTP::Date IO::Zlib Archive::Tar  Mail::SPF
notest install Mail::SPF::Query Net::Ident IO::Socket::SSL Mail::DomainKeys Mail::DKIM LWP::UserAgent HTTP::Date Encode::Detect BSD::Resource
notest install Storable DB_File Net::SMTP BerkeleyDB Class::Method::Modifiers
notest install Geo::IP IO::Socket::IP Net::Patricia
notest install Mail::DMARC::PurePerl MIME::QuotedPrint Compress::Raw::Zlib DBD::SQLite
notest install DVEEDEN/DBD-mysql-4.052.tar.gz DBI

We sticked with version 4.052 of DBD::mysql because the support for MariaDB and MySQL > 8 has been dropped. DBD::MariaDB is a fork of DBD::mysql but unfortunately it's not working yet. I hope it will become the default in the future.

Install spamassassin:

notest install  Mail::SpamAssassin Mail::SpamAssassin::Plugin::Razor2 Mail::SpamAssassin::BayesStore::MySQL

I had to skip the tests because of some error.

We have installed in advance the Razor2 and the Bayesstore:MySQL plugins that we'll need later.

Configuring

You can find the config files into /etc/mail/spamassassin

> cd /etc/mail/spamassassin
> ls
init.pre  local.cf  v310.pre  v312.pre  v320.pre  v330.pre v340.pre  v341.pre  v342.pre  v343.pre  v400.pre

local.cf

# Add *****SPAM***** to the Subject header of spam e-mails
# rewrite_header Subject *****SPAM*****
# put here your subnet
trusted_networks 10.0.0.
# Set the threshold at which a message is considered spam (default: 5.0)
required_score 5.0

# denying VALIDITY 
# https://knowledge.validity.com/s/articles/Accessing-Validity-reputation-data-through-DNS 
dns_query_restriction deny bl.score.senderscore.com 
dns_query_restriction deny sa-accredit.habeas.com 
dns_query_restriction deny sa-trusted.bondedsender.org

We are know denying validity, as it is imposing low limits and drastically blocking the queries (tx Shailendra Shukla).

`ExtractText`

Thanks to Gabriel Torres for reviewing and correcting this section

The purpose of the ExtractText plugin is to, when enabled, convert attachments (including images, byt the use of an OCR) into plain text in order to SpamAssassin to apply its rules to this text. So if we receive doc/pdf/images with spammy text in them, SpamAssassin will now be able to safely mark the email as spam.

In order to do that, we need to have installed some external programs in our server. The configuration lines added to local.cf have to load these programs to scan each message attachment.

Install the required external programs. Debian users will do;

apt-get install antiword
apt-get install docx2txt
apt-get install unrtf
apt-get install odt2txt
apt-get install tesseract-ocr
apt-get install poppler-utils

Slackware users will find all these programs on SlackBuild, while poppler is already availble in the distro.

Add the following lines to 70-extracttext.cf file:

cat > 70-extracttext.cf << EOF
ifplugin Mail::SpamAssassin::Plugin::ExtractText

extracttext_external pdftotext /usr/bin/pdftotext -nopgbrk -layout -enc UTF-8 {} -
extracttext_use pdftotext .pdf application/pdf

# http://docx2txt.sourceforge.net
extracttext_external docx2txt /usr/bin/docx2txt {} -
extracttext_use docx2txt .docx application/docx

extracttext_external antiword /usr/bin/antiword -t -w 0 -m UTF-8.txt {}
extracttext_use antiword .doc application/(?:vnd\.?)?ms-?word.*

extracttext_external unrtf /usr/bin/unrtf --nopict {}
extracttext_use unrtf .doc .rtf application/rtf text/rtf

extracttext_external odt2txt /usr/bin/odt2txt --encoding=UTF-8 {}
extracttext_use odt2txt .odt .ott application/.*?opendocument.*text
extracttext_use odt2txt .sdw .stw application/(?:x-)?soffice application/(?:x-)?starwriter

extracttext_external tesseract {OMP_THREAD_LIMIT=1} /usr/bin/tesseract -c page_separator= {} -
extracttext_use tesseract .jpg .png .bmp .tif .tiff image/(?:jpeg|png|x-ms-bmp|tiff)

add_header all ExtractText-Flags _EXTRACTTEXTFLAGS_

#header PDF_NO_TEXT X-ExtractText-Flags =~ /\bNoText\b/
#describe PDF_NO_TEXT PDF without text
#score PDF_NO_TEXT 0.2

#header DOC_NO_TEXT X-ExtractText-Flags =~ /\bNoText\b/
#describe DOC_NO_TEXT Document without text
#score DOC_NO_TEXT 0.2

#header EXTRACTTEXT exists:X-ExtractText-Flags
#describe EXTRACTTEXT Email processed by extracttext plugin
#score EXTRACTTEXT 0.001

endif
EOF

You can see three rules commented out. You can safely leave them commented out or enable them for debug purposes. The EXTRACTTEXT rule is just to have proof that the plugin is active. PDF_NO_TEXT and DOC_NO_TEXT will be hit in case of an empty document in attach. You will have an header like this when these two rules have been hit:

X-Spam-ExtractText-Flags: NoText

init.pre

# RelayCountry - add metadata for Bayes learning, marking the countries
# a message was relayed through
#
# Note: This requires the IP::Country::Fast Perl module
#
loadplugin Mail::SpamAssassin::Plugin::RelayCountry

# URIDNSBL - look up URLs found in the message against several DNS
# blocklists.
#
loadplugin Mail::SpamAssassin::Plugin::URIDNSBL

# SPF - perform SPF verification.
#
loadplugin Mail::SpamAssassin::Plugin::SPF

v400.pre

Load all new plugin which come with SA v.4

loadplugin Mail::SpamAssassin::Plugin::ExtractText 
loadplugin Mail::SpamAssassin::Plugin::DecodeShortURLs 
loadplugin Mail::SpamAssassin::Plugin::DMARC

`sa-update`

sa-update updates the rules (it requires gpg 1.4). Before running spamassassin for the first time download the rules:

sa-update

Add to your crontab this line to update the rules once a day

cat >> /etc/cron.d/qmail << EOF
# spamassassin update
30 3 * * * /usr/local/bin/sa-update --nogpg -v &
EOF

The -v option will produce an email notification to postmaster.

Testing

Run this debug command. If you get no error you are ready to run the daemon.

sudo -u spamd -H spamassassin -D --lint

Do not quit spamd with ctrl+C, because the next test with spamc will have to connect to it.

Open another terminal and check if the headers are inserted:

echo -e "From: myself@mymailserver.net\nTo:myfriend@domain.net\nSubject: test\n\n" | spamc

Received: from localhost by qmail.mymailserver.net
 with SpamAssassin (version 3.3.1);
 Tue, 30 Nov 2010 23:18:37 +0100
From: myself@mymailserver.net
To: myfriend@domain.net
Subject: test
X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-04-18) on qmail.mymailserver.net
X-Spam-Flag: YES
X-Spam-Level: *****
X-Spam-Status: Yes, score=5.4 required=5.0 tests=BAYES_99,FREEMAIL_FROM,
 MISSING_DATE,MISSING_MID,NO_RECEIVED,NO_RELAYS,TVD_SPACE_RATIO,
 T_TO_NO_BRKTS_FREEMAIL autolearn=no version=3.3.1

Running spamassassin

Download the startup script

cd /usr/local/bin
wget https://notes.sagredo.eu/files/qmail/spamdctl
chmod +x spamdctl

Set the IP variable with the IPs that are allowed to send queries to spamd:

#!/bin/bash
#
# Spamd init script
#
# August, 2th 2003
# Martin Ostlund, nomicon
# Modified slightly by Troy Belding for Qmailrocks - February 23, 2004
#
# Modified by Roberto Puzzanghera - September 02, 2014
# November 17, 2020: moved log file to /var/log/spamassassin/spamd.log

# Comma separated IPs that are allowed to query the spamd server
IP=127.0.0.1,::1,1.2.3.4,10.0.0.0/24

DAEMON=/usr/local/bin/spamd
NAME=spamd
SNAME=spamdctl
DESC="SpamAssassin Mail Filter Daemon"
LOGFILE=/var/log/spamassassin/spamd.log
PIDFILE="/var/run/$NAME.pid"
PNAME="spamd"
LISTEN_IP=0.0.0.0
# DEBUG="-D" # comment out to disable debug
# USER_PREFS="-q" # Use with -x. Comment out to disable sql user prefs

DOPTIONS="${DEBUG} ${USER_PREFS} -x -u spamd -A ${IP} -i ${$LISTEN_IP} -s $LOGFILE -H /home/spamd -d --pidfile=$PIDFILE"

KILL="/bin/kill"
KILLALL="/bin/killall"

# Defaults - don't touch, edit /etc/mail/spamassassin/local.cf
ENABLED=0
OPTIONS=""

set -e

case "$1" in
start)
echo -n "Starting $DESC: "
$DAEMON $OPTIONS $DOPTIONS

echo "$NAME."
;;
stop)
echo -n "Stopping $DESC: "
$KILL -9 `cat $PIDFILE`
/bin/rm $PIDFILE
echo "$NAME."
;;
restart|force-reload)
echo -n "Restarting $DESC: "
$0 stop
$0 start

echo "$NAME."
;;
*)
ME=/usr/local/bin/$SNAME
echo "Usage: $ME {start|stop|restart|force-reload}" >&2
exit 1
;;
esac

exit 0

Now check that spamd is running:

> spamdctl start
> ps axfu
root      1859  0.1  3.4 139360 61044 ?        Ss   19:00   0:01 /usr/bin/spamd -x -u spamd -A 127.0.0.1,<external-IP> -H /home/spamd -d --pidfile=/var/run/spamd.pid
spamd     1860  0.0  3.2 139360 58984 ?        S    19:00   0:00  \_ spamd child
spamd     1861  0.0  3.2 139360 58984 ?        S    19:00   0:00  \_ spamd child

Type spamd -c to learn how to use spamd. See also http://spamassassin.apache.org/full/3.4.x/doc/spamd.html

Starting spamassassin at boot time

To start spamassassin at boot time put your startup script in your rc.local:

/usr/local/bin/spamdctl start &

logrotate

Create a file /etc/logrotate.d/spamd like this (slackware) to rotate daily your spamd logs:

cat > /etc/logrotate.d/spamd << __EOF__
/var/log/spamassassin/spamd.log /var/log/spamassassin/razor-agent.log {
su root apache
rotate 5
daily
missingok
notifempty
delaycompress
postrotate
   [ -f '/var/run/spamd.pid' ] && (kill -HUP \`cat /var/run/spamd.pid\`) || exit 0
endscript
}
__EOF__

Be aware that we have already setup the logrotate for the log file or Razor, which we'll see in the next page.

vpopmail-auth driver removal

Filtering networks

spamassassin

add a comment

Comments

SA-Update Rules 01-May-2025

Shailendra Shukla May 2, 2025 16:43 CET

Hi Roberto,

There seems to be a few features added / changed in SA after the rules update on 01-May-2025. The below fields are seen the the Spam Mails received ,

0.0 RCVD_IN_VALIDITY_RPBL_BLOCKED RBL: ADMINISTRATOR NOTICE: The query to Validity was blocked. 
See https://knowledge.validity.com/hc/en-us/articles/20961730681243 for more information.
[88.151.12.31 listed in bl.score.senderscore.com]

Well after following the link provided in the message , it seems like one has to create a free account on https://my.validity.com/ and register your mail server ip / ip range CIDR to access queries made to their sender reputation data . Below is the transcript from their website

Validity provides access through DNS to our sender reputation data, including the Validity Certified Allowlist and the Return Path Blocklist, to allow for use with message filtering. This is commonly accessed through default rules enabled in SpamAssassin or directly through proprietary scripts and applications.

Validity will allow up to 10,000 requests to anonymous users over a 30-day period. If you require the ability to query in larger volumes then a contractual agreement is needed.

Although I have created a account and added my IP's , just wanted to check with you if you are aware of the same / updates to SA.

Can you check / investigate if my above findings are true and the optimum configuration if any has to be applied to SA configuration for the same

Cheers

Installing and configuring SpamAssassin

Troubles with latest DBI::mysql module

Changelog

Upgrading spamassassin to version 4.0.x

Install

Configuring

local.cf

ExtractText

init.pre

v400.pre

sa-update

Testing

Running spamassassin

Starting spamassassin at boot time

logrotate

Comments

qmail notes

LXC scripts

Other contents

Recent comments

See also...

Recent posts

Troubles with latest `DBI::mysql` module

Upgrading `spamassassin` to version 4.0.x

`ExtractText`

`sa-update`