Preparare il database SQL in SpamAssassin

10 settembre 2025 by Roberto Puzzanghera 0 commenti

Choosing the right perl DataBase Interface (DBI) module

Depending on whether you use MySQL or MariaDB, and if you use MySQL things depend also on the version, you have to install DBD::mysql or DBD::MariaDB perl DBI module. Other connectors for SQLite or Oracle are not covered here.

Here's a summary of the situation, which has become quite complicated:

DB server DB server version Connector Connector version
MariaDB any DBD::MariaDB latest version
MySQL 8.x or newer DBD::mysql 5.x
MySQL prior to v. 8 DBD::mysql 4.054 or latest of 4.x series

Install the perl module as follows (-T avoids the tests, capitalization is important in module's names):

MariaDB users:

cpan -T install DBD::MariaDB

MySQL with v. 8.x or newer users:

cpan -T install DBD::mysql

MySQL users with version prior to v. 8.x have to call the specific version in this way, otherwise the latest version from 5.x series will be installed:

cpan -T install DVEEDEN/DBD-mysql-4.054.tar.gz

Setting up the database

Enter the mysql prompt as root and create the database and the user:

> mysql -u root -p

CREATE DATABASE spamassassin;
CREATE USER 'spamassassin'@'localhost' IDENTIFIED BY '***';
GRANT USAGE ON * . * TO 'spamassassin'@'localhost' IDENTIFIED BY '***' WITH MAX_QUERIES_PER_HOUR 0 MAX_CONNECTIONS_PER_HOUR 0 MAX_UPDATES_PER_HOUR 0 MAX_USER_CONNECTIONS 0 ;
USE spamassassin;

Setting up the tables' layout

Let's prepare in advance the layout for the tables of the vaious plugins that we are going to install later:

USE spamassassin;

-- userprefs
CREATE TABLE userpref (
  username varchar(100) NOT NULL default '',
  preference varchar(50) NOT NULL default '',
  value varchar(255) NOT NULL default '',
  prefid int(11) NOT NULL auto_increment,
  ts timestamp NOT NULL DEFAULT current_timestamp() ON UPDATE current_timestamp(),
  PRIMARY KEY  (prefid),
  KEY username (username)
) ENGINE=InnoDB; 
GRANT ALL PRIVILEGES ON spamassassin . * TO 'spamassassin'@'localhost';
-- The column holding the timestamp can be useful in the future to purge very old records

-- TxRep
CREATE TABLE txrep (
  username varchar(100) NOT NULL default '',
  email varchar(255) NOT NULL default '',
  ip varchar(40) NOT NULL default '',
  msgcount int(11) NOT NULL default '0',
  totscore float NOT NULL default '0',
  signedby varchar(255) NOT NULL default '',
  last_hit timestamp NOT NULL default CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
  PRIMARY KEY (username,email,signedby,ip),
  KEY last_hit (last_hit)
) ENGINE=InnoDB;

-- Bayes
CREATE TABLE bayes_expire (
  id int(11) NOT NULL default '0',
  runtime int(11) NOT NULL default '0',
  KEY bayes_expire_idx1 (id)
) ENGINE=InnoDB;

CREATE TABLE bayes_global_vars (
  variable varchar(30) NOT NULL default '',
  value varchar(200) NOT NULL default '',
  PRIMARY KEY  (variable)
) ENGINE=InnoDB;

INSERT INTO bayes_global_vars VALUES ('VERSION','3');

CREATE TABLE bayes_seen (
  id int(11) NOT NULL default '0',
  msgid varchar(200) binary NOT NULL default '',
  flag char(1) NOT NULL default '',
  PRIMARY KEY  (id,msgid)
) ENGINE=InnoDB;

CREATE TABLE bayes_token (
  id int(11) NOT NULL default '0',
  token binary(5) NOT NULL default '',
  spam_count int(11) NOT NULL default '0',
  ham_count int(11) NOT NULL default '0',
  atime int(11) NOT NULL default '0',
  PRIMARY KEY  (id, token),
  INDEX bayes_token_idx1 (id, atime)
) ENGINE=InnoDB;

CREATE TABLE bayes_vars (
  id int(11) NOT NULL AUTO_INCREMENT,
  username varchar(200) NOT NULL default '',
  spam_count int(11) NOT NULL default '0',
  ham_count int(11) NOT NULL default '0',
  token_count int(11) NOT NULL default '0',
  last_expire int(11) NOT NULL default '0',
  last_atime_delta int(11) NOT NULL default '0',
  last_expire_reduce int(11) NOT NULL default '0',
  oldest_token_age int(11) NOT NULL default '2147483647',
  newest_token_age int(11) NOT NULL default '0',
  PRIMARY KEY  (id),
  UNIQUE bayes_vars_idx1 (username)
) ENGINE=InnoDB;

-- DecodeShortURLs
CREATE TABLE `short_url_cache` (
  `short_url` VARCHAR(255) NOT NULL,
  `decoded_url` VARCHAR(512) NOT NULL,
  `hits` INT NOT NULL DEFAULT 1,
  `created` INT(11) NOT NULL,
  `modified` INT(11) NOT NULL,
  PRIMARY KEY (`short_url`)
) ENGINE = InnoDB;
-- Maintaining index for cleaning is likely more expensive than occasional full table scan
ALTER TABLE `short_url_cache` ADD INDEX `short_url_created` (`created`);

-- Redirectors
CREATE TABLE `redir_url_cache` (
  `redir_url` VARCHAR(255) NOT NULL,
  `target_url` VARCHAR(512) NOT NULL,
  `hits` INT NOT NULL DEFAULT 1,
  `created` INT(11) NOT NULL,
  `modified` INT(11) NOT NULL,
  PRIMARY KEY (`redir_url`)
) ENGINE = InnoDB;
-- Maintaining index for cleaning is likely more expensive than occasional full table scan
ALTER TABLE `redir_url_cache` ADD INDEX `redir_url_created` (`created`);

Configuring DecodeShortURLs and DecodeShortURLs plugins

DecodeShortURLs / UrlShortener automatically expands short URLs (tinyurl, bit.ly, etc.) into canonical URLs and normalizes URLs for scoring. It also has a cache in the database to avoid repeated requests.

Redirectors follows real HTTP redirects (301, 302, meta-refresh, etc.) to get the final destination of a URL. It also caches the result in the database.

The settings for the vaious plugins will be presented in the following pages. Let's configure here in this page just the DecodeShortURLs and Redirectors plugins, as their settings are very simple.

First of all enable them uncommenting the following in the v402.pre file

loadplugin Mail::SpamAssassin::Plugin::Redirectors

and the following in the v400.pre file:

loadplugin Mail::SpamAssassin::Plugin::DecodeShortURLs

Now add their settings in the 90-sql.conf file, which we will use to save all SQL related settings:

# spamassassin MySQL user pwd
MYSQL_PWD=xxxxxxx
# mysql host
MYSQL_HOST=localhost
# perl DBI driver
PERL_DBI="DBD::MariaDB"

cat >> /etc/mail/spamassassin/90-sql.cf << __EOF__
# DecodeShortURLs plugin 
# https://spamassassin.apache.org/full/4.0.x/doc/Mail_SpamAssassin_Plugin_DecodeShortURLs.html 
# https://github.com/apache/spamassassin/blob/trunk/sql/decodeshorturl_mysql.sql 
url_shortener_cache_type        dbi 
url_shortener_cache_dsn         ${PERL_DBI}:database=spamassassin;host=localhost;port=3306
url_shortener_cache_username    spamassassin 
url_shortener_cache_password    ${MYSQL_PWD} 

# Redirectors plugin 
# https://spamassassin.apache.org/full/4.0.x/doc/Mail_SpamAssassin_Plugin_Redirectors.html 
# https://github.com/apache/spamassassin/blob/trunk/sql/redirectors_mysql.sql 
url_redirector_cache_type       dbi 
url_redirector_cache_dsn        ${PERL_DBI}:database=spamassassin;host=localhost;port=3306
url_redirector_cache_username   spamassassin 
url_redirector_cache_password   ${MYSQL_PWD}

__EOF

In the above command, just set your MariaDB/MySQL password for the user spamassassin, and the connector (driver) that you are using in the PERL_DBI variable (DBD::MariaDB or DBD::mysql).

The MYSQL_HOST is usually localhost if your database server lives in the same host of SpamAssassin.

Purging the cache

Create a file /usr/local/bin/purge_sa_cache.sh in order to purge the records older than one year from the cache:

#!/bin/bash

DB_USER="spamassassin"
DB_PASS="password"
DB_NAME="spamassassin"
DB_HOST="localhost"
DB_PORT="3306"
MYSQL_BIN=/usr/bin/mysql
# purge Redirectors $MYSQL_BIN -u"$DB_USER" -p"$DB_PASS" -h "$DB_HOST" -P "$DB_PORT" "$DB_NAME" << __EOF__ DELETE FROM redir_url_cache WHERE created < UNIX_TIMESTAMP(DATE_SUB(NOW(), INTERVAL 1 YEAR));
# purge DecodeShortURLs DELETE FROM short_url_cache WHERE created < UNIX_TIMESTAMP(DATE_SUB(NOW(), INTERVAL 1 YEAR)); __EOF__

Adjust the privileges

chmod +x /usr/local/bin/purge_sa_cache.sh
chmod go-wrx /usr/local/bin/purge_sa_cache.sh

Sdd to the crontab:

cat >> /etc/cron.d/qmail << __EOF__

# purge SpamAssassin's cache
0 3 * * 0 root /usr/local/bin/purge_sa_cache.sh >> /var/log/cron 2>&1
__EOF__

Aggiungi un commento

qmail notes

Pay me a coffee:

PayPal - The safer, easier way to pay online.

LXC scripts
Other contents
Guide per gli utenti