July 16, 2021 Roberto Puzzanghera 16 comments
TxRep was designed as an enhanced replacement of the AutoWhitelist plugin. TxRep, just like AWL, tracks scores of messages previously received, and adjusts the current message score, either by boosting messages from senders who send ham or penalizing senders who have sent spam previously. This not only treats some senders as if they were whitelisted but also treats spammers as if they were blacklisted. Each message from a particular sender adjusts the historical total score which can change them from a spammer if they send non-spam messages. Senders who are considered non-spammers can become treated as spammers if they send messages which appear to be spam. Simpler told TxRep is a score averaging system. It keeps track of the historical average of a sender, and pushes any subsequent mail towards that average.
I assume that you have a "spamassassin" DB and user as already done in the previous page.
countcolumn was renamed in v. 3.4.3 of spamassassin, so you should run this query after the upgrade:
ALTER TABLE `txrep` CHANGE `count` `msgcount` INT(11) NOT NULL DEFAULT '0';
> mysql -h [mysql-IP] -u root -p USE spamassassin; CREATE TABLE txrep ( username varchar(100) NOT NULL default '', email varchar(255) NOT NULL default '', ip varchar(40) NOT NULL default '', msgcount int(11) NOT NULL default '0', totscore float NOT NULL default '0', signedby varchar(255) NOT NULL default '', last_hit timestamp NOT NULL default CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP, PRIMARY KEY (username,email,signedby,ip), KEY last_hit (last_hit) ) ENGINE=InnoDB; CREATE TABLE bayes_expire ( id int(11) NOT NULL default '0', runtime int(11) NOT NULL default '0', KEY bayes_expire_idx1 (id) ) ENGINE=InnoDB; CREATE TABLE bayes_global_vars ( variable varchar(30) NOT NULL default '', value varchar(200) NOT NULL default '', PRIMARY KEY (variable) ) ENGINE=InnoDB; INSERT INTO bayes_global_vars VALUES ('VERSION','3'); CREATE TABLE bayes_seen ( id int(11) NOT NULL default '0', msgid varchar(200) binary NOT NULL default '', flag char(1) NOT NULL default '', PRIMARY KEY (id,msgid) ) ENGINE=InnoDB; CREATE TABLE bayes_token ( id int(11) NOT NULL default '0', token binary(5) NOT NULL default '', spam_count int(11) NOT NULL default '0', ham_count int(11) NOT NULL default '0', atime int(11) NOT NULL default '0', PRIMARY KEY (id, token), INDEX bayes_token_idx1 (id, atime) ) ENGINE=InnoDB; CREATE TABLE bayes_vars ( id int(11) NOT NULL AUTO_INCREMENT, username varchar(200) NOT NULL default '', spam_count int(11) NOT NULL default '0', ham_count int(11) NOT NULL default '0', token_count int(11) NOT NULL default '0', last_expire int(11) NOT NULL default '0', last_atime_delta int(11) NOT NULL default '0', last_expire_reduce int(11) NOT NULL default '0', oldest_token_age int(11) NOT NULL default '2147483647', newest_token_age int(11) NOT NULL default '0', PRIMARY KEY (id), UNIQUE bayes_vars_idx1 (username) ) ENGINE=InnoDB;
Bayes editing the file local.cf
use_bayes 1 bayes_auto_learn 1 use_txrep 1 txrep_factory Mail::SpamAssassin::SQLBasedAddrList
then editing v341.pre
# TxRep - Reputation database that replaces AWL loadplugin Mail::SpamAssassin::Plugin::TxRep
and commenting out this line in the /etc/mail/spamassassin/v310.pre file:
# loadplugin Mail::SpamAssassin::Plugin::AWL
Edit the config file /etc/mail/spamassassin/90-sql.cf and add the
cat >> 90-sql.cf << __EOF__ # txrep txrep_factory Mail::SpamAssassin::SQLBasedAddrList user_awl_dsn DBI:mysql:spamassassin:localhost user_awl_sql_username spamassassin user_awl_sql_password SApassword user_awl_sql_table txrep # bayesean bayes_store_module Mail::SpamAssassin::BayesStore::MySQL bayes_sql_dsn DBI:mysql:spamassassin:localhost bayes_sql_username spamassassin bayes_sql_password SApassword __EOF__
You will find that, once your bayesian system has been properly trained, it will be very effective to the point that a great deal of confidence can be assigned to it, so you may want to increase its spam score, which is 3.5 for spam probability from 99 to 100% and 0.2 for spam probability from 99.9 to 100% (the above scores are added each other). For example you can put this in your
score BAYES_99 4.5 score BAYES_999 0.5
In order to test the learning system save a raw spam message into a file named spam.txt and run
sa-learn in this way (supposing that email@example.com is the email recipient)
sa-learn --debug --spam -firstname.lastname@example.org spam.txt
The bayesian classifier can only score new messages if it already has 200 known spams and 200 known hams. So it is the time to train our learning system. Prepare a folder where you have a lot of messages that you are sure are only spam (at least 200 spam messages) and another with only ham.
sa-learn. For a mailbox with spam:
sa-learn --showdots --spam spam-directory/*
For a mailbox with ham:
sa-learn --showdots --ham ham-directory/*
It is important to do both.
The awl table is going to grow day after day depending on the traffic on your mail server. Most of the records are single spam event that will rarely produce another hit so that you can decide to clean out them to optimize the volume of data stored in that table and speed up the mysql query consequently.
Thus, let's create a file which stores the MySQL query. Modify this example entering the MySQL executable and the spamassassin MySQL account:
cat > /usr/local/bin/txrep_purge.sh << __EOF__ #!/bin/sh /usr/bin/mysql -u spamassassin -p"sa_pwd" -e "USE spamassassin; DELETE FROM txrep WHERE last_hit <= (now() - INTERVAL 120 day);" exit 0 __EOF__ chown root:mysql /usr/local/bin/txrep_purge.sh chmod ug+x /usr/local/bin/txrep_purge.sh chmod o-rwx /usr/local/bin/txrep_purge.sh
So "spamassassin" is the myql user and "[password]" is the password (this account must have the priviledges for the "spamassassin" DB both from the mail server's IP, from the apache's IP (userprefs via Roundcube) and now from the mysql host (localhost). Don't add spaces after -p.
Finally edit the crontab
and add a cronjob like this:
#minute hour mday month wday command 1 1 25 * * /usr/local/bin/txrep_purge.sh >> /var/log/cron
apache clamav dkim dovecot ezmlm fail2ban hacks lamp letsencrypt linux linux-vserver lxc mariadb mediawiki mozilla mysql openboard owncloud patches php proftpd qmail qmail to postfix qmail-spp qmailadmin rbl roundcube rsync sieve simscan slackware solr spamassassin spf ssh ssl surbl tcprules tex ucspi-tcp vpopmail vqadmin