July 16, 2021 Roberto Puzzanghera 16 comments
TxRep was designed as an enhanced replacement of the AutoWhitelist plugin. TxRep, just like AWL, tracks scores of messages previously received, and adjusts the current message score, either by boosting messages from senders who send ham or penalizing senders who have sent spam previously. This not only treats some senders as if they were whitelisted but also treats spammers as if they were blacklisted. Each message from a particular sender adjusts the historical total score which can change them from a spammer if they send non-spam messages. Senders who are considered non-spammers can become treated as spammers if they send messages which appear to be spam. Simpler told TxRep is a score averaging system. It keeps track of the historical average of a sender, and pushes any subsequent mail towards that average.
I assume that you have a "spamassassin" DB and user as already done in the previous page.
count
column was renamed in v. 3.4.3 of spamassassin, so you should run this query after the upgrade:ALTER TABLE `txrep` CHANGE `count` `msgcount` INT(11) NOT NULL DEFAULT '0';
> mysql -h [mysql-IP] -u root -p USE spamassassin; CREATE TABLE txrep ( username varchar(100) NOT NULL default '', email varchar(255) NOT NULL default '', ip varchar(40) NOT NULL default '', msgcount int(11) NOT NULL default '0', totscore float NOT NULL default '0', signedby varchar(255) NOT NULL default '', last_hit timestamp NOT NULL default CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP, PRIMARY KEY (username,email,signedby,ip), KEY last_hit (last_hit) ) ENGINE=InnoDB; CREATE TABLE bayes_expire ( id int(11) NOT NULL default '0', runtime int(11) NOT NULL default '0', KEY bayes_expire_idx1 (id) ) ENGINE=InnoDB; CREATE TABLE bayes_global_vars ( variable varchar(30) NOT NULL default '', value varchar(200) NOT NULL default '', PRIMARY KEY (variable) ) ENGINE=InnoDB; INSERT INTO bayes_global_vars VALUES ('VERSION','3'); CREATE TABLE bayes_seen ( id int(11) NOT NULL default '0', msgid varchar(200) binary NOT NULL default '', flag char(1) NOT NULL default '', PRIMARY KEY (id,msgid) ) ENGINE=InnoDB; CREATE TABLE bayes_token ( id int(11) NOT NULL default '0', token binary(5) NOT NULL default '', spam_count int(11) NOT NULL default '0', ham_count int(11) NOT NULL default '0', atime int(11) NOT NULL default '0', PRIMARY KEY (id, token), INDEX bayes_token_idx1 (id, atime) ) ENGINE=InnoDB; CREATE TABLE bayes_vars ( id int(11) NOT NULL AUTO_INCREMENT, username varchar(200) NOT NULL default '', spam_count int(11) NOT NULL default '0', ham_count int(11) NOT NULL default '0', token_count int(11) NOT NULL default '0', last_expire int(11) NOT NULL default '0', last_atime_delta int(11) NOT NULL default '0', last_expire_reduce int(11) NOT NULL default '0', oldest_token_age int(11) NOT NULL default '2147483647', newest_token_age int(11) NOT NULL default '0', PRIMARY KEY (id), UNIQUE bayes_vars_idx1 (username) ) ENGINE=InnoDB;
Enable TxRep
and Bayes
editing the file local.cf
use_bayes 1
bayes_auto_learn 1
use_txrep 1
txrep_factory Mail::SpamAssassin::SQLBasedAddrList
then editing v341.pre
# TxRep - Reputation database that replaces AWL loadplugin Mail::SpamAssassin::Plugin::TxRep
and commenting out this line in the /etc/mail/spamassassin/v310.pre file:
# loadplugin Mail::SpamAssassin::Plugin::AWL
Edit the config file /etc/mail/spamassassin/90-sql.cf and add the mysql
login:
cat >> 90-sql.cf << __EOF__ # txrep txrep_factory Mail::SpamAssassin::SQLBasedAddrList user_awl_dsn DBI:mysql:spamassassin:localhost user_awl_sql_username spamassassin user_awl_sql_password SApassword user_awl_sql_table txrep # bayesean bayes_store_module Mail::SpamAssassin::BayesStore::MySQL bayes_sql_dsn DBI:mysql:spamassassin:localhost bayes_sql_username spamassassin bayes_sql_password SApassword __EOF__
You will find that, once your bayesian system has been properly trained, it will be very effective to the point that a great deal of confidence can be assigned to it, so you may want to increase its spam score, which is 3.5 for spam probability from 99 to 100% and 0.2 for spam probability from 99.9 to 100% (the above scores are added each other). For example you can put this in your local.cf
:
score BAYES_99 4.5 score BAYES_999 0.5
In order to test the learning system save a raw spam message into a file named spam.txt and run sa-learn
in this way (supposing that postmaster@yourdomain.tld is the email recipient)
sa-learn --debug --spam -upostmaster@yourdomain.tld spam.txt
The bayesian classifier can only score new messages if it already has 200 known spams and 200 known hams. So it is the time to train our learning system. Prepare a folder where you have a lot of messages that you are sure are only spam (at least 200 spam messages) and another with only ham.
Then run sa-learn
. For a mailbox with spam:
sa-learn --showdots --spam spam-directory/*
For a mailbox with ham:
sa-learn --showdots --ham ham-directory/*
It is important to do both.
txrep
tableThe awl table is going to grow day after day depending on the traffic on your mail server. Most of the records are single spam event that will rarely produce another hit so that you can decide to clean out them to optimize the volume of data stored in that table and speed up the mysql query consequently.
Thus, let's create a file which stores the MySQL query. Modify this example entering the MySQL executable and the spamassassin MySQL account:
cat > /usr/local/bin/txrep_purge.sh << __EOF__ #!/bin/sh /usr/bin/mysql -u spamassassin -p"sa_pwd" -e "USE spamassassin; DELETE FROM txrep WHERE last_hit <= (now() - INTERVAL 120 day);" exit 0 __EOF__ chown root:mysql /usr/local/bin/txrep_purge.sh chmod ug+x /usr/local/bin/txrep_purge.sh chmod o-rwx /usr/local/bin/txrep_purge.sh
So "spamassassin" is the myql user and "sa_pwd" is the password (this account must have the priviledges for the "spamassassin" DB both from the mail server's IP, from the apache's IP (userprefs via Roundcube) and now from the mysql host (localhost). Don't add spaces after -p.
Finally edit the crontab
crontab -e
and add a cronjob like this:
#minute hour mday month wday command 1 1 25 * * /usr/local/bin/txrep_purge.sh >> /var/log/cron
qq_temporary_problem_(#4.3.0)
June 2, 2023 06:32
qq_temporary_problem_(#4.3.0)
June 1, 2023 21:18
qq_temporary_problem_(#4.3.0)
May 31, 2023 18:22
qq_temporary_problem_(#4.3.0)
May 31, 2023 14:42
qq_temporary_problem_(#4.3.0)
May 31, 2023 14:33
Thank you! for all the documentation, patches and support
May 26, 2023 08:42
free(): double free detected in tcache 2: /var/www/qmail/cgi-bin/qmailadmin
May 17, 2023 15:25
free(): double free detected in tcache 2: /var/www/qmail/cgi-bin/qmailadmin
May 17, 2023 07:46
Tags
apache clamav dkim dovecot ezmlm fail2ban hacks lamp letsencrypt linux linux-vserver lxc mariadb mediawiki mozilla mysql openboard owncloud patches php proftpd qmail qmail to postfix qmail-spp qmailadmin rbl roundcube rsync sieve simscan slackware solr spamassassin spf ssh ssl surbl tcprules tex ucspi-tcp vpopmail vqadmin
Comments
Error SQL bayes
Arturo Blanco July 15, 2021 22:26
Hi!
In a new installation that I just did I find the following problem:
...
....
The database and the table have on utf8mb4_unicode_ci.
Thanks!!
Reply | Permalink
Error SQL bayes
Roberto Puzzanghera Arturo Blanco July 15, 2021 23:45
Hi, try to change from char to binary the bayes_token.token field as shown here
Let me know if it solves
Reply | Permalink
Failed to parse line
Gabriel Torres May 27, 2020 05:12
Hi Roberto,
I am getting this error:
Cheers.
Reply | Permalink
Failed to parse line
Roberto Puzzanghera Gabriel Torres May 27, 2020 14:13
did you commented this line?
Reply | Permalink
Failed to parse line
Gabriel Torres Roberto Puzzanghera May 28, 2020 01:27
Hi Roberto,
The error is in your guide. Where you have:
it should be
Thanks.
Reply | Permalink
Failed to parse line
Roberto Puzzanghera Gabriel Torres May 28, 2020 10:28
Thank you. Actually I modified that line in my server but forgot to do the same in this page
Reply | Permalink
A small observation
Gabriel Torres May 27, 2020 04:36
This line is located in the v310.pre file. So the text should read:
Reply | Permalink
Is MySQL really required?
Gabriel Torres May 26, 2020 01:49
Hi Roberto,
We have here bayes and TxRep enabled without MySQL, with all data being written to /etc/mail/spamassassin/.spamassassin, since we have no interest in using MySQL as we don't need userpref here.
Do you think the MySQL approach is really necessary?
I'd rather keep things simple here.
Cheers
Reply | Permalink
Is MySQL really required?
Roberto Puzzanghera Gabriel Torres May 26, 2020 15:53
Hi, I choosed the mysql approach because I find it easier to purge the database by means of an SQL query, but I think you can get rid of mysql if you do the same with a command line script...
Reply | Permalink
Spamassassin 3.4.3 table column name changed
Tony Fung February 11, 2020 08:06
Hi Roberto,
The column "count" in table "txrep" is renamed to "msgcount" from version 3.4.3. Look into section "TxRep and Awl plugins has been modified..." at https://svn.apache.org/repos/asf/spamassassin/tags/spamassassin_release_3_4_3/UPGRADE.
Please update your guide as underneath when creating new table:
Or modifiy the table with the following command to upgrade from prior version:
Otherwise, the following error shall be recorded in spam log:
Reply | Permalink
Spamassassin 3.4.3 table column name changed
Roberto Puzzanghera Tony Fung February 11, 2020 12:03
Thank you, corrected.
Reply | Permalink
cleanup old data
Anonymous November 30, 2011 10:58
By adding
to the awl table definition you can easily spot old entries and delete them.
Reply | Permalink
Oops, the sql statement for
Anonymous Anonymous December 1, 2011 12:38
Oops, the sql statement for clean up should be
Reply | Permalink
yes, that's even better. I'll
roberto puzzanghera Anonymous November 30, 2011 14:20
yes, that's even better. I'll update this page as soon as possible
Reply | Permalink
AWL and Bayesean
Anonymous January 18, 2011 09:35
hi,
Ended up with this error message:
Reply | Permalink
Userprefs
roberto puzzanghera Anonymous January 18, 2011 18:55
Hi,
it's seems like the messages was rejected correctly because the sender is blacklisted, as the score is close to 100. Was it rejected?
Reply | Permalink