Now that we have the spam filters in place we have to train our bayesian system and report our spam to Razor
, Pyzor
and Spamcop
.
The obvious thing that comes in mind at this point could be to call sa_learn
and spamassassin --report
in cascade when clicking in the Roundcube
webmail's "Mark as Junk" button (look at the cmd_learn
and multi_driver
drivers of the markasjunk plugin), but this option has a couple of downsides:
- the learning process, the resulting journal syncing and the connection to several filtering networks takes up to 10 seconds, a time interval that our users don't want to wait.
- even worse, when they click the "Mark as Junk" button it is not always for a real spam message. For example, think about the regular newsletters that they no longer want to read and that they decide to conveniently label as spamming instead of unsubscribe in the proper way.
Therefore it is better to run these two tasks by means of a cronjob every night (and this is going to solve the first issue), processing the messages stored in a folder where the users have copied only real spam or ham messages (then fixing the second issue as well).
Creating the "Teach" mailboxes
When you configured dovecot
you have prepared the code for the autocreation of the TeachSpam and TeachNotSpam mailboxes as sons of Junk. If this is not a fresh installation or you configured dovecot
some time ago, check your 15-mailboxes.conf
file:
mailbox "Junk.TeachSpam" { auto = subscribe autoexpunge = 5d } mailbox "Junk.TeachNotSpam" { auto = subscribe autoexpunge = 30d }
Cronjob setup
Now download my shell script
wget -O /usr/local/bin/sa_cron.sh https://notes.sagredo.eu/files/qmail/sa_cron.sh chmod +x /usr/local/bin/sa_cron.sh
and setup a cronjob to run it every night, for example
45 2 * * * /usr/local/bin/sa_cron.sh >> /var/log/cron
If you run it with no arguments, the script will do the job for all users having the .Junk.TeachSpam and .Junk.TeachNotSpam mailboxes in their Maildirs.
If you want to test it for a single admin user you can run it in the following way:
sa_cron.sh username@domain.tld
Edit the script and set DELETE_TEACH_DATA=1
if you want to delete the messages after they have been processed. I commented out the line which deletes the messages in the TeachNotSpam mailbox because I'm not sure that deleting the ham messages is a good idea.
Set DEBUG=1
to run sa_learn
and spamassassin
in debug mode, so that the logs will show everything.
Logrotate
Setup che logrotate for the above log files:
cat > /etc/logrotate.d/spam_reports << __EOF__ /var/log/spamassassin/spamassassin.log /var/log/spamassassin/sa_learn.log { su root apache rotate 5 daily missingok notifempty delaycompress create 664 root apache sharedscripts } __EOF__
Comments
log file duplication?
Robert February 27, 2022 10:11 CET
Hi,
In the installation step a log is setup for spamd and I don't know if the /var/log/spamassassin/spamassassin.log setup here has to be separate or if it can point to the other one?
Reply | Permalink
log file duplication?
Roberto Puzzanghera Robert February 27, 2022 10:16 CET
it can be the same, but I prefer to separate the log of spamd from these ones
Edit: eventually, you have to set the log file inside the script
Reply | Permalink