send more spam please
Here's my thesis: Heuristic and algorithmic filters work more effectively when they have more data. They learn patterns more quickly and accurately. Therefore, with a higher volume of spam hitting my spam filters, and more patterns to detect, there will be a lower the number of spam message leaking through them into my inbox. I mean lower in absolute sense, despite the increase in the number of spam messages.
If spam filter rules weren't heuristic, just hard-coded rules, then I'd understand that more spam requires more work constructing more rules. But spam filters are smarter than this. They learn.
For example, if I receive 200 spams and 10 intentional or desired messages, let's say the spam filter stops 190 spams. I receive 20 messages. But if I receive 2000 spams and 10 desired messages, the filter will increase its efficacy, filtering 1995 messages. Essentially, it will pick out the more unique spams for what they are. I will receive only 15 messages, or 5 undesired ones.
What am I missing?
If spam filter rules weren't heuristic, just hard-coded rules, then I'd understand that more spam requires more work constructing more rules. But spam filters are smarter than this. They learn.
For example, if I receive 200 spams and 10 intentional or desired messages, let's say the spam filter stops 190 spams. I receive 20 messages. But if I receive 2000 spams and 10 desired messages, the filter will increase its efficacy, filtering 1995 messages. Essentially, it will pick out the more unique spams for what they are. I will receive only 15 messages, or 5 undesired ones.
What am I missing?


3 Comments on "send more spam please":
your amount of desired emails does not go up. Only your amount of undesired email goes up. So obviously, adding more spam is not going to help you increase your desired emails, only increase your (filtered or not) undesired emails. So more spam can only mean more undesired email. not more desired.
also, the increase in spam isn't exponential, but is in the orders of magnitude. And heuristics aren't dealing with it that well. For instance, there is now a lot of image spam, which heuristics can't process. there is also a lot of "fake spam", that is not trying to sell you anything, but just confuse your heuristics learning to make more mistakes.
From fyao:
For some reason, I can never see the letters in the graphic, or hear the letters being read, so here's my reply:
As well, you are assuming that more spam means more information for your anti-spam to learn. However, speaking as someone who has scanned through more than his share of spam, a lot of what is received is exactly the same as what was received already, so you are just getting more spam that has no value to either you or your spam filter.
- frank
I recently decided to forward all incoming email (for several domains) to gmail, from where I pop it all. 9/10's of the spam I'd been receiving vanished.
IOW, they have better filters than the old filtering service I used in a similar manner.
Spam arrives at my mail servers, forwards with my desired email to gmail, and the spam almost completely vanishes on pop.
Why gmail or any other shared spam filtering system? The system receives so much spam that it can pattern recognize much more thoroughly than a filter located on my system alone (that is, one that didn't share data with a larger population). My ISP isn't large enough to have a sufficiently large data set to analyse and flag spam.
Again, this is all intuitive, and not based on any research on spam filtering systems.
The two security professionals who've responded here both say my intuition is exactly wrong, but haven't explained why in depth.
They have provided a couple of exceptions though: image spam (though I'd hazard a guess that the headers can still be filtered for), fake spam (anti-spam filter technology), duplicates don't add filtering value (what?).
Hmm.. Perhaps it's time to read up on this tech, find out how it does or could work.
C
Post a Comment
<< back to .:. fuck decaf .:.