# 1. A simplified version of spam detection

February |

Cheapest |

From the office of President Richard |

Deposit the sum of 16.5 |

Suppose we want to detect spam messages, and we have observed from long experience that most spam can be identified by a small number of "spammy" words that appear in spam (and only in spam). How can we train a computer to recognize these bad words?

A common approach is to use a naive Bayes classifier, which uses Bayes' theorem to estimate the probability that a message is spam based on the propabilities that each word in the message appears in spam, as observed from some large training set of spam and non-spam ("ham") messages. This works reasonably well, but has some disadvantages:

- Bayesian spam filters can be misled by irrelevant words that happen to appear in a lot of non-spam messages. Once these filters became popular, spammers learned to include a large sample of non-spam text at the end of a message, which could overwhelm the effect of any suspicious words with a large number of non-suspicious ones.
- Because Bayesian spam filters use a large sample set to identify spammy words, they don't update quickly in response to mistakes. This can be annoying if you keep getting the same obvious spam over and over again.

I used a Bayesian spam filter (spamassassin) for a couple of years before I got fed up with it, and eventually replaced it with a filter based on Nike Littlestone's *Winnow* algorithm for learning concepts based on a small number of relevant attributes surrounded by many irrelevant ones.

# 2. The Winnow algorithm

In the Winnow algorithm (Littlestone, Nick. Learning quickly when irrelevant attributes abound. Machine Learning 2:253–318, 1988; winnow.pdf), we are trying to learn a concept of the form x_{1}∨x_{2}∨...∨x_{r}, where x_{1} through x_{r} are drawn from some much larger set of n variables (e.g., x_{i} = word i is present in a message). The algorithm is presented with a sequence of examples with the settings of all the variables (or *attributes*), and must make a guess for each example whether it is in the target class or not. If it guesses wrong, it is told about the error but charged for the mistake.

(The presentation of the algorithm here is a simplification of these notes by Shuchi Chawla, which I found by Googling.)

## 2.1. Details of the algorithm

The algorithm itself works like this:

Each x

_{i}is assigned a weight w_{i}, initially 1.To make a guess, test if ∑ x

_{i}w_{i}≥ n, and return 1 if this is true.If we guessed 1 but the correct answer was 0, all the weights that contributed to the incorrect guess are "demoted": we set w

_{i}' ← w_{i}/2 for all i with x_{i}= 1.If we guessed 0 but the correct answer was 1, all the weights that could have contributed to a correct guess are "promoted": w

_{i}' ← w_{i}⋅2 for all i with x_{i}= 1.

Note that weights for variables that are zero aren't updated at all. So we can use the algorithm without even knowing the full set of variables.

For example, given n=1000 and a definition of spam that includes only messages containing the word `Viagra`, after 9 messages that say `buy cheap Viagra` we will have raised the weights of `buy`, `cheap`, and `Viagra` to 512, and any new messages containing these three words will be classified as spam with a score of 1536. Unfortunately, we will also catch `buy cheap Rolex watches` (score 1024), but this will cause us to reduce the weights on `buy` and `cheap` to 256, so we won't be fooled again by this non-spam. A subsequent message that says `best Viagra here` will raise the weight on `Viagra` to 1024, at which point we will correctly classify all future spam. To learn this concept required 11 mistakes, slightly more than log_{2} n. This turns out to be typical for Winnow.

## 2.2. Mistake bounds

Littlestone shows that, in the worst case, Winnow makes O(r log n) mistakes. Here's why it works:

First we'll bound the number of mistakes P on positive instances, i.e., mistakes that result in a promotion. Observe that for any promotion step, we increase the weight on at least one of our r relevant attributes (otherwise it wouldn't be a positive instance), and we can only increase each relevant attribute ⌈log

_{2}n⌉ times before its weight exceeds n. This gives P ≤ r ⌈log_{2}n⌉ = O(r log n).To bound the number of mistakes N on negative instances, we'll keep track of changes in the total weight ∑ w

_{i}:- Each demotion step multiplies a set of weights that sums to at least n by 1/2; this reduces the total weight by at least n/2.
- Each promotion step multiplies a set of weights that sums to less than n by 2, increasing the total weight by less than n.
Putting these two bounds together gives a total change of at most Pn - N(n/2). But since we start with total weight n and can't drop below 0, w get Pn - Nn/2 > -n, which we can solve to get N < 2P+2. Since we already know P = O(r log n), this gives N = O(r log n as well).

We've just proved the claimed O(r log n) bound on mistakes. The actual bound in Littlestone's paper is a bit more sophisticated, and allows for *r-of-k* concepts where an instance is in the class if it has at least r of k relevant attributes (the OR case is *1-of-r*). For r-of-k concepts, the bound is O(rk log n).

## 2.3. Winnow in practice

I used a variant of Winnow from about 2003–2006 to filter spam (see SpamFilter for the actual code). It worked reasonably well; my vague recollection was that after training it on a previous sample of a few thousand spam and non-spam messages it misclassified roughly 1% of incoming messages, with spikes of errors occurring when spammers changed their tricks, I changed my implicit definition of spam, or at the end of the month (the classifier would incorrectly assume that since most messages received in February were spam, having `Feb` in the date was a good indicator of spammminess; this would confuse it when `Feb` changed to `Mar`).

Unfortunately, it produced roughly the same number of false positives and false negatives, which meant I still had to scan through my spam folder to teach it not to throw away good mail. So I eventually gave up and switched to using Google's Gmail system, which (like other large mail services) can filter spam based on secret proprietary algorithms developed by armies of smart people that can take advantage of having enormous samples of both good and bad email (through reporting by users and bogus honeypot addresses) and various other tricks not easily done by us amateurs (like running optical character recognition on embedded images). This works well enough that I no longer bother looking at my spam folder, although obvious spam messages still sneak through now and then, and there are categories of easily-recognized messages that my filter would have eliminated (because I don't like them) but that Google doesn't filter out (because they aren't actually spam).

Because these filters are secret, I don't know what they actually do. But I have heard rumors that Google does make heavy use of Winnow for classification, so maybe my email is still filtered with this algorithm.