Previous abstract | Contents | Next abstract

Combining Text and Heuristics for Cost-Sensitive Spam Filtering

Spam filtering is a text categorization task that shows especial features that make it interesting and difficult. First, the task has been performed traditionally using heuristics from the domain. Second, a cost model is required to avoid misclassification of legitimate messages. We present a comparative evaluation of several machine learning algorithms applied to spam filtering, considering the text of the messages and a set of heuristics for the task. Cost-oriented biasing and evaluation is performed.

José M. Gómez Hidalgo, Manuel Maña López and Enrique Puertas Sanz, Combining Text and Heuristics for Cost-Sensitive Spam Filtering. In: Proceedings of CoNLL-2000 and LLL-2000, Lisbon, Portugal, 2000. [ps] [pdf] [bibtex]

Last update: June 27, 2001. erikt@uia.ua.ac.be