Large Hadron Collider
Industrial and Other Scientific Tools
Man-made data pose difficult challenges in both representation and scalability.
FASTlab logo GT

FASTlab Home Papers/Code Team
Detecting Spammers with SNARE: Spatio-temporal Network-level Automatic Reputation Engine
Shuang Hao, Nick Feamster, Alexander Gray, Nadeem Syed, and Sven Krasser
USENIX Security Symposium 2009

We demonstrated the ability to perform automatic spam blacklisting without examining email content at all -- instead, looking at senders' spatio-temporal activities. [pdf]

Abstract: Users and network administrators need ways to filter email messages based primarily on the reputation of the sender. Unfortunately, conventional mechanisms for sender reputation -- notably, IP blacklists are cumbersome to maintain and evadable. This paper investigates ways to infer the reputation of an email sender based solely on network-level features, without looking at the contents of a message. First, we study first-order properties of network-level features that may help distinguish spammers from legitimate senders. We examine features that can be ascertained without ever looking at a packet's contents, such as the distance in IP space to other email senders or the geographic distance between sender and receiver. We derive features that are lightweight, since they do not require seeing a large amount of email from a single IP address and can be gleaned without looking at an email's contents -- many such features are apparent from even a single packet. Second, we incorporate these features into a classification algorithm and evaluate the classifier's ability to automatically classify email senders as spammers or legitimate senders. We build an automated reputation engine, SNARE, based on these features using labeled data from a deployed commercial spam-filtering system. We demonstrate that SNARE can achieve comparable accuracy to existing static IP blacklists: about a 70% detection rate for less than a 0.3% false positive rate. Third, we show how SNARE can be integrated into existing blacklists, essentially as a first-pass filter.

@incollection{hao2009snare, title = "{Detecting Spammers with SNARE: Spatio-temporal Network-level Automatic Reputation Engine}", author = "Shuang Hao and Nick Feamster and Alexander Gray and Nadeem Syed and Sven Krasser", booktitle = "Proceedings of the Eighteenth USENIX Security Symposium" year = "2009" }
In preparation

A Research Document Search Engine
We are developing new methods for text analysis, including topic modeling, in the context of a system for retrieval and visualization of research papers.

Nonlinear Recommendation Systems
Recommendation systems are mostly based on linear dimension reduction methods. We are developing an approach to recommender systems based on more powerful machine learning methods.

Fast Search for Particle Events
We have developed the first algorithmic approach to interactive-time search for trigger events, for the Large Hadron Collider.
Mars rover An Integrated System for Multi-Rover Scientific Exploration
Tara Estlin, Alexander Gray, Tobias Mann, Gregg Rabideau, Rebecca Castano, Steve Chien, and Eric Mjolsness
National Conference on Artificial Intelligence (AAAI) 1999

A system integrating machine learning and planning techniques for autonomous goal-directed planetary exploration by a coordinated team of rovers. [pdf]

Abstract: This paper describes an integrated system for coordinating multiple rover behavior with the overall goal of collecting planetary surface data. The Multi-Rover Integrated Science Understanding System combines concepts from machine learning with planning and scheduling to perform autonomous scientific exploration by cooperating rovers. The integrated system utilizes a novel machine learning clustering component to analyze science data and direct new science activities. A planning and scheduling system is employed to generate rover plans for achieving science goals and to coordinate activities among rovers. We describe each of these components and discuss some of the key integration issues that arose during development and influenced both system design and performance.

@inproceedings{estlin1999rovers, title = "{An Integrated System for Multi-Rover Scientific Exploration}", author = "Tara Estlin and Alexander G. Gray and Tobias Mann and Greg Rabideau and Rebecca Casta\~{n}o and Eric Mjolsness and Steve Chien", booktitle = "Proceedings of the Seventeenth National Conference on Artificial Intelligence (AAAI)", year = "1999" }