While many attacks are distributed across botnets, investigators and network operators have recently identified malicious networks through high profile autonomous system (AS) depeerings and network shutdowns. In this paper, we explore whether some ASs indeed are safe havens for malicious activity. We look for ISPs and ASs that exhibit disproportionately high malicious behavior using 10 […]
A Greedy Link Scheduler for Wireless Networks With Gaussian Multiple-Access and Broadcast Channels
Information-theoretic broadcast channels (BCs) and multiple-access channels (MACs) enable a single node to transmit data simultaneously to multiple nodes, and multiple nodes to transmit data simultaneously to a single node, respectively. In this paper, we address the problem of link scheduling in multihop wireless networks containing nodes with BC and MAC capabilities. We first propose […]
TSCAN: A Content Anatomy Approach to Temporal Topic Summarization
A topic is defined as a seminal event or activity along with all directly related events and activities. It is represented by a chronological sequence of documents published by different authors on the Internet. In this study, we define a task called topic anatomy, which summarizes and associates the core parts of a topic temporally […]
Tree-Based Mining for Discovering Patterns of Human Interaction in Meetings
Discovering semantic knowledge is significant for understanding and interpreting how people interact in a meeting discussion. In this paper, we propose a mining method to extract frequent patterns of human interaction based on the captured content of face-to-face meetings. Human interactions, such as proposing an idea, giving comments, and expressing a positive opinion, indicate user […]
SPIRE: Efficient Data Inference and Compression over RFID Streams
Despite its promise, RFID technology presents numerous challenges, including incomplete data, lack of location and containment information, and very high volumes. In this work, we present a novel data inference and compression substrate over RFID streams to address these challenges. Our substrate employs a time-varying graph model to efficiently capture possible object locations and interobject […]
Slicing: A New Approach for Privacy Preserving Data Publishing
Several anonymization techniques, such as generalization and bucketization, have been designed for privacy preserving microdata publishing. Recent work has shown that generalization loses considerable amount of information, especially for high-dimensional data. Bucketization, on the other hand, does not prevent membership disclosure and does not apply for data that do not have a clear separation between […]
Resilient Identity Crime Detection
Identity crime is well known, prevalent, and costly; and credit application fraud is a specific case of identity crime. The existing nondata mining detection system of business rules and scorecards, and known fraud matching have limitations. To address these limitations and combat identity crime in real time, this paper proposes a new multilayered detection system […]
Ranking Model Adaptation for Domain-Specific Search
With the explosive emergence of vertical search domains, applying the broad-based ranking model directly to different domains is no longer desirable due to domain differences, while building a unique ranking model for each domain is both laborious for labeling data and time consuming for training models. In this paper, we address these difficulties by proposing […]
Publishing Search Logs—A Comparative Study of Privacy Guarantees
Search engine companies collect the “database of intentions,” the histories of their users’ search queries. These search logs are a gold mine for researchers. Search engine companies, however, are wary of publishing search logs in order not to disclose sensitive information. In this paper, we analyze algorithms for publishing frequent keywords, queries, and clicks of […]
Privacy Preserving Decision Tree Learning Using Unrealized Data Sets
Privacy preservation is important for machine learning and data mining, but measures designed to protect private information often result in a trade-off: reduced utility of the training samples. This paper introduces a privacy preserving approach that can be applied to decision tree learning, without concomitant loss of accuracy. It describes an approach to the preservation […]