Databases enable users to precisely express their informational needs using structured queries. However, database query construction is a laborious and error-prone process, which cannot be performed well by most end users. Keyword search alleviates the usability problem at the price of query expressiveness. As keyword search algorithms do not differentiate between the possible informational needs […]
A Multidimensional Sequence Approach to Measuring Tree Similarity
Tree is one of the most common and well-studied data structures in computer science. Measuring the similarity of such structures is key to analyzing this type of data. However, measuring tree similarity is not trivial due to the inherent complexity of trees and the ensuing large search space. Tree kernel, a state of the art […]
A Link-Based Cluster Ensemble Approach for Categorical Data Clustering
Although attempts have been made to solve the problem of clustering categorical data via cluster ensembles, with the results being competitive to conventional algorithms, it is observed that these techniques unfortunately generate a final data partition based on incomplete information. The underlying ensemble-information matrix presents only cluster-data point relations, with many entries being left unknown. […]
A Genetic Programming Approach to Record Deduplication
Several systems that rely on consistent data to offer high-quality services, such as digital libraries and e-commerce brokers, may be affected by the existence of duplicates, quasi replicas, or near-duplicate entries in their repositories. Because of that, there have been significant investments from private and government organizations for developing methods for removing replicas from its […]
A Framework for Similarity Search of Time Series Cliques with Natural Relations
A Time Series Clique (TSC) consists of multiple time series which are related to each other by natural relations. The natural relations that are found between the time series depend on the application domains. For example, a TSC can consist of time series which are trajectories in video that have spatial relations. In conventional time […]
A Framework for Learning Comprehensible Theories in XML Document Classification
XML has become the universal data format for a wide variety of information systems. The large number of XML documents existing on the web and in other information storage systems makes classification an important task. As a typical type of semistructured data, XML documents have both structures and contents. Traditional text learning techniques are not […]
Intertemporal Discount Factors as a Measure of Trustworthiness in Electronic Commerce
In multiagent interactions, such as e-commerce and file sharing, being able to accurately assess the trustworthiness of others is important for agents to protect themselves from losing utility. Focusing on rational agents in e-commerce, we prove that an agent’s discount factor (time preference of utility) is a direct measure of the agent’s trustworthiness for a […]
PCloud: A Distributed System for Practical PIR
Computational Private Information Retrieval (cPIR) protocols allow a client to retrieve one bit from a database, without the server inferring any information about the queried bit. These protocols are too costly in practice because they invoke complex arithmetic operations for every bit of the database. In this paper, we present pCloud, a distributed system that […]
Topic Mining over Asynchronous Text Sequences
Time stamped texts, or text sequences, are ubiquitous in real-world applications. Multiple text sequences are often related to each other by sharing common topics. The correlation among these sequences provides more meaningful and comprehensive clues for topic mining than those from each individual sequence. However, it is nontrivial to explore the correlation with the existence […]
Identifying Evolving Groups in Dynamic Multimode Networks
A multimode network consists of heterogeneous types of actors with various interactions occurring between them. Identifying communities in a multimode network can help understand the structural properties of the network, address the data shortage and unbalanced problems, and assist tasks like targeted marketing and finding influential actors within or between groups. In general, a network […]
- « Previous Page
- 1
- …
- 82
- 83
- 84
- 85
- 86
- …
- 108
- Next Page »