TECHNOLOGY: JAVA
DOMAIN: CLOUD COMPUTING
S. No. | IEEE TITLE | ABSTRACT | IEEE YEAR |
1. | Decentralized Access Control with Anonymous Authentication of Data Stored in Clouds | We propose a new decentralized access control scheme for secure data storage in clouds that supports anonymous authentication. In the proposed scheme, the cloud verifies the authenticity of the series without knowing the user’s identity before storing data. Our scheme also has the added feature of access control in which only valid users are able to decrypt the stored information. The scheme prevents replay attacks and supports creation, modification, and reading data stored in the cloud. We also address user revocation. Moreover, our authentication and access control scheme is decentralized and robust, unlike other access control schemes designed for clouds which are centralized. The communication, computation, and storage overheads are comparable to centralized approaches. | 2014 |
2. | Modeling of Distributed File Systems for Practical Performance Analysis | Abstract—Cloud computing has received significant attention recently. Delivering quality guaranteed services in clouds is highly desired. Distributed file systems (DFSs) are the key component of any cloud-scale data processing middleware. Evaluating the performance of DFSs is accordingly very important. To avoid cost for late life cycle performance fixes and architectural redesign, providing performance analysis before the deployment of DFSs is also particularly important. In this paper, we propose a systematic and practical performance analysis framework, driven by architecture and design models for defining the structure and behavior of typical master/slave DFSs. We put forward a configuration guideline for specifications of configuration alternatives of such DFSs, and a practical approach for both qualitatively and quantitatively performance analysis of DFSs with various configuration settings in a systematic way. What distinguish our approach from others is that 1) most of existing works rely on performance measurements under a variety of workloads/strategies, comparing with other DFSs or running application programs, but our approach is based on architecture and design level models and systematically derived performance models; 2) our approach is able to both qualitatively and quantitatively evaluate the performance of DFSs; and 3) our approach not only can evaluate the overall performance of a DFS but also its components and individual steps. We demonstrate the effectiveness of our approach by evaluating Hadoop distributed file system (HDFS). A series of real-world experiments on EC2 (Amazon Elastic Compute Cloud), Tansuo and Inspur Clusters, were conducted to qualitatively evaluate the effectiveness of our approach. We also performed a set of experiments of HDFS on EC2 to quantitatively analyze the performance and limitation of the metadata server of DFSs. Results show that our approach can achieve sufficient performance analysis. Similarly, the proposed approach could be also applied to evaluate other DFSs such as MooseFS, GFS, and zFS. | 2014 |
3. | Balancing Performance, Accuracy, and Precision for Secure Cloud Transactions | Abstract—In distributed transactional database systems deployed over cloud servers, entities cooperate to form proofs of authorizations that are justified by collections of certified credentials. These proofs and credentials may be evaluated and collected over extended time periods under the risk of having the underlying authorization policies or the user credentials being in inconsistent states. It therefore becomes possible for policy-based authorization systems to make unsafe decisions that might threaten sensitive resources. In this paper, we highlight the criticality of the problem. We then define the notion of trusted transactions when dealing with proofs of authorization. Accordingly, we propose several increasingly stringent levels of policy consistency constraints, and present different enforcement approaches to guarantee the trustworthiness of transactions executing on cloud servers. We propose a Two-Phase Validation Commit protocol as a solution, which is a modified version of the basic Two-Phase Validation Commit protocols. We finally analyze the different approaches presented using both analytical evaluation of the overheads and simulations to guide the decision makers to which approach to use. | 2014 |
4. | A Scalable Two-Phase Top-Down Specialization Approach for Data Anonymization Using MapReduce on Cloud | Abstract—A large number of cloud services require users to share private data like electronic health records for data analysis or mining, bringing privacy concerns. Anonymizing data sets via generalization to satisfy certain privacy requirements such as kanonymity is a widely used category of privacy preserving techniques. At present, the scale of data in many cloud applications increases tremendously in accordance with the Big Data trend, thereby making it a challenge for commonly used software tools to capture, manage, and process such large-scale data within a tolerable elapsed time. As a result, it is a challenge for existing anonymization approaches to achieve privacy preservation on privacy-sensitive large-scale data sets due to their insufficiency of scalability. In this paper, we propose a scalable two-phase top-down specialization (TDS) approach to anonymize large-scale data sets using the MapReduce framework on cloud. In both phases of our approach, we deliberately design a group of innovative MapReduce jobs to concretely accomplish the specialization computation in a highly scalable way. Experimental evaluation results demonstrate that with our approach, the scalability and efficiency of TDS can be significantly improved over existing approaches. | 2014 |
5. | Dynamic Optimization of Multiattribute Resource Allocation in Self-Organizing Clouds
|
By leveraging virtual machine (VM) technology which provides performance and fault isolation, cloud resources can be provisioned on demand in a fine grained, multiplexed manner rather than in monolithic pieces. By integrating volunteer computing into cloud architectures, we envision a gigantic self-organizing cloud (SOC) being formed to reap the huge potential of untapped commodity computing power over the Internet. Toward this new architecture where each participant may autonomously act as both resource consumer and provider, we propose a fully distributed, VM-multiplexing resource allocation scheme to manage decentralized resources. Our approach not only achieves maximized resource utilization using the proportional share model (PSM), but also delivers provably and adaptively optimal execution efficiency. We also design a novel multi attribute range query protocol for locating qualified nodes. Contrary to existing solutions which often generate bulky messages per request, our protocol produces only one lightweight query message per task on the Content Addressable Network (CAN). It works effectively to find for each task its qualified resources under a randomized policy that mitigates the contention among requesters. We show the SOC with our optimized algorithms can make an improvement by 15-60 percent in system throughput than a P2P Grid model. Our solution also exhibits fairly high adaptability in a dynamic node-churning environment. | 2013 |
6. | Scalable and Secure Sharing of Personal Health Records in Cloud Computing Using Attribute-Based Encryption | Personal health record (PHR) is an emerging patient-centric model of health information exchange, which is often outsourced to be stored at a third party, such as cloud providers. However, there have been wide privacy concerns as personal health information could be exposed to those third party servers and to Un authorized parties. To assure the patients’ control over access to their own PHRs, it is a promising method to encrypt the PHRs before outsourcing. Yet, issues such as risks of privacy exposure, scalability in key management, flexible access, and efficient user revocation, have remained the most important challenges toward achieving fine-grained, cryptographically enforced data access control. In this paper, we propose a novel patient-centric framework and a suite of mechanisms for data access control to PHRs stored in semitrusted servers. To achieve fine-grained and scalable data access control for PHRs, we leverage attribute-based encryption (ABE) techniques to encrypt each patient’s PHR file. Different from previous works in secure data outsourcing, we focus on the multiple data owner scenario, and divide the users in the PHR system into multiple security domains that greatly reduces the key management complexity for owners and users. A high degree of patient privacy is guaranteed simultaneously by exploiting multiauthority ABE. Our scheme also enables dynamic modification of access policies or file attributes, supports efficient on-demand user/attribute revocation and break-glass access under emergency scenarios. Extensive analytical and experimental results are presented which show the security, scalability, and efficiency of our proposed scheme. | 2013 |
7. | On Data Staging Algorithms for Shared Data Accesses in Clouds | In this paper, we study the strategies for efficiently achieving data staging and caching on a set of vantage sites in a cloud system with a minimum cost. Unlike the traditional research, we do not intend to identify the access patterns to facilitate the future requests. Instead, with such a kind of information presumably known in advance, our goal is to efficiently stage the shared data items to predetermined sites at advocated time instants to align with the patterns while minimizing the monetary costs for caching and transmitting the
requested data items. To this end, we follow the cost and network models in [1] and extend the analysis to multiple data items, each with single or multiple copies. Our results show that under homogeneous cost model, when the ratio of transmission cost and caching cost is low, a single copy of each data item can efficiently serve all the user requests. While in multicopy situation, we also consider the tradeoff between the transmission cost and caching cost by controlling the upper bounds of transmissions and copies. The upper bound can be given either on per-item basis or on all-item basis. We present efficient optimal solutions based on dynamic programming techniques to all these cases provided that the upper bound is polynomially bounded by the number of service requests and the number of distinct data items. In addition to the homogeneous cost model, we also briefly discuss this problem under a heterogeneous cost model with some simple yet practical restrictions and present a 2-approximation algorithm to the general case. We validate our findings by implementing a data staging solver, whereby conducting extensive simulation studies on the behaviors of the algorithms. |
2013 |
TECHNOLOGY: JAVA
DOMAIN: Data Mining
S. No. | IEEE TITLE | ABSTRACT | IEEE YEAR |
1. | Facilitating Document Annotation Using Content and Querying Value | A large number of organizations today generate and share textual descriptions of their products, services, and actions. Such collections of textual data contain significant amount of structured information, which remains buried in the unstructured text. While information extraction algorithms facilitate the extraction of structured relations, they are often expensive and inaccurate, especially when operating on top of text that does not contain any instances of the targeted structured information. We present a novel alternative approach that facilitates the generation of the structured metadata by identifying documents that are likely to contain information of interest and this information is going to be subsequently useful for querying the database. Our approach relies on the idea that humans are more likely to add the necessary metadata during creation time, if prompted by the interface; or that it is much easier for humans (and/or algorithms) to identify the metadata when such information actually exists in the document, instead of naively prompting users to fill in forms with information that is not available in the document. As a major contribution of this paper, we present algorithms that identify structured attributes that are likely to appear within the document, by jointly utilizing the content of the text and the query workload. Our experimental evaluation shows that our approach generates superior results compared to approaches that rely only on the textual content or only on the query workload, to identify attributes of interest. | 2014 |
2. | An Empirical Performance Evaluation of Relational Keyword Search Techniques | Extending the keyword search paradigm to relational data has been an active area of research within the database and IR community during the past decade. Many approaches have been proposed, but despite numerous publications, there remains a severe lack of standardization for the evaluation of proposed search techniques. Lack of standardization has resulted in contradictory results from different evaluations, and the numerous discrepancies muddle what advantages are proffered by different approaches. In this paper, we present the most extensive empirical performance evaluation of relational keyword search techniques to appear to date in the literature. Our results indicate that many existing search techniques do not provide acceptable performance for realistic retrieval tasks. In particular, memory consumption precludes many search techniques from scaling beyond small data sets with tens of thousands of vertices. We also explore the relationship between execution time and factors varied in previous evaluations; our analysis indicates that most of these factors have relatively little impact on performance. In summary, our work confirms previous claims regarding the unacceptable performance of these search techniques and underscores the need for standardization in evaluations—standardization exemplified by the IR community. | 2014 |
3. | Set Predicates in SQL: Enabling Set-Level Comparisons for Dynamically Formed Groups | In data warehousing and OLAP applications, scalar-level predicates in SQL become increasingly inadequate to support a class of operations that require set-level comparison semantics, i.e., comparing a group of tuples with multiple values. Currently, complex SQL queries composed by scalar-level operations are often formed to obtain even very simple set-level semantics. Such queries are not only difficult to write but also challenging for a database engine to optimize, thus can result in costly evaluation. This paper proposes to augment SQL with set predicate, to bring out otherwise obscured set-level semantics. We studied two approaches to processing set predicates—an aggregate function-based approach and a bitmap index-based approach. Moreover, we designed a histogram-based probabilistic method of set predicate selectivity estimation, for optimizing queries with multiple predicates. The experiments verified its accuracy and effectiveness in optimizing queries. | 2014 |
4. | Keyword Query Routing | Keyword search is an intuitive paradigm for searching linked data sources on the web. We propose to route keywords only to relevant sources to reduce the high cost of processing keyword search queries over all sources. We propose a novel method for computing top-k routing plans based on their potentials to contain results for a given keyword query. We employ a keyword-element relationship summary that compactly represents relationships between keywords and the data elements mentioning them. A multilevel scoring mechanism is proposed for computing the relevance of routing plans based on scores at the level of keywords, data elements, element sets, and subgraphs that connect these elements. Experiments carried out using 150 publicly available sources on the web showed that valid plans (precision@1 of 0.92) that are highly relevant (mean reciprocal rank of 0.89) can be computed in 1 second on average on a single PC. Further, we show routing greatly helps to improve the performance of keyword search, without compromising its result quality. | 2014 |
5. | A Rough Hypercuboid Approach for Feature Selection in Approximation Spaces | The selection of relevant and significant features is an important problem particularly for data sets with large number of features. In this regard, a new feature selection algorithm is presented based on a rough hypercuboid approach. It selects a set of features from a data set by maximizing the relevance, dependency, and significance of the selected features. By introducing the concept of the hypercuboid equivalence partition matrix, a novel representation of degree of dependency of sample categories on features is proposed to measure the relevance, dependency, and significance of features in approximation spaces. The equivalence partition matrix also offers an efficient way to calculate many more quantitative measures to describe the inexactness of approximate classification. Several quantitative indices are introduced based on the rough hypercuboid approach for evaluating the performance of the proposed method. The superiority of the proposed method over other feature selection methods, in terms of computational complexity and classification accuracy, is established extensively on various real-life data sets of different sizes and dimensions. | 2014 |
6. | Active Learning of Constraints for Semi-Supervised Clustering | Semi-supervised clustering aims to improve clustering performance by considering user supervision in the form of pairwise constraints. In this paper, we study the active learning problem of selecting pairwise must-link and cannot-link constraints for semisupervised clustering. We consider active learning in an iterative manner where in each iteration queries are selected based on the current clustering solution and the existing constraint set. We apply a general framework that builds on the concept of neighborhood, where neighborhoods contain “labeled examples” of different clusters according to the pairwise constraints. Our active learning method expands the neighborhoods by selecting informative points and querying their relationship with the neighborhoods. Under this framework, we build on the classic uncertainty-based principle and present a novel approach for computing the uncertainty associated with each data point. We further introduce a selection criterion that trades off the amount of uncertainty of each data point with the expected number of queries (the cost) required to resolve this uncertainty. This allows us to select queries that have the highest information rate. We evaluate the proposed method on the benchmark data sets and the results demonstrate consistent and substantial improvements over the current state of the art. | 2014 |
7. | Supporting Privacy Protection in Personalized Web Search | Abstract—Personalized web search (PWS) has demonstrated its effectiveness in improving the quality of various search services on the Internet. However, evidences show that users’ reluctance to disclose their private information during search has become a major barrier for the wide proliferation of PWS. We study privacy protection in PWS applications that model user preferences as hierarchical user profiles. We propose a PWS framework called UPS that can adaptively generalize profiles by queries while respecting user specified privacy requirements. Our runtime generalization aims at striking a balance between two predictive metrics that evaluate the utility of personalization and the privacy risk of exposing the generalized profile. We present two greedy algorithms, namely GreedyDP and GreedyIL, for runtime generalization. We also provide an online prediction mechanism for deciding whether personalizing a query is beneficial. Extensive experiments demonstrate the effectiveness of our framework. The experimental results also reveal that GreedyIL significantly outperforms GreedyDP in terms of efficiency. | 2014 |
8. | Privacy-Preserving Enhanced Collaborative Tagging | Abstract—Collaborative tagging is one of the most popular services available online, and it allows end user to loosely classify either online or offline resources based on their feedback, expressed in the form of free-text labels (i.e., tags). Although tags may not be per se sensitive information, the wide use of collaborative tagging services increases the risk of cross referencing, thereby seriously compromising user privacy. In this paper, we make a first contribution toward the development of a privacy-preserving collaborative tagging service, by showing how a specific privacy-enhancing technology, namely tag suppression, can be used to protect end-user privacy. Moreover, we analyze how our approach can affect the effectiveness of a policy-based collaborative tagging system that supports enhanced web access functionalities, like content filtering and discovery, based on preferences specified by end users. | 2014 |
9. | Event Characterization and Prediction Based on Temporal Patterns in Dynamic Data System | Abstract—The new method proposed in this paper applies a multivariate reconstructed phase space (MRPS) for identifying multivariate temporal patterns that are characteristic and predictive of anomalies or events in a dynamic data system. The new method extends the original univariate reconstructed phase space framework, which is based on fuzzy unsupervised clustering method, by incorporating a new mechanism of data categorization based on the definition of events. In addition to modeling temporal dynamics in a multivariate phase space, a Bayesian approach is applied to model the first-order Markov behavior in the multidimensional data sequences. The method utilizes an exponential loss objective function to optimize a hybrid classifier which consists of a radial basis kernel function and a log-odds ratio component. We performed experimental evaluation on three data sets to demonstrate the feasibility and effectiveness of the proposed approach. | 2014 |
10. | Discovering Emerging Topics in Social Streams via Link-Anomaly Detection | Abstract—Detection of emerging topics is now receiving renewed interest motivated by the rapid growth of social networks. Conventional-term-frequency-based approaches may not be appropriate in this context, because the information exchanged in social network posts include not only text but also images, URLs, and videos. We focus on emergence of topics signaled by social aspects of theses networks. Specifically, we focus on mentions of users—links between users that are generated dynamically (intentionally or unintentionally) through replies, mentions, and retweets. We propose a probability model of the mentioning behavior of a social network user, and propose to detect the emergence of a new topic from the anomalies measured through the model. Aggregating anomaly scores from hundreds of users, we show that we can detect emerging topics only based on the reply/mention relationships in social-network posts. We demonstrate our technique in several real data sets we gathered from Twitter. The experiments show that the proposed mention-anomaly-based approaches can detect new topics at least as early as text-anomaly-based approaches, and in some cases much earlier when the topic is poorly identified by the textual contents in posts. | 2014 |
11. | A New Algorithm for Inferring User Search Goals with Feedback Sessions
|
For a broad-topic and ambiguous query, different users may have different search goals when they submit it to a search engine. The inference and analysis of user search goals can be very useful in improving search engine relevance and user experience. In this paper, we propose a novel approach to infer user search goals by analyzing search engine query logs. First, we propose a framework to discover
different user search goals for a query by clustering the proposed feedback sessions. Feedback sessions are constructed from user click-through logs and can efficiently reflect the information needs of users. Second, we propose a novel approach to generate pseudo-documents to better represent the feedback sessions for clustering. Finally, we propose a new criterion )“Classified Average Precision (CAP)” to evaluate the performance of inferring user search goals. Experimental results are presented using user click-through logs from a commercial search engine to validate the effectiveness of our proposed methods. |
2013 |
12. | Facilitating Effective User Navigation through Website Structure Improvement | Designing well-structured websites to facilitate effective user navigation has long been a challenge. A primary reason is that the web developers’ understanding of how a website should be structured can be considerably different from that of the users. While various methods have been proposed to relink webpages to improve navigability using user navigation data, the completely reorganized new structure can be highly unpredictable, and the cost of disorienting users after the changes remains unanalyzed. This paper addresses how to improve a website without introducing substantial changes. Specifically, we propose a
mathematical programming model to improve the user navigation on a website while minimizing alterations to its current structure. Results from extensive tests conducted on a publicly available real data set indicate that our model not only significantly improves the user navigation with very few changes, but also can be effectively solved. We have also tested the model on large synthetic data sets to demonstrate that it scales up very well. In addition, we define two evaluation metrics and use them to assess the performance of the improved website using the real data set. Evaluation results confirm that the user navigation on the improved structure is indeed greatly enhanced. More interestingly, we find that heavily disoriented users are more likely to benefit from the improved structure than the less disoriented users. |
2013 |
13. | Building a Scalable
Database-Driven Reverse Dictionary
|
In this paper, we describe the design and implementation of a reverse dictionary. Unlike a traditional forward dictionary, which maps from words to their definitions, a reverse dictionary takes a user input phrase describing the desired concept, and returns a set of candidate words that satisfy the input phrase. This work has significant application not only for the general public, particularly those who work closely with words, but also in the general field of conceptual search. We present a set of algorithms and the results of a set of experiments showing the retrieval accuracy of our methods and the runtime response time performance of our implementation. Our experimental results show that our approach can provide significant improvements in performance scale without sacrificing the quality of the result. Our experiments comparing the quality of our approach to that of currently available reverse dictionaries show that of our approach can provide significantly higher quality over either of the other currently available implementation | 2013 |
TECHNOLOGY: JAVA
DOMAIN: DEPENDABLE & SECURE COMPUTING
S. No. | IEEE TITLE | ABSTRACT | IEEE YEAR |
1. | Secure Two-Party Differentially Private Data Release for Vertically Partitioned Data | Privacy-preserving data publishing addresses the problem of disclosing sensitive data when mining for useful information. Among the existing privacy models, _-differential privacy provides one of the strongest privacy guarantees. In this paper, we address the problem of private data publishing, where different attributes for the same set of individuals are held by two parties. In particular, we present an algorithm for differentially private data release for vertically partitioned data between two parties in the semi honest adversary model. To achieve this, we first present a two-party protocol for the exponential mechanism. This protocol can be used as a sub protocol by any other algorithm that requires the exponential mechanism in a distributed setting. Furthermore, we propose a two party algorithm that releases differentially private data in a secure way according to the definition of secure multiparty computation. Experimental results on real-life data suggest that the proposed algorithm can effectively preserve information for a data mining task. | 2014 |
2. | Bandwidth Distributed Denial of Service: Attacks and Defenses | The Internet is vulnerable to bandwidth distributed denial-of-service (BW-DDoS) attacks, wherein many hosts send a huge number of packets to cause congestion and disrupt legitimate traffic. So far, BW-DDoS attacks have employed relatively crude, inefficient, brute-force mechanisms; future attacks might be significantly more effective and harmful. To meet the increasing threats, more advanced defenses are necessary. | 2014 |
3. | k-Zero Day Safety: A Network Security Metric for Measuring the Risk of Unknown Vulnerabilities | By enabling a direct comparison of different security solutions with respect to their relative effectiveness, a network security metric may provide quantifiable evidences to assist security practitioners in securing computer networks. However, research on security metrics has been hindered by difficulties in handling zero-day attacks exploiting unknown vulnerabilities. In fact, the security risk of unknown vulnerabilities has been considered as something un-measurable due to the less predictable nature of software flaws. This causes a major difficulty to security metrics, because a more secure configuration would be of little value if it were equally susceptible to zero-day attacks. In this paper, we propose a novel security metric, k-zero day safety, to address this issue. Instead of attempting to rank unknown vulnerabilities, our metric counts how many such vulnerabilities would be required for compromising network assets; a larger count implies more security because the likelihood of having more unknown vulnerabilities available, applicable, and exploitable all at the same time will be significantly lower. We formally define the metric, analyze the complexity of computing the metric, devise heuristic algorithms for intractable cases, and finally demonstrate through case studies that applying the metric to existing network security practices may generate actionable knowledge. | 2014 |
4. | Security Games for Node Localization through Verifiable Multilateration | Most applications of wireless sensor networks (WSNs) rely on data about the positions of sensor nodes, which are not necessarily known beforehand. Several localization approaches have been proposed but most of them omit to consider that WSNs could be deployed in adversarial settings, where hostile nodes under the control of an attacker coexist with faithful ones. Verifiable multilateration (VM) was proposed to cope with this problem by leveraging on a set of trusted landmark nodes that act as verifiers. Although VM is able to recognize reliable localization measures, it allows for regions of undecided positions that can amount to the 40 percent of the monitored area. We studied the properties of VM as a non cooperative two-player game where the first player employs a number of verifiers to do VM computations and the second player controls a malicious node. The verifiers aim at securely localizing malicious nodes, while malicious nodes strive to masquerade as unknown and to pretend false positions. Thanks to game theory, the potentialities of VM are analyzed with the aim of improving the defender’s strategy. We found that the best placement for verifiers is an equilateral triangle with edge equal to the power range R, and maximum deception in the undecided region is approximately 0:27R. Moreover, we characterized—in terms of the probability of choosing an unknown node to examine further—the strategies of the players. | 2014 |
5. | On Inference-Proof View Processing of
XML Documents
|
This work aims at treating the inference problem in XML documents that are assumed to represent potentially incomplete information. The inference problem consists in providing a control mechanism for enforcing inference-usability confinement of XML documents. More formally, an inference-proof view of an XML document is required to be both indistinguishable from the actual XML document to the clients under their inference capabilities, and to neither contain nor imply any confidential information. We present an algorithm for generating an inference-proof view by weakening the actual XML document, i.e., eliminating confidential information and other information that could be used to infer confidential information. In order to avoid inferences based on the schema of the XML documents, the DTD of the actual XML document is modified according to the weakening operations as well, such that the modified DTD conforms with the generated inference-proof view. | 2013 |
6. | SORT:A Self-Organizing Trust Model for Peer-to-Peer Systems | 2013 |
TECHNOLOGY: JAVA
DOMAIN: IMAGE PROCESSING
S. No. | IEEE TITLE | ABSTRACT | IEEE YEAR |
1. | Large Discriminative Structured Set Prediction Modeling With Max-Margin Markov Network for Lossless Image Coding | Abstract—Inherent statistical correlation for context-based prediction and structural interdependencies for local coherence is not fully exploited in existing lossless image coding schemes. This paper proposes a novel prediction model where the optimal correlated prediction for a set of pixels is obtained in the sense of the least code length. It not only exploits the spatial statistical correlations for the optimal prediction directly based on 2D contexts, but also formulates the data-driven structural interdependencies to make the prediction error coherent with the underlying probability distribution for coding. Under the joint constraints for local coherence, max-margin Markov networks are incorporated to combine support vector machines structurally to make maxmargin estimation for a correlated region. Specifically, it aims to produce multiple predictions in the blocks with the model parameters learned in such a way that the distinction between the actual pixel and all possible estimations is maximized. It is proved that, with the growth of sample size, the prediction error is asymptotically upper bounded by the training error under the decomposable loss function. Incorporated into the lossless image coding framework, the proposed model outperforms most prediction schemes reported. | 2014 |
2. | Multi-Illuminant Estimation With Conditional Random Fields | Abstract—Most existing color constancy algorithms assume uniform illumination. However, in real-world scenes, this is not often the case. Thus, we propose a novel framework for estimating the colors of multiple illuminants and their spatial distribution in the scene. We formulate this problem as an energy minimization task within a conditional random field over a set of local illuminant estimates. In order to quantitatively evaluate the proposed method, we created a novel data set of two-dominant illuminant images comprised of laboratory, indoor, and outdoor scenes. Unlike prior work, our database includes accurate pixel wise ground truth illuminant information. The performance of our method is evaluated on multiple data sets. Experimental results show that our framework clearly outperforms single illuminant estimators as well as a recently proposed multi illuminant estimation approach. | 2014 |
3. | Saliency-Aware Video Compression | Abstract—In region-of-interest (ROI)-based video coding, ROI parts of the frame are encoded with higher quality than non-ROI parts. At low bit rates, such encoding may produce attention grabbing coding artifacts, which may draw viewer’s attention away from ROI, thereby degrading visual quality. In this paper, we present a saliency-aware video compression method for ROI-based video coding. The proposed method aims at reducing salient coding artifacts in non-ROI parts of the frame in order to keep user’s attention on ROI. Further, the method allows saliency to increase in high quality parts of the frame, and allows saliency to reduce in non-ROI parts. Experimental results indicate that the proposed method is able to improve visual quality of encoded video relative to conventional rate distortion optimized video coding, as well as two state-of-the art perceptual video coding methods. | 2014 |
4. | Translation Invariant Directional Framelet Transform Combined With Gabor Filters for Image Denoising | Abstract—This paper is devoted to the study of a directional lifting transform for wavelet frames. A non sub-sampled lifting structure is developed to maintain the translation invariance as it is an important property in image denoising. Then, the directionality of the lifting-based tight frame is explicitly discussed, followed by a specific translation invariant directional framelet transform (TIDFT). The TIDFT has two framelets ψ1,ψ2 with vanishing moments of order two and one respectively, which are able to detect singularities in a given direction set. It provides an efficient and sparse representation for images containing rich textures along with properties of fast implementation and perfect reconstruction. In addition, an adaptive block-wise orientation estimation method based on Gabor filters is presented instead of the conventional minimization of residuals. Furthermore, the TIDFT is utilized to exploit the capability of image denoising, incorporating the MAP estimator for multivariate exponential distribution. Consequently, the TIDFT is able to eliminate the noise effectively while preserving the textures simultaneously. Experimental results show that the TIDFT outperforms some other frame-based denoising methods, such as contourlet and shearlet, and is competitive to the state-of-the-art denoising approaches. | 2014 |
5. | Vector-Valued Image Processing by Parallel Level Sets | Vector-valued images such as RGB color images or multimodal medical images show a strong inter channel correlation, which is not exploited by most image processing tools. We propose a new notion of treating vector-valued images which is based on the angle between the spatial gradients of their channels. Through minimizing a cost functional that penalizes large angles, images with parallel level sets can be obtained. After formally introducing this idea and the corresponding cost functionals, we discuss their Gâteaux derivatives that lead to a diffusion-like gradient descent scheme. We illustrate the properties of this cost functional by several examples in denoising and demo saicking of RGB color images. They show that parallel level sets are a suitable concept for color image enhancement. Demosaicking with parallel level sets gives visually perfect results for low noise levels. Furthermore, the proposed functional yields sharper images than the other approaches in comparison. | 2014 |
6. | Circular Re ranking for Visual Search
|
Search re ranking is regarded as a common way to boost retrieval precision. The problem nevertheless is not trivial especially when there are multiple features or modalities to be considered for search, which often happens in image and video retrieval. This paper proposes a new reranking algorithm, named circular reranking, that reinforces the mutual exchange of information across multiple modalities for improving search performance, following the philosophy that strong performing modality could learn from weaker ones, while weak modality does benefit from interacting with stronger ones. Technically, circular reranking conducts multiple runs of random walks through exchanging the ranking scores among different features in a cyclic manner. Unlike the existing techniques, the reranking procedure encourages interaction among modalities to seek a consensus that are useful for reranking.
In this paper, we study several properties of circular reranking, including how and which order of information propagation should be configured to fully exploit the potential of modalities for reranking. Encouraging results are reported for both image and video retrieval on Microsoft Research Asia Multimedia image dataset and TREC Video Retrieval Evaluation 2007-2008 datasets, respectively. |
2013 |
7. | Efficient Method for Content Re construction With Self-Embedding
|
This paper presents a new model of the content reconstruction problem in self-embedding systems, based on an erasure communication channel. We explain why such a model is a good fit for this problem, and how it can be practically implemented with the use of digital fountain codes. The proposed method is based on an alternative approach to spreading the reference information over the whole image, which has recently been shown to be of critical importance in the application at hand. Our paper presents a theoretical analysis of the inherent restoration trade-offs. We analytically derive formulas for the reconstruction success bounds, and validate them experimentally with
Monte Carlo simulations and a reference image authentication system. We perform an exhaustive reconstruction quality assessment, where the presented reference scheme is compared to five state-of-the-art alternatives in a common evaluation scenario. Our paper leads to important insights on how self-embedding schemes should be constructed to achieve optimal performance. The reference authentication system designed according to the presented principles allows for high-quality reconstruction, regardless of the amount of the tampered content. The average reconstruction quality, measured on 10000 natural images is 37 dB, and is achievable even when 50% of the image area becomes tampered. |
2013 |
8. | Modeling Iris Code and Its Variants as Convex Polyhedral Cones and Its Security Implications
|
Iris Code, developed by Daugman, in 1993, is the most influential iris recognition algorithm. A thorough understanding of Iris Code is essential, because over 100 million persons have been enrolled by this algorithm and many biometric personal identification and template protection methods have been developed based on Iris Code. This paper indicates that a template produced by Iris Code or its variants is a convex polyhedral cone in a hyperspace. Its central ray, being a rough representation of the original biometric signal, can be computed by a
simple algorithm, which can often be implemented in one Matlab command line. The central ray is an expected ray and also an optimal ray of an objective function on a group of distributions. This algorithm is derived from geometric properties of a convex polyhedral cone but does not rely on any prior knowledge (e.g., iris images). The experimental results show that biometric templates, including iris and palmprint templates, produced by different recognition methods can be matched through the central rays in their convex polyhedral cones and that templates protected by a method extended from IrisCode can be broken into. These experimental results indicate that, without a thorough security analysis, convex polyhedral cone templates cannot be assumed secure. Additionally, the simplicity of the algorithm implies that even junior hackers without knowledge of advanced image processing and biometric databases can still break into protected templates and reveal relationships among templates produced by different recognition methods. |
2013 |
9. | Robust Document Image Binarization Technique for Degraded Document Images | Segmentation of text from badly degraded document images is a very challenging task due to the high inter/intra-variation between the document background and the foreground text of different document images. In this paper, we propose a novel document image binarization
technique that addresses these issues by using adaptive image contrast. The adaptive image contrast is a combination of the local image contrast and the local image gradient that is tolerant to text and background variation caused by different types of document degradations. In the proposed technique, an adaptive contrast map is first constructed for an input degraded document image. The contrast map is then binarized and combined with Canny’s edge map to identify the text stroke edge pixels. The document text is further segmented by a local threshold that is estimated based on the intensities of detected text stroke edge pixels within a local window. The proposed method is simple, robust, and involves minimum parameter tuning. It has been tested on three public datasets that are used in the recent document image binarization contest (DIBCO) 2009 & 2011 and handwritten -DIBCO 2010 and achieves accuracies of 93.5%, 87.8%, and 92.03%, respectively, that are significantly higher than or close to that of the best-performing methods reported in the three contests. Experiments on the Bickley diary dataset that consists of several challenging bad quality document images also show the superior performance of our proposed method, compared with other techniques. |
2013 |
10. | Per-Colorant-Channel Color Barcodes for Mobile Applications: An Interference Cancellation Framework | We propose a color barcode framework for mobile phone applications by exploiting the spectral diversity afforded by the cyan (C), magenta (M), and yellow (Y) print colorant channels commonly used for color printing and
the complementary red (R), green (G), and blue (B) channels, respectively, used for capturing color images. Specifically, we exploit this spectral diversity to realize a three-fold increase in the data rate by encoding independent data in the C, M, and Y print colorant channels and decoding the data from the complementary R, G, and B channels captured via a mobile phone camera. To mitigate the effect of cross-channel interference among the print-colorant and capture color channels, we develop an algorithm for interference cancellation based on a physically -motivated mathematical model for the print and capture processes. To estimate the model parameters required for cross-channel interference cancellation, we propose two alternative methodologies: a pilot block approach that uses suitable selections of colors for the synchronization blocks and an expectation maximization approach that estimates the parameters from regions encoding the data itself. We evaluate the performance of the proposed framework using specific implementations of the framework for two of the most commonly used barcodes in mobile applications, QR and Aztec codes. Experimental results show that the proposed framework successfully overcomes the impact of the color interference, providing a low bit error rate and a high decoding rate for each of the colorant channels when used with a corresponding error correction scheme. |
2013 |
TECHNOLOGY: JAVA
DOMAIN: MOBILE COMPUTING
S. No. | IEEE TITLE | ABSTRACT | IEEE YEAR |
1. | Cooperative Spectrum Sharing: A Contract-Based Approach | Abstract—Providing economic incentives to all parties involved is essential for the success of dynamic spectrum access. Cooperative spectrum sharing is one effective way to achieve this, where secondary users (SUs) relay traffics for primary users (PUs) in exchange for dedicated spectrum access time for SUs’ own communications. In this paper, we study the cooperative spectrum sharing under incomplete information, where SUs’ wireless characteristics are private information and not known by a PU. We model the PU-SU interaction as a labor market using contract theory. In contract theory, the employer generally does not completely know employees’ private information before the employment and needs to offers employees a contract under incomplete information. In our problem, the PU and SUs are, respectively, the employer and employees, and the contract consists of a set of items representing combinations of spectrum accessing time (i.e., reward) and relaying power (i.e., contribution). We study the optimal contract design for both weakly and strongly incomplete information scenarios. In the weakly incomplete information scenario, we show that the PU will optimally hire the most efficient SUs and the PU achieves the same maximum utility as in the complete information benchmark. In the strongly incomplete information scenario, however, the PU may conservatively hire less efficient SUs as well. We further propose a decompose-and-compare (DC) approximate algorithm that achieves a close-to-optimal contract. We further show that the PU’s average utility loss due to the suboptimal DC algorithm and the strongly incomplete information are relatively small (less than 2 and 1.3 percent, respectively, in our numerical results with two SU types). | 2014 |
2. | Energy-Aware Resource Allocation Strategies for LTE Uplink with Synchronous HARQ Constraints | Abstract—In this paper, we propose a framework for energy efficient resource allocation in multiuser localized SC-FDMA with synchronous HARQ constraints. Resource allocation is formulated as a two-stage problem where resources are allocated in both time and frequency. The impact of retransmissions on the time-frequency problem segmentation is handled through the use of a novel block scheduling interval specifically designed for synchronous HARQ to ensure uplink users do not experience ARQ blocking. Using this framework, we formulate the optimal margin adaptive allocation problem, and based on its structure, we propose two suboptimal approaches to minimize average power allocation required for resource allocation while attempting to reduce complexity. Results are presented for computational complexity and average power allocation relative to system complexity and data rate, and comparisons are made between the proposed optimal and suboptimal approaches. | 2014 |
3. | Preserving Location Privacy in Geosocial Applications | Abstract—Using geosocial applications, such as FourSquare, millions of people interact with their surroundings through their friends and their recommendations. Without adequate privacy protection, however, these systems can be easily misused, for example, to track users or target them for home invasion. In this paper, we introduce LocX, a novel alternative that provides significantly improved location privacy without adding uncertainty into query results or relying on strong assumptions about server security. Our key insight is to apply secure user-specific, distance-preserving coordinate transformations to all location data shared with the server. The friends of a user share this user’s secrets so they can apply the same transformation. This allows all location queries to be evaluated correctly by the server, but our privacy mechanisms guarantee that servers are unable to see or infer the actual location data from the transformed data or from the data access. We show that LocX provides privacy even against a powerful adversary model, and we use prototype measurements to show that it provides privacy with very little performance overhead, making it suitable for today’s mobile devices. | 2014 |
4. | Snapshot and Continuous Data Collection in Probabilistic Wireless Sensor Networks | Abstract—Data collection is a common operation of Wireless Sensor Networks (WSNs), of which the performance can be measured by its achievable network capacity. Most existing works studying the network capacity issue are based on the unpractical model called deterministic network model. In this paper, a more reasonable model, probabilistic network model, is considered. For snapshot data collection, we propose a novel Cell-based Path Scheduling (CPS) algorithm that achieves capacity of Ω (1/5ѡ ln n. W) in the sense of the worst case and order-optimal capacity in the sense of expectation, where n is the number of sensor nodes, ѡ is a constant, and W is the data transmitting rate. For continuous data collection, we propose a Zone-based Pipeline Scheduling (ZPS) algorithm. ZPS significantly speeds up the continuous data collection process by forming a data transmission pipeline, and achieves a capacity gain of N times better than the optimal capacity of the snapshot data collection scenario in order in the sense of the worst case, where N is the number of snapshots in a continuous data collection task. The simulation results also validate that the proposed algorithms significantly improve network capacity compared with the existing works. | 2014 |
5. | A QoS-Oriented Distributed Routing Protocol for Hybrid Wireless Networks | Abstract—As wireless communication gains popularity, significant research has been devoted to supporting real-time transmission with stringent Quality of Service (QoS) requirements for wireless applications. At the same time, a wireless hybrid network that integrates a mobile wireless ad hoc network (MANET) and a wireless infrastructure network has been proven to be a better alternative for the next generation wireless networks. By directly adopting resource reservation-based QoS routing for MANETs, hybrids networks inherit invalid reservation and race condition problems in MANETs. How to guarantee the QoS in hybrid networks remains an open problem. In this paper, we propose a QoS-Oriented Distributed routing protocol (QOD) to enhance the QoS support capability of hybrid networks. Taking advantage of fewer transmission hops and anycast transmission features of the hybrid networks, QOD transforms the packet routing problem to a resource scheduling problem. QOD incorporates five algorithms: 1) a QoS-guaranteed neighbor selection algorithm to meet the transmission delay requirement, 2) a distributed packet scheduling algorithm to further reduce transmission delay, 3) a mobility-based segment resizing algorithm that adaptively adjusts segment size according to node mobility in order to reduce transmission time, 4) a traffic redundant elimination algorithm to increase the transmission throughput, and 5) a data redundancy elimination-based transmission algorithm to eliminate the redundant data to further improve the transmission QoS. Analytical and simulation results based on the random way-point model and the real human mobility model show that QOD can provide high QoS performance in terms of overhead, transmission delay, mobility-resilience, and scalability. | 2014 |
6. | Cooperative Caching for Efficient Data Access in Disruption Tolerant Networks | Abstract—Disruption tolerant networks (DTNs) are characterized by low node density, unpredictable node mobility, and lack of global network information. Most of current research efforts in DTNs focus on data forwarding, but only limited work has been done on providing efficient data access to mobile users. In this paper, we propose a novel approach to support cooperative caching in DTNs, which enables the sharing and coordination of cached data among multiple nodes and reduces data access delay. Our basic idea is to intentionally cache data at a set of network central locations (NCLs), which can be easily accessed by other nodes in the network. We propose an efficient scheme that ensures appropriate NCL selection based on a probabilistic selection metric and coordinates multiple caching nodes to optimize the tradeoff between data accessibility and caching overhead. Extensive trace-driven simulations show that our approach significantly improves data access performance compared to existing schemes. | 2014 |
7. | Real-Time Misbehavior Detection in IEEE 802.11-Based Wireless Networks: An Analytical Approach | Abstract—The distributed nature of the CSMA/CA-based wireless protocols, for example, the IEEE 802.11 distributed coordinated function (DCF), allows malicious nodes to deliberately manipulate their backoff parameters and, thus, unfairly gain a large share of the network throughput. In this paper, we first design a real-time backoff misbehavior detector, termed as the fair share detector (FS detector), which exploits the nonparametric cumulative sum (CUSUM) test to quickly find a selfish malicious node without any a priori knowledge of the statistics of the selfish misbehavior. While most of the existing schemes for selfish misbehavior detection depend on heuristic parameter configuration and experimental performance evaluation, we develop a Markov chain-based analytical model to systematically study the performance of the FS detector in real-time backoff misbehavior detection. Based on the analytical model, we can quantitatively compute the system configuration parameters for guaranteed performance in terms of average false positive rate, average detection delay, and missed detection ratio under a detection delay constraint. We present thorough simulation results to confirm the accuracy of our theoretical analysis as well as demonstrate the performance of the developed FS detector. | 2014 |
8. | A Neighbor Coverage-Based Probabilistic Rebroadcast for Reducing Routing Overhead in Mobile Ad Hoc Networks | Due to high mobility of nodes in mobile ad hoc networks (MANETs), there exist frequent link breakages which lead to frequent path failures and route discoveries. The overhead of a route discovery cannot be neglected. In a route discovery, broadcasting is a fundamental and effective data dissemination mechanism, where a mobile node blindly rebroadcasts the first received route request packets unless it has a route to the destination, and thus it causes the broadcast storm problem. In this paper, we propose a neighbor coverage-based probabilistic rebroadcast protocol for reducing routing overhead in MANETs. In order to effectively exploit the neighbor coverage knowledge, we propose a novel rebroadcast delay to determine the rebroadcast order, and then we can obtain the more accurate additional coverage ratio by sensing neighbor coverage knowledge. We also define a connectivity factor to provide the node density adaptation. By combining the additional coverage ratio and connectivity factor, we set a reasonable rebroadcast probability. Our approach combines the advantages of the neighbor coverage knowledge and the probabilistic mechanism, which can significantly decrease the number of retransmissions so as to reduce the routing overhead, and can also improve the routing performance. | 2013 |
9. | Relay Selection for Geographical Forwarding in Sleep-Wake Cycling Wireless Sensor Networks
|
Our work is motivated by geographical forwarding of sporadic alarm packets to a base station in a wireless sensor network (WSN), where the nodes are sleep-wake cycling periodically and asynchronously. We seek to develop local forwarding algorithms that can be tuned so as to tradeoff the end-to-end delay against a total cost, such as the hop count or total energy. Our approach is to solve, at each forwarding node enroute to the sink, the local forwarding problem of minimizing one-hop waiting delay subject to a lower bound constraint on a suitable reward offered by the next-hop relay; the constraint serves to tune the tradeoff. The reward metric used for the local problem is based on the end-to-end total cost objective (for instance, when the total cost is hop count, we choose to use the progress toward sink made by a relay as the reward). The forwarding node, to begin with, is uncertain about the number of relays, their wake-up times, and the reward values, but knows the probability distributions of these quantities. At each relay wake-up instant, when a relay reveals its reward value, the forwarding node’s problem is to forward the packet or to wait for further relays to wake-up. In terms of the operations research literature, our work can be considered as a variant of the asset selling problem. We formulate our local forwarding problem as a partially observable Markov decision process (POMDP) and obtain inner and outer bounds for the optimal policy. Motivated by the computational complexity involved in the policies derived out of these bounds, we formulate an alternate simplified model, the optimal policy for which is a simple threshold rule. We provide simulation results to compare the performance of the inner and outer bound policies against the simple policy, and also against the optimal policy when the source knows the exact number of relays. Observing the good performance and the ease of implementation of the simple policy, we apply it to our motivating problem, i.e., local geographical routing of sporadic alarm packets in a large WSN. We compare the end-to-end performance (i.e., average total delay and average total cost) obtained by the simple policy, when used for local geographical forwarding, against that obtained by the globally optimal forwarding algorithm proposed by Kim et al. | 2013 |
10. | Toward Privacy Preserving and Collusion Resistance in a Location Proof Updating System
|
Today’s location-sensitive service relies on user’s mobile device to determine the current location. This allows malicious users to access a restricted resource or provide bogus alibis by cheating on their locations. To address this issue, we propose A Privacy Preserving LocAtion proof Updating System (APPLAUS) in which colocated Bluetooth enabled mobile devices mutually generate location proofs and send updates to a location proof server. Periodically changed pseudonyms are used by the mobile devices to protect source location privacy from each other, and from the untrusted location proof server. We also develop user-centric location privacy model in which individual users evaluate their location privacy levels and decide whether and when to accept the location proof requests. In order to defend against colluding attacks, we also present betweenness ranking-based and correlation clustering-based approaches for outlier detection. APPLAUS can be implemented with existing network infrastructure, and can be easily deployed in Bluetooth enabled mobile devices with little computation or power cost. Extensive experimental results show that APPLA US can effectively provide location proofs, significantly preserve the source location privacy, and effectively detect colluding attacks. | 2013 |
11. | Distributed Cooperation and Diversity for Hybrid Wireless Networks | In this paper, we propose a new Distributed Cooperation and Diversity Combining framework. Our focus is on heterogeneous networks with devices equipped with two types of radio frequency (RF) interfaces: short-range
high-rate interface (e.g., IEEE802.11), and a long-range low-rate interface (e.g., cellular) communicating over urban Rayleigh fading channels. Within this framework, we propose and evaluate a set of distributed cooperation techniques operating at different hierarchical levels with resource constraints such as short -range RF bandwidth. We propose a Priority Maximum-Ratio Combining (PMRC) technique, and a Post Soft-Demodulation Combining (PSDC) technique. We show that the proposed techniques achieve significant improvements on Signal to Noise Ratio(SNR), Bit Error Rate (BER) and throughput through analysis, simulation, and experimentation on our software radio testbed. Our results also indicate that, under several communication scenarios, PMRC and PSDC can improve the throughput performance by over an order of magnitude. |
2013 |
12. | Toward a Statistical Frame work for Source Anonymity in Sensor Networks
|
In certain applications, the locations of events reported by a sensor network need to remain anonymous. That is, unauthorized observers must be unable to detect the origin of such events by analyzing the network traffic. Known as the source anonymity problem, this problem has emerged as an important topic in the security of wireless sensor networks, with variety of techniques based on different adversarial assumptions being proposed. In this work, we present a new framework for modeling, analyzing, and evaluating anonymity in sensor networks. The novelty of the proposed framework is twofold: first, it introduces the notion of interval indistinguishability” and provides a quantitative measure to model anonymity in wireless sensor networks; second, it maps source anonymity to the statistical problem of
binary hypothesis testing with nuisance parameters. We then analyze existing solutions for designing anonymous sensor networks using the proposed model. We show how mapping source anonymity to binary hypothesis testing with nuisance parameters leads to converting the problem of exposing private source information into searching for an appropriate data transformation that removes or minimize the effect of the nuisance information. By doing so, we transform the problem from analyzing real-valued sample points to binary codes, which opens the door for coding theory to be incorporated into the study of anonymous sensor networks. Finally, we discuss how existing solutions can be modified to improve their anonymity. |
2013 |
13. | Vampire Attacks: Draining Life from Wireless Ad Hoc Sensor Networks
|
Ad hoc low-power wireless networks are an exciting research direction in sensing and pervasive computing. Prior security work in this area has focused primarily on denial of communication at the routing or medium access control levels. This paper explores resource depletion attacks at the routing protocol layer, which permanently disable networks by quickly draining nodes’ battery power. These “Vampire” attacks are not specific to any specific protocol, but rather rely on the properties of many popular classes of routing protocols. We find that all examined protocols are susceptible to Vampire attacks, which are devastating, difficult to detect, and are easy to carry out using as few as one malicious insider sending only protocol-compliant messages. In the worst case, a single Vampire can increase network-wide energy usage by a factor of O(N), where N in the number of network nodes. We discuss methods to mitigate these types of attacks, including a new proof-of-concept protocol that provably bounds the damage caused by Vampires during the packet forwarding phase. | 2013 |
TECHNOLOGY: JAVA
DOMAIN: NETWORKING
S. No. | IEEE TITLE | ABSTRACT | IEEE YEAR |
1. | Fast Regular Expression Matching Using Small TCAM | Abstract—Regular expression (RE) matching is a core component of deep packet inspection in modern networking and security devices. In this paper, we propose the first hardware-based RE matching approach that uses ternary content addressable memory (TCAM), which is available as off-the-shelf chips and has been widely deployed in modern networking devices for tasks such as packet classification. We propose three novel techniques to reduce TCAM space and improve RE matching speed: transition sharing, table consolidation, and variable striding. We tested our techniques on eight real-world RE sets, and our results show that small TCAMs can be used to store large deterministic finite automata (DFAs) and achieve potentially high RE matching throughput. For space, we can store each of the corresponding eight DFAs with 25 000 states in a 0.59-Mb TCAM chip. Using a different TCAM encoding scheme that facilitates processing multiple characters per transition, we can achieve potential RE matching throughput of 10–19 Gb/s for each of the eight DFAs using only a single 2.36-Mb TCAMchip. | 2014 |
2. | Green Networking With Packet Processing Engines: Modeling and Optimization | Abstract—With the aim of controlling power consumption in metro/transport and core networks, we consider energy-aware devices able to reduce their energy requirements by adapting their performance. In particular, we focus on state-of-the-art packet processing engines, which generally represent the most energy-consuming components of network devices, and which are often composed of a number of parallel pipelines to “divide and conquer” the incoming traffic load. Our goal is to control both the power configuration of pipelines and the way to distribute traffic flows among them. We propose an analytical model to accurately represent the impact of green network technologies (i.e., low power idle and adaptive rate) on network- and energy-aware performance indexes. The model has been validated with experimental results, performed by using energy-aware software routers loaded by real-world traffic traces. The achieved results demonstrate how the proposed model can effectively represent energy- and network-aware performance indexes. On this basis, we propose a constrained optimization policy, which seeks the best tradeoff between power consumption and packet latency times. The procedure aims at dynamically adapting the energy-aware device configuration to minimize energy consumption while coping with incoming traffic volumes and meeting network performance constraints. In order to deeply understand the impact of such policy, a number of tests have been performed by using experimental data from software router architectures and real-world traffic traces. | 2014 |
3. | On Sample-Path Optimal Dynamic Scheduling for Sum-Queue Minimization in Forests | Abstract—We investigate the problem of minimizing the sum of the queue lengths of all the nodes in a wireless network with a forest topology. Each packet is destined to one of the roots (sinks) of the forest. We consider a time-slotted system and a primary (or one-hop) interference model. We characterize the existence of causal sample-path optimal scheduling policies for this network topology under this interference model. A causal sample-path optimal scheduling policy is one for which at each time-slot, and for any sample-path traffic arrival pattern, the sum of the queue lengths of all the nodes in the network is minimum among all policies. We show that such policies exist in restricted forest structures, and that for any other forest structure, there exists a traffic arrival pattern for which no causal sample-path optimal policy can exist. Surprisingly, we show that many forest structures for which such policies exist can be scheduled by converting the structure into an equivalent linear network and scheduling the equivalent linear network according to the one-hop interference model. The nonexistence of such policies in many forest structures underscores the inherent limitation of using sample-path optimality as a performance metric and necessitates the need to study other (relatively) weaker metrics of delay performance. | 2014 |
4. | PACK: Prediction-Based Cloud Bandwidth and Cost Reduction System | Abstract—In this paper, we present PACK (Predictive ACKs), a novel end-to-end traffic redundancy elimination (TRE) system, designed for cloud computing customers. Cloud-based TRE needs to apply a judicious use of cloud resources so that the bandwidth cost reduction combined with the additional cost of TRE computation and storage would be optimized. PACK’s main advantage is its capability of offloading the cloud-server TRE effort to end clients, thus minimizing the processing costs induced by the TRE algorithm. Unlike previous solutions, PACK does not require the server to continuously maintain clients’ status. This makes PACK very suitable for pervasive computation environments that combine client mobility and server migration to maintain cloud elasticity. PACK is based on a novel TRE technique, which allows the client to use newly received chunks to identify previously received chunk chains, which in turn can be used as reliable predictors to future transmitted chunks. We present a fully functional PACK implementation, transparent to all TCP-based applications and network devices. Finally, we analyze PACK benefits for cloud users, using traffic traces from various sources. | 2014 |
5. | Secure Data Retrieval for Decentralized Disruption-Tolerant Military Networks | Abstract—Mobile nodes in military environments such as a battlefield or a hostile region are likely to suffer from intermittent network connectivity and frequent partitions. Disruption-tolerant network (DTN) technologies are becoming successful solutions that allow wireless devices carried by soldiers to communicate with each other and access the confidential information or command reliably by exploiting external storage nodes. Some of the most challenging issues in this scenario are the enforcement of authorization policies and the policies update for secure data retrieval. Ciphertext-policy attribute-based encryption (CP-ABE) is a promising cryptographic solution to the access control issues. However, the problem of applying CP-ABE in decentralized DTNs introduces several security and privacy challenges with regard to the attribute revocation, key escrow, and coordination of attributes issued from different authorities. In this paper, we propose a secure data retrieval scheme using CP-ABE for decentralized DTNs where multiple key authorities manage their attributes independently. We demonstrate how to apply the proposed mechanism to securely and efficiently manage the confidential data distributed in the disruption-tolerant military network. | 2014 |
6. | A Distributed Control Law for Load Balancing in Content Delivery Networks
|
In this paper, we face the challenging issue of defining and implementing an effective law for load balancing in Content Delivery Networks (CDNs). We base our proposal on a formal study of a CDN system, carried out through the exploitation of a fluid flow model characterization of the network of servers. Starting from such characterization, we derive and prove a lemma about the network queues equilibrium. This result is then leveraged in order to devise a novel distributed and time-continuous algorithm for load balancing, which is also reformulated in a time-discrete version. The discrete formulation of the proposed balancing law is eventually discussed in terms of its actual implementation in a real-world scenario. Finally, the overall approach is validated by means of simulations. | 2013 |
7. | Achieving Efficient Flooding by Utilizing Link Correlation in Wireless Sensor Networks
|
Although existing flooding protocols can provide efficient and reliable communication in wireless sensor networks on some level, further
performance improvement has been hampered by the assumption of link independence, which requires costly acknowledgments (ACKs) from every receiver. In this paper, we present collective flooding (CF), which exploits the link correlation to achieve flooding reliability using the concept of collective ACKs. CF requires only 1-hop information at each node, making the design highly distributed and scalable with low complexity. We evaluate CF extensively in real-world settings, using three different types of testbeds: a single-hop network with 20 MICAz nodes, a multihop network with 37 nodes, and a linear outdoor network with 48 nodes along a 326-m-long bridge. System evaluation and extensive simulation show that CF achieves the same reliability as state-of-the-art solutions while reducing the total number of packet transmission and the dissemination delay by 30%-50% and 35%-50%, respectively. |
2013 |
8. | Complexity Analysis and Algorithm Design for Advance Bandwidth Scheduling in Dedicated Networks | An increasing number of high-performance networks provision dedicated channels through circuit switching or MPLS/GMPLS techniques to support large data transfer. The link bandwidths in such networks are typically shared by multiple users through advance reservation, resulting in varying bandwidth availability in future time. Developing efficient
scheduling algorithms for advance bandwidth reservation has become a critical task to improve the utilization of network resources and meet the transport requirements of application users. We consider an exhaustive combination of different path and bandwidth constraints and formulate four types of advance bandwidth scheduling problems, with the same objective to minimize the data transfer end time for a given transfer request with a prespecified data size: fixed path with fixed bandwidth (FPFB); fixed path with variable bandwidth (FPVB); variable path with fixed bandwidth (VPFB); and variable path with variable bandwidth (VPVB). For VPFB and VPVB, we further consider two subcases where the path switching delay is negligible or nonnegligible. We propose an optimal algorithm for each of these scheduling problems except for FPVB and VPVB with nonnegligible path switching delay, which are proven to be NP-complete and nonapproximable, and then tackled by heuristics. The performance superiority of these heuristics is verified by extensive experimental results in a large set of simulated networks in comparison to optimal and greedy strategies. |
2013 |
9. | Efficient Algorithms for Neighbor Discovery in Wireless Networks
|
Neighbor discovery is an important first step in the initialization of a wireless ad hoc network. In this paper, we design and analyze several algorithms for neighbor discovery in wireless networks. Starting with a single-hop wireless network of n nodes, we propose a Θ(nlnn) ALOHA-like neighbor discovery algorithm when nodes cannot detect collisions, and an order-optimal Θ(n) receiver feedback-based algorithm when nodes can detect collisions. Our algorithms neither require nodes to have a priori estimates of the number of neighbors nor synchronization between nodes. Our algorithms allow nodes to begin execution at different time instants and to terminate neighbor discovery upon discovering all their neighbors. We finally show that receiver feedback can be used to achieve a Θ(n) running time, even when nodes cannot detect collisions. We then analyze neighbor discovery in a general multihop setting. We establish an upper bound of O(Δlnn) on the running time of the ALOHA-like algorithm, where Δ denotes the maximum node degree in the network and nthe total number of nodes. We also establish a lower bound of Ω(Δ+lnn) on the running time of any randomized neighbor discovery algorithm. Our result thus implies that the ALOHA-like algorithm is at most a factor min(Δ,lnn) worse than optimal. | 2013 |
10. | Semi-Random Backoff: Towards Resource Reservation for Channel Access in Wireless LANs
|
This paper proposes a semi-random backoff (SRB) method that enables resource reservation in contention-based wireless LANs. The proposed SRB is fundamentally different from traditional random backoff methods because it provides an easy migration path from random backoffs to deterministic slot assignments. The
central idea of the SRB is for the wireless station to set its backoff counter to a deterministic value upon a successful packet transmission. This deterministic value will allow the station to reuse the time-slot in consecutive backoff cycles. When multiple stations with successful packet transmissions reuse their respective time-slots, the collision probability is reduced, and the channel achieves the equivalence of resource reservation. In case of a failed packet transmission, a station will revert to the standard random backoff method and probe for a new available time-slot. The proposed SRB method can be readily applied to both 802.11 DCF and 802.11e EDCA networks with minimum modification to the existing DCF/EDCA implementations. Theoretical analysis and simulation results validate the superior performance of the SRB for small-scale and heavily loaded wireless LANs. When combined with an adaptive mechanism and a persistent backoff process, SRB can also be effective for large-scale and lightly loaded wireless networks. |
2013 |
11. | A Utility Maximization Frame work for Fair and Efficient Multicasting in Multicarrier Wireless Cellular Networks
|
Multicast/broadcast is regarded as an efficient technique for wireless cellular networks to transmit a large volume of common data to multiple mobile users simultaneously. To guarantee the quality of service for each mobile user in such single-hop multicasting, the base-station transmitter usually adapts its data rate to the worst channel condition among all users in a multicast group. On one hand, increasing the
number of users in a multicast group leads to a more efficient utilization of spectrum bandwidth, as users in the same group can be served together. On the other hand, too many users in a group may lead to unacceptably low data rate at which the base station can transmit. Hence, a natural question that arises is how to efficiently and fairly transmit to a large number of users requiring the same message. This paper endeavors to answer this question by studying the problem of multicasting over multicarriers in wireless orthogonal frequency division multiplexing (OFDM) cellular systems. Using a unified utility maximization framework, we investigate this problem in two typical scenarios: namely, when users experience roughly equal path losses and when they experience different path losses, respectively. Through theoretical analysis, we obtain optimal multicast schemes satisfying various throughput-fairness requirements in these two cases. In particular, we show that the conventional multicast scheme is optimal in the equal-path-loss case regardless of the utility function adopted. When users experience different path losses, the group multicast scheme, which divides the users almost equally into many multicast groups and multicasts to different groups of users over non overlapping subcarriers, is optimal. |
2013 |
TECHNOLOGY: JAVA
DOMAIN: PARALLEL & DISTRIBUTED SYSTEM
S. No. | IEEE TITLE | ABSTRACT | IEEE YEAR |
1. | Enabling Trustworthy Service Evaluation in Service-Oriented Mobile Social Networks | In this paper, we propose a Trustworthy Service Evaluation (TSE) system to enable users to share service reviews in service-oriented mobile social networks (S-MSNs). Each service provider independently maintains a TSE for itself, which collects and stores users’ reviews about its services without requiring any third trusted authority. The service reviews can then be made available to interested users in making wise service selection decisions. We identify three unique service review attacks, i.e., linkability, rejection, and modification attacks, and develop sophisticated security mechanisms for the TSE to deal with these attacks. Specifically, the basic TSE (bTSE) enables users to distributedly and cooperatively submit their reviews in an integrated chain form by using hierarchical and aggregate signature techniques. It restricts the service providers to reject, modify, or delete the reviews. Thus, the integrity and authenticity of reviews are improved. Further, we extend the bTSE to a Sybil-resisted TSE (SrTSE) to enable the detection of two typical sybil attacks. In the SrTSE, if a user generates multiple reviews toward a vendor in a predefined time slot with different pseudonyms, the real identity of that user will be revealed. Through security analysis and numerical results, we show that the bTSE and the SrTSE effectively resist the service review attacks and the SrTSE additionally detects the sybil attacks in an efficient manner. Through performance evaluation, we show that the bTSE achieves better performance in terms of submission rate and delay than a service review system that does not adopt user cooperation. | 2014 |
2. | A Tag Encoding Scheme against Pollution Attack to Linear Network Coding | Network coding allows intermediate nodes to encode data packets to improve network throughput and robustness. However, it increases the propagation speed of polluted data packets if a malicious node injects fake data packets into the network, which degrades the bandwidth efficiency greatly and leads to incorrect decoding at sinks. In this paper, insights on new mathematical relations in linear network coding are presented and a key predistribution-based tag encoding scheme KEPTE is proposed, which enables all intermediate nodes and sinks to detect the correctness of the received data packets. Furthermore, the security of KEPTE with regard to pollution attack and tag pollution attack is quantitatively analyzed. The performance of KEPTE is competitive in terms of: 1) low computational complexity; 2) the ability that all intermediate nodes and sinks detect pollution attack; 3) the ability that all intermediate nodes and sinks detect tag pollution attack; and 4) high fault-tolerance ability. To the best of our knowledge, the existing key predistribution-based schemes aiming at pollution detection can only achieve at most three points as described above. Finally, discussions on the application of KEPTE to practical network coding are also presented. | 2014 |
3. | Exploiting Service Similarity for Privacy in Location-Based Search Queries | Location-based applications utilize the positioning capabilities of a mobile device to determine the current location of a user, and customize query results to include neighboring points of interests. However, location knowledge is often perceived as personal information. One of the immediate issues hindering the wide acceptance of location-based applications is the lack of appropriate methodologies that offer fine grain privacy controls to a user without vastly affecting the usability of the service. While a number of privacy-preserving models and algorithms have taken shape in the past few years, there is an almost universal need to specify one’s privacy requirement without understanding its implications on the service quality. In this paper, we propose a user-centric location based service architecture where a user can observe the impact of location inaccuracy on the service accuracy before deciding the geo-coordinates to use in a query. We construct a local search application based on this architecture and demonstrate how meaningful information can be exchanged between the user and the service provider to allow the inference of contours depicting the change in query results across a geographic area. Results indicate the possibility of large default privacy regions (areas of no change in result set) in such applications. | 2014 |
4. | Network Coding Aware Cooperative MAC Protocol for Wireless Ad Hoc Networks | Cooperative communication, which utilizes neighboring nodes to relay the overhearing information, has been employed as an effective technique to deal with the channel fading and to improve the network performances. Network coding, which combines several packets together for transmission, is very helpful to reduce the redundancy at the network and to increase the overall throughput. Introducing network coding into the cooperative retransmission process enables the relay node to assist other nodes while serving its own traffic simultaneously. To leverage the benefits brought by both of them, an efficient Medium Access Control (MAC) protocol is needed. In this paper, we propose a novel network coding aware cooperative MAC protocol, namely NCAC-MAC, for wireless ad hoc networks. The design objective of NCAC-MAC is to increase the throughput and reduce the delay. Simulation results reveal that NCAC-MAC can improve the network performance under general circumstances comparing with two benchmarks. | 2014 |
5. | A Probabilistic Misbehavior Detection Scheme toward Efficient Trust Establishment in Delay-Tolerant Networks | Abstract—Malicious and selfish behaviors represent a serious threat against routing in delay/disruption tolerant networks (DTNs). Due to the unique network characteristics, designing a misbehavior detection scheme in DTN is regarded as a great challenge. In this paper, we propose iTrust, a probabilistic misbehavior detection scheme, for secure DTN routing toward efficient trust establishment. The basic idea of iTrust is introducing a periodically available Trusted Authority (TA) to judge the node’s behavior based on the collected routing evidences and probabilistically checking. We model iTrust as the inspection game and use game theoretical analysis to demonstrate that, by setting an appropriate investigation probability, TA could ensure the security of DTN routing at a reduced cost. To further improve the efficiency of the proposed scheme, we correlate detection probability with a node’s reputation, which allows a dynamic detection probability determined by the trust of the users. The extensive analysis and simulation results demonstrate the effectiveness and efficiency of the proposed scheme. | 2014 |
6. | A System for Denial-of-Service Attack Detection Based on Multivariate Correlation Analysis | Abstract—Interconnected systems, such as Web servers, database servers, cloud computing servers and so on, are now under threads from network attackers. As one of most common and aggressive means, denial-of-service (DoS) attacks cause serious impact on these computing systems. In this paper, we present a DoS attack detection system that uses multivariate correlation analysis (MCA) for accurate network traffic characterization by extracting the geometrical correlations between network traffic features. Our MCA-based DoS attack detection system employs the principle of anomaly based detection in attack recognition. This makes our solution capable of detecting known and unknown DoS attacks effectively by learning the patterns of legitimate network traffic only. Furthermore, a triangle-area-based technique is proposed to enhance and to speed up the process of MCA. The effectiveness of our proposed detection system is evaluated using KDD Cup 99 data set, and the influences of both non-normalized data and normalized data on the performance of the proposed detection system are examined. The results show that our system outperforms two other previously developed state-of-the-art approaches in terms of detection accuracy. | 2014 |
7. | ReDS: A Framework for Reputation-Enhanced DHTs | Abstract—Distributed hash tables (DHTs), such as Chord and Kademlia, offer an efficient means to locate resources in peer-to-peer networks. Unfortunately, malicious nodes on a lookup path can easily subvert such queries. Several systems, including Halo (based on Chord) and Kad (based on Kademlia), mitigate such attacks by using redundant lookup queries. Much greater assurance can be provided; we present Reputation for Directory Services (ReDS), a framework for enhancing lookups in redundant DHTs by tracking how well other nodes service lookup requests. We describe how the ReDS technique can be applied to virtually any redundant DHT including Halo and Kad. We also study the collaborative identification and removal of bad lookup paths in a way that does not rely on the sharing of reputation scores, and we show that such sharing is vulnerable to attacks that make it unsuitable for most applications of ReDS. Through extensive simulations, we demonstrate that ReDS improves lookup success rates for Halo and Kad by 80 percent or more over a wide range of conditions, even against strategic attackers attempting to game their reputation scores and in the presence of node churn. | 2014 |
8. | A Secure Payment Scheme with Low Communication and Processing Over head for Multihop wireless Networks
|
We propose RACE, a report-based payment scheme for multihop wireless networks to stimulate node cooperation, regulate packet transmission, and enforce fairness. The nodes submit lightweight payment reports (instead of receipts) to the accounting center (AC) and temporarily store undeniable security tokens called Evidences. The reports contain the alleged charges and rewards without security proofs, e.g., signatures. The AC can verify the
payment by investigating the consistency of the reports, and clear the payment of the fair reports with almost no processing overhead or cryptographic operations. For cheating reports, the Evidences are requested to identify and evict the cheating nodes that submit incorrect reports. Instead of requesting the Evidences from all the nodes participating in the cheating reports, RACE can identify the cheating nodes with requesting few Evidences. Moreover, Evidence aggregation technique is used to reduce the Evidences’ storage area. Our analytical and simulation results demonstrate that RACE requires much less communication and processing overhead than the existing receipt-based schemes with acceptable payment clearance delay and storage area. This is essential for the effective implementation of a payment scheme because it uses micropayment and the overhead cost should be much less than the payment value. Moreover, RACE can secure the payment and precisely identify the cheating nodes without false accusations. |
2013 |
9. | Cluster-Based Certificate Revocation with Vindication Capability for Mobile Ad Hoc Networks | Mobile ad hoc networks (MANETs) have attracted much attention due to their mobility and ease of deployment. However, the wireless and dynamic natures render them more vulnerable to various types of security attacks than the wired networks. The major challenge is to guarantee secure network services. To meet this challenge, certificate revocation is an important integral component to secure network communications. In this paper, we focus on the issue of certificate revocation to isolate attackers from further participating in network activities. For quick and accurate certificate revocation, we propose the Cluster-based Certificate Revocation with Vindication Capability (CCRVC) scheme. In particular, to improve the reliability of the scheme, we recover the warned nodes to take part in the certificate revocation process; to enhance the accuracy, we propose the threshold-based mechanism to assess and vindicate warned nodes as legitimate nodes or not, before recovering them. The performances of our scheme are evaluated by both numerical and simulation analysis. Extensive results demonstrate that the proposed certificate revocation scheme is
effective and efficient to guarantee secure communications in mobile ad hoc networks. |
2013 |
10. | Fault Tolerance in Distributed Systems Using Fused Data Structures
|
Replication is the prevalent solution to tolerate faults in large data structures hosted on distributed servers. To tolerate f crash faults (dead/unresponsive data structures) among n distinct data structures, replication requires f + 1 replicas of each data structure, resulting in nf additional backups. We present a solution, referred to as fusion that uses a combination of erasure codes and selective replication to tolerate f crash faults using just f additional fused backups. We show that our solution achieves O(n) savings in space over replication. Further, we present a solution to tolerate f Byzantine faults (malicious data structures), that requires only nf + f backups as compared to the 2nf backups required by replication. We explore the theory of fused backups and provide a library of such backups for all the data structures in the Java Collection Framework. The theoretical and experimental evaluation confirms that the fused backups are space-efficient as compared to replication, while they cause very little overhead for normal operation. To illustrate the practical usefulness of fusion, we use fused backups for reliability in Amazon’s highly available key-value store, Dynamo. While the current replication – based solution uses 300 backup structures, we present a solution that only requires 120 backup structures. This results in savings in space as well as other resources such as power. | 2013 |
11. | Flexible Symmetrical
Global – Snapshot Algorithms for Large -Scale Distributed Systems |
Most existing global-snapshot algorithms in distributed systems use control messages to coordinate the construction of a global snapshot among all processes. Since these algorithms typically assume the underlying logical overlay topology is fully connected, the number of control messages exchanged among the whole processes is proportional to the square of number of processes, resulting in higher possibility of network congestion. Hence, such algorithms are neither efficient nor scalable for a large-scale distributed system composed of a huge number of processes. Recently, some efforts have been presented to significantly reduce the number of control messages, but doing so incurs higher response time instead. In this paper, we propose an efficient global-snapshot algorithm able to let every process finish its local snapshot in a given number of rounds. Particularly, such an algorithm allows a tradeoff between the response time and the message complexity. Moreover, our global-snapshot algorithm is symmetrical in the sense that identical steps are executed by every process. This means that our algorithm is able to achieve better workload balance and less network congestion. Most importantly, based on our framework, we demonstrate that the minimum number of control messages required by a symmetrical global-snapshot algorithm is Omega (N\log N), where N is the number of processes. Finally, we also assume non-FIFO channels. | 2013 |
12. | High Performance Resource Allocation
Strategies for Computational Economies |
Utility computing models have long been the focus of academic research, and with the recent success of commercial cloud providers, computation and storage is finally being realized as the fifth utility. Computational economies are often proposed as an efficient means of resource allocation, however adoption has been limited due to a lack of performance and high overheads. In this paper, we address the performance limitations of existing economic allocation models by defining strategies to reduce the failure and reallocation rate, increase occupancy and thereby increase the obtainable utilization of the system. The high-performance resource utilization strategies presented can be used by market participants without requiring dramatic changes to the allocation protocol. The strategies considered include overbooking, advanced reservation, just-in-time bidding, and using substitute providers for service delivery. The proposed strategies have been implemented in a distributed meta scheduler and evaluated with respect to Grid and cloud deployments. Several diverse synthetic workloads have been used to quantity both the performance benefits and economic implications of these strategies. | 2013 |
13. | Optimal Client-Server Assignment for Internet Distributed Systems
|
We investigate an underlying mathematical model and algorithms for optimizing the performance of a class of distributed systems over the Internet. Such a system consists of a
large number of clients who communicate with each other indirectly via a number of intermediate servers. Optimizing the overall performance of such a system then can be formulated as a client-server assignment problem whose aim is to assign the clients to the servers in such a way to satisfy some pre specified requirements on the communication cost and load balancing. We show that 1) the total communication load and load balancing are two opposing metrics, and consequently, their tradeoff is inherent in this class of distributed systems; 2) in general, finding the optimal client-server assignment for some pre specified requirements on the total load and load balancing is NP-hard, and therefore; 3) we propose a heuristic via relaxed convex optimization for finding the approximate solution. Our simulation results indicate that the proposed algorithm produces superior performance than other heuristics, including the popular Normalized Cuts algorithm. |
2013 |
14. | Scheduling Sensor Data Collection with Dynamic Traffic Patterns | The network traffic pattern of continuous sensor data collection often changes constantly over time due to the exploitation of temporal and spatial data correlations as well as the nature of condition – based monitoring applications. This paper develops a novel TDMA schedule that is capable of efficiently collecting sensor data for any network traffic pattern and is thus well suited to continuous data collection with dynamic traffic patterns. Following this schedule, the energy consumed by sensor nodes for any traffic pattern is very close to the minimum required by their workloads given in the traffic pattern. The schedule also allows the base station to conclude data collection as early as possible according to the traffic load, thereby reducing the latency of data collection. Experimental results using real-world data traces show that,
compared with existing schedules that are targeted on a fixed traffic pattern, our proposed schedule significantly improves the energy efficiency and time efficiency of sensor data collection with dynamic traffic patterns. |
2013 |
TECHNOLOGY: JAVA
DOMAIN: SOFTWARE ENGINEERING
S. No. | IEEE TITLE | ABSTRACT | IEEE YEAR |
1. | Ant Colony Optimization for Software Project Scheduling and Staffing with an Event-Based Scheduler
|
Research into developing effective computer aided techniques for planning software projects is important and challenging for software engineering. Different from projects in other fields, software projects are people-intensive activities and their related resources are mainly human resources. Thus, an adequate model for software project planning has to deal with not only the problem of project task scheduling but also the problem of human resource allocation. But as both of these two problems are difficult, existing models either suffer from a very large search space or have to restrict the flexibility of human resource allocation to simplify the model. To develop a flexible and effective model for software project planning, this paper develops a novel approach with an event-based scheduler (EBS) and an ant colony optimization (ACO) algorithm. The proposed approach represents a plan by a task list and a planned employee allocation matrix. In this way, both the issues of task scheduling and employee allocation can be taken into account. In the EBS, the beginning time of the project, the time when resources are released from finished tasks, and the time when employees join or leave the project are regarded as events. The basic idea of the EBS is to adjust the allocation of employees at events and keep the allocation unchanged at nonevents. With this strategy, the proposed method enables the modeling of resource conflict and task preemption and preserves the flexibility in human resource allocation. To solve the planning problem, an ACO algorithm is further designed. Experimental results on 83 instances demonstrate that the proposed method is very promising. | 2013 |
TECHNOLOGY: DOTNET
DOMAIN: CLOUD COMPUTING
S. No. | IEEE TITLE | ABSTRACT | IEEE YEAR |
1. | A Novel Economic Sharing Model in a Federation of Selfish Cloud Providers | Abstract—This paper presents a novel economic model to regulate capacity sharing in a federation of hybrid cloud providers (CPs). The proposed work models the interactions among the CPs as a repeated game among selfish players that aim at maximizing their profit by selling their unused capacity in the spot market but are uncertain of future workload fluctuations. The proposed work first establishes that the uncertainty in future revenue can act as a participation incentive to sharing in the repeated game. We, then, demonstrate how an efficient sharing strategy can be obtained via solving a simple dynamic programming problem. The obtained strategy is a simple update rule that depends only on the current workloads and a single variable summarizing past interactions. In contrast to existing approaches, the model incorporates historical and expected future revenue as part of the virtual machine (VM) sharing decision. Moreover, these decisions are enforced neither by a centralized broker nor by predefined agreements. Rather, the proposed model employs a simple grim trigger strategy where a CP is threatened by the elimination of future VM hosting by other CPs. Simulation results demonstrate the performance of the proposed model in terms of the increased profit and the reduction in the variance in the spot market VM availability and prices. | 2014 |
2. | A UCONABC Resilient Authorization Evaluation for Cloud Computing | The business-driven access control used in cloud computing is not well suited for tracking fine-grained user service consumption. UCONABC applies continuous authorization reevaluation, which requires usage accounting that enables fine-grained access control for cloud computing. However, it was not designed to work in distributed and dynamic authorization environments like those present in cloud computing. During a continuous (periodical) reevaluation, an authorization exception condition, disparity among usage accounting and authorization attributes may occur. This proposal aims to provide resilience to the UCONABC continuous authorization reevaluation, by dealing with individual exception conditions while maintaining a suitable access control in the cloud environment. The experiments made with a proof-of-concept prototype show a set of measurements for an application scenario (e-commerce) and allows for the identification of exception conditions in the authorization reevaluation. | 2014 |
3. | Distributed, Concurrent, and Independent Access to Encrypted Cloud Databases | Abstract—Placing critical data in the hands of a cloud provider should come with the guarantee of security and availability for data at rest, in motion, and in use. Several alternatives exist for storage services, while data confidentiality solutions for the database as a service paradigm are still immature. We propose a novel architecture that integrates cloud database services with data confidentiality and the possibility of executing concurrent operations on encrypted data. This is the first solution supporting geographically distributed clients to connect directly to an encrypted cloud database, and to execute concurrent and independent operations including those modifying the database structure. The proposed architecture has the further advantage of eliminating intermediate proxies that limit the elasticity, availability, and scalability properties that are intrinsic in cloud-based solutions. The efficacy of the proposed architecture is evaluated through theoretical analyses and extensive experimental results based on a prototype implementation subject to the TPC-C standard benchmark for different numbers of clients and network latencies. | 2014 |
4. | Key-Aggregate Cryptosystem for Scalable Data Sharing in Cloud Storage | Abstract—Data sharing is an important functionality in cloud storage. In this paper, we show how to securely, efficiently, and flexibly share data with others in cloud storage. We describe new public-key cryptosystems that produce constant-size cipher texts such that efficient delegation of decryption rights for any set of cipher texts are possible. The novelty is that one can aggregate any set of secret keys and make them as compact as a single key, but encompassing the power of all the keys being aggregated. In other words, the secret key holder can release a constant-size aggregate key for flexible choices of ciphertext set in cloud storage, but the other encrypted files outside the set remain confidential. This compact aggregate key can be conveniently sent to others or be stored in a smart card with very limited secure storage. We provide formal security analysis of our schemes in the standard model. We also describe other application of our schemes. In particular, our schemes give the first public-key patient-controlled encryption for flexible hierarchy, which was yet to be known. | 2014 |
TECHNOLOGY: DOTNET
DOMAIN: DATA MINING
S. No. | IEEE TITLE | ABSTRACT | IEEE YEAR |
1. | A Group Incremental Approach to Feature Selection Applying Rough Set Technique | Many real data increase dynamically in size. This phenomenon occurs in several fields including economics, population studies, and medical research. As an effective and efficient mechanism to deal with such data, incremental technique has been proposed in the literature and attracted much attention, which stimulates the result in this paper. When a group of objects are added to a decision table, we first introduce incremental mechanisms for three representative information entropies and then develop a group incremental rough feature selection algorithm based on information entropy. When multiple objects are added to a decision table, the algorithm aims to find the new feature subset in a much shorter time. Experiments have been carried out on eight UCI data sets and the experimental results show that the algorithm is effective and efficient. | 2014 |
2. | Consensus-Based Ranking of Multivalued Objects: A Generalized Borda Count Approach | Abstract—In this paper, we tackle a novel problem of ranking multivalued objects, where an object has multiple instances in a multidimensional space, and the number of instances per object is not fixed. Given an ad hoc scoring function that assigns a score to a multidimensional instance, we want to rank a set of multivalued objects. Different from the existing models of ranking uncertain and probabilistic data, which model an object as a random variable and the instances of an object are assumed exclusive, we have to capture the coexistence of instances here. To tackle the problem, we advocate the semantics of favoring widely preferred objects instead of majority votes, which is widely used in many elections and competitions. Technically, we borrow the idea from Borda Count (BC), a well-recognized method in consensus-based voting systems. However, Borda Count cannot handle multivalued objects of inconsistent cardinality, and is costly to evaluate top k queries on large multidimensional data sets. To address the challenges, we extend and generalize Borda Count to quantile-based Borda Count, and develop efficient computational methods with comprehensive cost analysis. We present case studies on real data sets to demonstrate the effectiveness of the generalized Borda Count ranking, and use synthetic and real data sets to verify the efficiency of our computational method. | 2014 |
3. | Rough Sets, Kernel Set, and Spatiotemporal Outlier Detection | Abstract—Nowadays, the high availability of data gathered from wireless sensor networks and telecommunication systems has drawn the attention of researchers on the problem of extracting knowledge from spatiotemporal data. Detecting outliers which are grossly different from or inconsistent with the remaining spatiotemporal data set is a major challenge in real-world knowledge discovery and data mining applications. In this paper, we deal with the outlier detection problem in spatiotemporal data and describe a rough set approach that finds the top outliers in an unlabeled spatiotemporal data set. The proposed method, called Rough Outlier Set Extraction (ROSE), relies on a rough set theoretic representation of the outlier set using the rough set approximations, i.e., lower and upper approximations. We have also introduced a new set, named Kernel Set, that is a subset of the original data set, which is able to describe the original data set both in terms of data structure and of obtained results. Experimental results on real-world data sets demonstrate the superiority of ROSE, both in terms of some quantitative indices and outliers detected, over those obtained by various rough fuzzy clustering algorithms and by the state-of-the-art outlier detection methods. It is also demonstrated that the kernel set is able to detect the same outliers set but with less computational time. | 2014 |
4. | Discovering Temporal Change Patterns in the Presence of Taxonomies | Frequent items mining is a widely exploratory technique that focuses on discovering recurrent correlations among data. The steadfast evolution of markets and business environments prompts the need of data mining algorithms to discover significant correlation changes in order to reactively suit product and service provision to customer needs. Change mining, in the context of frequent item sets, focuses on detecting and reporting significant changes in the set of mined item sets from one time period to another. The discovery of frequent generalized item sets, i.e., item sets that 1) frequently occur in the source data, and 2) provide a high-level abstraction of the mined knowledge, issues new challenges in the analysis of item sets that become rare, and thus are no longer extracted, from a certain point. This paper proposes a novel kind of dynamic pattern, namely the History Generalized Pattern (HiGen), that represents the evolution of an item set in consecutive time periods, by reporting the information about its frequent generalizations characterized by minimal redundancy (i.e., minimum level of abstraction) in case it becomes infrequent in a certain time period. To address HiGen mining, it proposes HiGen Miner, an algorithm that focuses on avoiding item set mining followed by post processing by exploiting a support-driven item set generalization approach. To focus the attention on the minimally redundant frequent generalizations and thus reduce the amount of the generated patterns, the discovery of a smart subset of HiGens, namely the Non-redundant HiGens, is addressed as well. Experiments performed on both real and synthetic datasets show the efficiency and the effectiveness of the proposed approach as well as its usefulness in a
real application context. |
2013 |
5. | Information-Theoretic Outlier Detection for Large-Scale Categorical Data
|
Outlier detection can usually be considered as a pre-processing step for locating, in a data set, those objects that do not conform to well-defined notions of expected behavior. It is very
important in data mining for discovering novel or rare events, anomalies, vicious actions, exceptional phenomena, etc. We are investigating outlier detection for categorical data sets. This problem is especially challenging because of the difficulty of defining a meaningful similarity measure for categorical data. In this paper, we propose a formal definition of outliers and an optimization model of outlier detection, via a new concept of holo entropy that takes both entropy and total correlation into consideration. Based on this model, we define a function for the outlier factor of an object which is solely determined by the object itself and can be updated efficiently. We propose two practical 1 -parameter outlier detection methods, named ITB-SS and ITB-SP, which require no user-defined parameters for deciding whether an object is an outlier. Users need only provide the number of outliers they want to detect. Experimental results show that ITB-SS and ITB-SP are more effective and efficient than mainstream methods and can be used to deal with both large and high-dimensional data sets where existing algorithms fail. |
2013 |
6. | Robust Module-Based Data Management
|
The current trend for building an ontology-based data management system (DMS) is to capitalize on efforts made to design a preexisting well-established DMS (a reference system). The method amounts to extracting from the reference DMS a piece of schema relevant to the new application needs-a module
-, possibly personalizing it with extra constraints w.r.t. the application under construction, and then managing a data set using the resulting schema. In this paper, we extend the existing definitions of modules and we introduce novel properties of robustness that provide means for checking easily that a robust module-based DMS evolves safely w.r.t. both the schema and the data of the reference DMS. We carry out our investigations in the setting of description logics which underlie modern ontology languages, like RDFS, OWL, and OWL2 from W3C. Notably, we focus on the DL-liteA dialect of the DL-lite family, which encompasses the foundations of the QL profile of OWL2 (i.e., DL-liteR): the W3C recommendation for efficiently managing large data sets. |
2013 |
7. | Protecting Sensitive Labels in Social Network Data Anonymization
|
Privacy is one of the major concerns when publishing or sharing social network data for social science research and business analysis. Recently, researchers have developed privacy models similar to k-anonymity to prevent node reidentification through structure information. However, even when these privacy models are enforced, an attacker may still be able to infer one’s private information if a group of nodes largely share the same sensitive labels (i.e., attributes). In other words, the label-node relationship is not well protected by pure structure anonymization methods. Furthermore, existing approaches, which rely on edge editing or node clustering, may significantly alter key graph properties. In this paper, we define a k-degree-l-diversity anonymity model that considers the protection of structural information as well as sensitive labels of individuals. We further propose a novel anonymization methodology based on adding noise nodes. We develop a new algorithm by adding noise nodes into the original graph with the consideration of introducing the least distortion to graph properties. Most importantly, we provide a rigorous analysis of the theoretical bounds on the number of noise nodes added and their impacts on an important graph property. We conduct extensive experiments to evaluate the effectiveness of the proposed technique | 2013 |
TECHNOLOGY: DOTNET
DOMAIN: PARALLEL & DISTRIBUTED SYSTEM
S. No. | IEEE TITLE | ABSTRACT | IEEE YEAR |
1. | Behavioral Malware Detection in Delay Tolerant Networks | Abstract—The delay-tolerant-network (DTN) model is becoming a viable communication alternative to the traditional infrastructural model for modern mobile consumer electronics equipped with short-range communication technologies such as Bluetooth, NFC, and Wi-Fi Direct. Proximity malware is a class of malware that exploits the opportunistic contacts and distributed nature of DTNs for propagation. Behavioral characterization of malware is an effective alternative to pattern matching in detecting malware, especially when dealing with polymorphic or obfuscated malware. In this paper, we first propose a general behavioral characterization of proximity malware which based on naive Bayesian model, which has been successfully applied in non-DTN settings such as filtering email spams and detecting botnets. We identify two unique challenges for extending Bayesian malware detection to DTNs (“insufficient evidence versus evidence collection risk” and “filtering false evidence sequentially and distributedly”), and propose a simple yet effective method, look ahead, to address the challenges. Furthermore, we propose two extensions to look ahead, dogmatic filtering, and adaptive look ahead, to address the challenge of “malicious nodes sharing false evidence.” Real mobile network traces are used to verify the effectiveness of the proposed methods. | 2014 |
2. | LocaWard: A Security and Privacy Aware Location-Based Rewarding System | Abstract—The proliferation of mobile devices has driven the mobile marketing to surge in the past few years. Emerging as a new type of mobile marketing, mobile location-based services (MLBSs) have attracted intense attention recently. Unfortunately, current MLBSs have a lot of limitations and raise many concerns, especially about system security and users’ privacy. In this paper, we propose a new location-based rewarding system, called LocaWard, where mobile users can collect location-based tokens from token distributors, and then redeem their gathered tokens at token collectors for beneficial rewards. Tokens act as virtual currency. The token distributors and collectors can be any commercial entities or merchants that wish to attract customers through such a promotion system, such as stores, restaurants, and car rental companies. We develop a security and privacy aware location-based rewarding protocol for the LocaWard system, and prove the completeness and soundness of the protocol. Moreover, we show that the system is resilient to various attacks and mobile users’ privacy can be well protected in the meantime. We finally implement the system and conduct extensive experiments to validate the system efficiency in terms of computation, communication, energy consumption, and storage costs. | 2014 |
3. | Power Cost Reduction in Distributed Data Centers: A Two-Time-Scale Approach for Delay Tolerant Workloads | Abstract—This paper considers a stochastic optimization approach for job scheduling and server management in large-scale, geographically distributed data centers. Randomly arriving jobs are routed to a choice of servers. The number of active servers depends on server activation decisions that are updated at a slow time scale, and the service rates of the servers are controlled by power scaling decisions that are made at a faster time scale. We develop a two-time-scale decision strategy that offers provable power cost and delay guarantees. The performance and robustness of the approach is illustrated through simulations. | 2014 |
4. | Traffic Pattern-Based Content Leakage Detection for Trusted Content Delivery Networks | Abstract—Due to the increasing popularity of multimedia streaming applications and services in recent years, the issue of trusted video delivery to prevent undesirable content-leakage has, indeed, become critical. While preserving user privacy, conventional systems have addressed this issue by proposing methods based on the observation of streamed traffic throughout the network. These conventional systems maintain a high detection accuracy while coping with some of the traffic variation in the network (e.g., network delay and packet loss), however, their detection performance substantially degrades owing to the significant variation of video lengths. In this paper, we focus on overcoming this issue by proposing a novel content-leakage detection scheme that is robust to the variation of the video length. By comparing videos of different lengths, we determine a relation between the length of videos to be compared and the similarity between the compared videos. Therefore, we enhance the detection performance of the proposed scheme even in an environment subjected to variation in length of video. Through a testbed experiment, the effectiveness of our proposed scheme is evaluated in terms of variation of video length, delay variation, and packet loss. | 2014 |