TECHNOLOGY: JAVA
DOMAIN: Networking
| S. No. | IEEE TITLE | ABSTRACT | IEEE YEAR | 
| 1 | Accurate Per-Packet Delay Tomography in Wireless Ad Hoc Networks | In this paper, we study the problem of decomposing the end-to-end delay into the per-hop delay for each packet, in multi-hop wireless ad hoc networks. Knowledge on the per-hop per-packet delay can greatly improve the network visibility and facilitate network measurement and management. We propose Domo, a passive, lightweight, and accurate delay tomography approach to decomposing the packet end-to-end delay into each hop. We first formulate the per packet delay tomography problem into a set of optimization problems by carefully considering the constraints among various timing quantities. At the network side, Domo attaches a small overhead to each packet for constructing constraints of the optimization problems. By solving these optimization problems by semi-definite relaxation at the PC side, Domo is able to estimate the per-hop delays with high accuracy as well as give a upper bound and lower bound for each unknown per-hop delay. We implement Domo and evaluate its performance extensively using both trace-driven studies and large-scale simulations. Results show that Domo significantly outperforms two existing methods, nearly tripling the accuracy of the state-of-the-art. | 2017 | 
| 2 | FiDoop-DP: Data Partitioning in Frequent Itemset Mining on Hadoop Clusters | Traditional parallel algorithms for mining frequent itemsets aim to balance load by equally partitioning data among a group of computing nodes. We start this study by discovering a serious performance problem of the existing parallel Frequent Itemset Mining algorithms. Given a large dataset, data partitioning strategies in the existing solutions suffer high communication and mining overhead induced by redundant transactions transmitted among computing nodes. We address this problem by developing a data partitioning approach called FiDoop-DP using the MapReduce programming model. The overarching goal of FiDoop-DP is to boost the performance of parallel Frequent Itemset Mining on Hadoop clusters. At the heart of FiDoop-DP is the Voronoi diagram-based data partitioning technique, which exploits correlations among transactions. Incorporating the similarity metric and the Locality-Sensitive Hashing technique, FiDoop-DP places highly similar transactions into a data partition to improve locality without creating an excessive number of redundant transactions. We implement FiDoop-DP on a 24-node Hadoop cluster, driven by a wide range of datasets created by IBM Quest Market-Basket Synthetic Data Generator. Experimental results reveal that FiDoop-DP is conducive to reducing network and computing loads by the virtue of eliminating redundant transactions on Hadoop nodes. FiDoop-DP significantly improves the performance of the existing parallel frequent-pattern scheme by up to 31 percent with an average of 18 percent. | 2017 | 
| 3 | CoRE: Cooperative End-to-End Traffic Redundancy Elimination for Reducing Cloud Bandwidth Cost | The pay-as-you-go service model impels cloud customers to reduce the usage cost of bandwidth. Traffic Redundancy Elimination (TRE) has been shown to be an effective solution for reducing bandwidth costs, and thus has recently captured significant attention in the cloud environment. By studying the TRE techniques in a trace driven approach, we found that both short-term (time span of seconds) and long-term (time span of hours or days) data redundancy can concurrently appear in the traffic, and solely using either sender-based TRE or receiver-based TRE cannot simultaneously capture both types of traffic redundancy. Also, the efficiency of existing receiver-based TRE solution is susceptible to the data changes compared to the historical data in the cache. In this paper, we propose a Cooperative end-to-end TRE solution (CoRE) that can detect and remove both short-term and long-term redundancy through a two-layer TRE design with cooperative operations between layers. An adaptive prediction algorithm is further proposed to improve TRE efficiency through dynamically adjusting the prediction window size based on the hit ratio of historical predictions. Besides, we enhance CoRE to adapt to different traffic redundancy characteristics of cloud applications to improve its operation cost. Extensive evaluation with several real traces shows that CoRE is capable of effectively identifying both short-term and long-term redundancy with low additional cost while ensuring TRE efficiency from data changes. | 2017 | 
| 4 | PDFS: Partially Dedupped File System for Primary Workloads | Primary storage dedup is difficult to be accomplished because of challenges to achieve low IO latency and high throughput while eliminating data redundancy effectively in the critical IO Path. In this paper, we design and implement the PDFS, a partially dedupped file system for primary workloads, which is built on a generalized framework using partial data lookup for efficient searching of redundant data in quickly chosen data subsets instead of the whole data. PDFS improves IO latency and throughput systematically by techniques including write path optimization, data dedup parallelization and write order preserving. Such design choices bring dedup to the masses for general primary workloads. Experimental results show that PDFS achieves 74-99 percent of the theoretical maximum dedup ratio with very small or even negative performance degradations compared with main stream file systems without dedup support. Discussions about varied configuring experiences of PDFS are also carried out. | 2017 | 
| 5 | EAFR: An Energy-Efficient Adaptive File Replication System in Data-Intensive Clusters | In data intensive clusters, a large amount of files are stored, processed and transferred simultaneously. To increase the data availability, some file systems create and store three replicas for each file in randomly selected servers across different racks. However, they neglect the file heterogeneity and server heterogeneity, which can be leveraged to further enhance data availability and file system efficiency. As files have heterogeneous popularities, a rigid number of three replicas may not provide immediate response to an excessive number of read requests to hot files, and waste resources (including energy) for replicas of cold files that have few read requests. Also, servers are heterogeneous in network bandwidth, hardware configuration and capacity (i.e., the maximal number of service requests that can be supported simultaneously), it is crucial to select replica servers to ensure low replication delay and request response delay. In this paper, we propose an Energy-Efficient Adaptive File Replication System (EAFR), which incorporates three components. It is adaptive to time-varying file popularities to achieve a good tradeoff between data availability and efficiency. Higher popularity of a file leads to more replicas and vice versa. Also, to achieve energy efficiency, servers are classified into hot servers and cold servers with different energy consumption, and cold files are stored in cold servers. EAFR then selects a server with sufficient capacity (including network bandwidth and capacity) to hold a replica. To further improve the performance of EAFR, we propose a dynamic transmission rate adjustment strategy to prevent potential incast congestion when replicating a file to a server, a networkaware data node selection strategy to reduce file read latency, and a load-aware replica maintenance strategy to quickly create file replicas under replica node failures. Experimental results on a real-world cluster show the effectiveness of EAFR and proposed strategies in reducing file read latency, replication time, and power consumption in large clusters. | 2017 | 
| 6 | Energy-Aware Scheduling of Embarrassingly Parallel Jobs and Resource Allocation in Cloud 
 | In cloud computing, with full control of the underlying infrastructures, cloud providers can flexibly place user jobs on suitable physical servers and dynamically allocate computing resources to user jobs in the form of virtual machines. As a cloud provider, scheduling user jobs in a way that minimizes their completion time is important, as this can increase the utilization, productivity, or profit of a cloud. In this paper, we focus on the problem of scheduling embarrassingly parallel jobs composed of a set of independent tasks and consider energy consumption during scheduling. Our goal is to determine task placement plan and resource allocation plan for such jobs in a way that minimizes the Job Completion Time (JCT). We begin with proposing an analytical solution to the problem of optimal resource allocation with pre-determined task placement. In the following, we formulate the problem of scheduling a single job as a Non-linear Mixed Integer Programming problem and present a relaxation with an equivalent Linear Programming problem. We further propose an algorithm named TaPRA and its simplified version TaPRA-fast that solve the single job scheduling problem. Lastly, to address multiple jobs in online scheduling, we propose an online scheduler named OnTaPRA. By comparing with the start-of-the-art algorithms and schedulers via simulations, we demonstrate that TaPRA and TaPRA-fast reduce the JCT by 40-430 percent and the OnTaPRA scheduler reduces the average JCT by 60-280 percent. In addition, TaPRA-fast can be 10 times faster than TaPRA with around 5 percent performance degradation compared to TaPRA, which makes the use of TaPRA-fast very appropriate in practice | 2017 |