TECHNOLOGY: JAVA
DOMAIN: IMAGE PROCESSING
S. No. | IEEE TITLE | ABSTRACT | IEEE YEAR |
1. | Large Discriminative Structured Set Prediction Modeling With Max-Margin Markov Network for Lossless Image Coding | Abstract—Inherent statistical correlation for context-based prediction and structural interdependencies for local coherence is not fully exploited in existing lossless image coding schemes. This paper proposes a novel prediction model where the optimal correlated prediction for a set of pixels is obtained in the sense of the least code length. It not only exploits the spatial statistical correlations for the optimal prediction directly based on 2D contexts, but also formulates the data-driven structural interdependencies to make the prediction error coherent with the underlying probability distribution for coding. Under the joint constraints for local coherence, max-margin Markov networks are incorporated to combine support vector machines structurally to make maxmargin estimation for a correlated region. Specifically, it aims to produce multiple predictions in the blocks with the model parameters learned in such a way that the distinction between the actual pixel and all possible estimations is maximized. It is proved that, with the growth of sample size, the prediction error is asymptotically upper bounded by the training error under the decomposable loss function. Incorporated into the lossless image coding framework, the proposed model outperforms most prediction schemes reported. | 2014 |
2. | Multi-Illuminant Estimation With Conditional Random Fields | Abstract—Most existing color constancy algorithms assume uniform illumination. However, in real-world scenes, this is not often the case. Thus, we propose a novel framework for estimating the colors of multiple illuminants and their spatial distribution in the scene. We formulate this problem as an energy minimization task within a conditional random field over a set of local illuminant estimates. In order to quantitatively evaluate the proposed method, we created a novel data set of two-dominant illuminant images comprised of laboratory, indoor, and outdoor scenes. Unlike prior work, our database includes accurate pixel wise ground truth illuminant information. The performance of our method is evaluated on multiple data sets. Experimental results show that our framework clearly outperforms single illuminant estimators as well as a recently proposed multi illuminant estimation approach. | 2014 |
3. | Saliency-Aware Video Compression | Abstract—In region-of-interest (ROI)-based video coding, ROI parts of the frame are encoded with higher quality than non-ROI parts. At low bit rates, such encoding may produce attention grabbing coding artifacts, which may draw viewer’s attention away from ROI, thereby degrading visual quality. In this paper, we present a saliency-aware video compression method for ROI-based video coding. The proposed method aims at reducing salient coding artifacts in non-ROI parts of the frame in order to keep user’s attention on ROI. Further, the method allows saliency to increase in high quality parts of the frame, and allows saliency to reduce in non-ROI parts. Experimental results indicate that the proposed method is able to improve visual quality of encoded video relative to conventional rate distortion optimized video coding, as well as two state-of-the art perceptual video coding methods. | 2014 |
4. | Translation Invariant Directional Framelet Transform Combined With Gabor Filters for Image Denoising | Abstract—This paper is devoted to the study of a directional lifting transform for wavelet frames. A non sub-sampled lifting structure is developed to maintain the translation invariance as it is an important property in image denoising. Then, the directionality of the lifting-based tight frame is explicitly discussed, followed by a specific translation invariant directional framelet transform (TIDFT). The TIDFT has two framelets ψ1,ψ2 with vanishing moments of order two and one respectively, which are able to detect singularities in a given direction set. It provides an efficient and sparse representation for images containing rich textures along with properties of fast implementation and perfect reconstruction. In addition, an adaptive block-wise orientation estimation method based on Gabor filters is presented instead of the conventional minimization of residuals. Furthermore, the TIDFT is utilized to exploit the capability of image denoising, incorporating the MAP estimator for multivariate exponential distribution. Consequently, the TIDFT is able to eliminate the noise effectively while preserving the textures simultaneously. Experimental results show that the TIDFT outperforms some other frame-based denoising methods, such as contourlet and shearlet, and is competitive to the state-of-the-art denoising approaches. | 2014 |
5. | Vector-Valued Image Processing by Parallel Level Sets | Vector-valued images such as RGB color images or multimodal medical images show a strong inter channel correlation, which is not exploited by most image processing tools. We propose a new notion of treating vector-valued images which is based on the angle between the spatial gradients of their channels. Through minimizing a cost functional that penalizes large angles, images with parallel level sets can be obtained. After formally introducing this idea and the corresponding cost functionals, we discuss their Gâteaux derivatives that lead to a diffusion-like gradient descent scheme. We illustrate the properties of this cost functional by several examples in denoising and demo saicking of RGB color images. They show that parallel level sets are a suitable concept for color image enhancement. Demosaicking with parallel level sets gives visually perfect results for low noise levels. Furthermore, the proposed functional yields sharper images than the other approaches in comparison. | 2014 |
6. | Reversible Data Hiding in Encrypted Images by Reserving Room Before Encryption | Recently, more and more attention is paid to reversible data hiding (RDH) in encrypted images, since it maintains the excellent property that the original cover can be losslessly recovered after embedded data is extracted while protecting the image content’s confidentiality. All previous methods embed data by reversibly vacating room from the encrypted images, which may be subject to some errors on data extraction and/or image restoration. In this paper, we propose a novel method by reserving room before encryption with a traditional RDH algorithm, and thus it is easy for the data hider to reversibly embed data in the encrypted image. The proposed method can achieve real reversibility, that is, data extraction and image recovery are free of any error. Experiments show that this novel method can embed more than 10 times as large payloads for the same image quality as the previous methods, such as for PSNR 40 dB. | 2013 |
7. | An Inpainting-Assisted Reversible Steganographic Scheme Using a Histogram Shifting Mechanism | In this paper, we propose a novel prediction-based reversible steganographic scheme based on image inpainting. First, reference pixels are chosen adaptively according to the distribution characteristics of the image content. Then, the image inpainting technique based on partial differential equations is introduced to generate a prediction image that has similar structural and geometric information as the cover image. Finally, by using the two selected groups of peak points and zero points, the histogram of the prediction error is shifted to embed the secret bits reversibly. Since the same reference pixels can be exploited in the extraction procedure, the embedded secret bits can be extracted from the stego image correctly, and the cover image can be restored losslessly. Through the use of the adaptive strategy for choosing reference pixels and the inpainting predictor, the prediction accuracy is high, and more embeddable pixels are acquired. Thus, the proposed scheme provides a greater embedding rate and better visual quality compared with recently reported methods. | 2013 |
8. | Query-Adaptive Image Search With Hash Codes | Scalable image search based on visual similarity has been an active topic of research in recent years. State-of-the-art solutions often use hashing methods to embed high-dimensional image features into Hamming space, where search can be performed in real-time based on Hamming distance of compact hash codes. Unlike traditional metrics (e.g., Euclidean) that offer continuous distances, the Hamming distances are discrete integer values. As a consequence, there are often a large number of images sharing equal Hamming distances to a query, which largely hurts search results where fine-grained ranking is very important. This paper introduces an approach that enables query-adaptive ranking of the returned images with equal Hamming distances to the queries. This is achieved by firstly offline learning bitwise weights of the hash codes for a diverse set of predefined semantic concept classes. We formulate the weight learning process as a quadratic programming problem that minimizes intra-class distance while preserving inter-class relationship captured by original raw image features. Query-adaptive weights are then computed online by evaluating the proximity between a query and the semantic concept classes.With the query-adaptive bitwise weights, returned images can be easily ordered by weighted Hamming distance at a finer-grained hash code level rather than the original Hamming distance level. Experiments on a Flickr image dataset show clear improvements from our proposed approach. | 2013 |
9. | Robust Document Image Binarization Technique for Degraded Document Images | Segmentation of text from badly degraded document images is a very challenging task due to the high inter/intravariation between the document background and the foreground text of different document images. In this paper, we propose a novel document image binarization technique that addresses these issues by using adaptive image contrast. The adaptive image contrast is a combination of the local image contrast and the local image gradient that is tolerant to text and background variation caused by different types of document degradations. In the proposed technique, an adaptive contrast map is first constructed for an input degraded document image. The contrast map is then binarized and combined with Canny’s edge map to identify the text stroke edge pixels. The document text is further segmented by a local threshold that is estimated based on the intensities of detected text stroke edge pixels within a local window. The proposed method is simple, robust, and involves minimum parameter tuning. It has been tested on three public datasets that are used in the recent document image binarization contest (DIBCO) 2009 & 2011 and handwritten-DIBCO 2010 and achieves accuracies of 93.5%, 87.8%, and 92.03%, respectively, that are significantly higher than or close to that of the bestperforming methods reported in the three contests. Experiments on the Bickley diary dataset that consists of several challenging bad quality document images also show the superior performance of our proposed method, compared with other techniques. | 2013 |
10. | Active Contour-Based Visual Tracking by Integrating Colors, Shapes, and Motions | In this paper, we present a framework for active contour-based visual tracking using level sets. The main components of our framework include contour-based tracking initialization, color-based contour evolution, adaptive shape-based contour evolution for non-periodic motions, dynamic shape-based contour evolution for periodic motions, and the handling of abrupt motions. For the initialization of contour-based tracking, we develop an optical flow-based algorithm for automatically initializing contours at the first frame. For the color-based contour evolution, Markov random field theory is used to measure correlations between values of neighboring pixels for posterior probability estimation. For adaptive shape-based contour evolution, the global shape information and the local color information are combined to hierarchically evolve the contour, and a flexible shape updating model is constructed. For the dynamic shape-based contour evolution, a shape mode transition matrix is learnt to characterize the temporal correlations of object shapes. For the handling of abrupt motions, particle swarm optimization is adopted to capture the global motion which is applied to the contour in the current frame to produce an initial contour in the next frame. | 2013 |
11. | A Novel Reversible Data Hiding Scheme Based on Two-Dimensional Difference-Histogram Modification | In this paper, based on two-dimensional difference- histogram modification, a novel reversible data hiding (RDH) scheme is proposed by using difference-pair-mapping (DPM). First, by considering each pixel-pair and its context, a sequence consisting of pairs of difference values is computed. Then, a two-dimensional difference-histogram is generated by counting the frequency of the resulting difference-pairs. Finally, reversible data embedding is implemented according to a specifically designed DPM. Here, the DPM is an injective mapping defined on difference-pairs. It is a natural extension of expansion embedding and shifting techniques used in current histogram-based RDH methods. By the proposed approach, compared with the conventional one-dimensional difference-histogram and one-dimensional prediction-error-histogram-based RDH methods, the image redundancy can be better exploited and an improved embedding performance is achieved. Moreover, a pixel-pair-selection strategy is also adopted to priorly use the pixel-pairs located in smooth image regions to embed data. This can further enhance the embedding performance. Experimental results demonstrate that the proposed scheme outperforms some state-of-the-art RDH works. | 2013 |
12. | Efficient Techniques for Depth Video Compression Using Weighted Mode Filtering | This paper proposes efficient techniques to compress a depth video by taking into account coding artifacts, spatial resolution, and dynamic range of the depth data. Due to abrupt signal changes on object boundaries, a depth video compressed by conventional video coding standards often introduces serious coding artifacts over object boundaries, which severely affect the quality of a synthesized view. We suppress the coding artifacts by proposing an efficient post-processing method based on a weighted mode filtering and utilizing it as an in-loop filter. In addition, the proposed filter is also tailored to efficiently reconstruct the depth video from the reduced spatial resolution and the low dynamic range. The down/up sampling coding approaches for the spatial resolution and the dynamic range are used together with the proposed filter in order to further reduce the bit rate. We verify the proposed techniques by applying them to an efficient compression of multi-view-plus-depth data, which has emerged as an efficient data representation for 3-D video. Experimental results show that the proposed techniques significantly reduce the bit rate while achieving a better quality of the synthesized view in terms of both objective and subjective measures. | 2013 |
13. | Securing Multimedia Content using Joint Compression and Encryption | Algorithmic parameterization and hardware architectures can ensure secure transmission of multimedia data in resource-constrained environments such as wireless video surveillance networks, tele-medicine frameworks for distant health care support in rural areas, and Internet video streaming. Joint multimedia compression and encryption techniques can significantly reduce the computational requirements of video processing systems. We present an approach to reduce the computational cost of multimedia encryption, while also preserving the properties of compressed video (useful for scalability, trans-coding, and retrieval), which endanger loss by naive encryption. Hardware-amenable design of proposed algorithms makes them suitable for real time embedded multimedia systems. This approach alleviates the need of additional hardware for encryption in resource constrained scenario, and can be otherwise used to augment existing encryption methods used for content delivery in Internet or other applications. In this work, we show how two compression blocks for video coding: a modified frequency transform (called as Secure Wavelet Transform or SWT) and a modified entropy coding scheme, (called Chaotic Arithmetic Coding (CAC)) can be used for video encryption. Experimental results are shown for selective encryption using proposed schemes. | 2013 |