Stereo image compression using wavelet coefficients morphology

https://doi.org/10.1016/j.imavis.2003.09.017Get rights and content

Abstract

In this paper, we propose a new stereo image compression scheme that is based on the wavelet transform of both images and the disparity estimation between the stereo pair subbands. The two images are decomposed by using a Discrete Wavelet Transform (DWT) and coded by employing the morphological representation of the wavelet coefficients, which is a technique that exploits the intraband–interband statistical properties of them. The progressive pixel-to-pixel evaluation of the disparity has been incorporated to the morphological coder so that a dense disparity field to be formed for every subband. The proposed method demonstrates very good performance as far as PSNR measures and visual quality are concerned and low complexity.

Introduction

A stereo pair consists of two images of the same scene recorded from two slightly different perspectives. The two images are distinguished as the Left and the Right image and from the data of this pair the information in the depth-dimension of the shot scene can be evaluated. Moreover, one can perceive a 3D image of the scene, when at the same time his left eye sees the Left image and his Right eye sees the right image. Therefore, stereo image pairs provide a two-dimensional means to represent 3D scenes.

The stereoscopic vision has a very wide field of applications in robot vision, virtual machines, medical surgery etc. These stereo imaging applications require efficient compression techniques for fast transmission rates and small storage capacities. The way the stereo pair is constructed implies inherent redundant information in the two images. Consequently, a stereo pair is compressed more efficiently than the two images can independently be compressed, if this redundancy is exploited. A commonly used coding strategy in stereo pair compression is firstly to encode the Left image, which is called reference image, independently by taking into account its intra-spatial redundancy. Then the Right image, which is called target image, is encoded by taking into account both, its intra-spatial and the cross-image redundancy of the pair. Transform coding is a method used to remove intra-spatial redundancy both from the reference and the target images. The cross-image redundant information is evaluated by considering the disparity between the two images. The disparity estimation involves the disparity compensated prediction of the target image and produces the disparity compensated difference (DCD) or residual target image together with the disparity vectors (DVs) [1]. As, usually, the Left and Right images are taken by two cameras with parallel optical axes and coplanar image planes, DVs can be represented by scalars. The disparity compensated prediction applies either on the spatial or transform domain of the images. If the transform domain consists of the multiresolution subbands of a wavelet transform, then this approach is referred as wavelet-based disparity analysis.

Several methods have been developed for disparity estimation. The area-based methods, including either pixel or line or area matching, are simple approaches of the disparity estimation [2], [3]. The pixel-based disparity estimation method finds the distance between the pixels of the two images that have similar intensity. The advantages of this method are the small disparity distances in homogeneous areas and its simple form. The block-based matching method, either fixed or variable size (FSBM or VSBM), finds the distance between two blocks that have similar intensities. A simple method is by comparing and shifting horizontally a predefined block of one image with the corresponding equal sized block of the other image. This is repeated within a predefined window length and the DV is estimated for the best match [4]. The block-matching algorithm may also be applied on the objects that appear after the object contour extraction in the two images [5] or on the subbands of a wavelet decomposed stereo image pair, in a hierarchical way [6]. The most commonly used similarity measures are the mean-square error (MSE) and the mean absolute error (MAE), which is generally preferred due to its simplicity.

Some methods code the residual part of the predicted target image by using efficient coders for ‘still’ images, as zero tree wavelet coder (EZW), or mixed coding [7], [8], [9], [10], [11]. Another method predicts the blocks transform of one image from the matching blocks transform of the other [12]. Another method is the subspace projection technique that combines disparity compensation and residual coding by applying a transform to each block of the target image [13].

The aim of the present study is to employ a robust ‘still’ image encoder and to embed the disparity estimation process into it. The proposed coder decomposes both images of the stereo pair by a Discrete Wavelet Transform (DWT) and then employs the Morphological Representation of Wavelet Data (MRWD) encoder. Firstly, the morphological coder encodes the reference image. Then, the morphological coder starts encoding the target image and during this process it produces the residual target image, which is the difference of the subband coefficients of the original target image with the predicted ones after a pixel-to-pixel matching with the reference image. The term pixel-to-pixel matching, which is used throughout this work, refers to the subband coefficients of the wavelet transformed image. Alternatively, the coder completes the formation of the clusters in the subbands, which are groups of significant coefficients, and then produces the residual target image after a cluster-to-cluster matching. Finally, the reference image, the residual target image and the DVs are entropy coded and transmitted.

The encoding algorithm exploits the statistical properties of the wavelet coefficients to form clusters and uses a morphological operator to efficiently encode the significant ones together with their positions [14]. The coarsest detail subbands are partitioned into significant and insignificant coefficients and the corresponding coefficients of the finer subbands are predicted by morphology causing the entropy of the resulting partitions to be lower than if the subbands were encoded as a whole. This algorithm provides a simple and efficient way of coding a ‘still’ image and is applied to the constituent images of a stereo pair, as that of Fig. 1.

The disparity field in this paper, which consists of the DV and the DCD, is estimated simultaneously with the morphological operations on the subbands in order to form a single framework. This is very important since the disparity compensated subband coefficients and the DVs of the residual target image form partitions, which lower their entropy. The disparity estimation has been implemented with pixel-to-pixel and cluster-to-cluster matching. The pixel-to-pixel matching allows the estimation of a dense disparity field when a better quality image is required, whereas the cluster-to-cluster matching allows the estimation of a sparse disparity field when low complexity is required.

The outstanding features of this proposed novel method are the inherent advantages of the wavelet transform, the morphological compression algorithm and the incorporation of the disparity estimation process into it. The main assets of the wavelet transform are the creation of almost uncorrelated coefficients, energy compaction and variable resolution. The morphological coder creates partitions between significant and insignificant coefficients that reduce the entropy. The proposed disparity estimation method builds dense disparity fields, avoids the blocking artifacts of the block matching methods and is of low complexity.

This paper is organized as follows: Section 2 provides an overview of the proposed coding algorithm and the disparity estimation. In Section 3 the proposed algorithm is explained and experimental results are presented in Section 4. Finally, conclusions are summarized in Section 5.

Section snippets

Coding based on morphological representation of the wavelet transform coefficients

The conventional wavelet image coders decompose a ‘still’ image into multiresolution bands [15], providing better compression quality than the so far existing DCT transform. An alternative adaptive wavelet packet scheme can enhance the benefits of this transform [16]. These types of wavelet decomposition suffer from the fact that they include all the coefficients, which are spread within the subbands, without considering their relation. The statistical properties of the wavelet coefficients led

The proposed stereo image compression algorithm

In this paper, the DWT is employed in order to transform the image pair. Wavelets are tools for decomposing signals, such as images, into hierarchical variable resolution subbands and provide several advantages over the traditional DCT. Some of these advantages are: the ability to compact most of the signal energy into a few transform coefficients, which is called energy compaction, the ability to capture and represent effectively low frequency components (such as image backgrounds) as well as

Experimental results

The above proposed stereo image pair compression method is tested on four different image pairs. They are the ‘room’ (256×256), Fig. 1, the ‘fruit’ (256×256), the ‘pepsi’ (400×512) and the ‘pentagon’ (512×512), which have been downloaded from [22], [23]. The measure is the optical quality of the reproduced images and the Peak-Signal-to-Noise-Ratio (PSNR), which is defined as:PSNR=10log102552(MSEl+MSEr)/2where MSEl and MSEr are the MSEs of the Left and Right images, respectively.

The uniform

Conclusions

In this paper a stereo image compression algorithm is presented, based on the MRWD. This algorithm compresses the Left and Right images, which are considered as reference and target respectively, by using a DWT and the MRWD method of partitioning the wavelet coefficients. The disparity estimation is a pixel-to-pixel or cluster-to-cluster matching process and has been incorporated within the compression algorithm. This process involves the estimation of the DV and the DCD fields between the

J. N. Ellinas received the B.Sc. in Electrical and Electronic Engineering from the University of Sheffield, England, in 1977, and the M.Sc. in Telecommunications from the Universities of Sheffield and Leeds, in 1978. Since 1983, he has been with the Technological Educational Institute of Piraeus, Department of Computer Engineering, Greece, where he is currently an Associate Professor. He is currently pursuing the PhD degree in the Department of Informatics and Telecommunications, Section of

References (24)

  • J.-L. Starck et al.

    Image Processing and Data Analysis: The Multiscale Approach

    (1998)
  • H. Yamaguchi et al.

    Stereoscopic images disparity for predictive coding

    Proceedings of ICASSP

    (1989)
  • B. Lucas et al.

    An iterative image registration technique with an application to stereo vision

    Proc. of the 7th Int. J. Conf on

    (1981)
  • W. Woo et al.

    Overlapped block disparity compensation with adaptive windows for stereo image coding

    IEEE Transactions on Circuits and Systems for Video Technology

    (2000)
  • J. Jiang et al.

    A hybrid scheme for low bit-rate coding of stereo images

    IEEE Transactions on Image Processing

    (2002)
  • S. Sethuraman et al.

    Multiresolution based hierarchical disparity estimation for stereo image pair compression

    Proceedings of the symposium on application of subbands and wavelets

    (1994)
  • H. Yamaguchi et al.

    Data compression and depth shape reproduction of stereoscopic images

    Syst. Comput. Japan

    (1991)
  • N.V. Boulgouris et al.

    Embedded coding of stereo images

    IEEE ICIP

    (2000)
  • N.V. Boulgouris et al.

    A family of wavelet-based stereo image coders

    IEEE Transactions on CSVT

    (2002)
  • T. Frajka et al.

    Residual image for stereo image compression

    ICIP

    (2002)
  • T. Frajka et al.

    Residual image for stereo image compression

    Optical Engineering

    (2003)
  • M.G. Perkins

    Data compression of stereopairs

    IEEE Transactions on Communications

    (1992)
  • Cited by (0)

    J. N. Ellinas received the B.Sc. in Electrical and Electronic Engineering from the University of Sheffield, England, in 1977, and the M.Sc. in Telecommunications from the Universities of Sheffield and Leeds, in 1978. Since 1983, he has been with the Technological Educational Institute of Piraeus, Department of Computer Engineering, Greece, where he is currently an Associate Professor. He is currently pursuing the PhD degree in the Department of Informatics and Telecommunications, Section of Telecommunications and Signal Processing, at the University of Athens. His research interests are in Image processing, Image Compression and Wavelets.

    M. S. Sangriotis received B.Sc. and PhD degrees from Physics Department of Athens University in Greece. In 1981 he was with the Department of Physics in Athens University. Since 1990 he has been with the Department of Informatics and Telecommunications in Athens, Greece, where he is currently an Assistant Professor. His research interests include Image Analysis and Image Coding.

    View full text