|
|
|
For our project 1, we designed and implemented an image compression scheme using transform coding. We picked a transform to use (Haar wavelet), a quantizer (Lloyd-Max), and allocated bits appropriate to the transform. We then tested out a few bit allocation schemes and picked the best 6 to test out on a few images. Our mathematical metric of how good the compression worked was the peak signal to noise ratio (PSNR).
In transform coding, one of the key ideas is picking a transform that takes an image and compacts the energy into fewer high energy transform coefficients in a simple implementation. Two simple and effective choices are the discrete cosine transform (DCT) and the Haar wavelet transform. We chose the wavelet transform because we had never done anything with wavelets and wanted to learn more about them. And in doing so, we decided not to use the MATLAB toolbox and write our own Haar wavelet transform. The choice of a bit-allocation scheme also seemed a little more straightforward.
Once we get the transformed data, we quantize the data. This is where we get our compression gain. By choosing a transform that compacts the energy into a few high energy coefficients, we can use more bits to quantize the high energy coefficients and less bits to quantize the low energy(i.e. low information)coefficients. We will lose some information, but the transform should compact the information enough to get good compression while still having a reasonable looking image.
Quantization theory tells us that we get optimal bit allocation by allocating the more bits as the variance increases. Just looking at the variances within the different quadrants along the transform told us that the upper left corner was much higher than the lower right. So we decided to allocate lots of bits to this area and decrease as we moved away from this corner. Allocating a higher ratio of bits to these quadrants would not increase the data stored that much because they were small in area, and the coefficients were high-energy so it was important to represent them accurately.
Initially, our quantizer was fairly simple. First, offset the coefficient matrix to all positives, round off, then shift out bits specified by the mask, then afterwards, write the matrix to a file. This made too many assumptions and had two main disadvantages. First, we were using a static "expected range" that coefficients would land in for each level of the DWT. This range was often too big so the level of quantization was not as fine as it could be. Second, the shifting and rounding never really took into account an inverse quantization back to exact zero. These two mistakes were most evident in one and two bit quantization. We could zero them out and get better results. So a new quantizer was written. It can dynamically range the coefficients to get finer levels of quantization. The new one also handles both one and two bit quantization separately.
We observed an interesting anomaly when we first tested out our compression scheme. When using 1 or 2 bits to quantize some of the outer quandrants, the PSNR would actually be worse than using 0 bits! Looking at a histogram for one of these images, we discovered that these quandrants had most of it's values around 0 and few values on the order of around +/- 100. This makes sense because images usually don't have a lot of high-frequency components so the first few coefficients may be near 0 whlie the edges will create the few larger numbers. The uniform Lloyd-Max quantizer does not have a reconstruction level at 0 for even numbers of quantization levels. And with 1 or 2 bits, this can create substatial error. To remedy this, we adjusted the quantizer at 1 or 2 bits to reconstruct at the 0 level.
|
![]()
|
This was simple but kind of ad hoc. A better solution may have been to use a compounder or to adjust the quantization level based upon the distribution of the input.
| |
![]()
|
The one bit quantizer needs to get a good range around zero so it looks like this. As you can see, all negative numbers get wiped.
|
The two bit quantizer is a bit more adaptive to the range, but the area around zero has to be preserved.
|
|
|
The assignment of bits (the mask) is done per "quadrant" which correspond to different levels of DWT coefficient matrices. Since we want to try various masks, we can simplify by putting the bit assignments in an array that corresponds to this quadrant assignment.
|
Below are the results from testing out some masks to obtain a comparison between PSNR and bits per pixel. The outcome were as expected where initial increases in bits per pixel showed dramatic changes in the PSNR while tapering off in an asymptotic way beyond a certain value of bpp.
Notable exceptions to the norm include the "Buck" results, which show unusually high PSNR for very low bits. This can be understood by realizing that "Buck" has a fairly simple low-contrast background, hence making it significantly easier to quantize.
The only notable instance where an increase in bpp resulted in a lower PSNR was with the image "Camera". This we conclude was a result of the high variance in the image mainly due to the scenic background.