Simple JPEG image I/O with libjpeg

20 Mar 2020 - tsp

Back when I started experimenting with computer vision algorithms I’ve always had to major problem of having to use either proprietary toolboxes (back then when I started playing around with computer vision at university we used MatLab with the highly flexible image processing toolbox) or use some ugly hacks (like having a half baked JPEG decoder implementation written by myself that unfortunately was never really finished). After some some I came to the decision that using libjpeg or libjpeg-turbo would be a good idea since it provides a solid implementation of JPEG and a rather simple programming interface. But the documentation was hard to read for me and I felt that I just missed an easy example on how to use libjpeg for accessing JPEG images. The same problem arose later when I tried to process images captures via a Video4Linux device (i.e. a webcam) and the RaspberryPi camera. So I decided to write this really short introduction - and provide a basic method that just reads an JPEG file into an bitmap buffer that one can simply copy and paste into existing projects without having any other dependencies than libjpeg.

Data structure used in this example

To do experimentation in the field of computer vision it’s often simple and feasible to keep a whole uncompressed bitmap of the source images in main memory. This of course assumes that either enough memory is present of one wants to rely on swapping and/or memory mapping large regions of data into memory. Note that this approach is really nice when doing experimentation and is also feasible for some real world tasks (like image classification) but might be problematic when one has do deal with high resolution images without downsampling or wants to keep a huge number of images in main memory (for example when doing reconstruction in radio astronomy, etc.). In this case one should start thinking about resource management before implementing anything though.

Keeping the image inside main memory as a continuous block also allows easy transfer to OpenCL or CUDA processing pipelines.

To store the image for easy accessing the following datastructure will be defined:

struct imgRawImage {
	unsigned int numComponents;
	unsigned long int width, height;

	unsigned char* lpData;
};

How to load a JPEG using libjpeg

The basic process is rather simple as one can also see from the code given below:

Note that the following code does not perform proper error handling. This has been left out due to readability of the code. Error handling has to be implemented in any real life scenario that’s going to be used for more than a simple experiment. Crashing a program is no error handling (except when developing with a framework like Erlang/OTP of course)!

The code

#include <jpeglib.h>
#include <jerror.h>

struct imgRawImage* loadJpegImageFile(char* lpFilename) {
	struct jpeg_decompress_struct info;
	struct jpeg_error_mgr err;

	struct imgRawImage* lpNewImage;

	unsigned long int imgWidth, imgHeight;
	int numComponents;

	unsigned long int dwBufferBytes;
	unsigned char* lpData;

	unsigned char* lpRowBuffer[1];

	FILE* fHandle;

	fHandle = fopen(lpFilename, "rb");
	if(fHandle == NULL) {
		#ifdef DEBUG
			fprintf(stderr, "%s:%u: Failed to read file %s\n", __FILE__, __LINE__, lpFilename);
		#endif
		return NULL; /* ToDo */
	}

	info.err = jpeg_std_error(&err);
	jpeg_create_decompress(&info);

	jpeg_stdio_src(&info, fHandle);
	jpeg_read_header(&info, TRUE);

	jpeg_start_decompress(&info);
	imgWidth = info.output_width;
	imgHeight = info.output_height;
	numComponents = info.num_components;

	#ifdef DEBUG
		fprintf(
			stderr,
			"%s:%u: Reading JPEG with dimensions %lu x %lu and %u components\n",
			__FILE__, __LINE__,
			imgWidth, imgHeight, numComponents
		);
	#endif

	dwBufferBytes = imgWidth * imgHeight * 3; /* We only read RGB, not A */
	lpData = (unsigned char*)malloc(sizeof(unsigned char)*dwBufferBytes);

	lpNewImage = (struct imgRawImage*)malloc(sizeof(struct imgRawImage));
	lpNewImage->numComponents = numComponents;
	lpNewImage->width = imgWidth;
	lpNewImage->height = imgHeight;
	lpNewImage->lpData = lpData;

	/* Read scanline by scanline */
	while(info.output_scanline < info.output_height) {
		lpRowBuffer[0] = (unsigned char *)(&lpData[3*info.output_width*info.output_scanline]);
		jpeg_read_scanlines(&info, lpRowBuffer, 1);
	}

	jpeg_finish_decompress(&info);
	jpeg_destroy_decompress(&info);
	fclose(fHandle);

	return lpNewImage;
}

Walkthrough and explaination

First the compressor is created using the jpeg_create_decompress function. This function requires a set of error handling routines. In the most simple case one can use the default ones provided by libjpeg. The error manager state structure struct jpeg_error_mgr can be initialized by jpeg_std_error. This function also returns a reference to the newly initialized structure. (Since a student of mine made that mistake a number of times: Note that this error manager will be declared as a local variable in the following example - when modularizing further one should take care that this error manager stays valid till the decoder is released!)

After that the compressor has it’s data supplied from one of the sources. In the following example the source is set to a libc FILE reference to read out of a file located on disk. This is done using jpeg_stdio_src. Another way would be reading from a memory location using jpeg_mem_src after an JPEG has been received via any other mean (camera device for MJPEG camera streams, network without caching, using memory mapping for file access, etc.)

Then the decompressor is initialized and the header is read (jpeg_read_header followed by jpeg_start_decompress). Note that both functions might fail so proper error handling is required.

Then the buffer is allocated. In this example two different data buffers are used - one might also use a flexible array member for that. I’ve implemented it that way to allow easy handling (including releasing, re-allocating, etc.) of a raw data array independent of any metadata. This sampel code of course also lacks error handling (malloc returns NULL in hard out of memory conditions if no out-of-memory killer is configured or in case resource limits are reached).

After the buffer has been allocated the code reads the image scanline by scanline using jpeg_read_scanlines. One could also read multiple scanlines at a time but since it might be desired to process them while streaming this example has been implemented that way. One could of course substitute the whole loop

while(info.output_scanline < info.output_height) {
	lpRowBuffer[0] = (unsigned char *)(&lpData[3*info.output_width*info.output_scanline]);
	jpeg_read_scanlines(&info, lpRowBuffer, 1);
}

by a single read:

lpRowBuffer[0] = (unsigned char *)(&lpData[0]);
jpeg_read_scanlines(&info, &lpRowBuffer, info.output_height);

Note that this function might also fail - add error handling again. At the end the decompressor is finalized using jpeg_finish_decompress and then released using jpeg_destroy_decompress. Again please take care that jpeg_finish_decompress might indicate an error in which case one might not want to use the already read data.

The other way: Storing raw images into JPEG

This works similar to the example above:

In this section no walkthrough will be provided since the idea is the same as for the decompressor described above (except the direction of data transfer).

The code

#include <jpeglib.h>
#include <jerror.h>

int storeJpegImageFile(struct imgRawImage* lpImage, char* lpFilename) {
	struct jpeg_compress_struct info;
	struct jpeg_error_mgr err;

	unsigned char* lpRowBuffer[1];

	FILE* fHandle;

	fHandle = fopen(lpFilename, "wb");
	if(fHandle == NULL) {
		#ifdef DEBUG
			fprintf(stderr, "%s:%u Failed to open output file %s\n", __FILE__, __LINE__, lpFilename);
		#endif
		return 1;
	}

	info.err = jpeg_std_error(&err);
	jpeg_create_compress(&info);

	jpeg_stdio_dest(&info, fHandle);

	info.image_width = lpImage->width;
	info.image_height = lpImage->height;
	info.input_components = 3;
	info.in_color_space = JCS_RGB;

	jpeg_set_defaults(&info);
	jpeg_set_quality(&info, 100, TRUE);

	jpeg_start_compress(&info, TRUE);

	/* Write every scanline ... */
	while(info.next_scanline < info.image_height) {
		lpRowBuffer[0] = &(lpImage->lpData[info.next_scanline * (lpImage->width * 3)]);
		jpeg_write_scanlines(&info, lpRowBuffer, 1);
	}

	jpeg_finish_compress(&info);
	fclose(fHandle);

	jpeg_destroy_compress(&info);
	return 0;
}

Some comments

The color space provided has to match the number of components and the supplied data. Normally this is one of:

A simple experiment (implementing a grayscale filter)

Just as an example how one might implement a simple grayscale filter. In this case it’s assumed that the image should keep 3 color channels, all representing the same intensity value as the greyscale channel. This has the advantage that other processing functions do not have to discriminate the number of components they are using. The disadvantage is three times as much memory usage.

So how does greyscale conversion work? Basically the intensity value of each channel is mapped to a given contribution into general intensity values. This is done to reflect the sensitivity of the eye for different colors. There is a huge number of different greyscale schemes, all differing somehow in detail (the magnitude of numbers is similar).

In the most naive way one might:

enum imageLibraryError filterGrayscale(
	struct imgRawImage* lpInput,
	struct imgRawImage** lpOutput
) {
	unsigned long int i;

	if(lpOutput == NULL) {
		(*lpOutput) = lpInput; /* We will replace our input structure ... */
	} else {
		(*lpOutput) = malloc(sizeof(struct imgRawImage));
		(*lpOutput)->width = lpInput->width;
		(*lpOutput)->height = lpInput->height;
		(*lpOutput)->numComponents = lpInput->numComponents;
		(*lpOutput)->lpData = malloc(sizeof(unsigned char) * lpInput->width*lpInput->height*3);
	}

	for(i = 0; i < lpInput->width*lpInput->height; i=i+1) {
		/* Do a grayscale transformation */
		unsigned char luma = (unsigned char)(
			0.299f * (float)lpInput->lpData[i * 3 + 0]
			+ 0.587f * (float)lpInput->lpData[i * 3 + 1]
			+ 0.114f * (float)lpInput->lpData[i * 3 + 2]
		);
		(*lpOutput)->lpData[i * 3 + 0] = luma;
		(*lpOutput)->lpData[i * 3 + 1] = luma;
		(*lpOutput)->lpData[i * 3 + 2] = luma;
	}

	return imageLibE_Ok;
}

Some other things to implement when doing first experiments

Some of the most useful filters and modules I’ve implemented during my early experiments with computer vision have been:

This article is tagged: Programming, Data Mining, Computer Vision


Dipl.-Ing. Thomas Spielauer, Wien (webcomplains389t48957@tspi.at)

This webpage is also available via TOR at http://jugujbrirx3irwyx.onion/

Valid HTML 4.01 Strict Powered by FreeBSD IPv6 support