Developing native Python extensions in C - capturing from Video4Linux devices

12 Jun 2022 - tsp
Last update 12 Jun 2022
Reading time 30 mins

Introduction
The example problem
- Requirements on FreeBSD
How to implement objects and methods in native extensions
The behavior implemented
The code

Introduction

Yes, another Python article. Seems I’m doing more work in Python than I really like to do (I still don’t exactly like the language despite having done a few years of coding in Python too). But sometimes you already have a project in Python or it just turns out to be the best tool for a given job - and then you should also use a tool that you don’t like if it fits your problem best. It’s not about sympathy but about the best technology to solve a given problem anyways.

So there you go. You have your huge amount of C code that you’ve developed and now someone wants to interface it using Python. Or you have some modules that you develop in C and want to run short tests or verifications from some Python environment like I wanted to do for some of my numeric and image processing / machine learning libraries. Then you might be tempted to just write a small Python wrapper around your ANSI C object oriented code and expose the functionality into some Python module namespace.

There is a huge amount of tutorials out there but they all used a somewhat strange view onto the environment and development of software out there - and required one to already know the internal workings of CPython. To be honest this is really helpful of course - but would have really appreciated a tutorial that focused on developing a simple plugin that exposes a custom Python object (not only allows one to call a simple hello world function and pass back a string - but to provide a real object that keeps state and exposes variables/attributes as well as functions/methods).

So the first thing that was a little bit strange was to get used to Pythons nomenclature when coming from the C / C++ world to Python.

The example problem

So what do I want to expose to Python. For sake of simplicity I want to expose a wrapper around the Video4Linux API and access a webcam in streaming mode as I already wrote up as a summary blog post some time ago. The idea is to have a lightweight library to capture frames from a simple V4L capture device such as a USB camera that’s accessible via some simple set of functions while the camera is ran in streaming mode with configurable queue depth - and without installing a huge amount of dependencies like installing OpenCV together with it’s Python wrapper - especially since building OpenCV on FreeBSD is somewhat of a problem from time to time due to some portability problems. The library should be simple, compact and buildable on any platform that exposes devices via the Video4Linux API and supporting the mmap based streaming API or the read/write based interface (yes this means that one limits the use cases and might introduce some delay but for most simple applications this should be sufficient)

Example capture visualized with matplotlib

The API thus would be composed of:

A simple method that will return a list of discovered Video4Linux devices that allow access via mmap.
A Camera class / object that allows one to wrap a V4L device and:
- Open it (this will be done by creating the object instance)
  - Query the capabilities as a separate CameraCapability class that exposes:
  - Driver information * Card information * Bus information * Device version * Capabilities and device capabilities * Query the maximum supported cropping area * The supported formats. Each format will be described by a CameraFormat class. Note that for this blog post I’m going to limit myself to the YUV888, RGB888 and Grayscale8 formats.
- Set the cropping region
  - Select the output format (RGB888, YUV888 or Grayscale) which should be delivered either directly by the device if supported or automatically converted by the library transparently. So one can simply set any of the formats without checking the capabilities

Requirements on FreeBSD

To run the code from the GitHub repository that this article is about one has to install:

multimedia/webcamd which supplies the character device files like /dev/videoN that expose the Video4Linux API - basically this is a usermode implementation of the Linux webcam and DVB drivers like uvcdriver via the cuse API by accessing the camera or DVB device using libusb (also depends on pthreads and libc). webcamd is usually started via your devd or devfs ruleset whenever a compatible camera device is attached and automatically provides the video device files inside your devfs
Since Video4Linux is a Linux API one has to install additional headers which are provided in multimedia/v4l_compat. This does not install any services - only header files for the ioctl commands

Read the output of pkg or the ports build scripts to see how to start webcamd after installation without restarting your system. webcamd should also be enabled in rc.conf if it’s used on a regular basis.

$ sudo pkg install multimedia/v4l_compat
$ sudo pkg install multimedia/webcamd
$ sudo echo "webcamd_enable=\"YES\"" >> /etc/rc.conf
$ sudo echo "cuse_load=\"YES\" >> /boot/loader.conf"
$ sudo kldload cuse
$ sudo /etc/rc.d/devd restart
$ sudo /usr/local/etc/rc.d/webcamd start

How to implement objects and methods in native extensions

So now we know what we want to do lets take a step back and look at how this is going to be implemented. First a short note about the nomenclature:

The main object that’s imported is called a module. This is what one addresses with the dotted notation. For example the notation module.Type would address the type Type inside the module module. Modules can of course be nested as usual.
- Modules can have member functions / methods. Those are static methods that are not tied to an object instance
- Modules can also contain object types that one can instantiate.
What’s known as an object or class definition in most programming languages is called a type in Python. So for each object type we’re going to register our own Python type. This type defined which attributes and methods are present at a given type.
- Attributes are variables that one can access
  - Methods are the member functions that one can call.

Creating a module and the main function

The main entry function that’s called when a shared object is loaded by Python is a method that uses the PyMODINIT_FUNC macro to designate itself as main function:

PyMODINIT_FUNC simplepycamInitFunction(void) {

}

This function will create a instance of a module descriptor that lists some basic information about the module, possible methods that one can call and some documentation string. All of this is defined in a structure defined by the PyModuleDef structure included from Python.h. This structure first requires an initializer macro PyModuleDef_HEAD_INIT that assigns most required fields to sane values and then can use C99 style module initializers to assign the values one required. In the following example we’re going to write a definition for the simplepycam namespace:

#include "Python.h"

struct PyModuleDef simplepycamModuleDefinition = {
	PyModuleDef_HEAD_INIT,

	.m_name = "simplepycam",
	.m_doc = "Simple V4L camera access from Python",
	.m_size = -1
};

PyMODINIT_FUNC PyInit_simplepycam(void) {
	PyObject* lpPyModule;

	/* Create an instance of the module definition */
	lpPyModule = PyModule_Create(&simplepycamModuleDefinition);
	if(lpPyModule == NULL) {
		return NULL;
	}

	return lpPyModule;
}

Note that the name of the init function is not arbitrary! It always has the structure PyInit_ followed by the module name.

Creating an object type (a instantiable class type)

The next step is to create a class type. As we’ve seen before we’re going to need class types for:

The camera itself (simplepycam.Camera)
A camera format (simplepycam.CameraFormat)

The CameraFormat class is the most simple one since it only serves as a wrapper around format information, the main logic will be found in Camera.

So how does one create a object type anyways? Basically one creates an instance of a PyTypeObject structure, assigns there references to methods / functions and member variable available, assigns references to custom __new__, __init__ and destructor functions if one requires them, finalizes the descriptor using the PyType_Ready function, increments it’s reference count and adds it to the module object using PyModule_AddObject method. Again a macro PyVarObject_HEAD_INIT makes life a little bit easier.

As one would expect when one already has done object oriented programming in C each object has a corresponding structure that contains it’s properties. Different than the previously presented object oriented pattern for C methods are not referenced via function pointers directly from this structure but the CPython interpreter keeps track of them for us in the type object. So one first has to define a base structure that will contain all of our object state. The first member of this structure has to be PyObject_HEAD - which contains the base class information from object:

struct pyCamera_Instance {
	PyObject_HEAD

	// Our own variables follow here ...
};

The accompanying PyTypeObject might look like the following - note that this will not compile until the methods, members and assigned functions are also defined:

static PyTypeObject simplepycamType_Camera = {
	PyVarObject_HEAD_INIT(NULL, 0)

	.tp_name = "simplepycam.Camera",
	.tp_doc = PyDoc_STR("Camera for simple V4L camera access"),
	.tp_basicsize = sizeof(struct pyCamera_Instance),
	.tp_itemsize = 0,

	.tp_methods = pyCamera_Methods,
	.tp_members = pyCamera_Members,

	.tp_new = pyCamera_new,
	.tp_init = (initproc)pyCamera_Init,
	.tp_dealloc = (destructor)pyCamera_Dealloc
};

The type will be finalized inside our initialization function and then will be added - with increased reference count - to our module namespace. The extended initialization methods might look like this one:

PyMODINIT_FUNC PyInit_simplepycam(void) {
	PyObject* lpPyModule;

	/* First let's finalize all type definitions ... */
	if(PyType_Ready(&simplepycamType_Camera) < 0) {
		return NULL;
	}

	/* Create an instance of the module definition */
	lpPyModule = PyModule_Create(&simplepycamModuleDefinition);
	if(lpPyModule == NULL) {
		return NULL;
	}

	/* Increment reference counts on all type definitions ... */
	Py_INCREF(&simplepycamType_Camera);

	/* Add type definitions to module ... */
	if((PyModule_AddObject(lpPyModule, "Camera", (PyObject*)(&simplepycamType_Camera))) < 0) {
		/* In case of error: Release all objects by decrementing their refcounts to 0 */
		Py_DECREF(&simplepycamType_Camera);
		Py_DECREF(lpPyModule);
		return NULL;
	}

	return lpPyModule;
}

The setup.py file

The whole build process for Python extensions is usually controlled by a file called setup.py that utilizes the distutils.core functionality. These utilities invoke compilers, etc. - these build processes are usually not steered by Makefiles or something similar. The most simple setup.py can be tried when we just comment out all members of our Camera type definition in simplepycamType_Camera that we’ve not implemented (methods, members, new, init and dealloc). The most basic setup.py might look like the following:

from distutils.core import setup, Extension

setup(
	name = "simplepycam",
	version = "0.1"
	ext_modules = [
		Extension("simplepycam", ["simplepycam.c"])
	]
)

One can then simply execute

python setup.py build

to build the extension. Usually this will produce a temporary and a lib directory also containing the platform and architecture information inside the filename - as well as the Python version. On my development machine using Python 3.8 this will create:

build/temp.freebsd-12.2-RELEASE-p7-amd64-3.8 for temporary stuff. After the build has finished this is not needed anymore.
build/lib.freebsd-12.2-RELEASE-p7-amd64-3.8 which contains simplepycam.cpython-38.so which is the binary extension module that will be loaded by our interpreter.

One can simply try this from the command line (for the sys.path statement I assume the same operating system and directory structure as described above):

$ python
Python 3.8.12 (default, Oct 5 2021, 01:13:43)
[Clang 10.0.1 (git@github.com:llvm/llvm-project.git llvmorg-10.0.1-0-gef32c611a on freebsd12)]
Type "help", "copyright", "credits" or "license" for more information.
>>>
>>> import sys
>>> sys.path.append("./build/lib.freebsd-12.2-RELEASE-p7-amd64-3.8")
>>>
>>> import simplepycam
>>> dir(simplepycam)
['Camera', '__doc__', '__file__', '__loader__', '__name__', '__package__', '__spec__']

Since we have not defined our init and new functions it’s not possible to create an instance up until now though:

>>> cam = simplepycam.Camera()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: cannot create 'simplepycam.Camera' instances

Adding dummy new, init and dealloc functions

So the next step will be adding dummy (empty) __new__, __init__ and destructor methods so one can instantiate the module:

static PyObject* pyCamera_New(
	PyTypeObject* subtype,
	PyObject* args,
	PyObject* kwds
) {
	struct pyCamera_Instance* lpNewInstance;

	lpNewInstance = (struct pyCamera_Instance*)type->tp_alloc(type, 0);
	if(lpNewInstance == NULL) {
		return NULL;
	}

	// Do other initialization

	return lpNewInstance;
}

static int pyCamera_Init(
	struct pyCamera_Instance* lpSelf,
	PyObject* args,
	PyObject* kwds
) {
	return 0;
}

static void pyCamera_Dealloc(
	struct pyCamera_Instance* lpSelf
) {
	PyTypeObject* lpType = Py_TYPE(lpSelf);

	Py_TYPE(lpSelf)->tp_free((PyObject*)lpSelf);

	Py_DECREF(lpType);
}

Those methods are then referenced via the tp_new, tp_init and tp_dealloc properties of the PyTypeObject structure as shown above. What do we expect those methods to do:

tp_new is the __new__ instance creation function. This function should only call tp_alloc to allocate storage for the object (one can also override this) and do as much initialization as is absolutely necessary or that’s not idempotent (i.e. is only allowed to happen exactly once). For immutable objects initialization should usually be done in tp_new.
tp_init is the well known __init__ function that initializes class state. Other than __new__ this method is assumed to be idempotent, i.e. it is possible to reinitialize an object by calling it’s __init__ function.
The tp_dealloc function is the destructor function. This has to be defined always except for singleton objects that are never destroyed. The deallocation function should release all remaining references to other objects and if applicable stop garbage collector tracking. In the end if should use tp_free to release the memory and - for heap allocated objects - also decrease the reference count on ones type.

Accessing parameters of the init function

Now let’s get to a more interesting topic - accessing the parameters of our init function (which is the same thing as accessing parameters of functions later on).

We assume that our initialization function will receive at most a single parameter for now that supplies the device filename. This is a positional (the first) or named (dev) parameter. This will have a default of /dev/video0. To parse arguments we’re going to use the PyArg_ParseTupleAndKeywords utility function that allows one to deconstruct positional arguments (args) and keyword arguments (kwds) into Python objects or C types:

static char* pyCamera_Init__KWList[] = { "dev", NULL };
static char* pyCamera_Init__DefaultDevice = "/dev/video0";

static int pyCamera_Init(
    struct pyCamera_Instance* lpSelf,
    PyObject* args,
    PyObject* kwds
) {
    char* lpArg_Dev = NULL;

    if(!PyArg_ParseTupleAndKeywords(args, kwds, "|s", pyCamera_Init__KWList, &lpArg_Dev)) {
        return -1;
    }

    if(lpSelf->lpDeviceName != NULL) {
        free(lpSelf->lpDeviceName);
        lpSelf->lpDeviceName = NULL;
    }

    if(lpArg_Dev == NULL) {
        lpSelf->lpDeviceName = (char*)malloc(sizeof(char) * strlen(pyCamera_Init__DefaultCameraFile) + 1);
        if(lpSelf->lpDeviceName == NULL) {
						PyErr_SetNone(PyExc_MemoryError);
						return -1;
        }
        strcpy(lpSelf->lpDeviceName, pyCamera_Init__DefaultCameraFile);
    } else {
        lpSelf->lpDeviceName = (char*)malloc(sizeof(char) * strlen(lpArg_Dev) + 1);
        if(lpSelf->lpDeviceName == NULL) {
						PyErr_SetNone(PyExc_MemoryError);
						return -1;
        }
        strcpy(lpSelf->lpDeviceName, lpArg_Dev);
    }

		/*
			Stripped querying capabilities of the camera, verifying streaming
			functionality and available interfaces, copying information, etc. from
			webpage code snippets. See the full source code for details
		*/

    return 0;
}

The format string of PyArg_ParseTupleAndKeywords specifies how the arguments and keyword arguments should be parsed. In this case a single s has been specifies that indicates that the first argument that should be read is a string that should already be converted into a C character array (if one would use s* one could receive a Py_Buffer object that one can then pass on to other Python functions). Note that we are not responsible for managing this character buffer - in fact if we release it we’re going to crash the interpreter. Just threat this as a NULL terminated constant that we’re allowed to read.

In this function one can also already see part of the exception handling used in Python - the init function just returns 0 on success or -1 on failure. The exception information is passed by using PyErr_SetNone which allows one to specify a custom Python object as exception type and passes the value None.

Adding custom member functions (methods)

Then we want to add custom methods to our object. All methods exported from a type are listed in a method table - this is an array of type PyMethodDef[]. It contains the name of the function, a function pointer and optional flags. The last parameter is a PyDoc string that gives a short explanation of the purpose of the method. The flags supported are:

METH_VARARGS which specifies the C calling convention. For newer Python versions one should always use METH_VARARGS
METH_KEYWORDS allows one to support keyword arguments. Those are passed as a third argument to the methods. Note that the method signature has to match the presence or absence of the keywords argument.

All methods have the same signature. They:

Return a PyObject reference. Methods that don’t return anything useful have to return PyNone after incrementing the reference count on PyNone.
Their first argument is a PyObject reference that refers to their own datastructure. One can cast this to one’s own structure type.
The second argument contains the arguments that are specified without keywords
The optional third argument contains keyword arguments and is only present when METH_KEYWORDS has been set.

For example a simple method that does not receive arguments and does not return any value might have the following signature:

static PyObject* pyCamera_Example(
	PyObject* lpSelf,
	PyObject *args
) {
	/* We return nothing ... */
	Py_INCREF(Py_None);
	return Py_None;
}

This method will then have to be listed in the method table:

static PyMethodDef pyCamera_Methods[] = {
	{ "example", pyCamera_Example, METH_VARARGS, "Example method" },
	{ NULL, NULL, 0, NULL }
};

The method table will be referenced by the tp_methods member of the type descriptor mentioned above.

Supporting context routines `enter` and `exit`

Since PEP 343 Python has the useful with statement that allows one to handle context management in a more elegant way without having to remember to close resources one has opened before. Before with the typical pattern was to open a resource and later on close it using the appropriate functions - one had to remember to close every resource for sure exactly once (also in exception cases, etc.) to prevent having opened resources lingering around and potentially keeping locks or having some logical errors when closing external resources more than once. To solve that problem the with statement has been introduces.

For example the typical usage is:

with Example(...) as ex:
	ex.doWhatever()
	...

As soon as one leaves the with block the context is automatically closed. On the object side this requires two methods - __enter__ and __exit__. When one extends a previously existing object that also supports the open and close semantics one can just use the context management protocol methods to wrap around the open and close methods:

class Example:
	def __init__(self, argument):
		# Do some initialization here ...
		pass

	def open(self):
		# Do some context initialization here ...
		pass

	def close(self):
		# Do some context cleanup here
		pass

	def __enter__(self):
		self.open()
		return self

	def __exit__(self, type, value, traceback):
		self.close()

Since this has been introduced as a language feature later on there is no special handling of __enter__ and __exit__ on the native extensions side. One just registers the methods in the type method table and is good to go. The native analogon to a wrapper around open and close might look like the following:

static PyObject* pyCamera_Enter(
	PyObject* lpSelf,
	PyObject* args
) {
	pyCamera_Open(lpSelf, args);

	return lpSelf;
}

static PyObject* pyCamera_Exit(
	PyObject* lpSelf,
	PyObject* args
) {
	pyCamera_Close(lpSelf, args);

	Py_INCREF(Py_None);
	return Py_None;
}

static PyMethodDef pyCamera_Methods[] = {
	/* Stripped on webpage example ... */
	{ "__enter__", pyCamera_Enter, METH_VARARGS, "Enter for the context method" },
	{ "__exit__", pyCamera_Exit, METH_VARARGS, "Exit for the context method" },
	/* Stripped on webpage example ... */
	{ NULL, NULL, 0, NULL }
};

Custom getters and setters (control over attributes / members)

Since we have some properties that we want to have control of over their getting and setting behavior (specifically for example the device name should be an immutable string as long as the device is opened but should be modifiable - but only to another string - whenever the device is closed and some of the basic capability information that are exposed should be immutable so their setter has to raise an exception in case of an access) we have to implement custom getters and setters. This is done by supplying a reference to the members and their respective getters and setters in the tp_getset attribute of the type declaration which refers to a PyGetSetDef table:

The following example shows how I implemented the getter and setter for the device file name device and the immutable getter for the driver property:

static PyGetSetDef pyCamera_Properties[] = {
	{ "device", (getter)pyCamera_Property_Device_Get, (setter)pyCamera_Property_Device_Set, "Name of the device file (only mutable when closed)", NULL },
	{ "driver", (getter)pyCamera_Property_Driver_Get, (setter)pyCamera_Property_Immutable_Set, "Immutable name of driver attached to the device", NULL},
	/* Stripped on webpage example ... */
	{ NULL }
};

static int pyCamera_Property_Immutable_Set(
	struct pyCamera_Instance* lpSelf,
	PyObject* lpValue,
	void* lpClosure
) {
	PyErr_SetString(PyExc_ValueError, "Property is immutable");
	return -1;
}

static int pyCamera_Property_Device_Set(
	struct pyCamera_Instance* lpSelf,
	PyObject* lpValue,
	void* lpClosure
) {
	if(lpSelf->hHandle != -1) {
		PyErr_SetString(PyExc_ValueError, "Device is opened, cannot set device file name");
		return -1;
	}

	if(lpValue == NULL) {
		PyErr_SetString(PyExc_TypeError, "Device file name has to be a string");
		return -1;
	}
	if(!PyUnicode_Check(lpValue)) {
		PyErr_SetString(PyExc_TypeError, "Device file name has to be a string");
		return -1;
	}
	if(PyUnicode_KIND(lpValue) != PyUnicode_1BYTE_KIND) {
		/* ToDo: Convert if possible ... */
		PyErr_SetString(PyExc_TypeError, "Device file name has to be a UTF-8 string");
		return -1;
	}

	unsigned long int dwNameLen = PyUnicode_GetLength(lpValue);
	char* lpNewName = (char*)malloc(sizeof(char) * (PyUnicode_GetLength(lpValue) + 1));
	if(lpNewName == NULL) { PyErr_SetNone(PyExc_MemoryError); return -1; }

	if(lpSelf->lpDeviceName != NULL) { free(lpSelf->lpDeviceName); lpSelf->lpDeviceName = NULL; }

	strncpy(lpNewName, PyUnicode_DATA(lpValue), PyUnicode_GetLength(lpValue));
	lpNewName[PyUnicode_GetLength(lpValue)] = 0;
	lpSelf->lpDeviceName = lpNewName;

	/* Return success */
	return 0;

}
static PyObject* pyCamera_Property_Device_Get(
	struct pyCamera_Instance* lpSelf,
	void* lpClosure
) {
	PyObject* lpDevString = PyUnicode_FromKindAndData(PyUnicode_1BYTE_KIND, lpSelf->lpDeviceName, strlen(lpSelf->lpDeviceName));
	if(lpDevString == NULL) { PyErr_SetNone(PyExc_MemoryError); return NULL; }

	return lpDevString;
}

static PyObject* pyCamera_Property_Driver_Get(
	struct pyCamera_Instance* lpSelf,
	void* lpClosure
) {
	if(lpSelf->hHandle == -1) {
		PyErr_SetString(PyExc_ValueError, "Device not opened, cannot get driver name");
		return NULL;
	}

	if(lpSelf->lpCaps_Driver == NULL) {
		Py_INCREF(Py_None);
		return Py_None;
	}

	PyObject* lpPyStr = PyUnicode_FromKindAndData(PyUnicode_1BYTE_KIND, lpSelf->lpCaps_Driver, strlen(lpSelf->lpCaps_Driver));
	if(lpPyStr == NULL) { PyErr_SetNone(PyExc_MemoryError); return NULL; }
	return lpPyStr;
}

The reference in the Python object type declaration looks like the following:

static PyTypeObject simplepycamType_Camera = {
    PyVarObject_HEAD_INIT(NULL, 0)

    /* Stripped on webpage example ... */

		.tp_getset = pyCamera_Properties,

		/* Stripped on webpage example ... */
};

Tuples

In Python a tuple is a immutable list of objects that might even have different types. They’re often used to return multiple values from a method. I’m also using them to handle cropping regions as a (left, top, width, height) tuple or to supply data of the color channels as an (r,g,b) tuple. They can be simply created as PyObject by PyTuple_New. Entries are set using the PyTuple_SetItem and read using PyTuple_GetItem. In addition one might use PyTuple_Check to see if an PyObject is of the tuple type and use PyTuple_Size to determine the size of the tuple. Each member of the tuple also has to be a PyObject of course. The tuple steals reference counts - it does not automatically increment them. The entity that adds entries to tuples is responsible to increment them as required.

PyObject* lpSampleTuple = PyTuple_New(3);

PyTuple_SetItem(lpSampleTuple, 0, PyLong_FromLong(1));
PyTuple_SetItem(lpSampleTuple, 1, PyLong_FromLong(2));
PyTuple_SetItem(lpSampleTuple, 2, PyLong_FromLong(3));

printf("Size of the tuple: %u\n", (unsigned int)PyTuple_Size(lpSampleTuple));

Py_DECREF(lpSampleTuple);

List

A list works similar to a tuple - but should contain only elements of the same size. Lists are also dynamically sized - they can simply grow and have entries removed.

PyObject* lpSampleList = PyList_New(3);

PyList_SetItem(lpSampleList, 0, PyLong_FromLong(1));
PyList_SetItem(lpSampleList, 1, PyLong_FromLong(2));
PyList_SetItem(lpSampleList, 2, PyLong_FromLong(3));

printf("Size of the list: %u\n", (unsigned int)PyList_Size(lpSampleList));

PyList_Append(PyLong_FromLong(4));

printf("Size of the list: %u\n", (unsigned int)PyList_Size(lpSampleList));

Py_DECREF(lpSampleList);

Dictionaries

This is the third basic type used often in Python. It’s the basic key value store. One can use any key type that allows one to calculate a hash for it’s content. They’re created empty using PyDict_New():

PyObject* lpSampleDict = PyDict_new()

Access is done using:

PyDict_SetItem(PyObject *p, PyObject *key, PyObject *val)
PyDict_SetItemString(PyObject *p, const char *key, PyObject *val)
PyDict_DelItem(PyObject *p, PyObject *key)
PyDict_DelItemString(PyObject *p, const char *key)
PyDict_GetItem(PyObject *p, PyObject *key)
PyDict_GetItemWithError(PyObject *p, PyObject *key)
PyDict_GetItemString(PyObject *p, const char *key)

and some other functions. One can look up in the official documentation how to use those.

Calling back from C to Python callbacks

Since I wanted to implement typical callback based handling of image streaming in addition to a polling interface one has to implement callbacks from C to a Python callable. This is really simple - from the parameter side Python callables are just PyObject instances so they can be passed as parameters and properties just as already known. The only difference is that they implement the tp_call property. One can check for a callable by using PyCallable_Check. In my case I implemented something that I often do - the ability to either pass a simple callable or a list of callables to a property. In the latter case all callbacks will be walked in case of an event.

To call the methods one then simply invokes PyObject_Call. This also transparently handles calling of objects member functions - this method just accepts the usual parameters for positional and keyword arguments. The positional argument list is a PyTuple tuple, the keyword argument parameter a PyDict dictionary. So one can call a method pretty simple:

	PyObject* lpCallback = ...; /* Comes from somewhere ... */

	if(PyCallable_Check(lpCallback)) {
		/* Build positional argument list */
		PyObject* posArgs = PyTuple_New(2);
		PyTuple_SetItem(posArgs, 0, PyLong_FromLong(1));
		PyTuple_SetItem(posArgs, 1, PyLong_FromLong(2));

		/* Build keyword argument dictionary */
		PyObject* lpNamedKey1 = PyUnicode_FromKindAndData(PyUnicode_1BYTE_KIND, "key1", strlen("key1"));
		PyObject* lpNamedKey2 = PyUnicode_FromKindAndData(PyUnicode_1BYTE_KIND, "key2", strlen("key2"));

		PyObject* kwArgs = PyDict_New();

		PyDict_SetItem(kwArgs, lpNamedKey1, PyLong_FromLong(1));
		PyDict_SetItem(kwArgs, lpNamedKey2, PyLong_FromLong(2));

		/* Method call ... */
		PyObject* lpResult = PyObject_Call(lpCallback, posArgs, kwArgs);

		Py_DECREF(lpNamedKey1);
		Py_DECREF(lpNamedKey2);
		Py_DECREF(posArgs);
		Py_DECREF(kwArgs);

		if(lpResult == NULL) {
			/* An exception has been raised */
		} else {
			Py_DECREF(lpResult);
		}
	}

The behavior implemented

When one creates a new Camera instance one passes the name of the camera to the __init__ function. The __init__ method will not open the device handle, won’t query capabilities and won’t start streaming or capturing.

The next step when using the camera will be to open the previously specified device file. This will also query the capabilities of the camera, the supported formats and set a default format and cropping region. Before one starts streaming one can of course change the format and cropping region again. The calls to open and close will be automated by the context methods __enter__ and __exit__ - one should not mix both.

To process frames two approaches are supported:

One can either supply a single callback or a list of callbacks and then use the stream method to deliver frames one after each other
Or one can start streaming using streamOn, process one frame after each other using the nextFrame method and in the end terminate the stream using streamOff.

Whenever a frame has been ready and is available for the Python code it gets passed to the callback or returned from NextFrame - note that this works in the foreground thread and not a special forked background thread or process due to the usage that this library has been targeted at (testing and demonstrating algorithms, etc.). For each frame that arrives from the camera a ProcessFrame method is called - this converts the raw data into a double nested Python list in the selected output format (at time of writing the code in the GitHub repository assumes that one has an YUV 4:2:2 input frame and wants to produce and RGB output image - a version that supports VUY and RGB input frames as well as RGB, YUV and grayscale output frames will be added in near future; the library won’t support compressed formats since it would be too much work to implement self contained decompression code for motion JPEG without adding an external dependency like the excellent libjpeg which would violate the no external dependencies constraint for this project.

The code

The code of my simple C extension for Python is available at GitHub. It of course covers a little bit more functionality than written in this blog article but is still a pretty small and self contained extension.

It can be used to simply capture frames from V4L devices - now in a simple way - and then manipulate them using libraries such as Pillow or Matplotlib in a really simple way. For example to show a captured frame using matplotlib:

import matplotlib.pyplot as plt
import simplepycam

with simplepycam.Camera("/dev/video0") as cam:
	cam.streamOn()
	frame = cam.nextFrame()
	cam.streamOff()
	plt.imshow(frame)
	plt.show()

or to store a sequence of frames using pillow using either the Callback method:

from PIL import Image
import simplepycam

nStoredImages = 0
nImagesToStore = 100

def cbStoreImage(camera, frame):
	global nStoredImages
	print("Storing image {} of {}".format(nStoredImages, nImagesToStore))

	# Transform frame into flatter array so PIL is able to process it
	newPilImage = []
	for row in range(len(frame)):
		for col in range(len(frame[row])):
			newPilImage.append(frame[row][col])

	im = Image.new(mode="RGB", size=(len(frame[0]), len(frame)))
	im.putdata(newList)
	im.save("image{}.jpg".format(nStoredImages))

	nStoredImages = nStoredImages + 1
	if(nStoredImages >= nImagesToStore):
		return False

	return True

with simplepycam.Camera("/dev/video0") as cam:
	cam.frameCallback = [ cbStoreImage ]
	cam.stream()

or one might also use the polling API:

from PIL import Image
import simplepycam

nImagesToStore = 5

with simplepycam.Camera("/dev/video0") as cam:
	cam.streamOn()

	for i in range(nImagesToStore):
		frame = cam.nextFrame()
		print("Storing image {} of {}".format(i, nImagesToStore))

		newPilImage = []
		for row in range(len(frame)):
			for col in range(len(frame[row])):
				newPilImage.append(frame[row][col])
		im = Image.new(mode="RGB", size=(len(frame[0]), len(frame)))
		im.putdata(newPilImage)
		im.save("image{}.jpg".format(i))

	cam.streamOff()

Usually I prefer using the callback API since this allows one to attach an arbitrary number of callbacks (like processing and visualization). On the other hand this gets really handy when doing things in a proper way with multiple threads or on a distributed system when one builds a media processing graph - which was not the task of this small project.