Blog Archives
A PyQt widget for OpenCV camera preview
This is a hands-on post where I’ll show how to create a PyQt widget for previewing frames captured from a camera using OpenCV. In the process, it’ll be clear how to use OpenCV images with PyQt. I’ll not explain what PyQt and OpenCV are because if you don’t know them yet, you probably don’t need it and this post is not for you =P.
The main reason for integrating PyQt and OpenCV is to provide a more sophisticated UI for applications using OpenCV as basis for computer vision tasks. In my specific case, I needed it in the facial recognition prototype used in the interview for Globo TV I posted last week. I usually don’t write about such specific things, but there’s little information about it on the net. So I’d like to make a contribution sharing my findings.
The first problem to be solved is to show a cv.iplimage (the type of an OpenCV image in Python) object in a generic PyQt widget. This is easily solved by inheriting from QtGui.QImage this way:
import cv
from PyQt4 import QtGui
class OpenCVQImage(QtGui.QImage):
def __init__(self, opencvBgrImg):
depth, nChannels = opencvBgrImg.depth, opencvBgrImg.nChannels
if depth != cv.IPL_DEPTH_8U or nChannels != 3:
raise ValueError("the input image must be 8-bit, 3-channel")
w, h = cv.GetSize(opencvBgrImg)
opencvRgbImg = cv.CreateImage((w, h), depth, nChannels)
# it's assumed the image is in BGR format
cv.CvtColor(opencvBgrImg, opencvRgbImg, cv.CV_BGR2RGB)
self._imgData = opencvRgbImg.tostring()
super(OpenCVQImage, self).__init__(self._imgData, w, h, \
QtGui.QImage.Format_RGB888)
The important lines here are 14-17:
- Line 14 converts the image from BGR to RGB format. OpenCV images loaded from files or queried from the camera often (not always; adapt it to your scenario) are delivered in BGR format, which is not what PyQt expects. Thus, I convert the image to RGB before going on. What happens if you skip this step? Well… nothing critical… the image will be shown with the channels R and B flipped.
- Line 15 saves a reference to the
opencvRgbImgbyte-content to prevent the garbage collector from deleting it when__init__returns. This is very important. - Lines 16-17 call the
QtGui.QImagebase class constructor passing the byte-content, dimensions and format of the image.
Done! If all you want is to show an OpenCV image in a PyQt widget, that’s all you need. However, for a camera preview it’s better to have a more convenient and complete API. Specifically, it must be straightforward to connect a camera device to a widget and still keep them decoupled. There should be one camera device, which can be seen as a frame provider, to many widgets:
And the frames must be manipulated independently by the widgets, i.e., the widgets must have complete control over the frames delivered to them.
The camera device can be encapsulated like this:
import cv
from PyQt4 import QtCore
class CameraDevice(QtCore.QObject):
_DEFAULT_FPS = 30
newFrame = QtCore.pyqtSignal(cv.iplimage)
def __init__(self, cameraId=0, mirrored=False, parent=None):
super(CameraDevice, self).__init__(parent)
self.mirrored = mirrored
self._cameraDevice = cv.CaptureFromCAM(cameraId)
self._timer = QtCore.QTimer(self)
self._timer.timeout.connect(self._queryFrame)
self._timer.setInterval(1000/self.fps)
self.paused = False
@QtCore.pyqtSlot()
def _queryFrame(self):
frame = cv.QueryFrame(self._cameraDevice)
if self.mirrored:
mirroredFrame = cv.CreateImage(cv.GetSize(frame), frame.depth, \
frame.nChannels)
cv.Flip(frame, mirroredFrame, 1)
frame = mirroredFrame
self.newFrame.emit(frame)
@property
def paused(self):
return not self._timer.isActive()
@paused.setter
def paused(self, p):
if p:
self._timer.stop()
else:
self._timer.start()
@property
def frameSize(self):
w = cv.GetCaptureProperty(self._cameraDevice, \
cv.CV_CAP_PROP_FRAME_WIDTH)
h = cv.GetCaptureProperty(self._cameraDevice, \
cv.CV_CAP_PROP_FRAME_HEIGHT)
return int(w), int(h)
@property
def fps(self):
fps = int(cv.GetCaptureProperty(self._cameraDevice, cv.CV_CAP_PROP_FPS))
if not fps > 0:
fps = self._DEFAULT_FPS
return fps
I’ll not go through all this code because it’s not complex. Essentially, it uses a timer (with interval defined by the fps; lines 18-20) to query the camera for a new frame and emits a signal passing the captured frame as parameter (lines 26 and 32). The timer is important to avoid spending CPU time with unnecessary pooling. The rest is just bureaucracy.
Now, let’s see the camera widget itself. The main purpose of it is to draw the frames delivered by the camera device. But, before drawing a frame, it must allow anyone interested to process it, changing it if necessary without interfering with any other camera widget. Here’s the code:
import cv
from PyQt4 import QtCore
from PyQt4 import QtGui
class CameraWidget(QtGui.QWidget):
newFrame = QtCore.pyqtSignal(cv.iplimage)
def __init__(self, cameraDevice, parent=None):
super(CameraWidget, self).__init__(parent)
self._frame = None
self._cameraDevice = cameraDevice
self._cameraDevice.newFrame.connect(self._onNewFrame)
w, h = self._cameraDevice.frameSize
self.setMinimumSize(w, h)
self.setMaximumSize(w, h)
@QtCore.pyqtSlot(cv.iplimage)
def _onNewFrame(self, frame):
self._frame = cv.CloneImage(frame)
self.newFrame.emit(self._frame)
self.update()
def changeEvent(self, e):
if e.type() == QtCore.QEvent.EnabledChange:
if self.isEnabled():
self._cameraDevice.newFrame.connect(self._onNewFrame)
else:
self._cameraDevice.newFrame.disconnect(self._onNewFrame)
def paintEvent(self, e):
if self._frame is None:
return
painter = QtGui.QPainter(self)
painter.drawImage(QtCore.QPoint(0, 0), OpenCVQImage(self._frame))
Again… I’ll not go through all this. The really important stuff is in the lines 24-26, 35 and 39. As stated before, it’s paramount that the widgets sharing a camera device don’t interfere with each other. In this case, all widgets receives the same frame, which in fact is a reference to the same memory location. This means that if a widget modifies a frame, the others will see it. Clearly, this is not desirable. So, every widget saves its own version of the frame (line 24). This way, they can do whatever they want safely. However, to process the frame is not responsibility of the widget. Thus, it emits a signal with the saved frame as parameter (line 25) and anyone connected to it can do the hard work.
It remains to draw the frame, what is done overriding the paintEvent method (line 35) of the QtGui.QWidget class. The relevant lines are 26 and 39. Line 26 forces a schedule of a paint event and line 39 effectively draws it when a paint event occurs (using OpenCVQImage as you can see).
The following snippet shows how to use all this together:
def _main():
@QtCore.pyqtSlot(cv.iplimage)
def onNewFrame(frame):
cv.CvtColor(frame, frame, cv.CV_RGB2BGR)
msg = "processed frame"
font = cv.InitFont(cv.CV_FONT_HERSHEY_DUPLEX, 1.0, 1.0)
tsize, baseline = cv.GetTextSize(msg, font)
w, h = cv.GetSize(frame)
tpt = (w - tsize[0]) / 2, (h - tsize[1]) / 2
cv.PutText(frame, msg, tpt, font, cv.RGB(255, 0, 0))
import sys
app = QtGui.QApplication(sys.argv)
cameraDevice = CameraDevice(mirrored=True)
cameraWidget1 = CameraWidget(cameraDevice)
cameraWidget1.newFrame.connect(onNewFrame)
cameraWidget1.show()
cameraWidget2 = CameraWidget(cameraDevice)
cameraWidget2.show()
sys.exit(app.exec_())
See that two CameraWidget objects share the same CameraDevice (lines 17, 19 and 23), but only the first processes the frames (lines 4 and 20). The result is two widgets showing different images resulting from the same frame, as expected:
Now you can embed CameraWidget in a PyQt application and have a fresh OpenCV camera preview. Cool huh? I hope you enjoyed it =P.
Until next…
A very lightweight plug-in infrastructure in Python
For some applications, run-time extensibility is a major requirement. There are lots of examples out there: browsers, media players, photo editors, etc. All these softwares can be easily extended with new functionality using plug-ins. How is this done?
It seems like complex stuff. Indeed, it really is, specially when you are using a bureaucratic language like Java or digging into the low level with C. However, when there aren’t security concerns, the extensions are of limited scope and a language with great introspection power like Python is being used, this can be a piece of cake =P.
Let’s see… suppose the plug-ins provide their services by means of the following contract interface:
class Plugin(object):
def setup(self):
raise NotImplementedError
def teardown(self):
raise NotImplementedError
def run(self, *args, **kwards):
raise NotImplementedError
Given this, a basic plug-in infrastructure should have as features:
- A way to auto-discover subclasses of
Pluginon-demand at run-time - A centralized way to access these subclasses
Thanks to the black magic of Python metaclasses (I’m assuming you are familiar with them; otherwise, see this excellent SO discussion), it’s very simple to implement those features:
class Plugin(object):
class __metaclass__(type):
def __init__(cls, name, base, attrs):
if not hasattr(cls, 'registered'):
cls.registered = []
else:
cls.registered.append(cls)
...
Now, every time a subclass of Plugin is defined, it is added to Plugin.registered so that there’s a centralized way to access the plug-ins. But the problem of auto-discovery still remains because a plug-in class must be defined to the metaclass trick work, which requires the import of the modules containing the plug-in classes definitions. However, this is easy to fix:
import imp
import logging
import pkgutil
class Plugin(object):
class __metaclass__(type):
...
@classmethod
def load(cls, *paths):
paths = list(paths)
cls.registered = []
for _, name, _ in pkgutil.iter_modules(paths):
fid, pathname, desc = imp.find_module(name, paths)
try:
imp.load_module(name, fid, pathname, desc)
except Exception as e:
logging.warning("could not load plugin module '%s': %s",
pathname, e.message)
if fid:
fid.close()
...
The class method load forces the import of any module found in a path list. Consequently, an explicit import is not needed in order to discover the plug-ins, making the application itself fully decoupled of them.
As an usage example, if you had defined subclasses SamplePlugin1 and SamplePlugin2 of Plugin in some module located at "./plugins/", you could access them this way:
>>> Plugin.load("plugins/")
>>> Plugin.registered
[<class 'SamplePlugin1'>, <class 'SamplePlugin2'>]
Of course, this is extremely simple. There’s no sandbox (which implies security issues) and the plug-ins are passive (the application call their methods, instead of them calling methods of a plug-in API). However, for many programs this is enough and anything more complex would be over-engineer.
That’s it. This is a common problem in software engineering, so I hope this is useful. =)
Controlling FPU rounding modes with Python
Hey folks! In the previous post I talked a little about the theory behind the floating-point arithmetic and rounding methods. This time I’ll try a more practical approach showing how to control the rounding modes that FPUs (Floating-Point Units) employ to perform their operations. For that I’ll use Python. But, first let me introduce the <fenv.h> header of the ANSI C99 standard.
The ANSI C99 standard establishes the <fenv.h> header to control the floating-point environment of the machine. A lot of functions and constants allowing to configure the properties of the floating-point system (including rounding modes) are defined in this header. In our case, the important definitions are:
Functions
int fegetround()– Returns the current rounding modeint fesetround(int mode)– Sets the rounding mode returning 0 if all went ok and some other value otherwise
Constants
FE_TOWARDZERO– Flag for round to zeroFE_DOWNWARD– Flag for round toward minus infinityFE_UPWARD– Flag for round toward plus infinityFE_TONEAREST– Flag for round to nearest
Unfortunately, not all compilers fully implement the ANSI C99 standard. Therefore, there is no guarantee on the portability of code that uses the <fenv.h> header. Considering this caveat, the GCC compiler supports it, which makes it possible to build a Python extension wrapping the features of <fenv.h>. However, this is not a smart way to reach our goal since it would require forcing users to compile the source code of the extension. An alternative is to use the ctypes Python module to instantiate the libm (part of the standard C library used by GCC, typically glibc, that implements <fenv.h>):
from ctypes import cdll
from ctypes.util import find_library
libm = cdll.LoadLibrary(find_library('m'))
The constants that identify the rounding modes are usually defined as macros in <fenv.h> and, therefore, are not accessible via ctypes. We need to redefine them in Python. The problem is that they vary according the processor so that it is necessary to establish some logic on the result of the function platform.processor(). Right now, however, I just have a x86 processor to test, so I will use the constants for it only:
FE_TOWARDZERO = 0xc00 FE_DOWNWARD = 0x400 FE_UPWARD = 0x800 FE_TONEAREST = 0
Now, just call the appropriate functions:
>>> back_rounding_mode = libm.fegetround() >>> libm.fesetround(FE_DOWNWARD) 0 >>> 1.0/10.0 0.099999999999999992 >>> libm.fesetround(FE_UPWARD) 0 >>> 1.0/10.0 0.10000000000000001 >>> libm.fesetround(back_rounding_mode)
But, what about the MS Visual Studio? Well… the standard C library of this compiler, msvcrt (MS Visual Studio C Run-Time Library), does not support <fenv.h>. Nonetheless, MS Visual Studio implements a set of non-standard (quite typical…) constants and functions specialized in manipulating the properties of the machine FPU. These constants and functions are defined in the <float.h> header distributed with that compiler. The following definitions are of particular interest to us:
Functions
unsigned int _controlfp(unsigned int new, unsigned int mask)– Sets/gets the control vector of the floating-point system (more information here)
Constants
_MCW_RC = 0x300– Control vector mask for information about rounding modes_RC_CHOP = 0x300– Control vector value for round to zero mode_RC_UP = 0x200– Control vector value for round toward plus infinity mode_RC_DOWN = 0x100– Control vector value for round toward minus infinity mode_RC_NEAR = 0– Control vector value for round to nearest mode
Analogous to the previous case:
>>> _MCW_RC = 0x300 >>> _RC_UP = 0x200 >>> _RC_DOWN = 0x100 >>> from ctypes import cdll >>> msvcrt = cdll.msvcrt >>> back_rounding_mode = msvcrt._controlfp(0, 0) >>> msvcrt._controlfp(_RC_DOWN, _MCW_RC) 590111 >>> 1.0/10.0 0.099999999999999992 >>> msvcrt._controlfp(_RC_UP, _MCW_RC) 590367 >>> 1.0/10.0 0.10000000000000001 >>> msvcrt._controlfp(back_rounding_mode, _MCW_RC) 589855
For sure, it is not convenient to type all this every time you want to change the rounding mode. Nor it is appropriate to guess the standard C library of the system. But you can always create functions encapsulating this behavior. Something like this:
def _start_libm():
global TO_ZERO, TOWARD_MINUS_INF, TOWARD_PLUS_INF, TO_NEAREST
global set_rounding, get_rounding
from ctypes import cdll
from ctypes.util import find_library
libm = cdll.LoadLibrary(find_library('m'))
set_rounding, get_rounding = libm.fesetround, libm.fegetround
# x86
TO_ZERO = 0xc00
TOWARD_MINUS_INF = 0x400
TOWARD_PLUS_INF = 0x800
TO_NEAREST = 0
def _start_msvcrt():
global TO_ZERO, TOWARD_MINUS_INF, TOWARD_PLUS_INF, TO_NEAREST
global set_rounding, get_rounding
from ctypes import cdll
msvcrt = cdll.msvcrt
set_rounding = lambda mode: msvcrt._controlfp(mode, 0x300)
get_rounding = lambda: msvcrt._controlfp(0, 0)
TO_ZERO = 0x300
TOWARD_MINUS_INF = 0x100
TOWARD_PLUS_INF = 0x200
TO_NEAREST = 0
for _start_rounding in _start_libm, _start_msvcrt:
try:
_start_rounding()
break
except:
pass
else:
print "ERROR: You couldn't start the FPU module"
In this case, the constants and functions to be called, as well as the standard C library instantiated, are abstracted. Just use the constants TO_ZERO, TOWARD_MINUS_INF, TOWARD_PLUS_INF, TO_NEAREST and the functions set_rounding, get_rounding.
So, that’s it… I hope this has been useful to you somehow (I doubt it… :P) or, at least, that it was interesting… Until next time…


