Blog Archives
Aspect-like abstractions and Python metaclasses
Posted by Rafael Barreto
If you follow this blog you know I enjoy Python. Many of my posts are Python related and this is a trend here. However, I’m not one of those fanboys that talk about technologies as religions. Most of the time I prefer to work on more dense and formal computing themes, like Machine Learning, and do not care much about specific technologies (despite being needed tools, they are ephemeral). And this is the very reason I REALLY enjoy Python =). It seems confusing, but makes complete sense to me. Because I like formal stuff, I prefer abstract to concrete. And Python is one of the few languages I know in which I can remain at a high level of abstract thinking while building concrete real world stuff. Let me demonstrate this abstraction power in a practical use case.
Consider the plug-in infrastructure I described in this previous post. Although extremely simple, that general design is good enough for many scenarios. It defines a common interface and employs a little of meta-programming to auto-discover the plug-ins. However, it has a serious flaw. Let’s look at the following code snippet:
class Troublemaker(Plugin):
def setup(self):
raise RuntimeError("I'm a troublemaker and I'll not let you set me up.")
def teardown(self):
raise RuntimeError("I'm a troublemaker and I'll not let you tear me down.")
def run(self, *args, **kwards):
raise RuntimeError("I'm a troublemaker and I'll not let you run me.")
From the designed infrastructure point of view, this is a completely fine plug-in and will be discovered and loaded without problems. So, what’s wrong? Well… the Troublemaker methods are supposed to be called many times in a couple of different points by the main application. In order to avoid program crashes, code for exception handling and logging (for plug-in debug purposes) must be inserted in ALL THESE POINTS. It’s not difficult to see this doesn’t scale up. Imagine if the plug-ins common interface had 10 methods… what a hell hum?
The problem here is a fundamental one. Plug-ins must obey a simple rule which wasn’t addressed in the designed infrastructure:
A plug-in should be hermetic. Exceptions should not leak and all abnormal situations should be logged.
But how this requirement can be enforced by the design? There must be some kind of abstraction barrier between the plug-ins methods and the callers taking responsibility for handling and logging any exception, no matter where and when they occur. Does it smell like a typical Aspect Oriented Programming (AOP) problem domain to you? The cross-cutting concerns are the exceptions handling and logging, which are spread across the main application. We want to isolate these concerns, so… yes, this sounds like AOP to me and that’s why it’s difficult to imagine a clean and low invasive solution using only normal OO constructs: thinking using the wrong paradigm =P.
Some AOP concepts can be easily implemented in Python using the notion of metaclass. There are lots of good materials about this subject on the net (for example, this discussion at SO) and I’ll not replicate them. However, a brief overview comes in handy for those unfamiliar with the idea.
A class can be viewed as a template for the creation of objects, which are entities that modify their states by responding to messages. Thus, a class defines the messages and how they are responded by the objects (the objects shape). Different objects of the same class can have different states, but they share the same shape.
So far, nothing new. The surprise is that, in Python, classes are themselves objects (like all Python data). This is very easy to see:
>>> class Original(object): pass ... >>> Copied = Original >>> Copied == Original True >>> x = Copied() >>> x.__class__.__name__ 'Original'
Since a class is also an object, and objects are produced using a class as template, what is the template of a class? In other words, what’s the class of a class? Guess who said: a metaclass. Metaclasses are templates, recipes of how classes are created. The default Python metaclass is type:
>>> Copied.__class__.__name__ 'type'
Yeah… type is a class used to create other classes. When the class keyword is found in a source code, the Python interpreter uses type to create the class. This means you can create classes at run-time. For instance, class Original(object): pass is completely equivalent to type("Original", (object,), {}).
And what’s the class of type metaclass?
>>> type.__class__.__name__ 'type'
Aha! Here the interpreter cheats a bit as you can see, but that’s necessary. After all, we are engineers and not philosophers to ask what’s the class of a metaclass… =P
Anyway… the important point here is that Python allows you to create your own metaclass and customize how classes are created by inheriting from type. Why would you like to do this? Well… indeed, most of the time you SHOULD NOT DO THIS. As Tim Peters once said:
Metaclasses are deeper magic than 99% of users should ever worry about. If you wonder whether you need them, you don’t (the people who actually need them know with certainty that they need them, and don’t need an explanation about why).
But there are some use cases in which a metaclass is the perfect solution. The plug-in infrastructure is a good example. A very simple metaclass is used to guarantee the registration of any loaded class deriving from Plugin, and this is the whole plug-in auto-discovery mechanism.
Another use case is that described in the beginning of this post: to intercept plug-in calls to log and handle any exception. Let’s be more generic here: to intercept method calls to log and handle any exception. It would be nice if we could do something like this:
class Test1(object):
# log this method BUT RAISE any exception
@traceablecallable
def normalMethod(self):
pass
# log this method BUT RAISE any exception
@traceablecallable
def troublemakerMethod(self):
raise RuntimeError()
class Test2(Test1):
pass
class Test3(Test1):
def normalMethod(self):
print "I'm a normal method"
# log this method and SUPPRESS any exception
@securetraceablecallable
def troublemakerMethod(self):
super(Test3, self).troublemakerMethod()
test3 = Test3()
test3.normalMethod()
test3.troublemakerMethod()
test2 = Test2()
test2.normalMethod()
test1 = Test1()
test1.normalMethod()
test1.troublemakerMethod()
…generating as output:
INFO:calling Test3:normalMethod
I'm a normal method
INFO:calling Test3:troublemakerMethod
INFO:calling Test1:troublemakerMethod
ERROR:exception at Test1:troublemakerMethod: RuntimeError("")
ERROR:exception at Test3:troublemakerMethod: RuntimeError("")
INFO:calling Test1:normalMethod
INFO:calling Test1:normalMethod
INFO:calling Test1:troublemakerMethod
ERROR:exception at Test1:troublemakerMethod: RuntimeError("")
------------------------------------------------------------
Traceback (most recent call last):
...
RuntimeError
Try to figure out what’s going on here for a moment. traceablecallable and securetraceablecallable are just inheritable annotations meaning:
traceablecallable: log any call to this callable; log any exception and raise itsecuretraceablecallable: log any call to this callable; log any exception and suppress it
A callable is anything that can be called in Python, such as a method or a function. If that was possible, our use case would be solved in a very elegant way. It turns out this is really possible using the metaclass magic:
class TraceableMeta(type):
_MODES = { "TRACE" : 1,
"SECURE_TRACE" : 2 }
def __init__(cls, name, bases, attrs):
super(TraceableMeta, cls).__init__(name, bases, attrs)
cls.__tracedcallables__ = {}
for base in bases:
cls.__tracedcallables__.update(getattr(base, "__tracedcallables__",
{}))
traceModes = (getattr(v, "__tracemode__", None) for v in attrs.values())
cls.__tracedcallables__.update({k : v for k, v in zip(attrs.keys(), \
traceModes) if v in TraceableMeta._MODES.values()})
for k, v in cls.__tracedcallables__.items():
if attrs.has_key(k):
setattr(cls, k, _tracedCallableProxy(getattr(cls, k), v))
def _tracedCallableProxy(c, traceMode):
callableMod = inspect.getmodule(c)
if callableMod:
callableMod = callableMod.__name__
callableDefOrig = c.im_class.__name__ if hasattr(c, "im_class") \
else callableMod
def callableProxy(*args, **kwards):
try:
logging.info("calling %s:%s" % (callableDefOrig, c.__name__))
return c(*args, **kwards)
except Exception as e:
logging.error("exception at %s:%s: %s(\"%s\")" % (callableDefOrig,
c.__name__, e.__class__.__name__, e.__str__()))
if traceMode != TraceableMeta._MODES["SECURE_TRACE"]:
raise e
return callableProxy
def _traceabledecorator(traceMode):
def decorator(c):
if callable(c):
c.__tracemode__ = traceMode
return c
return decorator
traceablecallable = _traceabledecorator(TraceableMeta._MODES["TRACE"])
securetraceablecallable = _traceabledecorator( \
TraceableMeta._MODES["SECURE_TRACE"])
I’ll not explain all this code, just the idea. Basically, TraceableMeta is a metaclass that checks for the traceablecallable and securetraceablecallable annotations in the methods of the to be created class and its bases. The annotated methods are then replaced by a proxy responsible for doing the exception handling and logging stuff. One great thing about metaclasses is that they are inherited. Thus, for example, if A has TraceableMeta as metaclass and B inherits from A, then TraceableMeta is also metaclass of B. So, even overridden methods in derived classes lacking explicit annotations are considered, making the functionality provided by TraceableMeta totally transparent in the class hierarchy.
How can this be used by the plug-in infrastructure? It’s pretty simple. Just redefine the plug-in common interface:
class PluginMeta(TraceableMeta):
def __init__(cls, name, bases, attrs):
super(PluginMeta, cls).__init__(name, bases, attrs)
if not hasattr(cls, "registered"):
cls.registered = []
else:
cls.registered.append(cls)
class Plugin(object):
__metaclass__ = PluginMeta
...
@securetraceablecallable
def setup(self):
raise NotImplementedError()
@securetraceablecallable
def teardown(self):
raise NotImplementedError()
@securetraceablecallable
def run(self, *args, **kwargs):
raise NotImplementedError()
Tell the truth. How many languages do you know out there in which you could devise a solution like this in 56 lines of code? That’s all about abstraction. And that’s why I REALLY like Python =D.
