wiki:vge/MagickWrappers
Last modified 7 years ago Last modified on 09/10/07 13:10:51

Back to vge

Magick Wrappers

As small team experimenting with new media and in this case in particular with real-time graphics and virtual and augmented reality, we want to be able to quickly try different combinations of soft- and hardware, technologies and techniques and our own ideas. Dual language architecture allows us to harness the performance of C/C++ libraries while quickly coding "throw away"/prototype code in a high level langauge.

One of the most tedious tasks in such setup is writing the scripting "bindings" - an interface layer between the two langauges (also called "wrapping" an API). It is important for us to lower the psychological barrier and time required for wrapping yet another library now and then.

Our scripting language of choice - PLT scheme - provides two mechanisms for interfacing C code:

extensions
dynamic libraries written in C using mzscheme C API which can be loaded into scheme interpreter
foreign function interface (FFI)
a scheme library that allows to describe C functions and data directly in scheme and generate the bindings for them. FFI can't be used to wrap C++ libraries directly because of the incompatible ABI.

One possible solution for wrapping C++ libraries is to maintain alternative C interface for them and use that API via FFI. The disadvantage of this approach is redundant code: the API has to be described twice in C and scheme which can be error prone and typing-intensive. Given the experimental nature of VGE the APIs often change and having to alter it twice could daunt the programmers involved.

Instead we're using a combination of mzscheme extension API and C++ template metaprogramming to minimize boilerplate code typical for language bindings. The approach is inspired by boost::python and luabind libraries, although isn't as powerful and feature rich.

Half-jokingly, the VGE binding utilities are informally refered to as magick wrappers (in archaic/Crowlean spelling).

Uniform scheme to C/C++ and back conversion API

Function wrapper templates generate conversions from scheme values (represented by Scheme_Object * pointers in C API) to C types of function arguments and from C types to Scheme_Object * pointers of function return value. In order for these to be possible all supported types implement

TYPE from_scheme<TYPE>(Scheme_Object *);

and/or

Scheme_Object * to_scheme(TYPE);

operations. These conversions can also be used directly where necessary.

Header file VGE/scheme_convert.h implements conversions for several internal types as well as for wrapped structure pointers. Individual extensions may add support for more types (e.g. engine base extension adds direct support for Ogre vectors, quaternions and angles). New type conversions should be declared after #include <VGE/scheme_convert.h> but before #include <VGE/type_switch.h> (see Type Switch) (FIXME is that really so? and what about scheme_fun.h?)

When a conversion isn't possible these operations throw vge_bad_cast_exn exception (whose constructor expects a C string representing expected type for error reporting), generated function wrappers catch this exception and convert it to appropriate scheme error condition, when using convertions "manually" one should handle the eventual exception oneself.

For this to work template functions should have access to strings representing names of supported types. At first the solution seems to be using typeid(TYPE).name() but for some obscure reason this construct returns computer friendly "mangled" type names used by the compiler internally.

There seem to exist type name demanglers out there (compiler specific because of lack of common C++ ABI / name mangling scheme) but until now we haven't been able to find the one which works across the platoforms VGE is being developed on (32/64 bit linux and 32 bit windows). Temporary (and ugly) solution is a type_name template and macros, see VGE/type_name.h. This header also defines names for a number of built in names. The default version of the type_name template -- used for types not implicitly named with TYPE_NAME or AUTO_TYPE_NAME macros -- returns mangled type name.

To supply new type name add one of the following to the global scope:

//succint, preferred form:
AUTO_TYPE_NAME(C_Type);
// flexible
TYPE_NAME("HumanReadbleTypeName", C_Type);

The first -- succint -- form uses stringification of the macro parameter to generate the name string so that string form looks just like its normal C/C++ representation.

Note
VGE_ENUM (see enum wrapper macro) uses AUTO_TYPE_NAME internally to name the enum. FIXME: do the same for registered strucure pointer types. It also defines to_scheme and from_scheme operations for the enum.

Structure Pointers

In order to wrap object oriented API, such as OGRE, we need scheme representation of (pointers to) C++ objects.

The simplest way to do that in mzscheme is to use scheme_cpointer_type - tagged void pointer wrapped in a scheme value. This approach is used by mzscheme FFI to wrap C structures, using pointer tag field (which can hold any scheme value) to represent the type. FFI also maintains some meta information about structure types allowing simple form of polymorphism: structures of descendant types can be used where ancestor type is expected.

However this approach only works when descendant structure contins ancestor structure at its start, which isn't always the case for C++ class hierarchies:

  • if the parent class contains no virtual methods but child declares some, object of descendant type will start with vtable pointer followed by ancestor structure
  • in case of multiple inheritance child object obviously can only start with the first parent structure

Thus, pointer casts in C++ will not only reinterpret the pointer type, but also change the address (in particular that's what operation dynamic_cast<> does). Although the actual address calculation happens at run time, using run-time type information (RTTI), dynamic_cast needs the pointer type (to convert from) to be available at compile time - meaning it can't handle void pointers (which contain no type information).

Given infinite development time it could be possible to wrap each family of potentially compatible pointers into distinct wrapper type containing a typed pointer to the hierarchy root. But in practice a single library such as OGRE contains dozens of mutually incompatible hierarchies and this approach isn't feasible.

Instead, VGE employs a custom RTTI schema built on top of standard C++ RTTI. Each structure wrapper contains typeless (void) pointer as well as a pointer to a type_info structure describing the original pointer. Since standard C++ RTTI API is very limited these to fields are only enough to cast (or refuse to) the void pointer to the orginal pointer type, but not to one of the ancestor types. In order to achieve the latter we build a type registry, which captures parent-child relationships between types together with pointer conversions from type to type, tagged with type_info's. Now, given a typeless pointer a original type_info we can see if it can be cast to a type with different type_info and if so - how.

Only class hierarchies must be registered with the type registry, not the parentless classes (or classes with parents we don't care about). But if registering a class with multiple parents the first parent always has to be registered as such (the order of other parents doesn't matter and they can be skipped). That's because of the child -- first parent conversion optimization

Notes
  • isn't there type traits to check so this rule can go away?
  • if type registry operations were macros then they could use TYPE_NAME or AUTO_TYPE_NAME to register object and pointer type names. In this case the same API would have to be used for parentless classes (but only perform type name registeration).
TODO
  • support for more then three ancestors
  • document type registration (for now grep for 'type_registry' in *.cpp)

Enum Wrappers

Mzscheme's FFI has support for enums which binds scheme symbols to their integer values (since C enums are just lists of names for integers). The obvious problem is the care that should be taken for exact correspondence of symbols and values in scheme and C/C++. The advantage is speed of compiled scheme code - the ineger value will be taken directly from the symbol's binding.

In VGE we use alternative approach: enum values are coverted to and from scheme symbols using a per type lookup table. Disadvantage is the lookup time, advantages are wrapping by name (less error prone), binding independance (symbol's name is used, not its binding which could have been altered, especially for generic words like 'point'), the same symbol may be correspond to different C/C++ values - e.g. common words used as enum values in different libraries or in different namespaces in the same library.

Header VGE/scheme_enum.h defines the macro VGE_ENUM which creates lookup tables, declares to_scheme and from_scheme functions for enum type, and registers the type name for error reporting. Two micro-macros - _P and _E can be used for listing enum values: the longer form _P(name, value) allows to chose the scheme symbol name, while shorter _E(value) generates the symbol name from enum value automatically. Note, that there are no commas between invocations of micro-macros:

VGE_ENUM( Ogre::SceneManager::PrefabType, 
	  _P( "plane", Ogre::SceneManager::PT_PLANE )
	  _P( "cube", Ogre::SceneManager::PT_CUBE )
	  _P( "sphere", Ogre::SceneManager::PT_SPHERE ));

Function Wrappers

The utilities described so far are used together almost behind the scenes to implement function wrapper generators.

To wrap a C/C++ function:

  • make sure that return type supports to_scheme operation
  • make sure that argument types support from_scheme operation
  • use one of VGE_WRAP_FUN or VGE_AUTO_WRAP_FUN macros to generate wrapper function at compile time, generate a scheme primitive function value from wrapper, bind a symbol in scheme namespace (given as a last argument to the macros) to the generated primitive. Like usually _AUTO_ form generates scheme name from wrapped function name.

Note, that Scheme_Object * type implements to_scheme and from_scheme operations, so it can be used just as well - e.g. to accept pairs, lists and other scheme objects directly or emulate generic functions (see type switch below).

The following restrictions apply to functions that can be wrapped:

  • functions can't use default argument values
  • overloaded functions (functions with the same name but different type signature) can't be wrapped into single scheme function (FIXME this is annoying)

Type Switch

Type switch macros allow to emulate overloaded / generic functions. The form of type switch construct is analogous to that of switch statement, but switch variable is a Scheme_Object * and case selectors are C/C++ types. As soon as the scheme value can be converted to a type in TYPE_CASE(...) macro the portion of code following it until TYPE_CASE_END macro is executed. If no match is found scheme_wrong_type is called with "One of {TYPE1, TYPE2, ...}" as expected type.

Please note, that the first type for which conversion is possible will match, thus making all number types equivalent (due to automatic conversion) and causing wrapped pointers match ancestor pointer types. If for some reason you desire to treat related pointer types differently list the most specific types first.