C++/C# interoperability

14 Sep 2017 • 11 min. read • Comments

This is a blog entry on C++/C# interoperability.

Recently I was working on a project where I needed to use C++ code from within a C# application. Apart from calling and executing C++ functions from within C#, data needed to be passed back and forth across this divide. This is not a trivial task, considering the different paradigms used by the two languages, mainly the unmanaged nature of C++ versus the managed nature of C#.

As the project dealt with image processing functionality, some of the data that needed to be passed from C# to C++ and back consisted of raw image buffers.

In this blog I will briefly describe the mechanism used for performing C++/C# interoperability. Some code is also provided for handling the ‘transfer’ of data between the two languages.

Managed code

C# is one of the programing languages forming part of Microsoft’s .NET framework. C# programs get compiled into an intermediate type of language, called IL, that run on a virtual execution system, called Common Language Runtime (CLR).

Using Java as an analogy, think of the CLR as being the Java VM (JVM). One difference is that, apart from C#, there are other languages in the .NET framework. For example, F# and Visual Basic. Programs written in these languages all get compiled to the same intermediate code that executes on CLR.

One characteristic of the .NET languages is that they make use of managed code. In managed code, the CLR takes responsibility of managing the memory and other resources of the programs. This `management’ can include garbage collection, control of the lifetime of objects, enhanced debugging functionality, etc.

In constrast, in umanaged code like that of C++, the runtime system knows little about the memory and resources used by the program and can provide minimal services. It’s the program’s responsibility to manage such objects and resources.

One of the issues faced in making C++ code interoperable with C# is how to go about moving data across the boundary between managed and unmanaged code. This process is known as marshaling.

Interoperability mechanisms

The .NET framework provides different ways of achieving interoperability. These are:

Platform Invocation (P/Invoke)
C++ Interop
COM Interop
Embedded Interop types

Platform Invocation (PInvoke for short) allows for managed code to call native unmanaged functions implemented as DLLs. This method is ideal for when we have API-like functions written in C or C++ that need to be accessed from within a C# program. For further info on PInvoke follow this link.

C++ Interop is also known as implict PInvoke and informally referred to as It Just Works. This mechanism consists of wrapping a native C++ class so that it can be consumed by C# code. More details on this method can be found here.

COM Interop is a mechanism specifically for exposing COM components to a .NET language. In other words, the unmanaged code must be encapsulated as a COM object for this mechanism to be applicable.

A recent mechanism that was introduced with .NET version 4.0 is Embedded Interop Types. This is based on defining the equivalence of types.

For my image processing-based project, I opted for the PInvoke method mainly because it fits quite well with the API-style of usage of the C++ code.

A PInvoke example

The diagram below summarises the PInvoke mechanism.

The native C++ code is compiled as a DLL with C-type linkage used for the exported functions (‘DLL function 1’ and ‘DLL function 2’ in the diagram above). Sample C++ code is given below:

//----- ImageHandler.h -------
#pragma once

#ifdef IMAGEHANDLER_EXPORTS
#	define IMAGEHANDLER_API __declspec(dllexport) 
#else
#	define IMAGEHANDLER_API __declspec(dllimport) 
#endif

extern "C" IMAGEHANDLER_API float ValidateImageColourInterop(char**& inputParams, int* numInputParams, char**& outputParams, int* numOutputParams);
extern "C" IMAGEHANDLER_API float ValidateImageResolutionInterop(char**& inputParams, int* numInputParams, char**& outputParams, int* numOutputParams);

The C++ functions to be invoked must also be declared in the C# code. These declarations are called the managed signatures of the functions, and must specify the name of the DLL library in which they reside, obviously the function names, the return types of the functions, and the input parameters of the functions.

using System;
using System.Runtime.InteropServices;

public class ColourValidator
{
    [DllImport("ImageHandler", CallingConvention = CallingConvention.Cdecl, CharSet = CharSet.Ansi)]
    static extern int ValidateImageColourInterop(ref IntPtr inParams, ref int inParamsSize, ref IntPtr outParams, ref int outParamsSize);


    public bool Validate()
    {
        // ...
		
        int rc = ValidateImageColourInterop(ref inputParamsRoot, ref inputParamsSize, ref outParamsRoot, ref outParamsSize);
		
        // ...
    }
} 

The managed signatures provide the information needed by the CLR to locate the C++ functions in the respective DLLs.

Marshalling of data types

The diagram above also depicts the mashalling mechanism, the process of converting unmanaged types into managed types and vice-versa. The black line indicates the boundary between the managed and unmanaged code. An example is given whereby a C++ DWORD is marshalled into a System.UInt32 on .NET.

For some types, marshalling can be done in different ways. For example, a LPCTSTR
in C++ (a long pointer to a constant wide-character string) can be marshalled to a System.String, or a System.Text.StringBuilder, or a System.Char[] type.

The .NET framework provides a marshaler object, unsurprisingly called System.Runtime.InteropServices.Marshaller, that provides common marshalling functionality.

A custom Marshaler

For this project, I wanted to be able to pass a list of key-value pairs between the managed and unmanaged code. This would allow for enhanced flexibility and better mantainability between the two codebases.

It was decided that the key would always be of string type, while the value type can vary (e.g. the value could be of integer type, long int, floating-point, string, etc.)

Another reason for adopting a key-value approach is due to the nature of the application. Since the project deals with images of different types and formats, different image-related metadata needs to be sent to and received from the C++ code. Think of EXIF data as an example of such metadata. Representing this metadata as a list of key-value pairs is the ideal way.

Since the number of key-value pairs is not fixed, our custom marshalling object needs to do serialisation in addition to the marshalling task.

The source code for the C++ and C# portions of our custom marshaler are provided below. The code is quite straightforward and can speak for itself. The C++ code uses a tempate version of the key-value pair in order to accomodate the different value types. On the C# side, we rely on the fact that different types have a common parent class, the Object class.


C++ interop class:	Interop.h	Interop.cpp
C++ Key-Value support classes:	KeyValuePair.h	KeyValuePair.cpp
	KeyValueList.h	KeyValueList.cpp
C# interop class:	Interop.cs
Our custom C# marshaller:	GenericMarshaller.cs

Usage of the customer C++ Marshaler is as follows:

#include "KeyValuePair.h"
#include "Interop.h"


int ValidateImageColourInterop(char**& inputParams, int* numInputParams, char**& outputParams, int* numOutputParams)
{
	int rc = EXIT_SUCCESS;		// assume so

	try
	{
		KeyValueList kvps = Interop::decodeParameterList(inputParams, numInputParams);
		kvps.display();

		KeyValueList kvps2;

		try
		{
			if (!ValidateImageColour(kvps, kvps2))
				rc = EXIT_FAILURE;
		}
		catch (std::exception& ex)
		{
			// ...
			rc = EXIT_FAILURE;
		}

		// serialise the output
		Interop::encodeParameterList(kvps2, outputParams, numOutputParams);
	}
	catch (std::exception& ex)
	{
		// ...
		rc = EXIT_FAILURE;
	}
	catch (...)
	{
		// ...
		rc = EXIT_FAILURE;
	}

	return rc;
}

And the C# usage example is given below:

using System;
using System.Runtime.InteropServices;

public class ColourValidator
{
    [DllImport("ImageHandler", CallingConvention = CallingConvention.Cdecl, CharSet = CharSet.Ansi)]
    static extern int ValidateImageColourInterop(ref IntPtr inParams, ref int inParamsSize, ref IntPtr outParams, ref int outParamsSize);

	
	public ValidatorOutput Validate(IDictionary<string, object> p_Params)
	{
		bool isValid = true;        // assume true for now
		
		try
		{
			// encode params for interop call
			IntPtr inputParamsRoot;
			int inputParamsSize;
			Interop.encodeParameterList(p_Params, out inputParamsRoot, out inputParamsSize);

			// do interop call
			IntPtr outParamsRoot = IntPtr.Zero;
			int outParamsSize = 0;
			int rc = ValidateImageColourInterop(ref inputParamsRoot, ref inputParamsSize, ref outParamsRoot, ref outParamsSize);
			if (rc != 0)
			{
				isValid = false;
				// ...
			}

			// decode params returned by interop call
			IDictionary<string, object> p_OutParams = Interop.decodeParameterList(outParamsRoot, outParamsSize);
		}
		catch (Exception ex)
		{
			isValid = false;
			// ...
		}

		// return the validation result
		return new ValidatorOutput(isValid, p_OutParams);
	}
}

Conclusion

Some concluding thoughts on using PInvoke for C++/C# interoperability.

Usage of PInvoke is quite simple and I believe that the custom marshaller described above provides quite a generic approach that can be used as-is for different application domains.

Some things to look out for:

C++/C# interop code can prove to be quite difficult to debug, especially across the boundary between managed and unmanaged code. The use of logging can mitigate this problem.
PInvoke requires C-style linkage. Thus C++ classes need to be serialised before they are marshalled.
PInvoke is not type safe. Errors are reported at runtime.

I hope to look at better ways of implementing interop in the near future, potentially looking at using Embedded Interop types.