Tuesday, 31 March 2009

Marshalling a Variable-Length Array From Unmanaged Code In C#

I recently spent time working on some C# code to interact with a simple DNS-SD system. This requires using DNS TXT records, which are not supported in the System.Net.Dns class. After a few google searches failed to turn up a pure .Net client library that met my needs, I settled on an approach based around p/invoking the Win32 DnsQuery function.

And quickly ran into problems.

For DNS TXT records, DnsQuery returns a DNS_TXT_DATA structure in the Data field of the DNS_RECORD structure. DNS_TXT_DATA is declared like this:
typedef struct {
    DWORD dwStringCount;
    PWSTR pStringArray[1];
} DNS_TXT_DATA,
    *PDNS_TXT_DATA;

Using the very handy P/Invoke Interop Assistant, we see that this struct can be represented like this in managed code:
[StructLayout(LayoutKind.Sequential)]
public struct DNS_TXT_DATA {

    /// DWORD->unsigned int
    public uint dwStringCount;

    /// PWSTR[1]
    [MarshalAs(UnmanagedType.ByValArray,
            SizeConst=1,
            ArraySubType=UnmanagedType.SysUInt)]
    public IntPtr[] pStringArray;
}

There is a problem with pStringArray, unfortunately. The System.Runtime.InteropServices.Marshal class cannot marshal a variable length array, as it needs to know in advance how big the array is in order to allocate memory. That's why the managed structure needs SizeConst specified in the MarshalAs attribute.

However, if the DNS TXT record data contains multiple quoted strings separated by whitespace, DnsQuery will return a structure with a variable number of elements in pStringArray. Since SizeConst is set at compile-time, when we marshal this into the managed struct defined above, we only get the first element in our single-element array. Rats.

More googling turned up very little info on dealing with this, though I found indications that others had run into the same problem without finding a satisfactory conclusion. DnsQuery is not the only Win32 function that returns variable-length arrays, and p/invoking any of the others has the same issue.

Simply declaring SizeConst to be bigger than we need - "hey, I know I'll never get more than 10 or so strings back, so why not declare SizeConst to be 128?" - is inelegant (hardcoded upper limits, ugh) and doesn't work properly anyway. Since the struct layout is sequential the marshaller will copy over (e.g.) 128*sizeof(IntPtr) sequential bytes (a total of 512 bytes, in this case). That much memory was never allocated on the unmanaged side, so we end up with a load of junk in the tail of pStringArray, and more often than not the marshaller chokes on this junk and throws an AccessViolationException. Fun.

There IS a way to get round the problem, though. I'm not sure it's the best way, but it works and seems stable, so I thought I'd throw it out there in case anyone else can use it (or maybe explain to me why it's an unsafe stupid thing to do...)

Basically, since we're dealing with sequential memory, we can use Marshal.PtrToStructure to marshal the DNS_TXT_DATA structure as defined above, then use pointer arithmetic to gain access to any further data that needs marshalling.

Pointer arithmetic? Oh yes, even in the safe and secure world of managed code it's sometimes still necessary to get our hands dirty, and situations like this illustrate that it will always be valuable to have some hard-earned Assembly/C/C++ war wounds.

So, assuming we have valid p/invoke declarations and data structures (I've included a complete source program below), DnsQuery is called like so:
var pServers = IntPtr.Zero;
var ppQueryResultsSet = IntPtr.Zero;
var ret = DnsQuery(domain,
        DnsRecordType.TEXT,
        DnsQueryType.STANDARD,
        pServers,
        ref ppQueryResultsSet,
        IntPtr.Zero);
if (ret != 0)
    throw new ApplicationException("DnsQuery failed: " + ret);

If we examine the memory location of ppQueryResultsSet (Ctrl-Alt-M,1 or Debug->Windows->Memory->Memory1 in Visual Studio) we'll see something like the following (actual address locations may vary - just copy the int value of ppQueryResultsSet to the Address bar of the memory window):
0x049E0878  00 00 00 00  ....
0x049E087C  b8 09 9e 04  ¸.ž.
0x049E0880  10 00 20 00  .. .
0x049E0884  19 30 00 00  .0..
0x049E0888  00 00 00 00  ....
0x049E088C  00 00 00 00  ....
0x049E0890  06 00 00 00  ....
0x049E0894  b8 08 9e 04  ¸.ž.
0x049E0898  d8 08 9e 04  Ø.ž.
0x049E089C  f8 08 9e 04  ø.ž.
0x049E08A0  28 09 9e 04  (.ž.
0x049E08A4  68 09 9e 04  h.ž.
0x049E08A8  88 09 9e 04  ˆ.ž.

I've set the column size to 4 here, as most of the values we are dealing with are 4 bytes in size. This effectively shows one value per line.

The first 6 rows (24 bytes) correspond to the DNS_RECORD structure up until (but not including) the DNS_TXT_DATA structure in DNS_RECORD's Data union. We can marshal this first structure without problem:
var dnsRecord = (DnsRecord) Marshal.PtrToStructure(
        ppQueryResultsSet, typeof (DnsRecord));

The DNS_TXT_DATA structure starts at address 0x049E0890 in my example. Having already marshalled the DNS_RECORD structure, now I want a pointer to the DNS_TXT_DATA structure. I can do this by creating a new pointer at the address of ppQueryResultsSet plus 24 bytes, and marshalling again:
var ptr = new IntPtr(
        ppQueryResultsSet.ToInt32() + Marshal.SizeOf(dnsRecord));
var txtData = (DNS_TXT_DATA) Marshal.PtrToStructure(
        ptr, typeof (DNS_TXT_DATA));

Because of the definition of DNS_TXT_DATA, this only marshals 8 bytes - 4 bytes for dwStringCount, and 4 bytes for the single element in pStringArray (an IntPtr). Since we know the memory is sequential, however, this gives us everything we need - we now know how many strings have been received (6 in this case, as indicated at 0x049E0890), and the location of the pointer to the first string (0x049E0894).

With this info, we can marshal all the pointers into an array with a length of dwStringCount:
ptr = new IntPtr(ptr.ToInt32() + sizeof (uint)); // move to first
var ptrs = new IntPtr[txtData.dwStringCount]; // dest array
Marshal.Copy(ptr, ptrs, 0, ptrs.Length);

And finally we iterate through those pointers, marshalling the string pointed at by each:
var strings = new List();
for (var i = 0; i < ptrs.Length; ++i)
{
    strings.Add(Marshal.PtrToStringAnsi(ptrs[i]));
}

While the example I've presented here is specific to DnsQuery, the general approach should be applicable to any situation where you need to marshal a data structure containing a variable-length array.

Source code

6 comments:

  1. emergency cell phone chargers29 April 2009 02:27

    Interesting and useful info - thanks for informing all of us. Nate

    ReplyDelete
  2. Any information on your sources though?

    ReplyDelete
  3. Very Helpfull, thanks.

    I will note that this:
    ptr = new IntPtr(ptr.ToInt32() + sizeof (uint)); // move to first
    should be:
    ptr = new IntPtr(ptr.ToInt32() + IntPtr.Size); // move to first

    And PtrToStringAnsi should be PtrToStringAuto.

    This handles padding issues on 64bit OS's.

    ReplyDelete
  4. Thank you for the information, it is exactly what I was looking for.

    ReplyDelete
  5. Thank you! Was just what I was looking for to solve a marshaling issue with my embedded XulRunner component.

    ReplyDelete
  6. Indeed, it's the best way you ever did. This is what I'm looking for around the web because looking for a solution too.

    Elizabeth
    Webmaster ~"How to potty train a dog"~

    ReplyDelete