VARIANT, BSTR & SAFEARRAY C++ Tutorial...

http://www.whooper.co.uk/excelvariants.htm

...VARIANT, BSTR & SAFEARRAY C++ Tutorial...

<--Back to Excel Home Page

Introduction

Frustrated by the lack of good articles on the VARIANT data type, I decided to write this short introduction.

The VARIANT type is an all purpose data type used by IDispatch::Invoke both to transmit and receive parameters. It can hold numbers, strings, arrays, error values and IDispatch pointers. An XLL developer could consider the variant to be the Automation equivalent of the do everything Excel XLOPER data type.

Structure

Here is a simplified version of the VARIANT definition. For the full definition see this link.

struct tagVARIANT {
    VARTYPE vt; // unsigned short integer type code
    WORD wReserved1;
    WORD wReserved2;
    WORD wReserved3;
    union {
    //  C++ Type      Union Name   Type Tag                Basic Type
    //  --------      ----------   --------                ----------

        long          lVal;        // VT_I4                ByVal Long
        unsigned char bVal;        // VT_UI1               ByVal Byte
        short         iVal;        // VT_I2                ByVal Integer
        double        dblVal;      // VT_R8                ByVal Double
        VARIANT_BOOL  boolVal;     // VT_BOOL              ByVal Boolean
        SCODE         scode;       // VT_ERROR
        DATE          date;        // VT_DATE              ByVal Date
        BSTR          bstrVal;     // VT_BSTR              ByVal String
        IUnknown      *punkVal;    // VT_UNKNOWN
        IDispatch     *pdispVal;   // VT_DISPATCH          ByVal Object
        SAFEARRAY     *parray;     // VT_ARRAY|*           ByVal array
        // A bunch of other types that don't matter here...
        VARIANT       *pvarVal;    // VT_BYREF|VT_VARIANT  ByRef Variant
        void          * byref;     // Generic ByRef       
    };
};

The variant type is 16 bytes in size.

Here is an example of creating a variant of type double:

VARIANT v;
v.vt= VT_R8;
v.dblVal = 999.999;

This example shows a variant containing all the actual data inside it's 16 byte structure. With the more complex string & safearry types, the data is stored separately and the VARIANT structure just contains a pointer to it.

The Variant String Type - BSTR

Conventional C strings are arrays of type char terminated by a null. Each character is stored in a single byte of 8 bits according to the ANSI encoding map called ASCII. The VARIANT string type, which is called BSTR, is more sophisticated. It has the following features:

(a) Each character is stored in two byes, or 16 bits, in order to allow for larger non Latin based character sets. This wide character storage is called UNICODE, while the conventional 8 bit storage is known as ANSI.

(b) There is an unsigned long integer (32 bit) byte count at the start of the string array, so the maximum length of a BSTR string is 2^32/2=2147483648 characters. The string "Hello World" has eleven characters, so at the start of the array one would find a byte count length of 22. A byte cont allows nulls to be stored inside the sting if required.

(c) There is a null terminating character at the end of the array.

(d) The BSTR pointer (stored in the variant as v.bstrVal) does not point to the start of the string structure, but rather to the start of the first character, ie four bytes forward of the actual start in which the length is recorded. The advantage of this is that as long as there is no early null character inside the string, a BSRT pointer can be consider as an ordinary pointer to a C style null terminated string, albeit of 16 bit character size. Alternatively, a BSTR can be consider as a buffer of a certain size in which conventional wide character null terminated strings can be stored.

In pictures:

        
In C++ the types are:

ANSI UNICODE
char
or CHAR
unsigned short
or WCHAR
char*
or LPSTR
LPWSTR
const char*
or LPCSTR
LPCWSTR

When working with conventional C strings it's easy to create and delete char arrays to hold the data. However, with the more complex BSTR type you should use the SysAllocString and SysFreeString functions for allocating and deleting; and certainly never never call delete on the pointer (which doesn't even point to the start of the data).

Here are three examples of creating a string VARIANT:

//////////////////////////////////
VARIANT v;
v.vt=VT_BSTR;
v.bstrVal= SysAllocString(L"Hello World"); // L for UNICODE characters

VariantClear(&v); // Calls SysFreeString(v.bstrVal) for us then sets v.vt=VT_EMPTY

//////////////////////////////////
LPVARIANT pV; // VARIANT*
pV= new VARIANT;
pV->vt=VT_BSTR;
pV->bstrVal= SysAllocString(L"Hello World");

VariantClear(pV); // Delete BSTR
delete pV;       // Delete Variant

//////////////////////////////////
VARIANT v;
const char c[12]="Hello World"; // 11 single byte char plus null
LPWSTR p= new WCHAR [ 12 ]; // 12 double byte characters
i=MultiByteToWideChar(CP_ACP, 0, c, -1, p, 12); // ANSI->UNICODE
v.vt=VT_BSTR;
v.bstrVal=SysAllocString(p);
delete p; // String has been copied out of the temporary buffer

VariantClear(&v);
//////////////////////////////////

The function VariantClear deletes any memory attached to the variant (in this case the BSTR "Hello World") and sets v.vt=VT_EMPTY meaning empty or no value. With our earlier example of a numeric variant of type VT_R8, there would of course be no need to call VariantClear because there is no attached data, the numeric value is held inside the structure.

The function VariantInit can be used to initialize a variant by setting v.vt=VT_EMPTY.  The function VariantCopy frees the memory associated with the destination argument and copies the source. VariantChangeType handles coercions between different types including string to numeric and visa versa.

And here is some code that won't work:

VARIANT v;
const char c[12]="Hello World";
char* p= new char [ 4 + (11+1)*2 ];
char* pS=p+4;
MultiByteToWideChar(CP_ACP, 0, c, -1, (LPWSTR) pS, 12);
*( (int*) p)=22;
v.vt=VT_BSTR;
v.bstrVal=(LPWSTR) pS;
VariantClear(&v);

Although it produces a valid variant, VariantClear() or SysFreeString() will not delete the manually allocated memory.

The old C functions strlen() etc will not work with wide character strings but the standard library contains alternatives such as wcslen(), wscpy(), wcscat() (see this link). You can also use SysStringLen() to return (byte_count /2); which will equal wcslen() only if there is no null character stored inside the string.

It's valid to pass a BSRT pointer equal to NULL in order to represent a NULL string.

You will sometimes see OLECHAR and OLESTR("Hello World") in place of WCHAR and L"Hello World". In modern Win 32 code this is equivalent.

Further reading: BSTR API at MSDN  Variant API at MSDN  COleVariant at MSDN  Article about Strings in C++ Part I, Part II

SAFEARRAYs

A SAFEARRAY is a multi dimensional multi type array. To pass a safe array in a variant v, set v.vt to VT_ARRAY | element VT type, and set v.parray to point to the safe array.

Here is the definition of a safe array:

// The SAFEARRAY structure
typedef struct tagSAFEARRAY
{

    USHORT cDims;                        // Number of dimensions
    USHORT fFeatures;                    // Misc flags inc element type
    ULONG cbElements;                    // Element size
    ULONG cLocks;                        // Lock count
    PVOID pvData;                        // Pointer to data
    SAFEARRAYBOUND rgsabound[ cDims];    // Array of dimension structures   

} SAFEARRAY;

// SAFEARRAY Dimension structure
typedef struct tagSAFEARRAYBOUND
{

    ULONG cElements;       // Number of elements in dimension
    LONG lLbound;          // Lower bound of dimension (usually 0)                   

} SAFEARRAYBOUND;

cDims gives the number of dimensions in the array. Eg a one dimensional array has cDims=1.

The bits of the fFeatures flag are described as followed: (Note: it looks complicated but don't worry too much as we will use an API to create our safe arrays and it will take care of this detail for us)

#define FADF_HAVEVARTYPE 0x0080 // An array that has a general VT type.

// If set the two byte VT type is stored four bytes before
//  the start of the SAFEARRAY structure.
// Any VT type except VT_EMPTY and VT_NULL is OK

#define FADF_BSTR 0x0100 // An array of BSTRs.
#define FADF_UNKNOWN 0x0200 // An array of IUnknown*.
#define FADF_DISPATCH 0x0400 // An array of IDispatch*.
#define FADF_VARIANT 0x0800 // An array of VARIANTs.
#define FADF_RESERVED 0xF0E8 // Bits reserved for future use.

#define FADF_AUTO 0x0001 // Array is allocated on the stack.
#define FADF_STATIC 0x0002 // Array is statically allocated.
#define FADF_EMBED invokeURLs=false autostart=trueDED 0x0004 // Array is embedded in a structure.
#define FADF_FIXEDSIZE 0x0010 // Array may not be resized


The SAFEARRYBOUND structure describes each dimension. Eg, in VB Dim x(10 to 100) would equate to lLbound=10 and cElements=91.

There is an API for manipulating safe arrays (see this link: Array Manipulation API), and a variant safe array class derived from the variant structure (see this link: COleSafeArray).

Here is an example of creating a variant containing a one dimensional safe array of longs with size 100 and lower bound 0 using the API.

VARIANT v;
SAFEARRAYBOUND rgb [] = { 100, 0 };
v.vt = VT_ARRAY | VT_I4; // Array of longs
// Now call SafeArrayCreate with type, dimension,
//  and pointer to vector of dimension descriptors

v.parray = SafeArrayCreate(VT_I4, 1, rgb);
long *rgelems; // Pointer to long elements
// Now lock the array for editing and get a pointer
// to the raw elements

SafeArrayAccessData(v.parray, (void**)&rgelems);
// Loop through setting the elements
for (int c = 0; c < 100; c++)
    rgelems[c] = c;
// Release the lock on the array
// (which also invalidates the element pointer)

SafeArrayUnaccessData(v.parray);

// Clear the array releasing the memory
VaraintClear(&v)


And here is another example in which we create a 10 row by 15 column array of variant elements:

v.vt = VT_ARRAY | VT_VARIANT;
SAFEARRAYBOUND sab[2];
sab[0].lLbound = 1; sab[0].cElements = 10;
sab[1].lLbound = 1; sab[1].cElements = 15;
v.parray = SafeArrayCreate(VT_VARIANT, 2, sab);
// Fill with some values...
for(i=1; i<=10; i++) {
    for(int j=1; j<=15; j++)
    {
        // Create entry value for (i,j)
        VARIANT tmp;
        tmp.vt = VT_I4;
        tmp.lVal = i*j;
        // Add to safearray...
        long indices[] = {i,j};
        SafeArrayPutElement(v.parray, indices, (void *)&tmp);
    }
}

VaraintClear(&v)

This code uses the slower technique of multiple calls to SafeArrayPutElement instead of locking the array and accessing directly as in our first example. Another useful function in the API (link above) is SafeArrayGetElement. It also possible, for performance purposes, to avoid the API and create the array by hand, but you will have to get to grips with fFeatures etc.

Conclusion

This concludes my short tutorial on the Variant Data Type. I hope you found it useful!



[MSDN]

http://msdn.microsoft.com/library/default.asp?url=/library/en-us/automat/htm/chap6_7zdz.asp

VARIANT and VARIANTARG

Use VARIANTARG to describe arguments passed within DISPPARAMS, and VARIANT to specify variant data that cannot be passed by reference. When a variant refers to another variant by using the VT_VARIANT | VT_BYREF vartype, the variant being referred to cannot also be of type VT_VARIANT | VT_BYREF. VARIANTs can be passed by value, even if VARIANTARGs cannot. The following definition of VARIANT is described in OAIDL.H automation header file:

To simplify extracting values from VARIANTARGs, Automation provides a set of functions for manipulating this type. Use of these functions is strongly recommended to ensure that applications apply consistent coercion rules.

Value Description
VT_EMPTY No value was specified. If an optional argument to an Automation method is left blank, do not pass a VARIANT of type VT_EMPTY. Instead, pass a VARIANT of type VT_ERROR with a value of DISP_E_PARAMNOTFOUND.
VT_EMPTY | VT_BYREF Not valid.
VT_UI1 An unsigned 1-byte character is stored in bVal.
VT_UI1 | VT_BYREF A reference to an unsigned 1-byte character was passed. A pointer to the value is in pbVal.
VT_UI2 An unsigned 2-byte integer value is stored in uiVal.
VT_UI2 | VT_BYREF A reference to an unsigned 2-byte integer was passed. A pointer to the value is in puiVal.
VT_UI4 An unsigned 4-byte integer value is stored in ulVal.
VT_UI4 | VT_BYREF A reference to an unsigned 4-byte integer was passed. A pointer to the value is in pulVal.
VT_UI8 An unsigned 8-byte integer value is stored in ullVal.
VT_UI8 | VT_BYREF A reference to an unsigned 8-byte integer was passed. A pointer to the value is in pullVal.
VT_UINT An unsigned integer value is stored in uintVal.
VT_UINT | VT_BYREF A reference to an unsigned integer value was passed. A pointer to the value is in puintVal.
VT_INT An integer value is stored in intVal.
VT_INT | VT_BYREF A reference to an integer value was passed. A pointer to the value is in pintVal.
VT_I1 A 1-byte character value is stored in cVal.
VT_I1 | VT_BYREF A reference to a 1-byte character was passed. A pointer the value is in pcVal.
VT_I2 A 2-byte integer value is stored in iVal.
VT_I2 | VT_BYREF A reference to a 2-byte integer was passed. A pointer to the value is in piVal.
VT_I4 A 4-byte integer value is stored in lVal.
VT_I4 | VT_BYREF A reference to a 4-byte integer was passed. A pointer to the value is in plVal.
VT_I8 A 8-byte integer value is stored in llVal.
VT_I4 | VT_BYREF A reference to a 8-byte integer was passed. A pointer to the value is in pllVal.
VT_R4 An IEEE 4-byte real value is stored in fltVal.
VT_R4 | VT_BYREF A reference to an IEEE 4-byte real value was passed. A pointer to the value is in pfltVal.
VT_R8 An 8-byte IEEE real value is stored in dblVal.
VT_R8 | VT_BYREF A reference to an 8-byte IEEE real value was passed. A pointer to its value is in pdblVal.
VT_CY A currency value was specified. A currency number is stored as 64-bit (8-byte), two's complement integer, scaled by 10,000 to give a fixed-point number with 15 digits to the left of the decimal point and 4 digits to the right. The value is in cyVal.
VT_CY | VT_BYREF A reference to a currency value was passed. A pointer to the value is in pcyVal.
VT_BSTR A string was passed; it is stored in bstrVal. This pointer must be obtained and freed by the BSTR functions, which are described in Conversion and Manipulation Functions.
VT_BSTR | VT_BYREF A reference to a string was passed. A BSTR* that points to a BSTR is in pbstrVal. The referenced pointer must be obtained or freed by the BSTR functions.
VT_DECIMAL Decimal variables are stored as 96-bit (12-byte) unsigned integers scaled by a variable power of 10. VT_DECIMAL uses the entire 16 bytes of the Variant.
VT_DECIMAL | VT_BYREF A reference to a decimal value was passed. A pointer to the value is in pdecVal.
VT_NULL A propagating null value was specified. (This should not be confused with the null pointer.) The null value is used for tri-state logic, as with SQL.
VT_NULL | VT_BYREF Not valid.
VT_ERROR An SCODE was specified. The type of the error is specified in scodee. Generally, operations on error values should raise an exception or propagate the error to the return value, as appropriate.
VT_ERROR | VT_BYREF A reference to an SCODE was passed. A pointer to the value is in pscode.
VT_BOOL A 16 bit Boolean (True/False) value was specified. A value of 0xFFFF (all bits 1) indicates True; a value of 0 (all bits 0) indicates False. No other values are valid.
VT_BOOL | VT_BYREF A reference to a Boolean value. A pointer to the Boolean value is in pbool.
VT_DATE A value denoting a date and time was specified. Dates are represented as double-precision numbers, where midnight, January 1, 1900 is 2.0, January 2, 1900 is 3.0, and so on. The value is passed in date.

This is the same numbering system used by most spreadsheet programs, although some specify incorrectly that February 29, 1900 existed, and thus set January 1, 1900 to 1.0. The date can be converted to and from an MS-DOS representation using VariantTimeToDosDateTime, which is discussed in Conversion and Manipulation Functions.

VT_DATE | VT_BYREF A reference to a date was passed. A pointer to the value is in pdate.
VT_DISPATCH A pointer to an object was specified. The pointer is in pdispVal. This object is known only to implement IDispatch. The object can be queried as to whether it supports any other desired interface by calling QueryInterface on the object. Objects that do not implement IDispatch should be passed using VT_UNKNOWN.
VT_DISPATCH | VT_BYREF A pointer to a pointer to an object was specified. The pointer to the object is stored in the location referred to by ppdispVal.
VT_VARIANT Invalid. VARIANTARGs must be passed by reference.
VT_VARIANT | VT_BYREF A pointer to another VARIANTARG is passed in pvarVal. This referenced VARIANTARG, pvarVal, cannot be another VT_VARIANT|VT_BYREF. This value can be used to support languages that allow functions to change the types of variables passed by reference.
VT_UNKNOWN A pointer to an object that implements the IUnknown interface is passed in punkVal.
VT_UNKNOWN | VT_BYREF A pointer to the IUnknown interface is passed in ppunkVal. The pointer to the interface is stored in the location referred to by ppunkVal.
VT_ARRAY | An array of data type was passed. (VT_EMPTY and VT_NULL are invalid types to combine with VT_ARRAY.) The pointer in pparray points to an array descriptor, which describes the dimensions, size, and in-memory location of the array. The array descriptor is never accessed directly, but instead is read and modified using the functions described in Conversion and Manipulation Functions.