Back from the brink…
<sigh> After the attack of the incompetent hosts (1&1). Previous content should be restored shortly.
<sigh> After the attack of the incompetent hosts (1&1). Previous content should be restored shortly.
One of the things I remember vividly from various software development books is the fact that it is easier to write code than read it. This tends to mean that developers are always tempted to re-write applications in order to understand them; it’s notionally “easier” than trying to grok the code by inspection only.
The major problem with this approach lies in what you don’t see in the code. These are all the subtle workarounds and accumulated knowledge that exists almost between the lines. Unless the code is extremely well commented, and the developer actually takes the time to read and understand the comments, any rewriting, or even simply “cleaning-up” the code will almost certainly miss these subtleties.
You can see this in practice in a lot of companies when new developers arrive and, tutting at the weirdnesses in the existing codebase, set about rewriting things in their own image. This isn’t just an ego thing, it’s a fairly natural response to an unfamiliar codebase.
The problem is, despite being aware of all this, I keep doing it myself.
The latest example was taking an (admittedly nasty) combination of shell scripts, javascript and C++ and rewriting some of it completely while refactoring the rest. Although there were many up-sides to reimplementing the shell and javascript parts of it in .NET - better error handling and messages, richer functionality and performance improvements - it has taken much longer than expected. Much, much longer than I expected. The problem was that, although it was me who wrote most of the code in the first place, even I had forgotten some of the subtleties of the way it worked.
Maybe next time I’ll heed my own advice.
There’s lots of talk about dynamic languages recently; what with the ubiquity of Javascript, and the rise and rise of Python, Ruby etc, and now Microsoft jumping into the fray with the DLR - dynamic language runtime - to make creating a dynamic language almost as easy as using one.But sometimes, we forget the old boys, sitting quietly in the corner, like poor, aged old VBA. Once the poster boy for scripting, and still prevalent in even the most recent versions of the Microsoft Office apps, but now really just biding it’s time, waiting for it’s inevitable replacement by some .NET-based upstart.One of things people often talk about with Javascript is that you can dynamically add properties to an object at runtime, e.g.:
var circ = new Object();
circ.hittest = function(x,y) { return false; };
But, people often forget that you can do a similar thing with VBA in Excel. Yes, VBA.
For instance, add a function to a worksheet object:
Sub NewMethod()
MsgBox “I got called!”
End Sub
Now it’s possible to call this method as if it existed directly on the Worksheet class! For example, with some code in the Workbook like this:
Sub CallIt()
Worksheets(”Sheet1″).NewMethod
End Sub
So there we go, VBA was dynamic (thanks to IDispatch) before it was even cool…
Well, I’ve finally got round to getting my hands on a Mac - a shiny 17″ MacBook Pro, no less - after last being heavily involved in Mac development about a decade ago when it was all about the somewhat basic IDE (at least in today’s terms) MPW - the Macintosh Programmer’s Workshop. Check it out if you’re doing any retro Mac OS 7 coding, and aren’t scared of an environment without intellisense! It was a blessed relief when MetroWerks came along with CodeWarrior and bought Mac development out of the dark ages. Gutted; I’ve just discovered that they actually discontinued it in July 2005. Still, having just run up Xcode, I can safely say it’s living on in spirit.
More by luck than judgment I picked up the machine a couple of days after the launch of Leopard, so got that included, and it’s a good job because one of the killer features for me was bootcamp, and this comes as standard with the new OS. Don’t worry, I won’t be going into excruciating detail about the other new features - but mostly because I haven’t used previous versions of the OS enough to be able to make a worthwhile comparison.
Despite that, I still think that it’s a stark reminder of how good an OS can be; a solid foundation combined with a good consistent and cutting edge UI experience.
But I’m off to boot back into Windows for a quick bit of HL2 EP2…
Today I was at the Oxford eResearch Center for an nVidia CUDA workshop. CUDA is their new API for facilitating scientific computing on GPU (graphics processing units).
For a while now there’s been some interest within the computer science research community on the possible application of GPUs to general purpose computing tasks. Well, it’s more accurate to call them scientific computing tasks, I guess, as these are typically embarassingly parallel numerical applications like monte carlo simulations.
nVidia are targeting CUDA directly at potential users in the financial, oil & gas and aerospace industries. These are areas where they’ve never had a significant presence, instead being focused on the consumer gaming and graphics workstation market, which has served them very well. But now they’re looking to exploit the diminishing gains in processor power in recent generations of processors by instead helping people to squeeze maximum bang-for-the-buck by utilising the highly-threaded, parallel, multi-processor nature of commodity GPUs.
The presentations were very interesting. They ranged from almost marketing level discussions of the product road maps etc, to real nuts-and-bolts optimisation discussions. Mark Harris demonstrated successive iterations of a reduction algorithm that eventually approached the limit of memory bandwidth (78gb/s), albeit with some pretty nasty loop unrolling and templating!
From my perspective the biggest drawback of the current hardware and API is that it only supports single precision floating point. Unfortunately everything in my world uses double precision maths; and although it’s possible to convert on entry/exit from the GPU API, this adds significant overhead. Of course, it should be possible to reduce the numerical range that the algorithm has to deal with in order to avoid the need for double precision at all, but this is a bit more re-engineering than I can justify. Even the on-chip double implementation will suffer from a quite significant slow-down compared to the single precision version - but even this will be an order of magnitude faster than non-GPU code, so this shouldn’t be too serious.
Another interesting aspect of the technology is the use of an “intermediate” form of assembled code; PTX files. These are generated from the C-like .CU source files, and are then turned into card-specific machine code on the hardware. This allows nVidia a degree of freedom to change the on-chip architecture, instruction set etc without breaking existing applications.
If you’re interested in keeping in touch with news of the various GPGPU users in academia and industry, you can take a look at www.gpgpu.org, which was started by Mark Harris, who was one of the presenters today.
Using WinDbg it’s possible to get a dump of each XLL call that is made by Excel as it calculates. If you’re using Excel 2003, create the following breakpoint that dumps the symbol at eax+4 (the entry point that is about to be called), then continues.:
bu EXCEL!MdCallBack+0xa880 "dds @eax+0x4 L1; g"
You’ll need to adjust the offset for other versions of Excel - and I haven’t tried it yet with 2007. Assuming you’ve got symbols available for the XLLs being called, you’ll get something like this:
0013bc50 15109730 addin1!addin1_function1
0013bc50 12c0e918 addin2!addin2_anotherfunctions
This technique can be useful when troubleshooting - to identify the last addin call being made before a failure perhaps - and is also quite interesting to just watch and see the pattern in which your XLL UDFs get called.
Thought you might like to know about a few Visual Studio tools and add-ins that I regularly use, and find very useful:
Fantastic little tool that lets you copy formatted text from the VS editor as HTML fragments. You can’t beat syntax highlighting to ease code readability, and this plug-in is great for adding code to blog entries, etc. Written by Colin Coller and available from http://www.jtleigh.com/CopySourceAsHtml.
C++ has missed out on a few of the productivity boosters available for the managed languages in VS2005; one of the most annoying ones is code snippets - mainly because it’s omission seems so arbitrary, it’s not as if it needs reflection or some other CLR specific features. Anyway, it turns out that it is available, but as part of the “PowerToys” package, available from http://msdn2.microsoft.com/en-us/vstudio/bb190754.aspx. As well as being good for day-to-day productivity, code snippets are also a really powerful demo tool, allowing you to give the appearance that you’re writing code “live” when it’s really just glorified copy and pasting.
I do quite a lot of work with XML, and this little utility proves very useful. I used to use XMLSpy, but it was really too overblown when all I wanted to do was use it’s ability to quickly run an XPath query and see what it returns. This add-in allows me to do that directly in Visual Studio… lovely! It was put together by Don Demsak, and is available from http://donxml.com/allthingstechie/archive/2006/07/07/2792.aspx.
When moving code from “traditional” registry-based COM to shiny new side-by-side, registration-free COM, there are a few places where you might need an analog for things like looking up a DLL name from a prog ID. E.g. in the registry-based world, you can go from a Prog ID for a class to it’s physical DLL filename by doing this:
Now obviously, if you attempt this on a machine where the components are only being used via SxS (i.e. have never been registered) the CLSIDFromProgID step will work - OLE32.DLL, which implements this function, is aware of SxS - but the subsequent steps won’t because the information isn’t in the registry.
I guess this is really breaking because we’re taking advantage of our knowledge of COM internals, rather than going via the official APIs. Although as far as I know the “correct” way of doing the progid->(type library) DLL mapping is via IProvideClassInfo, but that relies on the COM objects supporting this interface, and unfortunately we have, err, several - hundred - that don’t.
So, I set about looking to see if there was a way to get this information using the activation context APIs. All the information is there in the manifest, so how do we get it - without something nasty like querying the manifest XML directly?
There are only a handful of functions in the ActCtx API, and one of them - FindActCtxSectionGuid - looks relevant. By calling this with the ACTIVATION_CONTEXT_SECTION_COM_SERVER_REDIRECTION flag, it looks like we can get some data from the manifest based on our COM object CLSID.
Here’s the problem. The returned data from this function is a ACTCTX_SECTION_KEYED_DATA structure, and as far as I can tell this is essentially an opaque blob, with a couple of length indicators. I couldn’t find any documentation about what the lpData member was supposed to point to (if you know any better, please let me know)!
I decided to break out WinDbg, and see what OLE32!CLSIDFromProgID did, as I assumed that this must be doing something similar. It was! In fact, it was calling FindActCtxSectionString to map the prog ID to a CLSID, then using this in a call to FindActCtxSectionGuid. After a bit of disassembly, and some staring at the memory window in Visual Studio, I got a good enough idea of the contents to be able to figure out how the data referenced the filename:
typedef struct tagSECTION_DATA
{
DWORD dwSize; // 0×78 (120) structure size?
DWORD _2; // 0×00
DWORD dwSectionType; // 0×04 (ACTIVATION_CONTEXT_SECTION_COM_SERVER_REDIRECTION)
GUID clsid; // CLSID of class?
GUID _5; // Some other GUID
GUID _6; // CLSID of class again…?
GUID _7; // NULL
DWORD dwFileNameLength; // file name size in bytes
DWORD dwFileNameSectionOffset; // file name offset into data.lpSectionBase
DWORD dwProgIDLength; // progid size in bytes
DWORD dwProgIDOffset; // offset from start of this structure to progid (0×78)
BYTE _8[28]; //Unknown
// Prog ID string follows
} SECTION_DATA;
So now you can cast the data member to this structure and easily extract the filename, voila!
ACTCTX_SECTION_KEYED_DATA data;
data.cbSize = sizeof(data);
if (!FindActCtxSectionGuid(FIND_ACTCTX_SECTION_KEY_RETURN_HACTCTX,
NULL,
ACTIVATION_CONTEXT_SECTION_COM_SERVER_REDIRECTION,
&guid,
&data))
{
return GetLastError();
}
if(data.ulDataFormatVersion == 1) // Fail-safe in case internal format changes…?
{
// Cast returned data to our structure type
SECTION_DATA *pdata = static_cast<SECTION_DATA *>(data.lpData);
// DLL filename can be found in the section base data at specified offset
std::wstring filename(
reinterpret_cast<wchar_t *>( ((BYTE *)data.lpSectionBase + pdata->dwFileNameSectionOffset) ));
}
So the SxS compliant version of the CLSID to filename/typelibrary is:
I’m sure this will break horribly when they change the internal format of the activation context data (in fact, it’s probably different now between XP and Vista - many things in the SxS world are), but hopefully we can use the ulDataFormatVersion to do some basic sanity checking.
Now if only some of those useful looking functions in sxs.dll were documented…
I’ve recently been trying to work out what was going on with some ASP.NET/COM interop issues at work. It turned out to be due to the differences between OpenProcessToken and OpenThreadToken.
It might sound obvious in hindsight, but I haven’t worked a great deal with the security model within Windows, so I’m not particularly au fait with it all. Plus, given that it was a DLL, called from COM, called from IIS, it wasn’t particularly easy to debug.
The system I work on uses an access control mechanism whereby users have to be members of a certain NT domain group in order to use it. At certain points in the process, the security DLL checks for this by getting the current token, using GetTokenInformation and iterating over the group SIDs it contains (it was written before CheckTokenMembership was available).
The trouble is, when running under IIS and ASP.NET, the check was always failing. Even though the appropriate user identity was being passed through, by impersonation, from the client, it wasn’t working. Hmmm.
It turned out that the validation code was using OpenProcessToken, but of course, the impersonation happens at thread level. You can impersonate as much as you want, but the process access token always contains the original token (for the Network user in my case), not the one with the groups for the user you’re interested in.
By changing the code to use OpenThreadToken and passing FALSE for the OpenAsSelf parameter, you can get the properly impersonated access token. Ahhh.
One of my aims at work is to simplify the development and testing regime as much as possible, and this mostly consists of making sure we’re using the most appropriate tools and technologies wherever possible. In the context of correctness checking this involves determining whether the time we spend installing, configuring and chasing down false positives in third-party tools such as Purify outweighs their benefits.
As a rule I’d always favour using built-in, vendor provided hooks rather than a bolt-on product, and as such I was interested in what we could do with the IMallocSpy functionality in the COM runtime, as our group’s software is almost all COM based.
It turned up some interesting things…
The first was the behaviour of CoRevokeMallocSpy. Stupidly I originally neglected to check the return value, and then when I did, found it was returning E_ACCESSDENIED. It turns out that this means there are still allocations that occurred when the spy was active that haven’t been freed. Given that my use of the spy was to check for leaked allocations in the first place, this was a bit annoying; at least, it meant I couldn’t overload the lifetime of my spy object, to, say, dump a list of leaks when it was destroyed. If there were any leaks it never got destroyed!
The next problem was that it seemed whenever I had an app that called the apparently innocous CLSIDFromProgID, it would leak memory. For example:
addr: 0x0015eb20: size: 0x2c (44) contents: c0 31 fd 76 30 c8 15 00 01 00 00 00 01 00 12 00 .1.v0........... 28 00 10 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 01 f0 ad ba bc ec 15 00 94 ec 15 00 ............ [0x77583315] CSpyMalloc_Alloc+0x49 [0x774fd073] CoTaskMemAlloc+0x13 [0x76fd18fb] operator new+0xe [0x76fd5c6b] StgDatabase::InitClbFile+0x2e [0x76fdc190] StgDatabase::InitDatabase+0x623c [0x770076fa] OpenComponentLibraryEx+0x3e [0x77005306] OpenComponentLibraryTS+0x1a [0x76fd954d] _RegGetICR+0x761f [0x76fd1f24] CoRegGetICR+0xffff877d [0x76fd6a20] IsSelfRegProgID+0x65 [0x76fd80f9] CComCLBCatalog::GetClassInfoFromProgId+0x1783 [0x77518a6d] CComCatalog::GetClassInfoFromProgId+0x100 [0x77518964] CComCatalog::GetClassInfoFromProgId+0x1e [0x775188a0] CLSIDFromProgID+0x76 [0x004120f5] wmain+0xa5 [0x004173a6] __tmainCRTStartup+0x1a6 [0x004171ed] wmainCRTStartup+0xd [0x7c816fd7] BaseProcessStart+0x23
Looking at the stack trace, it seemed there was some kind of internal caching going on, but what was confusing me was that I was under the impression that all memory allocated by the COM runtime would be freed by the time CoUninitialize was done. After all, you can’t make any further COM calls after this point. If you don’t believe me, just try using a static CComPtr, and see what happens in DllMainCRTStartup when your app exits.
After a bit of poking about with WinDbg (thank goodness we get decent symbols for the OS now), I could see that some kind of “database” was being created within CLBCATQ.DLL:
0:000> k ChildEBP RetAddr¼br> 0012faf4 770076fa CLBCATQ!StgDatabase::InitDatabase 0012fb18 77005306 CLBCATQ!OpenComponentLibraryEx+0x3e 0012fb34 76fd954d CLBCATQ!OpenComponentLibraryTS+0x1a 0012fdd0 76fd1f24 CLBCATQ!_RegGetICR+0x205 0012fdf0 76fd6a20 CLBCATQ!CoRegGetICR+0x29 0012fe48 76fd80f9 CLBCATQ!IsSelfRegProgID+0x6b 0012fe88 77518a6d CLBCATQ!CComCLBCatalog::GetClassInfoFromProgId+0x51 0012fec0 77518964 ole32!CComCatalog::GetClassInfoFromProgId+0x149 0012fee0 775188a0 ole32!CComCatalog::GetClassInfoFromProgId+0x1e 0012ff0c 00401340 ole32!CLSIDFromProgID+0x95 0012ff7c 00401cae testleakcheck!wmain+0x60 0012ffc0 7c816fd7 testleakcheck!__tmainCRTStartup+0x10f [f:\rtm\vctools\crt_bld\self_x86\crt\src\crtexe.c @ 583] 0012fff0 00000000 kernel32!BaseProcessStart+0x23
I could see by looking at the exports that there was a function called CoRegCleanup in CLBCATQ.DLL that looked like it could be used to free up this storage before I did my leak checking. Calling it by dynamically getting the function pointer using GetProcAddress did make a difference, but there was still some memory not freed, and I didn’t feel comfortable using an undocumented function in this way.
Then I remembered the magical OANOCACHE environment variable.
This is used to tell the COM runtime not to cache memory used for BSTRs, VARIANTs, SAFEARRAYs, or anything else allocated using CoTaskMalloc. So, I set the variable, re-ran the test and voila! the apparent leaks disappeared. There must be something in CLBCATQ that detects the environment varibale and disables it’s internal cache.
So the moral of the story is; if you’re attempting to reliably track memory usage with IMallocSpy, remember to make sure you have OANOCACHE set, otherwise you will always end up with memory not being freed until late into process teardown.