Diving into C++ for Flickr Metadata Synchr v0.6.0.0

The situation


I am currently working on version 0.6.0.0 of my Flickr Metadata Synchr tool. The goals for my open source project on CodePlex are described on the Flickr Metadata Synchr wiki page and you can always find the latest status there.


At the moment the latest public release is version 0.5.5.0. The feature set for v0.5.5.0 is roughly:



  • Allow you to select a Flickr photoset and a local directory with images.
  • Load metadata from both local and Flickr images into internal metadata structures.
  • Compare these metadata structures and synchronize them.
  • Update metadata on Flickr after the synchronization.

One of the features planned for v0.6.0.0 is updating the XMP and IPTC metadata in locally stored images. I was planning on doing this through the Windows Imaging Component (WIC) which is part of the .NET Framework 3.0. WIC is also available as a separate download for Windows XP and Windows Server 2003.


Windows Presentation Foundation provides a nice managed API for reading and writing metadata through WIC. It provides the SetQuery and GetQuery methods on the BitmapMetadata class. I was already using the GetQuery method, which works fine. However, I hit a snag when I wanted to use the SetQuery method to update metadata.


Plan A


There is a way to do this through the InPlaceBitmapMetadataWriter class. It just touches the metadata structures in the image file and doesn't have to read or write the entire stream with pixel information. This will give you excellent performance and so you don't run the risk of having to reencode the pixel stream or loosing metadata. The sad thing is that it almost never works. The image file often does not have enough room in its metadata structures to allow metadata fields to be filled or updated. When you try to save the updated metadata, the InPlaceBitmapMetadataWriter fails. That is probably why the save method is called TrySave. By the way, the code sample on that MSDN Page is dead wrong. If you call TrySave before updating metadata, it always succeeds. Probably because there is nothing to save yet. You have to call it after you update the metadata, and then it returns false ;( Which means your metadata was not updated successfully.


Plan B


So I tried plan B: Creating a new image file by writing out a copy of the original image, but now with updated metadata. This means you have to grab the original BitmapFrame from the JpegBitmapDecoder. Clone it, update its metadata and write it out again using the JpegBitmapEncoder.


This is where I hit a major problem. The Save() method on the JpegBitmapEncoder almost always fails with an InvalidOperationException with the error message "Cannot write to the stream". When the encoder is able to write out the image, the JPEG turns out to be reencoded with a different quality than the original. This is noticeable through a significant change in size of the file. This happens even though I specified the BitmapCreateOptions.PreservePixelFormat option when opening the image with the decoder. Googling (or Windows Live Searching if you will) for a solution didn't yield anything useful.


Plan C


I had to come up with a Plan C. The Windows Vista Shell is obviously able to update metadata in images without affecting the JPEG quality and without creating a copy of the image file. This led me to an MSDN article titled "Photo Metadata Policy". This is the introduction:



Metadata (file properties) for photo files can be stored using multiple metadata schemas, in different data formats and in different locations within a file. In Windows Vista™, the Microsoft® Windows® Shell provides a built-in property handler for photo files formats, such as JPEG, TIFF, and PNG, to simplify metadata retrieval.


When a piece of metadata is present in different underlying schemas, the built-in property handler determines which value to return . For instance, the Author property may be stored in the following locations in a TIFF file:


  • The Creator tag in the XMP Dublin Core schema:
    /ifd/xmp/purl.org/dc/elements/1.1/dc:creator

  • The Artist tag in the EXIF schema:
    /ifd/{ushort=315}

  • The Artist tag in the EXIF schema embedded in an XMP block:
    /ifd/xmp/ns.adobe.com/tiff/1.0/tiff:artist

On read, the property handler determines the value that takes precedence over the others that exist in the file and returns it. On write, the property handler makes sure it leaves each schema in a resolved and consistent state with the others. This may mean either updating or removing the tag in question in a given schema.


This would also help to solve another piece of the metadata puzzle: what to do with the several different options of putting metadata in image files (XMP versus IPTC, multiple possible XMP places, etc.). After updating an image, I want the metadata in the different blocks to be consistent. WIC doesn't help with this. You have to sort it out yourself. The Windows Vista Shell does help with metadata reconciliation.


So all seems to be well. Just use the Shell API to update the metadata. I would love to be able to do this from C#. Yet that doesn't seem to be possible or it is extraordinarily difficult. The "file property" handling is implemented in propsys.dll through a COM based API. But you can't add a reference to this COM library in a C# project. It doesn't have a type library ;( The only option I can find is to use C++ and use the propsys.h and propsys.idl files that are distributed in the Windows SDK. This is horrible. I guess I have to dust off my C++ skills to be able to call a brand-new Windows Vista API. WTF?!


The "Longhorn" promise for managed code


Do you remember the promises Microsoft made back in 2003 for the new Windows Client OS codenamed "Longhorn"? I sure do, since I visited the PDC03 conference where this was all announced. Microsoft promised us a brave new world where all Windows APIs could be accessed easily from managed code. Three and a bit years later we have a new Windows Client OS called Vista that doesn't live up to this promise. Microsoft has implemented new APIs that seem to be inaccessible from managed code other than through C++.


Now I can understand why part of the promise was lost during the infamous "Longhorn Reset" at Microsoft. Microsoft's ambition to completely wrap all existing Win32 APIs in WinFX was too big. But why Microsoft would be creating new APIs without managed code in mind is beyond me...


I found some C++ code on the blog of Ben Karas that is indeed able to update metadata  I hate having to add this C++ code to my project. It would require people to have Visual C++ and the Windows SDK (especially the Windows Vista header and library (*.h, *.idl, *.lib) files) installed to be able to build my code in Visual Studio.


A plan D might be to manually create C# wrappers for the COM interfaces of propsys.dll. This article describes how to do this for COM interfaces in general.

Leave a Reply

Your email address will not be published. Required fields are marked *