• Microsoft WebDAV Extension for IIS 7.0

    In late 2006, I noted that IIS7 (that shipped with Vista) didn’t include WebDAV. Well the good news is for IIS7 and Windows Server 2008 at least, it’s now available. The notes don’t say it’s compatible with Vista, so this might only be a server thing.

  • Heroes Happen {2008} - Adelaide

    Yesterday, along with the rest of my team from UniSA, I attended the Adelaide “Heroes Happen {2008}” event - the local launch of Windows Server 2008, SQL Server 2008 and Visual Studio 2008.

    It was a good day, and we got some nice goodies (including a T-Shirt, Windows Server 2008 Enterprise Edition, and Vista Ultimate with SP1). One curious thing is that the extra EULA with the Vista DVD says you’re only entitled to use it for 365 days! What’s up with that?

    The location at the Hilton was good, if a little crowded. Excellent catering though, which is always something I watch out for.

    Some good presentations, and I did learn something new - I didn’t realise that the .NET Framework 3.5 includes support for RSS and Atom feeds.

    It was great to bump into lots of familiar faces too.

  • Batch-converting JPEG files for OCR

    I was sent a whole bunch of .jpg files of scanned documents with text that I wanted to extract.

    I have Microsoft Office Document Imaging (MODI) installed, so I was keen to use that to perform the OCR (instead of re-typing all the text!). The only problem is that MODI only understands TIFF and MDI formats.

    I used ImageMagick to do the conversion. Convert might sound like the best candidate, but mogrify did the job for me.

    You can convert a whole lot of files using the following command:

    mogrify -format tiff *.jpg

    This creates new tiff files for each JPEG file. The only problem is that MODI doesn’t like the particular flavour of TIFF generated. Fortunately ImageMagick has 1001 options to configure exactly what you want to happen.

    A bit of experimentation and I’ve found that the following extra options generate TIFF files that can be read without problems:

    mogrify -format tiff -colorspace RGB -compress RLE *.jpg

    All good, except that I then discovered that the scanning was at such a low DPI that the OCR wasn’t able to find any text :-(

    Something else that sounds interesting is that MODI can be programmed against. Maybe I could automate this even more!