-
Batch-converting JPEG files for OCR
I was sent a whole bunch of .jpg files of scanned documents with text that I wanted to extract.
I have Microsoft Office Document Imaging (MODI) installed, so I was keen to use that to perform the OCR (instead of re-typing all the text!). The only problem is that MODI only understands TIFF and MDI formats.
I used ImageMagick to do the conversion. Convert might sound like the best candidate, but mogrify did the job for me.
You can convert a whole lot of files using the following command:
mogrify -format tiff *.jpg
This creates new tiff files for each JPEG file. The only problem is that MODI doesn’t like the particular flavour of TIFF generated. Fortunately ImageMagick has 1001 options to configure exactly what you want to happen.
A bit of experimentation and I’ve found that the following extra options generate TIFF files that can be read without problems:
mogrify -format tiff -colorspace RGB -compress RLE *.jpg
All good, except that I then discovered that the scanning was at such a low DPI that the OCR wasn’t able to find any text :-(
Something else that sounds interesting is that MODI can be programmed against. Maybe I could automate this even more!
-
10 Years!
It’s hard to believe, but it really is 10 years ago today that this gorgeous woman said “I do”! Yes, Narelle and I are celebrating our 10th wedding anniversary.
In a lot of ways, it seems to have gone very quickly. It’s like only yesterday we were going out and then engaged, flying interstate once a month to see each other (I maintain I kept Ansett afloat that year). Long distance relationships are very difficult, but the wait was worth it.
We were married in Sydney (which is where Narelle and her family were living at the time), so that meant there was a fair bunch of Adelaide family and friends who made the trek across. Not to mention Cathy from USA. Then as a complete surprise, Mr & Mrs Rush and Mr & Mrs Badger decided they’d drive to Sydney just to see the wedding, then drive all the way home again on the same weekend (1,400km each way).
After the honeymoon, it was back to Adelaide. As if getting married isn’t enough, Narelle also moved interstate!
Now Narelle’s Mum & Dad have moved here to retire and help share the babysitting with my parents. Very handy, especially with G3 due in May.
It’s not all beer and skittles mind you. Well I don’t drink beer for one thing (strawberry milkshakes are more my cup of tea) though I think the kids do have some skittles in the toy cupboard. Seriously, there have been some tough times, but we’ve got through them, and they’ve been more than made up for by the good ones.
Well all I can say is if the first 10 years are any indication, I’m looking forward to the next 10, and the 10 after that, and the 10 after that…
-
TyTN II frozen
I’m not sure what happened. It was time to leave work, so I un-cradled my HTC TyTN II phone. I noticed it was part-way through synching but that shouldn’t matter.
I walked out to the bus-stop and waited, and waited, and waited. I had a meeting in the City and when the bus arrived eventually I realised I’d be late, so I went to try and ring ahead.
The phone was stuck.
I did a soft-reset, and after it restarted it got as far as saying it couldn’t do a network connection (not sure why it was trying to do that), and then it would refuse to do anything else.
Further soft-resets didn’t help either. Narelle tried to ring me, but I couldn’t even answer her call, as none of the buttons had any effect, and the only way I could stop it ringing was to soft-reset it again.
I finally got home, and was able to look up the manual online. For future reference, to do a hard-reset:
- Press and hold the left SOFT KEY and the right SOFT KEY, and at the same time, use the stylus to press the RESET button at the bottom of your device.
- Release the stylus, but continue pressing the two SOFT KEYs until you see a special message on the screen.
- Release the two SOFT KEYs, and then press the button on your device.
That seems to have done the trick, though it does mean I’ll have to re-install everything else again.