Vintage Photo Scanning – A Journey of Discovery

I recently undertook a project to scan and digitally convert a collection of vintage photographs belonging to my parents and wanted to share some of my findings, both from a technical and an emotional perspective. So if, like me, you discover a treasure trove of old photographs buried in a drawer somewhere in your parents house, don’t put them back, but do keep reading!

Parental Archives

Like many of my generation and the generations before me, I grew up in an almost exclusively non-digital era with an unwilling reliance on a minimal selection of analog TV and radio channels, cassette tapes and film cameras.

And while the invention of the Internet, coupled with services like YouTube, iTunes and Spotify has meant that many of the TV, radio and musical memories of my youth can be resurrected in the blink of an eye, alas the same does not hold not true for photographic memories. These are way harder to resurrect (impossible in some cases) as they cannot be reproduced or digitally remastered without the original content itself, which in most cases is in the possession of a single entity – your parents!

And this also means that you will require that your parents have done two things:

  1. Taken the time to capture photographs of your childhood in the first place;
  2. Ensure that these (and others of their own) were preserved intact over the intervening years.

And indeed the exact same applies to your parents and to the memories of the life they had before your arrival.

Equipment

In terms of the equipment used to conduct this year-long exercise, here is what I needed:

  • Digital Photo Scanner: You don’t need to pay a lot of money for this (mine was a HP Deskjet F4580 that I bought for just €50), it just needs to support both Greyscale and Colour scanning (which most do) and at a decent resolution (300dpi).
  • Computer: Again, a relatively inexpensive laptop/desktop will be fine, although the scanning software can prefer a little extra RAM at times (when you’re scanning a lot of photos in a single session). Mine (actually, my wife’s) was a Dell Lattitude running Microsoft Windows.
  • Scanning Software: This may depend on your scanning device. I used to software that came with my scanner.
  • Graphics Software: As you are highly likely to want to crop some of the scanned images afterwards, you may need some additional graphics software for this. My personal open source favourite is GIMP, but there are lots to choose from and your scanning software may even do this for you anyway.
  • Post-its: These could prove really handy for cataloging and sorting the photographs so they can be reinserted into their original albums afterwards.
  • Exiftool: This a Unix command-line utility for injecting meta-data into digital image files, such as GPS location, date & time and names of those in the photograph.

Copious amounts of patience, coffee and beer are also strongly recommended.

Planning & Sorting

Strangely, one of the first challenges you’ll face is exactly how to remove the photos from their albums without damaging them and in such a way that you’ll be able to reinsert them in roughly the same order afterwards.

And don’t forget that, while you may feel that you project is complete once you’ve scanned the photos and have them on your laptop, your parents may want them restored into their original setting and you need to respect that.

So in my view, here is the best way to approach this:

  1. Devise an album/page numbering scheme and attach some post-its (or equivalent) to the pages in the various albums.
  2. Remove all of the Black & White photos first, because it’ll be more efficient to scan these together using the same scanner resolution/quality settings.
  3. As you remove each photo, write the album and page number on the rear, preferably using a pencil (which can easily be removed afterwards if required).
  4. Once removed, arrange the photographs into bundles of roughly the same size. This will also make for more efficient scanning (and cropping) of images later on.

Culling

Some of the photos may also be too faded, blurred, cropped or too small to be worth scanning so you may wish to omit those from the process early on. Similarly, keeping multiple (but very similar) photos of the same occasion (with the same people in them) can sometimes dilute the power of just one photo of that occasion.

This is just something you’ll need to make a personal judgement call on but you could use the following logic:

  1. Is there another, similar photo of the same occasion with the same people in it?
  2. Although it’s blurred, or of poor quality, is this the only photo with a particular person or group in it?
  3. Is there a favourite piece of music that this photo could go with, if you were to include it in a musical slide show or movie?

In my case, the percentage success rate here was actually only around 50% (i.e. I ended up skipping roughly half the entire collection) but given the nature of photographic technology at the time, this is not entirely surprising.

Testing, Trial and Error

The first thing you need to do once you think you are ready to start scanning is to stop and do some testing (with just a couple of photos) to be sure you are going to be happy with the results. Here, you are looking to settle on your optimum scanning technique and preferred resolution, file format, compression ratio, colour balance etc.

In terms of the file format, this is important too because not all formats are supported by the popular exiftool utility and so if you plan to inject metadata into the scanned images later on, you need to test this now so you do not use a format you will later regret. For example, I had scanned several hundred photos in PNG format before I realised that I could not inject metadata into them using the exiftool utility. I found that the JPEG format worked best for me.

So trust me, testing beforehand will save you a huge amount of time (and stress) later on, and you will thank me for warning you now.

Scanning & Cropping

In terms of the scanning effort itself, I found that scanning multiple (similarly sized) images at the same time was way more efficient. I also found it more efficient to crop the images from within the scanning software (that came with my scanner) before saving them to disk as separate images.

You might be forgiven for thinking this is the longest phase of the journey, but for me it wasn’t – the dating of the photos, naming of the image files and insertion of meta-data took a lot longer.

Naming Convention

In terms of how you name the image files produced by the scanning exercise, this is really a matter of personal preference. You could just stick with the arbitrary names assigned by the scanning software, but based on my experience you are far better off to invest a little extra time in devising a naming scheme for the files so that you can search for (and/or rearrange) them more easily later on.

What worked for me here was to construct the name of each file using 4 basic pieces of data, separated by a tilde character:

<Date>~<Title>~<People>~<Location>.jpg

where

  • <Date> follows the standard YYYY-MM-DD date format. This means that the files will naturally sort themselves chronologically on most standard file browsing applications.
  • <Title> is some sort of snappy, 4-5 word title for the photo or event, possibly prefixed by a number if there are multiple photos taken on the same day at the same event.
  • <People> is a comma-separated list of the names of the people in the photo (as they would be commonly known to your family).
  • <Location> is a succinct description of where the photograph was originally taken (e.g. something that would match a search in Google Maps).

The use of a tilde character as the field separator (as opposed to a comma or hyphen, for example) is also optional, of course, but works well for me in many situations as it is rarely used within any of the other field/data types, thus allowing you to have commas and hyphens in those other fields without confusion.

Filing

Personally, I would not advise storing several hundred photos in a single directory as I think it would make them harder to manage, find and sort. I therefore decided to store batches of related files in a series of hierarchical subdirectories, some of which themselves included dates in their name. This is again a personal preference thing but it may work in your favour if you are planning to share a copy of the finished photo collection (on a USB stick or CD or via Dropbox) with friends and family.

Dating & Facial Recognition

This was by far the most enjoyable part of the journey. Not only did I learn so much about my wider family (and about myself) but the time I shared with my parents while undertaking this phase was hugely rewarding, both for them and for me. More mature readers will already know this, of course.

The facial recognition itself is relatively straightforward, in that your parents will either recognise the people or not, and it really doesn’t have to me any more complicated than that.

However, putting date on an old photograph can be a lot more difficult, especially when the folks are that little bit older. However, there are some tricks you can use to help with the accuracy here too, which essentially boil down to asking one or more of the following questions:

  1. We you married when this photo was taken?
  2. Was it before or after an important event in your life (e.g. Holy Communion, Confirmation, 21st Birthday, Wedding)?
  3. Was I (or any of my siblings) born when it was taken?
  4. Were your parents still alive when it was taken?
  5. Where did you/we live when that was taken?

By trying to evaluate the date of the photo in the context of seemingly unrelated milestones in their lives, you may find yourselves able to hone in on the real date with a reasonable sense of accuracy.

Are We There Yet?

At this point, you should have all of the photos scanned, cropped and named according to when they were taken, what the event was, who was in the photo and where it was taken. And for many people that would be more than enough.

However, the engineer in me was of course not happy to leave it at that. So watch out for my next blog post on how to inject metadata into your scanned images and use that to aid the importing of the photos into popular photo management software.

 

Debugging network connectivity issues using telnet

Introduction

Ah, the joys and simplicity of the humble telnet utility when it comes to debugging networking connectivity issues. Ever-present in any Unix-based operating system worth it’s salt but sorely missed in Windows systems post-XP (although it can be manually installed).

It has long since been superseded by it’s more secure cousin SSH but did you know that it can still be used to great effect to help determine the most likely reason you are unable to connect to another server in your network, or on the wider Internet.

What does success look like?

What many people don’t realise is that by specifying an additional parameter when using it, you can instruct the program to try to establish a connection on a specific port number (rather than the default port of 23). Take the following example which shows a successful connection to a fairly standard web server:

$ telnet myhost.example.com 80
Trying 192.168.1.10...
Connected to myhost.example.com.
Escape character is '^]'.

The key thing to note here is the presence of the message, “Connected to myhost.example.com” and the fact the the session remains connected (i.e. you don’t get bumped back to the command prompt immediately). This is telling you that you have successfully established a valid connection to the required port on the server you’re interested in, which also (and most importantly) confirms that:

  1. You are not being blocked by a firewall.
  2. The service you’re trying to connect to is alive and well, listing on port 80.
  3. There are no software-based access rules preventing you from accessing the service on that port.

Help, it didn’t work!

In my experience, the above list represents the three most common reasons why things may not be working the way you hoped or expected. The neat thing about the telnet command is that it usually hints at this in the response it gives you, consistently so across the wide range of operating systems it runs on. The following table summarises the meaning of each such response from telnet:

Response Meaning
Trying… Firewall issue
Unable to connect to remote host Service not running
Connection closed by foreign host Software-based access rule

Keep reading for a more detailed explanation of each scenario:

1. You are blocked by a firewall

If the response to your telnet command is simply a “Trying x.x.x.x…” message (which eventually times out) then there is most likely a firewall rule (somewhere along the route to the remote server) blocking you:

$ telnet myhost.example.com 80
Trying 192.168.1.10...

Resolving this normally requires the services of a network engineer to grant the correct access through the firewall that is blocking you.

2. Nothing is running on the specified port

If the response to your telnet command ends swiftly with, “Unable to connect to remote host”, then you can be confident that you are not being blocked by a firewall but the server (you’ve managed to connect to) does not appear to have a process running that’s listening the port you specified:

$ telnet myhost.example.com 80
Trying 192.168.1.10...
telnet: connect to address 192.168.1.10: Connection refused
telnet: Unable to connect to remote host
Resolving this normally requires the services of the server administrator (normally having them start the offending service for you).

3. You’re not allowed to talk to the service anyway

If your telnet command appears to connect successfully but immediately returns to the command prompt with a “Connection closed by foreign host” message, then you can be confident that you are not being blocked by a firewall, there is a valid service listening on the specified port but there is some form of software-specific access rule preventing you from communicating with that service:

$ telnet myhost.example.com 80
Trying 192.168.1.10...
Connected to myhost.example.com.
Escape character is '^]'.
Connection closed by foreign host.
$

Resolving this normally requires the services of the support team that manages the service you are trying to connect to (and having them add a rule allowing traffic from your network).

Summary

There are of course many other connectivity debugging tools (e.g. netcat) and indeed telnet is only suited to TCP-based protocols. However, the presence of this command on almost every operating system out there, along with the fact that it’s syntax and responses have not changed for so long, make it well suited for use in multi-platform environments and multi-layered networks.

Git basics for Subversion command-line users

For anyone that has used the Subversion version control system and wants to consider the switch to Git instead, here is a simple comparison of the more common tasks you’re likely to need to get started.

Please excuse the somewhat simplistic narrative in some items. This article is intended to get users accustomed to the basic Git commands without being too specific about the terminology of either world. Some items also show more than one Git command for the equivalent Subversion command (not uncommon in Git) and there are of course lots of shortcut ways to do some of the items (but for simplicity I’ve not covered them here).

 

1. Check out a repository
i.e. Fetch a copy of the files in a repository that I’ve never fetched before…

$ svn checkout <url>
$ git clone <url>

 

2. What files have I changed?
i.e. Which of my locally checked out files have changed since I last checked them out?

$ svn status
$ git status

 

3. What’s changed on the server?
i.e. Has someone else committed changes that I don’t yet have locally?

$ svn status -u
$ git fetch origin
$ git diff <branch> origin/<branch>

 

4. Show my changes
i.e. What are the changes I’ve made to my files?

$ svn diff
$ git diff

 

5. Fetch latest files
i.e. Fetch the very latest copy of my repository from the server

$ svn update
$ git pull origin <branch>

 

6. Commit my changes
i.e. Push all my changes back to the server

$ svn commit -m "A comment describing your changes"
$ git add <files>
$ git commit -m "A comment describing your changes"
$ git push origin <branch>

 

7. Change logs
i.e. View a list of my most recent changes

$ svn log
$ git log

 

Some other useful Git commands to note
Show all available branches

$ git branch -a

Show differences made in last commits

$ git log -p -<n>

Prune any local copies of branches that have been removed from master copy of repo

$ git remote prune origin

Simple JSON parsing from Bash using Python

Have a Bash command you like to use a lot but need to parse some JSON data from another script? Here’s a quick inline Python command you might find helpful:

COUNT=`jscript.sh|python -c "import json; import sys;
  data=json.load(sys.stdin); print data['count']"`

Just substitute the name of your other shell script (jscript.sh) and the JSON index (count) as appropriate and you’re all set!

Oh, and of course, join the 2 preformatted lines above also – these are just split because by WordPress theme was truncating the single-line version.

Capturing Screen Shots on Mac OS X

I’ve had reason to capture rather a lot of screen shots in recent weeks and find the following Mac keyboard shortcuts very useful:

Shortcut Description
Command-Shift-3 Captures entire desktop (saved as PNG file on desktop)
Command-Ctrl-Shift-3 Copies entire desktop (saved to your paste buffer)
Command-Shift-4 Captures portion of desktop (saved as PNG file on desktop)
Command-Shift-4 + Spacebar Captures current window (saved as PNG file on desktop

There are lots of other variations on the above, which are described nicely  here.

How to reduce the size of a PDF file on a Mac

This one’s been bugging me for several weeks so finally got around to finding a proper solution. Basically, if someone sends you a PDF file with some photos in it, the chances are they didn’t think of reducing the picture quality (in the original document) before creating the PDF. As a result, the PDF file can end up being enormous, for no obvious reason to the reader.

While I managed to find out how to tackle the problem from inside the likes of Microsoft Word (better options in Office 2011), I am often without the original document, so what to do there (on a Mac)?

Well, as it happens, Mac Preview does have a way of doing this which I found out about here, and while it worked well for me (PDF files were mostly text with 2-3 pictures anyway), I have seen comments on the above article questioning the quality of the resulting file. Still, it may work for you too.

Oracle and Sun’s third generation ZFS-based storage appliance

My first job in the IT sector was as Technical Support Engineer at Eurologic Systems, an Irish-owned data storage company, back in 1992. I can still recall working with a SEAGATE ST512N (a 5.25″ hard disk with a whopping 12MB capacity) and later on taking delivery of our very first 3.5″ 1GB hard disk (the now famous SEAGATE Barracuda).

However, I also recall some of the former Mentec/DEC guys lamenting about how they used to clean some 300MB DEC drives (one platter at a time) that took up most of the room (way before my time of course).

And so earlier this week, almost 20 years later, I read with interest about Oracle and Sun’s new third-generation storage appliance (based on their 128-bit ZFS file system) which once again takes the form of a huge cabinet-like disk drive that looks like it would easily take up an entire room (ok, the rooms are much bigger now too but you get the point).

Of course the 20-year time gap isn’t the only difference as the new Sun ZFS Storage 7420 Appliance boasts a staggering raw capacity of 1.15PB (that’s 1,234,799,616 MB) along with a host of other equally impressive numbers.

This enormous growth in storage capacity easily trumps Moore’s Law (CPU capacity doubles every 18 months) and is more like a three-fold increase every 12 months. Nice ….. very nice!

Calling all Web Developers, check out FeedHenry

If you’re a web developer and would like to get on board the mobile app train, then you should check out FeedHenry.

Our cloud-based developer studio allows you to develop apps using familiar web technologies like HTML, CSS and JavaScript and then build versions of those apps for iPhone, iPad, Android, BlackBerry, Windows Phone 7 and Nokia WRT, all from the same code base. Registration is completely free.

Each app can also execute part of its business logic in our cloud. This allows you can change the key functionality of the app without having to push (time consuming and costly) updates to the app stores. It also means that you don’t need to provision any extra infrastructure to scale your apps to millions of users.

There’s a whole bunch of other stuff it does too so if you’d like to know more, head on over to http://www.feedhenry.com

You might also like to take a peek at http://www.aerlingus.com/help/aerlingusmobile – all built on the FeedHenry platform!