Request for Comments: Photo and Video Consolidation for Rapid Photo Downloader

Introduction

Rapid Photo Downloader 0.9.1 implements a simple rule when displaying files [1]:

If a file exists that Rapid Photo Downloader recognizes on the download device, it is displayed.

The only exception to this rule is when downloading directly from a camera that has two storage slots (e.g. two memory cards) and the same photo exists in both storage slots. Think of professional-level cameras that can be configured to write the same photo to both memory cards, creating an in-camera backup. In such instances, Rapid Photo Downloader will display and download only one of the duplicate photos [2]. Note the emblems beside the checkmark,which indicate which memory card the files are found on:

If a file download attempt has been completed, the checkmark beneath its thumbnail changes to an icon indicating whether the download was successful or not:

Finally, the main window is cleared of downloaded files only if the user explicitly chooses a main menu option clearing the download files, or exits the program.

The logic behind this simple system is:

  1. At all times having a visual overview of what has been downloaded in a session (i.e. since the program was opened), and what hasn’t, is critically important.

  2. It’s also critically important for the program to be able to initiate a download from several devices in parallel. Just because a new download device is inserted doesn’t necessarily mean that files that have already been downloaded should be automatically cleared from the user interface. Download devices can be plugged in at any time, including in the middle of the download. Moreover, when automation features are turned on, a download from one device can be initiated even when another device is already being downloaded from.

While there is much to be said for a system simple, there are downsides too:

The first major downside is that users who are not experienced with the program’s operation, or who are suffering from mental exhaustion after a long day’s shoot, can become confused when they download some files from a device, remove the device and insert again (or simply rescan it). They suddenly see the same file being displayed twice in the main window: the first represents the file they had earlier downloaded, and the second represents the same file on the device.

The following images illustrate the problem. This is what the user sees before downloading:

Here, after having already downloaded some files, the user has rescanned the same download source:

In the screenshot above, a dimmed thumbnail appears beside the thumbnail showing the image had already been downloaded.

The second major downside is that there are workflows where distinguishing duplicate files that come from different download devices would be helpful:

  1. Returning to the example of the professional-level camera that can write the same photo to dual memory cards, when downloading from card readers and not directly from the camera, if both memory cards are inserted at the same time as each other, the duplicate photos will appear in the display twice, and if no user intervention is undertaken, will be downloaded as if they were not duplicates.

  2. It’s wrong to assume that to solve the previous problem, the user merely needs to insert only one of the two memory cards. It’s a bad assumption because while the camera can duplicate the photos, it may not duplicate everything from the shoot. For example, the Canon 1D series of cameras can be set to write duplicate photos to both memory cards, but they will not write videos or audio annotations to both cards (the first screenshot above illustrates this). Instead, they will write the videos and audio annotations to only one card. If the user inserts the wrong card, assumes they have downloaded their videos and audio annotations along with their photos, and formats both memory cards in-camera, they’ll be in for a nasty surprise when they later realize that they never downloaded their videos and audio annotations.

Another reason why distinguishing duplicate files that come from different download devices would be helpful is because it’s currently impossible to treat RAW [3] and JPEG files as one photo in the interface, because a RAW file might have been written to one memory card, and the JPEG to the other:

  1. It’s therefore impossible to show only one thumbnail for the RAW + JPEG.

  2. It’s also impossible to robustly provide the feature to mark for download only one of a RAW + JPEG pair, e.g. mark only the RAW for download, and not the JPEG.

Proposal

Provide the option of consolidating files across devices and downloads. In other words, analyze the results of device scans looking for duplicate files and matching RAW + JPEG pairs, comparing them across multiple devices and download sessions.

Note: see Addendum (below) for the criteria that determine when a file is a duplicate or a matching RAW + JPEG pair.

Specifically:

  1. When a file is scanned, compare it against (a) previously downloaded files and (b) files that have been scanned during that session (and whose device is still inserted), but not yet downloaded.

  2. If the newly scanned file is recognized as a duplicate of a previously downloaded file and the file is currently displayed as a completed download, replace the display of the existing completed download with the newly scanned file. If the completed download was a failed download, display the newly scanned file normally, as if no download had been attempted. If the completed download was successful, then display the newly scanned file as dimmed and unmarked, which is what the program already does with previously downloaded files.

  3. If the newly scanned file is a duplicate of another newly scanned file that is also displayed, display only one of the files. Indicate below the thumbnail that the file comes from multiple sources. If the file is downloaded, download only one of the duplicates. Given the potential for increased download speed by downloading in parallel from multiple memory cards instead of sequentially from one memory card, in cases where there are sets of duplicate files, the program could conceivably alternate the memory card source for files in the set, enabling the parallel transfer.

With respect to RAW + JPEG pairs, if the pair are both currently displayed and have not been downloaded, via a program preference provide the option to:

  1. Treat the pair as one photo. Display only one thumbnail, indicate below the thumbnail that it is a RAW + JPEG, and if downloaded, download both the RAW and the JPEG.

  2. Treat the pair as two photos [4]. Display a thumbnail for both the RAW and the JPEG.

When treating a RAW + JPEG pair as two photos, via another program preference, provide the option to:

  1. Automatically mark both the RAW and JPEG for download

  2. Not mark the JPEG for download, but mark the RAW

  3. Not mark the RAW for download, but mark the JPEG

With respect to RAW + JPEG pairs, finally if one of the pair has been previously downloaded and the other has not:

  1. If treating the pair as two photos, display the new file of the pair, and follow the program preference for marking it for download. If the previously downloaded photo is currently displayed as a completed download, keep its display, making it obvious that it has already been downloaded.

  2. If treating the pair as one photo, display the 2nd of the pair signifying below its thumbnail that it is a RAW + JPEG. If the previously downloaded first of the pair is currently displayed, replace it with the new file, but show in its tooltip that its pair had previously been downloaded.

  3. When naming the 2nd of the pair when sequence number are used, the program currently reuses the sequence number of the 1st of the pair only when both files are downloaded in the same session [see 4]. This behavior will be changed such that the sequence number of the 2nd will be matched with the 1st. In other words, if you download one of the pair, exit the program, and download the second of the pair, the sequence number will now match.

With respect to RAW + JPEG pairs, if both of the pair have been previously downloaded, and one or both of the pair are downloaded again and sequence numbers are used in file renaming, a new sequence number will be assigned.

Finally, because this proposal may not suit all users, and to maintain the option of keeping existing behavior, the proposed features could be turned on and off (in their entirety, not piecemeal) using the program preferences.

Proposed Preference User Interface Changes

New preference option Consolidate files across devices and downloads when it is enabled:

Note how Treat matching RAW + JPEG files as two photos is enabled.

By contrast, here Treat matching RAW + JPEG files as one photo is enabled:

Note that when treated as one photo, it’s impossible to not mark for download one of the pair (because that makes no sense!).

When the new preference option Consolidate files across devices and downloads is disabled, a new preference for controlling the display of completed downloads is enabled, and all of the new features proposed here are disabled in their entirety:

When the new preference option Consolidate files across devices and downloads is disabled, and a new device is inserted, the program will display a message of the following kind:

Proposal Benefits

If this proposal is well thought through, and implemented satisfactorily, the program should “just work”, hiding the complexity of various download scenarios. Naïve or exhausted users will not be surprised by duplicate thumbnails for the same file. RAW + JPEG users will have more control and a potentially more efficient workflow.

Proposal Pitfalls

  1. It’s not easy to explain what the proposed feature set does. Understanding the meaning of “consolidating files across devices and downloads” would require reading the program’s documentation (or this proposal). Not all users read the documentation—emails I receive inquiring about the program’s basic operation attest to that!

  2. Performance could suffer, particularly when downloading tens of thousands of files. The program stores its in-memory database of displayed files in the program’s primary process and thread, which is also the process that controls the user interface. Determining duplicate files and RAW + JPEG pairs requires querying that database. If the operations related to this proposal are too slow, the user interface will freeze for the duration of the operations, which is not good. Solving this problem may introduce unwanted complexity.

  3. The implementation could be complex in any case, especially given the uncontrolled nature of device insertions and removals, magnifying the possibility of bugs and increasing the possibility a file might not be downloaded that should have been, leading to unexpected file loss.

  4. If the user downloads during different time zones, all bets are off, as time zones are not recorded in EXIF. This is potentially a very difficult problem to solve. (Or perhaps it’s more accurate to merely say I haven’t given it a lot of thought).

  5. “Downloads” of processed files (i.e. those already on the computer) could cause unexpected problems that have not been considered in this proposal.

Addendum

A file is “the same”, i.e. a duplicate, when its file name and file size match, and its modification time is broadly equal. The same is true of RAW + JPEG pairs: they match when their base filename match and EXIF times are broadly equal. The reason times cannot be strictly compared for equality is that cameras do not necessarily write to two different memory cards at precisely the same time. In fact, much of the time they don’t, because the memory card speeds might differ, free space might differ, or different formats are being written to different cards.

Endnotes

[1] To simplify the discussion, this proposal ignores the fact that the program can be set to filter the display of thumbnails by using the Timeline and by showing only new files. In other words, for the purposes of discussion, a file is “displayed” even if it is hidden due to the aforementioned filtering.

[2] When dual card slots are detected in a camera, the program will display an emblem or emblems showing which memory card the file comes from.

[3] While it’s increasingly true that ‘RAW’ can be written as ‘raw’, I’m following the Canon naming convention for RAW + JPEG. See for instance http://cpn.canon-europe.com/content/education/infobank/image_compression/raw_jpeg_shooting.do

[4] Treating the matching RAW + JPEG pair as two photos has no effect on file renaming. For current program operation, see http://damonlynch.net/rapid/documentation/#rawjpeg

Thank you for reading this far in this lengthy proposal, and I look forward to your comments!

1 Like

Just a little comment since I am on my mobile, probably more feedback when I am on a real computer next week: Some dialog texts are hard to understand without knowing it beforehand: 1. What is the difference between “treat … as one photo” and “… mark both for download”? 2. What does “clear the completed downloads” mean?

Sounds good…maybe for further clarity, “…replace it in the display with the new file, …”

I really like this. I shoot RAW+JPEG and have only one slot for SD cards. Normally, I am able to download from both cards in the same session, but there have been situations where I have not been able to.

For your consideration, my workflow:
1: I always shoot RAW+jpeg (well almost, not images intended for hugin)
2: I do NOT remove the SD cards, I plug the came (K3) into a USB3 plug, permanently wired into desktop
3: After selection, on computer images are moved to “Photo” (from Pictures)
4: My NAS backs up “Photos” every 12 hours

I might look at images on other computers (e.g. a laptop) but I’ve only “captured” the image when my main computer has it

To that end , I’d like to configure the main computer to MOVE the images off the card, but other computers to simply copy …copies are always considered ephemeral (MOVE is permanent) If I set SD cards to “duplicate” it’s because I have doubts about a card. To that end I’d like to have BOTH copies downloaded (e.g. 2 sub directories SD1 & SD2) then if e.g. I find that SD2 images are often corrupted I know who the culprit** is.
When going on a shoot I know both cards are clean with plenty of capacity.

One failure mode is where a 64Gb is in fact say a 4Gb [internally, i.e. a fake] , so I might end up with 400 images on SD2 and 50 on SD2 (all taken in duplicate) I’d REALLY like to know about that fake card, before I lose an image.

Interesting point. I’ll keep that in mind. Thanks for your input.

Glad you’re putting so much thought into this useful tool. Thanks for your ongoing efforts to improve it’s usability for a wide audience.

While my needs are simple and well handled by RPD, I’d love to be able to tick a checkbox and have the program ignore all my jpgs. I rarely need to download them, as I usually use RAW only for editing. I use the work around you suggested some months back, and it gets me there, but I’d love to be able to check a box that say’s “Do not import JPG” and go from there.