There’s lots to say about the topic, too much for one post. What I will say is that the performance you need is also dependent upon the image sizes you will be mangling.
In code, most image processing is done in two nested loops, one iterating the rows and one nested within that iterates through the pixels in each row. To make this go faster, the programmer will use programming tools such as OpenMP to divide the rows among the available cores. This is a trivial thing to do; if you’ve already written the two loops, paralellizing them is a simple as putting a statement called a pragma on the line above the outer loop, and the compiler will do all the work to “parallel-ize” the code. In fact the compiler is smart enough to generate code that will automatically use all of the cores available on the computer. Cool Beans!!!
But the speedup isn’t a straight multiplication, because for an image in a given location in memory, some cores are closer to it than others. One core can usually address the memory directly, but the others have to “navigate” varying levels of memory cache. This overhead takes away some of the advantage of dividing the work among the cores.
On slower machines, this overhead can start to negate the advantage of more cores for smaller images, because the divided work is small relative to the overhead. This is just a notion, but I think more than four cores for images 24MP and smaller can diminish the speedup to the point where it’s not noticeable on slower machines.
So, the bottom line to all this is that the first priority in selecting hardware is to get a relatively fast processor. Right after that would come the number of cores, along with the caching structure, but I probably wouldn’t worry too many cores if the images are relatively small.
Now, all of the above depends on whether the programmer actually put the pragma in all the places in the code that could benefit from being parallelized. It used to be hard to program multithreading, but OpenMP makes it trivial. Using the GPU to do processing is still hard to do, so you’ll find such use implemented sporadically in programs, sometimes saved for the most tedious operations. What this means is that you need to understand your specific software’s use of the hardware; if you use mine, you’d be silly to buy a GPU, because I don’t do that sort of programming…
If you read some of the hardware reviews, you’ll find they’ll typically run benchmark software that includes some sort of image processing, like the handbrake multithreaded video transcoder. This is probably as close to a relevant benchmark to our purposes as you’ll find.
It’s a complicated topic; hope this helps…