codepaths/openmp_simd, sse2

SIMD is SSE, or rather, SSE is SIMD. SSE2 code paths are manually vectorized code. OpenMP SIMD code paths are automatically vectorized code by the compiler. Both allow to process Multiple Data in a Single Instruction, leading to substantial speed-ups where that logic can be used.

Recent work on color conversions in the pipeline (to go from/to Lab to/from RGB) have shown that manually vectorized SSE code was slower than automagically vectorized OpenMP SIMD code. The reason is probably that the auto way adapts better to the CPU cache size/SSE generation heuristics. But bear in mind that color conversions are straightforward matrice/vector dot products, for which SSE stuff is designed.

That started several checks on different modules to see if manual SSE2 code brought something more, and it was removed in cases that have been proven slower than OpenMP code. However, this behaviour is not systematic so SSE2 code paths are kept in some places and there is no reason to override them with the OpenMP SIMD pathes.

Also, self-building and using a march=generic packaged build will make a difference here and the target_clones are not fully functional and barely tested.