Ok, I think I should probably explain some of my code, since GLSL is weird in a few ways, but also since I’m doing things in ways that seem significantly differently than other code (such as your own) and formulas suggest. I’ve been studying the code you linked to, especially XYZ_to_LAB
as it applies the gamma/transfer curve identically to all channels (and apparently it’s like that in Bruce Lindbloom’s XYZ→L*a*b* page as well; wish I’d looked at it and not just the page for the other direction). It looks like we might be doing the same thing, but I’m still unsure, so I’ll try converting what my code does into something more similar.
Firstly, GLSL runs on GPUs, and as such, it’s apparently considered bad practice to have code that branches much. So within my toGamma
and toLinear
functions, instead of if
and ternary statements, I have it simply compute both ‘branches’ and then ‘mix’ the results based on a per-channel conditional. So, lets convert this GLSL code to C:
vec4 toGamma(vec4 color, const transfer trc)
{
bvec4 cutoff = lessThan(color, vec4(trc.cutoffToGamma));
vec4 higher = (1.0 + trc.off)*pow(color, vec4(1.0/trc.power)) - trc.off;
vec4 lower = color*trc.slope;
color = mix(higher, lower, cutoff);
return color;
}
In C, the definition of the transfer
struct (very easy to find as it’s the first thing I define in my shaders) would remain the same, except I might change float
to double
(in GLSL, doubles are floats, you just change the precision - which is literally the first few lines of code in my shader).
So when we rewrite toGamma
in such a way that the C version is similar to XYZ_to_LAB
, we’d end up with something like this:
void toGamma(double r, double g, double b, transfer trc, double *to_r, double *to_g, double *to_b)
{
*to_r = r >= trc.cutoffToGamma ? (1.0 + trc.off) * pow(r, 1.0 / trc.power) - trc.off : r * trc.slope;
*to_g = g >= trc.cutoffToGamma ? (1.0 + trc.off) * pow(g, 1.0 / trc.power) - trc.off : g * trc.slope;
*to_b = b >= trc.cutoffToGamma ? (1.0 + trc.off) * pow(b, 1.0 / trc.power) - trc.off : b * trc.slope;
}
So, prior to receiving these responses, I had made a ‘private’ shader on Shadertoy that applied my L*a*b*→XYZ code backwards. I first convert each pixel in an image from RGB to XYZ, then from there I worked through my code backwards to go from XYZ to L*a*b*. After all that, I slapped in my original L*a*b*→XYZ code and checked to see if it gave me the original image back.
I admit that at first it didn’t, but that was because I’d swapped around the order that 2 things were in (or rather, I kept the original order instead of having them occur in the opposite order). After fixing that, I indeed got the original image back. After that, the darkest, closest to black pixels were for some reason labeled as ‘out of gamut’ and were thus turned into that gray color from the original shader, but getting rid of that ‘out of gamut’ check fixed that.
So, since it’s easier to compare XYZ→L*a*b*, ever since your response I’ve been working to compare your version with mine. Here’s the basic GLSL code for just the XYZ→L*a*b* conversion (originally just inlined inside mainImage
, but here I’ve split it into a separate function and included the definitions (and their comments) of a few things for convenience):
// Parameters for transfer characteristics (gamma curves)
struct transfer {
// Exponent used to linearize the signal
float power;
// Offset from 0.0 for the exponential curve
float off;
// Slope of linear segment near 0
float slope;
// Values below this are divided by slope during linearization
float cutoffToLinear;
// Values below this are multiplied by slope during gamma correction
float cutoffToGamma;
};
// Creates a whitepoint's xyz chromaticity coordinates from the given xy coordinates
#define white(x, y)\
vec3(x, y, (1.0 - x - y))
#define Bright(w)\
((w)/w.y)
// Standard illuminant D50 according to ICC standards (they specify a hex value
// for the 16-bit integer representation, as well as a specific way to decode it,
// so I did some math to figure out exactly the values they expect)
const vec3 whiteD50I = white(31595.0/91393.0, 32768.0/91393.0);
// Gamma for the CIE L*a*b* Lightness scale
const transfer gamLab = transfer(3.0, 0.16, 243.89/27.0, 0.08, 216.0/24389.0);
// Color Appearance Model (or 'reference') white point
const vec3 whiteCam = whiteD50I;
vec4 XYZ_to_LAB(vec4 color)
{
color.xyz /= Bright(whiteCam);
color = toGamma(color, gamLab);
color.xyz = (color.xyz*100.0 + 16.0)/116.0;
color.xz = (color.xz - color.y)*vec2(500, -200);
color.y = color.y*116.0 - 16.0;
return color;
}
If I convert it to C (like earlier), I get something like this:
#define D50_WHITE_REF_x (31595.0 / 91393.0)
#define D50_WHITE_REF_y (32768.0 / 91393.0)
#define D50_WHITE_REF_X (D50_WHITE_REF_x / D50_WHITE_REF_y)
#define D50_WHITE_REF_Y (1.0)
#define D50_WHITE_REF_Z ((1.0 - D50_WHITE_REF_x - D50_WHITE_REF_y) / D50_WHITE_REF_y)
#define LAB_GAMMA (3.0)
#define LAB_OFFSET (0.16)
#define LAB_SLOPE (243.89 / 27.0)
#define LAB_CUT_GAMMA (216.0 / 24389.0)
void XYZ_to_LAB(double X, double Y, double Z, double *to_L, double *to_a, double *to_b)
{
double xr = X / D50_WHITE_REF_X;
double yr = Y / D50_WHITE_REF_Y;
double zr = Z / D50_WHITE_REF_Z;
// Doing this inline, but I could instead pack LAB_* into a 'transfer' struct and reuse 'to_gamma()'
double fx = xr >= LAB_CUT_GAMMA ? (1.0 + LAB_OFFSET) * pow(xr, 1.0 / LAB_GAMMA) - LAB_OFFSET : xr * LAB_SLOPE;
double fy = yr >= LAB_CUT_GAMMA ? (1.0 + LAB_OFFSET) * pow(yr, 1.0 / LAB_GAMMA) - LAB_OFFSET : yr * LAB_SLOPE;
double fz = zr >= LAB_CUT_GAMMA ? (1.0 + LAB_OFFSET) * pow(zr, 1.0 / LAB_GAMMA) - LAB_OFFSET : zr * LAB_SLOPE;
fx = (fx * 100.0 + 16.0) / 116.0;
fy = (fy * 100.0 + 16.0) / 116.0;
fz = (fz * 100.0 + 16.0) / 116.0;
// I could instead set '*to_L' before modifying 'fy' above by simply multiplying by 100
*to_L = fy * 116.0 - 16.0;
*to_a = (fx - fy) * 500.0;
*to_b = (fz - fy) * -200.0;
}
I think it’s safe to say that your code has fewer instructions than mine, so would run much faster! However, I’d have to write a special ‘gamma’ function specifically for L*a*b*, rather than reusing the one I use for sRGB, Rec. 709, etc. At this point I suppose it depends on how much faster it is overall; I know mobile devices choke on the 3D version of my gamut shader, so if it ends up being enough of a speed boost to raise my phone’s framerate up from 15fps (which is after setting steps
to 30, down from 120), that’d be a pretty good win.
I feel it’s worth pointing out that my toGamma
function takes values in the same range as it outputs (0 to 1), and so the offset and multiplier are scaled accordingly (to 0.16
and 1.16
(calculated as 1.0 + trc.off
)). Also, I could shorten the gamma/transfer curve bit by just using the cube root function (cbrt()
, which your code uses and I had to look up because I wasn’t familiar with it) instead of taking the value to the 1/3rd power. However, while I did inline that part of the code instead of reusing it, I decided to keep its content intact. Shows what the other code would likely turn into if inlined, at the very least.
So, it seems as if my code does at least take the +16 stuff into account when using that function, but now to figure out if it does so in a way that leads to correct results!
Note: I have not tested any of the C code I wrote above. I’m assuming it’s correct as I basically performed a literal translation from GLSL to C (roughly in the style your code’s in), but I may have changed something while editing it and caused the rest of it to break due to forgetting to change all the other things to match. I’ll test things more properly after I go eat lunch.