Looking at the Git, not much going on except a new pull request, see: upgrade to SIMDe 0.7.4 by Mr-C Michael R. Carusoe (lives in Germany)
Sorry to have been quiet for a while. I had been very busy with other projects until recently.
I’m working on migration to LibRaw to support more cameras, but it’s harder than I expected.
I hope to be able to release new version of LightZone by end of June.
Great news! This should be top priority as DCRaw hasn’t been updated since 2018 and lagging far behind virtually all mirrorless cameras, and all Canon CR3 cameras. For those unfamiliar with LibRaw, it looks extremely promising and would be a fantastic upgrade to Lightzone. I didn’t get as far as trying to compile the source code yet, and I’m not so sure I have that capability. It looks like Windows is already compiled. I haven’t tested that yet either since I primarily use a Ubuntu 22.04 desktop.
Hello,
I have 0 real-life experience with low-level optimisation (I normally work as a business developer, so asynchronous execution and SQL tuning are the most often used methods to speed up our processes).
Have you perhaps already looked into the new Vector API to replace JNI and native code? It’s currently in incubation.
https://openjdk.org/jeps/438
To calculate a Eucledian norm using vectorised operations, one could write:
// the CPU-specific vectorisation params
private static final VectorSpecies<Float> SPECIES = FloatVector.SPECIES_PREFERRED;
static float[] vectorisedNorm(float[] x, float[] y, float[] z) {
float[] norm = new float[x.length];
int loopBound = SPECIES.loopBound(x.length);
int vectorSize = SPECIES.length();
int i;
for (i = 0; i < loopBound; i += vectorSize) {
FloatVector xVector = FloatVector.fromArray(SPECIES, x, i);
FloatVector yVector = FloatVector.fromArray(SPECIES, y, i);
FloatVector zVector = FloatVector.fromArray(SPECIES, z, i);
xVector.fma(xVector, yVector.fma(yVector, zVector.mul(zVector)))
.sqrt()
.intoArray(norm, i);
}
// process the remaining few items sequentially
for (; i < x.length; i++) {
norm[i] = (float) Math.sqrt(x[i] * x[i] + y[i] * y[i] + z[i] * z[i]);
}
return norm;
}
For very simple operations, auto-vectorisation is already very fast (in fact, the gain from the code above is minimal: with about 13 million entries in the arrays (6123 * 2123, to get some non-aligned items), I get 12 ms of computation time on my Ryzen 5 5600X); for more complex ones, and surely in experienced hands, the API could produce better results.
Also, allocating float[] norm
for each test takes about 3 ms (as you know, Java will zero out the array, even though it is then completely overwritten by the calculation). Moving to an externally provided array, the time is about 9 ms instead of 12. Maybe using a MemorySegment
could help; or simply organising the code in such a way that the array is reusable can eliminate that cost.
Parallelism via parallel streams and/or hand-written parallel code could also increase performance, though doing that on such short calculations as the sample shown above may not bring much benefit:
private static final ForkJoinPool POOL = new ForkJoinPool();
static float[] vectorisedNormPar(float[] x, float[] y, float[] z) {
float[] norm = new float[x.length];
int nCores = Runtime.getRuntime().availableProcessors();
int blockSize = SPECIES.loopBound(x.length / nCores);
POOL.submit(() ->
IntStream.range(0, nCores).parallel().forEach(blockN -> {
boolean isLastBlock = (blockN == nCores - 1);
int startIndex = blockN * blockSize;
processBlock(x, y, z, norm, startIndex, blockSize, isLastBlock);
})
).join();
return norm;
}
private static void processBlock(
float[] x, float[] y, float[] z, float[] norm,
int start, int blockSize, boolean isLastBlock) {
int end = isLastBlock ? SPECIES.loopBound(x.length) : start + blockSize;
int chunkSize = SPECIES.length();
int i;
for (i = start; i < end; i += chunkSize) {
FloatVector xVector = FloatVector.fromArray(SPECIES, x, i);
FloatVector yVector = FloatVector.fromArray(SPECIES, y, i);
FloatVector zVector = FloatVector.fromArray(SPECIES, z, i);
xVector.fma(xVector, yVector.fma(yVector, zVector.mul(zVector)))
.sqrt()
.intoArray(norm, i);
}
if (isLastBlock) {
for (; i < x.length; i++) {
norm[i] = (float) Math.sqrt(x[i] * x[i] + y[i] * y[i] + z[i] * z[i]);
}
}
}
This saves a little bit of time, but not much, I get about 11.4 ms/op on my 12-core CPU (probably memory/cache issues? I’m not a low-level guy).
On my machine, processing a block of 1024x1024 with 64 chunks (instead of nCores = Runtime.getRuntime().availableProcessors()
above) gives the best results: 0.042 ns / vector (so I allocated the x
, y
and z
arrays with a size of 1024x1024, passed them to the vectorisedNormPar
method, and replaced nCores with the chunk size of 64, then ran vectorisedNormParallel(x, y, z, norm, 64)
10˙000 times, and divided the measured time (422 ms) by (10˙000 * 1˙024 * 1˙024).
.
Also, the Tornado VM promises GPGPU support (OpenCL, CUDA) through annotations:
class Compute {
public static void matrixMultiplication(final float[] A, final float[] B, final float[] C, final int size) {
for (@Parallel int i = 0; i < size; i++) {
for (@Parallel int j = 0; j < size; j++) {
float sum = 0.0f;
for (int k = 0; k < size; k++)
sum += A[(i * size) + k] * B[(k * size) + j];
C[(i * size) + j] = sum;
}
}
}
}
https://www.infoq.com/articles/tornadovm-java-gpu-fpga/
https://www.infoq.com/articles/java-performance-tornadovm/
Thank you for the update Masahiro! Appreciate your effort towards LightZone development!
Regards,
Sattva
Hi, I’ve just released new beta version of LightZone.
Please note that this is a beta version, so it may not be as stable as you expected.
- This version cannot open/edit Canon
.cr3
files, since migration to LibRaw is still ongoing. - On macOS it crashes complaining about code signing. I’m working on this issue now. I’ll release fixed packages asap.
- If you cannot open your raw file that can be opened with older versions, please let me know your camera model shown on Metadata tab.
- Edit: If you are Fujifilm user, please don’t use this version. It cannot open Fujifilm raws because of a bug.
As for the migration to LibRaw, I can already open raw files with LibRaw on developing version of LZ, but resulting image color is greenish. I’m going to fix the color balance and enable the LibRaw in next release (5.0.0beta2).
Hi
I’ve only just discovered Lightzone and it’s impressive and quite simple to use. It really should be more widely known.
However I can’t find a way of increasing the size of the tools on a large high definition monitor - they need to be up to twice as big, and it would be good to be able give it more RAM.
Once I know how to make the tools the right size I’m looking forward to really getting to grips with it.
I have no answer for you even though I did use LZ on Windows back in the day when it was payware and Uwe Steinmüller was their photographic advisor. He died about ten years ago, but I just found that parts of his Outback Photo website still work: http://www.outbackphoto.com/content/technique.html
If I remember this correctly, the LZ programmer suddenly left the building. LZ entered a zombie state for quite some time while a group of LZ enthusiast users worked hard to move the code to Open Source. I believe the main obstacles were of legal nature as LZ depended on third party closed source code (libraries). Once solved, it was still an uphill battle as LZ needed programmers. Luckily, Masahiro Kitagawa showed up.
The question I have is how is the software going to solve a large high definition monitor and small monitors that are not high def? Isn’t it easier to satisfy the larger group, presumably the other monitors? Further, can you modify your screen settings to match the settings of LZ? I noticed that you asked the same question on GitHub, but I didn’t have a chance to answer there yet.
Steve
Yes it is very fortunate to have Masahiro as the main LZ programmer. Another obstacle that he has been working on is the the primary raw convertor of DCRaw, by Dave Coffin, has not been updated since 2018,leaving all new cameras out in the cold. Luckily, another raw convertor is “libraw”, and is used software such as Darktable and Rawtherpee. As far as I know, that convertor is fairly up to date for new cameras. Steve
I’ve been digging around and found a folder “.java” which contains folders for preferences - they’re .xml files. I have no experience of java at all, but maybe something could go in there. I have no idea of what to put - or where, but with a bit of coaching, I’m willing to try.
Is the UI configured with .css? If so there are ways of getting it to respond to different window sizes…
I think most monitors are higher definition now - particularly for people working with images who would tend to put a high priority on it.
I revisited this and your previous post and am a little confused. When I mouse over the toolbar I am returned the text of what the tool is. I have to wait a couple of seconds before the text appears. Do you not see anything when you mouse over? As for another possible (programming ) solution , perhaps the same technique for enlarging/shrinking the histogram can be used. The 3 dots between the histogram and the toolbar can do that trick.
Edit: I just found the Java theme location that might interest you → flatlaf-intellij-themes-3.1.1.jar, found under Java/lightzone.
On my copy of LZ I can raise the memory to near 8 MB, even though my desktop has 16 MB available. When I go into HTOP, I see less then 4 MB of memory used with Firefox, Thunderbird, and LZ all running at the same time. I seems that the max limit selected is not a problem but YMMV.
Steve
And this is the maximum for an 80286. That’s sustainability in action!
I’m not talking about the tooltip showing, I’m talking about the size of the tools themselves. The bands of the zonemapper, for example, on my screen, are about 4mm wide and therefore impossible to use with any precision - no doubt on a lower definition screen, each band is about 10mm and can be used with confidence.
Similarly the disk on the colour balance module, which allows one to move a point from the middle to influence the colour of the image is too small to use (and you can’t actually see where you’ve put the point). On a lower definition screen it would be probably 3 or 4 cm in diameter and the point a clear dot or a little circle.
Added to which there is a lag (which is why I ask about memory) when I try to use a tool - you have to wait for the tool to “wake up”, particularly the Zonemapper. Is it because it’s trying to work on Linux?
In spite of all, I’m getting passable results, even if I can’t get the full benefit of the programs capabilities.
(a little later) I looked in the folder and there are a series of .xml files that pass (mostly colour) variables to change the appearance of the UI. It does give hope that an .xml file put somewhere in ~/.java could override something andmake things bigger
I don’t use lightzone, but maybe something like this, that I use for Digikam, can work in this case too: env QT_DEVICE_PIXEL_RATIO=2 digikam -qwindowtitle %c
I understand now. You can add an enhancement suggestion to your Issue 307 that you entered in Github to add the ability to stretch each tool, similar to how the histogram can be stretched. I don’t know if it’s possible but it doesn’t hurt to add it to the issue. Also, I do have a screen magnifier in my OS (Ubuntu 22.04). I can magnify the right side of the screen in increments of .25. It’s a little odd but it works. I would think windows has the same capability.
Personally I have no lag, at least not noticeable on my desktop, when I bring in the Zonemapper tool or other tools. I remember in the past if I have a ton of tools that LZ might lag, but haven’t noticed that in the latest beta. I did recently upgrade my desktop however.