Google Gemini is great as an instant DIY FAQ assistant

This is a tangent that began over here on DPR’s MFT forum - My brief talk with Brave Leo AI about “L.Monochrome D” and Lumix picture styles and filters.
TLDR: various AI assistants were all unreliable on basic fact checking of camera specs but good at a general introduction to black and white photography

So I discover so-called AI image enhancement or upscaling, totally new to me, and is bombarded with tech lingo and terms that makes zero sense at first glance.

Google Gemini was a really great tool to organize a FAQ to the field in my own pace, step by step. The replies are concise and understandable, the overall user experience is great, kept my inquiry structured.

It’s also motivating to have an expert assistant to ask about anything. Quora and Stack Exchange is likely similar within some fields I guess if you prefer flesh and blood.

These are linked transcripts:

I assume it knows what it’s talking about on this one :slight_smile:

AI & the Ensh*tification of the Web is on my reading list now, curious to learn more about it from fine folks in this virtual corner of the world.

“Expert” is probably a bit too far IMHO, and the issue is that an LLM can sound convincing but be totally wrong. They are also known to just make things up. This is the same AI, Gemini, that recommended putting glue in your pizza sauce to thicken it up.

I would not assume that. Check it for yourself.

I was google searching some code this week, and Gemini gave me an answer. I was curious, and the code was simple. It had Two different ways to style the code, with convincing explanations, so far as I could tell. One example ran correctly, the other didn’t and wasn’t even really that close (while at cursory glance it looked OK).

Anyway, “AI” as a term has become a catchall, marketing nonsense. There are good “AI” like upscayle (this uses machine learning). And there are not so good of AI.

3 Likes

Your clarification is appreciated. I was just looking for ‘better’ words for this reply and it struck me that a confined well restricted task like that is perfect for instant ‘expert’ digital assistance.

As is my DIY FAQ example where I was asking for basic explanations and definitions.

Truth, nothing but, and diligence tasks remains of course, whether my source is digital or processed through various chains of flesh and blood.

As does the bias and dataset challenge, ie. where did you learn that, how did you learn that, what did you eliminate during the process and so on.

As I put it in another thread, you are basically playing intellectual Russian roulette, where you don’t know which of the confidently delivered “facts” are actually nonsense.

1 Like

thanks, I’ll have a look.

As mentioned a minute ago I asked about basic definitions like “what is stable diffusion” and so on. I assume that’s a simple task to solve for Gemini. Whenever there’s a digital assistant available to teach Darktable or vkdt etc. it’ll probably be a big help for many of us. The skynet stuff is a real concern of course, asking for a lot unknown future trouble.

I think it is instructive to think about this statement. LLM (large language models/AI) are “good at” finding statistical correlations among words in a large source set. At this point, that source includes much of the open internet, and an unknown amount of material that falls under various copyright protections.

Things that show up repeatedly in the source, and that generally agree with each other, will be things that LLM is “good at”. The internet is chock-a-block with explanations of the exposure triangle, aperture and depth of field. Consequently, when you ask a chatbot to explain them, it’s on firm ground.

Now think about camera reviews. There are also tons of these online. But they all differ in important ways. One of the most important differences of course is which camera they are reviewing. And LLM’s are only capable of detecting correlations among words, they don’t actually understand anything. So they will ‘know’ that Nikon Z50 and 40 megapixel sensor show up together a lot. But they don’t understand that this is because many gear freak bloggers think the Z50 needs a bigger sensor. And because many reviews of Fujifilm cameras that do have 40 meg sensors will also mention the Z50 to point out that it has a smaller sensor.

In this context, it makes sense that these ‘intelligent’ systems will be easily confused by camera specs.

Another key consideration has already been mentioned. chatbots are very, very good at presenting information in an authoritative way. We intuitively interpret this as a sign of reliability. If a human can express an idea clearly, it suggests a level of deep familiarity and comfort that is often (but not always) associated with actual expertise.

Chatbots subvert this form of ‘fact checking’ by focusing on the style that expertise usually has, while not being capable of capturing the substance that actually matters.

So while your chatbot sounds like it knows what it’s talking about, that’s not a reliable signal here. Especially when it’s talking about a subject you don’t know well.

Try asking it about something you are an expert in. How good are it’s explanations of things that you understand? If you notice important inaccuracies there, that’s a red flag to keep in mind when you considering what it tells you about things you don’t know.

7 Likes

The LLM explainer is much appreciated, thanks Tyler

It would help me with links to manuals or manufacturer sites but it repeated a phrase about not having access to pdfs (IIRC) and walled sites. I see pdfs in search results all the time but can’t tell of course if some of these models aren’t trained on that as well, for whatever reason.

Great point and I’m prone to miss that aspect of communication from time to time.

For now I’ll be using Gemini and its cousins as a search tool alongside other channels. For deep learning I’ll mix it up with authoritative sources and forums like these.

Besides asking about clean HDMI on LX10 (see OP link) and other specs I’m still on AI honeymoon it seems. Also rediscovered some rude nasty schoolboy attitudes when given a chance to troll a big gun. It caught my Bill Hicks pun immediately.

1 Like

See how it just straight up lied to you when it said it has its “own understanding of the world”?

3 Likes

What would you consider a more truthful reply? Or a better declaration perhaps?

You and I and everyone else have our “own understanding of the world”. Sometimes they intersect gently, sometimes they cause friction, that’s life. I’m not trying to start an argument, I’m genuinely curious about the various perspectives on and reactions to the machines among us.

(edited for clarity, hopefully)

I would recommend you read the thread where we announced the usage of LLM on the forum.

this one - AI(LLM)-generated content on discuss.pixls.us?

1 Like

Good read, solid arguments and guidelines, thanks. I’m not sure if this post is off limits then but if you decide to remove it I’d appreciate a heads-up.

Generative AI lacks a coherent understanding of the world:

2 Likes

See section “LLMs often appear to learn and use representations of the outside world” in https://cims.nyu.edu/~sbowman/eightthings.pdf

These results are in tension, at least to some extent, with the common intuition that LLMs are nothing but statistical next-word predictors, and therefore cannot learn or reason about anything but text. While the premise of this intuition is technically correct in some cases, it can paint a misleading picture of the often-rich representations of the world that LLMs develop as they are trained. In addition, LLMs are increasingly often augmented with other ways of learning about the world that make this claim literally false

(BTW, from a developer’s perspective, I find django co-author Simon Willison’s blog (https://simonwillison.net/) and videos ( https://www.youtube.com/watch?v=uRuLgar5XZw or https://www.youtube.com/watch?v=h8Jth_ijZyY) rather interesting. I found the PDF I linked above also via his site.)

1 Like

Samuel R. Bowman works at Anthropic.

1 Like

Yes, he does. But I don’t think that invalidates everything he writes.
If you think about it, after a certain complexity level, even in animals, an increasing number of neurons gives rise to behaviour (not only improving existing behaviours). A dog or cat will recognise its name, when called, can express feelings and so on, while a ringworm will not / cannot. In a certain sense, the LLMs ‘understand’ stuff. They may be very limited (I was shocked to see how badly they play simple games like tic-tac-toe, for example).

Here is what I mean by ‘some limited understanding of the world’:

Prompt:

Three objects, 'A', 'B' and 'C'
are placed on a 100 cm wide table
along a straight line spanning the width of the table.
None of the objects falls off the table.

Object 'A' is directly on the left side.
Object 'B' is 80 cm from 'A'.
Object 'C' is 30 cm from 'B'.

Work in steps.
What is the distance between 'A' and 'C'?

Output:

Let me solve this step by step.

1. First, let's understand what we know:
  * The table is 100 cm wide
  * Object 'A' is on the left side
  * Object 'B' is 80 cm from 'A'
  * Object 'C' is 30 cm from 'B'
2. Since 'A' is on the left side and 'B' is 80 cm from 'A':
  * 'B' must be 80 cm to the right of 'A'
3. Now, 'C' is 30 cm from 'B', and it could be either:
  * 30 cm to the right of 'B', or
  * 30 cm to the left of 'B'
4. If 'C' is 30 cm to the right of 'B':
  * Distance from 'A' to 'C' = 80 + 30 = 110 cm
  * But this is impossible because the table is only 100 cm wide
5. Therefore, 'C' must be 30 cm to the left of 'B'
  * Distance from 'A' to 'C' = 80 - 30 = 50 cm

The distance between 'A' and 'C' is 50 cm.

And yes, it is stupid, because if I omit the ‘None of the objects falls off the table’ part, then the 3rd part becomes:

3. 'C' is 30 cm from 'B':

Since we're working along a straight line
And moving from left to right
'C' must be 30 cm to the right of 'B'

To find the distance between 'A' and 'C':

Distance from A to C = Distance from A to B + Distance from B to C
Distance from A to C = 80 cm + 30 cm
Distance from A to C = 110 cm

Therefore, the distance between 'A' and 'C' is 110 cm.

All very logical, except for:

  • ignoring that all objects are on the table
  • inventing that we are working from left to right, which I never included in my prompt.

:slight_smile:

Keeping the first line of the prompt as in the 2nd example, but changing the question also results in the right answer:

Three objects, 'A', 'B' and 'C'
are placed on a 100 cm wide table
along a straight line spanning the width of the table.
...
Going left to right,
what is the order of the objects as laid out on the table?

Answer:

...
Object 'C' is 30 cm from 'B':
C's position = 80 + 30 = 110 cm

However, since the table is only 100 cm wide, 'C' cannot be to the right of 'B'
'C' must be between 'A' and 'B'

Therefore, going left to right, the order must be:
A, C, B

It does seem (luckily) that improvements come at exponential costs (I found a chart somewhere, but don’t know where). I do hope to keep my job. :slight_smile:

2 Likes

Not to mention the new sweat shops that have replaced sewing machines for computers…

Seems like there are human costs to AI other than the fears of job loss…

There are human costs to everything. I don’t believe paying Kenyans $2 an hour to train an AI is any worse than paying Kenyans $2 to do any other kind of similar work. The root problem is not AI but that Kenyans are taken advantage of by corporations. In the end, almost every product we use is manufactured using a raw material that was mined by people earning nothing. If it wasn’t the raw material, maybe the coal used to power the factories that made it, etc. We cannot avoid benefiting from near slave work.

Agreed. There seems to be a very dogmatic sentiment here that just because LLM’s have big flaws, that they are inherently useless/bad/evil.

2 Likes