Outsourcing thinking

An interesting article that echoes my thoughts and fears.

By ‘outsourcing’ what we think is a menial, uninteresting task, we give up some understanding. I won’t even mention things like writing personal communications such as birthday greetings, and stuff like organising a vacation, or writing a blog post or forum entry. (I do use LLMs for forum posts, but not to assist with writing: I use them to confirm what I have written describes what’s implemented in darktable’s source code, for example).

I actually feel that having a GPS map on my phone in the past 10 years has made it harder to learn what’s where. I moved to Zürich 8 years ago, and Google Maps was a great help - however, I still cannot get around in the city without it, except for a few locations I routinely go to.

9 Likes

I am always surprised when people frame this as an philosophical/ethical debat about using this shiny perfect tool to do our thinking vs doing it ourselves. Maybe that will be a relevant discussion one day, but for now, my daily interactions with LLMs look like a blooper reel.

I am aware that in demo situations, LLMs can do a great job, and the more computational power you allocate, the better the output. But LLMs are cropping up in situations where the objective is to save 5 minutes of work from a human at the smallest cost possible.

A recent example: the other day I was trying to request a gluten-free meal for a flight. The website did not work for some reason (it was greyed out), so I wrote an e-mail to custőmer service. I got back an LLM-generated response (which, to their credit, was fully disclosed in the first sentence) about how I can use the website to request a special meal. If you have questions, call our agents at … . So I spent 35 minutes on hold until I resolved the issue, talking to two agents. The company basically rendered their cheap, async communication channel (e-mail) useless by using LLMs.

I think that, advances in top-of-the-line LLMs notwithstanding, in most situations we will be encountering the budget version. For example, the essay mentions dating apps. I guess that with careful prompt engineering and trying multiple iterations, you could generate convincing romantic responses in such a situation, but most people will go with minimum effort, because that’s the point, it is a labor saving device, why go overboard?

Similarly, I think that an experienced user can use LLMs as a sophisticated search engine to get a quick intro into a topic, which would be useful in an educational setting. But very few students will invest so much effort, when 5% of it gives you mediocre slop that will be better then the median so you pass.

In coding, LLMs can also be a great tool. I have seen people quickly iterate a prototype, understand it, improve on it, and then use it as a basis or write their own version based on what they learned. But again, that requires effort and understanding, and culling inferior output, etc. As opposed to finishing the task in 10% of the time and moving on to the next one, technical debt be damned.

5 Likes

A bit too wordy for my taste, but the arguments brought forth were sound, and written well.

I have recently used LLMs to see how they can aid in research tasks. So far, I selected only inconsequential product searches, such as “is this lens worth it, or should I consider other lenses?”, or “is there a pair of headphones for a particular set of requirements?”.

I found that LLMs were very good at introducing the topic at hand. They give great Wikipedia-style overviews that list the most important keywords for further research.

However, this is only step one of my typical research: Step two is to fan out, and gather a wide range of possible approaches to the problem at hand. This is where the actual research is happening, and where you learn the most! Step three is narrowing it down to the most appropriate path.

Very consistently, I found LLMs would skip the second step, and skip right to the conclusion. Thus they short-circuited my research, and denied me the opportunity to learn.

I found this incredibly annoying. In the end, it was usually I who came up with the most relevant arguments, whereas the LLM just tried to reinforce what had already been said.

It seems to me that this behavior is intentional. LLMs do not provide arguments for or against, they provide answers. As such, they are no “bicycle for the mind”; they are not even “training wheels”; they are taxis. They take you to the destination directly, and skip all the thinking along the way—for a small fee. Is it too cynical of me to conclude that these things are built for monetization instead of empowerment, considering who built them?

5 Likes

Companies are currently pushing developers to use AI tools (as: we are required to take the ‘course’ (marketing material from Microsoft, in our case), and then strongly advised or required to install Copilot and use it). In the short run, we gain by quickly generating the configuration for some quite complex / badly designed beast, and avoid the headache required to understand its quirks. In the long run, we learned nothing. Some argue that it’s OK, as if there’s a problem, I can prompt the LLM, telling it what the problem is, paste a log snippet, and it will fix the config, even write a test that checks that the new one is now OK.

I don’t know if you follow what kinds of plans people have for development. It seems the ‘dark factory’ (no humans involved) is coming in the next few months/years.
The Five Levels: from Spicy Autocomplete to the Dark Factory (and the article that’s linked at the beginning of the post)
GitHub - steveyegge/gastown: Gas Town - multi-agent workspace manager

And what they produce:

Not really. My impression is that people have all kinds of fanciful plans for LLMs, but what I am waiting for is the opportunity to weigh the cost of technical debt, which will take a few years.

Conceptually, I don’t see anything new here. “Quick nostrum” vs “careful understanding” has been a key trade-off in programming at scale from its very beginning. The historical experience is that without input from an experienced architect, who insists on clean practices, refactoring, etc, most projects collapse under their own weight; at which point it is cheaper to start over (if a company does not do it, a competitor will).

Management of course considers good practices a waste of money and time. The current promise is that LLMs will handle their own mess, like a self-cleaning oven, so it appeals to management immensely. That promise is marketed by people who are selling LLM services, so I am a bit skeptical.

I keep an open mind, but at the moment there is no solid evidence pointing to a drastic long-run productivity gain.

2 Likes

Sure. But:

  • LLMs haven’t been around for long, so of course there is no long-term experience
  • if it’s cheap enough to rebuild the solution from scratch (using an LLM), then tech debt is not an issue (for management)
  • if I lose my job now (with 3 kids dependent on my income), the possibility that in 5 years from now we’ll go back to more human involvement in development won’t help me. :frowning:
3 Likes

There is a truism in the philosophy of science, that all hypotheses are under-determined by their data, i.e. the data is insufficient to show that a hypothesis is correct.

Willard van Quine extended this, pointing out that one can raise innumerable hypotheses to explain a particular phenomenon. It would seem, from your post, that LLMs (or their developers) don’t seem to have heard of Quine’s thesis.

1 Like

If there were statistically significant, provable productivity gains to be had through LLMs, the AI companies would spread that gospel far and wide. And yet, there is a deafening silence on that front. What studies we do see, show mild to severe productivity losses, even in best-case scenarios such as well-defined programming tasks.

4 Likes

It’s a bit unfortunate that these tools, which can be pretty useful as long as their output is adequately validated, try their best to seem so convincing and professional that you (or your manager) don’t feel the need to validate their output anymore. If the hypothesis that using LLMs also makes us significantly worse at validating their output over time turns out to be true, we will be in some trouble.

It’s perhaps also worth adding that in certain domains (e.g. mathematics or programming) that permit rigid and formal specifications, there is currently work being done towards ensuring that LLM output satisfies a user-provided formal specification. This already works reasonably well for mathematics, but it is still left to be seen whether it will become useful for programming too (after all, anyone who works in formal methods can attest to the fact that writing accurate formal specifications for real-world software is extremely difficult, and maintaining them as user requirements change even more so). In domains where we can’t write formal specifications at all, we are out of luck regardless.

The main negative trend that has already become very evident to me is that LLMs often escalate relatively minor mental health problems into major mental health crises and psychoses. The amount of insane messages I get at work by people that are clearly in a state of psychosis has increased significantly since the release of ChatGPT (and most of these messages are now also partially LLM-generated).

I am not too hopeful that this trend is self-correcting for the exact reason that this isn’t a new phenomenon - many tech companies have already been operating like this for decades, maintaining their market position through other means. Tools become worse, technology becomes less safe, very few things last (including the employees that work on them). I really wish to live in a world where this changes for the better, but I see no reason to be optimistic.

6 Likes

This.

The shortest, cheapest and quickest path to increasing immediate profit will always be favored, if not mandated. That’s why I believe AI will be misused and must (financially) fail before there’s a chance to see and use its best potential.

2 Likes

100%. My son navigates by GPS giving him turn-by-turn directions. If he says he went to so and so, and I ask, is that near such and such, he has absolutely no idea. It’s just go this way, turn right now, go so far and turn left here. There is no concept of where you are or what other things exist along the way. IMO, they are lost except for the machine telling them what to do.

Bingo!

I find LLMs to be pretty good for coding, at least for more simple / consistent things.

But this is really the issue. I’m using LLMs enough to know what’s going on in the world around me, doing mostly redundant things I’ve done a million times before. But sometimes I’ll try on some more complex or bigger tasks.

Recently I had a project where there was some new process that needed to be implemented in one of our apps. It needed to work like another process I had made a workflow for a few years ago, long enough that I couldn’t really tell you the nuts and bolts of how it worked. I used Claude to basically just say “Take X feature and make a new feature with a few changes”. I got it done really quickly, but I told my boss afterwards that I couldn’t really tell you what I did.

I’ve never felt so disconnected from the code I write as when I use an LLM. I guess it should instead be considered the code I’m responsible for. At least for me, manually coding through the mundane stuff is when my brain has a chance to marinate and think about the more complex stuff. Using LLMs for the mundane stuff means I skip that warm up period. Luckily there is no forced use for me. I’m free to explore how it’s useful for me and my team.

4 Likes

I see Musk is merging SpaceX and X/AI (not sure that the random capitalisation is correct).

I don’t know whether he is going to rename the company, if he is then I reckon “Skynet” would be a good name.

1 Like
2 Likes

I wonder if there are threads where the bots complain about users :rofl:

1 Like

I wonder how many humans direct their agent to say certain things there. It is the inverse of the chatbot problem in other “social” settings.

Edit: I have heard about molt, but I should have read the wiki. Everything I thought it would be vulnerable too, seems, to have materialized lol

The paralel universe :laughing:
I wonder what the inverse of “AI slop” would be? :thinking:

Polsia? :smiley:

4 Likes

Apparently, yes: