You’re not actually disagreeing with me, you’re just restating that the process is fallible. No argument there. All reasoning models are fallible, including humans. The difference is, LLMs are consistently fallible, in ways that can be measured, improved, and debugged (unlike humans, who are wildly inconsistent, emotionally reactive, and prone to motivated reasoning).
Also, the fact that LLMs are “trained on tools like logic and discourse” isn’t a weakness. That’s how any system, including humans, learns to reason. We don’t emerge from the womb with innate logic, we absorb it from language, culture, and experience. You’re applying a double standard: fallibility invalidates the LLM, but not the human brain? Come on.
And your appeal to “fuck around and find out” isn’t a disqualifier; it’s an opportunity. LLMs already assist in experiment design, hypothesis testing, and even simulating edge cases. They don’t run the scientific method independently (yet), but they absolutely enhance it.
So again: no one’s saying LLMs are perfect. The claim is they’re useful in evaluating truth claims, often more so than unaided human intuition. The fact that you’ve encountered hallucinations doesn’t negate that - it just proves the tool has limits, like every tool. The difference is, this one keeps getting better.
Edit: I’m not describing a “reasoning model” layered on top of an LLM. I’m describing what a large language model is and does at its core. Reasoning emerges from the statistical training on language patterns. It’s not a separate tool it uses, and it’s not “trained on logic and discourse” as external modules. Logic and discourse are simply part of the training data; meaning they’re embedded into the weights through gradient descent, not bolted on as tools.
You’re not actually disagreeing with me, you’re just restating that the process is fallible. No argument there. All reasoning models are fallible, including humans. The difference is, LLMs are consistently fallible, in ways that can be measured, improved, and debugged (unlike humans, who are wildly inconsistent, emotionally reactive, and prone to motivated reasoning).
Also, the fact that LLMs are “trained on tools like logic and discourse” isn’t a weakness. That’s how any system, including humans, learns to reason. We don’t emerge from the womb with innate logic, we absorb it from language, culture, and experience. You’re applying a double standard: fallibility invalidates the LLM, but not the human brain? Come on.
And your appeal to “fuck around and find out” isn’t a disqualifier; it’s an opportunity. LLMs already assist in experiment design, hypothesis testing, and even simulating edge cases. They don’t run the scientific method independently (yet), but they absolutely enhance it.
So again: no one’s saying LLMs are perfect. The claim is they’re useful in evaluating truth claims, often more so than unaided human intuition. The fact that you’ve encountered hallucinations doesn’t negate that - it just proves the tool has limits, like every tool. The difference is, this one keeps getting better.
Edit: I’m not describing a “reasoning model” layered on top of an LLM. I’m describing what a large language model is and does at its core. Reasoning emerges from the statistical training on language patterns. It’s not a separate tool it uses, and it’s not “trained on logic and discourse” as external modules. Logic and discourse are simply part of the training data; meaning they’re embedded into the weights through gradient descent, not bolted on as tools.