Posts

Sorted by New

Wiki Contributions

Comments

Sorted by

>The artificially generated data includes hallucinated links.

Not commenting on OpenAI's training data, but commenting generally: Models don't hallucinate because they've been trained on hallucinated data. They hallucinate because they've been trained on real data, but they can't remember it perfectly, so they guess. I hypothesize that URLs are very commonly hallucinated because they have a common, easy-to-remember format (so the model confidently starts to write them out) but hard-to-remember details (at which point the model just guesses because it knows a guessed URL is more likely than a URL that randomly cuts off after the http://www.).