Actually sport it is a narrative
QuoteAstronomers on Wednesday confirmed the discovery of an interstellar object racing through our Solar System – only the third ever spotted, though scientists suspect many more may slip past unnoticed.
The visitor from the stars, designated 3I/Atlas by the International Astronomical Union's Minor Planet Center, is likely the largest yet detected. It has been classified as a comet.
"The fact that we see some fuzziness suggests that it is mostly ice rather than mostly rock," Jonathan McDowell, an astronomer at the Harvard-Smithsonian Center for Astrophysics, told AFP.
Originally known as A11pl3Z before it was confirmed to be of interstellar origin, the object poses no threat to Earth, said Richard Moissl, head of planetary defense at the European Space Agency.
"It will fly deep through the Solar System, passing just inside the orbit of Mars," but will not hit our neighbouring planet, he said.
[Continues . . .]
QuoteOn July 1, the NASA-funded ATLAS (Asteroid Terrestrial-impact Last Alert System) survey telescope in Rio Hurtado, Chile, first reported observations of a comet that originated from interstellar space. Arriving from the direction of the constellation Sagittarius, the interstellar comet has been officially named 3I/ATLAS. It is currently located about 420 million miles (670 million kilometers) away.
Since that first report, observations from before the discovery have been gathered from the archives of three different ATLAS telescopes around the world and the Zwicky Transient Facility at the Palomar Observatory in San Diego County, California. These "pre-discovery" observations extend back to June 14. Numerous telescopes have reported additional observations since the object was first reported.
The comet poses no threat to Earth and will remain at a distance of at least 1.6 astronomical units (about 150 million miles or 240 million km). It is currently about 4.5 au (about 416 million miles or 670 million km) from the Sun. 3I/ATLAS will reach its closest approach to the Sun around Oct. 30, at a distance of 1.4 au (about 130 million miles or 210 million km) — just inside the orbit of Mars.
The interstellar comet's size and physical properties are being investigated by astronomers around the world. 3I/ATLAS should remain visible to ground-based telescopes through September, after which it will pass too close to the Sun to observe. It is expected to reappear on the other side of the Sun by early December, allowing for renewed observations.
QuoteResearchers from MIT, Harvard, and the University of Chicago have proposed the term "potemkin understanding" to describe a newly identified failure mode in large language models that ace conceptual benchmarks but lack the true grasp needed to apply those concepts in practice.
It comes from accounts of fake villages – Potemkin villages – constructed at the behest of Russian military leader Grigory Potemkin to impress Empress Catherine II.
The academics are differentiating "potemkins" from "hallucination," which is used to describe AI model errors or mispredictions. In fact, there's more to AI incompetence than factual mistakes; AI models lack the ability to understand concepts the way people do, a tendency suggested by the widely used disparaging epithet for large language models, "stochastic parrots."
[. . .]
Here's one example of "potemkin understanding" cited in the paper. Asked to explain the ABAB rhyming scheme, OpenAI's GPT-4o did so accurately, responding, "An ABAB scheme alternates rhymes: first and third lines rhyme, second and fourth rhyme."
Yet when asked to provide a blank word in a four-line poem using the ABAB rhyming scheme, the model responded with a word that didn't rhyme appropriately. In other words, the model correctly predicted the tokens to explain the ABAB rhyme scheme without the understanding it would have needed to reproduce it.
[Continues . . .]
QuoteAbstract:
Large language models (LLMs) are regularly evaluated using benchmark datasets. But what justifies making inferences about an LLM's capabilities based on its answers to a curated set of questions? This paper first introduces a formal framework to address this question.
The key is to note that the benchmarks used to test LLMs -- such as AP exams -- are also those used to test people. However, this raises an implication: these benchmarks are only valid tests if LLMs misunderstand concepts in ways that mirror human misunderstandings. Otherwise, success on benchmarks only demonstrates potemkin understanding: the illusion of understanding driven by answers irreconcilable with how any human would interpret a concept.
We present two procedures for quantifying the existence of potemkins: one using a specially designed benchmark in three domains, the other using a general procedure that provides a lower-bound on their prevalence. We find that potemkins are ubiquitous across models, tasks, and domains. We also find that these failures reflect not just incorrect understanding, but deeper internal incoherence in concept representations.