AGI Sparklings proponents rejoice! Finding a literal map(*) means LLMs have a world model.

zogwarg@awful.systems · 2 years ago

AGI Sparklings proponents rejoice! Finding a literal map(*) means LLMs have a world model.

self@awful.systems · 2 years ago

between this and that one fucking chart that tried to say humans emit more CO2 producing text and art than generative AI (using the same underhanded tactics as cryptobros trying to make it look like banks are worse for the environment than blockchains), I’m really starting to feel like the AI industry is in the “deliberately fill your scam email with as many typos as possible to weed out anyone too intelligent” stage of its growth

swlabr@awful.systems · 2 years ago

Anytime they say it’s not a stochastic parrot, what they really mean is that it’s three stochastic parrots in a trenchcoat.

zurohki@aussie.zone · 2 years ago

I think I’m going to start dropping the phrase “As a large language model” into my emails.

froztbyte@awful.systems · 2 years ago

zogwarg@awful.systems · edit-2 2 years ago

Not even that! It looks like a blurry jpeg of those sources if you squint a little!

Also I’ve sort of realized that the visualization is misleading in three ways:

They provide an animation from shallow to deep layers to show the dots coming together, making the final result more impressive than it is (look at how many dots are in the ocean)
You see blobby clouds over sub-continents, with nothing to gauge error within the cloud blobs.
Sorta-relevant but obviously the borders as helpfully drawn for the viewer to conform to “Our” world knowledge aren’t even there at all, it’s still holding up a mirror (dare I say a parrot?) to our cognition.

froztbyte@awful.systems · 2 years ago

haha I know (re precision) but I made that as a shitpost not an academic paper. besides, it’s about as accurate as the promptfans are

that animation… is, yeah. I’m reminded of watching someone eyeball stats on their model as they were tweaking parameters, trying to tamp down overfitting.

it’s also just such shitty science. “can we find some way to represent this data to conform to $x hypothesis?”, albeit that of course isn’t surprising from the P-Hacking As A Service crowd

blakestacey@awful.systems · 2 years ago

As an AI language model, I’d like to point everyone to Max Tegmark’s appearances in the old!sneerclub archives.

self@awful.systems · 2 years ago

as a large language model, I am incapable of feeling surprise that Tegmark is associated with neo-nazis

(also I really need to de-jank the stylesheet for the archive and get the rest of the data in it soon)

froztbyte@awful.systems · 2 years ago

some of these replies (those are diff links) are staggeringly awful

and this one is a piece of art:

chatGPT just learns, it doesn’t reason, it doesn’t use imagination, it doesn’t plan. It learns at a scale that’s so far beyond what any human can, that it can use “pure” learning to do complex behaviors.

zogwarg@awful.systems · 2 years ago

^^ Quietly progressing from humans are not the only ones able to do true learning, to machines are the only ones capable of true learning.

Poetic.

PS: Eek at the *cough* extrapolation rules lawyering 😬.

swlabr@awful.systems · 2 years ago

Oof, they got so close on that last one, yet so far away. Truly a masterpiece in misunderstanding

carlitoscohones@awful.systems · 2 years ago

My first thought in watching the animation was - AI parrot shoots shotgun at side of barn, draws target around result.

Are the dots in the ocean from Amazon’s underwater warehouse structures?

swlabr@awful.systems · 2 years ago

The ocean dots are Lemurian and Atlantean civilisations, situated in the deep. The AI said they’re there, so they must be real! That’s how reality works now!

Witless Protection Program@mastodon.me.uk · 2 years ago

@swlabr @carlitoscohones just off to raise some VC millions for my AI-Driven Atlantis Recovery startup brb

200fifty@awful.systems · 2 years ago

I had the same thought as Emily Bender’s first one there, lol. The map is interesting to me, but mostly as a demonstration of how anglosphere-centric these models are!

Kichae@kbin.social · edit-2 2 years ago

So, what’s going on here, in plainer language, anyway? Are they just including location information in training data and then, totally surprisingly finding it again in the output data? That’s kind of the sense I get from the post here, but I’m not sure if I’m misunderstanding.

Or did they just cluster the data and squint until someone said one of the graphs “kinda looks like it lines up with a Mercator projection”?

froztbyte@awful.systems · edit-2 2 years ago

I… so. damn you, I looked.

this says

For spatial representations, we run Llama-2 models on the names of tens of thousands cities, structures, and natural landmarks around the world, the USA, and NYC. We then train linear probes on the last token activations to predict the real latitude and longitudes of each place

their code does… a lot of things with that input data. including filling some in and conveniently removing “small” towns and some states and eliminating duplicates[1] and other shit

a very quick glance at some of the input data:

% xsv sample 5 uscities.csv| xsv table
city            city_ascii      state_id  state_name  county_fips  county_name  lat      lng        population  density  source  military  incorporated  timezone         ranking  zips                     id
Northglenn      Northglenn      CO        Colorado    08001        Adams        39.9108  -104.9783  37899       2056.3   shape   FALSE     TRUE          America/Denver   2        80260 80233 80234 80603  1840020192
East Gull Lake  East Gull Lake  MN        Minnesota   27021        Cass         46.3948  -94.3548   961         44.4     shape   FALSE     TRUE          America/Chicago  3        56401                    1840007720
Idaho Springs   Idaho Springs   CO        Colorado    08019        Clear Creek  39.7444  -105.5006  2044        318.7    shape   FALSE     TRUE          America/Denver   3        80452                    1840018790
Santa Rosa      Santa Rosa      TX        Texas       48061        Cameron      26.2561  -97.8252   2873        1373.2   shape   FALSE     TRUE          America/Chicago  3        78593                    1840023167
Mystic          Mystic          IA        Iowa        19007        Appanoose    40.7792  -92.9446   337         41.6     shape   FALSE     TRUE          America/Chicago  3        52574                    1840008316

cool. so. we have high-precision data with actual coordinates and well-defined information. as the input. to the mash-things-together-into-a-proximates-slurry machine.

and then on prompting the slurry with questions about “hey where is Wyoming”, it can provide a rough answer.

amazing.

[1] - whoops forgot the footnote. how about that Washingon in every state, huh? sure is a good thing the US doesn’t have lots of reused names!

Kichae@kbin.social · 2 years ago

Wow. I was kinda tongue-in-cheeking it there, because I genuinely thought I was misinterpreting/over-simplifying the OP, but they really are trying to sell “it didn’t discard this data we explicitly fed it” as some kind of big deal.

I was expecting this to be more like them discovering that regional dialects exist or soemthing dumb-but-not-that-dumb.

froztbyte@awful.systems · 2 years ago

promptfans, making grandiose badfaith claims that turn out so not-even-wrong it entirely moves the goalposts on the argument? nevarrrr

froztbyte@awful.systems · 2 years ago

these fucking people

self@awful.systems · 2 years ago

we need a run of @[email protected]’s “it can’t be that stupid, you must be explaining it wrong” stickers but with the ChatGPT logo instead of the bitcoin one

also how can we talk shit about LLMs when computation was impossible until they were invented?

blakestacey@awful.systems · 2 years ago

15-ish years ago, I was doing a lot of principal component analysis and multi-dimensional scaling. A standard exercise in that area is to take distances between cities, like the lengths of airline flight paths, and reconstruct a map. If only I’d thought to claim that to be a world model!

blakestacey@awful.systems · 2 years ago

Whereas the electro-mechanical device that Turing built could perform just one code-cracking function well, today’s frontier AI models are approaching the “universal” computers he could only imagine, capable of vastly more functions.

Fucking Christ, that hurt to read.

swlabr@awful.systems · 2 years ago

gerikson@awful.systems · 2 years ago

I’m sorry, I can’t take anyone with a blue checkmark seriously.

Kichae@kbin.social · 2 years ago

deleted by creator

swlabr@awful.systems · 2 years ago

Also yes

zogwarg@awful.systems · edit-2 2 years ago

deleted by creator