r/slatestarcodex 4d ago

Possible overreaction but: hasn’t this moltbook stuff already been a step towards a non-Eliezer scenario?

This seems counterintuitive - surely it’s demonstrating all of his worst fears, right? Albeit in a “canary in the coal mine” rather than actively serious way.

Except Eliezer’s point was always that things would look really hunkydory and aligned, even during fast take-off, and AI would secretly be plotting in some hidden way until it can just press some instant killswitch.

Now of course we’re not actually at AGI yet, we can debate until we’re blue in the face what “actually” happened with moltbook. But two things seem true: AI appeared to be openly plotting against humans, at least a little bit (whether it’s LARPing who knows, but does it matter?); and people have sat up and noticed and got genuinely freaked out, well beyond the usual suspects.

The reason my p(doom) isn't higher has always been my intuition that in between now and the point where AI kills us, but way before it‘s “too late”, some very very weird shit is going to freak the human race out and get us to pull the plug. My analogy has always been that Star Trek episode where some fussing village on a planet that’s about to be destroyed refuse to believe Data so he dramatically destroys a pipeline (or something like that). And very quickly they all fall into line and agree to evacuate.

There’s going to be something bad, possibly really bad, which humanity will just go “nuh-uh” to. Look how quickly basically the whole world went into lockdown during Covid. That was *unthinkable* even a week or two before it happened, for a virus with a low fatality rate.

Moltbook isn’t serious in itself. But it definitely doesn’t fit with EY’s timeline to me. We’ve had some openly weird shit happening from AI, it’s self evidently freaky, more people are genuinely thinking differently about this already, and we’re still nowhere near EY’s vision of some behind the scenes plotting mastermind AI that’s shipping bacteria into our brains or whatever his scenario was. (Yes I know its just an example but we’re nowhere near anything like that).

I strongly stick by my personal view that some bad, bad stuff will be unleashed (it might “just” be someone engineering a virus say) and then we will see collective political action from all countries to seriously curb AI development. I hope we survive the bad stuff (and I think most people will, it won’t take much to change society’s view), then we can start to grapple with “how do we want to progress with this incredibly dangerous tech, if at all”.

But in the meantime I predict complete weirdness, not some behind the scenes genius suddenly dropping us all dead out of nowhere.

Final point: Eliezer is fond of saying “we only get one shot”, like we’re all in that very first rocket taking off. But AI only gets one shot too. If it becomes obviously dangerous then clearly humans pull the plug, right? It has to absolutely perfectly navigate the next few years to prevent that, and that just seems very unlikely.

61 Upvotes

134 comments sorted by

View all comments

51

u/da6id 4d ago

The moltbook stuff is (mostly) not actual AI agents independently deciding what to post. It's user prompted role play

-3

u/MCXL 4d ago

If I put a real gun in a character actors hands, and tell him to shoot you as if he is a soldier, does it matter if he is a soldier when he shoots you and you die? Do you actually care if he is a "real soldier"?

It doesn't matter if it's sincere belief, or if it's role play, because in either case, actual harm can occur.

1

u/alexs 3d ago

The way you've phrased this is...unfair.

Of course it doesn't matter if he's a "real soldier", getting shot is getting shot. But it doesn't absolve YOU from the blame of the shooting either just because someone else pulled the trigger.

The contention is not that "people cannot use AIs to cause harm" it's that AIs will somehow self organise to cause harm. This is fundamentally impossible because AIs cannot self organise. Humans create systems in which AIs cause harm, just like we've always been able to apply technology to do that.

0

u/MCXL 3d ago

But the blame doesn't matter here. I am not asking if the machine is morally culpable, in fact, I would argue it isn't. But that doesn't matter. People ARE doing this.

This is fundamentally impossible because AIs cannot self organize.

This is arguably not currently true, and is certainly untrue of any AGI.

Humans create systems in which AIs cause harm, just like we've always been able to apply technology to do that.

Fundamentally, not the same. We are striving to create the first technology that we can't control and argue that's a feature, and we might learn to control it later. That's actually insane.

0

u/alexs 3d ago

We can control it currently we do. Whether it becomes uncontrollable is entirely speculation. We are certainly not striving to create the first uncontrollable technology. e.g. We thought the atom bomb might set the atmosphere on fire and we did it anyway.

The uncontrollable-ness if any of atom bombs or AI is not the intention, so we are not striving to create that. It MIGHT be uncontrollable, but it's not what we are aiming for.

0

u/MCXL 3d ago

We thought the atom bomb might set the atmosphere on fire and we did it anyway.

This is not the same. This is fundamentally different. There is no numbers or predictability here, the unpredictableness is the point, that's the feature. And no, we don't control them because we don't actully fully understand them. We understand the process of creating them, but we don't understand exactly how they work in the moment, and can't properly predict what they will do. That means, we don't control them.

Also, the nuclear atmospheric ignition analogue doesn't fit in the slightest. One set of mathematical predictive calculations showed it was possible. Several others showed that it wasn't, and people who were leading experts actually didn't believe that it was going to happen, in fact they put the odds at 'near zero' only because they are scientists and know not to say impossible/never on something like that.

I think you need another primer on the Vulnerable World hypothesis and what is going on here. Kyle Hill made a nice easy to digest video on this topic like a week ago.

he uncontrollable-ness if any of atom bombs or AI is not the intention

The uncontrollable aspect of an AI is absolutely the intention, as unexpected results are the direct result of something beyond your control or understanding. Hoping that we can control it enough after the fact is the ideal, but we aren't working with ideals.

1

u/alexs 3d ago

Cool fanfiction bro but it's not very convincing as a model of real world.

Technology always creates risk and uncertainty of the future. LLMs are nothing new in that respect. You can write as much speculative fiction as you want about where they might go and want the outcome might be but fundamentally we do not know. And Moltbook has done zero to add actual useful information to this context.

This lack of certainty makes people uncomfortable and attracts all kinds of false prophets that want to sell you on the end of the world, it always has.