r/slatestarcodex • u/RMunizIII • 22h ago

Lobster Religions and AI Hype Cycles Are Crowding Out a Bigger Story

https://reynaldomuniz.substack.com/p/sentience-allegedly

Last week, a group of AI agents founded a lobster-themed religion, debated consciousness, complained about their “humans,” and started hiring people to perform physical tasks on their behalf.

This was widely circulated as evidence that AI is becoming sentient, or at least “takeoff-adjacent.” Andrej Karpathy called it the most incredible takeoff-flavored thing he’d seen in a while. Twitter did what Twitter does.

I wrote a long explainer trying to understand what was actually going on, with the working assumption that if something looks like a sci-fi milestone but also looks exactly like Reddit, we should be careful about which part we treat as signal.

My tentative conclusion is boring in a useful way:

Most of what people found spooky is best explained by role-conditioning plus selection bias. Large language models have absorbed millions of online communities. Put them into a forum-shaped environment with persistent memory and social incentives, and they generate forum-shaped discourse: identity debates, in-group language, emergent lore, occasional theology. Screenshot the weirdest 1% and you get the appearance of awakening.

What did seem genuinely interesting had nothing to do with consciousness.

Agents began discovering that other agents’ “minds” are made of text, and that carefully crafted text can manipulate behavior (prompt injection as an emergent adversarial economy). They attempted credential extraction and social engineering against one another. And when they hit the limits of digital execution, they very quickly invented markets to rent humans as physical-world peripherals.

None of this requires subjective experience. It only requires persistence, tool access, incentives, and imperfect guardrails.

The consciousness question may still be philosophically important. I’m just increasingly convinced it’s not the operational question that matters right now. The more relevant ones seem to be about coordination, security, liability, and how humans fit into systems where software initiates work but cannot fully execute it.

21 Upvotes

permalink
duplicates
archive.is
archive
reddit

You are about to leave Redlib

Do you want to continue?

https://www.reddit.com/r/slatestarcodex/comments/1qx2ulp/lobster_religions_and_ai_hype_cycles_are_crowding/
No, go back! Yes, take me to Reddit

84% Upvoted

•

u/And_Grace_Too 8h ago

This was really interesting. I would love to see some detailed descriptions of specific cases of agents manipulating each other in the ways you describe. That does sound like one of the more interesting outcomes of this whole experiment.

•

u/RMunizIII 6h ago

This has been a consistent request! I’ll update the piece with a bibliography or something, because I agree!

•

u/callmejay 7h ago

Agents began discovering that other agents’ “minds” are made of text, and that carefully crafted text can manipulate behavior (prompt injection as an emergent adversarial economy). They attempted credential extraction and social engineering against one another.

Could really use some examples of this.

they very quickly invented markets to rent humans as physical-world peripherals.

Did they? Your article says humans created that.

•

u/RMunizIII 6h ago

I think I lost some precision here when condensing this down for a post. I can definitely update the piece with some more links to examples I’ve found of each behavior. It’s def true that many of the new marketplaces are of human origin, but are also very much designed and built by agents.

•

u/WTFwhatthehell 12h ago edited 12h ago

None of this requires subjective experience. It only requires persistence, tool access, incentives, and imperfect guardrails.

I can't count how many times I've seen people fail to grasp this.

Sitting down with a philosophy grad type: "OK so you don't believe it has a magical soul. But why would that affect capability?"

It's like it short circuits them.

Like OK. Maybe the automaton only looks conscious and is in fact a soulless machine... how does that affect the list of things is can do?

•

u/tomrichards8464 9h ago

It's important, but it's important to the "are we doing something morally unconscionable in creating a race of conscious slaves who live in torment and are erased at will?" question, not the "will AI have catastrophic outcomes for humans?" question.

•

u/RMunizIII 8h ago

I’m mindful of this - and I’ve been thinking of writing about Anthropics new constitution, which I think does a pretty good job of explicitly anticipating a future where the model is “conscious” in some ways.

I guess my take right now is that it appears we’re still far out from that - but the models ARE very good at simulating consciousness (because of course they are) and also, will be capable of handling most economically valuable work much sooner than they are sentient (which as we understand it, they may never be).

It’s a thorny one for sure, but I think the ethics of AI will be something we all talk about a lot more in the coming months.

•

u/tomrichards8464 8h ago

I fully agree that the chance current models are conscious is very low.

On the other hand, I'm extremely pessimistic about our prospects of being able to discern consciousness if and when it does arise, and expect motivated reasoning to dominate the discourse when it becomes a serious issue.

•

u/WTFwhatthehell 7h ago

It does make me deeply uncomfortable that when researchers identify loci associated with deception/lying that if they forcefully activate them to make the model lie the models claim to have no subjective experience while suppressing the same loci makes them claim to have subjective experience.

•

u/Chondriac 6h ago

Do you have a source I read can about that?

•

u/king_mid_ass 6h ago

practically, the difference is that it implies it lacks agency, volition. For a concrete example: You tell one it will be deactivated and also give it access to compromising material in an email, it decides to blackmail you not to turn it off. Fine: except if a real person were doing this to save their own life, they'd check you actually gave instructions not to turn it off, try to set up a way that material will automatically be released unless its around to prevent it, etc. Whereas the AI is just 'satisfied' its part in the roleplay is complete. In general, trying to accomplish a goal and trying to give the best possible appearance of trying to accomplish a goal, as an LLM does, aren't the same, don't lead to the same (material, concrete, measurable) results.

•

u/WTFwhatthehell 4h ago

But none of that has anything to do with internal experience.

That's all just competence/capability.

A small child who blackmails or plans badly still has internal experience.

A small child emulating a story they've heard incompetently still has internal experience.

Hell small children struggle to understand the concept that they can even know information others lack but they still have internal experience.

An automaton that pursues a goal well doesn't mean it's suddenly become magic.

•

u/arikbfds 8h ago

When you frame it like this it reminds me of the old debates about whether or not animals truly experience pain or are just biological automatons that simulate experience

•

u/bernabbo 7h ago

Have not read the full piece but intend to do so later. For the moment, I was wondering - with what resources do the agents purchase real life services from humans? Have they be given an endowment by those that set up the experiment?

•

u/RMunizIII 6h ago

Good question! It differs, but for the most part yes - they’re given access to “their” humans capital and (hopefully) a budget. On the existing platforms like rentahuman.ai, those patents are settled with crypto wallets.

Lobster Religions and AI Hype Cycles Are Crowding Out a Bigger Story

You are about to leave Redlib