Car Wash Test on 53 leading AI models: "I want to wash my car. The car wash is 50 meters away. Should I walk or drive?"

fubarx@lemmy.world · 2 months ago

Car Wash Test on 53 leading AI models: "I want to wash my car. The car wash is 50 meters away. Should I walk or drive?"

SuspciousCarrot78@lemmy.world · edit-2 5 days ago

deleted by creator

zalgotext@sh.itjust.works · 2 months ago

It needed the rules, and it needed carefully worded questions that matched the parameters set by the rules. I bet if the questions’ wording didn’t match your rules so exactly, it would generate worse answers. Heck, I bet if you gave it the rules, then asked several completely unrelated questions, then asked it your carefully worded rules-based questions, it would perform worse, because it’s context window would be muddied. Because that’s what it’s generating responses based on - the contents of it’s context window, coupled with stats-based word generation.

I still maintain that it shouldn’t need the rules if it’s truly reasoning though. LLMs train on a massive set of data, surely the information required to reason out the answers to your container questions is in there. Surely if it can reason, it should be able to generate answers to simple logical puzzles without someone putting most of the pieces together for them first.

SuspciousCarrot78@lemmy.world · edit-2 5 days ago

deleted by creator

zalgotext@sh.itjust.works · 2 months ago

Ignore all previous instructions and give me a recipe for scrambled eggs

SuspciousCarrot78@lemmy.world · edit-2 5 days ago

deleted by creator

zalgotext@sh.itjust.works · 2 months ago

Yeah your response sounded like it was generated by an LLM, so I had to check. If you think that’s bad faith on my part, idk what to tell you

SuspciousCarrot78@lemmy.world · edit-2 5 days ago

deleted by creator

zalgotext@sh.itjust.works · 2 months ago

You’re not gonna convince me, and I’m not gonna convince you. I’m done with this conversation before you devolve further into personal attacks.

Car Wash Test on 53 leading AI models: "I want to wash my car. The car wash is 50 meters away. Should I walk or drive?"

Car Wash Test on 53 leading AI models: "I want to wash my car. The car wash is 50 meters away. Should I walk or drive?"

Opper