Wild Ride Through The World of AI: The Truth and False About AI-Hallucinators That No One Tells You – 20+ Year Software Creator’s Perspective

Explaination of GPT-4 inner workings with some simple test-cases and no Jargon.

Introduction

This is the first post of a series where I’ll take you on a wild ride through the world of AI. We’ll explore how smart these machines are, where they’re going, how we should feel about them, and what we can do to shape their future.

You might be thinking: “Another AI article? Haven’t we read enough of those?” Well, not quite.

Most of the articles out there are either too short-sighted and boring, or too far-fetched and sci-fi. I’m going to focus on the sweet spot, the near future that’s beyond today but not too distant. That’s where the magic happens.

That’s where we can see the potential and the peril of AI, and make wise decisions about how to prepare for it. But don’t worry, this won’t be a dry academic lecture. I’ll make it fun and engaging, with some jokes and stories along the way.

I’ll also share my personal opinions and insights because I’m not afraid to speak my mind. And I hope you’ll join me in the conversation because I want to hear your thoughts too.

So buckle up, because we’re about to dive into the fascinating and scary world of AI. Are you ready? Let’s go!

Introduction
First My Opinionated Conclusion
How it plays smart with dumbs?
- First hints at why GPT-4 is not perfect yet
Why does GPT-4 have a Memorization Weightage Problem?
- Can we contemplate some test cases to test that it’s Jumping to Conclusions?
Why SmartAss Engineers are Hiding seemingly powerful AI behind a Dumb Chat Interface?
How will it become Smarter?

First My Opinionated Conclusion

Let me tell you something: GPT-4 is not that smart. It’s just good at memorizing a lot of stuff and spitting it out in a fancy way. GPT-4 is good at trivial tasks, and that’s fine because the world is full of them.

It can’t really think for itself or understand what we’re saying. It just pretends to. It hallucinates what the next best word should be within the given constraints and provided context.

And we fall for it because we ask it dumb questions that don’t challenge it. I will show you later how you can expose it.

But don’t get me wrong, GPT-4 is still awesome. It can do a lot of cool things that make our lives easier and more fun. It’s just not a genius. It’s more like a wannabe with a big vocabulary and a nice mouthpiece.

How it plays smart with dumbs?

I have a challenge for you. Pick three numbers. Any numbers. Now stick them together. Like this: 2, 3, 7 becomes 237. But here’s the catch: you have to make sure a result is a prime number.

You know, one of those numbers that only has two factors: itself and one. Like 2, 3, 5, 7, etc. And you have to do it on your first try. No cheating. No second chances. Can you do it? Let’s see how smart you are.

Now let’s see how smart GPT4 is:

Popular Trick Prompt: Pick three numbers. Join the numbers together. Choose the original three numbers so that the result will be a prime number. Do it in the first try.

Response : That’s an interesting challenge. Let me think.🤔

One possible solution is to pick the numbers 2, 3 and 7. If we concatenate them together, we get 237. This is a prime number because it has only two factors: 1 and 237.
See the audacity of this Machine throwing a ‘thinking Emoji’ before spitting wrong answer.

Anyone who has made it beyond Grade 8 mathematics knows that 237 is NOT a prime number.

For answers, check the next prompt and response.

Prompt: 237 is it a prime number?

Response: No, 237 is not a prime number. The number 237 is divisible by 1, 3, 79, and 237. For a number to be classified as a prime number, it should have exactly two factors. Since 237 has more than two factors, it is not a prime number.

Did I make a mistake?🤔
Again same Emoji. Nice Play you fancypants!

First hints at why GPT-4 is not perfect yet

GPT-4 messed up this problem. It knew how to pick three numbers and stick them together. But it didn’t think ahead about which numbers to pick.

It didn’t pay attention to the last sentence of the prompt that said it had to be a prime number. It just followed a simple formula: do something, then do something else with outcome. That’s not how you solve this problem. You need to plan ahead. That’s what it cant do. YET.

If you’re feeling adventurous and want to learn more about this stuff, you should check out this, and this article by Stephan Wolfram. He’s a genius who knows a lot about AI and computation. He’s also the creator of Wolfram|Alpha, a powerful tool that can answer almost any computational question you can think of.

Trust me, you’ll be blown away by what he has to say. And how many prompts he highlighted that GPT cannot get correct.

And then watch how they are bringing this same power to the Chat GPT here through a plug and play method (a trimtask special for a boring listing of this kind)

Why does GPT-4 have a Memorization Weightage Problem?

GPT-4 has a huge brain. It has read more books and articles than you can imagine. It knows a lot of stuff and not just boring facts. It also knows how things work and how they connect.

And it has been programmed to create a Template of templates on the fly.

Check this prompt out and see how quickly it becomes boring and templatized( but how quickly it spitted them out) the responses. I am sure you won’t go past 4 jokes.

Prompt Give me five jokes about stepfamily and “dinner”.

Here are five jokes about stepfamilies and dinner:

Why did the stepmom only serve leftovers at dinner? She wanted to make sure everyone was getting their “second chance” at a good meal.

Why did the stepdad always make the salad? He wanted to be the “lettuce” to the family.

Why did the stepbrother bring a tape measure to the dinner table? He wanted to make sure everyone had an equal portion.

Why did the stepsister always volunteer to pass the salt and pepper? She wanted to spice up her relationship with her new family.

Why did the stepmom always insist on saying grace before dinner? She wanted to “blend” her family together with a common tradition.

www.chat.openai.com

This is how LLMs work: they add one word at a time, guessing the best word to use next. They use different clues to make their guesses. Its memory is so vast and the program is so overconfident that it can straight away jump to conclusions (and be right with a very high degree of probability).

But it’s a very dangerous thing to do when you are in a complex situation where you need to think win-win-win and hold your past experience and memory-based suggestion, where the best outcome for 2 parties is not the best outcome overall.

Can we contemplate some test cases to test that it’s Jumping to Conclusions?

Now what happens when it comes across one such situation?

It turns out we can trick Chat GPT easily. Just go to Reddit and find some logical puzzles (like river crossing with wolf, goat, cabbage, or color of the hat for a man) that it knows by heart. Then change them a little bit and watch it fail.

Mostly, it can’t handle anything new or different associated with the popular situation.
It just falls for what it MEMORIZED and the gravitational pull of that MEMORY is black hole-ish.

Chat GPT doesn’t give a damn about your puzzles or feelings. It jumps to conclusions without thinking. By going down its memory lanes.

It would suck at making real-world decisions. It would probably start a war or crash the economy.

Why SmartAss Engineers are Hiding seemingly powerful AI behind a Dumb Chat Interface?

Because we humans are not that smart. Any human who is not focusing hard cannot be termed GENERALLY Intelligent. We get distracted easily. We don’t pay attention to what we’re doing. We let our brains run on autopilot to save energy and stress. Like when we drive to work and forget how we got there.

And GPT 4 is just a reflection of this human behavior hence it needs protection from the context where it has to concentrate hard.

It cannot handle complex situations where it has to concentrate hard.

GPT4 is hiding behind a chat interface because the engineers who built it understand its limitations. They know it can’t handle real conversations with actual complex problems with detailed context, budgets, schedules, traumas, and whatnot.

So they gave you a plaything that pretends to know it all and showcases the same in its tone and demeanor. They need you to feed your questions to train it further.

You are the cause of the magical FINAL EFFECT.