24. Sept.2 Min. Lesezeit

A horse probably has no feathers

Cybersecurity Column: Cybercriminals (Part 14)

Every month, this post focuses on cyber criminals, IT security gaps or new, interesting digital topics. Today, I'll explain how the artificial intelligence chatGPT works.

Recently, I've been asked a lot how an artificial intelligence like chatGPT actually works. Many people notice that the answers that chatGPT gives you are, in most cases, incredibly good, almost human. On the other hand, chatGPT (until recently) failed at things as banal as "Write banana backwards". How can that be?

First of all: chatGPT is not a knowledge database. It is a language model that calculates the probabilities of word sequences. Thousands of texts and websites were analyzed for this purpose. Wikipedia, poems, children's stories and fairy tales, as well as newspaper articles about politics, murders, demonstrations and horse riding tournaments.

But what are the probabilities of word sequences? This text begins with "In the last ...". If I ask you which word will follow, most people will spontaneously say "time", some will say "consequence" and even fewer will say "instance". chatGPT knows this because in the 45 TB of texts it has learned from, "time" most often follows "consequence" before "instance" after "In the last ...".

It also knows that the word "horse" is very often followed (although not immediately) by terms such as "fur", "hooves", "mane" and "tail". This is the case in many texts on Wikipedia, in "Bibi and Tina" and in reports on horseback riding tournaments. The words "wings", "feathers", "scales" or "claws" are only very rarely followed.

If we ask chatGPT to describe a horse, it will probably give it fur, hooves, a mane and a tail – and we as questioners will be amazed at how well it can describe a horse and "know" that it has fur and not feathers.

But why did the AI fail to write "banana" backwards? Quite simply because there were few or no texts in the training data in which the words "banana", "backwards" and "enanab" followed one another.

Sometimes chatGPT also invents answers, which is called "hallucination". For example, when a lawyer had chatGPT write a legal submission listing as many precedents as possible. Due to a lack of sufficient data, the AI simply invented several references. As I said - it is a language model and not a database.

The lawyer did not notice the AI's hallucination, but the judge and the opposing lawyers did. However, I am sure that chatGPT will formulate excellent answers to the cheating lawyer's uncomfortable questions from the bar association. It is very likely that the word "apology" will appear in them. The word "feathers", however, will (probably) not appear this time either.

If you are interested in IT security training for your company, please write to me at info@bortoli.ch – with an i at the front and a small L and then a small i at the back.

Protect yourself proactively! Find out how you can do this - book a non-binding and free consultation.

BOOK CONSULTATION NOW

CYBERNINJA

A horse probably has no feathers

Cybersecurity Column: Cybercriminals (Part 14)