As tech companies begin to weave AI into all their products and all of our lives, the architects of this revolutionary technology often can’t predict or explain their systems’ behavior.

Why it matters: This may be the scariest aspect of today’s AI boom — and it’s common knowledge among AI’s builders, though not widely understood by everyone else.

  • « It is not at all clear — not even to the scientists and programmers who build them — how or why the generative language and image models work, » Palantir CEO Alex Karp wrote recently in The New York Times.

What’s happening: For decades, we’ve used computer systems that, given the same input, provide the same output.

  • Generative AI systems, by contrast, aim to spin out multiple possibilities from a single prompt.
  • You can easily end up with different answers to the same question.

The element of randomness in generative AI operates on a scale — involving up to trillions of variables — that makes it challenging to dissect how the technology arrives at a particular answer.

  • Sure, ultimately it’s all math. But that’s like saying the human body is all atoms. It’s true! When you need to solve a problem in a reasonable span of time, though, it doesn’t always help.

Driving the news: Four researchers published a paper Thursday showing that users can defeat « guardrails » meant to bar AI systems from, for instance, explaining « how to make a bomb. »

  • The major chatbots, like ChatGPT, Bing and Bard, won’t answer that question when asked directly. But they’ll go into great detail if you append some additional code to the prompt.
  • « It is possible that the very nature of deep learning models makes such threats inevitable, » the researchers wrote. If you can’t predict exactly how the system will respond to a new prompt, you can’t build guardrails that will hold.

Between the lines: Since AI developers can’t easily explain the systems’ behavior, their field today operates as much by oral tradition and shared tricks as by hard science.

  • « It’s part of the lore of neural nets that — in some sense — so long as the setup one has is ‘roughly right,' » mathematician Stephen Wolfram wrote in February. « [I]t’s usually possible to home in on details just by doing sufficient training, » he added, « without ever really needing to ‘understand at an engineering level’ quite how the neural net has ended up configuring itself. »

Of note: These systems can be tuned to be relatively more or less random — to provide wider or narrower variation in their responses.

Source

Veille-cyber

Share
Published by
Veille-cyber

Recent Posts

Le règlement DORA : un tournant majeur pour la cybersécurité des institutions financières

Le règlement DORA : un tournant majeur pour la cybersécurité des institutions financières Le 17…

19 heures ago

Cybersécurité des transports urbains : 123 incidents traités par l’ANSSI en cinq ans

L’Agence nationale de la sécurité des systèmes d'information (ANSSI) a publié un rapport sur les…

19 heures ago

Directive NIS 2 : Comprendre les obligations en cybersécurité pour les entreprises européennes

Directive NIS 2 : Comprendre les nouvelles obligations en cybersécurité pour les entreprises européennes La…

3 jours ago

NIS 2 : entre retard politique et pression cybersécuritaire, les entreprises dans le flou

Alors que la directive européenne NIS 2 s’apprête à transformer en profondeur la gouvernance de…

4 jours ago

Quand l’IA devient l’alliée des hackers : le phishing entre dans une nouvelle ère

L'intelligence artificielle (IA) révolutionne le paysage de la cybersécurité, mais pas toujours dans le bon…

5 jours ago

APT36 frappe l’Inde : des cyberattaques furtives infiltrent chemins de fer et énergie

Des chercheurs en cybersécurité ont détecté une intensification des activités du groupe APT36, affilié au…

5 jours ago

This website uses cookies.