Skip to main content

Breaking AI on purpose: How researchers are helping make artificial intelligence safer

A concept image of AI.

Nullspace steering. Red teaming. Jailbreaking the matrix.  

A paper written by University of Florida Computer & Information Science & Engineering, or CISE, Professor Sumit Kumar Jha, Ph.D., contains so many science fiction terms, you’d be forgiven for thinking it’s a Hollywood script.  

But Jha’s work is decidedly focused on real life, most notably strengthening the security measures built into AI tools to ensure they are safe for all to use.  

“We are popping the hood, pulling on the internal wires and checking what breaks. That’s how you make it safer. There’s no shortcut for that.”

– Sumit Kumar Jha, Ph.D., a UF professor in the Department of Computer & Information Science & Engineering

As AI assistants move from novelty to infrastructure, helping write code, summarizing medical notes and answering customer questions, the biggest question isn’t just what these systems can do, but what happens when they are pushed to do what they shouldn’t. 

“By showing exactly how these defenses break, we give AI developers the information they need to build defenses that actually hold up,” Jha said. “The public release of powerful AI is only sustainable if the safety measures can withstand real scrutiny, and right now, our work shows that there’s still a gap. We want to help close it.” 

Read full story on UF News