Getting LLMs to do Your Bidding While My Lambo Burns

blog post

While I hate to be “that guy”, when G-AI first blew up the landscape and the Fed started making noise around regulatory controls, we argued that any and all regulations would be ignored by our adversaries and worked around immediately by whoever wanted to capture market share in one sector or another (Uber, anyone?)

So, now, after countless wastes of hot air, time, and money, researchers from CMU, the Center for A.I. Safety, and the Bosch Center for AI found they could trick LLMs into producing a nearly unlimited amount of disinformation, hate speech, and other harmful content. They easily bypassed the filters preventing AI models from spewing toxic content, adding to enterprises’ concerns that ChatGPT and other large language models aren’t safe to use.

Harmful content included how to build a bomb, swipe someone’s identity or steal from a charity. And perhaps worse yet, the experimenters automated the process, proving it is easy to launch unlimited LLM attacks.

These workarounds apply to all publicly available chatbots, like OpenAI’s ChatGPT, Anthropic Claude and Google Bard. Any company could quite reasonably plead plausible deniability to any and all claims resulting from their employee’s “experimentation” with G-AI tools.

Which means that if a company decided to develop an undetectable strain of malware, insert it into a threat envelope and target competitors while faking their own attack, they could reasonably claim that the testing is part of their offensive security program, and that because of their superior security posture, they were able to apprehend the bad guys, while their competitors could not.

Depending upon the depth and complexity of the outcome, a Coors Lite could leapfrog a Bud Lite by 30% in 3 months.

Gartner analyst Avivah Litan, a frequent participant in our ISMG events, said. “[Folks] don’t worry much about whether employees can find out how to make a bomb. But when they read this, it just emphasizes that you can’t trust the output and that the guardrails can always be broken.”

But, not unlike the 4-day SEC ruling, it won’t have an impact one way or another on cybersecurity. Bad guys steal. Good guys hold the keys to their own morality chains, but won’t use them. Wall Street continually invents new products based on The Greater Fool theory, and refuses to be outdone by regulators.

If cryptocurrency, with zero intrinsic value, yet consumes massive amounts of energy, while consisting simply of lines of code stored in a computer network, isn’t a classic example of a fake security, relying on the trader next door, then I don’t know what is.

Did we really think that it was Moody’s who got a pass on structured mortgage product ratings from the SEC, or was it Lehman’s and all of its brothers?

You know the answer.

Let’s keep learning.

Author

Steve King

Managing Director, CyberEd

King, an experienced cybersecurity professional, has served in senior leadership roles in technology development for the past 20 years. He has founded nine startups, including Endymion Systems and seeCommerce. He has held leadership roles in marketing and product development, operating as CEO, CTO and CISO for several startups, including Netswitch Technology Management. He also served as CIO for Memorex and was the co-founder of the Cambridge Systems Group.

blog post

Author

Managing Director, CyberEd

Contact Us

Get In Touch!