Data Ingress

blog post

Data Ingress

We’ve been worried about the risks and dangers of technology leverage enabled by ChatGPT, and now the data is beginning to swim ashore to corroborate.

It ain’t pretty.

A recent survey from professional networking app Fishbowl found that 43% of working professionals have used AI tools, such as OpenAI’s ChatGPT, to accomplish tasks at work. Of these people, 68% hadn’t told their bosses they were using these tools for work, and 80% of those have pasted company data into some LLM somewhere. ChatGPT incorporates everything, including that data, into its publicly available knowledge base and it becomes yet another part of the rapidly expanding global corpus of everything about everything.

That portion of our aggressive workforce has figured out that ChatGPT can improve productivity dynamics by 10X and are perfectly willing to go mining beyond the creation poems, school essays, and song lyrics.

Which is great. Sort of.

Folks who hate the widely used anti-productivity tool, PowerPoint, have discovered that if you copy bullet points from your company’s 2023 text documents and reports into ChatGPT, it will merrily recreate the data in a PowerPoint slide deck. It will also share that data with any 3rd party query about your company’s plans and strategy for 2023.

The thing about copying and pasting is that most security products are oriented to “file” protection and have no way to track or identify meaning in text without context or pattern.

Cyberhaven Labs analyzed ChatGPT usage for 1.6 million workers at companies across industries and detected 6,352 attempts to paste corporate data into ChatGPT per 100,000 employees, defined as “data egress” events.

“Data ingress” refers to employees copying data out of ChatGPT and pasting it elsewhere. (Google Doc, a company email, source code editor, etc.).

JPMorgan and Verizon have shut the vault door on ChatGPT, hoping that no one figures out a workaround while they seek answers to long term questions.

Whether or not the recent usage data prompted the response from those companies, that data tells us that in the last 3 months, the number of incidents per 100,000 employees where confidential data went to ChatGPT increased by 60.4%.

The most common types of confidential data leaking to ChatGPT are sensitive/internal only data, source code and client data. During this time period, source code eclipsed client data as the second most common type of sensitive data going to ChatGPT.

Just like in phishing attacks, socially engineered funds transfers, impersonated news, misconfigured containers, unpatched vulnerabilities, old, unsupported operating systems, AD, VPNs, gargantuan attack surfaces, overly permissive access, zombie APIs and un-vetted 3rd party vulnerabilities, we have become our own worst enemy.

Author

Steve King

Managing Director, CyberEd

King, an experienced cybersecurity professional, has served in senior leadership roles in technology development for the past 20 years. He has founded nine startups, including Endymion Systems and seeCommerce. He has held leadership roles in marketing and product development, operating as CEO, CTO and CISO for several startups, including Netswitch Technology Management. He also served as CIO for Memorex and was the co-founder of the Cambridge Systems Group.

blog post

Author

Managing Director, CyberEd

Contact Us

Get In Touch!