Although the principles of machine learning were laid down some half a century ago, only recently have they found widespread application in practice. As computing power grew, computers learned first to distinguish objects in images and play Go better than humans, then to draw pictures based on text descriptions and maintain a coherent chat. In 2021–2022, scientific breakthroughs became accessible to all. For example, you can subscribe to MidJourney and, say, instantly illustrate your own books. And OpenAI has finally opened up its large GPT-3 (Generative Pretrained Transformer 3) language model to the general public through ChatGPT. The bot is available at chat.openai.com, where you can see for yourself how it maintains a coherent conversation, explains complex scientific concepts better than many teachers, artistically translates texts between languages, and much, much more.
If we strip ChatGPT down to the bare essentials, the language model is trained on a gigantic corpus of online texts, from which it “remembers” which words, sentences, and paragraphs are collocated most frequently and how they interrelate. Aided by numerous technical tricks and additional rounds of training with humans, the model is optimized specifically for dialog. Because “on the internet you can find absolutely everything”, the model is naturally able to support a dialog on practically all topics: from fashion and the history of art to programming and quantum physics.
Scientists, journalists, and plain enthusiasts are finding ever more applications for ChatGPT. The Awesome ChatGPT prompts website has a list of prompts (phrases to start a conversation with a bot), which allow to “switch” ChatGPT so that it will respond in the style of Gandalf or some other literary character, write Python code, generate business letters and resumes, and even imitate a Linux terminal. Nevertheless, ChatGPT is still just a language model, so all the above is nothing more than common combinations and collocations of words — you won’t find any reason or logic in it. At times, ChatGPT talks convincing nonsense (like many humans), for example, by referring to non-existent scientific studies. So always treat ChatGPT content with due caution. That said, even in its current form, the bot is useful in many practical processes and industries. Here are some examples in the field of cybersecurity.
On underground hacker forums, novice cybercriminals report how they use ChatGPT to create new Trojans. The bot is able to write code, so if you succinctly describe the desired function (“save all passwords in file X and send via HTTP POST to server Y”), you can get a simple infostealer without having any programming skills at all. However, straight-arrow users have nothing to fear. If bot-written code is actually used, security solutions will detect and neutralize it as quickly and efficiently as all previous malware created by humans. What’s more, if such code isn’t checked by an experienced programmer, the malware is likely to contain subtle errors and logical flaws that will make it less effective.
At least for now, bots can only compete with novice virus writers.
When InfoSec analysts study new suspicious applications, they reverse-engineer, the pseudo-code or machine code, trying to figure out how it works. Although this task cannot be fully assigned to ChatGPT, the chatbot is already capable of quickly explaining what a particular piece of code does. Our colleague Ivan Kwiatkovski has developed a plugin for IDA Pro that does precisely that. The language model under the hood isn’t really ChatGPT – rather its cousin, davinci-003 – but this is a purely technical difference. Sometimes the plugin doesn’t work, or outputs garbage, but for those cases when it automatically assigns legitimate names to functions and identifies encryption algorithms in the code and their parameters, it’s worth having in your kitbag. It comes into its own in SOC conditions, where perpetually overloaded analysts have to devote a minimum amount of time to each incident, so any tool to speed up the process is welcome.
A variation of the above approach is an automated search for vulnerable code. The chatbot “reads” the pseudo-code of a decompiled application, and identifies places that may contain vulnerabilities. Moreover, the bot provides Python code designed for vulnerability (PoC) exploitation. Sure, the bot can make all kinds of mistakes, in both searching for vulnerabilities and writing PoC code, but even in its current form the tool is of use to both attackers and defenders.
Because ChatGPT knows what people are saying about cybersecurity online, its advice on this topic looks convincing. But, as with any chatbot advice, you never know where it exactly came from, so for every 10 great tips there may be one dud. All the same, the tips in the screenshot below for example are all sound:
Phishing and BEC
Convincing texts are a strong point of GPT-3 and ChatGPT, so automated spear-phishing attacks using chatbots are probably already occurring. The main problem with mass phishing e-mails is that they don’t look right, with too much generic text that doesn’t speak directly to the recipient. As for spear-phishing, when a live cybercriminal writes an e-mail to a single victim, it’s quite expensive; therefore, it’s used only in targeted attacks. ChatGPT is set to drastically alter the balance of power, because it allows attackers to generate persuasive and personalized e-mails on an industrial scale. However, for an e-mail to contain all necessary components, the chatbot must be given very detailed instructions.
But major phishing attacks usually consist of a series of e-mails, each gradually gaining more of the victim’s trust. So for the second, third, and nth e-mails, ChatGPT will really save cybercriminals a lot of time. Since the chatbot remembers the context of the conversation, subsequent e-mails can be beautifully crafted from a very short and simple prompt.
Moreover, the victim’s response can easily be fed into the model, producing a compelling follow-up in seconds.
Among the tools attackers can use is stylized correspondence. Given just a small sample of a particular style, the chatbox can easily apply it in further messages. This makes it possible to create convincing fake e-mails seemingly from one employee to another.
Unfortunately, this means that the number of successful phishing attacks will only grow. And the chatbot will be equally convincing in e-mail, social networks, and messengers.
How to fight back? Content analysis experts are actively developing tools that detect chatbot texts. Time will tell how effective these filters will prove to be. But for now, we can only recommend our two standard tips (vigilance and cybersecurity awareness training), plus a new one. Learn how to spot bot-generated texts. Mathematical properties are not recognizable to the eye, but small stylistic quirks and tiny incongruities still give the robots away. Check out this game to see if you can spot the difference between human- and machine-written text.