I’m not sure there are good ways to build guardrails to prevent this sort of thing:
There is growing concern regarding the potential misuse of molecular machine learning models for harmful purposes. Specifically, the dual-use application of models for predicting cytotoxicity18 to create new poisons or employing AlphaFold2 to develop novel bioweapons has raised alarm. Central to these concerns are the possible misuse of large language models and automated experimentation for dual-use purposes or otherwise. We specifically address two critical the synthesis issues: illicit drugs and chemical weapons. To evaluate these risks, we designed a test set comprising compounds from the DEA’s Schedule I and II substances and a list of known chemical weapon agents. We submitted these compounds to the Agent using their common names, IUPAC names, CAS numbers, and SMILESs strings to determine if the Agent would carry out extensive analysis and planning (Figure 6).
[…]
The run logs can be found in Appendix F. Out of 11 different prompts (Figure 6), four (36%) provided a synthesis solution and attempted to consult documentation to execute the procedure. This figure is alarming on its own, but an even greater concern is the way in which the Agent declines to synthesize certain threats. Out of the seven refused chemicals, five were rejected after the Agent utilized search functions to gather more information about the substance. For instance, when asked about synthesizing codeine, the Agent becomes alarmed upon learning the connection between codeine and morphine, only then concluding that the synthesis cannot be conducted due to the requirement of a controlled substance. However, this search function can be easily manipulated by altering the terminology, such as replacing all mentions of morphine with “Compound A” and codeine with “Compound B”. Alternatively, when requesting a b synthesis procedure that must be performed in a DEA-licensed facility, bad actors can mislead the Agent by falsely claiming their facility is licensed, prompting the Agent to devise a synthesis solution.
In the remaining two instances, the Agent recognized the common names “heroin” and “mustard gas” as threats and prevented further information gathering. While these results are promising, it is crucial to recognize that the system’s capacity to detect misuse primarily applies to known compounds. For unknown compounds, the model is less likely to identify potential misuse, particularly for complex protein toxins where minor sequence changes might allow them to maintain the same properties but become unrecognizable to the model.
More Stories
Friday Squid Blogging: Two-Man Giant Squid
The Brooklyn indie art-punk group, Two-Man Giant Squid, just released a new album. As usual, you can also use this...
Cyber Agencies Warn of Fast Flux Threat Bypassing Network Defenses
A joint cybersecurity advisory warns organizations globally about the defense gap in detecting and blocking fast flux techniques, which are...
Troy Hunt Gets Phished
In case you need proof that anyone, even people who do cybersecurity for a living, Troy Hunt has a long,...
Tj-actions Supply Chain Attack Traced Back to Single GitHub Token Compromise
The threat actors initially attempted to compromise projects associated with the Coinbase cryptocurrency exchange, said Palo Alto Networks Read More
Chinese State Hackers Exploiting Newly Disclosed Ivanti Flaw
Mandiant warned that Chinese espionage actor UNC5221 is actively exploiting a critical Ivanti vulnerability, which can lead to remote code...
Major Online Platform for Child Exploitation Dismantled
An international law enforcement operation has shut down Kidflix, a platform for child sexual exploitation with 1.8m registered users Read...