Artificial intelligence (AI) safety has turned into a constant cat-and-mouse game. As developers add guardrails to block ...
Large language models are supposed to shut down when users ask for dangerous help, from building weapons to writing malware. A new wave of research suggests those guardrails can be sidestepped not ...
A new study has shown that prompts in the form of poems confuse AI models like ChatGPT, Gemini and Claude— to the point where sometimes, security mechanisms don't kick in. The result came as a ...
Recent years have seen the wide application of NLP models in crucial areas such as finance, medical treatment, and news media, raising concerns about the model robustness. Existing methods are mainly ...
Friends, Romans, countrymen: Lend me your keyboards, as poetry comes to bury artificial intelligence, not to praise it. Such was the refrain of a 2025 study conducted by researchers at AI ethics ...
Members of a Microsoft Corp. team tasked with using hacker tactics to find cybersecurity issues have open-sourced an internal tool, PyRIT, that can help developers find risks in their artificial ...
Head-to-head test results place ActiveFence ahead of Amazon Bedrock Guardrails and Microsoft Azure Content Safety, as well as open-source baselines Llama Prompt Guard 2, and ProtectAI As enterprises ...
Researchers have developed a computer worm that targets generative AI (GenAI) applications to potentially spread malware and steal personal data. The new paper details the worm dubbed “Morris II,” ...
OpenAI published a response to The New York Times’ lawsuit by alleging that The NYTimes used manipulative prompting techniques in order to induce ChatGPT to regurgitate lengthy excerpts, stating that ...