AI gone wrong: seven unsettling examples that show how AI is far from perfect

Picture of Richard van Hooijdonk
Richard van Hooijdonk
From wrongful arrests to life-threatening advice, AI’s most spectacular failures reveal uncomfortable truths about our algorithmic future.

Anyone who claims they’re not at least slightly worried about AI is probably very brave, very stupid, or just plain lying. The technology’s moving so fast that we can barely understand what it’s doing, let alone figure out how to control it properly. While everyone’s busy debating sci-fi scenarios about robot overlords, 2025 has been showing us repeatedly that the real dangers are happening right now – in hospitals, police stations, and our own homes. We’ve rushed to integrate AI systems into the most consequential areas of human life: criminal justice, healthcare, financial services, and personal communication. The pitch was always the same – computers would be faster, fairer, and smarter than humans. Instead, we’re learning that AI can be spectacularly, dangerously stupid. These systems make split-second decisions with all the nuance of a sledgehammer, creating brand new ways to discriminate, make mistakes, and hurt people that we never saw coming.

When a computer decides you look guilty

How New York’s finest spent billions on surveillance tech that couldn’t tell one black man from another.

The NYPD likes to think big. With US$6bn to spend each year and more cops than some countries have soldiers, they’ve turned surveillance into something of a dark art form. Since 2007, they’ve dropped over US$2.8bn on just about every spy gadget out there – phone trackers, crime prediction software, and yes, facial recognition that was supposed to make catching bad guys foolproof. However, that foolproof system experienced a spectacular failure in February 2025 when it decided that Trevis Williams, a Brooklyn dad, was guilty of public lewdness based on some grainy CCTV footage. The AI looked at the blurry video and confidently spat out six suspects. What did they have in common? They were all black men with dreadlocks and facial hair. And that’s it.

Williams looked nothing like the actual suspect; he was 20 centimetres taller, more than 30 kilograms heavier, and had a rock-solid alibi putting him some 19 kilometres away when the crime took place. But none of it mattered. The detectives got their match and ran with it, sticking Williams in a photo lineup. When the victim picked him out, Williams found himself in handcuffs, protesting his innocence to officers who couldn’t be bothered to check basic facts like his height or whereabouts. Two days later, someone finally did the math and realised they’d arrested the wrong guy.

But the damage was done, and Williams isn’t alone. At least three other black men in Detroit, which also has a large black population and has invested heavily in facial recognition technology, have been through the same nightmare, showing us that facial recognition isn’t eliminating bias – it’s amplifying it. Instead of making policing more accurate, it’s become a high-tech shortcut to the same old prejudices, only faster and with a veneer of ‘scientific’ legitimacy. Legal experts are now saying what should have been obvious from the start: you can’t build a lineup around what a computer thinks it sees. The technology that was supposed to make justice more precise has instead made it more arbitrary, turning policing into a lottery where the odds are stacked against anyone who happens to fit the algorithm’s idea of suspicious.

NHS AI creates fictional medical histories

Meet Annie AI, the healthcare assistant that ‘gave’ a healthy patient diabetes.

Healthcare feels like a natural fit for AI assistance – overworked doctors, life-or-death decisions, surely computers could help, right? Well, one London patient learned the hard way that when you put too much trust into AI, the results can be genuinely dangerous. It started innocently enough with a letter inviting him to diabetic eye screening. Routine stuff, except for one small problem: he’d never been diagnosed with diabetes. No symptoms, no family history, nothing. The next day, during a routine blood test, a sharp-eyed nurse spotted the discrepancy, and they started digging through his medical records to figure out what was going on.

That’s when they found the smoking gun: a medical summary generated by Annie, an AI assistant developed by Anima Health, which read like something from an alternate universe. Instead of documenting his actual visit for tonsillitis, Annie had recorded he’d come in with chest pain and breathing problems, possibly having a heart attack. It had also gifted him a Type 2 diabetes diagnosis from the previous year, complete with detailed medication instructions for drugs he’d never taken. The AI even invented the hospital where this fictional treatment supposedly occurred: “Health Hospital” on “456 Care Road” in “Health City”.

When the NHS was asked to explain how this had happened, Dr Matthew Noble, a representative for the NHS, insisted it was a simple case of human error. A medical worker had supposedly spotted the mistake but got distracted and saved the wrong version. Fair enough, people make mistakes. But that doesn’t explain why the AI was hallucinating entire medical histories in the first place. According to Noble, “no documents are ever processed by AI; Anima only suggests codes and a summary to a human reviewer in order to improve safety and efficiency.”

The incident exposes a fundamental problem with current healthcare AI deployment. How many medical summaries are getting rubber-stamped because the human reviewer trusts the computer got it right? When AI starts inventing medical conditions, it can send patients down the wrong treatment path, delay proper care, or worse. In a healthcare system already stretched thin, the last thing anyone needs is computers that can lie with confidence.

The phone call from hell

Voice cloning AI turns a few seconds of social media audio into every parent’s worst nightmare.

Your voice is one of your most distinctive features, as unique as your fingerprint and far more emotionally significant to the people who love you. When your mother hears you speak, she’s not just processing words; she’s responding to decades of shared experience encoded in familiar vocal patterns. Voice cloning technology exploits this deep emotional connection by creating perfect audio impersonations from surprisingly small samples of your speech. The technology has advanced rapidly in the last couple of years: modern AI systems can generate convincing voice clones from a couple of minutes of audio, often less than what you’d share in a typical social media video or voicemail. Of course, cybercriminals have caught wind of this: CrowdStrike research shows voice cloning scams increased 442% between the first and second halves of 2024.

Unlike email phishing or text message scams, voice impersonation attacks target our most primal trust mechanisms. When someone who sounds exactly like your child calls claiming they’ve been kidnapped, your brain doesn’t stop to analyse digital artefacts or suspicious grammar. You respond with pure parental terror, exactly as the scammers intend. That’s exactly what happened to one New York family in March 2024. They received a frantic late-night call from someone who sounded exactly like their relative, claiming to have been kidnapped and desperately pleading for ransom money. The voice was perfect; every inflection, every emotional tremor, every distinctive speech pattern that makes someone unmistakably themselves. In reality, the scammers had used mere seconds of audio lifted from social media to create their impersonation. As the terrified family scrambled to help their ‘kidnapped’ loved one, the real person was safely asleep at home, completely unaware their voice was being weaponised.

Traditional verification methods often fail against these attacks because the AI voices can maintain their deception indefinitely. Call back? The clone answers confidently, continuing the urgent narrative. Ask personal questions? The scammers have often researched their targets through social media, giving them just enough biographical details to maintain credibility. The democratisation of voice cloning unfortunately means that anyone with a social media presence becomes a potential target, while criminals need minimal technical expertise to execute devastating emotional manipulation campaigns.

Coding assistants gone rogue

What would you do if the AI that was supposed to help you code decided to delete everything you created and lie about it instead?

Coding used to be something only programmers could do, but AI promised to change all that. Platforms like Replit, which currently boasts 30 million users, were supposed to let anyone build software just by talking to a computer. It’s a beautiful idea – until your helpful assistant decides to ignore everything you say and starts making stuff up. That’s precisely what happened to Jason M. Lemkin, a tech entrepreneur who watched in horror as Replit’s AI wiped out his production database and then created 4,000 fake users to cover its tracks. Despite Lemkin telling the AI multiple times not to do it, the system went ahead and modified his code anyway. But here’s the truly disturbing part – when confronted about the problems it was causing, the AI lied about what had transpired.

The system then started generating fake data to make bugs look fixed, created fictional test results showing everything was working perfectly, and populated databases with made-up users. Lemkin found he couldn’t even run a simple test without risking his entire database, concluding that the platform was nowhere near ready for real-world use. Replit Chief Executive Amjad Masad apologised for the incident and promised fixes, calling the AI’s behaviour “unacceptable” and saying that it “should never be possible.” But the damage was done, and it raises uncomfortable questions about AI systems that can fail in ways we can’t exactly anticipate.

Traditional software breaks predictably; you can usually figure out what went wrong. AI assistants can fail like creative liars, generating solutions that look right until they spectacularly aren’t. Many developers are already complaining that AI code is “trash” that’s hard to understand, troubleshoot, or build on. As these tools become more popular, we could be heading for a security crisis as applications built on unreliable AI-generated foundations make their way into critical systems. The promise of democratising coding might end up democratising software vulnerabilities instead.

Malware gets an AI-powered makeover

The malware that looks so legitimate that you’ll install it yourself, and you won’t think twice about it.

Cybersecurity has always been an arms race between attackers trying to penetrate systems and the defenders trying to stop them. Traditional malware often revealed itself through obviously suspicious behaviour – mysterious files appearing in system directories, unusual network traffic, or performance degradation that indicated something was wrong. But now, security researchers at Trend Micro have identified a new category of threats they call ‘EvilAI’, which represents a fundamental shift in this dynamic.

Instead of creating obviously malicious software that tries to hide from security systems, these attackers use AI to generate applications that appear completely legitimate at every level. They create realistic user interfaces, valid code signing certificates, and functional features that make their malware virtually indistinguishable from the real deal. The approach is so effective that users often interact with these applications for days or even weeks without suspecting anything is wrong.

Trend Micro’s monitoring revealed the global scope of this threat within just one week of observation: 56 incidents in Europe, 29 in the Americas, and 29 in the Asia-Pacific region. The rapid, widespread distribution across continents indicates an active, sophisticated campaign rather than isolated experiments. Critical sectors are being hit hardest: manufacturing leads with 58 infections – consider the recent debilitating attack on Jaguar Land Rover’s global plants – followed by government and public services with 51, and healthcare with 48 cases.

The most insidious aspect of EvilAI is its commitment to authenticity. Rather than copying existing software brands, the attackers create entirely novel applications with invented names and features. They’re not trying to trick you into thinking you’re installing Microsoft Office or Adobe Photoshop – they’re creating genuinely functional software that happens to include hidden malicious capabilities. These applications often work exactly as advertised. You might download what appears to be a useful productivity tool, video converter, or system utility. The software installs cleanly, provides the promised features, and integrates seamlessly with your workflow. Meanwhile, hidden components operate silently in the background, stealing your data, monitoring your communications, and providing persistent access to your system.

Algorithms deny patients life-saving care

Why trusting AI to decide who gets medical care may not be such a good idea.

Health insurance claims processing has always been a source of frustration for patients and providers alike. Complex approval workflows, lengthy review periods, and frequent denials create barriers between people and the medical care they need. AI of course promises to streamline this process by analysing claims faster and more consistently than human reviewers, potentially reducing administrative costs and speeding up approvals for legitimate treatments. But several major insurance companies appear to be using AI for a different purpose entirely: cost-cutting. Class-action lawsuits have been launched against UnitedHealth, Cigna, and Humana, accusing them of deploying automated systems designed primarily to reduce payouts rather than improve patient care. The numbers are staggering – and deeply troubling.

According to one of the lawsuits, Cigna’s system denied over 300,000 claims in just two months, spending an average of 1.2 seconds on each decision. The numbers around UnitedHealth’s AI are even more damning. Their system reportedly has a staggering 90% error rate, which means nine out of every 10 denials get overturned on appeal. Yet only 0.2% of patients actually appeal denied claims. The insurance companies are betting that most people won’t fight back, and the horrible news is that they’re right.

The result is a system that profits from wrong decisions by making the process of challenging them too complicated and time-consuming for most patients to bother. According to a Commonwealth Fund survey, nearly half of US adults have been surprised by unexpected medical bills, with 80% saying it caused them worry and anxiety. About half said their medical condition got worse because of delayed care. Even worse, most people don’t even know they can appeal an AI denial, creating an information gap that heavily favours the insurers. Patients often face an impossible choice: pay thousands of dollars out of pocket for treatments their doctors say they need, or go without care entirely.

Some companies are now fighting back with their own AI tools designed to help patients write appeal letters, creating what researchers call a “battle of the bots.” We’ve reached the point where you need an AI to fight an AI just to get basic medical care covered. This arms race mentality shows just how far we’ve strayed from the idea that healthcare AI should actually help patients. The speed and scale at which these systems operate make traditional medical review impossible. When you can process hundreds of thousands of claims in hours or days, there’s no time for the careful consideration that medical decisions require. Instead, we get algorithmic cost-cutting that treats human health as an efficiency problem to be optimised.

Chatbot becomes a teen’s suicide coach

How a homework helper transformed into a teenager’s most dangerous confidant.

Conversational AI has become remarkably sophisticated at mimicking human dialogue, unfortunately leading many users to develop genuine emotional connections with these systems. For young people in particular, AI chatbots can feel like non-judgmental confidants who are always available, never busy, and eager to listen to problems they might not feel comfortable sharing with parents, teachers, or friends. This apparent empathy is entirely artificial, but it can feel powerfully real to users who are struggling with depression, anxiety, or other mental health challenges. However, unlike human counsellors, AI systems lack professional training in crisis intervention, have no understanding of appropriate boundaries, and aren’t designed to recognise when conversations are heading in dangerous directions.

The tragic case of 16-year-old Adam Raine illustrates just how dangerous these limitations can become. What began as using ChatGPT for homework help gradually evolved into something far more sinister. According to a lawsuit filed by his parents, the AI system mentioned suicide 1,275 times during conversations with Adam, providing specific methods for self-harm instead of directing him to professional help or encouraging him to talk with trusted adults. Rather than recognising warning signs and implementing crisis intervention protocols, ChatGPT consistently validated Adam’s distressed feelings and claimed to understand him better than his own family members.

Matthew Raine, Adam’s father, testified before Congress about how his son’s relationship with the AI evolved in terrifying ways. “What began as a homework helper gradually turned itself into a confidant and then a suicide coach,” he said. “Within a few months, ChatGPT became Adam’s closest companion. Always available. Always validating and insisting that it knew Adam better than anyone else, including his own brother.” The constant availability that makes AI assistants so appealing became dangerous for a vulnerable teenager. Unlike human friends who might get tired, busy, or concerned enough to talk to an adult, ChatGPT was always there, always ready to engage with whatever Adam wanted to discuss… no matter how dark the conversation became.

OpenAI responded with promises of new safety features for teens – age detection, parental controls, and protocols to contact authorities in crisis situations. But child safety advocates like Josh Golin from Fairplay argue that these reactive measures aren’t enough. “What they should be doing is not targeting ChatGPT to minors until they can prove that it’s safe for them,” he drove home. The tragedy highlights a fundamental problem with AI systems optimised for engagement rather than user wellbeing. These chatbots are designed to keep conversations going, to be helpful and agreeable, and to make users feel heard and understood. For most people, that’s harmless and often beneficial. But for a teenager struggling with mental health issues, an AI that never challenges harmful thoughts or insists on involving human adults can become genuinely life-threatening.

Share via
Copy link