The Lament of AI
3 December 2024
Author: Gordon Carse, 5BKW
AI is, without question, a startling development in the world of technology. From summarising reports to creating art and music, its abilities are jaw-dropping. But is it reliable? Can it properly be used in a criminal trial, where people’s liberty is at stake?
In a recent trial alleging large-scale drug production and supply, the Prosecution unleashed its heralded AI tool. The results were, in the Learned Judge’s words, lamentable.
As is ever more common in drug cases, the prosecution sought to rely on text messages between conspirators and others to show the mechanics and workings of an organised crime group. The difficulty they faced was not limited to the volume of the data (over 25 devices were seized) but that vast quantities of the messages were in a foreign language. Equally common was the issue. Were the messages sent by the defendant, and what did they mean?
In seeking to rely on relevant material and to discharge their disclosure duties, the prosecution took the somewhat unusual step of handing schedules of messages to an interpreter and asking them to carry out that task. An interpreter – employed by a private business – was tasked to search for any messages that might be relevant to the prosecution and for any messages that might undermine the prosecution case or assist the defence case. The reason? It was cost-effective. The prosecution asserted that translating the messages recovered from the devices would cost in the region of £50,000 – £60,000. Issue was taken with this approach. Could it be right that the prosecution could simply hand over their statutory obligations to a private individual? An individual who had no legal training, no known investigatory experience and who is not subject to the confines of statutory obligation. Well, ultimately, the question was not litigated.
Perhaps recognizing they could not palm off their disclosure obligations to a private individual, the prosecution changed tac and came up with an ‘ingenious solution’ to avoid the hefty translation fee. Their idea? Feed the schedules of messages through an AI translation tool and then have an officer undertake the disclosure exercise.
Presumably, to avoid the potential data breaches of using a publicly available AI tool, the prosecution turned to TOEX. TOEX is a programme which embeds teams into Regional Organised Crime Units across the country. They offered an AI translation tool they had developed and, apparently, used with great success[1]. Issue was taken with this approach. How could this be considered reliable? How could this approach be considered consistent with Judicial Guidance[2]? These questions were litigated.
The prosecution argued that the reliability of the translation was guaranteed because any messages they sought to rely on would be placed before an interpreter for verification. So, it was argued, that the prosecution had complied with Judicial Guidance that the accuracy of the AI material be checked before it was relied upon.
But what of the disclosure exercise? Well, the prosecution argued, this had been conducted properly by an experienced officer, in line with their statutory obligations.
Taking a contrary position, on behalf of the defendant, it was argued the process was fundamentally flawed. The officer carried out the disclosure exercise not on verified messages but on the product of the AI translation. It was noted that the judicial guidance highlights that “AI tools may be inaccurate, incomplete, or misleading.” Was this concern a complaint in the ether? No. Disclosure was sought of the full AI product, and analysis of the AI translation was carried out against the verified messages. The analysis was revealing.
Of the 55 most significant messages relied upon by the Crown against the defendant, not a single AI translation matched the verified translation. Sometimes the difference was insignificant, but sometimes the difference was significant and material.
On any objective analysis, the product of the AI tool was not reliable. The translated messages were either wrong, questionable or unintelligible. If the underlying material upon which the disclosure exercise was conducted was not reliable, how could it be argued the product of the disclosure exercise was reliable? On considering the scheduled 55 messages the Court was not satisfied that it could and the messages, a significant plank of the prosecution case, were excluded.
Whilst the use of AI tools certainly has its place, this case certainly demonstrated its limitations.
[1] Tackling Organized Exploitation Programme Issue 10 July – September 2024
[2] Artificial Intelligence (AI) Guidance for Judicial Office Holders published 12.12.23