A majority of the 300 AI experts and practitioners who participated in a recent survey are nervous about the accuracy of generative AI output.
It’s a job killer. It’s the best productivity tool ever. It knows everything. It knows nothing. It’s going to make us lazy. It’s going to make us smarter. Such has been the editorializing over the past few months around generative AI tools such as ChatGPT, now available to everyone. But perhaps ultimately the truth will set AI free.
AI is only as good as the data that’s fed into it, and often, that’s how it gets things wrong. A majority of the 300 AI experts and practitioners who participated in a recent survey are nervous about the accuracy of generative AI output. “In recent years, large language models (LLMs) such as GPT-3, BERT and T5 have demonstrated remarkable capabilities in natural language processing, transforming the way we interact with language,” the authors of the report, published by expert.ai, observe. “However, along with these opportunities, there are also major concerns related to potential biases, misuse of language models for malicious purposes, unapproved disclosure and use of proprietary information and, last but not least, a lack of truthfulness.”
Truthfulness ranks at the top of the list of concerns among AI professionals in the survey, as cited by 70%. “Many LLMs like GPT 3.x are trained on wide swaths of information, some of which is copyright protected, and because it comes from publicly available internet data, it has a fundamental garbage in, garbage out issue,” according to the authors. “This copyrighted information was cited as a risk for 59% of respondents.”
See also: 5 Questions Services Firms Should Ask Before Jumping Into AI
The bottom line is AI does not reason on its own, at least not yet. Rather, it provides conclusions based on statistical probabilities. “Generative AI models like GPT do not answer questions as much as they guess at the answer that you are looking for, depending on your question,” the report’s authors point out. “Sometimes, the result can be something humorous or entertaining. Occasionally, it results in false or biased information being presented as fact. And because it’s presented in a way that is grammatically correct and authoritative sounding, the end user may not know the difference – coherent nonsense.”
The challenge with LLMs going forward is they “can have significant ethical and legal implications, particularly around issues of bias, fairness and truthfulness,” the report’s authors warn. They recommend paying close to attention to whether LLMs are being adopted in ethical and legal ways, and “that they are not inadvertently perpetuating discrimination, copyright infringement or other harmful practices.”
Again, it all comes down to data. “To mitigate these risks, “enterprises need to carefully consider the data used to train LLMs, ensure they have processes in place to identify and address biases, and use additional validation methods to fact check results against known and trusted sources of truth.”