AI with any startup CEO, and chances are they will describe major shifts: AI boosting work, research, and providing easier access to knowledge. However, recent studies indicate that with each update, summarizing data might get increasingly challenging for such tools. Royal Society has just published a new study where almost three-quarters (73%) of AI-made scientific summaries are found to have mistakes. Almost 5,000 summaries generated by ChatGPT-4, ChatGPT-4.5, DeepSeek, and LLaMA 3.3 70B were looked at for the study. The research reveals some alarming facts about AI today and its direction in the future.
The Newer the Model, the Worse the Summary
It is reasonable to think that advanced models like ChatGPT-4o would be more accurate. It was discovered in the study that the newer the model, the more likely omission and overgeneralization becomes. When contrasted with its earlier versions, the AI ChatGPT-4o deleted more significant information. Just like the other models, Meta’s LLaMA 3.3 70B was found to be over 36 times likelier to overgeneralize. The greater their use, the more likely it appears for these tools to simplify scientific findings too much and make them seem mean something else.
Why Summarizing Data is So Difficult for AI
While presenting data in a summary is very technical, it is also about making the right judgment, a human trait that AI has not mastered yet. Try to teach children that only a stove is hot, not every kitchen appliance. The subtle meaning is easy for humans to grasp. Since heat is a common use of appliances, an AI could assume that this is true until told otherwise. This same problem happens in scientific areas as well. If AI chatbots are used carelessly, they can draw general conclusions that may not be right and confuse people who read their statements. Any mistake in a summary made for clinical or medical uses may put a person’s life in danger.

The High Stakes of Relying on AI Summaries
Nowadays, AI is found in schools, health clinics, pharmacies, and other workplaces. Still, the study reveals that we may rely heavily on tools that can’t properly take care of these duties yet. The writers say that we can expect better AI summaries in the future, but, for now, humans should still be used to accurately interpret science information. Coming back to AI, I hope you agree that humans should keep their task of summarizing science for now.
📝 Source
Original article from Futurism