The Chinese startup, which roiled markets with its low-cost reasoning model that emerged in January, collaborated with researchers from the Beijing institution on a paper detailing a novel approach to reinforcement learning to make models more efficient.
The new method aims to help artificial intelligence models better adhere to human preferences by offering rewards for more accurate and understandable responses, the researchers wrote. Reinforcement learning has proven effective in speeding up AI tasks in narrow applications and spheres. However, ...
Learn more about Bloomberg Tax or Log In to keep reading:
See Breaking News in Context
From research to software to news, find what you need to stay ahead.
Already a subscriber?
Log in to keep reading or access research tools and resources.