The approach has the potential to enhance problem-solving capabilities and produce more precise outcomes, making AI models more effective in various applications.
Researchers at Google DeepMind have developed an innovative approach to enhance the mathematical abilities of AI language models like ChatGPT. This technique, called Optimization by PROmpting (OPRO), harnesses the power of other AI models to improve the effectiveness of written instructions provided to AI models, leading to significant improvements in their math skills. OPRO sidesteps the limitations of traditional math-based optimization methods by using natural language instructions to guide large language models (LLMs) in solving problems.
Leveraging natural language for optimization
In the traditional realm of machine learning, algorithms like derivative-based optimizers guide AI models to improve their performance by finding the best solution. These algorithms rely on mathematical definitions and the slope of a performance curve to make adjustments. OPRO, on the other hand, uses “meta-prompts” expressed in everyday human language to set the stage for optimization. Instead of mathematical definitions, OPRO describes the optimization problem in natural language and instructs LLMs to generate new solutions iteratively based on the problem description and past answers.
See also: How to Attract LLM Developers Amidst the AI Boom
Human-like encouragement boosts accuracy
One of the most intriguing aspects of the DeepMind study is the impact of specific phrases on AI model output. Phrases like “Let’s think step by step” significantly improved the accuracy of AI models when solving math problems. In this latest study, DeepMind researchers discovered that the phrase “Take a deep breath and work on this problem step by step” was the most effective prompt when used with Google’s PaLM 2 language model, achieving an 80.2 percent accuracy score in tests against a dataset of grade-school math word problems.
This approach may seem peculiar because AI models don’t reason like humans but rely on vast datasets of language phrases for problem-solving. Phrases like “let’s take a deep breath” or “think step by step” likely tap into patterns of reasoning or problem-solving examples in the data. OPRO’s advantage lies in its ability to sift through numerous possible prompts to identify the most effective one for a specific problem. This ability could enable AI to produce more accurate and useful results in the future.
OPRO represents a promising breakthrough in improving AI math skills by infusing human-style encouragement into AI language models. This innovative approach has the potential to enhance problem-solving capabilities and produce more precise outcomes, making AI models more effective in various applications.