Artificial Intelligence has conquered chess, mastered language generation, and even created art. But can it crack the code of abstract reasoning—those visual puzzles that challenge human cognition? Researchers at USC Viterbi School of Engineering are putting the latest AI models to the ultimate test.
The Abstract Reasoning Challenge
USC Viterbi ISI Research Assistants Kian Ahrabian and Zhivar Sourati recently investigated whether multi-modal large language models (MLLMs) can perform nonverbal abstract reasoning—tasks that require both visual perception and logical thinking. Their findings, presented at the Conference on Language Modeling (COLM 2024), reveal fascinating insights into AI's current limitations.
"We wanted to see if this new generation of large models, which are able to process images, can reason on their own," Ahrabian explained. "For example, if you see a yellow circle turning into a blue triangle, can the model apply the same pattern in a different scenario?"
Testing AI Against Human Intelligence
The research team used Raven's Progressive Matrices—a well-established test of abstract reasoning—to evaluate 24 different AI models. These puzzles require identifying patterns and relationships that seem intuitive to humans but prove challenging for machines.
The results were revealing:
Open-Source Models: Struggled significantly, with most failing to demonstrate meaningful reasoning capabilities on these visual tasks.
Closed-Source Models: Showed better performance, particularly GPT-4V, which demonstrated "some nontrivial results" but remained far from perfect.
Where AI Stumbles
A critical discovery emerged when researchers isolated the problem. Even when they removed visual processing from the equation—providing detailed text descriptions instead of images—many models still couldn't reason effectively.
"Even when we removed the visual element and just gave them text, many models still couldn't reason effectively," Sourati explained. This revealed that the limitation wasn't just visual processing but fundamental reasoning abilities.
The Path to Better AI Reasoning
One promising approach the researchers explored was "Chain of Thought prompting," where AI systems are guided to think step-by-step through reasoning tasks. This method led to significant improvements, with some models showing up to 100% performance gains when provided with hints and structured guidance.
Why This Matters
The implications extend far beyond academic research:
Autonomous Systems: Self-driving cars need abstract reasoning to navigate unpredictable scenarios Healthcare: Medical AI must reason through complex diagnostic patterns Robotics: Robots require abstract thinking to adapt to new environments Scientific Discovery: AI research assistants need reasoning abilities to form hypotheses
The Bigger Picture
Jay Pujara, research associate professor and study co-author, noted: "Every day we're bombarded with new headlines about what AI can (and can't) do, which are often very surprising. We still have such a limited understanding of what new AI models can do, and until we understand these limitations we can't make AI better, safer, and more useful."
Current State vs. Future Potential
While today's AI models excel at pattern recognition and data processing, true abstract reasoning remains elusive. The research highlights that achieving human-level reasoning requires more than processing power—it demands fundamental advances in how AI systems approach problem-solving.
Current Capabilities:
- Excellent at memorizing large datasets
- Strong pattern matching in familiar contexts
- Effective at specific, well-defined tasks
Remaining Challenges:
- Generalizing to novel situations
- Abstract problem-solving without extensive training
- Understanding context and relationships in complex scenarios
Looking Forward
The USC research represents crucial groundwork for developing more capable AI systems. As these models continue to evolve, understanding their reasoning limitations helps guide development toward truly intelligent systems that can adapt and think more like humans.
The journey from pattern recognition to genuine reasoning represents one of AI's next great frontiers—and research like this brings us closer to bridging that gap.
Ready to implement these insights?
Let's discuss how these strategies can be applied to your specific business challenges.
You might also like
More insights from AI Research
