This article analyzes a Science study that assesses how leading AI weather models perform against traditional physics-based forecasts, with a focus on extreme weather events. It highlights that AI systems excel at routine predictions but struggle to anticipate unprecedented, high-impact extremes.
AI vs physics-based forecasting: what the Science study found
AI weather models such as GraphCast and Pangu-Weather are increasingly capable at short- and medium-range forecasts. Yet when it comes to record-breaking heat, extreme winds, or deep cold, they often fall short compared with physics-based models.
The study, which evaluated models as of a year ago, notes that these systems tend to underestimate the magnitude and likelihood of extreme events. In response, some teams have added probabilistic outputs to generate multiple possible outcomes, a step forward that nevertheless does not fully overcome the training-data limitation.
The co-author Sebastian Engelke of the University of Geneva explains that, at their core, AI models learn from patterns in decades of past data and thus struggle to foresee novel extremes that lie outside those learned histories. By contrast, physics-based forecasting builds on the underlying atmospheric dynamics, allowing models to adapt to conditions far different from those seen in the training record.
This fundamental difference helps physics-based methods retain an edge for high-impact, societally critical events.
Why AI struggles with unprecedented extremes
The key limitation is that AI systems largely reproduce historical behavior rather than generate new physics-driven insights. When a weather system pushes into a regime never seen before, the training data provides little guidance, and the model’s predictions can understate risk.
As a result, out-of-sample extremes—such as unusually intense heatwaves or explosive cyclogenesis—pose persistent challenges for AI-only forecasts. This is not a knock on AI capability in general, but a reminder that training data quality and coverage place a ceiling on what purely data-driven models can achieve for the most dangerous events.
Current performance and notable cases
In practice, AI models are increasingly deployed alongside traditional models by weather agencies, data companies, and insurers because of their speed and strong performance for ordinary weather scenarios. Nvidia’s Atlas and other AI systems have demonstrated skill in some untrained extremes, including rapid intensification events and realistic wind and pressure fields, and they can accurately predict hurricane tracks.
This performance profile makes AI a powerful component of operational forecasting, especially where rapid decision-making is essential.
Notable successes and limitations in extreme scenarios
While AI can capture many routine patterns with high fidelity, extreme events remain the Achilles’ heel. The study notes that AI’s current strengths lie in routine forecasts and pattern recognition, while physics-based models still provide the most reliable guidance for the most dangerous storms, heatwaves, and cold spells.
The incorporation of probabilistic outputs marks an improvement, but these enhancements do not fully erase the gap created by limited exposure to unprecedented conditions.
Where AI adds value in weather forecasting
Despite the limitations for extremes, AI-assisted forecasting delivers tangible benefits across the weather enterprise.
- Speed and scalability: rapid generation of ensemble forecasts for a wide range of scenarios.
- Enhanced routine forecasts: strong performance in daily weather prediction, improving planning for agriculture, energy, and transportation.
- Uncertainty quantification: probabilistic outputs that support risk communication and insurance modeling.
- Augmentation, not replacement: AI tools augment meteorologists, who apply physical intuition and domain expertise to interpret results.
Looking to the future: a hybrid path
The path forward is likely a hybrid forecasting approach that combines the best of both worlds. By aligning AI capabilities with the physics of the atmosphere, forecasters can achieve both speed and resilience against extremes.
This direction relies on richer training data that include simulated extremes, ongoing validation against real-world events, and the integration of physics-based constraints to guide AI predictions toward physically plausible outcomes.
Implications for policy and public safety
For decision-makers, the implications are clear: invest in hybrid models and maintain transparent communication about forecast uncertainty. Continuously evaluate model performance across the full spectrum of weather, including extreme events.
The goal is to enhance early warning capabilities and risk assessment. Resilience planning should also be prioritized while preserving the reliability that physics-based forecasting has long provided for high-stakes weather events.
Here is the source article for this story: Traditional forecasting still beats AI for the most extreme weather

