We present an approach to assess the accuracy of climate models based on multi-objective optimization and an infrastructure to support analyzing massive amounts of the model data. Many previous studies have demonstrated that multi-model ensembles generally show better skill than each ensemble member. When generating weighted multi-model ensembles, the first step is measuring the performance of individual model simulations using observations. There is a consensus on the assignment of weighting factors based on a single evaluation metric. When considering only one evaluation metric, the weighting factor for each model is proportional to a performance score or inversely proportional to an error for the model. While this conventional approach can provide appropriate combinations of multiple models, the approach confronts a big challenge when there are multiple metrics under consideration. When considering multiple evaluation metrics, it is obvious that a simple averaging of multiple performance scores or model ranks does not address the trade-off problem between conflicting metrics. So far, there seems to be no best method to generate weighted multi-model ensembles based on multiple performance metrics. The current study applies the multi-objective optimization, a mathematical process that provides a set of optimal trade-off solutions based on a range of evaluation metrics, to combining multiple performance metrics for the global chemistry climate models and generating a weighted multi-model ensemble. In general, both observational and model data required for this optimization effort are scattered across the network. As a result, the optimization can be hampered by increasing costs of computation and communication between data servers where NASA satellite data and climate model simulations are archived. To address this Big Data challenge, we outline a plan to apply the Virtual Information Fabric Infrastructure (VIFI) to the multi-objective optimization of climate model simulations with large ensembles. VIFI enables executing scalable analytics optimized for distributed data systems. Our proof-of-concept implementation shows the considerable variability across the climate simulations. We conclude that the optimally weighted multi-model ensemble always shows better performance than an arithmetic ensemble average and may provide reliable future projections. The VIFI architecture, including resource management and scheduling, is critical to achieve processing of massive Earth Science datasets from observations and climate models. |