TY - JOUR
T1 - Methodology for Identifying Mesoscale Weather Patterns from High-Dimensional Climate Datasets
AU - Nevat, Ido
AU - Acero, Juan A.
N1 - Publisher Copyright:
© The Author(s), under exclusive licence to Springer Nature Switzerland AG 2024.
PY - 2025/4
Y1 - 2025/4
N2 - We develop a new methodology to solve the problem of identifying and selecting mesoscale weather patterns (MWPs) from high-dimensional spatio-temporal climate datasets. This problem is important and topical and has many implications for decision-makers across multiple sectors, such as urban design, urban climate, agriculture, transportation, energy, and disaster management. This problem involves selecting a small subset of data (specific days) from the original large dataset (decades long), such that it captures the essential information and characteristics of the large dataset, while minimizing redundancy. This is useful as it makes the dataset more manageable for processing, analysis, and insight gathering without degrading the overall quality of the information content. We develop a novel algorithm that is based on advanced machine learning and optimization techniques and consists of two stages: (1) spatial dimensionality reduction (SDR) to reduce the number of spatial cells analyzed while preserving as much relevant information as possible and (2) representative subset selection (RSS) to find a small subset of days in the dataset that captures the essential patterns, relationships, and information present in the full dataset—these are the mesoscale weather patterns. We demonstrate our methodology by applying it to a spatio-temporal dataset of atmospheric observations, the ERA5 dataset, in Singapore. The MWPs offer valuable insights into the region’s diverse weather conditions and help researchers, climatologists, and policymakers comprehend the complex interactions between atmospheric elements.
AB - We develop a new methodology to solve the problem of identifying and selecting mesoscale weather patterns (MWPs) from high-dimensional spatio-temporal climate datasets. This problem is important and topical and has many implications for decision-makers across multiple sectors, such as urban design, urban climate, agriculture, transportation, energy, and disaster management. This problem involves selecting a small subset of data (specific days) from the original large dataset (decades long), such that it captures the essential information and characteristics of the large dataset, while minimizing redundancy. This is useful as it makes the dataset more manageable for processing, analysis, and insight gathering without degrading the overall quality of the information content. We develop a novel algorithm that is based on advanced machine learning and optimization techniques and consists of two stages: (1) spatial dimensionality reduction (SDR) to reduce the number of spatial cells analyzed while preserving as much relevant information as possible and (2) representative subset selection (RSS) to find a small subset of days in the dataset that captures the essential patterns, relationships, and information present in the full dataset—these are the mesoscale weather patterns. We demonstrate our methodology by applying it to a spatio-temporal dataset of atmospheric observations, the ERA5 dataset, in Singapore. The MWPs offer valuable insights into the region’s diverse weather conditions and help researchers, climatologists, and policymakers comprehend the complex interactions between atmospheric elements.
KW - Machine learning
KW - Mesoscale weather patterns
KW - Representative subset selection
KW - Submodular functions
UR - https://www.scopus.com/pages/publications/85205504791
U2 - 10.1007/s10666-024-09995-5
DO - 10.1007/s10666-024-09995-5
M3 - Article
AN - SCOPUS:85205504791
SN - 1420-2026
VL - 30
SP - 289
EP - 317
JO - Environmental Modeling and Assessment
JF - Environmental Modeling and Assessment
IS - 2
ER -