Digital twins are increasingly important in many domains, including for understanding and managing the natural environment. Digital twins of the natural environment are fueled by the unprecedented amounts of environmental data now available from a variety of sources from remote sensing to potentially dense deployment of earth-based sensors. Because of this, data science techniques inevitably have a crucial role to play in making sense of this complex, highly heterogeneous data. This seminar reflects on the role of data science in digital twins of the natural environment, with particular attention on how resultant data models can work alongside the rich legacy of process models that exist in this domain. We seek to unpick the complex two-way relationship between data and process understanding. By focusing on the interactions, we end up with a template for digital twins that incorporates a rich, highly dynamic learning process with the potential to handle the complexities and emergent behaviors of this important area. The seminar also considers the important role that Digital Research Infrastructure can play in underpinning digital twin development, including supporting FAIR assets to underlying data and modeling resources as well as ensuring federation between digital twin structures. |