Pan-European infrastructure for ocean & marine data management

Newsletter

5 September 2017

See other editions

Other editions

Content:

Introducing the SeaDataCloud project

SeaDataNet is a major infrastructure in Europe for managing, indexing and providing access to ocean and marine data sets and data products, acquired from research cruises and other observational activities in European coastal marine waters, regional seas and the global ocean. It also develops and governs common standards for metadata and data formats, common vocabularies and quality flags as well as standard software tools. SeaDataNet core partners are the National Oceanographic Data Centres (NODCs) and major marine research institutes in Europe, together with the Intergovernmental Oceanographic Commission (IOC) of UNESCO, International Council for the Exploration of the Sea (ICES) and EU Joint Research Centre. SeaDataNet cooperates intensely with experts in observing programmes, modelling, ICT, and standards, on European and international scales. SeaDataNet has international cooperation through its collaboration with IOC-IODE and ICES, the Ocean Data Interoperability Platform (ODIP) project, GEOSS and the Research Data Alliance. The SeaDataNet data centres are highly skilled and have been actively engaged in data management for many decades and have the essential capabilities and facilities for data quality control, long term stewardship, retrieval and distribution. The SeaDataNet standards, services and products have been developed since the mid-1990s, supported by EU DG RTD and involving and cooperating with a range of EU RTD projects.

SeaDataNet has a close cooperation with the operational oceanography community represented by i) EuroGOOS, the association of national governmental agencies and research organisations committed to European-scale operational oceanography within the context of the intergovernmental Global Ocean Observing System, ii) Copernicus Marine Environmental Monitoring Service (CMEMS), deploying pan-European capacity for Ocean Monitoring and Forecasting, and iii) Euro-ARGO, developing a long-term global ocean monitoring system deploying Argo floats.

The SeaDataNet network is also a major contributor to the development of the European Marine Observation and Data network (EMODnet), from its start in 2008. EMODnet is an initiative of EU DG MARE, aimed at supporting the implementation of Marine Knowledge 2020, Blue Growth and Marine Strategy Framework Directive (MSFD by developing generic marine data products and services for all European sea regions. Through its synergy with the EMODnet developments SeaDataNet has largely expanded its network and at present > 100 marine data centres from 34 countries around European seas are connected to the SeaDataNet infrastructure, giving overview and access to their data resources sets for physical oceanography, marine chemistry, marine geology and geophysics, bathymetry, and marine biology.

SeaDataNet is an operational infrastructure, well established on a pan-European scale with national nodes and currently serving many users and applications. However it lags behind in a number of expectations and there are new challenges, from lead user communities, from progressing standards and IT developments, and from required interaction with international initiatives.

Therefore we are very happy to inform you that these challenges are being taken up in the EU HORIZON 2020 SeaDataCloud project which has started in November 2016 and will run for 4 years. In summary this project aims at upgrading and expanding the architecture and services of the SeaDataNet infrastructure, inter alia by adopting cloud and High Performance Computing technology, with the following objectives:

  • Improve services to users and data providers
  • Optimise connecting data centres and data streams to the infrastructure
  • Improve interoperability with other European and International networks to provide overview and access also to these other data sources.
In the SeaDataCloud project, the SeaDataNet members have entered into a strategic and technical cooperation with the EUDAT consortium. EUDAT is a European network of computing infrastructures that develop and operate a common framework for managing scientific data and providing an interoperable layer of common data services. SeaDataNet will cooperate with the EUDAT e-infrastructure service providers to build upon the state of the art in ICT and e-infrastructures for data, computing and networking.  

Our newsletters are an easy way to report on SeaDataNet achievements and current activities, events, conferences, collaborations, and software updates; they also give notice about future opportunities and perspectives. The newsletters are instrumental for new users or potential data providers to get an overview of the infrastructure. On the SeaDataNet portal you can find all necessary information and resources to search and download data or to become a data provider.

This is the first edition of the newsletter in the framework of the EU HORIZON 2020 funded SeaDataCloud project. It gives you more information about the status of present services and tools, and planned SeaDataCloud project developments. It introduces our partners from EUDAT.  In addition, it gives information about EMODnet project developments which are undertaken with SeaDataNet involvement. Finally it gives a report of the IMDIS 2016 Conference that was organized by the SeaDataNet network.

We hope you will enjoy this newsletter and will be triggered to visit the SeaDataNet portal for a try-out of its services and to follow its evolution. We aim to reach as many people as possible, so please forward it to anyone you know may be interested.

Present status of SeaDataNet infrastructure

Oceanographic and marine data include a very wide range of measurements and variables covering a broad, multidisciplinary spectrum of projects and programmes. Oceanographic and marine data are collected by several thousands of research institutes, governmental organizations and private companies in the countries bordering the European seas. Various heterogeneous observing sensors are used installed on research vessels, submarines, aircraft, moorings, drifting buoys and satellites. These sensors measure physical, chemical, biological, geological and geophysical parameters, with further data resulting from the analysis of water and sediment samples for a wide variety of parameters.

SeaDataNet provides an operational infrastructure for managing, quality control, and long term stewardship of these data as collected for science, environmental management, and other applications. The SeaDataNet infrastructure comprises a network of interconnected data centres that perform marine data management at national and local levels and that together make their information and data resources discoverable and accessible in a harmonized way for science and for society. For that purpose they develop and apply various SeaDataNet standards and tools, and maintain a series of European metadata services, which are all published at the central SeaDataNet portal.

homepage SeaDataNet portal
Figure: the upgraded homepage of the SeaDataNet portal

The SeaDataNet directory services provide overviews of marine organisations in Europe, and their engagement in marine research projects, managing large datasets, and data acquisition by research vessels and monitoring programmes for the European seas and global oceans:   

  • European Directory of Marine Organisations (EDMO) (> 3.800 entries)
  • European Directory of Marine Environmental Data (EDMED) (> 4.100 entries)
  • European Directory of Marine Environmental Research Projects (EDMERP) (> 3.000 entries)
  • European Directory of Cruise Summary Reports (CSR) (> 46.000 entries)
  • European Directory of  the Ocean Observing Systems (EDIOS) (> 10.000 entries)
monthly progress September 2012
Figure: the monthly progress of each of the directories since September 2012.  

Users can follow this monthly progress at the SeaDataNet portal.

The overview also includes the CDI service. This is the Common Data Index (CDI) Data Discovery and Access service which provides users online unified discovery and access to the vast resources of marine and ocean datasets, managed by the distributed data centres. It gives users a highly detailed insight in the geographical coverage, and other metadata features of data across the different data centres. Users can request access to identified datasets in a harmonised way, using a shopping basket. They can follow the processing of requests via an online transaction register and can download datasets in the SeaDataNet standard formats. Through the cooperation with many EU projects and its active role in the EMODnet development the number of connected data centres has steadily risen to >100 connected data centres at present. This way the CDI service provides metadata and access to more than 1.97 Million data sets, originating from more than 600 organisations in Europe, covering physical, geological, chemical, biological and geophysical data, and acquired in European waters and global oceans.  The following 3 images give an illustration of the CDI and EDMO services.

overview CDI entries july 2017
Figure: Overview of CDI entries per July 2017: >1.97 million data sets from 600+ originators and 100+ connected data centres

CDI user interface Mediterranean sea
Figure: CDI User Interface with selection of data points in Mediterranean Sea near Italy  

overview CDI data centres EDMO
Figure: Overview of CDI data centres in the EDMO directory service

SeaDataNet maintains and operates Common Vocabulary Web services, covering a broad spectrum of ocean and marine disciplines and used by many marine communities in the world. The common terms are used to mark-up metadata, data and data products in a consistent and coherent way. Governance is regulated by an international board.

SeaDataNet provides standards, and common software tools for metadata and data formatting, Quality Control - Quality Assurance, statistical analysis and a versatile software package for data analysis and presentation. These tools can be downloaded without any restriction from the SeaDataNet portal.

The SeaDataNet portal also provides access to data products that have been developed in the predecessor SeaDataNet II project for all European sea regions for temperature and salinity. The products concern aggregated data sets as ODV collections of all SeaDataNet measurements and climatologies as regional gridded fields based on the aggregated datasets. The available data products have been documented with metadata in the Sextant Catalogue and can be viewed and downloaded in the OceanBrowser service.

Sextant catalogue SDN data products
Figure: Sextant Catalogue with descriptions of present SeaDataNet data products

Oceanbrowser temperature North Atlantic
Figure: OceanBrowser viewing and downloading service with aggregated data set for temperature in the North-Atlantic region

Oceanbrowser temperature Mediterranean sea
Figure: OceanBrowser viewing and downloading service with climatology for temperature in the Mediterranean sea region

SeaDataCloud: identified challenges and planned activities

SeaDataCloud is a HORIZON 2020 project and successor to the earlier SeaDataNet II project. It started in November 2016 and will run for 4 years, involving as project members 56 organisations from 29 countries in Europe. Moreover there is close interaction and cooperation with several EMODnet projects which engage many more marine data centres that are nodes in the SeaDataNet infrastructure. All nodes will be invited with SeaDataCloud funding for participating in the SeaDataCloud training workshops and follow-up activities as part of the deployment and implementation of the upgraded infrastructure components. The SeaDataCloud project is coordinated by IFREMER (France) with technical coordination by MARIS (Netherlands).     

geographic distribution Seadatacloud members
Figure: Geographic distribution of SeaDataCloud project members

The SeaDataNet infrastructure provides services for a number of leading user communities. Through dialogues and from experience new challenges have been identified as targets for the SeaDataCloud project.

Science: researchers require marine and ocean data and/or data products to support their scientific activities; moreover SeaDataNet, through its data centres, seeks to provide data management support including long-term stewardship, for researchers who are collecting and analysing marine observations and samples from research vessel cruises or otherwise. Identified major challenges:   

  • Usage of the CDI data discovery and access services in practice lags behind expectation and needs to be stimulated by providing more data sets, easier access, better findability, and further ways of engagement;
  • Data from nationally funded cruises and scientific campaigns generally reach the data centres, but data are lacking from several research projects;
  • Scientists are interested in data publishing with persistent identifiers (such as DOI) for their data collections, pushed by scientific publishers and career perspectives;   
EMODnet: SeaDataNet delivers the data infrastructure behind multiple EMODnet portals and provides a successful basis for generating data products for the European seas. Identified major challenges:   
  • Important data collections (European and international) are missing;
  • Users increasingly make use of the data product services, while not fully satisfied with the CDI data discovery and access service;
  • EMODnet counts on SeaDataCloud for improving and upgrading the underlying SeaDataNet standards, INSPIRE compliance, tools, services, and essential infrastructure;   
Operational oceanography (EuroGOOS, Copernicus (CMEMS), Euro-ARGO, AtlantOS, EMSO, JERICO-Next, and others): SeaDataNet delivers standards, ensures validation and long-term archiving of observation data, and provides climatology (T&S) data collections for calibration purposes. Identified major challenges:
  • Filling the present gap between the automated collected time series, as acquired by network operators, and their validation and long-term archival and access provision by SeaDataNet data centres. This gap delays progress of EMODnet Physics targets and building of historical data products for reanalysis purposes;
  • Formulating marine OGC Sensor Web Enablement (SWE) standards for streamlining the (near) real time data flows from platforms to data centres, to detail relevant metadata of these systems and data flows, and to facilitate easy access by means of Sensor Observation Services (SOS);
  • Developing validation protocols and archiving systems for the data outputs of new measuring technologies, such as in situ biogeochemical sensors and new autonomous platforms, such as gliders and AUVs.
SeaDataCloud activities:
Activities are focused on upgrading existing and developing and introducing new standards, tools and services for both data providers as well as users. Innovative developments are aimed at considerably advancing the capabilities of the SeaDataNet infrastructure for handling a wide range of marine data types and making it easier for European data providers to connect to the SeaDataNet infrastructure and to include their data resources. Also at expanding the SeaDataNet offer by giving access to international data resources through interoperability. Activities are also aimed at users from many marine disciplines, providing them access to an extensive and steadily increasing pool of available marine and ocean data resources from a large network of data providers, supported by standard, interoperable and highly efficient services for discovery, access, and transformation of data resources, as well as a cloud environment with advanced services for analysis, quality control, sub-setting, and visualisation of retrieved datasets and generation and publication of their data products.

SeaDataNet will cooperate with the EUDAT e-infrastructure service providers to build upon the state of the art in ICT and e-infrastructures for data, computing and networking. EUDAT will realise a common cloud and computing environment that will be integrated with the SeaDataNet infrastructure and its services.
 
There are several innovations planned, both in technology and in standards. Below a selection is given to illustrate how SeaDataCloud aims to fulfil its challenges.

Innovation highlights in technology:
  • Upgrading the CDI Data Discovery and Access service using the cloud: for the CDI service a considerable innovation is planned by adopting a European cloud environment, to be provided by EUDAT, in the architecture which will host copies of the SeaDataNet data resources of connected data centres. Exchange will take place by dynamic replication. Internal cloud services will check possible duplicates, overall quality and integrity of the data formats and metadata relations. Results of these checks will be reported back to data centres for amendments of their submissions and/or local configurations for mapping data and metadata. This will improve considerably the overall functioning and consistency of the SeaDataNet infrastructure. Transformation services will be included for overall converting to other required output formats such as SeaDataNet NetCDF and INSPIRE data models. Early July 2017 the Specification document has been released. This document contains the architecture specification of the SeaDataNet data cloud. It includes the requirements to build the environment and the specification for upgrading the SeaDataNet CDI Data Discovery and Access service infrastructure using EUDAT services. The SeaDataNet infrastructure will be extended to leverage EUDAT capability to principally allow individual data centres to replicate data onto EUDAT resources for establishing a data cache that will improve the overall performance of the SeaDataNet CDI service for fast delivery of quality controlled data sets. Moreover, this deliverable contains a description of how the SeaDataNet CDI service system components, such as the Common Data Index (CDI) Import Manager and the Request Status Manager (RSM), will use the EUDAT API to interact with the EUDAT Cloud. Activities are underway for developing the components following the specifications.

CDI infrastructure
Figure: Current and planned CDI infrastructure

  • Developing integrated online services for ingesting autonomous observatory data: Increasingly in situ autonomous observatories are used to observe the oceans. However, managing data from these observatories is often quite complex because they include several sensors, data transmission systems… and because they require a data centre to receive, to decode and to check the measured variables. SeaDataCloud will develop an ingestion service for these autonomous observatories. This will be based on the OGC Sensor Web Enablement (SWE) family of standards and consist of an online service to describe the observatory (or the network of observatories) as SensorML metadata, and an online ingestion service to receive, to decode and to check data. Sensor Web Enablement (SWE) profiles will be worked out for selected instruments and platforms, cooperating with other projects such as JERICO-Next, Eurofleets2, ODIP II, FixO3, NeXOS, SCHeMA, BRIDGES and with EuroGOOS and Copernicus (CMEMS) providing easier ingestion and documentation of new observing stations, streamlining the data flow, and developing an optimal workflow for validating, archiving and giving access to long time series from operational oceanography stations.
  • Developing a preconfigured and pre-built virtual appliance system as a complete solution to new data centres to connect to the CDI service: This will include all necessary data management tools, easily deployable and ready to use with minimal setup, which will ease the process of maintenance and management of the distributed nodes and allows an automatic upgrade of the ‘Download – Replication Manager’ installed at each node. It will be easily deployable into a compatible virtualisation environment. Virtual appliances eliminate the need for physical hardware because they can run on virtual platform solutions. The SeaDataCloud virtual appliance will contain only the necessary software applications (e.g. web and application servers, Download – Replication Manager) with an operating system that is enough for it to run optimally and, thanks to which the system is less vulnerable to security breaches.
  • Developing a Virtual Research Environment with a packaged set of advanced downstream services for users: To make the overall SeaDataNet infrastructure more attractive and a service to which users, in particular researchers, will return much more often, and remain connected for longer sessions, the concept of incorporating a Virtual Research Environment (VRE) has emerged as a way to further satisfy user needs. This will facilitate collaborative and individual research from public, academic and private institutes concerning using, handling, analysing and processing ocean and marine data into value-added data products, which can be integrated, visualised and published using OGC and high level visualisation services. Thus the cloud environment will host a number of advanced services, seen as a packaged collection of processing services and that can be connected to subsets of the SeaDataNet data resources. These data subsets can be augmented by individual researchers and groups of researchers to give added-value products and new insights. A variety of advanced services will be offered by the Ocean Data View (ODV) and the Data-Interpolating Variational Analysis (DIVA) software for which online versions will be developed. The ODV service will include functionality for validating and harmonising large collections of data sets for specific data types, which will contribute to preparing data products such as temperature and salinity climatologies. The VRE will be set up in such a way that additional advanced services can be included without too much effort. To be competitive and very inviting for users of all targeted sectors, the VRE should be equipped with high capacity and performance for big data processing and state-of-the-art web visualisation services, making use of standards for wide interoperability, whilst respecting user privacy and differences in data policies.
Innovation highlights in standards:
  • Common Vocabularies: further developments will be undertaken for the existing NERC Vocabulary Services (NVS 2.0) as in use for SeaDataNet for marking up metadata and data. New vocabularies will be developed as required for supporting the OGC-Sensor Web Enablement (SWE) standards that will be developed in SeaDataCloud for ocean observation platforms, research vessels, related instrumentation and sensors. This will be undertaken in collaboration with OGC to ensure no duplication of effort and to build on existing activity. Specialized ontologies for the technical vocabularies will be formalized and refined.
  • Pilot for applying Linked Data principle for SeaDataNet common directories: Over recent years, the idea of publishing structured data on the World Wide Web has gained much traction, through high profile applications such as Facebook's Graph schema allowing media and previews of websites to enter the social networking platform and providing recommendations of pages and sites to users. Structured data is also at the heart of the World Wide Web Consortium's (W3C) best practices for publishing data on the Web. When structured data are made available online using the W3C's Resource Description Framework standard and using web addresses to identify both the datasets being published and the definitions of its contents and related datasets it is Linked Data as defined by Tim Berners-Lee (2006). At present the use of machine readable metadata is lacking in the SeaDataNet directories as well as the use of unique and persistent identifiers on the Web to address catalogue entries. A pilot will be undertaken to enhance the use of Linked Data and Semantic Web techniques for the SeaDataNet directories. Existing metadata schemas will be amended for areas which can be migrated to Linked Data representations, while ensuring compliance with the INSPIRE Spatial Data Infrastructure standards. As a next step some exemplar data will be published within Resource Description Framework (RDF) representations. RDF has an associated query language (SPARQL) which allows interrogation of links between datasets published in RDF. Once the exemplar data have been published as RDF, they can be harvested into SPARQL endpoints. This will enhance the discoverability of SeaDataNet data and information both within the SeaDataNet infrastructure and in more general online search facilities.
  • INSPIRE compliance for data formats: SeaDataCloud strives for full INSPIRE compliance and has planned deployment of central transformation services for converting SeaDataNet data sets to other required output formats such as SeaDataNet NetCDF (CF) and relevant INSPIRE application schemas. It will be reviewed and specified how the SeaDataNet ODV format, used in SeaDataNet for a wide range of marine observation data, can be used as basis for an INSPIRE compliant data format, following the INSPIRE Observations & Measurements (O&M) data model and guidelines. Possibly relevant are new RDF/OWL and JSON implementations. Regarding INSPIRE compliance also INSPIRE themes of relevance in SeaDataCloud (such as Environmental monitoring facilities) will be identified and the data models will be reviewed and introduced to the SeaDataCloud partners. In addition it will consider any possibly required extensions to the existing SeaDataNet NetCDF (CF) data format and analysing possible upgrading to NetCDF version 4.
  • Publishing and dissemination of upgraded standards: New and updated profiles and best practices will be published at the SeaDataNet portal and integrated in the SeaDataNet tools and services. They will also be submitted to relevant INSPIRE Thematic Clusters, OGC, ISO, W3C communities, and the IODE/JCOMM Ocean Data Standards and Best Practices project (ODSBP) for adoption and wider dissemination.
The following images give a schematic view on the upgrading of the infrastructure that is planned:

SeaDataNet architecture
Figure: Present SeaDataNet architecture

proposed upgraded SDN architecture
Figure: Proposed upgraded architecture with data replication, advanced services and VRE in the cloud

The work plan for SeaDataCloud is organised as a cycle of activities that pass from operation to development to operation.  A very important aspect is that new services, components and standards must be implemented over the whole network and without causing any disturbance to the operational functioning of the infrastructure. This will be achieved by versioning of services, parallel installation and testing before moving to production, and careful coordination of upgrade implementation. In addition, it will involve a number of Training Workshops for data managers and technicians of the data centres to explain upgrades, to give hands-on training, and to provide instructions and guidance for local installation and configuration. The SeaDataNet infrastructure currently connects >100 data centres which are not all members of the SeaDataCloud consortium. However a budget provision is made for facilitating the involvement of the non-partners during the SeaDataCloud project implementation in the 2 training workshops and follow-on activities aimed at introducing, instructing and upgrading of the local SeaDataNet data centre nodes. This is necessary for successfully achieving innovative improvements and upgrading of the infrastructure over the whole network of connected SeaDataNet data centres, on behalf of the infrastructure and its users.   

Next to upgrading the SeaDataNet infrastructure additional activities are devoted to further populating of the SeaDataNet directories with new entries, monitoring the operational performance and availability of the infrastructure, its services and its nodes, promotion and dissemination, including organizing the IMDIS conferences, and further development of the sustainable organization and business model.

Moreover, activities are undertaken for reviewing the present SeaDataNet data products with new versions, including additional data sets and taking into account the experiences as gained in earlier rounds.

Introducing the EUDAT network

Several activities in SeaDataCloud are aimed at innovation: upgrading existing (and adding new) services and widening virtual access provision by adding cloud and High Performance Computing (HPC) services. The latter is done in a cooperation with the EUDAT network of e-infrastructure providers. This is both a technical cooperation, bundling expertise and infrastructures, and a strategic cooperation as EUDAT is well engaged in the European Open Science Cloud (EOSC) initiative and SeaDataNet is well positioned as infrastructure and data management expert for the European marine and ocean data community. The SeaDataCloud partnership should lead to a win-win situation and long term cooperation.     

EUDAT is a European network of data and computing infrastructures that develop and operate a common framework for managing scientific data across European data centres and repositories and providing an interoperable layer of common data services. Covering both access and deposit, from informal data sharing to long-term archiving, and addressing identification, discoverability and computability of both long-tail and big data, EUDAT services aim to address the different phases of the research data lifecycle. With a network of more than 20 European research organisations, data and computing centres in 14 countries, the EUDAT Collaborative Data Infrastructure is the largest sustainable infrastructure of integrated data services and resources supporting research in Europe.

EUDAT participates in SeaDataCloud through 5 sites which joined the project as partners:  CSC (Finland), DKRZ (Germany), CINECA (Italy), STFC (United Kingdom), and GRNET (Greece). These 5 EUDAT centres will establish a common cloud and computing environment that will be integrated with the SeaDataNet data management infrastructure to provide central caching and cloud computing facilities. EUDAT brings in a number of EUDAT common services that will be adopted and, where needed, adapted for upgrading and optimizing the SeaDataNet Common Data Index (CDI) Data Discovery and Access service and for establishing the new SeaDataNet Virtual Research Environment (VRE). All services, upgraded and newly established in the SeaDataCloud project, will be accessible from the SeaDataNet portal.

members EUDAT
Figure: Current members of the EUDAT Collaborative Data Infrastructure

Present status of SeaDataNet software tools

SeaDataNet has developed and maintains a set of tools to be used by each data centre and freely available from the SeaDataNet portal. It includes documentation and common software tools for metadata and data, statistical analysis and grid interpolation and a versatile software package for data analysis, QA-QC and presentation. As part of the SeaDataCloud project upgrades are undertaken taking into account new requirements. The following software versions are current:  

MIKADO, developed by IFREMER, is used to generate the XML metadata entries for CDI, CSR, EDMED, EDMERP and EDIOS SeaDataNet catalogues. The latest version (3.3.5) has been released in November 2016. The next version (3.4) in preparation (release planned after summer 2017) will include the upgrade of existing database drivers, the addition of a Csv.jdbc driver to configure csv files, and new facility to import directly NEMO CDI summary CSV file in Mikado. See the video of the MIKADO instruction at the EMODnet Chemistry III training workshop in Trieste – Italy (May 2017).  

NEMO, developed by IFREMER, enables conversion of ASCII files of vertical profiles, time series or trajectories to SeaDataNet format files which can be text Ocean Data View (ODV) and MedAtlas formats or binary NetCDF format. The latest version (1.6.3) has been released in May 2016. The next version (1.6.4) in preparation (release planned after summer 2017) will simplify the management of CSV files as input files of NEMO, and will manage the deprecated vocabulary. See the video of the NEMO instruction at the EMODnet Chemistry III workshop in Trieste – Italy (May 2017).

OCTOPUS, developed by IFREMER, replaces the former SeaDataNet tools ‘Med2medSDN’, ‘Change_vocab_V1toV2’, ‘MedSDN2CFPoint’, ‘OdvSDN2CFPoint’, offering a unique and ergonomic software able to convert files in a given SeaDataNet format to another SeaDataNet format (e.g.: ODV to NetCDF, MedAtlas to NetCDF, MedAtlas to ODV). Octopus is also a format checker for ODV and MedAtlas SeaDataNet files. Furthermore, it allows to split a multistation SeaDataNet file into several monostation SeaDataNet files, and to extract stations from a multistation file. Finally it is also designed to convert specific Magnetism, Gravimetry and Depth data files (MGD) to SeaDataNet ODV format. The first version of OCTOPUS (1.2) was released on February 2017. The latest version (1.3.9) has been released 21 July 2017 and includes a number of bug corrections and extra functions. A next version is in preparation (release planned at fall 2017) and will include the new conversion from NetCDF to ODV, a new format checker for NetCDF and the management of deprecated vocabulary. See the video of the OCTOPUS instruction at the EMODnet Chemistry III training workshop in Trieste – Italy (May 2017).

Download Manager, developed by IFREMER, supports connecting to the SeaDataNet infrastructure.  The Data centres can connect to the CDI Data Discovery and Access service to support automatic processing of data set requests, for as far as possible. Therefore a data centre has to install locally a java component ‘Download Manager (DM)', that handles all communication between the data centre system and the CDI RSM service and that takes care that requested files are made ready for downloading by users (if OK) via their personal download pages at the data centre. The DM software is maintained by IFREMER with input from MARIS and fit for delivering SeaDataNet NetCDF formats next to ODV ASCII data files for profiles, time series and trajectories. The latest version is V1.4.6 (July 2015). As part of the SeaDataCloud upgrading of the CDI data discovery and access service developments are ongoing for:

  • Revising the set-up and functionality of the Download Manager to become a Replication manager that can interact with the local data centre configuration, planned CDI import manager and data cloud;
  • Developing the Download Manager and its future successor, Replication Manager, as a virtual appliance that will ease installation, configuration and version updating for many data centres. At present tests are underway with the first release for the current Download Manager.      
Ocean Data View (ODV) is a software package for the interactive exploration, analysis and visualization of oceanographic and other geo-referenced profile, time-series, trajectory or sequence data. ODV, developed by AWI, can display original data points or gridded fields based on the original data. ODV has two fast weighted-averaging gridding algorithms as well as the advanced DIVA gridding software built-in. Gridded fields can be colour-shaded and/or contoured. The ODV data format allows dense storage and very fast data access. Large data collections with millions of stations can easily be maintained and explored on inexpensive desktop and notebook computers. ODV also supports the netCDF format. Latest ODV4 Version: V4.7.10 (February 2017).

DIVA software tool (Data-Interpolating Variational Analysis), developed by the University of Liege, allows to spatially interpolate (or analyse) observations on a regular grid in an optimal way. The analysis is performed on a finite element grid allowing for a spatial variable resolution and a good representation of the coastline and isobaths. As some areas covered in the European seas have complex coastlines, the finite-element grid of DIVA will be able to adequately resolve those. Latest version: V4.7.1 (June 2016).

All these tools can be downloaded together with user manuals and further documentation, where available, without any restriction from the SeaDataNet portal.

SeaDataNet and EMODnet

The European Marine Observation and Data network (EMODnet) was initiated in 2008 by EU DG MARE and from the start SeaDataNet members have been closely involved in its planning and implementation. SeaDataNet infrastructure and standards have been adopted for developing and supporting the EMODnet portals for physics, chemistry and bathymetry, while also contributing to biology and geology. The focus in EMODnet is on developing generic data products and serving these with effective interfaces and services to users from public, government, research and industry.

EMODnet projects and portals make full use of the SeaDataNet infrastructure for managing additional thematic measurement data as gathered from data partners, connecting new data providers to the SeaDataNet data discovery and access service, and for harvesting large data collections from the SeaDataNet infrastructure which serve as input for the production and publication of thematic data products with a European coverage.

The EMODnet initiative has finalized its second phase and its results have convinced EU Parliament to agree with a third phase which focuses on further development and publishing of generic products on a European scale, and more interaction with downstream parties for added-value products and services to reach out to customers. The third phase is underway since early 2017. Below a selection of portals with SeaDataNet involvement are highlighted.  

EMODnet Chemistry III

EMODnet Chemistry aims at collecting, validating, and providing access to marine chemistry data, and generating and publishing marine chemistry data products, relevant for the implementation of the Marine Strategy Framework Directive (MSFD) and its stakeholders at national, regional and European levels.

In the previous phases EMODnet Chemistry has already gathered a large collection of marine chemistry data sets and developed with involvement of stakeholders and chemistry experts data products concerning eutrophication and contaminants in water, seabed and biota for 5 major European sea regions: Baltic Sea, N.E. Atlantic (Celtic Seas, Iberian Coast, Bay of Biscay and Macaronesia), Greater North Sea, Mediterranean Sea, and Black Sea. The data products concern spatially interpolated concentration maps with time evolution as well as harmonised, aggregated and validated data collections. These are highly relevant for MSFD indicators 5 (Eutrophication), 8 (Contaminants) and 9 (Sea-food contaminants). Moreover, it has developed operational services for sharing and visualising these data and data products.

matrix EMODnet chemistry
Figure: Search chemicals by regions: the Matrix indicates per sea region and per chemicals group by map and table how many measurement data are available.

A major challenge has been to manage the heterogeneity, complexity and large volume of the gathered datasets and to process these into harmonized data products for all European sea regions. This has been solved by applying consolidated (SeaDataNet) standards, controlled vocabularies, tools and services for data and metadata populated by data centres. Thereafter automated robot harvesting has taken place by means of the SeaDataNet CDI Data Discovery and Access service to deliver regional data collections for nutrients, oxygen, chlorophyll, and contaminants, provided by more than 60 data centres. This way > 700.000 chemistry data sets from 64 data centres and 311 originators from 32 countries from 1868 – 2016, most for water column and less for sediments and biota have been gathered and handed over to the regional product teams.  

Using a common methodology including use of the SeaDataNet Ocean Data View software (ODV) and with input of marine chemical experts, harmonised, aggregated and validated regional data collections have been produced for the 5 major European sea regions. As part of this process, a Data Validation loop has been introduced to identify and correct errors at their local sources. As a next step, spatially interpolated regional map products have been computed from the harmonized data collections using the SeaDataNet Data-Interpolating Variational Analysis software (DIVA). Depending on sufficient spatial and temporal data coverage for the regions, maps have been produced for: Dissolved Oxygen, Nitrate, Phosphate, Nitrate plus Nitrite, Silicate, Ammonium, Total Nitrogen, Total Phosphorus, Chlorophyll - a and pH. Contaminant data (antifoulants, heavy metals, hydrocarbons, pesticides and biocides, polychlorinated biphenyls, and radionuclides) cover mainly coastal waters as part of national monitoring and are visualized as harmonised validated time series.

All data products have been ingested in dedicated viewing services on EMODnet Chemistry portal where users can browse and visualise observation densities and (animated) maps of temporal and spatial evolution (also in depth).

phosphate concentration European basins
Figure: Spatial distribution of phosphate concentration in the European basins in winter for the decade 2003-2012

Priority was given to those parameters that are relevant for Member States, Regional Sea Conventions, and EU for assessing the state of the European waters under the Marine Strategy Framework Directive. For that purpose experts from Regional Sea Conventions and EU were engaged in dedicated workshops organised by EMODnet Chemistry for tuning products and discussing their fitness for purpose

EMODnet Chemistry phase III started in March 2017 and the partnership includes 45 institutes from 27 countries and 3 international organisations (ICES, Black Sea Commission, UNEP/MAP) with overall coordination by OGS (Italy) and technical coordination by MARIS (Netherlands). The scope includes to continue and strengthen the data collection and product generation for the MSFD indicators 5, 8 and 9 engaging and tuning with major MSFD stakeholders. The scope is also extended with gathering data and generating products for indicator 10 (Marine Litter), in particular beach litter, seafloor litter as found in fishermen’s nets and micro plastics. This is undertaken in cooperation and synergy with existing initiatives at regional and European scales, including tuning with the TSG Marine Litter. Moreover the portal and various access and viewing services will be upgraded. 

EMODnet High Resolution Seabed Mapping

The previous EMODnet Bathymetry projects have resulted in a versatile EMODnet Bathymetry portal. As part of the project a harmonised bathymetry Digital Terrain Model (DTM) has been produced with a grid size 1/8 minute * 1/8 minute = ca 230 meter * 230 meter) for all sea basins in European waters. The EMODnet DTM has been based upon available bathymetric survey datasets as gathered from Hydrographic Offices (HOs), marine research institutes, governmental departments, and other organisations. In the latest EMODnet DTM version of October 2016 more than 7.700 survey data sets have been used, provided by 31 data holders from 18 countries. References to the used data and their data holders can be found in the source references layer. Therefore gathered survey datasets have been described and included in the Common Data Index (CDI) Data Discovery and Access service that has been adopted from SeaDataNet. In practice a number of Hydrographic Offices did not deliver survey datasets but composite DTMs which have been produced by themselves from survey datasets. These composite DTMS have been included in the SeaDataNet Sextant Catalogue service as data products.

The operational CDI service for EMODnet Bathymetry now contains 14791 entries from 28 data centres from 15 countries. The number of CDI entries relevant for European waters amounts now  to 11505 survey datasets, while the number of Sextant entries amounts to 77 composite DTMs from 19 data providers from 15 countries. Overall combined the total number of data providers amounts to 34 data providers from 19 countries.

CDI entries survey datasets
Figure: overview of all CDI entries for survey data sets

CDI quick search EMODnet bathymetry portal
Figure: CDI Quick Search user interface at the EMODnet Bathymetry portal

A common methodology has been applied for QA/QC and generation of the EMODnet DTM. A common software tool ‘GLOBE’ has been developed and applied for processing input datasets and generating the EMODnet DTM in a harmonised approach. The EMODnet methodology is documented in the ‘Manual on QA/QC and DTM generation’ which can be downloaded from the portal. A total of 11 regions has been defined to cover all of European seas in an area from 36 degrees West to 43 degrees East and from 85 degrees North to 25 degrees North. For each region a Regional Coordinator has been responsible for producing the regional EMODnet DTM using the GLOBE software, a selection from the available survey data and composite DTMs, and using their bathymetric expertise. Thereafter the regional DTMs have been validated and integrated into the overall EMODnet DTM.  

Also close cooperation and synergy has taken place with the General Bathymetric Chart of the Oceans (GEBCO) because the GEBCO DTM (with a grid resolution of circa 1000 * 1000 meters) has been used to fill gaps in data coverage. This has resulted in a method whereby EMODnet DTM releases have been integrated in the next release of GEBCO for improving the bathymetry while vice versa the prevailing GEBCO releases have been integrated into the EMODnet DTM for filling gaps in geographical coverage. This exchange and synergy has resulted in reducing anomalies at boundaries and overall better results for both products.

EMODnet DTM October 2016
Image: latest EMODnet DTM – October 2016 release

EMODnet DTM Tyrrhenian sea
Figure: Detail of EMODnet DTM for Tyrrhenian Sea and Sicily – Italy in other color set

Altogether the EMODnet DTM contains 1.092.115.678 data points (28.799 rows x 37.922 columns) which are divided over 16 tiles which can be downloaded freely in various formats. The Bathymetry Viewing and Downloading service gives users functionalities for viewing individual and combined layers of the EMODnet DTM together with external map layers and for downloading components of the EMODnet DTM in a range of formats; each cell in the DTM gives a reference to the prevailing data source (survey dataset as described in CDI service; composite DTM as described in Sextant service; GEBCO_2014 in case of no other data sources) which is included in the DTM grid cell data.

The GIS layers in the Bathymetry Viewing and Download service can be shared as OGC WMS and WCS services with other EMODnet portals and beyond. Also WMS layers from other EMODnet portals and external services can be added to the Bathymetry Viewer and Download service. The URLs for the OGC services can be found at the portal. The NetCDF format can be used in combination with the 3D viewer software, which you can download from the portal. A video of the EMODnet DTM in 3D can be found at YouTube.

The new phase for EMODnet Bathymetry started in December 2016 and is named "EMODnet  High Resolution Seabed Mapping’. Compared to the previous phase even more hydrographic services and research institutes have joined the partnership which now counts 41 organisations from 20 countries around the European seas. The new phase is coordinated by Shom (France) with MARIS (Netherlands) as technical coordinator. The scope concerns further upgrading of the EMODnet DTM by incorporating even more surveys and further improvement of the digital bathymetry. Also Satellite Derived Bathymetry (SDB) data will be included for covering gaps in survey coverage. Moreover developments are underway for increasing the overall resolution from 1/8 to 1/16 arc minute grid and to expand the coverage of the DTM to include also the European coastal zones as well as the European arctic region and Barents Sea. Higher resolution DTMs will be developed and made available where data are available and released for publication by its owners.

EMODnet Data Ingestion

banner EMODnet Ingestion

The ‘EMODnet Ingestion and safe-keeping of marine data’ project is a new project seeking to identify and reach out to organisations from research, public, and private sectors who are holding marine datasets and who are not yet connected and contributing to the existing marine data management infrastructures which are driving EMODnet. Those potential data providers should be motivated and supported to release their datasets for safekeeping and subsequent freely distribution and publication through EMODnet.

The new project started May 2016 and is undertaken by a European consortium of 44 organisations from 29 coastal countries. Most partners are established SeaDataNet data centres and the consortium also includes coordinators of the EMODnet thematic data portal projects.

The emphasis of activities in the first year has been put towards developing the EMODnet Data Ingestion portal and its services for ingesting and publishing data sets, developing the pathways for processing and elaborating of data submissions, laying a basis for promotion and marketing activities, and making an initial inventory of potential data sources and their providers.

The EMODnet Data Ingestion portal has been launched early February 2017. It encourages data providers to share marine data and provides a submission service and marine data management guidance information.

EMODnet ingestion homepage
Figure: EMODnet Data Ingestion portal homepage

The Ingestion portal aims at organisations not yet routinely submitting data sets to national data centres and not yet used to marine data management practices and standards. A low threshold is offered by splitting the completion of the submission form in 2 parts:

  • Part 1 submission form: a number of key fields to be completed by the data submitter, including uploading of a zip file with the data sets and related documentation;
  • Part 2 submission form: review of the received part 1 and consecutive completion of the additional metadata fields by an assigned data centre.
The EMODnet Data Ingestion portal is public domain and all submitted data sets are considered as open data. However a user registration using the Marine-ID service is required for data providers and data centres for using the Submission service with its range of functions.

data submission workflow
Figure: Data Submission workflow for Phase I going from ingestion till publishing ‘as is’  

A distinction is made between 2 phases in the life cycle of a data submission:
  • Phase I:  from submission to publishing of the submitted datasets package ‘as is’
  • Phase II: further elaboration of the data sets and integration (of subsets) in national, European and EMODnet thematic portals.

This split will allow to publish already in an early stage the original data package with high quality metadata.

pathways common workflow

The ‘pathways’ for curating and elaborating submitted data sets follow a common workflow: data are 1) submitted by data holding organisations and published ‘as is’ at the Ingestion portal with support of assigned data centres; 2) validated, processed and stored in dedicated national data centres; 3) populated in the appropriate European infrastructure, such as SeaDataNet and EurOBIS, and 4) made part of EMODnet data portals.

Submission forms with data packages are assigned to qualified data centres depending on the country of the data provider and the type of EMODnet theme. These data centres are partners in the EMODnet Ingestion project and data centres that are partners in each of the EMODnet Thematic portals.

For operational oceanography a close cooperation takes place with EMODnet Physics. This concerns identifying and arranging inclusion of additional stations for Near Real Time (NRT) data exchange. The Data Ingestion portal explains how the NRT exchange is organised with EuroGOOS – Copernicus and guidance how to connect in practice. Furthermore together with SeaDataCloud activities are undertaken for developing a Sensor Web Enablement (SWE) pilot for Real Time data exchange.

EMODnet ingestion bookmarks
Figure: EMODnet Data Ingestion bookmarks

Promotion and outreach activities are equally important as technical developments. In the first year it has focused on establishing cooperation and synergy within the EMODnet community. And preparatory work has been undertaken for wider outreach and marketing to potential data providers. A first inventory of potential data providers per country has been compiled. This inventory covering 26 countries reporting 466 potential data sources is now being followed-up by EMODnet Ingestion members approaching the identified organisations for submitting their data sets.

poster EMODnet ingestion Brussels metro
Figure: Poster in Brussels metro

EMODnet Open Sea Lab - Coming Soon!

EMODnet organizes a three day open data bootcamp from 15-17 November 2017 in Antwerp, Belgium, for ideating and co-creating innovative solutions to unique problems using EMODnet’s wealth of marine data and ocean observations. Are you a coder, developer, data enthusiast? Maybe you are a maritime company with a unique challenge or an environmental manager interested in new tools to map or manage your marine space? Or are you are a student or an entrepreneur with a novel idea? Whatever your experience or interest EMODnet Open Sea Lab will bring together experts in European marine data management from EMODnet with experts in digital innovation from IMEC to work with you to turn innovative ideas into working applications! The Open Sea Lab will match-make and bring together teams and will involve ideation, co-creation, prototype development and validation by mentors.

To find out more visit www.opensealab.eu and register your interest to be kept informed!

Pilot Blue Cloud - formulating a Blue Blueprint for the EOSC

The European Commission aims to deliver a European Open Science Cloud (EOSC) by 2020. The Open Science Cloud will federate existing and future thematic clouds. At the 2016 G7 S&T Ministerial, Commissioner Moedas proposed the development of a Pilot Blue Cloud as a pilot initiative of the European Science Cloud and as a contribution from the Commission to the further development of the G7 Future of the Seas and Oceans Initiative.

As follow-up EU DG-RTD organised two Pilot Blue Cloud workshops in coordination with DG–GROW, DG–CNECT, and DG–MARE to develop a common vision and strategy for developing the Pilot Blue Cloud. The workshops brought together representatives of projects, infrastructures, initiatives and networks that can be considered as possible building blocks of the Pilot Blue Cloud. These included inter alia:

  • EGI, EUDAT, Copernicus and the H2020 project EOSC Pilot, engaged in development of the overall EOSC infrastructure;
  • SeaDataCloud, AtlantOS, ODIP2, EMODnet, EMBRC, EBI, EMSO, PANGAEA, EuroGOOS, CMEMS INSTAC, Euro-Argo, ICES, EurOBIS, projects and major infrastructures engaged in marine data management
  • BlueBridge, EVER-EST, ENVRI-PLUS, SeaDataCloud, and EMODnet, projects engaged in developing Virtual Research Environments (VRE) for marine data applications;
  • IOC-IODE and OBIS, global programmes for marine data management
The "Blue" research community is particularly well placed to offer its contribution and exploit the EOSC potential as it is already well organised and has many cooperations. The Pilot Blue Cloud can be one of the demonstrators of the EOSC.

The key messages resulting from the Workshops are:
  • All the participants recognise the need to develop a Pilot Blue Cloud to ensure that the European Open Science Cloud (EOSC) will provide an even playing field where to bring together data and tools that are essential for blue science.
  • This tool will only be successful if it is developed putting researchers at the centre from the start.
  • With this goal in sight, the participants propose to formulate a Blue Blueprint for the EOSC – contributing to the overall architecture of the EOSC the experience of its Pilot Blue Cloud.
A next Workshop will be organised in the Fall of 2017, following the next meeting of the EOSC Summit.

IMDIS 2016 Conference

SeaDataNet initiated around 2005 the IMDIS conference - International Conference on Marine Data and Information Systems. The IMDIS cycle of conferences has the aim of providing an overview of the existing information systems to serve different users in ocean science. It also shows the progress on development of efficient: infrastructures for managing large and diverse data sets, standards, interoperable information systems, services and tools for education.

IMDIS 2016: the 5th International Marine Data and Information System Conference, was held on 11–13 October 2016 in Gdansk (Poland). It was organised by IO PAN (Poland), IMGW (Poland), IFREMER (France) and OGS (Italy). IMDIS 2016 was a great success with 110 participants from 24 Countries. There were 53 oral presentations and 54 posters.

ECS Gdansk participants per country
Figure: The ECS (Europejskie Centrum Solidarnosci) in Gdansk, Poland.

The proceedings have been published in the Vol. 57 – supplement of Bollettino of Geofisica. You can also download an electronic version of the book of abstracts. This also applies for pdf versions of the oral presentations and the posters. A huge thanks to all of you who contributed to IMDIS.

The SeaDataCloud project will organise 2 more editions of the IMDIS conferences which appeal to an international audience of experts engaged in marine and ocean data acquisition, management infrastructures and data applications. The IMDIS conferences will also give an excellent opportunity to present various aspects of SeaDataCloud developments and resulting upgraded and new SeaDataNet services, and to compare these with international developments.

Acronyms as used in this Newsletter

This newsletter contains many acronyms which are described in the following list:  

API: Application Programming Interface
CDI: Common Data Index
CF: Climate and Forecast
CMEMS: Copernicus Marine Environmental Monitoring Service
CSR: Cruise Summary Reports
CSV: Comma Separated Values
CS-W: Catalogue Service for the Web
DIVA: Data-Interpolating Variational Analysis software
DOI: Digital Object Identifier
DTM: Digital Terrain Model
EDIOS: European Directory of Oceanographic Observing Systems
EDMED:  European Directory of Marine Environmental Data
EDMERP: European Directory of Marine Environmental Research Projects
EDMO: European Directory of Marine Organisations
EMODnet: European Marine Observation and Data Network
EOSC: European Open Science Cloud
EuroGOOS: European Global Ocean Observing System
GEBCO: General Bathymetric Chart of the Oceans
GEOSS: Global Earth Observation System of Systems
HPC: High Performance Computing
ICES: International Council for the Exploration of the Sea
ICT: Information and Communication Technologies
IMDIS: International Conference on Marine Data and Information Systems
IOC: Intergovernmental Oceanographic Commission
IODE: International Oceanographic Data and Information Exchange
ISO: International Organization for Standardization
JSON: Java Script Object Notation
MSFD: Marine Strategy Framework Directive
NetCDF: Network Common Data Form
NODC: National Oceanographic Data Centre
NRT: Near Real Time
NVS: NERC Vocabulary Services
OBIS: Ocean Biogeographic Information System
ODIP: Ocean Data Interoperability Platform
ODV: Ocean Data View software
ODSBP: Ocean Data Standards and Best Practices project
OGC: Open Geospatial Consortium
O&M: Observations & Measurements
OWL: Ontology Web Language
QA: Quality Assurance
QC: Quality Control
RDA: Research Data Alliance
RDF: Resource Description Framework
RSM: Request Status Manager
RTD: Research and Technological Development
SDB: Satellite Derived Bathymetry
SDC: SeaDataCloud
SDN: SeaDataNet
SWE: OGC Sensor Web Enablement
URL: Universal Resource Locator
VRE: Virtual Research Environment
W3C: World Wide Web Consortium
WCS: Web Coverage Service
WFS: Web Feature Service
WMS: Web Map Service
XML: Extensible Markup Language