Pan-European infrastructure for ocean & marine data management

Newsletter

13 May 2013

See other editions

Other editions

Do you want to receive our newsletter? Or do you want to unsubscribe?

Go to the form

Content:

Introduction

SeaDataNet is a pan-European infrastructure providing harmonised discovery services and access to ocean and marine environmental data sets managed in distributed data centres. The partnership is composed of 44 institutions directly involved in the project as partner and 10 other institutions as associate partner, from 35 countries riparian to European seas. Also through its engagement in other European projects and initiatives, such as Geo-Seas, Upgrade Black Sea SCENE, and EMODNet pilots, at present 83 data centres are connected to the SeaDataNet infrastructure, giving access to already more than 1.3 million data sets for physical oceanography, chemistry, geology, geophysics, bathymetry and biology.

8_article_74_seadatanet.jpg (114.4 K)

The data are collected in research projects and monitoring programmes and archived in local data bases of the SeaDataNet data centres; corresponding metadata, based upon the ISO19115 standard and completed using common SeaDataNet tools and SeaDataNet controlled vocabularies, where possible, are maintained in a central catalogue with user friendly interfaces. This way the SeaDataNet portal provides a 'single stop shop' for all kinds of users to discover and request access to ocean and marine data sets acquired by European organisations for the European seas and global oceans.

SeaDataNet is addressing the needs of different user communities working in science, environmental management, policy making, and economical sectors. Better integrated data systems are vital for these users to achieve improved scientific research and results, to support marine environmental and integrated coastal zone management, to establish indicators of Good Environmental Status for sea basins, and to support offshore industry developments, shipping, fisheries, and other economic activities.

SeaDataNet also maintains close contact and collaboration with research groups (both public and private) that are developing concepts and methodologies in informatics, because SeaDataNet is one of the major players in informatics in oceanography. It is setting and gouverning standards, and exploring and establishing interoperability solutions to connect to other e-infrastructures on the basis of standards of ISO (19115, 19139), OGC (WMS, WFS, CS-W and SWE), OpenSearch,  OpenID and Shibboleth. In particular, SeaDataNet is establishing exchanges from its infrastructure and portal to GEOSS, Ocean Data Portal (ODP) of IOC-IODE, EurOBIS and the European Nucleotide Archive (ENA) of EMBL-EBI. Moreover SeaDataNet works together with EU projects Eurofleets I and II (research vessels) and JERICO (monitoring sites) on developing SWE standards for oceanography.

Autumn 2012 this has been extended with active participation in the Ocean Data Interoperability Platform (ODIP) project where cooperation takes place with leading oceanographic data infrastructures from the USA and Australia as well as IOC-IODE to explore common standards and interoperability solutions.

The funding of SeaDataNet is secured until October 2015 through the EU SeaDataNet II project with the aim to sustain an operationally robust and state-of-the-art Pan-European infrastructure for providing up-to-date and high quality access to ocean and marine metadata, data and data products. In addition SeaDataNet's adoption as leading component for data management in the development of EMODNet (European Marine Observation and Data Network) contributes to its perspective towards long term sustainability.

Activities in SeaDataNet II are:

  • Operational maintenance of the SeaDataNet pan-European catalogues (CDI (data), EDMED (large data sets), EDMERP (projects), EDMO (organisations), EDIOS (monitoring systems), CSR (Cruise Summary Reports)) for providing more and up-to-date information and data
  • Seeking INSPIRE Directive compliance: adopting the ISO-19139 standard for XML description and OGC-CSW exchange of the CDI and CSR catalogues
  • Developing machine to machine interfaces next to existing user interfaces for data and product distribution to regular user communities (operational oceanography, MSFD, ..)
  • Improving data sets duplicate management and introducing Data Preview services before requesting downloading
  • Improving the capability for handling also marine biological data in close cooperation with EurOBIS, MarBEF, and other biological data initiatives
  • Expanding the Product portfolio by aggregated data sets at full basin scale using the data sets made available via the SeaDataNet infrastructure
  • Deploying true synergies with other projects such as Eurofleets and Jerico for streamlining the flow of data from acquisition at Research Vessels and Fixed Monitoring systems to data centres, also exploring Sensor Web Enablement (SWE), and with MyOcean for improved data exchange from real-time systems and data centres to the ocean modelling community
  • Reinforcing the SeaDataNet base of data holders; increasing the number of connected data centres as well as the volumes and types of data sets to be accessed via the SeaDataNet services
  • Maintaining and extending the driver role of the SeaDataNet infrastructure and its community in the further development and implementation of the European Marine Observations and Data Network (EMODNet) as initiated by the EU in the framework of the MSFD Directive
  • Seeking interaction and synergy with international standards developments for IT (OGC, ISO), ocean data management (IODE Ocean Data Standards initiative) and interoperability with European and global portals (INSPIRE, GEOSS, Ocean Data Portal)  
  • Organising feedback from user communities in order to improve services of the infrastructure.

This Newsletter is presenting a number of the tools for data management as developed by SeaDataNet, and highlighting some key achievements in SeaDataNet II so far. The Newsletter also gives information on the collaboration with other initiatives in data management and information systems. Furthermore you are invited to register and participate in the IMDIS 2013 Conference that will be organised and hosted by SeaDataNet in Lucca - Italy from 23 to 25 September 2013.

We hope you enjoy the newsletter and will be triggered to visit the SeaDataNet portal at http://www.seadatanet.org/ for a try out of its services and to follow its evolution.

SeaDataNet tools - NEMO

NEMO is a reformatting software used to generate data files at SeaDataNet formats. These formats are defined so that an external user, downloading data from SeaDataNet portal, receives all data at the same standard format, no matter which SeaDataNet data centre they come from. NEMO objective is then, to reformat ASCII files of vertical profiles (like CTD, Bottle, and XBT),  time-series (like current meters, sea level data) or trajectories (like thermosalinograph data) at SeaDataNet standard formats (ODV, NetCDF and MEDATLAS). The NEMO tool can be downloaded without any restriction from the SeaDataNet portal.

History
CONVMED, the forerunner of NEMO, was used in IFREMER/SISMER since the end of the 90’s for reformatting the data received at the data centre to MEDATLAS format, which is used as the archive format of CTD, bottle and time-series data in the French data centre.
NEMO is available for SeaDataNet users since 2008; its version 1.11 was first demonstrated at the 3rd training course of SeaDataNet project in June 2008, in Ostend, Belgium.
Since that time, twelve releases have been delivered to SeaDataNet users, each of them with new functionalities and/or bug corrections. The current release is NEMO version 1.4.5.
NEMO has received funding from the European Union:

  • Sixth Framework Programme (FP6/2002-2006) under grant agreement n° 026212, SeaDataNet
  • Seventh Framework Programme (FP7/2007-2013) mainly under grant agreement n°283607, SeaDataNet II and also under n°238952, Geo-Seas for supporting non numeric measured parameters.

Future
Next NEMO major release (version 1.5.0) is planned during the second quarter of 2013. Additions will be the possibility to generate output files at NETCDF format, and the compliancy with the new version 2 of the SeaDataNet Controlled Vocabularies (NVS 2.0), that will be introduced during 2013 as part of the 1st innovation cycle of SeaDataNet II.

Description of NEMO functions
As the files to be reformatted can be all kinds of ASCII format, NEMO must be able to read all these formats and to translate them to SeaDataNet formats.

To do so, the principle is that the user of NEMO describes the input files formats so that NEMO is able to find the information which is necessary to generate the files at SeaDataNet formats.
One very important pre-requirement is that in the entry files the information about the stations must be located at the same position: same line in the file, same position on the line or same column if Coma Separated Value (CSV) format. Furthermore, station information must be at the same format in all the stations.

To convert the input files, the user has to proceed into 5 steps:

  1. Describe the type of file and the type of measurements: one file for one cruise, or one directory with n files for one cruise, or n directories for n cruises or n files not related to cruises, file with separators, vertical profiles, time-series, trajectories.  Choose the output format (ODV, NetCDF or  MEDATLAS)
  2. Describe the cruise, only if the files are related to one cruise and only if NetCDF or MEDATLAS is the output format.
  3. Describe the station information:  all metadata available on the station in the input file can be described to be kept in the output file, some information are mandatory like data type, date, time, latitude, longitude.
  4. Describe the measured parameters: all measured parameters which need to be kept in the output file must be described; description includes parameter code, name and unit; location in input file; format in output file (number of decimal values);  default values;  a formula can be applied on the value of the parameter in the input file, this can be useful for unit conversion, for example.
  5. Convert the input file(s)

8_article_75_nemo_software.jpg (42.3 K)

 

SeaDataNet tools - MIKADO

MIKADO is an ISO-19115 XML catalogue description generator used to create XML files for metadata exchange of:

  • Marine Environmental Datasets (EDMED),
  • Cruise Summary Reports (CSR),
  • Plan Cruise Report (PCR),
  • Common Data Index to individual datasets (CDI),
  • European Directory of Marine Environmental Research Projects (EDMERP),
  • European Directory of the Ocean Observing System (EDIOS),
  • Seismic SensorML,
  • Seismic O&M (Observations And Measurements).

The MIKADO tool can be downloaded without any restriction from the SeaDataNet portal.

History
MIKADO has been designed by IFREMER during the European project SEA-SEARCH (2002-2006) to support partners in generating XML files that are entries for the EDMED, CSR, CDI directories in order to provide their contributions to the central directories.

In the framework of the European program SeaDataNet, MIKADO has been upgraded to include the EDMERP and EDIOS directories and to support new functionalities, such as:

  • User-friendly interface to help the user to generate manually or automatically XML files,
  • Use the SeaDataNet Vocabularies to insure the consistency of input data and facilitate homogenization and standardization of XML output files,
  • Demand-driven incremental mapping between the user database and common vocabularies,
  • Generation of the coupling table for the Download manager.

MIKADO has been committed to provide the mechanism for the generation of Plan Cruise Report (PCR) in the frame of the international Partnership Observation of the Global Oceans (POGO).

In the framework of the European project Geo-Seas, MIKADO has been upgraded to support new functionalities for describing geological and geophysical data:

  • GML extensions for CDI to describe data collected along ship track,
  • Production of O&M and SensorML documents to take into account needs specific to the description of seismic data and for accessing and viewing them.
  • Generation of the coupling table for Downloading and Viewing services of seismic data in the Geo-Seas portal.

MIKADO is available for SeaDataNet users since 2007; its version 1.01 was first demonstrated at the 1st training course of SeaDataNet project in February 2007, in Ostend, Belgium.
Since that time, sixteen releases have been delivered to SeaDataNet users, each of them with new functionalities and/or bug corrections. The current release is MIKADO version 2.5.

MIKADO has received funding from the European Union:

  • Sixth Framework Programme (FP6/2002-2006) under grant agreement n° 026212, SeaDataNet
  • Seventh Framework Programme (FP7/2007-2013) under grant agreement n°283607, SeaDataNet II and also under n°238952, Geo-Seas for GML, O&M and SensorML extensions.

Future
Next MIKADO major release (version 3.0) is planned during the second quarter of 2013. Important new features will be the migration of Common Data Index (CDI) from ISO-19115 to ISO-19139 standards and the compliancy with the new version 2 of the SeaDataNet Controlled Vocabularies (NVS 2.0), that will be introduced during 2013 as part of the 1st innovation cycle of SeaDataNet II.

Description of MIKADO functions
MIKADO can be used into 2 different ways:

  • One manual way, to input manually information for catalogues in order to generate XML files. Manual input is well adapted if there is a small amount of XML descriptions to create. This manual way creates XML files one by one and does not require specific database knowledge. MIKADO manual interface has the same design and behaves in the same way for all the catalogues. This interface is very intuitive and easy to use: it is composed of user-friendly forms, divided into thematic tabs that the user has to fill in.
  • One automatic way, to generate these descriptions automatically if information is stored in a relational database or in an Excel file. MIKADO helps the user to define his own configuration, in two steps:
    1. Definition of connection parameters to access the user local database: these parameters (driver class name, JDBC connect URL, username/password if necessary) are pre-filled by MIKADO for all database management systems;
    2. Mapping between the user’s database and the XML format: this mapping is performed by SQL queries which extract information from the database. Here knowledge on database management and SQL language is required, but MIKADO provides pre-formatted queries to help the user. He only has to complete queries and to associate the content of his own database with the XML fields which are clearly entitled.

8_article_76_mikado_flowchart.jpg (41.2 K) 

SeaDataNet tools - DOWNLOAD MANAGER

DOWNLOAD MANAGER is a software used to download data files from SeaDataNet portal.
DOWNLOAD MANAGER has to be installed and configured at each SeaDataNet node for enabling receiving and processing data requests from users as submitted to the SeaDataNet portal.
DOWNLOAD MANAGER  provides the following functions:

  • Interaction with the portal’s Request Status Manager (RSM) for receiving user requests on local datasets which are described in the CDI catalogue
  • Interaction with local data system for retrieving of datasets at SeaDataNet formats
  • Provision of access to RSM for transporting data files to users.

The DOWNLOAD MANAGER tool can be downloaded from the SeaDataNet EXTRANET by data centres that show interest in getting connected to the SeaDataNet infrastructure and making their data discoverable and accessable through the SeaDataNet portal. Interested data centres beyond the consortium are advised to contact the SeaDataNet support desk (sdn-userdesk@seadatanet.org). Installation and configuration requires professional guidance from the SeaDataNet overall support desk and the CDI support desk.

History
DOWNLOAD MANAGER prototype version was developed by Russian NODC: The requirements analysis phase took place from end of 2007 to end of 2008 and the software design phase from end of 2008 to end of 2009.
DOWNLOAD MANAGER operational version has been designed by IFREMER in the beginning of 2010.
DOWNLOAD MANAGER evolution and maintenance are done by IFREMER with major updates:

  • DOWNLOAD MANAGER v1.1f, Structure revision:
    • Servlets are still provided by DOWNLOAD MANAGER to let the RSM notify it about new requests to be processed, to display the user’s download page and to monitor DM status and log (userPage, download, controller and status).
    • Main functions have been deported from the servlet Controller into several Java batch program: Requests processing (DM_Batch), deletion of old user files (DM_CleanerBatch) and checks processing (DM_Checker).
  • DOWNLOAD MANAGER v1.2.0:
    • Nagios monitoring
    • RESTful web service support for modus 1
  • DOWNLOAD MANAGER v1.4.0:
    • Implementation of advanced services as visualisation for Geo-Seas seismic data managed by HRSVS module.
DOWNLOAD MANAGER has received funding from the European Union:
  • Sixth Framework Programme (FP6/2002-2006) under grant agreement n° 026212, SeaDataNet
  • Seventh Framework Programme (FP7/2007-2013) under grant agreement n°283607, SeaDataNet II and also under n°238952, Geo-Seas for advanced viewing service on seismic datasets.

Future
Next DOWNLOAD MANAGER major release (version 1.4.2) is planned during the second quarter of 2013. Additions will be the possibility to generate output files at NETCDF format, and the compliancy with new version 2 of the SeaDataNet Controlled Vocabularies (NVS 2.0), that will be introduced during 2013.

Then an extension of the checker tool of DOWNLOAD MANAGER (DM_Checker batch) is scheduled for the middle of 2013 (version 1.4.3). The aim of this release is to check coherence “CDI catalog/coupling of SDN partners” in both directions in order to point out CDI records missing in the coupling table and coupling table records missing in CDI central directory.

Description of DOWNLOAD MANAGER functions
DOWNLOAD MANAGER (DM) consists of the following components:

  • RSM-DM request dialogue interface provides interaction with RSM for receiving user requests.
  • DOWNLOAD MANAGER Request processor (with Read-method and local application) provides interaction with the local data at the SeaDataNet node and prepares the zipped data file (Datasets pre-processed at SeaDataNet formats or via a read-method and preparing the datasets from a database system) or enable advanced services (e.g. viewing service on seismic datasets).
  • RSM-DM response dialogue interface provides interaction with the RSM for reporting about transport data status. The RSM interface gives users an overview of the status of their individual requests for datasets at each of the SeaDataNet node. This status is updated continuously by communication from DOWNLOAD MANAGER at each data centre.
  • DOWNLOAD MANAGER User page for downloading contains a download section and a visualization section and allows user to get files or view datasets.
  • DOWNLOAD MANAGER checker and cleaner batches allow SeaDataNet nodes to perform checks on data and DOWNLOAD MANAGER system and to maintain users’ directories.
  • DOWNLOAD MANAGER monitor and status pages allow Nagios monitoring by HCMR and gives an overview of the DOWNLOAD MANAGER status to RSM.

 8_article_77_download_manager_flowchart.jpg (34.7 K)

SeaDataNet tools - new version of ODV

Version 4.5.3 of Ocean Data View (ODV) has been released on Jan 15 2013. This version provides a number of important enhancements some of which are highly relevant for SeaDataNet users of ODV. This includes the new capability of the SeaDataNet importer to extract values for eight additional meta-variables from the .csv files delivered with any SDN retrieval. Also very important is the availability of the new derived variable Aggregated Variable.

Aggregated derived variables combine data values of one or more input variables in a single variable. This is useful when a given parameter, such as oxygen, has been measured by different laboratories and is reported in separate original variables, possibly using different units.  Aggregated derived variables allow merging the various original variables (possibly involving unit conversion) into a single variable for scientific analysis. Calculation of aggregated variable values occurs sample by sample using the values of the input variables for the given sample.

8_article_78_odv_aggregrated_variable.jpg (40.5 K) 

The ODV Aggregated Variable dialog

The Aggregated Variable dialog (see Figure) lets you define an aggregated variable. Start by entering the label of the new variable, its units, the number of significant digits, and choose the aggregation mode. Exclusive aggregation uses the first available data value in the specified order of input variables, while Average mode calculates the average of all available input values for a given sample.

Then press the Add button to add the first input variable from the list of basic and already defined derived variable that will appear. Then choose one of the available conversions. The default conversion, Identity Transformation, leaves the input value unchanged. Continue to add more input variables and use the Top, Up, Down, Bottom buttons to change the order of input variables if necessary. Press OK to complete the aggregated variable definition.

ODV 4.5.3 can be downloaded from the SeaDataNet portal.

SeaDataNet tools - new version of DIVA

DIVA is software for statistical analysis and interpolation. In situ measurements are generally sparse and heterogeneously distributed. DIVA (Data-Interpolating Variational Analysis) allows the spatial interpolation of these observations on a regular grid in an optimal way. Such gridded fields can be used in numerous applications, including verifying the consistency of measurements (i.e. outlier detection), initialisation, calibration and validation of ocean models (in support of projects like MyOcean), analyses of changes and trends at seasonal, annual and interannual time scales and budget analyses (such as heat content and total biomass).

DIVA version 4.5.1 was released in March 2013. New in this release are release are optimisations and new features including physical constraints with decay rates and source terms.  It includes the following new features:

  • Advection constraint with linear decay rate and local sources (such as found for radioactive tracers and river discharges)
  • divadetrend now allows one to change easily the order in which detrending is done (for example first years then months or the inverse)
  • two new error calculations are provided (one quick version divacpme with better quality than the original quick version of the poor man’s error; the other divaexerr is an almost exact error calculation much faster than the exact calculation). These two options will be implemented into the 4D version for version 4.6.0 so that error fields will be available with more reasonable CPU times for final climatology productions
  • Simplification of installation and compilation with additional tests of correct installation
  • Housekeeping of the code (simplifications, error messages, cleaning up of code, further optimisations, elimination of depreciated tools)
  • New documentation largery augmented with examples and new tool descriptions
  • Possibilities to call diva from other softwares via system calls, examplified by a matlab function divagrid.m
  • divadoxml adapted to new specifications from IFREMER

The DIVA development has received funding from the European Union Sixth Framework Programme (FP6/2002-2006) under grant agreement n° 026212, SeaDataNet,  Seventh Framework Programme (FP7/2007-2013) under grant agreement n° 283607, SeaDataNet II, and project EMODNet (MARE/2008/03 - Lot 3 Chemistry - SI2.531432) from the Directorate-General for Maritime Affairs and Fisheries.

DIVA can be downloaded from the SeaDataNet portal.

SeaDataNet CDI used in IONIO Project

Funded by the European Territorial Cooperation Operational Programme "Greece-Italy", IONIO involves three different institutes (CMCC – Euro-Mediterranean Centre on Climate Change, ENEA - Italian National Agency for New Technologies, Energy and Sustainable Economic Development, and HCMC - Hellenic Centre For Marine Research). They are developing analyses and tools to improve safety and sustainable perspective in the Ionian Sea and in the Southern Adriatic (SANI area). IONIO is deploying:

  1. an observing system which will provide in situ meteo-oceanographic measurements in real time from moored buoys, ships of opportunity and Argo floats;
  2. a forecasting system delivering in real time high resolution currents and wave products;
  3. a “IONIO” service constituted by an information system based on Web tools for the rapid exploration of the environmental information from the regional stakeholders;
  4. Decision Support Systems (DSS)

IONIO is based on an existing set of capabilities that facilitate:

  • End-to-end data preservation and access;
  • Direct, closed loop interaction of models with the data acquisition process;
  • Virtual collaborations created on demand to drive data model coupling and share ocean observatory resources (e.g., instruments, networks, computing, storage and workflows).

IONIO AND MARINE SERVICES
The IONIO service relies on INSPIRE principles to leave data as close as possible to their collection source, basing the system on distributed data nodes. Data are processed into interoperable formats, agreed standards, and protocols. The service will be complemented with several Decision Support Systems (DSS) for Search and Rescue (SAR), for Ship Routing and Safety (SRS) and for Pollution Hazard Mapping (PHM), based upon the environmental information produced by observational and modeling components. The IONIO service will be tested with the marine and maritime stakeholder groups of the Programme Area. The IONIO products and services should contribute to the future cross border marine water management and safety.

IONIO has started to develop a knowledge information system based on new SeaDataNet Common Data Index Metadata profile of ISO 19115-19139. It is using the links in the CDI to provide information on the organisation that collected data (EDMO), the project within which the data were collected (EDMERP) and the content of the data sets (EDMED), and on Cruise Summary Reports. The knowledge management make uses of the possibility to include bibliographic references in the new CDI and provide links to protocols used for quality assurance and quality control, as well as links to published papers.

8_article_80_ionio_project.jpg (57.3 K)

Making SeaDataNet more fit for handling biological data

So far, SeaDataNet has focused on data management and access for physical oceanography, marine chemistry (to support also the EMODNet Chemistry pilot), bathymetry (to support the EMODNet Hydrography and Seabed Mapping pilots), and geology and geophysics (to support the Geo-Seas project and the EMODNet Geology pilot). Many partners in SeaDataNet are also involved in data acquisition and management for marine biology. A number are member of the Marine Biodiversity and Ecosystem Function (MarBEF) network of excellence and contributing to EurOBIS (European Ocean Biogeographic Information System), managed by the Flanders Marine Institute (VLIZ).

8_article_81_infrastructure.jpg (24.4 K)

One of the objectives of SeaDataNet II is to undertake actions to make SeaDataNet better fit for handling marine biological data sets and establishing interoperability with biology infrastructure developments (Fig 1.). Therefore an analysis is undertaken in SeaDataNet II together with actors from the initiatives mentioned above as to how SeaDataNet can be best adapted for also handling marine biological data sets.

Based on an analysis of the present situation and currently existing biology data standards and initiatives, such as the Ocean Biogeographic Information System (OBIS), Global Biodiversity Information Facility (GBIF), Working Group on Biodiversity Standards (TDWG) and World Register of Marine Species (WoRMS) standards, a recommended format for data exchange of biological data is being developed.

Key issues that steer the format development are:

  • Requirements posed by the intended use and application of the data format (data flows, density calculations, biodiversity index calculations, community analysis, etc…)
  • Availability of suitable vocabularies (World Register of Marine Species, SDN Parameter list, SDN Unit list, etc…)
  • Requirements for compatibility with existing tools and software (WoRMS taxon match services, EurOBIS QC services, Lifewatch workflows, Ocean Data View, etc…) 

It appears that the CDI format is already fit for handling for handling biological data. Adaptations are required for the data format by which data can be downloaded. Results of the performed analysis and proposed templates for the exchange of several types of marine biological data are expected for autumn 2013.

SeaDataNet II- MyOcean 2 Collaboration

SeaDataNet II and MyOcean 2 have signed an updated Memorandum of Understanding (MOU)  with the aim to provide archived quality data to operational systems with dedicated interfaces.
As part of the MOU work is ongoing for a joint T&S climate data product by means of an aggregated data set produced in collaboration between SeaDataNet and the MyOcean INS-TAC group. 

For the joint product the focus is on the aggregation of the data from the EuroGOOS ROOS providers and the SeaDataNet National Data Centres, removing duplicates and converting all data in the same format with the same QC flags. Whenever it’s possible, all the parameters measured by a platform are aggregated even if the scientific validation will only be performed of the Temperature and Salinity parameters.

8_article_82_v3_v4_products.jpg (38.5 K)
Building V3 and V4 Reprocessed T&S in situ products

The SeaDataNet Regional Coordinators discussed and agreed with INS-TAC on a common procedure and timeschedule for gathering, QC and delivery of SeaDataNet data sets to INS-TAC for further QC and aggregation. Before gathering also SeaDataNet undertook a duplicates analysis on its distributed resources in order to optimise the delivery.

8_article_82_time_schedule.jpg (53.4 K)
SeaDataNet WP10 and MyOcean INS_TAC common time schedule defined at the First Joint Meeting 18th of September 2012

A duplicate analysis was conducted within the SeaDataNet Infrastructure between October and December 2012, before the harvesting process of Temperature and Salinity files. The Duplicates Implementation Plan was prepared by HCMR. This plan was based on the duplicates checks conducted with ODV by AWI for the 6 SDN regions in September 2012. The implementation plan was sent to all SDN partners asking for:

  • Identification of duplicates
  • Cleaning of their data sets (delete, update, replace, etc)
  • Detailed explanations of their actions

After evaluation of the modifications of each partner, the CDI central catalogue (as well as the local archives) were updated by SDN data centres accordingly. A total of 60866 potential duplicates were identified (see Tab.1).

8_article_82_results_duplicate_analysis.jpg (30.0 K)
Results of the duplicate analysis conducted within SeaDataNet Infrastructure between October and December 2012, before the harvesting process of Temperature and Salinity files

Guidelines will be sent to all SDN partners to avoid similar duplicates cases in the future; in the meantime a white list of the cleaned and checked CDIs has been prepared in order to check new entries in the CDI central catalogue to avoid future duplicates.
The gathering of the SeaDataNet T&S data sets was undertaking by an automatic harvesting procedure by means of  a CDI Robot that has been developed by MARIS. The CDI Robot used the CDI Data Discovery and Access Service to query, shop and retrieve data sets from the distributed data centres in an automatic way. The query searched for all data sets with T&S and for which the access restriction was Unrestricted or under SeaDataNet License (ca 860.000 CDIs). Then the Robot was triggered to start harvesting the related ODV files from the distributed data centres through the general CDI shopping mechanism (RSM – DM). This procedure was also used to test and tune the performance of the RSM – DM process and to find the optimum data requests. The tuning resulted in a slicing factor of 500 data sets per cycle of 10 minutes, which could be handled by all connected data centres. RSM keeps track of all data requests and repeats data requests in case of disturbances at DM level. Robot harvesting and tuning of the shopping system run from mid December 2012 to mid January 2013.

Then all retrieved ODV files with the full CDI metadata as CSV file were stored on a DVD and sent to AWI. The ODV files (more than 2 Millions of SDN data files in ODV format) contained in most cases not only T&S but also additional observations. AWI did the aggregation of all data files into a single TS Data Collection using SDN Importer of ODV 4.5.3. The aggregation of the many original temperature and salinity variables into single T and S variables (using “Aggregated Derived Variables” ODV option) has been done in 9 pieces of about 250,000 files each, and then re-combined. The logs of problem files were analyzed and sent to the coordinator and the data centers for fixing. It followed the creation of regional and 1900-2012 subsets and distribution to SDN regional groups.

During the preparation of the aggregated dataset, more than 14 000 files were rejected because ODV was not standard or not SDN standard and ODV files had to be corrected. The list of errors was sent to 33 data centres, among them 5 are non SDN partners (mid Feb2013). Not all these errors could have been fixed before data delivery to MyOcean.

Regional Coordinators, after a preliminary and basic QC through ODV software, released the “raw” aggregated temperature and salinity collections, covering the time period 1990-2012, to the regional responsible of the INS-TAC. The WP leader provided to the colleagues some guidelines and a report template in order to harmonize the work in progress.

The SeaDataNet data were received by the MyOcean INS-TAC and a joint workshop took place mid April 2013 to assess the data delivered also against the data sets as managed by MyOcean INS-TAC. Preliminary conclusions have been made which are now further discussed and analysed.

Geo-Seas - results of adopting and adapting SeaDataNet approach

Very recently the EU FP7 Geo-Seas project has been finalised. It has implemented an e-infrastructure of 26 marine geological and geophysical data centres, located in 17 European maritime countries. In practice the Geo-Seas e-infrastructure has been created by adopting and adapting the methodologies, standards and architecture developed by the SeaDataNet project. This way Geo-Seas has expanded the SeaDataNet infrastructure and made it fit to handle marine geological and geophysical data, data products and services, creating a joint infrastructure covering both oceanographic and marine geoscientific data.

The operation of Geo-Seas is continued by the Geo-Seas data centres following an exploitation agreement and in synergy with SeaDataNet II. Geo-Seas is also currently providing the data that underpins the data products developed for the European Marine Observation and Data Network (EMODNET) Geology project which will continue to be funded  as part of the European Commission’s marine strategy framework. As many of the Geo-Seas partners are also partners in the EMODNET Geology project this will promote the continued population of the Geo-Seas Common Data Index (CDI).

8_article_83_collage.jpg (81.7 K)

The results of the Geo-Seas project can be summarized as follows:

  • 26 marine geological and geophysical data centres, located in 17 European coastal countries, are now connected to the Geo-Seas infrastructure and are providing access to their data sets through the Geo-Seas Common Data Index (CDI) data discovery and access service.  As a result there are now in excess of 124,000 metadata records and corresponding data sets derived from seabed sediment samples, boreholes, borehole samples, geophysical surveys (seismic, gravity, magnetic) of the seabed and sub-seabed, cone penetration tests, and side scan sonar surveys available via the Geo-Seas and SeaDataNet portals
  • The original SeaDataNet standards for metadata and data formats used for the CDI service have been considerably adapted, upgraded and expanded for use with geological and geophysical data. In addition the SeaDataNet common vocabularies have been expanded to include a large number of geological and geophysical terms as well as those derived from the GeoSciML initiative. These additional terms have been accepted by the SeaDataNet governance organisation and have significantly enriched this common resource.
  • The original SeaDataNet software tools and services for preparing and importing metadata entries and formatting data sets for use in the CDI service have been adapted and upgraded to accommodate the Geo-Seas metadata and data formats. The upgraded versions have also been added to the common tools and services available for both projects.
  • New software tools and services have been developed for the visualization and analysis of lithological log data, digital terrain models (DTMs) and seismic data sets available from the Geo-Seas data discovery and access service. These include downloadable software packages for viewing lithological logs ("Porcupine") and bathymetric DTM analysis and viewing tools (3D-viewer based on the NASA World Wind software) as well as a demonstrator online service for viewing high-resolution seismic images. These tools and services can be used from the Geo-Seas portal
  • The outputs of the previous EU-funded EUSeaSed and SEISCANEX projects have been upgraded and incorporated in the Geo-Seas CDI service.
  • The monitoring and performance of the common Geo-Seas and SeaDataNet infrastructure has been significantly improved by the development and implementation of a ‘robot user’ monitoring system that tests and validates the complete data discovery and delivery mechanism on a daily basis.

Geo-Seas is coordinated by Helen Glaves of NERC-BGS (United Kingdom).

Micro B3 - establishing interoperability with marine genomics infrastructure

The EU 7FP project Micro B3 (Biodiversity, Bioinformatics, Biotechnology) (http://www.microb3.eu) will develop innovative bioinformatic approaches and a legal framework to make large-scale data on marine viral, bacterial, archaeal and protists genomes and metagenomes accessible for marine ecosystems biology and to define new targets for biotechnological applications. Micro B3 will build upon a highly interdisciplinary consortium of 32 academic and industrial partners comprising world-leading experts in bioinformatics, computer science, biology, ecology, oceanography, bioprospecting and biotechnology, as well as legal aspects.

8_article_85_keywords.jpg (34.0 K)

Micro B3 started 1st January 2012 and will run for 4 years as part of the EU Ocean of Tomorrow programme. For the first time a strong link between oceanographic and molecular microbial research will be established to integrate global marine data with research on microbial biodiversity and functions. The Micro B3 Information System (MB3-IS) will provide innovative open source software for data-processing, -integration, -visualisation, and -accessibility. Interoperability will be the key for seamless data transfer of sequence and contextual data to public repositories.

The Micro B3 project thus aims for a better understanding of the complexity of marine microbial communities and their role in climate change. This requires that the data sets and information on marine organisms and genes are complemented with their environmental context. Oceanographic and marine environmental data will be provided to Micro B3 by SeaDataNet and EurOBIS. Moreover data will be collected in the framework of Micro B3 via organising Ocean Sampling Day (OSD) and derived from the Tara Oceans expedition for both of which PANGAEA will give data management support and as SeaDataNet partner will make sure that the oceanographic data will also be discoverable and accessable through SeaDataNet. Moreover data from the Malaspina cruise will be used as input.

8_article_85_model_data_flows.jpg (63.6 K)

The Micro B3 Information System (MB3-IS) requires data input from both the genomic data infrastructure as presented by EMBL-EBI’s European Nucleotide Archive (ENA) and the ocean environmental data infrastructure as presented by SeaDataNet and EurOBIS. Therefore a functional analysis took place for the flow of data from the field via the data management infrastructures to MB3-IS and its users. Furthermore a technical analysis and specifications were formulated of the interoperability options, both for delivering metadata and data to MB3-IS and for the mutual exchanges between SeaDataNet (marine environmental data), EurOBIS (marine biodiversity data) and the European Nucleotide Archive (ENA) (molecular sequence data). The further detailing and actual implementation of the interoperability solutions, followed by the operational provision of oceanographic data to Micro B3 will take place in the second and third year of the Micro B3 project.

The operation will be tested and evaluated by a number of Case Studies representative of current activities related to exploration of marine microbial ecosystems. Therefore 10 geographical sites have been selected as Micro B3 genomic and oceanographic data matching areas relevant for the Micro B3 Use Cases.

Micro-B3 is coordinated by Frank Oliver Glöckner of Jacobs University Bremen (Germany).

International collaboration - Ocean Data Interoperability Platform (ODIP)

The EU FP7 ODIP project (http://www.odip.org) is establishing and operating an Ocean Data Interoperability Platform. It has a duration of 3 years from 1st October 2012. ODIP aims to establish an EU / USA / Australia/ IOC-IODE coordination platform, the objective of which will be achieving the interoperability of ocean and marine data management infrastructures, and to demonstrate this coordination through several joint EU-USA-Australia-IOC/IODE prototypes that would ensure persistent availability and effective sharing of data across scientific domains, organisations and national boundaries.

8_article_84_website.jpg (81.8 K)

ODIP is undertaken by representatives of the leading marine data management infrastructures in Europe (such as SeaDataNet, Geo-Seas, MyOcean), USA (such as US NODC, IOOS, R2R) and Australia (such as IMOS) and IOC-IODE (ODP).

The ODIP platform will organise international workshops to foster the development of common standards and develop prototypes to evaluate and test selected potential standards and interoperability solutions. The ODIP partnership will also provide a forum to harmonise the diverse regional systems, while advancing the European contribution to the global system. The products and services developed by ODIP will be actively promoted at the international level by IOC/IODE.

The 1st ODIP Workshop took place at the IODE project office in Ostende, Belgium, on 25-28 February 2013 with 46 participants, representing well the various EU, USA, Australian regional infrastructure projects and initiatives that are stakeholders of the ODIP project. Six Topics were discussed:

  • Topic 1: Controlled Vocabularies
  • Topic 2: Discovery Metadata formats
  • Topic 3: Metadata and data exchange mechanisms 
  • Topic 4: Data formats
  • Topic 5: Sensor Web Enablement
  • Topic 6: Added value viewing services

All presentations and their video's are available at the ODIP website (www.odip.org) under Workshops. The brainstorming and discussions resulted in a long list of possible actions. Their implementation is planned by means of a limited number of ODIP Prototype projects. Each of these projects will bring together a number of the identified actions (with possibe overlap). The actual developments for implementing the Prototypes is foreseen as a joint activity of ODIP participants in synergy with and harvesting from activities underway in regional projects and initiatives, such as SeaDataNet (EU), IMOS (Australia), R2R (USA), and ODIP.  ODIP will provide a communication and exchange platform where partners can meet, discuss and tune their development activities and ensure that results are made fit for building the ODIP Prototypes. A proposal about possible ODIP Prototype projects and the way forward is being drafted and will be made public once agreement has been reached in the ODIP Steering Committee.

ODIP is coordinated by Helen Glaves of NERC-BGS (United Kingdom).

EMODNet pilots continued with strong SeaDataNet involvement

SeaDataNet is providing a major contribution to the development process for the overarching EMODNet (European Marine Observation and Data Network). EMODNet is developing into a network of existing and developing European observation systems, linked by a data management structure covering all European coastal waters, shelf seas and surrounding ocean basins. It must facilitate long-term and sustainable access to the high-quality data necessary to understand the biological, chemical and physical behaviour of seas and oceans. Key elements for harmonisation and interoperability are establishing and adopting common standards and protocols for quality control, metadata and data formats, vocabularies and technical protocols.

The "proof of concept" of EMODnet has been tested through preparatory actions. Portals for a number of maritime basins have been set up for hydrographic, geological, biological, chemical and physical data as well as functional habitat maps. These portals provide access to marine data and data products of a standard format and known quality.

In the past 3 years portals have been initiated for the following marine data themes, covering selected marine basins:

As outcome of a Call for Tender from September 2012 it was announced very recently that all portals will be continued and further expanded in geographical coverage and range of products. This will be undertaken by the leading consortia of the pilots with uptake of numerous new data providers and experts. The EU has arranged funding for the coming 3 years, also for undertaking supportive studies and activities, which together must result in a sustained and operational EMODNet. As part of this it will be examined how the portals meet the needs of users from industry, public authorities and scientists. It is also an objective to identify data gaps and arguments why these gaps should be filled in future monitoring.

8_article_86_collage.jpg (123.8 K)

The EMODNet approach with thematic portals for specific disciplines and communities and with EMODNet concertation meetings together with MODEG experts has proven to be very useful and effective. This way many potential players from a given discipline or theme can be engaged for their own specialism and interest, while the interoperability and cohesion between the thematic portals is achieved by using common standards from OGC for viewing services (WMS, WFS) and SeaDataNet for data discovery and access services and semantic interoperability. In practice most of the portals (chemistry, hydrography, physics, geology (via link with Geo-Seas)) have adopted the SeaDataNet approach of using the CDI data discovery and access service including its flexible data access restrictions for giving overview and access to basic measurements datasets. Biology at present uses only the EurOBIS standards, but in its next phase will include the SeaDataNet standards and approach for giving access to distributed data sets. Thereby it will harvest from the SeaDataNet II project efforts for upgrading the SeaDataNet standards to make them also fit for handling biological data in an interoperability scheme with EurOBIS.

Further RTD work as taking place in SeaDataNet II will and must continue on standards and protocols that can be applied as basis for the expanding EMODNet portals. EMODnet stimulates a wider implementation and adoption of these standards, in practice resulting in an expansion of the SeaDataNet infrastructure of connected data centres.

International Conference on Marine Data and Information Systems IMDIS 2013

A new edition IMDIS 2013 of the SeaDataNet series of International Conference on Marine Data and Information Systems IMDIS will be organised and hosted by SeaDataNet in Lucca - Italy from 23 to 25 September 2013.

8_article_87_imdis_lucca.jpg (33.8 K)

Scientific knowledge of the oceans is essential for making decisions to promote the well-being of the human population, reduce losses of lives and properties caused by natural hazards and allow better conservation of the global environment. To achieve advance in this direction it is necessary to have better access to global high quality scientific data for the derivation of knowledge and information through regional and global data analysis.

The IMDIS series of Conferences is promoting the meeting of different communities working in informatics, data management, research, environmental protection, etc. It is focused on-line access to data, meta-data and products, communication standards and adapted technology to ensure platforms interoperability. IMDIS 2013 aims at providing an overview of the existing information on marine environmental data, and showing the progresses on development of efficient infrastructures for managing large and diverse data sets.

The IMDIS 2013 Conference will be organised in four sessions:

  1.  Marine information management
    • Exchange, processing and interactive work with marine data sets from highly heterogeneous sources
    • Federation and integration
    • Network services and technologies
  2. Marine environmental data bases: infrastructures and data access systems
    • Coastal and deep-sea operational oceanography metadata/data systems
    • Physical and bio-chemical databases for climate studies
    • Geophysical and geological metadata/data systems
  3. Data Services in ocean science
    • Standards and quality-assurance issues
    • Services and Visualisation tools
    • User oriented services
  4. Services for Users and Education
    • Historical evolution in data collection and management
    • Tools for dissemination
    • Test bed development for educational purposes

Papers and posters will present main topics and different technological solutions, promote discussions among different communities of developers and users.

For further information and to register please go to: http://imdis2013.seadatanet.org/