Contact information for data centers (data submission)
Data
policy for U.S. Global Repeat Hydrography Program, in support of US carbon
cycle and CLIVAR programs
Lynne Talley, Nikki Gruber, Jim Swift with inputs from planning committee
December 20, 2001
The
objective of the repeat hydrography program is to maintain decadal
time-scale sampling of ocean transports and inventories of climatically
significant parameters, such as carbon system components, nutrients,
freshwater and heat. While autonomous sampling programs such as ARGO and
the basin-scale volunteer observing ship programs can sample a portion of
the climate-relevant fields, that is, parameters important for heat and
freshwater in the upper part of the water column, they cannot at this time
sample the entire water column nor can they sample important chemical
tracer constituents. These semi-autonomous programs also cannot provide
calibrated data; sensors presently in use are subject to drift and require
occasional verification by in situ measurements.
The
repeat hydrography program is ongoing sampling in support of improved
understanding and modeling of climate and the carbon system, rather than
characterization of the long-term mean regional, basin-scale and global
circulation and basic property fields.
Thus
the following data policy for hydrographic observations is adopted:
-
Observations
will be made publicly available in preliminary form through a
specified data assembly center as soon after collection as is
practical ("early release"), with final calibrated data
provided publicly when available. Collection is interpreted as the
completion of the determination of the value of the particular
parameter. Thus, for example, tritium/helium collection may not be
complete for over a year after return to shore/laboratory. Timelines
for each type of data are given in "Data submission" below.
-
All
data collected as part of the repeat hydrography program will be
submitted to a designated data management structure for quality
control and dissemination for synthesis.
-
General
U.S. national policy, applicable to all data collection programs,
requires that post cruise inventory information (ROSCOP form) be
completed within 60 days of the end of the cruise, usually by the
Chief Scientist. Ultimately, all data must be archived with the
National Oceanographic Data Center, following timelines set by the
funding agencies.
This
data policy achieves at least two objectives:
It allows
data to be incorporated in a variety of analyses, including data
assimilation, as soon as possible after collection, and carried out by
numerous investigators, consonant with data policies for more rapidly
sampled ocean observation programs, such as ARGO, VOS and satellite
observations programs, as well as atmospheric observation programs.
Reduction
in the lengthy delays in data availability that occurred during WOCE
despite the stated two-year proprietary period.
While,
historically, oceanographic data in the U.S. has been proprietary for some
time period, typically two years, this paradigm is shifting. For example,
the TAO and ARGO arrays both explicitly recognize the scientific and
societal value of public and easy access to their data in
"real-time". The "early release" policy for repeat
hydrographic data recognizes that these data too are a publicly funded
asset of significant scientific and societal importance.
It
is recognized that well-qualified, experienced and highly-motivated
investigators are necessary for success of the observation programs.
Mechanisms will be provided for involvement of such investigators in
acquisition of the data sets. Data analyses will be funded separately
through competitive proposals. Those who collect the data will typically
be in the forefront of those proposing analyses.
It
is recognized that other nations may not initially subscribe to this type
of data policy since their funding may be continuing under policies
requiring much individual initiative. We urge that international data
policies be adopted that adhere to the WOCE requirements at a minimum (two
year release from time of analysis), and that the international data
policies eventually converge with the U.S. policy. In unilateral
recognition of the international importance of these globally distributed
repeat hydrographic sections, and as an example to other nations, the U.S.
"early release" data policy will allow access to U.S. repeat
hydrographic data to all investigators.
For
U.S. repeat hydrographic measurements collected on non-U.S. ships in order
to complete the suite of Level 1 and 2 measurements, the data release
requirements may differ. See "Sampling support for non-U.S.
cruises" section below.
National
and international oversight:
A
U.S. Science Steering Committee (SSC) is required to oversee the program.
The committee will advocate adequate and consistent coverage of all Level
1 and 2 observations. It will ensure smooth interactions with funding
agencies and individual investigators, including those proposing Level 3
measurements. It will ensure that adequate support is provided for the
necessary data assembly center structures.
The
U.S. SSC will also interact with other national committees, international
committees and structures carrying out large-scale repeat hydrographic
programs.
Within
the U.S. a consortium of scientists will lead each Level 1 and 2
observation type. Each consortium will consist of the principal
investigators and data managers associated with that data type. The
consortia should work to integrate all aspects of the program within the
framework of the Carbon/CLIVAR national and international program
requirements. Each repeat section will include a line coordinator. For
U.S. lines, the fieldwork will be led by an experienced chief scientist,
who may also be the line coordinator, and co-chief scientist. Guidelines
for the proposed support for these investigators, including a scientific
party of 2 to accompany them on the cruise, are given in the proposal.
The
data management structures will serve the U.S. community. If international
agreements to this end can be achieved, the data management structures
could also serve the international contributions to the global repeat
hydrography program.
Application
of the data policy - timeline:
The
"early release" data policy applies to all measurements in the
core program for repeat hydrography, including both Levels 1 and 2 (see core
measurements).
Daily
during cruise:
-
Reduced
temperature, salinity, depth data sets via GTS as TESAC messages.
-
Meteorological
observations.
Within
5 weeks of the cruise, released to the relevant data management structure:
-
Preliminary
CTD (pressure, temperature, salinity, oxygen if measured)
-
A
merged bottle data file including preliminary discrete salinity,
oxygen, nutrients (and carbon system components)
-
Preliminary
CFC-11, CFC-12, CFC-113
-
Underway
data, including continuous (1-minute) navigation, bathymetry,
shipboard meteorological measurements, temperature, salinity, pCO2 (if
measured).
-
Shipboard
ADCP data
Within
6 months of the cruise, presuming the 5-week release of CTD and discrete
salinity data:
-
Final
salinity, oxygen, nutrients, CFC, CTD data
-
Final
underway data
-
Final
shipboard ADCP data
-
Final carbon
system parameters (Total CO2 and Total Alkalinity required; pH,
pCO2 if measured)
-
CDOM if measured
-
Lowered
ADCP (if measured)
-
Any
other Level 2 measurements
Within
6 months of shore-based analysis:
-
Tritium/helium
-
14C and 13C
-
DON if measured
Within
2 years of analysis (required NSF data release schedule):
-
Any
other (Level 3) observations. Those based on discrete bottle samples
should be submitted to the hydrographic data management structure and
merged with the other bottle data. (These include measurements such as
NH4, low level nutrients, DMS and methyl halides, other trace metals,
chlorophyll, TOP.)
-
Underway
data should be submitted to the underway data management structure to
be merged with the Level 1 and 2 underway data.
-
Other
discrete sampling programs that are likely to be carried out on many
of the cruises, such as transmissometry and optics, should be
submitted to the relevant data management groups (examples are a JGOFS
SMP project for global transmissometry, and the NASA DAC for optics).
Pre-cruise
planning:
Each
cruise will be planned by the Chief Scientist and principal investigators
responsible for Level 1 and 2 measurements, taking into consideration any
special requests by other principal investigators for Level 3
measurements. Priority for berths and ship laboratory space will be given
to the observation programs in Levels 1 and 2. Reasonable requests for
wiretime, laboratory space and berths will be accommodated for Level 3
observations. Any disputes will be negotiated with assistance from the
Science Steering Committee (SSC).
The
list of sampling programs and investigators will be submitted to the
Science Steering Committee 9 months prior to the cruise, to allow the SSC
to cover any missing observation types. The SSC and funding agencies will
ensure that all Level 1 measurements are covered adequately for each
cruise, and that as many of the Level 2 measurements as possible are also
included.
The
final sampling program, including exact cruise track, nominal station
locations, parameters to be sampled, anticipated precision and accuracy of
each measurements, and list of measurement groups or personnel should be
filed with the SSC 3 months prior to the start of the cruise.
Data
submission:
A
subset of the temperature/salinity/depth and meteorological data should be
submitted in near real-time during the cruise, through the GTS.
Data
will be submitted following the cruise to the relevant data management
structure within the required timelines specified above. Structures
required for the repeat hydrography program (Level 1 and 2 data) should
include centers for: CTD data, discrete bottle data, underway data, ADCP
data and meteorological data. Most Level 3 data sets can be submitted to
these centers; transmissometer and optical data should be submitted to
management structures that will be designated prior to the beginning of
the program. Some of these functions could or should be combined in a
single office (e.g. hydrography, carbon and underway data could rationally
be combined in one center), in order to increase efficiency and remove
duplication of effort. One model might be that the present distributed and
autonomous data centers function as units of a single (distributed) data
center, with unified and streamlined procedures for efficient data
assembly and quality control.
Documentation:
Sampling
logs as required for the ROSCOP form submission, and at a minimum at the
uniform standard for the WOCE hydrographic program's summary (.sum) files
should be maintained. In addition, full information regarding sampling at
each station should be logged, including depths, types of samples, any
perceived problems with samples upon collection or initial analysis,
personnel involved in analysis.
Documentation
of sampling and analytical protocols should be submitted with the data
sets. Documentation with the initial (early release) data submission
should include a brief description of shipboard sampling procedures and
precision and accuracy of observations. Documentation submitted with the
final data set should also include detailed description of sample
preparations, analytical procedures, equipment calibrations, data
reduction techniques, computation algorithms, citations and anything else
deemed necessary.
Data
quality standards:
U.S.
measurement standards should adhere to those set by WOCE and JGOFS for CTD,
hydrographic properties and carbon system components.
Setting
high standards for US data quality and delivery has international
community-wide benefits, as in the WOCE Hydrographic Program. We encourage
comparison studies of international scope.
There
should be international distribution of a US methods handbook (prepared
soon; required that US investigators use it) addressing reference
materials, data quality goals, data flag protocols, (formats). A second
version, prepared with international participation, may be pursued later;
the US handbook is needed now.
To
provide the opportunity for comparison with historical data, measurement
techniques should be consistent with techniques used to collect the
existing data unless there is significant scientific justification for
change. When new techniques are adopted, methods for relating the new data
to existing data should be developed. This requirement extends to regional
comparisons as well.
As
many measurements as possible should be made relative to a certified
reference material standard. Such standards now exist for dissolved
inorganic carbon, total alkalinity and CFCs. Standards for other Level 1/2
measurements are urgently needed, based on experience in WOCE and JGOS.
These include standardization of oxygen analysis procedures (already
developed) and development of nutrient standards.
U.S.
support for observations on non-U.S. cruises:
A
number of long repeat hydrographic sections in the overall plan are being
carried out by other nations. Many of these do not include all Level 1 and
2 observations (see Table). U.S.
groups have the capability to provide these measurements. The U.S. repeat
hydrography plan therefore includes these observations. Data submission
requirements as outlined above are waived for these groups, in recognition
that the non-U.S. cruises will not be operating with the same submission
deadlines and in recognition of the importance of collecting these
observations for the long-term record of ocean change.
Data
collected by U.S. investigators on non-U.S. cruises should still be
submitted to the designated data management structures, at the time of
public dissemination of the principal data sets or at two years, whichever
is earlier. If the non-U.S. data sets have an embargo on publication that
extends beyond 2 years, then the submitted U.S. data should remain
proprietary, that is, without dissemination, until the non-U.S. release
date.
The
data management structures will make every effort to assemble the complete
data sets, through contact with the non-U.S. principal investigators. An
international agreement on the repeat hydrography program, data submission
and designated data assembly centers should be sought.
Data
management:
Data
management is divided into two stages: (1) the principal investigators who
are responsible for data collection, analysis, calibration, documentation,
and submission to the data assembly centers and possibly NODC and (2) the
data assembly centers that are responsible for data merging, online
dissemination and documentation, a second stage of quality control (QC),
and archiving at NODC.
At-sea
data processing and documentation is most effective (but is expensive). At
a minimum, there should be a data management person on board, who is
responsible for merging data, assembling documentation, and other matters.
The science team (consortium) will be helping the PI to process the data,
but more by providing tools, advice, standards, etc., with the PI leading
the processing. Thus the PIs doing the cruise must include data processing
funds in their proposals to do the sea work. At a minimum, PI QC -
applying flags as per a standard protocol - should be done, with a second
level of QC as needed for some parameters, e.g. carbon system.
The
bottle S/O2/nutrient/CFC data must be quality controlled and merged with
preliminary processed CTD data shipboard quickly. These should be made
available to the carbon PIs within 5 weeks of the cruise in order for the
carbon data to be submitted within six months.
A
single system of QC flag assignments and record keeping should be used for
all bottle parameters.
The
data management groups will be responsible for making the data accessible
to the community via worldwide web servers, published or online data
reports, CD-roms, etc.
|