|
by Lola M. Olsen
The following rules have been used in determining
TERMs and three levels of VARIABLEs (known as Variable_Level1, Variable_Level2, Variable_Level3) for the
GCMD keywords. These rules are
used in GCMD's procedures
for modifying TOPICs, TERMs,
and VARIABLEs to assist
the user in locating Earth science data sets
of interest. These VALIDs are expected to
remain fairly stable over time, although suggestions
for additions and/or changes will always be
considered. In addition, an uncontrolled level of keywords is
available for "detailed variables". This list will be
uncontrolled except for spelling.
Data set producers and DIF writers are encouraged
to populate this field with "detailed variables"
for the data sets being documented. None of
the rules below will apply to this uncontrolled set of "detailed
variables". [The field will be searchable through
a Lucene fielded search.] Be aware
that not all the keywords currently have dataset descriptions behind them.
1. At any level of the keyword taxonomy, all topics, terms, and
variables should be chosen to be mutually exclusive, minimizing overlap
as much as possible.
2. At any level within the taxonomy, the keywords should be
parallel. For example, one would not include a broader or narrower
keyword within any one level of the taxonomy.
3. Terms may be prefixed with TOPIC level
modifiers, if they do not "stand alone" well.
4. Terms/Variables should be plural when
singular vs. plural is in question.
Count nouns answering the question,
"How many?" are plural: for example,
chemical reactions, penguins, ecosystems.
(Exceptions to this rule exist on
a discipline-specific basis.)
Noncount nouns answer the question
"How much?" Abstract concepts and
unique entities are singular: for example,
copper, snow, water, digestion, and conductivity.
5. No "data center-specific" pre- or suf-fixed
variables should be used.
6. Chemical symbols may be used at Variable_Level2 or Variable_Level3
or as detailed variables.
7. Variables may be prefixed with TERM level
modifiers; however, this is not required. TERM modifiers are suggested to
identify variables that do not "stand alone"
well. A generic variable such as "motion"
should not be used if a more accurate and
descriptive variable such as "sea ice motion"
is what the user will find in the search.
[Variables should generally not be prefixed
with TOPIC level modifiers, although there are exceptions]
8. Statistical modifiers may be used at Variable_Level2 or Variable_Level3
or captured as detailed variables.
Example: Mean stream discharge
9. Extended modifiers should be reserved for
Variable_Level2 or Variable_Level3 or captured in the
uncontrolled (detailed variable) keyword list.
Example: Integrated Precipitable Water Vapor
"Intercepted" Photosynthetically Active Radiation
10. Meaningless (scientifically,) overly
complex modifiers, or interal organization prefixes should be avoided in the
variable list.
Example: "1_BUTENE" and "Langley_8_year_SRB_SW_Radiation" would be appropriate only at the uncontrolled detailed variable list.
11. If a generic VARIABLE has been used to
describe a contingent of variables, a repetitive
generic term should NOT be used at the same level of the variable
list. When multiple expressions for the same variable exist, the variable level should indicate these as "xxx Expressions", signifying that the expressions following can be used interchangeably with appropriate conversions.
Example: Variable = water vapor.
Do not include another variable indicating
the same quantity at the same level, such as "humidity".
All water vapor derived values (which can be converted
from one expression to another), such as "absolute humidity",
"specific humidity", "relative humidity", "vapor pressure",
"mixing ratio" should be listed at the variable level
below the common identifier, "Water Vapor Expressions".
12. Duplicate variables should be avoided
if one serves as a euphemism or surrogate
for another.
Possible example:
sea ice stage development vs. sea ice form
13. Variable descriptors that add only nebulous
information should be avoided; for instance,
how low is low? for the lower troposphere?
14. If the science community uses terms
interchangeably, the more commonly used variable
in the field should be chosen.
Examples:
Variable: Use Planetary Boundary Layer (PBL) or Atmospheric Boundary Layer (ABL)
Reserve "peplosphere" for the Detailed Variable level.
15. Modifiers that only describe the spatial
domain should generally be reserved for Variable_Level2 or Variable_Level3.
Example:
Variable: Heat Flux
Variable Level 2 or 3: Global Heat Flux
16. Variables should be mutually exclusive,
minimizing overlap as much as possible.
17. Keywords should not be associated with
"value judgments". "Air Quality"
should be used in preference to "Pollution".
18. Any "slashed" keywords must be clarified
so that each side of the slashed word(s) can
stand alone for searching by the user.
Example:
Use: Atmosphere > Atmospheric Radiation >
Optical Depth/Optical Thickness
Not: Atmosphere > Atmospheric Radiation >
Optical Depth/Thickness
19. An overriding goal is to have the keywords as "parallel" (in terms of "detail") as possible within any one level of the hierarchy.
To suggest a modification to the GCMD
keyword valids, please contact one of the
GCMD science
staff, E-mail your suggestion to GCMD User Support.
|