Skip all navigation and jump to content Jump to site navigation Jump to section navigation
NASA Logo - Goddard Space Flight Center NASA Home Page Goddard Space Flight Center Home Page

     + Visit NASA.gov

a directory of Earth science data and services
header 2 bullet About Us bullet FAQ bullet Contact Us bullet Site Map
Home Data Sets Data Services Collaborations Add new dataset and data service records to GCMD What's New Participate Calendar Links
Basics of full-text search of the GCMD Database

The functionality of the full-text search engine to search the GCMD database has been improved with new features and more intuitive search functions that closely resemble the behavior of commercial Internet search engines. The following was adapted from the Jakarta Lucene query syntax guide.

As with most modern full-text search engines, a query is broken up into terms and operators. There are two types of terms: single (or mutliple) terms and phrases.

Type in a single term such as ozone and click enter to retrieve a list of relevant titles.

Type in multiple terms such as ozone TOMS and the search engine will interpret this as ozone AND TOMS and retrieve only those descriptions with the words ozone and TOMS somewhere in the discription.

Type in a phrase as a group of words surrounded by double quotes such as

and click enter to retrieve a list of descriptions. Descriptions retrieved will contain those words together somewhere in the description. In this example the resulting descriptions will contain the phrase "sea ice".

Note: The search engine is not case sensitive. Proper results will be returned if you type antarctica, ANTARCTICA, or Antarctica.

Boolean Search

Boolean operators consist of the following words:

  • AND or "+" - Two or more terms or phrases must be in the description. AND is the default operator.
  • OR - Either one or the other of the multiple terms specified must be in the description.
  • NOT or "-" - A term or phrase secified is excluded from the search

    Note: The Boolean operators OR, NOT must be specified explicitly and must be in CAPITAL LETTERS. If you have multiple terms in the query without any operators or quotes, an AND operator is assumed.

    Search Example: The search engine interprets this query as ozone AND TOMS AND polar AND antarctica and will return all descriptions that contain all of those words.

AND Query Examples:


Two (or more) terms in a query. This will retrieve all descriptions containing both words. An AND Boolean operator is assumed between the terms (sea AND winds) and all descriptions with both of those words in it will be retrieved. This is equivalent to an intersection using sets. The symbol && can be used in place of the word AND.

OR Query Example:

Two (or more) terms in a query seperated by a Boolean operator, OR. Note the Boolean operator MUST be capitalized, otherwise the search engine willl assume it is a word to be searched. This query will retrieve descriptions with either sea or topography in it. As a general rule, OR queries will return more hits than AND queries. This is equivalent to a union using sets. The symbol || can be used in place of the word OR.

NOT Query Examples:

In this query, the Boolean NOT separates the two terms. This query will retrieve all descriptions with the word sea but not the word ice. In other words if the description contains both sea and ice, the description will not be retrieved. This is equivalent to a difference using sets. The symbol ! can be used in place of the word NOT.

 

Fielded Searching

The search engine allows you to restrict your search to any DIF or SERF metadata field (see DIF and SERF user guides for the list of metadata fields). The syntax is as follows:

DIF/dif_field_name: query
or
SERF/serf_field_name: query

For example, if you want to restrict your search to the DIF title field, just specify
and only those descriptions with "AVHRR" in the title will be returned.

and only those description with "software" in the title will be returned. The fielded searching also allows you to drill through subfields. For example, you can specify the exact Parameter hierarchy or Personnel field to conduct your search:

will return all descriptions with the phrase "carbon dioxide" as a Variable keyword.

will return all descriptions with the phrase "carbon dioxide" within the parameters field.

will return all descriptions with Personnel with the last name "Smith".


Modified Queries

The search engine supports several term modifiers for enhanced searching options. These are:

    • Wildcard searches
    • Fuzzy searches
    • Proximity searches
    • Range searches
    • Term boosting

Wildcard Searches
The search engine suppports sngle and multiple character wildcard searches.

    • To perform a single chracter wildcard search use the "?" symbol.
    • To perform a multiple chracter wildcard search use the "*" symbol.

The single character wildcard search looks for terms that match with the single character replaced. For example, to search for "text" or "test" you can use the search:

Multiple character wildcard searches looks for 0 or more characters. For example, to search for wind, winds or windy, you can use the search:

You can also use the wildcard search in the middle of a term.

Note: You cannot use a * or ? symbol as the first character of a search.

Fuzzy Search
The search engine supports fuzzy searches based on the Levenshtein Distance, or Edit Distance algorithm. To do a fuzzy search use the tilde, "~", symbol at the end of a Single word Term. For example, to search for a term similar in spelling to "roam" use the fuzzy search: roam~

This search will find terms like foam and roams.

An additional (optional) parameter can specify the required similarity. The value is between 0 and 1, with a value closer to 1 only terms with a higher similarity will be matched. For example: Note: The default that is used if the parameter is not given is 0.5.

Proximity Search
The search engine supports finding words that are a within a specific distance away from the query term. To do a proximity search use the tilde, "~", symbol at the end of a Phrase. For example to search for "greenhouse" and "carbon" within 10 words of each other in a description use the search:

Range Searches
Range Queries allow one to match descriptions whose field(s) values are between the lower and upper bound specified by the Range Query. Range Queries can be inclusive or exclusive of the upper and lower bounds. Sorting is done lexicographically.

This will find all descriptions whose titles are between Greenhouse and IPCC, but not including Greenhouse and IPCC.

Inclusive range queries are denoted by square brackets [ ]. Exclusive range queries are denoted by curly brackets { }.

Boosting a Term
The search engine provides the relevance level of matching descriptions based on the terms found. To boost a term use the caret, "^", symbol with a boost factor (a number) at the end of the term you are searching. The higher the boost factor, the more relevant the term will be.

Boosting allows you to control the relevance of a description by boosting its term. For example, if you are searching for: greenhouse carbon and you want the term "greenhouse" to be more relevant boost it using the ^ symbol along with the boost factor next to the term. You would type: greenhouse^4 carbon

This will make descriptions with the term "greenhouse" appear more relevant. You can also boost Phrase Terms as in the example:

Note: By default, the boost factor is 1. Although the boost factor must be positive, it can be less than 1 (e.g. 0.2)

Grouping
The search engine supports using parentheses to group clauses to form sub queries. This can be very useful if you want to control the boolean logic for a query.

To search for either "greenhouse" or "carbon" and "emissions" use the query:

This eliminates any confusion and makes sure that emissions must exist and either term greenhouse or carbon may exist.

Field Grouping
The search engine supports using parentheses to group multiple clauses to a single field.

To search for a title that contains both the word "emissions" and the phrase "global warming" use the query:


USA dot gov - The U.S. Government's Official Web Portal
+ Privacy Policy and Important Notices
NASA
Webmaster:  Monica Holland
Responsible NASA Official:  Lola Olsen
Last Updated: May 2008