GrADS Data Server - Configuration File Reference

Table of Contents


Basic format description

The GDS configuration file is in XML format. This means it must start with the following line:

<?xml version="1.0" encoding="ISO-8859-1"?>

and then contain a set of nested "tags" .Each tag specifies the configuration options for a particular module of the server. In cases where configuration for a module is more complex, the tag for that module may contain further tags. Tags are written in one of two forms. For a tag with no contents, the syntax is:

<tagname attribute="value" ... />

and for a tag with contents, it is:

<tagname attribute="value" ...>
  (other tags)
</tagname>

The GDS configuration file does not currently use plain text or CDATA segments.

back to table of contents


Example configuration files

Most tags are completely optional; the GDS will use reasonable defaults when it can. The only settings that are always needed are the location of GrADS, and the location of the dataset to serve. Also, for the admin service to be enabled, an authorization string must be set. Here's a minimal example configuration file, using mostly default settings:

<?xml version="1.0" encoding="ISO-8859-1"?>
<gds>
  <catalog>
    <data>
      <datadir file="/data/mruns/" suffix=".nc" format="nc" />
    </data>
  </catalog>
  <grads>
    <invoker grads_dir="/home/jdoe/grads/" />
  </grads>
  <mapper>
    <service-admin auth="sDFe294f3nv034u8"/>
  </mapper>
</gds>
  

Whereas, this is what a more customized GDS configuration file might look like:

<?xml version="1.0" encoding="ISO-8859-1"?>
<gds name="my_server home="http://www.some.edu/~jdoe/data_server_info.html">
  <catalog temp_entries_limit="500">
    <data>
       <dataset name="my_data" file="/data/my_data.ctl" format="ctl"/>
       <datadir name="my_model_runs" file="/data/mruns/" suffix=".nc" format="nc"
                doc="http://www.some.edu/my_online_data/mrun_info.html" >
         <metadata name="model_version" value="1.0.0"/>
         <metadata-filter att_name="ensemble_id"/>
       </datadir>

       <datadir name="my_model_runs_optimized" file="/data/mruns/" suffix=".ctl" format="ctl"
                doc="http://www.some.edu/my_online_data/mrun_info.html"
                direct_subset="true" source_suffix="dat" />
       <mapdir name="private_data">
         <datalist file="secret_datasets.lst"/>
       </mapdir>
    </data>
  </catalog>
  <log mode="rotate" frequency="week">
    <log_override module="grads/invoker" level="verbose"/>
  </log>

  <grads>
    <invoker grads_dir="/home/jdoe/grads/" time_limit="600"/>
    <analyzer storage="250" time_limit="60"/>
    <dods subset_size="2000" />
    <uploader/>
  </grads>
  <mapper>
    <service-admin auth="sDFe294f3nv034u8"/>
  </mapper>
  <privilege_mgr default="public">
    <ip_range mask="127.0.0.1" privilege="full" />
    <ip_range mask="192.168" privilege="full" />
    <privilege name="full" /> <!-- no restrictions -->
    <privilege name="public"
               analyze_allowed="false"
              
abuse_hits="1000"
               abuse_timeout="24/>
      <deny path="/private_data">
    </privilege>
  </privilege_mgr>
  <servlet>
    <filter-overload limit="20" />
  </servlet>
</gds>  

back to table of contents


Tag hierarchy

The following table shows the tag structure of the configuration file. This corresponds to the runtime structure of the server. The top level tag must always be <gds>. Each tag must be contained by the tag immediately to its left, and can contain any of the tags to its right. Tags followed by * can appear multiple times.

<gds> <catalog> <data>

<dataset>*

<datalist>*

<datadir>*

<metadata>
<metadata-filter>
<mapdir>*

<dataset>*

<datadir>*

<datalist>*

<mapdir>*
<log> <log_override>  
<grads> <invoker>
<analyzer>
<uploader>
<dods>
<servlet> <filter-*>
<mapper> <service-*>
<privilege_mgr> <ip_range>*
<privilege>* <allow>*  
<deny>*  

back to table of contents


Tag definitions

An alphabetical list of the tags used in the configuration file, and the attributes that can be set for each tag.

<allow> Contained by <privilege>. Allows access to data objects for this privilege set. Used to partially or completely override a <deny>. Can in turn be partially or completely overriden by another <deny>

path The path for data objects to be affected. Access will be allowed to any data objects whose path matches (starts with) the path given, unless the data object also matches a <deny> tag with a more specific path. Also see inherit in <privilege>.
 
<analyzer> Contained by <grads>. Configuration for performing analysis tasks

storage Maximum size allowed for an analysis result, in kilobytes. Default is 0, no limit. Can be overriden by analysis_storage in <privilege>
time Maximum time an analysis task is allowed to run before it is aborted, in seconds. Default is 600 sec (10 minutes). If set to 0 (no limit), the time_limit setting in <invoker> is used instead. Can be overriden by analysis_time in <privilege>
 
<catalog> Contained by <gds>. Configuration information for the server's catalog of data entries
     contains: <data>
temp_entries_limit Maximum number of temporary entries (uploads or analysis results) that the server should keep in its cache. Default is 0, no limit.
temp_storage_limit Maximum disk space the server should use for caching temporary entries, in megabytes. Default is 0, no limit.
temp_age_limit Amount of time after which a temporary entry should expire from the cache, in hours. Default is 0, no limit.
 
<data> Contained by <catalog>. List of data objects to be served.
contains: <dataset> <datadir> <datalist> <mapdir>
 
<datadir> Contained by <data>, <mapdir>. Specifies a directory in which to search for data objects

name This will be prefixed to the online name of all data objects in this directory. (optional - default is the filename of the directory)
  file The local filename for this directory
  recurse If set to "true", all subdirectories will also be searched. (optional - default is true)
  prefix Only files whose names begin with the prefix will be loaded (optional)
  suffix Only files whose names end with the suffix will be loaded (optional)
  source_prefix Used with direct_subset. The server will replace the value of prefix with this string to construct the datafile name out of the descriptor file name.
  source_suffix Used with direct_subset. The server will replace the value of prefix with this string to construct the datafile name out of the descriptor file name.
  doc see <dataset>
  das see <dataset>
  direct_subset see <dataset>
  format see <dataset>
 
<datalist> Contained by <data>, <mapdir>. Specifies a file containing a list of data objects
name This will be prefixed to the online name of all data objects in the list. (optional)
list_format

The format of the list. Available options are:
"file": each line contains only a filename (default)
"name": each line contains an online name, followed by a filename (separated by whitespace)

doc see <dataset>
das see <dataset>
format see <dataset>
 
<dataset> Contained by <data>, <mapdir>. Specifies a single data object to be served.

name The online name for the data object. (optional - default is the portion of the file or URL after the last "/")
  file A filename, if the data object is locally stored
  url An OPeNDAP URL if the data object is remotely stored
  source Used with direct_subset. The file containing the actual data that corresponds to the descriptor file.
  doc A URL pointing to documentation for this dataset (optional)
  das Location of a supplemental DAS, which will be merged with the auto-extracted attributes for this data object. (optional)
  direct_subset

Setting this attribute to true enables a mode in which the GDS reads directly from the datafile, rather than invoking GrADS as an intermediary, for subsetting operations. This provides a considerable performance gain. However this feature requires that the data be stored in a very simple layout: regular grids of big-endian IEEE single-precision floating point data.

In order to ensure this, the format attribute must be "ctl", and the descriptor file specified must be contain the record OPTIONS big_endian with no other options. It also must not contain the records DTYPE, FILEHEADER, XYHEADER, or THEADER. Finally, the units field for all variables must be 99. If direct_subset is set to "true" and any of these conditions are not met, the dataset will fail to load.

The filename in the DSET record of the descriptor file is ignored. The name of the file containing the data must be specified using source for <dataset> or source_prefix and source_suffix for <datadir>.
Default is "false".

  format

The storage format of the data object. The GDS uses this setting in two ways: firstly, to determine which GrADS binary to invoke in order to open the data set. Secondly, if the format is "ctl", the GDS will parse the descriptor file directly to obtain required metadata. For the other formats, since the descriptor file may contain partial metadata, or not exist at all, the GDS must invoke GrADS to generate a metadata listing. Valid settings:
"ctl" : GrADS described data (includes sequential, GRIB, BUFR and station data) (default)
"nc": netCDF (including data accessed via an XDF descriptor file)
"hdf": HDF-SDS (including data accessed via an XDF descriptor file)
"dods": OPeNDAP URL (including data accessed via an XDF descriptor file)
"opendap": this option is equivalent to "dods"

 
<deny> Contained by <privilege>. Denies access to data objects for this privilege set

path

The path for data objects to be affected. Access will be denied to any data objects whose path matches (starts with) the path given, unless the data object also matches an <allow> tag with a more specific path. Also see inherit in

<dods> Contained by <grads>. Configuration for fulfilling OPeNDAP requests

subset_size Maximum allowed size allowed for a subset operation, in bytes
buffer_size Size of buffer used to stream subset data to the network, in bytes
 
<filter-*> Contained by <servlet>. Tags of this type contain configuration information for the request filters. To configure a filter named X, create a tag of the form <filter-X> with the attributes you wish to set.
filters that can be configured
abuse blocks excessive hits from a specific IP address
analysis performs analysis tasks for requests that include them
overload rejects requests when the server is under heavy load

generic attributes:
enabled If set to "false", the filter will simply pass all requests through, taking no action (default is "true")
filter-specific attributes:
hits Applies to "abuse" filter. Specifies the number of hits to allow per hour from the same IP. Default is 0 (no limit). Can be overriden by abuse_hits in <privilege>.
timeout Applies to "abuse" filter. Specifies how long to deny access after an IP exceeds the hit limit, in hours. Default is 24. Can be overriden by abuse_timeout in <privilege>.
limit Applies to "overload" filter. Specifies the maximum number of simultaneous requests to allow. Default is 0 (no limit).
 
<gds> Top-level tag. Contains all configuration information for the server
  contains: <catalog> <log> <tool> <servlet> <mapper> <privilege_mgr>
  name A descriptive name for this server installation, which will be used in dynamically generated web pages.
  home The URL for a home page for this server. The GDS will put a link to this page on every Web page it serves. The page you point to with this setting should describe the purpose of/data served by this GDS, and include a link back to the dataset listings. If possible it should also provide a way to contact the server administrator. Default is the home page provided with the GDS (e.g. http://hostname:9090/index.html).
 
<grads> Contained by <gds>. Configuration for the tool used to access, analyze, and store data
contains: <invoker> <uploader> <analyzer> <dods>
 
<invoker> Contained by <grads>. Configuration for invoking GrADS as an external process

grads_dir The path of a full GrADS distribution
grads_bin The path to a single GrADS executable. Use this if a full distribution is not available.
time_limit Maximum time a GrADS process is allowed to run before it is aborted, in seconds.This is solely intended as a safeguard against GrADS unexpectedly hanging, and should be set to several minutes or more. Default is 300 sec (5 minutes).
 
<ip_range> Contained by <privilege_mgr>. Assigns privileges according to the IP address of the request

mask A partial IP address. Requests will be given privileges according to the ip_range with the most specific mask that matches. Setting mask to "" sets the global privilege level, which will be given to any request that does not match another ip_range.
privilege name of the set of privileges to grant to this IP range
 
<log> Contained by <gds>. Configuration information for the server logger (which functions independently of any logging in Tomcat or the JVM)
contains: <log_override>

mode

Default is "file". Values are:
"console": all messages will be written to standard output
"file": all log messages will go to a single file
"rotate": log messages will be written to a rotating collection of files, with a new file being rotated in according to the frequency attribute

file In file mode, the name of the log file. In rotate mode, the rotating log file names will be this file name plus a date identifier. Default is "log/gds.log".
frequency

The frequency of rotation when logging mode is set to "rotate". Default is "monthly". Values are:
"month": monthly
"week": weekly
"day": daily

level

Level of detail to log. Default is "info". Values are:
"debug": extremely detailed output
"verbose": detailed output
"info": major events and errors (default)
"error": error messages only
"critical": server-critical errors only

print_mem If equal to "true", the available heap space for the JVM will be printed with each log entry. Default is "false".
print_module If equal to "true", the name of the module generating the message will be printed. Default is "false".
date_format A template for the date portion of log entries. The template should use the format supported by the java.text.SimpleDateFormat class (see Java 2 API Documentation). If omitted, the logger will use its default format.
 
<log_override> Contained by <log>. Used to set a different level of logging for a specific module. This avoids generating excessive log entries when debugging a specific module.

module

The full name of the module, omitting the initial "gds/".

There are two easy ways to determine module names. Firstly, most of the configuration tags are in fact module names. For instance, the analysis filter is named "filter-analysis". Because it is owned by the "servlet" module, its full name is "servlet/filter-analysis". Secondly, when <log> setting print_module is enabled, module names can be obtained by looking at existing log entries.

level see <log>
 
<mapdir> Contained by <data>, <mapdir>. Use to put various data objects under a single online path. Any number of <dataset>, <datadir>,<datalist> and <mapdir> tags may be nested inside this tag.
contains: <dataset> <datadir> <datalist> <mapdir>

name This will be prefixed to the online name of all data objects inside this tag.
 
<mapper> Contained by <gds>. Configuration for the mapper that assigns each request to a particular service
contains: <service>
 
<metadata> Contained by <dataset>, <datadir>or <datalist>. Specifies a metadata attribute to be added to the datasets generated from the parent tag. These metadata attributes are not affected by <metadata-filter> tags.

name The name of the metadata attribute.
var The name of the variable this attribute is associated with (blank for global attributes) (default is global)
type The OPeNDAP 2 type: Byte, Int16, UInt16, Int32, UInt32, Float32, Float64, String, or URL. (default is String)
value The value of the attribute. This can be a single number; a space-separated list of numbers; or one or more lines of text for the URL and String types.
 
<metadata-filter> Contained by <dataset>, <datadir>or <datalist>. Specifies which metadata attributes contained in the datasets generated by the parent tag should be sent to the client. By default, no attributes are sent - metadata filters with send="true" must be created to include dataset attributes. COARDS attributes (other than the global "title" attribute, and "long_name" and "units" for data variables) are always generated by the GDS and cannot be sent from the dataset. GrADS 1.9 is required to use this feature.
send If true, attributes that match this filter will be sent. If false, attributes that match this filter will not be sent. If an attribute matches both types of filter, it is not sent. (default is true)

global_only If set, the attribute must be global (not associated with a particular variable) to match the filter.
var_prefix If set, attributes must be associated with a variable whose name starts with the given string, in order to match the filter.
var_suffix If set, attributes must be associated with a variable whose name ends with the given string, in order to match the filter.
var_name If set, attributes must be associated with the specified variable, in order to match the filter.
att_prefix If set, the attribute's name must start with the given string in order to match the filter.
att_suffix If set, the attribute's name must end with the given string in order to match the filter.
att_name If set, the attribute must have the given name in order to match the filter.
 
<privilege> Contained by <privilege_mgr>. A set of privileges that can be associated with an IP, or used as a baseline for defining more specific sets of privileges. All attributes except "name" are optional.
contains: <allow> <deny>
name Name of this privilege set
inherit

The name of another <privilege> to inherit settings from.
Settings of the the "parent" <privilege> are inherited by this one, unless specifically overridden. Any <allow> and <deny> tags in this privilege set are merged with those of the parent, with precedence going to the "child" in cases where both an <allow> and a <deny> are found for same path.

abuse_hits If set, overrides hits in <filter-abuse>
abuse_timeout If set, overrides timeout in <filter-abuse>
admin_allowed If true, allows administration requests for this privilege set. Default is false.
analyze_allowed If other than "true", turns off analysis capability for this privilege set. Default is true.
analyze_time If set, overrides time in <analyzer>
analyze_storage If set, overrides storage in <analyzer>
dods_subset_size If set, overrides subset_size in <dods>
upload_allowed If other than "true", turns off upload capability for this privilege set. Default is true.
upload_storage If set, overrides storage in <upload>
 
<privilege_mgr> Contained by <gds>. Configuration for the privilege manager, which assigns a set of privileges to each request
contains: <ip_range> <privilege>
default The name of the default <privilege>, to be assigned to requests that do not match any specified <ip_range>. If this attribute is omitted, a blank privilege set will be created and used as the default.
 
<service-*> Contained by <mapper>. Tags of this type contain configuration information for the network services that the server provides to clients. To configure a service named X, create a tag of the form <service-X> with the attributes you wish to set.
services that can be configured
admin Performs administrative functions. See Web-based administration in the Administrator's guide.
ascii Sends ASCII-format subsets
das Sends OPeNDAP Data Attribute Structures
dds Sends DODS Data Descriptor Structures
dir Sends a directory of data objects in HTML This is the default if no extension is present after a directory name in a request URL.
dods Sends OPeNDAP binary format subsets
help Sends a message providing links to user help.
info Sends a data object summary in HTML. This is the default if no extension is present after a dataset name in a request URL.
upload Receives uploaded data
xml Sends complete data object catalog in XML
generic attributes
enabled If set to "false", the service will not be made available. Default is "true" for all services.
service-specific attributes
auth Applies to "admin" service. Specifies the authorization code that must be provided to access the service. Admin service will not be enabled unless this is set.
timeout Applies to "admin" service. Specifies the length of time in seconds that the "reload" command should wait for the server to become idle, before giving up.
 
<servlet> Contained by <gds>. Configuration for the servlet interface
contains: <filter>
 
<uploader> Contained by <grads>. Configuration for handling uploaded data
   udfread Location of the GrADS udfread utility, used to unpack uploaded data
  storage Maximum size allowed for an upload, in kilobytes
 
 
back to table of contents