Previous Home Site Map Contact Next

Wasmer Consulting NMPlot User's Guide Introduction to Grids

Imagine that you are assigned the task of creating a navigational chart for a lake. The chart must accurately display the lake's depth at any location, so boaters can avoid dangerously shallow areas. However, the depth of the lake has never been surveyed, so you will need to measure it yourself.

You get a boat, sail to various points in the lake, and measure the depth. At each point, you carefully mark your location on a map, and write the measured depth next to it. You have created a model of the varying depth of the lake by sampling it at discrete points.

In the terminology of NMPlot, you have created a *grid* .

When you measure something that can be expressed as a single number (the length of a rope, the weight of a brick), you are making a *scalar measurement*.

Often, a quantity being measured will vary depending on exactly where you measure it. Consider the depth of a lake. It varies smoothly, starting from zero at the shoreline, typically increasing towards the center. If you measure the depth near the shore, you will get a much different result than if you measure it near the center. The depth is a *scalar field* : a measurable scalar quantity that varies as a function of the measurement location. The world is full of scalar fields. Examples include the water temperature of the sea and the elevation of the Earth's surface above sea level.

Engineers and scientists often want to record and analyze scalar fields. This can rarely be done with perfect accuracy. Usually, the best that can be done is to record measurements made at a number of discrete locations. If enough measurement locations are used, the field can be recorded to an acceptable level of accuracy. Such a set of measurements, along with any auxiliary information (such as the units of measurement), is defined as a *grid*. Each measurement is called a *grid point*, or alternatively, a *data point*.

Using a grid, it is possible to compute a * data value* at any point by interpolating or extrapolating from the grid's data points. The data value is the best estimate, using the information in the grid, of the value of the scalar field at that location. If the distribution of grid points is adequate, the data value will be close to the scalar field's actual value.

The term grid comes from the fact that measurements are often taken at regularly-spaced locations defined by a two- or three-dimensional rectangular grid. However, this is not a requirement. Sets of measurements taken at arbitrary locations are also considered grids.

NMPlot works with a specific type of grid: one that characterizes a scalar field that varies with geographic location. Such grids are two-dimensional: a measurement location can be specified by two numbers. Furthermore, the grids are *georeferenced*, which means that we know where on the surface of the Earth that the measurement points are located. In practice, this means that it is possible to determine the longitude and latitude of each measurement location.

NMPlot cannot process three-dimensional grids, such as a set of air temperatures measured throughout the atmosphere. If can, however, work with a series of grids measuring the air temperature at a height of 1000 feet, 2000 feet, etc.

The following are all examples of grids that NMPlot can work with.

The level of an air pollutant measured at sampling stations located throughout a city.

The predicted noise level surrounding an airport, as calculated by an airport noise model.

The level of insect infestation observed at farms located throughout a state.

Tomorrow's predicted high temperatures at various cities, as predicted by a weather forecast.

The relative humidity measured at 1000 feet above ground level.

A grid is composed of the following parts.

Certainly, the most important part of a grid is the set of data points. Each data point consists of three numbers: two for the geographic location of the point on the Earth, and one for the data value at that location.

In practice, the location of each data point may not be explicitly stored in the grid. For example, the data points may be stored as a two-dimensional regular grid. In this case, only the dimensions and location of the grid needs to be specified. The location of individual data points can be calculated from this information. Of course, if the data points are irregularly distributed, with no pattern to their locations, then the location of each point must be explicitly included in the grid.

A grid's coordinate system specifies how to interpret geographic coordinates. It georeferences the grid, defining how to relate a given geographic coordinate in the grid with a location on the Earth.

Examples of coordinate systems include:

Longitude and latitude

Universal Transverse Mercator (UTM)

Meters east and north of the Empire State Building

A grid's *data metric* specifies what is being measured by the data points in the grid, and what physical units are used.

Examples of data metrics include:

Air temperature, measured in ° F

Noise level, measured in dB

Grasshopper population density, measured in grasshoppers/m

^{2}

A grid's *valid data limits* define the minimum and maximum data levels that are considered valid. Any data points with values outside these limits are considered invalid, and should not be used in calculations.

If the data points in a grid are stored as explicit location-value pairs, invalid data can simply be excluded from the grid. However, if the data points are stored as a two-dimensional rectangular grid, there is no easy way to exclude invalid points. In this case, invalid points are given an extreme value, far beyond the limits of actual valid data, and the valid data limits are set so that the extreme value is recognized as invalid.

Invalid data points are common. It is rare that a large data set is collected without error. Instruments malfunction. Mistakes are made.

Invalid data points can also be generated by computer models. As an example, consider NOISEMAP, the United States Air Force's airport noise model. NOISEMAP writes a large two-dimensional grid of predicted noise levels. For data points located far from where airplanes fly, NOISEMAP does not perform calculations, but simply writes the value -50 dB into the grid. This value is invalid, in the sense that it does not represent a prediction. It simply means that no calculations were performed at those points.

*Tip:*

An informal convention in science and engineering is to use large positive or negative numbers consisting entirely of the digit nine (-999, 9999, etc.) to denote invalid data. Be suspicious of any data of this form.

The *defined area polygon* is the area inside which a grid accurately records a scalar field. A grid is a record of a continuous smoothly-varying scalar field, made by measuring it at discrete points. Using the grid, you can compute an approximation of the original scalar field at any location by interpolating or extrapolating from the measured points. The defined area polygon specifies the area where it is wise to do so. It is the portion of the Earth's surface where a grid adequately describes a scalar field.

Any calculations using the grid should be restricted to locations inside the defined area polygon. For example, if a contour plot of the grid is displayed, no contours should be drawn outside of the defined area.

In practice, extrapolation is often unreliable. Therefore, as a rule of thumb, the defined area polygon equals the area where it is mathematically possible to perform interpolation. This area is known as the *convex hull* of the data points. The convex hull is the smallest convex polygon that completely encloses a set of data point locations.

In some cases, even interpolation is unreliable. Consider the arrangement of data points in An arrangement of data points for which the convex hull would be an inappropriate defined area polygon. Although we could interpolate between the data points in the area marked A, it may be unwise to do so, as it may yield inaccurate results. Most likely, we would like the defined area polygon to equal the area marked by the light gray dashed line.. While it is possible to interpolate values in the area marked A, it may be unwise to do so, as the results may be inaccurate. Most likely, you would like interpolation to be restricted to the area surrounded by the dashed line.

An arrangement of data points for which the convex hull would be an inappropriate defined area polygon. Although we could interpolate between the data points in the area marked A, it may be unwise to do so, as it may yield inaccurate results. Most likely, we would like the defined area polygon to equal the area marked by the light gray dashed line.

If we draw contours based upon this set of data points, they should be restricted to the defined area polygon, as shown in Contours should be clipped to the defined area polygon (i.e., the area in which interpolation can be reliably performed)..

Contours should be clipped to the defined area polygon (i.e., the area in which interpolation can be reliably performed).

The defined area polygon is an optional part of a grid. If it is not present, then the default defined area polygon, defined by the convex hull of the data points, should be assumed.

Geographic annotations comprise any relevant map data stored in a grid. While optional, such data can be of value in working with the grid. For example, in a grid of air pollutant readings around an industrial site, it would be useful to include the site boundary. A plotting program could then draw the boundary when creating a contour plot of the grid, and an analysis program could warn if any off-site readings were above the legal limit.

Such map data can also be stored in an external file. Storing it directly in a grid is simply an organizational convenience. As a rule of thumb, only map data that is particularly relevant to the grid should be stored internally. Extensive general-purpose map data should be stored in external files.

Bookkeeping information consists of any additional information that may be useful to include with a grid. Examples of bookkeeping information include:

A detailed description of the data in the grid

The time and date when the grid was created

Contact information for the person or organization responsible for the grid

For measured data, a description of the instruments and procedures used to collect the data

For data produced by a computer model, a description of the model, and of any parameters and input files used to run it

Previous Home Site Map Contact Next

Copyright © 1996-2003, Wasmer ConsultingPage URL: http://wasmerconsulting.com/nmplot_usersguide_introductiontogrids.htm

Webmaster e-mail: wasmer@wasmerconsulting.com