INTEGRATING IMAGE INTERPRETATION
AND
UNSUPERVISED CLASSIFICATION PROCEDURES
W.J. Kramber and A. Morse
Idaho Dept. of Water Resources
1301 N. Orchard St.
Boise ID 83706
ABSTRACT
Multispectral images can be classified into land use and
land cover maps using image processing (IP) classification or image interpretation
procedures. IP classification procedures are generally used because they are more
automated and they offer superior spectral discrimination than image interpretation
procedures. However, most IP classification programs analyze only spectral information,
and many land use and land cover classes have similar spectral properties, and this
results in classification error. In contrast, an analyst can interpret spatial as well as
spectral information to distinguish land use and land cover classes. But image
interpretation involves much more time than IP classification procedures, so it usually is
not a practical approach to use for large study areas. An alternative hybrid method is to
integrate IP classification procedures with image interpretation procedures so both
spectral and spatial information are analyzed. The hybrid method involves stratifying the
image through image interpretation and on-screen digitizing. The image interpretation for
developing strata requires less precision and less time than a strict image interpretation
because strata are developed only for the classes needed, based on the spectral similarity
found between classes. An unsupervised classification is used to precisely define class
boundaries. The method integrates image interpretation and IP classification procedures by
overlaying the unsupervised classification with the strata developed from image
interpretation to produce a final classification. This hybrid method is illustrated with
two examples of land use and land cover mapping projects completed at the Idaho Department
of Water Resources (IDWR).
INTRODUCTION
Satellite data are now commonly used to develop land use
and land cover databases. IP classification and image interpretation are two procedures
that are used to analyze and classify satellite data. IP classification offers superior
spectral discrimination while image interpretation offers better spatial discrimination.
Most often, IP classification programs are used that apply
spectral pattern recognition algorithms to satellite data in a supervised, unsupervised,
or hybrid process. IP classification procedures used in remote sensing are based on the
premise that different land use and land cover classes have different multispectral
reflectance properties. The main advantage of IP classification over image interpretation
is that IP can discriminate subtle spectral differences that an analyst cannot interpret.
IP offers superior spectral discrimination because many bands of information can be
analyzed fast and efficiently. The number of bands that can be processed is only limited
by the software, while an analyst can only interpret three raw or enhanced bands of
information at one time. The problem with IP classification programs is that most only
analyze spectral information and some land use and land cover classes have similar or even
identical spectral reflectance properties and this leads to errors in a classification.
Another problem is that IP classification algorithms cannot
distinguish between land use and land cover (Dobson 1993). A map user may require that
classes be developed for different land use classes that actually have the same or similar
land cover, and since the satellite sensors only detect reflectance from land cover the
classes will be spectrally similar.
An image analyst can use image interpretation procedures to
analyze both spatial and spectral information to map land use and land cover classes. An
analyst will typically display a raw or enhanced image that best distinguishes the classes
to be mapped and then interpret the image and delimit the class boundaries by using screen
digitizing. But image interpretation and the associated screen digitizing is very tedious
to develop a detailed classification and it involves much more time than IP classification
procedures. So, it usually is not a practical approach to use for large study areas.
The best approach is to integrate the IP classification and
image interpretation procedures so that the strong points of each method can be used to
classify an image. This paper will discuss a practical method to integrate unsupervised
classification procedures with image interpretation procedures by discussing how the
method was used in two land use and land cover mapping projects completed by the IDWR.
Study Areas
The first study area was the Snake River Plain, covering
72,000 square miles of Idaho (Figure 1). This area
was classified using fifteen Landsat MSS scenes acquired during July and August of 1986.
The classification was for a water rights adjudication so the emphasis was on classifying
irrigated agriculture (Morse et al. 1990). Other classes were dryland agriculture,
wetlands, water, non-agricultural land--consisting mostly of rangeland and forest, and
other--usually clouds, shadows, and bad data.
The second study area was the Coeur d'Alene Basin in
northern Idaho (Figure 2). The basin covers 3,900
square miles. The basin was classified for the United States Geological Survey (USGS) for
use in a water quality management plan, using Landsat Thematic Mapper (TM) data acquired
during July of 1989. The classes developed were: dense urban and built-up land, sparse
urban and built-up land, irrigated agriculture and pasture, dryland agriculture and
pasture, rangeland, deciduous forest, conifer forest, sparse forest, recent
clearcuts,
recovering clearcuts, water, wetlands, barren land, mined land, and clouds and shadows.
METHODS
The method used to classify the two study areas involved
the following steps: 1) data transformation, 2) unsupervised classification, 3) image
interpretation, 4) integrating image interpretation and the unsupervised classification.
The data transformation step was optional.
Data Transformation
Principal Component Analysis (PCA) is a statistical
procedure commonly used in remote sensing for enhancement and data compression (Jensen
1986). PCA was applied to the Landsat MSS data used in the Snake River Plain study to
reduce the dimensionality of the data from four original bands to two components while
retaining approximately 98% of the original variance. PCA reduced CPU time for clustering
and classification by reducing the volume of data.
The Tasseled-Cap (TC) transformation was applied to the
Landsat TM data used in the Couer d'Alene Basin study. The TC transformation was developed
specifically for remote sensing data to enhance and compress original spectral bands into
transformed channels that are directly related to biophysical scene characteristics (Kauth
and Thomas 1976; Crist et al. 1986). The first 3 TC components contain approximately 95%
of the variance found in the original 6 reflective Landsat TM bands. The data compression
and enhancement characteristics described facilitated the classification process by
reducing the clustering and classification time. The TC transformation was used because it
offers more consistent and interpretable results than PCA while still offering similar
data compression capabilities. The main advantage in our method is that scatter plots of
clusters produced from TC components are more readily interpretable than scatter plots of
clusters produced from principal components.
Unsupervised Classification
An unsupervised classification algorithm was used to
generate spectral class files with 255 spectral classes. This was done on a county basis
for the Snake River Plain study and for the entire study area for the Coeur d'Alene Basin
project. This large number of spectral classes was used to minimize the number of spectral
classes that relate to more than one land use and land cover class.
The spectral classes were labeled into land use and land
cover classes based on image interpretation. This was accomplished by displaying raw or
enhanced data in the red and green image planes and reading the spectral class file into
the blue image plane. Individual and groups of spectral classes were highlighted from the
blue image plane and then interpreted, labeled, and recorded. The numbers in the Anderson
land use and land cover classification scheme were used, when possible, as labels for the
Coeur d'Alene Basin study (Anderson et al. 1976).
Spectral classes are very cumbersome to label without the
aid of scatter plots. So an in-house program was developed to read spectral class
statistics from a file that was output during the clustering process and produce scatter
plots for any desired band combinations. The scatter plots are plotted on 36" wide
plotter paper and each cluster is represented by an ellipse that shows one standard
deviation around the mean of the cluster and the number of the spectral class in the
center of the ellipse (Figure 3). Small dense areas
of clusters can be expanded and plotted to improve readability. The scatter plots greatly
facilitated the labeling of spectral classes since the relative locations of the clusters
are apparent on the plots.
Many of the spectral classes had more than one land use and
land cover label indicating that these classes were spectrally similar. In the Coeur
d'Alene Basin study, classes that were spectrally similar to varying degrees were:
irrigated agriculture, conifer forest, sparse forest, recovering clearcuts, and wetlands;
and dryland agriculture, rangeland, and recent and recovering clearcuts. In the Snake
River Plain study, spectral similarity was found between: irrigated agriculture,
non-irrigated agriculture, and wetlands; and between non-irrigated agriculture and
rangeland.
Image Interpretation
An analyst uses image interpretation to stratify the image.
The procedure involves displaying subsets of a false color composite and screen digitizing
the image into strata. The strata are developed only for the classes that cannot be
discriminated using IP classification programs because they have similar spectral
properties. Most of the strata boundaries do not need to be screen digitized precisely,
because the strata are only used as general areas. The unsupervised classification is used
to precisely define boundaries when the strata and the unsupervised classification are
overlaid. In contrast, a strict image interpretation requires much more time because all
class boundaries need to be precisely defined while screen digitizing. Once the strata are
developed they are rasterized into a GIS file.
Integrating Image Interpretation with the Unsupervised
Classification
A post-processing (POPRO) program developed at IDWR
overlays the strata with the unsupervised classification to produce the final
classification. The POPRO program performs a GIS matrix overlay. This procedure restricts
land use and land cover classes from occurring in areas outside of specific strata. This
procedure is similar to the post-classification sorting technique described by Hutchinson
(1982) but no ancillary data is used--the strata are developed through image
interpretation.
The POPRO program was developed to be easy to use and
flexible so that it can be modified as needed during the classification process. POPRO
allows up to 255 spectral classes to be overlaid with up to 99 strata classes, to develop
a maximum of 99 land use and land cover classes. The main advantage of POPRO over
commercial software matrix overlay programs is that it is easy to set up with a text
editor; strata can be printed, copied, added, edited, or deleted; and the matrix is in a
format that can be easily read and interpreted.
APPLICATION
Two specific examples will be discussed to help illustrate
the procedure. In the Coeur d'Alene Basin study, recent clearcuts were spectrally similar
with rangeland due to the large amount of bare soil that dominates the ground cover of
these two classes. Another problem was that recovering clearcuts were spectrally similar
with sparse forest due to similar ground cover. But clearcuts have a unique shape and size
that makes them easy to interpret so areas where clearcuts occurred were screen digitized
to develop a stratum for clearcuts. The stratum was set up in the POPRO program so
spectral classes that were labeled as both recent clearcuts and rangeland were labeled as
recent clearcuts when they occurred within the clearcut stratum and as rangeland in other
strata. Also, spectral classes that were labeled as both recovering clearcuts and sparse
forest were labeled as recovering clearcuts within the clearcut stratum and as sparse
forest in other strata. Figure 4 shows the actual file containing the strata list, that
was read by POPRO.
05 |
Water.dat |
06 |
Wetlands.dat |
07 |
Barren.dat |
10 |
Cloudsh.dat |
25 |
Irr-agr.dat |
26 |
Dry-agr.dat |
41 |
Dec-for.dat |
42 |
Con-for.dat |
45 |
Clearcut.dat |
Figure 4. Printout of file Stratnam.dat, a
list of strata and their codes.
Figure 5 is a portion of the matrix as shown in a table
format. The full matrix would be 9 rows by 255 columns for this classification. The output
class values are shown for spectral classes 1, 157, 158, 159, and 255 in the coniferous
forest and clearcut strata. For spectral class 159, the output class is 03 (rangeland) in
the coniferous forest stratum and 45 (recent clearcuts) in the clearcut stratum. Figure 6
shows a two-by-two cell example of the matrix overlay using the same spectral classes and
strata as in figure 5. Figure 7 is a printout of the stratum file Clearcut.dat--this is
one of the actual files read by POPRO.
| Strata.gis |
Stratum Name |
Spectral.gis |
|
|
1.¼ ..157 |
158 |
159....255 |
| 05 |
Water.dat |
| 06 |
Wetlands.dat |
| 07 |
Barren.dat |
| 08 |
Cloudsh.dat |
| 25 |
Irr-agr.dat |
| 26 |
Dry-agr.dat |
| 41 |
Dec-for.dat |
| 42 |
Con-for.dat |
05 |
03 |
44
|
03 |
03 |
| 45 |
Clearcut.dat |
05 |
45 |
46 |
45 |
09 |
Figure 5. A portion of the matrix of
strata and spectral classes in a table form. The actual ouput class values are shown for
the coniferous forest and clearcut strata.
Spectral.gis |
Strata.gis |
Output.gis |
|
1 |
2 |
|
1 |
2 |
|
1 |
2 |
1 |
158 |
158 |
|
42 |
45 |
|
44 |
46 |
2 |
159 |
159 |
|
45 |
42 |
|
45 |
03 |
Figure 6. A four cell example of a matrix
overlay. The spectral class file is overlayed with the strata produced from image
interpretation to produce the final output GIS file.
In the Snake River Plain study, irrigated and non-irrigated
agriculture were spectrally similar to a small degree. This spectral similarity is usually
found in areas where the irrigated crops are at an immature stage and the non-irrigated
crops are at a mature stage and in areas where bare soil dominates the ground cover.
Through image interpretation irrigated agriculture can be identified by its smooth texture
and other spatial clues such as center-pivot patterns and square or rectangular field
patterns. In contrast, non-irrigated agriculture usually appears more mottled with
irregular field boundaries. So an analyst used image interpretation and screen digitizing
to develop separate strata for irrigated agriculture and non-irrigated agriculture, and
set this up in the POPRO program.
The next step was an iterative process of verifying the
results and making changes. This was accomplished by displaying two bands of the image in
the red and green color planes and then reading the classification into the blue image
plane. Individual classes were highlighted and problems were noted. Most of the problems
were corrected by changing spectral class labels in the stratum files (see Figure 7) and
running POPRO again. Sometime during this process an analyst may realize that another
strata is needed beyond what was initially planned. This is accomplished by screen
digitizing the new strata, rasterizing the new strata onto the strata GIS file, adding the
strata code and filename to the Stratnam.dat file (Figure 4), and creating the new strata
file. The new strata file is created by copying one of the existing files and editing it
as necessary. After this verification process is completed, the classification is complete
except for any strict image interpretation.
| 1 |
|
|
|
5 |
|
|
|
|
10 |
|
|
|
|
15 |
|
|
|
|
20 |
|
|
|
|
25 |
| 05 |
42 |
42 |
42 |
42 |
42 |
42 |
42 |
42 |
42 |
42 |
42 |
42 |
42 |
42 |
42 |
42 |
42 |
42 |
42 |
42 |
42 |
42 |
42 |
42 |
| 26 |
|
|
|
30 |
|
|
|
|
35 |
|
|
|
|
40 |
|
|
|
|
45 |
|
|
|
|
50 |
| 42 |
42 |
42 |
42 |
42 |
42 |
42 |
42 |
42 |
42 |
42 |
42 |
42 |
42 |
42 |
42 |
42 |
42 |
42 |
42 |
42 |
42 |
42 |
42 |
42 |
| 51 |
|
|
|
55 |
|
|
|
|
60 |
|
|
|
|
65 |
|
|
|
|
70 |
|
|
|
|
75 |
| 42 |
42 |
42 |
42 |
42 |
42 |
42 |
42 |
42 |
42 |
42 |
42 |
42 |
42 |
42 |
42 |
42 |
42 |
42 |
42 |
42 |
42 |
42 |
42 |
42 |
| 76 |
|
|
|
80 |
|
|
|
|
85 |
|
|
|
|
90 |
|
|
|
|
95 |
|
|
|
|
100 |
| 45 |
42 |
42 |
42 |
42 |
42 |
42 |
42 |
42 |
42 |
42 |
42 |
42 |
42 |
42 |
42 |
42 |
42 |
42 |
42 |
42 |
42 |
42 |
45 |
42 |
| 101 |
|
|
|
105 |
|
|
|
|
110 |
|
|
|
|
115 |
|
|
|
|
120 |
|
|
|
|
125 |
| 42 |
42 |
42 |
42 |
44 |
44 |
44 |
42 |
45 |
42 |
45 |
42 |
42 |
44 |
44 |
44 |
44 |
45 |
42 |
44 |
42 |
46 |
46 |
46 |
44 |
| 126 |
|
|
|
130 |
|
|
|
|
135 |
|
|
|
|
140 |
|
|
|
|
145 |
|
|
|
|
150 |
| 42 |
44 |
45 |
44 |
46 |
46 |
46 |
46 |
44 |
45 |
46 |
45 |
45 |
46 |
45 |
45 |
46 |
46 |
45 |
46 |
46 |
45 |
45 |
46 |
45 |
| 151 |
|
|
|
155 |
|
|
|
|
160 |
|
|
|
|
165 |
|
|
|
|
170 |
|
|
|
|
175 |
| 46 |
46 |
46 |
45 |
46 |
46 |
45 |
46 |
45 |
45 |
46 |
45 |
46 |
46 |
45 |
46 |
46 |
45 |
45 |
46 |
46 |
45 |
45 |
46 |
45 |
| 176 |
|
|
|
180 |
|
|
|
|
185 |
|
|
|
|
190 |
|
|
|
|
195 |
|
|
|
|
200 |
| 46 |
45 |
45 |
46 |
45 |
45 |
46 |
45 |
45 |
46 |
46 |
45 |
45 |
45 |
46 |
46 |
45 |
46 |
46 |
45 |
45 |
46 |
45 |
45 |
45 |
| 201 |
|
|
|
205 |
|
|
|
|
210 |
|
|
|
|
215 |
|
|
|
|
220 |
|
|
|
|
225 |
| 45 |
45 |
45 |
46 |
45 |
46 |
46 |
45 |
46 |
45 |
45 |
46 |
46 |
09 |
45 |
09 |
45 |
45 |
46 |
45 |
45 |
45 |
46 |
46 |
45 |
| 226 |
|
|
|
230 |
|
|
|
|
235 |
|
|
|
|
240 |
|
|
|
|
245 |
|
|
|
|
250 |
46 |
45 |
45 |
45 |
46 |
46 |
45 |
46 |
45 |
45 |
46 |
45 |
46 |
45 |
45 |
46 |
46 |
45 |
46 |
46 |
46 |
45 |
45 |
46 |
45 |
| 251 |
|
|
|
255 |
| 45 |
46 |
46 |
45 |
09 |
Clearcut.dat Stratum number 45. Clearcut
forest stratum.
Figure 7. Printout of file Clearcut.dat. The numbers 1 to
255 (in bold) represent spectral classes. The numbers below the spectral class numbers are
the class codes that the spectral classes are assigned to in the classification for this
stratum. For example, spectral class 255 is assigned to class 9. The class codes are: 1 =
urban, 3 = rangeland, 5 = water, 6 = wetland, 7 = barren land, 9 = clouds, 10 = shadows,
25 = irrigated agriculture, 26 = dryland agriculture, 41 = deciduous forest, 42 = conifer
forest, 44 = sparse forest, 45 = recent clearcuts, 46 = recovering clearcuts, 75 = mined
land.
The final procedure for the Coeur d'Alene Basin study was
to image interpret and screen digitize urban areas and mined lands. Urban areas are very
difficult to classify spectrally because they are a mix of many land cover types. But the
spatial pattern of urban areas can be visually identified, so urban areas were visually
interpreted and screen digitized. Most mined lands in this study area were too small and
spectrally diverse to be classified so mined areas were located on existing maps and then
identified and screen digitized on the image. The screen digitized urban and mined lands
vector files were then rasterized onto the final classification.
SUMMARY
Two study areas that required different levels of detail
were classified using a method that integrates image interpretation and unsupervised
classification procedures. The method uses commercial software and programs developed at
the IDWR. The method is practical for taking advantage of the strong points of two
different procedures for producing land use and land cover classifications.
REFERENCES
Anderson, J.R., Hardy, E.E., Roach, J.T., and R.E. Witmer.
1976. A land use and land cover classification system for
use with remote sensor data. Geological Survey
Paper 964. pp28.
Crist, E.P., Laurin, R., and R.C. Cicone. 1986. Vegetation
and soils information contained in transformed Thematic
Mapper data. Proceedings of IGARSS' 86
Symposium, Zurich. pp.1465-1470.
Dobson, J.E. 1993. Land cover, land use differences
distinct. GIS World 6:2:20-22.
Hutchinson, C.F. 1982. Techniques for combining Landsat and
ancillary data for digital classification improvement.
Photogrammetric Engineering and Remote Sensing
48:123-130.
Jensen, J.R. 1986. Introductory Digital Image Processing: A
Remote Sensing Perspective, Prentice-Hall, Englewood
Cliffs, New Jersey. pp.379
Kauth, R.J., and G.S. Thomas. 1976. The tasseled-cap--A
graphic description of the spectral-temporal development of
agricultural crops as seen by Landsat.
Proceedings of the Symposium on Machine Processing of Remotely
Sensed Data, West Lafayette, IN. pp 85-91.
Morse, A., T.J. Zarriello, and W.J. Kramber, 1990. Using
remote sensing and GIS technology to help adjudicate Idaho
water rights. PE&RS 56(3):365-370. |