© 1994 American Society for
Photogrammetry and Remote Sensing
Reproduced with permission.
1994 ASPRS/ACSM Annual Convention Proceedings. Ed. LeLand D. Whitmill. Bethesda, MD:
American Society for Photogrammetry and Remote Sensing and
American Congress on Surveying and Mapping,
1994. pp 327-336.
INTEGRATING IMAGE INTERPRETATION
AND
UNSUPERVISED CLASSIFICATION PROCEDURES
W.J. Kramber and A. Morse
Idaho Dept. of Water Resources
1301 N. Orchard St.
Boise ID 83706
ABSTRACT
Multispectral images can be classified into land use and land cover maps using image processing (IP) classification or image interpretation procedures. IP classification procedures are generally used because they are more automated and they offer superior spectral discrimination than image interpretation procedures. However, most IP classification programs analyze only spectral information, and many land use and land cover classes have similar spectral properties, and this results in classification error. In contrast, an analyst can interpret spatial as well as spectral information to distinguish land use and land cover classes. But image interpretation involves much more time than IP classification procedures, so it usually is not a practical approach to use for large study areas. An alternative hybrid method is to integrate IP classification procedures with image interpretation procedures so both spectral and spatial information are analyzed. The hybrid method involves stratifying the image through image interpretation and on-screen digitizing. The image interpretation for developing strata requires less precision and less time than a strict image interpretation because strata are developed only for the classes needed, based on the spectral similarity found between classes. An unsupervised classification is used to precisely define class boundaries. The method integrates image interpretation and IP classification procedures by overlaying the unsupervised classification with the strata developed from image interpretation to produce a final classification. This hybrid method is illustrated with two examples of land use and land cover mapping projects completed at the Idaho Department of Water Resources (IDWR).
INTRODUCTION
Satellite data are now commonly used to develop land use and land cover databases. IP classification and image interpretation are two procedures that are used to analyze and classify satellite data. IP classification offers superior spectral discrimination while image interpretation offers better spatial discrimination.
Most often, IP classification programs are used that apply spectral pattern recognition algorithms to satellite data in a supervised, unsupervised, or hybrid process. IP classification procedures used in remote sensing are based on the premise that different land use and land cover classes have different multispectral reflectance properties. The main advantage of IP classification over image interpretation is that IP can discriminate subtle spectral differences that an analyst cannot interpret. IP offers superior spectral discrimination because many bands of information can be analyzed fast and efficiently. The number of bands that can be processed is only limited by the software, while an analyst can only interpret three raw or enhanced bands of information at one time. The problem with IP classification programs is that most only analyze spectral information and some land use and land cover classes have similar or even identical spectral reflectance properties and this leads to errors in a classification.
Another problem is that IP classification algorithms cannot distinguish between land use and land cover (Dobson 1993). A map user may require that classes be developed for different land use classes that actually have the same or similar land cover, and since the satellite sensors only detect reflectance from land cover the classes will be spectrally similar.
An image analyst can use image interpretation procedures to analyze both spatial and spectral information to map land use and land cover classes. An analyst will typically display a raw or enhanced image that best distinguishes the classes to be mapped and then interpret the image and delimit the class boundaries by using screen digitizing. But image interpretation and the associated screen digitizing is very tedious to develop a detailed classification and it involves much more time than IP classification procedures. So, it usually is not a practical approach to use for large study areas.
The best approach is to integrate the IP classification and image interpretation procedures so that the strong points of each method can be used to classify an image. This paper will discuss a practical method to integrate unsupervised classification procedures with image interpretation procedures by discussing how the method was used in two land use and land cover mapping projects completed by the IDWR.
Study Areas
The first study area was the Snake River Plain, covering 72,000 square miles of Idaho . This area was classified using fifteen Landsat MSS scenes acquired during July and August of 1986. The classification was for a water rights adjudication so the emphasis was on classifying irrigated agriculture (Morse et al. 1990). Other classes were dryland agriculture, wetlands, water, non-agricultural land--consisting mostly of rangeland and forest, and other--usually clouds, shadows, and bad data.
The second study area was the Coeur d'Alene Basin in northern Idaho . The basin covers 3,900 square miles. The basin was classified for the United States Geological Survey (USGS) for use in a water quality management plan, using Landsat Thematic Mapper (TM) data acquired during July of 1989. The classes developed were: dense urban and built-up land, sparse urban and built-up land, irrigated agriculture and pasture, dryland agriculture and pasture, rangeland, deciduous forest, conifer forest, sparse forest, recent clearcuts, recovering clearcuts, water, wetlands, barren land, mined land, and clouds and shadows.
METHODS
The method used to classify the two study areas involved the following steps: 1) data transformation, 2) unsupervised classification, 3) image interpretation, 4) integrating image interpretation and the unsupervised classification. The data transformation step was optional.
Data Transformation
Principal Component Analysis (PCA) is a statistical procedure commonly used in remote sensing for enhancement and data compression (Jensen 1986). PCA was applied to the Landsat MSS data used in the Snake River Plain study to reduce the dimensionality of the data from four original bands to two components while retaining approximately 98% of the original variance. PCA reduced CPU time for clustering and classification by reducing the volume of data.
The Tasseled-Cap (TC) transformation was applied to the Landsat TM data used in the Couer d'Alene Basin study. The TC transformation was developed specifically for remote sensing data to enhance and compress original spectral bands into transformed channels that are directly related to biophysical scene characteristics (Kauth and Thomas 1976; Crist et al. 1986). The first 3 TC components contain approximately 95% of the variance found in the original 6 reflective Landsat TM bands. The data compression and enhancement characteristics described facilitated the classification process by reducing the clustering and classification time. The TC transformation was used because it offers more consistent and interpretable results than PCA while still offering similar data compression capabilities. The main advantage in our method is that scatter plots of clusters produced from TC components are more readily interpretable than scatter plots of clusters produced from principal components.
Unsupervised Classification
An unsupervised classification algorithm was used to generate spectral class files with 255 spectral classes. This was done on a county basis for the Snake River Plain study and for the entire study area for the Coeur d'Alene Basin project. This large number of spectral classes was used to minimize the number of spectral classes that relate to more than one land use and land cover class.
The spectral classes were labeled into land use and land cover classes based on image interpretation. This was accomplished by displaying raw or enhanced data in the red and green image planes and reading the spectral class file into the blue image plane. Individual and groups of spectral classes were highlighted from the blue image plane and then interpreted, labeled, and recorded. The numbers in the Anderson land use and land cover classification scheme were used, when possible, as labels for the Coeur d'Alene Basin study (Anderson et al. 1976).
Spectral classes are very cumbersome to label without the aid of scatter plots. So an in-house program was developed to read spectral class statistics from a file that was output during the clustering process and produce scatter plots for any desired band combinations. The scatter plots are plotted on 36" wide plotter paper and each cluster is represented by an ellipse that shows one standard deviation around the mean of the cluster and the number of the spectral class in the center of the ellipse . Small dense areas of clusters can be expanded and plotted to improve readability. The scatter plots greatly facilitated the labeling of spectral classes since the relative locations of the clusters are apparent on the plots.
Many of the spectral classes had more than one land use and land cover label indicating that these classes were spectrally similar. In the Coeur d'Alene Basin study, classes that were spectrally similar to varying degrees were: irrigated agriculture, conifer forest, sparse forest, recovering clearcuts, and wetlands; and dryland agriculture, rangeland, and recent and recovering clearcuts. In the Snake River Plain study, spectral similarity was found between: irrigated agriculture, non-irrigated agriculture, and wetlands; and between non-irrigated agriculture and rangeland.
Image Interpretation
An analyst uses image interpretation to stratify the image. The procedure involves displaying subsets of a false color composite and screen digitizing the image into strata. The strata are developed only for the classes that cannot be discriminated using IP classification programs because they have similar spectral properties. Most of the strata boundaries do not need to be screen digitized precisely, because the strata are only used as general areas. The unsupervised classification is used to precisely define boundaries when the strata and the unsupervised classification are overlaid. In contrast, a strict image interpretation requires much more time because all class boundaries need to be precisely defined while screen digitizing. Once the strata are developed they are rasterized into a GIS file.
Integrating Image Interpretation with the Unsupervised Classification
A post-processing (POPRO) program developed at IDWR overlays the strata with the unsupervised classification to produce the final classification. The POPRO program performs a GIS matrix overlay. This procedure restricts land use and land cover classes from occurring in areas outside of specific strata. This procedure is similar to the post-classification sorting technique described by Hutchinson (1982) but no ancillary data is used--the strata are developed through image interpretation.
The POPRO program was developed to be easy to use and flexible so that it can be modified as needed during the classification process. POPRO allows up to 255 spectral classes to be overlaid with up to 99 strata classes, to develop a maximum of 99 land use and land cover classes. The main advantage of POPRO over commercial software matrix overlay programs is that it is easy to set up with a text editor; strata can be printed, copied, added, edited, or deleted; and the matrix is in a format that can be easily read and interpreted.
APPLICATION
Two specific examples will be discussed to help illustrate the procedure. In the Coeur d'Alene Basin study, recent clearcuts were spectrally similar with rangeland due to the large amount of bare soil that dominates the ground cover of these two classes. Another problem was that recovering clearcuts were spectrally similar with sparse forest due to similar ground cover. But clearcuts have a unique shape and size that makes them easy to interpret so areas where clearcuts occurred were screen digitized to develop a stratum for clearcuts. The stratum was set up in the POPRO program so spectral classes that were labeled as both recent clearcuts and rangeland were labeled as recent clearcuts when they occurred within the clearcut stratum and as rangeland in other strata. Also, spectral classes that were labeled as both recovering clearcuts and sparse forest were labeled as recovering clearcuts within the clearcut stratum and as sparse forest in other strata. Figure 4 shows the actual file containing the strata list, that was read by POPRO.
05
Water.dat
06
Wetlands.dat
07
Barren.dat
10
Cloudsh.dat
25
Irr-agr.dat
26
Dry-agr.dat
41
Dec-for.dat
42
Con-for.dat
45
Clearcut.dat
Figure 4. Printout of file Stratnam.dat, a list of strata and their codes.
Figure 5 is a portion of the matrix as shown in a table format. The full matrix would be 9 rows by 255 columns for this classification. The output class values are shown for spectral classes 1, 157, 158, 159, and 255 in the coniferous forest and clearcut strata. For spectral class 159, the output class is 03 (rangeland) in the coniferous forest stratum and 45 (recent clearcuts) in the clearcut stratum. Figure 6 shows a two-by-two cell example of the matrix overlay using the same spectral classes and strata as in figure 5. Figure 7 is a printout of the stratum file Clearcut.dat--this is one of the actual files read by POPRO.
| Strata.gis | Stratum Name |
Spectral.gis | ||||
| 1.¼ ..157 | 158 | 159....255 | ||||
| 05 | Water.dat | |||||
| 06 | Wetlands.dat | |||||
| 07 | Barren.dat | |||||
| 08 | Cloudsh.dat | |||||
| 25 | Irr-agr.dat | |||||
| 26 | Dry-agr.dat | |||||
| 41 | Dec-for.dat | |||||
| 42 | Con-for.dat | 05 |
03 | 44 |
03 | 03 |
| 45 | Clearcut.dat | 05 |
45 | 46 | 45 | 09 |
Figure 5. A portion of the matrix of
strata and spectral classes in a table form. The actual ouput class values are shown for
the coniferous forest and clearcut strata.
Spectral.gis |
Strata.gis |
Output.gis |
||||||
1 |
2 |
1 |
2 |
1 |
2 |
|||
1 |
158 |
158 |
42 |
45 |
44 |
46 |
||
2 |
159 |
159 |
45 |
42 |
45 |
03 |
||
Figure 6. A four cell example of a matrix overlay. The spectral class file is overlayed with the strata produced from image interpretation to produce the final output GIS file.
In the Snake River Plain study, irrigated and non-irrigated agriculture were spectrally similar to a small degree. This spectral similarity is usually found in areas where the irrigated crops are at an immature stage and the non-irrigated crops are at a mature stage and in areas where bare soil dominates the ground cover. Through image interpretation irrigated agriculture can be identified by its smooth texture and other spatial clues such as center-pivot patterns and square or rectangular field patterns. In contrast, non-irrigated agriculture usually appears more mottled with irregular field boundaries. So an analyst used image interpretation and screen digitizing to develop separate strata for irrigated agriculture and non-irrigated agriculture, and set this up in the POPRO program.
The next step was an iterative process of verifying the results and making changes. This was accomplished by displaying two bands of the image in the red and green color planes and then reading the classification into the blue image plane. Individual classes were highlighted and problems were noted. Most of the problems were corrected by changing spectral class labels in the stratum files (see Figure 7) and running POPRO again. Sometime during this process an analyst may realize that another strata is needed beyond what was initially planned. This is accomplished by screen digitizing the new strata, rasterizing the new strata onto the strata GIS file, adding the strata code and filename to the Stratnam.dat file (Figure 4), and creating the new strata file. The new strata file is created by copying one of the existing files and editing it as necessary. After this verification process is completed, the classification is complete except for any strict image interpretation.
| 1 | 5 | 10 | 15 | 20 | 25 | |||||||||||||||||||
| 05 | 42 | 42 | 42 | 42 | 42 | 42 | 42 | 42 | 42 | 42 | 42 | 42 | 42 | 42 | 42 | 42 | 42 | 42 | 42 | 42 | 42 | 42 | 42 | 42 |
| 26 | 30 | 35 | 40 | 45 | 50 | |||||||||||||||||||
| 42 | 42 | 42 | 42 | 42 | 42 | 42 | 42 | 42 | 42 | 42 | 42 | 42 | 42 | 42 | 42 | 42 | 42 | 42 | 42 | 42 | 42 | 42 | 42 | 42 |
| 51 | 55 | 60 | 65 | 70 | 75 | |||||||||||||||||||
| 42 | 42 | 42 | 42 | 42 | 42 | 42 | 42 | 42 | 42 | 42 | 42 | 42 | 42 | 42 | 42 | 42 | 42 | 42 | 42 | 42 | 42 | 42 | 42 | 42 |
| 76 | 80 | 85 | 90 | 95 | 100 | |||||||||||||||||||
| 45 | 42 | 42 | 42 | 42 | 42 | 42 | 42 | 42 | 42 | 42 | 42 | 42 | 42 | 42 | 42 | 42 | 42 | 42 | 42 | 42 | 42 | 42 | 45 | 42 |
| 101 | 105 | 110 | 115 | 120 | 125 | |||||||||||||||||||
| 42 | 42 | 42 | 42 | 44 | 44 | 44 | 42 | 45 | 42 | 45 | 42 | 42 | 44 | 44 | 44 | 44 | 45 | 42 | 44 | 42 | 46 | 46 | 46 | 44 |
| 126 | 130 | 135 | 140 | 145 | 150 | |||||||||||||||||||
| 42 | 44 | 45 | 44 | 46 | 46 | 46 | 46 | 44 | 45 | 46 | 45 | 45 | 46 | 45 | 45 | 46 | 46 | 45 | 46 | 46 | 45 | 45 | 46 | 45 |
| 151 | 155 | 160 | 165 | 170 | 175 | |||||||||||||||||||
| 46 | 46 | 46 | 45 | 46 | 46 | 45 | 46 | 45 | 45 | 46 | 45 | 46 | 46 | 45 | 46 | 46 | 45 | 45 | 46 | 46 | 45 | 45 | 46 | 45 |
| 176 | 180 | 185 | 190 | 195 | 200 | |||||||||||||||||||
| 46 | 45 | 45 | 46 | 45 | 45 | 46 | 45 | 45 | 46 | 46 | 45 | 45 | 45 | 46 | 46 | 45 | 46 | 46 | 45 | 45 | 46 | 45 | 45 | 45 |
| 201 | 205 | 210 | 215 | 220 | 225 | |||||||||||||||||||
| 45 | 45 | 45 | 46 | 45 | 46 | 46 | 45 | 46 | 45 | 45 | 46 | 46 | 09 | 45 | 09 | 45 | 45 | 46 | 45 | 45 | 45 | 46 | 46 | 45 |
| 226 | 230 | 235 | 240 | 245 | 250 | |||||||||||||||||||
46 |
45 | 45 | 45 | 46 | 46 | 45 | 46 | 45 | 45 | 46 | 45 | 46 | 45 | 45 | 46 | 46 | 45 | 46 | 46 | 46 | 45 | 45 | 46 | 45 |
| 251 | 255 | |||||||||||||||||||||||
| 45 | 46 | 46 | 45 | 09 |
Clearcut.dat Stratum number 45. Clearcut forest stratum.
Figure 7. Printout of file Clearcut.dat. The numbers 1 to 255 (in bold) represent spectral classes. The numbers below the spectral class numbers are the class codes that the spectral classes are assigned to in the classification for this stratum. For example, spectral class 255 is assigned to class 9. The class codes are: 1 = urban, 3 = rangeland, 5 = water, 6 = wetland, 7 = barren land, 9 = clouds, 10 = shadows, 25 = irrigated agriculture, 26 = dryland agriculture, 41 = deciduous forest, 42 = conifer forest, 44 = sparse forest, 45 = recent clearcuts, 46 = recovering clearcuts, 75 = mined land.
The final procedure for the Coeur d'Alene Basin study was to image interpret and screen digitize urban areas and mined lands. Urban areas are very difficult to classify spectrally because they are a mix of many land cover types. But the spatial pattern of urban areas can be visually identified, so urban areas were visually interpreted and screen digitized. Most mined lands in this study area were too small and spectrally diverse to be classified so mined areas were located on existing maps and then identified and screen digitized on the image. The screen digitized urban and mined lands vector files were then rasterized onto the final classification.
SUMMARY
Two study areas that required different levels of detail were classified using a method that integrates image interpretation and unsupervised classification procedures. The method uses commercial software and programs developed at the IDWR. The method is practical for taking advantage of the strong points of two different procedures for producing land use and land cover classifications.
REFERENCES
Anderson, J.R., Hardy, E.E., Roach, J.T., and R.E. Witmer.
1976. A land use and land cover classification system for
use with remote sensor data. Geological Survey
Paper 964. pp28.
Crist, E.P., Laurin, R., and R.C. Cicone. 1986. Vegetation
and soils information contained in transformed Thematic
Mapper data. Proceedings of IGARSS' 86
Symposium, Zurich. pp.1465-1470.
Dobson, J.E. 1993. Land cover, land use differences distinct. GIS World 6:2:20-22.
Hutchinson, C.F. 1982. Techniques for combining Landsat and
ancillary data for digital classification improvement.
Photogrammetric Engineering and Remote Sensing
48:123-130.
Jensen, J.R. 1986. Introductory Digital Image Processing: A
Remote Sensing Perspective, Prentice-Hall, Englewood
Cliffs, New Jersey. pp.379
Kauth, R.J., and G.S. Thomas. 1976. The tasseled-cap--A
graphic description of the spectral-temporal development of
agricultural crops as seen by Landsat.
Proceedings of the Symposium on Machine Processing of Remotely
Sensed Data, West Lafayette, IN. pp 85-91.
Morse, A., T.J. Zarriello, and W.J. Kramber, 1990. Using
remote sensing and GIS technology to help adjudicate Idaho
water rights. PE&RS 56(3):365-370.