Binder Github

Premade Datasets

In this notebook, we’ll go over some of the basics of premade datasets and take a look at the different datasets available in sankee.

[1]:
import sankee
import ee

ee.Authenticate()
ee.Initialize()

Dataset Introduction

sankee uses sankee.Dataset objects to define premade Land Use/Land Cover datasets. Most of the LULC datasets in Earth Engine are available, but feel free to request new ones if there’s something missing. You can see the currently available datasets below.

[2]:
sankee.datasets.datasets
[2]:
[<Dataset: LCMS LC - Land Change Monitoring System Land Cover>,
 <Dataset: LCMS LU - Land Change Monitoring System Land Use>,
 <Dataset: NLCD - National Land Cover Database>,
 <Dataset: MCD12Q1 - MODIS Global Land Cover Type 1>,
 <Dataset: MCD12Q1 - MODIS Global Land Cover Type 2>,
 <Dataset: MCD12Q1 - MODIS Global Land Cover Type 3>,
 <Dataset: CGLS - Copernicus Global Land Cover>,
 <Dataset: C-CAP - NOAA Coastal Change Analysis Program 30m>,
 <Dataset: Canada Forested Ecosystem Land Cover>,
 <Dataset: LCMAP - Landscape Change Monitoring, Assessment, and Projection>,
 <Dataset: CORINE - Coordination of Information on the Environment>]

Each dataset defines the labels and colors assigned to different pixel values, which you can summarize using Dataset.df. For example, a value of 123 in the CORINE dataset would be assigned a label of Ports and a color of #E6CCCC.

[3]:
sankee.datasets.CORINE.df.head()
[3]:
id label color
0 111 Continuous urban #E6004D
1 112 Discontinuous urban #FF0000
2 121 Industrial/Commercial #CC4DF2
3 122 Road/Rail #CC0000
4 123 Ports #E6CCCC

Some datasets are generated annually, but others are available on an irregular schedule. Run Dataset.years to see the available years.

[4]:
sankee.datasets.CORINE.years
[4]:
(1986, 1999, 2005, 2011, 2017)

You can retrieve an image from one year using Dataset.get_year. This can be useful if you want to view the data on a map before trying to generate a Sankey diagram.

[5]:
image = sankee.datasets.CORINE.get_year(1986)
image
[5]:
<ee.image.Image at 0x7efce8302160>

Before we go any further, let’s get geemap up and running so we can take a look at some images.

!pip install geemap

[6]:
import geemap

Map = geemap.Map()
Map

Now let’s add the 1986 CORINE data we retrieved earlier to the map.

[7]:
Map.addLayer(image, {}, "CORINE - 1986")
Map.centerObject(image)

To look for areas that experienced land cover change, you might want to visually compare against another year’s data. Let’s add 2017 CORINE data and zoom in on Ankara, Turkey.

[8]:
Map.addLayer(sankee.datasets.CORINE.get_year(2017), {}, "CORINE - 2017")

aoi = ee.Geometry.Point([32.806481, 39.92385]).buffer(10_000)
Map.centerObject(aoi)

Once the layers load, try toggling them on and off in the map to see how land cover changed. Finally, let’s create a Sankey diagram showing that change. Just run Dataset.sankify with a list of available years and a region.

[9]:
sankee.datasets.CORINE.sankify(years=[1986, 2017], region=aoi)

After turning some classes off to simplify the diagram, we can see how some agricultural lands were converted to urban and industral between the two dates.

Now let’s look at some other datasets.

Global Datasets

MODIS Global Land Cover

The MODIS Global Land Cover datasets are global, annual datasets that go back to 2001. The MODIS dataset is split into three sub-datasets in sankee: MODIS_LC_TYPE1, MODIS_LC_TYPE2, and MODIS_LC_TYPE3. The TYPE3 sub-dataset has fewer, more generalized classes than TYPE2, which has fewer than TYPE1.

Let’s look at an example Sankey diagram showing loss of permanent snow and ice in Greenland.

[10]:
sankee.datasets.MODIS_LC_TYPE1.sankify(
    years=[2001, 2020],
    region=ee.Geometry.Point([-61.221966, 80.204697]).buffer(10_000),
    title="Ice Loss in Greenland"
)

Copernicus Global Land Cover

Like MODIS, CGLS is an annual dataset, but it is currently only available between 2015 and 2019. Let’s look at the forest loss that resulted from a devastating 2018 wildfire in Kineta, Greece.

[11]:
aoi = ee.Geometry.Polygon([
    [383.142871, 38.023993],
    [383.109218, 38.017772],
    [383.109218, 37.970153],
    [383.185452, 37.959326],
    [383.217731, 37.981249],
    [383.227346, 38.004518],
    [383.142871, 38.023993]
])

sankee.datasets.CGLS_LC100.sankify(
    years=[2017, 2019],
    region=aoi,
    title="Wildfire in Kineta, Greece",
    max_classes=4,
)

United States Datasets

NLCD

The National Land Cover Database produces land cover maps of the continental United States on an irregular schedule between 2001 and 2019.

Let’s use NLCD to look at Land Cover change at a mountaintop removal coal mine in West Virginia. This time, we’ll include 3 years of data.

[12]:
aoi = ee.Geometry.Polygon([
    [-81.973301, 38.121953],
    [-82.000086, 38.105746],
    [-82.009701, 38.077104],
    [-82.000429, 38.060617],
    [-81.965403, 38.043044],
    [-81.950637, 38.073321],
    [-81.942052, 38.107907],
    [-81.973301, 38.121953],
])

sankee.datasets.NLCD.sankify(
    years=[2001, 2011, 2019],
    region=aoi,
    title="Hobet Coal Mine, West Virginia"
)

LCMS

The Landscape Change Monitoring System project releases land cover and land use products annually since 1985. sankee supports both datasets.

Below, we can see 35 years of vegetation recovery surrounding Mount St. Helens after its 1980 eruption.

[13]:
aoi = ee.Geometry.Polygon([
    [-122.289907, 46.289229],
    [-122.22328, 46.329545],
    [-122.15528, 46.346138],
    [-122.053623, 46.323855],
    [-122.033704, 46.252682],
    [-122.102391, 46.231786],
    [-122.161462, 46.24176],
    [-122.260371, 46.233686],
    [-122.289907, 46.289229],
])

sankee.datasets.LCMS_LC.sankify(
    years=[1985, 2000, 2020],
    region=aoi,
    title="Mount St. Helens Recovery"
)

LCMAP

The Land Change Monitoring, Assessment, and Projection dataset, hosted in the Earth Engine Community Datasets, also provides annual land cover data back to 1985.

We can use that long time series to look at urban sprawl around the city of Las Vegas.

[14]:
aoi = ee.Geometry.Polygon([
    [-115.01184401606046, 36.24170785506492],
    [-114.98849806879484, 36.29928186470082],
    [-115.25628981684171, 36.35238941394592],
    [-115.34692702387296, 36.310348922031565],
    [-115.37988600824796, 36.160811202271944],
    [-115.30298171137296, 36.03653336474891],
    [-115.25628981684171, 36.05207884201088],
    [-115.26590285395109, 36.226199908103695],
    [-115.19174513910734, 36.25499793268206],
])

sankee.datasets.LCMAP.sankify(
    years=[1985, 2000, 2020],
    region=aoi,
    title="Las Vegas Expansion"
)

Other Datasets

C-CAP

The NOAA Coastal Change Analysis Program maps land cover across coastal areas of the United states at irregular intervals.

Below, we can see the effect of clearcuts on a forest in the Oregon Coast Range.

[15]:
aoi = ee.Geometry.Polygon([
    [-123.867847, 43.544301],
    [-123.954382, 43.439692],
    [-123.877462, 43.329906],
    [-123.760709, 43.322913],
    [-123.664558, 43.503475],
    [-123.867847, 43.544301],
])

sankee.datasets.CCAP_LC30.sankify(
    years=[1996, 2016],
    region=aoi,
    scale=30,
    title="Clearcuts, Oregon Coast Range"
)

Canada Forest Ecosystems

Finally, the Canada Forested Ecosystem Land Cover dataset maps land cover in Canadian forests annually from 1984 to 2019. Below, we can see the effects of construction of a new natural gas refinery in Alberta.

[16]:
aoi = ee.Geometry.Polygon([
    [249.444319, 55.057876],
    [249.444319, 55.092961],
    [249.476945, 55.092961],
    [249.476945, 55.057876],
    [249.444319, 55.057876],
])

sankee.datasets.CA_FOREST_LC.sankify(
    years=[1984, 2019],
    region=aoi,
    title="Refinery Construction, Alberta"
)