Maps and geospatial data, working with coordinate systems and projections
Georgetown University
Spring 2024
Edmund Halley’s New and Correct Chart Shewing the Variations of the Compass (1701) was the first map to show lines of equal magnetic variation.
In 1826, Charles Dupin published a thematic map of France showing illiteracy levels using shadings from white to black.
Spatial data, also known as geospatial data, is information about a physical object that can be represented by numerical values in a geographic coordinate system.
The geographic vector model is based on points located within a coordinate reference system (CRS). Points can represent self-standing features (e.g., the location of a bus stop) or they can be linked together to form more complex geometries such as lines and polygons. Most point geometries contain only two dimensions.
The geographic raster data model usually consists of a raster header and a matrix (with rows and columns) representing equally spaced cells (often called pixels).
In addition to raster and vector data, there is also LiDAR data (also known as point clouds) and 3D data. LiDAR data is data that is collected via satellites, drones, or other aerial devices. 3D data is data that extends the typical latitude and longitude 2-D coordinates and incorporates elevation and or depth into the data. While complex, this data is rich with information and can be used to solve a variety of problems pertaining to the Earth’s surface.
GEOS is a powerful geometry engine that provides functions for performing geometric operations on spatial data.
GDAL is a versatile library for reading, writing, and transforming geospatial data.
PROJ is a library for cartographic transformations and coordinate system conversions.
Simple Features Geometries (often referred to as SF Geometries) are a fundamental concept in geospatial data modeling. They provide a standardized way to represent geometric shapes and their spatial relationships. Let’s explore the key aspects:
SF Geometries include various types that allow us to model real-world features like cities, rivers, buildings, and land parcels:
Typical delimited text file with latitude and longitude:
id,name,amount,city,lon,lat
1,Kevin,2.1,Rapperswil,8.8249,47.2274
2,Eva,2.2,Zürich,8.5435,47.3768
3,"Jimmy,Muff",2.3,,7.4397,46.9487
Another CSV with a POINT
definition (we’ll talk about this shortly)
id,name,amount,city,geom
1,Kevin,2.1,Rapperswil,POINT(8.8249 47.2274)
2,Eva,2.2,Zürich,POINT(8.5435 47.3768)
3,"Jimmy,Muff",2.3,,POINT(7.4397 46.9487)
The shapefile format is a digital vector storage format for storing geometric location and associated attribute information. It has existed since the early 90’s. It is possible to read and write geographical datasets using the shapefile format with a wide variety of software.
The term “shapefile” is quite common, but the format consists of a collection of files with a common filename prefix, stored in the same directory.
shp
: the feature geometry fileshx
: the shape index positiondbf
: the attribute dataYou will most likely only use the shp
file with special libraries for visualization purposes.
prj
: the projection metadataxml
: other assiated metadatasbn
sbx
{
"type": "FeatureCollection",
"features": [
{
"type": "Feature",
"geometry": {
"type": "Point",
"coordinates": [ -90.0715, 29.9510 ]
},
"properties": {
"name": "Fred",
"gender": "Male"
}
},
{
"type": "Feature",
"geometry": {
"type": "Point",
"coordinates": [ -92.7298, 30.7373 ]
},
"properties": {
"name": "Martha",
"gender": "Female"
}
},
{
"type": "Feature",
"geometry": {
"type": "Point",
"coordinates": [ -91.1473, 30.4711 ]
},
"properties": {
"name": "Zelda",
"gender": "Female"
}
}
]
}
GeoJSON consists of the following different parts:
Typically one GeoJSON file (or dataset) will consist of a FeatureCollection containing a list of your data.
n a spatial relationships between features, each type of geometry (point, polyline, and polygon) has an interior and a boundary. How the interiors and boundaries of two geometries compare determines the spatial relationship they exhibit. The following image outlines the geometries, boundaries, and interiors of points, polylines, and polygons.
A choropleth map displays divided geographical areas or regions that are coloured in relation to a numeric variable.
geo
,value
A cartogram is a map in which the geometry of regions is distorted in order to convey the information of an alternate variable. The region area will be inflated or deflated according to its numeric value.
How to make a cartogram?
A cartogram is a map in which the geometry of regions is distorted in order to convey the information of an alternate variable. The region area will be inflated or deflated according to its numeric value.
How to make a cartogram?
Heat maps are useful when you have to represent large sets of continuous data on a map using a color spectrum. A heat map is different from a chloropleth map in that the colors in a heat map do not correspond to geographical boundaries.
This map of India shows the average annual rainfall using different shades of blue. The darker the shade of blue, the higher the rainfall.
A dot map (also called dot distribution map or dot density map) uses a dot to indicate the presence of a variable. Dot maps are essentially scatterplots on a map and are useful for showing spatial patterns.
This is a dot map of the world showing nearly 700,000 geotagged Wikipedia articles, each represented by a yellow dot, in 2011.
The same dataset, in 2018. More articles, different projections. Would you change anything?
Dots are often used in graphs, charts, and maps to accurately locate individual observations and phenomena, but that’s not the case here. If you read a dot density map that way, it’ll look like there were fatalities everywhere in Florida, and that lightning strikes become much less deadly as soon as you cross the border with Georgia or Alabama.
In a dot density map, though, each dot represents one observation, but dots aren’t located where those observations were made; instead, dots are distributed to maximize coverage and, if the placement algorithm is well designed and manually tweaked, it’ll avoid absurd placement —such as dots over lakes, rivers, or unpopulated regions.
The higher the level of aggregation, the less you see the whole story.
Data from http://www-personal.umich.edu/~mejn/election/2016/
The culprit: the earth is not a perfect sphere. It’s a spheroid/ellipsoid!
projections
. Examples are ‘Mercator’, ‘UTM’, ‘Robinson’, ‘Lambert’, and ‘Albers’.Good resource: click here
It’s good to be able to estimate locations on earth from longitude & latitude. For example:
London: latitude 51.509865, and longitude -0.118092. (degrees)
dimensionality reduction
Mappings from 3D to 2D always leave artifacts and distortions.
Each projection has its strengths and weaknesses:
This section is taken from the Projections page of Plot.js documentation, which has a lot of additional information as well
DSAN 5200 | Spring 2024 | https://gu-dsan.github.io/5200-spring-2024/