I can think back and remember struggling with being able to utilize OpenStreetMap (OSM) data in a project of mine. I wanted a free source of information and geospatial data, where I was not bound to strict and limiting copyrights. OpenStreetMap offered a one-stop-shop for free and open geospatial data. At the time, I was most interested in utilizing their very detailed road network. However, I ran into many hiccups along the way, which ultimately deterred me from completing the project. Some time has passed and I have returned to said project and successfully completed the preliminary work, that is, being able to utilize OSM data in a GIS. This brings us to the thesis of this blog-post. Herein lies a post that aims to provide guidance for users who are unfamiliar with the process of accessing, retrieving and utilizing OSM data. By doing so I hope to “give back” to the open geospatial community, which without community member involvement would cease to exist. Remember that all the applications and data that we will use in this tutorial are available for free (no cost) to the public… but we all know that there is no such thing as a free lunch (or in our case, a free GIS).
Briefly...
This tutorial will cover the following:
- access and browse an OSM data repositories
- download a subset (often called an ‘extract’) of the planet.osm data package
- install PostgreSQL (object-relational database system)
- install PostGIS for use with PostgreSQL (spatial database extension for PostgreSQL)
- install and utilize osm2pgsql (converts OSM data for use in the PostgreSQL database)
- install QGIS and its dependencies (GIS package)
- query and add data in QGIS from PostGIS/PostgreSQL
Also, this tutorial makes the following assumptions:
- currently running Mac OS X 10.6 or later (tutorial verified using both 10.6.7 and 10.7)
- access to administrative privileges so that we can perform installations
- plenty of storage space (extracts from OSM range from a few MB to many GBs)
- willingness to try something new!
Lastly, this tutorial will follow the following format:
The tutorial will follow a guided, step by step instruction that will assume the role of a new user installing and processing all data from scratch. The included links and information above are for background to the project that we are about to begin. Please follow all the instructions (don’t skip any steps if you don’t know what you are doing) and download all the required data when prompted to.
We will be using Terminal in this guide but I don’t expect, nor is it a requirement that any readers of this post have any Terminal.app background or knowledge. Terminal allows users to interact with the computer through a command line interface. If you have not seen or used the Terminal before, you may have come across some instance of command-line interfaces (perhaps on Windows machines, à la Command Prompt). I will try my best to explain what we are doing during the phase of the tutorial which utilizes the Terminal.app.
If you would like more information about any of the items mentioned above, please feel free to visit the respective website/wiki as listed below (but remember to come back!):
OpenStreetMap http://www.openstreetmap.org/
Planet.osm http://wiki.openstreetmap.org/wiki/Planet.osm
PostgreSQL http://en.wikipedia.org/wiki/PostgreSQL
PostGIS http://en.wikipedia.org/wiki/PostGIS
Osm2pgsql http://wiki.openstreetmap.org/wiki/Osm2pgsql
QGIS http://www.qgis.org/wiki/Welcome_to_the_QGIS_Wiki
C'est bon. Allez!
OpenStreetMap & the Repositories
OpenStreetMap is a collections of free geographic data that can be viewed within a browser (http://www.openstreetmap.org/). We can also access, download and utilize the underlying OSM project data from various repositories. What makes this so attractive is the fact that its a community driven project, which means that anyone can contribute to it. It is also free to use and distribute under the CC-BY-SA license (as long as we attribute OSM and the license itself). The standard package that OSM distributes is called the ‘planet.osm’. It is a standard XML-formatted .osm file, which at the time of writing (Aug 22, 2011) is over 220 GB (17 GB compressed). The planet.osm is updated weekly (every thursday) and includes the latest revisions of nodes, ways and polygons (points, lines and polygons for the ESRI crowd). I strongly encourage you NOT to download the entire planet.osm and work with it because of its size and the required computing power needed to work with such a database. We will be exploring an extract of the planet.osm, specifically the Province of Ontario (Canada), which comes in a much smaller/cpu-friendly size (2 GB uncompressed / 350 MB compressed).
http://wiki.openstreetmap.org/wiki/Planet.osm provides a list of mirrors from which the planet.osm file and its extracts can be downloaded. We will specifically be using the CloudMade directory located on http://downloads.cloudmade.com/
Preface:
This tutorial will reflect what I am doing on my computer. This is done in an effort to address commonly made mistakes. Tailor the tutorial to suit your own needs or follow my instructions to a tee.
Let us make sure that we stay somewhat organized during this tutorial.
- create a new folder on the desktop called ‘osm_tutorial’
- create two subfolders within ‘osm_tutorial’ called ‘downloads’ and ‘data’
We will download all of our installation files to the ‘downloads’ directory and we will download the OSM data to the ‘data’ directory.
Access, browse & download OSM data
Let’s download an extract of the planet.osm file (remember that the planet.osm file is too large to handle on its own). We will use the CloudMade repository which updates their planet.osm and ‘extract’.osm (such that ‘extract’ is the name of a location/place) weekly.
The repository is organized in a hierarchal structure (Region > SubRegion > Country > Province or State). Note that, CloudMade does not have the entire planet.osm parsed into smaller extracts, such that only the most popular or demanded areas have been extracted for us to use. Some repositories will extract different regions than others. There are also some repositories that extract smaller scale areas (Cities, Towns, etc).
I am going to be using the Ontario extract, which is located in Americas > Northern America > Canada > Ontario. Feel free to use any of the other extracts. However, I recommend that you choose an extract from the lowest-level in the heirarchy (eg, provinces or cities) and stay away from the larger extracts such as regions or countries.
- Follow the folder structure ‘Americas > Northern America > Canada > Ontario’ to find the ontario.osm.bz2 extract
- Right click > Save Link As on the ‘ontario.osm.bz2’ link
- Save the file to your ‘/Desktop/osm_tutorial/data’ folder
Do not extract this file. A benefit of using the tools mentioned in this tutorial, is the ability to work with fully compressed OSM data. Why is the file named ‘ontario.osm.bz2’? BZIP2 (bz2) is an open-source compression tool that OSM utilizes to produce small production packages (eg. a 17 GB ‘planet.osm.bz2’ v. 220 GB uncompressed ‘planet.osm’).
Download and install PostgreSQL
The next step is to download our database client, PostgreSQL. There are many iterations of PostgreSQL and it can get quite confusing for the beginner user. Luckily, Dave Page @ Enterprise DB maintains an easy-to-use all-in-one installer that is available for Mac OS X (and other platforms as well).
- Go to http://www.enterprisedb.com/products-services-training/pgdownload#osx
- Select ‘Installer version Version 9.0.4–1′ (latest at the time of writing) for Mac OS X
- Save the file to your ‘/Desktop/osm_tutorial/downloads’ folder
We will now try to install PostgreSQL. I say ‘try’ because *if* this is your first attempt at installing PostgreSQL, you will be prompted in regards to your computers ‘Shared Memory’ configuration. Not to worry, PostgreSQL handles the changes that are necessary, and yes, it is safe to allow PostgreSQL make these changes. I’ve included a snippet from the PostgreSQL ‘readme’.
Shared Memory
-------------
PostgreSQL uses shared memory extensively for caching and inter-process communication. Unfortunately, the default configuration of Mac OS X does not allow suitable amounts of shared memory to be created to run the database server.
- Mount the PostgreSQL diskimage by double-clicking on ‘postgresql-9.0.4–1-osx.dmg’
- Launch the installer and allow it to make the necessary changes
- You will need to restart your computer. Remember to bookmark me before you go!
Welcome back!
- Re-mount the PostgreSQL diskimage by double-clicking on ‘postgresql-9.0.4–1-osx.dmg’ (if it is not already mounted on the desktop)
- The installer will now pass the shared memory check and load the setup wizard
- Proceed through the setup wizard, accepting all default configuration settings. Remember to assign a unique password for your PostgreSQL database when prompted too
- Make special note of port number ‘5432’ that PostgreSQL has been assigned to. We will need to call upon this number later on in this tutorial
The installer will take a minute or so to complete and once completed will ask you if you would like to ‘Launch Stack Builder at exit?’ This is required as we will need to install the PostGIS extension because PostgreSQL cannot handle our OSM data on its own. PostGIS will act as our databases forerunner, handling the spatial information in the OSM data for our PostgreSQL database
- Make sure the checkbox to launch stack builder on exit is ticked
- When stack builder launches, it will ask you to select your PostgreSQL installation from the drop-down style menu
- Select ‘Spatial Extensions > PostGIS’ (as seen below)
- Proceed through the rest of the installation, accepting all default configuration options
- When prompted to, input your unique password for the PostgreSQL database. By default, the port number (5432) and username (postgres) are correct
- Once the installation is completed, launch ‘pgAdmin III’ from the ‘Applications’ folder
- Double-clicking on ‘PostgreSQL 9.0 (localhost:5432)’ will start the database (located under ‘Object browser > Servers’ in the left pane)
- Right click > New Database… in the central pane (making sure not to select the ‘postgres’ or ‘template_postgis’ items
- Fill out the ‘New Database…’ form with the settings as shown below
- That’s it, we can exit the application
Download and install osm2pgsql
Our next step is to download and install osm2pgsql, which will expedite the ‘ontario.osm.bz2’ file into our PostgreSQL database
- Go to http://dbsgeo.com/downloads/#osm2pgsql
- Select and download ‘osm2pgsql 0.70.5 — Snow Leopard / OS 10.6 (Intel x86_64)’ to your ‘/Desktop/osm_tutorial/downloads’ folder
- Mount the osm2pgsql diskimage by double-clicking ‘osm2pgsql_0.70.5_snow_intel_r1.dmg’
- You must install GEOS and PROJ frameworks before installing the osm2pgsql tool. Do this now. Proceed through the installations, accepting all default configurations
- Once GEOS and PROJ are installed, we can move on to osm2pgsql. Accept all default configurations when installing osm2pgsql
Using osm2pgsql and the Terminal.app
We will use osm2pgsql to parse our osm data into the PostgreSQL database. Here is where we will encounter the use of the Terminal.app. Like I mentioned earlier, I will try to explain what exactly we are telling Terminal to do.
- Launch Terminal, you can use the spotlight (magnify glass in the top right-hand corner) to search for it
The first line of code (if you haven’t opened terminal recently) should say something similar to this(replace my computer name and username with your own). Terminal is letting us know what{whos} computer we are on (michael-markietas-mac), and where are we performing the tasks (a directory, in this case, michael markieta’s home directory)
Last login: Mon Aug 22 19:38:44 on console
Michael-Markietas-MacBook-Pro:~ michaelmarkieta$
We need to change the working directory from our user directory (michaelmarkieta) to the desktop. We use the ‘cd’ command which intuitively means ‘change directory’. We also pass in where we would like to change our directory to. In this case, we will change directory to the ‘desktop’. Type the following into Terminal (or copy and paste):
cd desktop
If done correctly, Terminal will switch the current working directory to the desktop and the repeating line of code should read something like this:
michael-markietas-mac:~desktop$>
Now that we know how to change working directories, lets navigate to our ontario.osm.bz2 file which is located in ‘desktop/osm_tutorial/data’. To move more than one folder when changing directories, we must insert our path between apostrophes(“”) and separate folders with forward slashes (/). Type the following into Terminal (or copy and paste):
cd "osm_tutorial/data"
Terminal should now be working from within the ‘data’ folder, which is located inside the ‘osm_tutorial’ folder, which is also located inside{on} the ‘desktop’, which is also part of michael markieta’s user folder. The full path would look something like ‘/Users/Michael Markieta/Desktop/osm_tutorial/data’.
michael-markietas-mac:~data$>
Now to perform some osm2pgl magic. The tool offers many configurable parameters. These can be seen by typing the following into Terminal:
osm2pgsql --help
We will concern ourself with a few of these parameters, but its useful to look over what is included with a tool when you instal something new. We will be using –U, username, –d, database name to guide osm2pgsql in parsing the ontario.osm.bz2 file into our PostgreSQL database. The code will need to look like this:
osm2pgsql -d gis -U postgres -P 5432 ontario.osm.bz2
Let's pick this apart one chunk at a time.
osm2pgsql
— in words, we are telling Terminal.app to start the osm2pgsql script-d gis
— in words, we are configuring the script to use the database called ‘gis’-U postgres
— in words, we are configuring the script to access the database with username ‘postgres’-P 5432
— in words, we are configuring the script to access the database on port 5432ontario.osm.bz2
— in words, this is the file we would like our script to parse into our database
We will need to let it run for a few minutes. Make yourself some coffee/tea, some breakfast (if you are working into the wee hours of the night/morning like I am).. We can finish up this tutorial when its done running the tool.
If everything went smoothly, you should have received the following confirmation in Terminal:
Completed planet_osm_point
Completed planet_osm_roads
Completed planet_osm_polygon
Completed planet_osm_line
This tutorial is getting a bit too long and will be continued on another page. But first, lets take a look at what we have completed so far from the list of items we were suppose to tackle.
access and browse an OSM data repositoriesdownload a subset (often called an ‘extract’) of the planet.osm data packageinstall PostgreSQL (object-relational database system)install PostGIS for use with PostgreSQL (spatial database extension for PostgreSQL)install and utilize osm2pgsql (converts OSM data for use in the PostgreSQL database)- install QGIS and its dependencies (GIS package)
- query and add data in QGIS from PostGIS/PostgreSQL
Looks like we are almost there! Stay tuned everyone...
Part two is now up: http://www.spatialanalysis.ca/2011/using-openstreetmap-data-part-2/
[…] Analysis Mapping with Michael Markieta Skip to content HomePortfolioAbout Me ← Using OpenStreetMap Data — P.1 Typographic Maps […]
[…] click Here to now about using […]