### Introduction In this article, I'll show you how to use geospatial data from the Overture Maps Foundation. They store this valuable data as geoparquet files in S3 buckets. I'll guide you through using Amazon Athena with SQL to access and query this data. You'll learn how to create an S3 bucket, and configure Amazon Athena settings to run the queries. ![[Pasted image 20240131031036.png]] Then, I'll walk you through converting large CSV files into formats that work well with GIS Desktop Apps like QGIS and ArcGIS Pro. You'll see how to transform these files into GeoPackage and File Geodatabase formats, making them ready for efficient use in GIS applications. ![[Pasted image 20240126052315.png]] ### 1. Create a new S3 Bucket in us-west-2 region 1. You will need an AWS account. 2. Ensure that you are operating in the us-west-2 region. ![[Pasted image 20240126000533.png]] 3. Create an [S3 bucket](https://s3.console.aws.amazon.com/s3/home?region=us-west-2) in the same region (us-west-2) with a unique name. ![[Pasted image 20240126054508.png]] ### 2. Amazon Athena 1. Go to settings in [Athena](https://us-west-2.console.aws.amazon.com/athena/home?region=us-west-2#/query-editor/settings) then choose the created bucket ![[Pasted image 20240125181006.png]] 2. Run below queries to set up your view of the tables: [click for all queries](https://github.com/OvertureMaps/data/blob/main/athena_setup_queries.sql). ```sql CREATE EXTERNAL TABLE `buildings`( `id` string, `geometry` binary, `bbox` struct<minx:double,maxx:double,miny:double,maxy:double>, `names` struct<common:array<struct<value:string,language:string>>,official:array<struct<value:string,language:string>>,alternate:array<struct<value:string,language:string>>,short:array<struct<value:string,language:string>>>, `version` int, `updatetime` string, `sources` array<struct<property:string,dataset:string,recordId:string,confidence:double>>, `class` string, `hasparts` boolean, `height` double, `numfloors` int, `facadecolor` string, `facadematerial` string, `roofmaterial` string, `roofshape` string, `roofdirection` double, `rooforientation` string, `roofcolor` string, `eaveheight` double, `level` int, `minheight` double, `buildingid` string) PARTITIONED BY ( `type` string) STORED AS PARQUET LOCATION 's3://overturemaps-us-west-2/release/2024-01-17-alpha.0/theme=buildings' ``` 3. Be sure to load the partitions by running `MSCK REPAIR <tablename>;` or choosing "Load Partitions" from the table options menu, you can run query below. ```sql -- Load partitions MSCK REPAIR TABLE `buildings` ``` 4. Run the following query to fetch buildings in a specific area. The WKT geometry will be in WKT format, as AWS Athena defaults to CSV for exporting query results. Use Parquet format if needed. ```sql SELECT id, bbox, names, version, updatetime, sources, class, hasparts, height, numfloors, facadecolor, facadematerial, roofmaterial, roofshape, roofdirection, rooforientation, roofcolor, eaveheight, level, minheight, buildingid, type, st_geomfrombinary(geometry) as wkt FROM buildings WHERE st_intersects(st_geomfrombinary(geometry),st_geometryfromtext('Here you can input your WKT. Create one using geojson.io and save it as WKT')) ``` *Don't forget to replace WKT with yours inside st_geometryfromtext('')* ![[Pasted image 20240126040821.png]] Run time: 6 min 16.603 sec Data scanned: 137.11 GB For #Egypt #Boundaries: Download the geojson file from [https://github.com/wmgeolab/geoBoundaries/raw/905b0ba/releaseData/gbOpen/EGY/ADM0/geoBoundaries-EGY-ADM0.geojson](https://github.com/wmgeolab/geoBoundaries/raw/905b0ba/releaseData/gbOpen/EGY/ADM0/geoBoundaries-EGY-ADM0.geojson),simplify it with [https://mapshaper.org/](https://mapshaper.org/), and convert it to WKT using [https://geojson.io/](https://geojson.io/). 5. Download the data by clicking the 'Download result' button, or find it in the S3 bucket we created. ### 3. Data Conversion I will guide you through the process of data conversion using `ogr2ogr`, enabling you to convert large CSV files to extensions easily works in GIS Desktop Apps like #QGIS and #ArcGIS Pro and spatial indexed by default. 1. CSV to #GeoPackge (Works in QGIS and ArcGIS Pro) ```bash ogr2ogr egypt_overturemaps.gpkg e54ffded-8544-4010-ab4b-e9db0cf06616.csv -oo AUTODETECT_TYPE=YES -oo KEEP_GEOM_COLUMNS=NO -a_srs 'EPSG:4326' -nln "egypt_buildings" ``` 2. CSV to #FileGDB (File Geodatabase) (Works in QGIS and ArcGIS Pro) ```bash ogr2ogr egypt_overturemaps.gdb e54ffded-8544-4010-ab4b-e9db0cf06616.csv -oo AUTODETECT_TYPE=YES -oo KEEP_GEOM_COLUMNS=NO -a_srs 'EPSG:4326' -nlt MULTIPOLYGON -nln "egypt_buildings" ``` ![[Pasted image 20240126052439.png]] ### Download the processed data Download #GeoPackge (.gpkg) : https://bit.ly/3HMdD1i Download #FileGeodatabase (.gdb) : https://bit.ly/3uaTv5L PLEASE DON'T TRY TO CONVERT BIG DATA TO #SHAPEFILES ! USE AT YOUR OWN RISK. ### YouTube Video (Arabic) ![](https://youtu.be/s9Ayep6NLes) ### REF https://github.com/OvertureMaps/data https://docs.aws.amazon.com/athena/latest/ug/geospatial-functions-list-v2.html https://gdal.org/drivers/vector/csv.html https://gdal.org/drivers/vector/gpkg.html https://gdal.org/drivers/vector/openfilegdb.html