SpatiaLite Cookbook Chapter 05: Desserts, spirits, tea and coffee |
Back to the SpatiaLite home page Previous Chapter Chapter 04: Haute cuisine Back to the Cookbook home page Next Chapter Chapter 06: Cooking basics, which are useful to know
DB pages / page cache
Any SQLite DB simply is a single monolithic file: any data and related info is stored within this files.PRAGMA page_size;. |
1024. |
PRAGMA page_count;. |
31850. |
PRAGMA freelist_count;. |
12326. |
PRAGMA page_size = 4096;. |
PRAGMA page_size;. |
1024. |
VACUUM;. |
PRAGMA page_size;. |
4096. |
PRAGMA page_count;. |
5197. |
PRAGMA freelist_count;. |
0. |
PRAGMA cache_size;. |
1000. |
PRAGMA cache_size = 1000000;. |
PRAGMA cache_size;. |
1000000. |
Shapefiles
SpatiaLite supports both import and export for Shapefiles:> spatialite provinces.sqlite SpatiaLite version ..: 5.0.0 Supported Extensions: - 'VirtualShape' [direct Shapefile access] - 'VirtualDbf' [direct DBF access] - 'VirtualText' [direct CSV/TXT access] - 'VirtualNetwork' [Dijkstra shortest path] - 'RTree' [Spatial Index - R*Tree] - 'MbrCache' [Spatial Index - MBR cache] - 'VirtualSpatialIndex' [R*Tree metahandler] - 'VirtualElementary' [ElemGeoms metahandler] - 'VirtualKNN' [K-Nearest Neighbors metahandler] - 'VirtualXPath' [XML Path Language - XPath] - 'VirtualFDO' [FDO-OGR interoperability] - 'VirtualGPKG' [OGC GeoPackage interoperability] - 'VirtualBBox' [BoundingBox tables] - 'SpatiaLite' [Spatial SQL - OGC] PROJ.4 version ......: Rel. 4.9.3, 15 August 2016 GEOS version ........: 3.5.0-CAPI-1.9.0 r4084 RTTOPO version ......: 1.1.0-dev SQLite version ......: 3.22.0 Enter ".help" for instructions spatialite> .loadshp prov2010_s provinces CP1252 32632 the SPATIAL_REF_SYS table already contains some row(s) ======== Loading shapefile at 'prov2010_s' into SQLite table 'provinces' BEGIN; CREATE TABLE provinces ( PK_UID INTEGER PRIMARY KEY AUTOINCREMENT, "OBJECTID" INTEGER, "cod_pro" INTEGER, "NOME_PRO" TEXT, "SIGLA" TEXT); SELECT AddGeometryColumn('provinces', 'Geometry', 32632, 'MULTIPOLYGON', 'XY'); COMMIT; Inserted 110 rows into 'provinces' from SHAPEFILE ======== spatialite> .headers on spatialite> SELECT * FROM provinces LIMIT 5; PK_UID|OBJECTID|cod_pro|NOME_PRO|SIGLA|Geometry 1|1|1|Torino|TO| 2|2|2|Vercelli|VC| 3|3|3|Novara|NO| 4|4|4|Cuneo|CN| 5|5|5|Asti|AT| spatialite> |
spatialite> .dumpshp provinces Geometry exported_provinces CP1252 ======== Dumping SQLite table 'provinces' into shapefile at 'exported_provinces' SELECT * FROM "provinces" WHERE GeometryAliasType("Geometry") = 'POLYGON' OR GeometryAliasType("Geometry") = 'MULTIPOLYGON' OR "Geometry" IS NULL; Exported 110 rows into SHAPEFILE ======== spatialite> .quit > |
> spatialite_tool -i -shp prov2010_s -d db.sqlite -t provinces -c CP1252 -s 32632 SQLite version: 3.7.4 SpatiaLite version: 2.4.0-RC5 Inserted 110 rows into 'provinces' from 'prov2010_s.shp' > |
> spatialite_tool -e -shp exported_provinces -d db.sqlite -t provinces -g Geometry -c CP1252 SQLite version: 3.7.4 SpatiaLite version: 2.4.0-RC5 Exported 110 rows into 'exported_provinces.shp' from 'provinces' > |
DBF files
SpatiaLite simply supports import for DBF files:> spatialite communities.sqlite SpatiaLite version ..: 5.0.0 Supported Extensions: - 'VirtualShape' [direct Shapefile access] - 'VirtualDbf' [direct DBF access] - 'VirtualText' [direct CSV/TXT access] - 'VirtualNetwork' [Dijkstra shortest path] - 'RTree' [Spatial Index - R*Tree] - 'MbrCache' [Spatial Index - MBR cache] - 'VirtualFDO' [FDO-OGR interoperability] - 'SpatiaLite' [Spatial SQL - OGC] PROJ.4 version ......: Rel. 4.7.1, 23 September 2009 GEOS version ........: 3.3.0-CAPI-1.7.0 SQLite version ......: 3.7.4 Enter ".help" for instructions spatialite> .loaddbf com2010_s.dbf communities CP1252 ======== Loading DBF at 'com2010_s.dbf' into SQLite table 'communities' BEGIN; CREATE TABLE communities ( PK_UID INTEGER PRIMARY KEY AUTOINCREMENT, "OBJECTID" INTEGER, "cod_reg" INTEGER, "cod_pro" INTEGER, "cod_com" INTEGER, "pro_com" INTEGER, "comune" TEXT, "NOME_ITA" TEXT, "NOME_TED" TEXT); COMMIT; Inserted 8094 rows into 'communities' from DBF ======== spatialite> .headers on spatialite> SELECT * FROM communities LIMIT 5 OFFSET 5000; PK_UID|OBJECTID|cod_reg|cod_pro|cod_com|pro_com|comune|NOME_ITA|NOME_TED 5001|4958|12|58|54|58054|Manziana|Manziana| 5002|4959|12|58|55|58055|Marano Equo|Marano Equo| 5003|4960|12|58|56|58056|Marcellina|Marcellina| 5004|4961|12|58|57|58057|Marino|Marino| 5005|4962|12|58|58|58058|Mazzano Romano|Mazzano Romano| spatialite> .quit > |
> spatialite_tool -i -dbf com2010_s -d db.sqlite -t communities -c CP1252 SQLite version: 3.7.4 SpatiaLite version: 2.4.0-RC5 Inserted 8094 rows into 'communities' from 'com2010_s.dbf' > |
TXT/CSV files
SpatiaLite supports both import and export for TXT/CSV files:Other supported export formats
Using spatialite_gui. you can also export your data as:The basic approach
Any language supporting any generic SQLite driver aka connector can fully support SpatiaLite.SELECT load_extension('path_to_extension_library');.
Executing the above SQL statement will load any SQLite's extension: this obviously including SpatiaLite.BEWARE:
The VirtualNetwork module has deprecated, and is only preserved for continued support of legacy apps.This text will remain available to preserve a full historical record.
All new applications should, instead, use the more recent VirtualRouting.
The official documentation about the Spatial Indexing support can be found here:
The VirtualRouting is a pure SQL solution, as apposed to VirtualNetwork which needed an external tool (spatialite_network)
SpatiaLite supports an internal routing module called VirtualNetwork.
Starting from an arbitrary network this module allows to identify shortest path connections using simple SQL queries.
The VirtualNetwork. module supports sophisticated and highly optimized algorithms, so it's really fast and very efficient even using huge sized networks.
Network foundations
You cannot assume that any generic road layer corresponds to a network.
A real network must satisfy several specific prerequisites, i.e. it has to be a graph.
Graph theory is a wide and complex branch of mathematics;
if you are interested in this, here you can get some further details:
Graph Theory
Shortest Path Problem
Dijkstra's Algorithm
A* Algorithm
Very shortly explained:Starting from a network aka graph both Dijkstra's and A* algorithms can then identify the shortest path (minimal cost connection) connecting any arbitrary couple of nodes.
- a network is a collection of arcs
- each single arc connects two nodes
- each arc has an unique direction:
i.e. the arc going from A-node to B-node is not necessarily the same one going from B to A- each arc has a well known cost (e.g. length, travel time, capacity, ...)
- both arcs and nodes must expose some explicitly defined unique identifier.
- geometries of arcs and nodes must satisfy a strong topological consistency.
There are several sources distributing network-like data.
One of the most renowned and widely used is OSM [Open Street Map], a completely free worldwide dataset.
There are several download sites distributing OSM; just to mention the main ones: Anyway in the following example we'll download the required OSM dataset from: www.gfoss.it
Most precisely we'll download the toscana.osm.bz2. dataset.Version from 2014-06-26:
- toscana.osm.bz2 : 105 MB
- toscana.osm : 1.5 GB
- tuscany.sqlite : 84.4 MB (after import by spatialite_osm_net [2018-09-28] )
[2018-09-28] or from Geofabrik
- OpenStreetMap Data Extracts/Europe/Italy/Centro
- http://download.geofabrik.de/europe/italy/centro-latest.osm.bz2
Step 1: you must uncompress the OSM dataset.
This file is compressed using the bzip2 algorithm, widely supported by many open source tools.
e.g. you can use 7-zip to unzip this file. www.7-zip.org
Step 2: any OSM dataset simply is an XML file
(you can open this file using any ordinary text editor at your choice).
SpatiaLite supports a specific CLI tool allowing to load an OSM dataset into a DB: spatialite_osm_net.
>spatialite_osm_net -o toscana.osm -d tuscany.sqlite -T tuscany -m
SQLite version: 3.22.0
SpatiaLite version: 5.0.0
using IN-MEMORY database
Parsing input: Pass 1 [Nodes and Ways] ...
Parsing input: Pass 2 [Arcs of the Graph] ...
UNRESOLVED-NODE 1520302186
..
UNRESOLVED-NODE 1964678842
Creating helper table 'tuscany_nodes' ... wait please ...
Helper table 'tuscany_nodes' successfully created
Dropping temporary table 'osm_tmp_nodes' ... wait please ...
Dropped table 'osm_tmp_nodes'
Dropping temporary table 'graph_nodes' ... wait please ...
Dropped table 'graph_nodes'
exporting IN_MEMORY database ... wait please ...
IN_MEMORY database successfully exported
VACUUMing the DB ... wait please ...
All done: OSM graph was successfully loaded
>
Very briefly explained:
- -o toscana.osm. selects the input OSM dataset to be loaded.
- -d tuscany.sqlite. selects the output DB to be created and populated.
- -T tuscany. will create the Geometry Table storing the OSM dataset
- -m. an in-memory database will be used, so to perform data import in the shortest time.
SELECT
*
FROM tuscany;
Just a quick check:
id osm_id class node_from node_to name oneway_from_to oneway_to_from length cost geometry ... ... ... ... ... ... ... ... ... ... ... 2393 8079944 tertiary 659024545 659024546 Via Cavour 1 1 7.468047 0.537699 BLOB sz=80 GEOMETRY 2394 8079944 tertiary 659024546 156643876 Via Cavour 1 1 12.009911 0.864714 BLOB sz=96 GEOMETRY 2395 8083989 motorway 31527668 319386487 Autostrada del Sole 1 0 424.174893 13.882087 BLOB sz=80 GEOMETRY 2396 8083990 motorway 31527665 31527668 Autostrada del Sole 1 0 130.545183 4.272388 BLOB sz=112 GEOMETRY ... ... ... ... ... ... ... ... ... ... ... Please note #1: there is no separate representation for nodes, simply because they can be indirectly retrieved starting from the corresponding arcs.
- a single Tuscany. table exists into the DB created by spatialite_osm_net.
- each row in this table corresponds to a single network arc
- the nodes connected by each arc are identified by node_from. and node_to.
- oneway_from_to. and oneway_to_from. determine if the arc can be walked in both directions or not.
- length. is the geometric length of the arc (measured in meters).
- cost. is the estimated travel time (expressed in seconds).
- geometry. is the LINESTRING. representation corresponding to the arc.
Please note #2: this one surely is a real network, but in this form cannot yet support routing queries.
A further step is still required, i.e. creating a VirtualNetwork. table.We'll use spatialite_gui. to create the VirtualNetwork table.
Anyway the same operation is supported as well by the spatialite_network. CLI tool
(and this CLI tool supports an extended diagnostic capability, useful to identify any eventual problem).
SELECT
*
FROM tuscany_net
WHERE NodeFrom = 267209305
AND NodeTo = 267209702;
And finally you can now test your first routing query:
Algorithm ArcRowid NodeFrom NodeTo Cost Geometry Name Dijkstra NULL 267209305 267209702 79.253170 BLOB sz=272 GEOMETRY NULL Dijkstra 11815 267209305 250254381 11.170037 NULL Via Guelfa Dijkstra 11816 250254381 250254382 8.583739 NULL Via Guelfa Dijkstra 11817 250254382 250254383 12.465016 NULL Via Guelfa Dijkstra 16344 250254383 256636073 15.638407 NULL Via Cavour Dijkstra 67535 256636073 270862435 3.147105 NULL Piazza San Marco Dijkstra 25104 270862435 271344268 5.175379 NULL Piazza San Marco Dijkstra 25105 271344268 82591712 3.188657 NULL Piazza San Marco Dijkstra 11802 82591712 267209666 4.978328 NULL Piazza San Marco Dijkstra 20773 267209666 267209702 14.906501 NULL Via Giorgio La Pira
- you simply have to set the WHERE NodeFrom = ... AND NodeTo = .... clause.
- and a result-set representing the shortest path solution will be returned.
- the first row of this result-set summarizes the whole path, and contains the corresponding geometry.
- any subsequent row represents a single arc to be traversed, following the appropriate sequence, so to go from origin to destination.
SpatiaLite's VirtualNetwork. tables support two alternative algorithms:UPDATE tuscany_net SET Algorithm = 'A*';
UPDATE tuscany_net SET Algorithm = 'Dijkstra';
- Dijkstra's shortest path is a classic routing algorithm, based on thorough mathematical assumptions, and will surely identify the optimal solution.
- A* is an alternative algorithm based on heuristic assumptions:
it is usually faster than Dijkstra's, but under some odd condition may eventually fail, or may return a sub-optimal solution.- anyway switching from the one to the other is really simple.
- using the Dijksta's algorithm is the default selection.
A VirtualNetwork. table simply represents a staticized snapshot of the underlying network.
This allows to adopt an highly efficient binary representation (in other words, allows to produce solutions in a very quick time), but obviously doesn't supports dynamic changes.
Each time the underlying network changes the corresponding VirtualNetwork. must be DROP.ped and then created again, so to correctly reflect the latest network state.
In many cases this isn't an issue at all: but on some highly dynamic scenario this may be a big annoyance.
Be well conscious of this limitation.
Back to the SpatiaLite home page Previous Chapter Chapter 04: Haute cuisine Back to the Cookbook home page Next Chapter Chapter 06: Cooking basics, which are useful to know