Correspondences between SQL Functions and DE-9IM Intersection Matrices
Quick recall:- The standard Spatial SQL model defines several functions (as e.g. ST_Equals(), ST_Intersects(), ST_Touches() and alike) intended to test if a given spatial relationship exists between a pair of Geometries.
- All such SQL functions are simply convenience wrappers based on the most general DE-9IM model.
- An alternative (more flexible, more generic and more efficient) approach for evaluating spatial relationships could be based on the following two advanced SQL functions:
- ST_Relate ( g1 Geometry , g2 Geometry ) : intersection_matrix Text
This first function will evaluate both Geometries, then returning a Text string corresponding to a nine digits serialized DE-9IM intersection_matrix as e.g. '0F1FF0FF2' or '212101212' with the following interpretation:- F: this stands for FALSE, i.e. the corresponding intersection doesn't occurs.
- 0: the corresponding intersection is a Point, i.e. a zero-dimensions Geometry.
- 1: the corresponding intersection is a Linestring, i.e. a one-dimension Geometry.
- 2: the corresponding intersection is a Polygon, i.e. a two-dimensions Geometry.
- ST_RelateMatch ( intersection_matrix Text , reference_pattern Text ) : true_or_false Boolean
This second function is intended to evaluate an intersection matrix returned by a previous call to ST_Relate() against a given reference_pattern, then returning TRUE or FALSE.
A reference_pattern as e.g. 'T*F**FFF*' or '0********' simply is yet another nine digits serialized DE-9IM matrix adopting a slightly changed syntax:- F, 0, 1 and 2: same interpretation as before.
- T: this stands for TRUE, i.e. the corresponding intersection is expected to occur, but its specific nature doesn't matters.
- *: an asterisk is a mask sign and always means ignore / don't care / skip this matrix cell.
- Note: ST_Relate() is a computationally heavy function (exactly as ST_Equals(), ST_Crosses() and alike are); on the other hand ST_RelateMatch() is a very light and quick function.
So when you have to check for more than a single relationship at the same time, calling ST_Relate() just once and then evaluating each spatial relationship by calling ST_RelateMatch() multiple times will always ensure noticeably faster performances. - Conclusion: using the quirky and awkward DE-9IM model surely is neither easy or intuitive.
The following table is specifically intended to make the usage of both ST_Relate() and ST_RelateMatch() easier, in a reasonably painless way.
- ST_Relate ( g1 Geometry , g2 Geometry ) : intersection_matrix Text
SQL function | Reference Pattern |
---|---|
ST_Equals( g1, g2 ) | T*F**FFF* |
ST_Disjoint( g1, g2 ) | FF*FF**** |
ST_Touches( g1, g2 ) | FT******* |
F**T***** | |
F***T**** | |
ST_Within( g1, g2 ) | T*F**F*** |
ST_Overlaps( g1, g2 ) | T*T***T** |
1*T***T** | |
ST_Crosses( g1, g2 ) | T*T****** |
T*****T** | |
0******** | |
ST_Intersects( g1, g2 ) | T******** |
*T******* | |
***T***** | |
****T**** | |
ST_Contains( g1, g2 ) | T*****FF* |
ST_Covers( g1, g2 ) | T*****FF* |
*T****FF* | |
***T**FF* | |
****T*FF* | |
ST_CoveredBy( g1, g2 ) | T*F**F*** |
*TF**F*** | |
**FT*F*** | |
**F*TF*** |
Further useful readings:
- a general introduction to DE-9IM
- a well-written documentation from a Proprietary SW Vendor
A practical example (and related objective timings)
SELECT m.name AS municipality, p.name AS province, ST_Disjoint(m.geom, p.geom) AS disjoint, ST_Touches(m.geom, p.geom) AS touches, ST_Within(m.geom, p.geom) AS within, ST_Intersects(m.geom, p.geom) AS intersects, ST_Overlaps(m.geom, p.geom) AS overlaps, ST_CoveredBy(m.geom, p.geom) AS covered_by FROM municipalities AS m, provinces AS p;In this first test I've followed the most classical approach to determine several spatial relationships intercurring between:
- the municipalities table (MultiPolygons), containing the 276 Municipalites of Tuscany.
- and the provinces table (MultiPolygons), containing the 10 Provinces of Tuscany.
- this is an unsophisticated query, and the resulting cartesian product of both tables contains 2.760 rows.
CREATE TEMPORARY TABLE tmp_relate AS SELECT m.name AS municipality, p.name AS province, ST_Relate(m.geom, p.geom) AS matrix FROM municipalities AS m, provinces AS p; SELECT municipality, province, matrix, ST_RelateMatch(matrix, 'FF*FF****') AS disjoint, ST_RelateMatch(matrix, 'FT*******') OR ST_RelateMatch(matrix, 'F**T*****') OR ST_RelateMatch(matrix, 'F***T****') AS touches, ST_RelateMatch(matrix, 'T*F**F***') AS within, ST_RelateMatch(matrix, 'T********') OR ST_RelateMatch(matrix, '*T*******') OR ST_RelateMatch(matrix, '***T*****') OR ST_RelateMatch(matrix, '****T****') AS intersects, ST_RelateMatch(matrix, 'T*T***T**') AS overlaps, ST_RelateMatch(matrix, 'T*F**F***') OR ST_RelateMatch(matrix, '*TF**F***') OR ST_RelateMatch(matrix, '**FT*F***') OR ST_RelateMatch(matrix, '**F*TF***') AS covered_by FROM tmp_relate;Using the same tables as before in this second test, but this time I've adopted the alternative approach based on ST_Relate() and ST_RelateMatch().
After some preliminary tests, it quickly emerged that the SQLite optimizer doesn't like the mixture of an inner query together with function calls on the same overall query (resulting in awful timings), so I duly switched to an indirect two-steps approach:
- step #1: computing ST_Related() and saving the whole resultset into a temporary table.
- step #2: completing the task by calling ST_RelateMatch on the above temporary table.
Note: the second query (ST_RelateMatch) only required a few milliseconds to complete; the real computational load was entirely confined within the first query (ST_Relate).
Final conclusion: directly using the awkward DE-9IM model is in someways difficult, but it does ensure an astonishing performance boost.
It's usage is highly recommended when you are required to check more than a single spatial relationship between the same pair of Geometries.
On 2022-04-05 14:47:07 UTC anonymous added:
For the performance issue when using multiple ST_RelateMatch() calls, htere seems to be an alternative to putting the data in a temp table first. According to tests I did on my own data, if you use a subselect and prevent sqlite from "flattening" the subsect by adding a LIMIT and OFFSET clause like this, performance is good as well:
For the performance issue when using multiple ST_RelateMatch() calls, htere seems to be an alternative to putting the data in a temp table first. According to tests I did on my own data, if you use a subselect and prevent sqlite from "flattening" the subsect by adding a LIMIT and OFFSET clause like this, performance is good as well:
SELECT ... FROM ( SELECT ... FROM ... LIMIT -1 OFFSET 0) WHERE ...