Wiki page
[about DE-9IM] by
anonymous
2022-04-05 14:47:07.
D 2022-04-05T14:47:07.619
L about\sDE-9IM
P 111e279bced74dfbf866726ee847a2534ca6d0fc
U anonymous
W 9188
<h2>Correspondences between SQL Functions and DE-9IM Intersection Matrices</h2>
Quick recall:
<ol>
<li>The standard Spatial SQL model defines several functions (as e.g. <b>ST_Equals()</b>, <b>ST_Intersects()</b>, <b>ST_Touches()</b> and alike) intended to test if a given spatial relationship exists between a pair of Geometries.</li>
<li>All such SQL functions are simply convenience wrappers based on the most general DE-9IM model.</li>
<li>An alternative (more flexible, more generic and more efficient) approach for evaluating spatial relationships could be based on the following two <b><i>advanced</i></b> SQL functions:
<ul>
<li><b>ST_Relate (</b> <i>g1</i> Geometry , <i>g2</i> Geometry <b>) :</b> <i>intersection_matrix</i> Text<br>
This first function will evaluate both Geometries, then returning a Text string corresponding to a nine digits serialized DE-9IM <b>intersection_matrix</b> as e.g. <b>'0F1FF0FF2'</b> or <b>'212101212'</b> with the following interpretation:
<ul>
<li><b>F</b>: this stands for <b>FALSE</b>, i.e. the corresponding intersection doesn't occurs.</li>
<li><b>0</b>: the corresponding intersection is a Point, i.e. a zero-dimensions Geometry.</li>
<li><b>1</b>: the corresponding intersection is a Linestring, i.e. a one-dimension Geometry.</li>
<li><b>2</b>: the corresponding intersection is a Polygon, i.e. a two-dimensions Geometry.</li>
</ul></li>
<li><b>ST_RelateMatch (</b> <i>intersection_matrix</i> Text , <i>reference_pattern</i> Text <b>) :</b> <i>true_or_false</i> Boolean<br>
This second function is intended to evaluate an <b>intersection matrix</b> returned by a previous call to <b>ST_Relate()</b> against a given <b>reference_pattern</b>, then returning <b>TRUE</b> or <b>FALSE</b>.<br>
A <b>reference_pattern</b> as e.g. <b>'T*F**FFF*'</b> or <b>'0********'</b> simply is yet another nine digits serialized DE-9IM matrix adopting a slightly changed syntax:
<ul>
<li><b>F</b>, <b>0</b>, <b>1</b> and <b>2</b>: same interpretation as before.</li>
<li><b>T</b>: this stands for <b>TRUE</b>, i.e. the corresponding intersection is expected to occur, but its specific nature doesn't matters.</li>
<li><b>*</b>: an asterisk is a <b>mask sign</b> and always means <b><i>ignore / don't care / skip</i></b> this matrix cell.</li>
</ul></li>
<li><u>Note</u>: <b>ST_Relate()</b> is a computationally heavy function (exactly as <b>ST_Equals()</b>, <b>ST_Crosses()</b> and alike are); on the other hand <b>ST_RelateMatch()</b> is a very light and quick function.<br>
So when you have to check for more than a single relationship at the same time, calling <b>ST_Relate()</b> just once and then evaluating each spatial relationship by calling <b>ST_RelateMatch()</b> multiple times will always ensure noticeably faster performances.</li>
<li><u>Conclusion</u>: using the quirky and awkward DE-9IM model surely is neither easy or intuitive.<br>
The following table is specifically intended to make the usage of both <b>ST_Relate()</b> and <b>ST_RelateMatch()</b> easier, in a reasonably painless way.
</ul></li>
</ol>
<table cellspacing="4" cellpadding="6" border="1" bgcolor="#f0fff0">
<tr><th bgcolor="#ffc000">SQL function</th><th bgcolor="#ffc000">Reference Pattern</th></tr>
<tr><td><b>ST_Equals( </b><i>g1</i>, <i>g2</i> <b>)</b></td><td align="center"><b>T*F**FFF*</b></td></tr>
<tr><td><b>ST_Disjoint( </b><i>g1</i>, <i>g2</i> <b>)</b></td><td align="center"><b>FF*FF****</b></td></tr>
<tr><td rowspan="3"><b>ST_Touches( </b><i>g1</i>, <i>g2</i> <b>)</b></td><td align="center"><b>FT*******</b></td></tr>
<tr><td align="center"><b>F**T*****</b></td></tr>
<tr><td align="center"><b>F***T****</b></td></tr>
<tr><td><b>ST_Within( </b><i>g1</i>, <i>g2</i> <b>)</b></td><td align="center"><b>T*F**F***</b></td></tr>
<tr><td rowspan="2"><b>ST_Overlaps( </b><i>g1</i>, <i>g2</i> <b>)</b></td><td align="center"><b>T*T***T**</b></td></tr>
<tr><td align="center"><b>1*T***T**</b></td></tr>
<tr><td rowspan="3"><b>ST_Crosses( </b><i>g1</i>, <i>g2</i> <b>)</b></td><td align="center"><b>T*T******</b></td></tr>
<tr><td align="center"><b>T*****T**</b></td></tr>
<tr><td align="center"><b>0********</b></td></tr>
<tr><td rowspan="4"><b>ST_Intersects( </b><i>g1</i>, <i>g2</i> <b>)</b></td><td align="center"><b>T********</b></td></tr>
<tr><td align="center"><b>*T*******</b></td></tr>
<tr><td align="center"><b>***T*****</b></td></tr>
<tr><td align="center"><b>****T****</b></td></tr>
<tr><td><b>ST_Contains( </b><i>g1</i>, <i>g2</i> <b>)</b></td><td align="center"><b>T*****FF*</b></td></tr>
<tr><td rowspan="4"><b>ST_Covers( </b><i>g1</i>, <i>g2</i> <b>)</b></td><td align="center"><b>T*****FF*</b></td></tr>
<tr><td align="center"><b>*T****FF*</b></td></tr>
<tr><td align="center"><b>***T**FF*</b></td></tr>
<tr><td align="center"><b>****T*FF*</b></td></tr>
<tr><td rowspan="4"><b>ST_CoveredBy( </b><i>g1</i>, <i>g2</i> <b>)</b></td><td align="center"><b>T*F**F***</b></td></tr>
<tr><td align="center"><b>*TF**F***</b></td></tr>
<tr><td align="center"><b>**FT*F***</b></td></tr>
<tr><td align="center"><b>**F*TF***</b></td></tr>
</table><br>
Further useful readings:
<ul>
<li>a general introduction to <a href="https://en.wikipedia.org/wiki/DE-9IM">DE-9IM</a></li>
<li>a well-written documentation from a <a href="http://edndoc.esri.com/arcsde/9.0/general_topics/understand_spatial_relations.htm">Proprietary SW Vendor</a></li>
</ul><br>
<hr>
<h3>A practical example (and related objective timings)</h3>
<verbatim>
SELECT m.name AS municipality, p.name AS province,
ST_Disjoint(m.geom, p.geom) AS disjoint,
ST_Touches(m.geom, p.geom) AS touches,
ST_Within(m.geom, p.geom) AS within,
ST_Intersects(m.geom, p.geom) AS intersects,
ST_Overlaps(m.geom, p.geom) AS overlaps,
ST_CoveredBy(m.geom, p.geom) AS covered_by
FROM municipalities AS m, provinces AS p;
</verbatim>
In this first test I've followed the most <b>classical approach</b> to determine several spatial relationships intercurring between:
<ul>
<li>the <b>municipalities</b> table (MultiPolygons), containing the <b>276</b> Municipalites of Tuscany.</li>
<li>and the <b>provinces</b> table (MultiPolygons), containing the <b>10</b> Provinces of Tuscany.</li>
<li>this is an unsophisticated query, and the resulting <b>cartesian product</b> of both tables contains <b>2.760</b> rows.</li>
</ul>
<u>Measured timing</u>: <b>67 secs</b>
<br><br><br>
<verbatim>
CREATE TEMPORARY TABLE tmp_relate AS
SELECT m.name AS municipality, p.name AS province, ST_Relate(m.geom, p.geom) AS matrix
FROM municipalities AS m, provinces AS p;
SELECT municipality, province, matrix,
ST_RelateMatch(matrix, 'FF*FF****') AS disjoint,
ST_RelateMatch(matrix, 'FT*******') OR
ST_RelateMatch(matrix, 'F**T*****') OR
ST_RelateMatch(matrix, 'F***T****') AS touches,
ST_RelateMatch(matrix, 'T*F**F***') AS within,
ST_RelateMatch(matrix, 'T********') OR
ST_RelateMatch(matrix, '*T*******') OR
ST_RelateMatch(matrix, '***T*****') OR
ST_RelateMatch(matrix, '****T****') AS intersects,
ST_RelateMatch(matrix, 'T*T***T**') AS overlaps,
ST_RelateMatch(matrix, 'T*F**F***') OR
ST_RelateMatch(matrix, '*TF**F***') OR
ST_RelateMatch(matrix, '**FT*F***') OR
ST_RelateMatch(matrix, '**F*TF***') AS covered_by
FROM tmp_relate;
</verbatim>
Using the same tables as before in this second test, but this time I've adopted the alternative approach based on <b>ST_Relate()</b> and <b>ST_RelateMatch()</b>.<br>
After some preliminary tests, it quickly emerged that the SQLite optimizer doesn't like the mixture of an inner query together with function calls on the same overall query (resulting in awful timings), so I duly switched to an indirect <b>two-steps approach</b>:
<ul>
<li><b>step #1</b>: computing <b>ST_Related()</b> and saving the whole resultset into a <b>temporary table</b>.</li>
<li><b>step #2</b>: completing the task by calling <b>ST_RelateMatch</b> on the above temporary table.</li>
</ul>
<u>Measured timing</u>: <b>22 secs</b><br><br>
<u>Note</u>: the second query (ST_RelateMatch) only required a few milliseconds to complete; the real computational load was entirely confined within the first query (ST_Relate).<br><br>
<u>Final conclusion</u>: directly using the awkward DE-9IM model is in someways difficult, but it does ensure an astonishing performance boost.<br>
It's usage is highly recommended when you are required to check more than a single spatial relationship between the same pair of Geometries.
<br><br>
<hr /><div id="1509e6a215798069"><i>On 2022-04-05 14:47:07 UTC anonymous added:</i><br />
For the performance issue when using multiple ST_RelateMatch() calls, htere seems to be an alternative to putting the data in a temp table first. According to tests I did on my own data, if you use a subselect and prevent sqlite from "flattening" the subsect by adding a LIMIT and OFFSET clause like this, performance is good as well:
<blockquote><pre>
SELECT ...
FROM (
SELECT ...
FROM ...
LIMIT -1 OFFSET 0)
WHERE ...
</pre></blockquote></div id="1509e6a215798069">
Z 41b54d73453f370a1e59e7ba14d2201f