gml
Not logged in

XML Tools Tutorial: how to process a Simple GML dataset

Back to main page

Downloading the sample dataset

In this first example we'll use a simple GML dataset being published via WFS by Regione Sardegna (Italy) under the IODL v.2 licence terms (equivalent to CC-BY-SA).

transforming the XML document into a SQLite/Spatialie DB-file (#1)

$ spatialite_xml_load -x poi-sardegna.gml -d poi0.sqlite
SQLite version: 3.8.5
SpatiaLite version: 4.2.0-rc2
Target DB: poi0.sqlite
Input XML: poi-sardegna.gml
Done - inserted/updated rows: 62751 [0:00:01.856]
$
In this first attempt we'll use no special argument at all; we'll simply rely on standard basic defaults:

transforming the XML document into a SQLite/Spatialie DB-file (#2)

$ spatialite_xml_load -x poi-sardegna.gml -d poi1.sqlite -nl 2 -pl 1
SQLite version: 3.8.5
SpatiaLite version: 4.2.0-rc2
Target DB: poi1.sqlite
Input XML: poi-sardegna.gml
Done - inserted/updated rows: 62751 [0:00:01.840]
$
In this second attempt we'll apply two further arguments in order to limit the length of table names:

transforming the XML document into a SQLite/Spatialie DB-file (#3)

$ spatialite_xml_load -x poi-sardegna.gml -d poi2.sqlite -nl 2 -pl 2
SQLite version: 3.8.5
SpatiaLite version: 4.2.0-rc2
Target DB: poi2.sqlite
Input XML: poi-sardegna.gml
Done - inserted/updated rows: 62751 [0:00:01.778]
$
This third attempt is exactly the previous one: except in that this time we've specified -pl 2:

transforming the XML document into a SQLite/Spatialie DB-file (#4)

$ spatialite_xml_load -x poi-sardegna.gml -d poicg.sqlite -nl 2 -
pl 1 -cg
SQLite version: 3.8.5
SpatiaLite version: 4.2.0-rc2
Target DB: poicg.sqlite
Input XML: poi-sardegna.gml
Done - inserted/updated rows: 58260 [0:00:01.880]
$
This final attempt is exactly the same as #3; but this time we've added a further -cg argument, thus requesting to immediately collapse any GML geometry item.

quick comparison

  1. when no special precaution is applied (poi0.sqlite) you'll discover that some tables could have very long names, as e.g.
    featurecollection_member_poigenerici_boundedby_envelope_lowercorner
    Please note: this GML sample is basically simple; things may easily go much more bad that this in many other real-world cases.
  2. in this specific case by applying both -pl 2 and -nl 1 we'll get significantly shorter table names, as e.g.
    poigenerici_envelope_lowercorner
  3. by applying both -pl 2 and -nl 2 we'll get slightly longer table names, as e.g.
    poigenerici_boundedby_envelope_uppercorner
  4. Please note well: the required minimal number of ancestors (-pl) strictly depends on the intrinsic structure of the GML document being parsed:
    • Requesting to use too few ancestors will produce table names collisions, and will cause the import operation to fail.
    • Requesting to use too much ancestors will produce unnecessarily long table names.
    • As a rule of the thumb, you should always start by just requesting -nl 1; and in case of any failure you'll then increase this value until final success.
  5. Activating the -cg option is always warmly recommended while importing any GML file, because it will greatly reduce the overall complexity of the target DB-file, most notably when there are many complex Linestrings or Polygons.

direct check

Now start a spatialite CLI or spatialite_gui session, and familiarize yourself with all DB-files created by spatialite_xml_load:
  1. check both xml_metacatalog_tables and xml_metacatalog_columns, and discover how the XML tree was mapped into corresponding relationally joined DBMS tables.
  2. test poi1.sqlite and check how raw GML geometries were loaded into the DBMS; explore the following tables:
    • poigenerici_punto
    • poigenerici_punto_point
    • poigenerici_point_pos
  3. now test poicg.sqlite and check how the the same GML geometries were loaded into the DBMS in the collapsed form:
    • explore the poigenerici_punto table.
    • now the node_value will contain a collapsed representation of the whole GML geometry.
    • a further from_gml_geometry will now contain a Geometry BLOB.
      Please note: such geometries are intentionally not properly registered in geometry_columns simply because the GML standard potentially allows to use different SRID values for each single geometry feature.


Final post-processing: collapsing the whole DBMS

$ spatialite_xml_collapse -d poicg.sqlite -dd -nl 2
SQLite version: 3.8.5
SpatiaLite version: 4.2.0-rc2
Collapsing <gml:Envelope><gml:lowerCorner>
Collapsing <gml:Envelope><gml:upperCorner>
Collapsing <gml:Envelope><gml:lowerCorner>
Collapsing <gml:Envelope><gml:upperCorner>
Collapsing <dbu:POIGENERICI><dbu:IDFEATURE>
Collapsing <dbu:POIGENERICI><dbu:NOME>
Collapsing <dbu:POIGENERICI><dbu:CODICEISTACOMUNALE>
Collapsing <dbu:POIGENERICI><dbu:COMUNE>
Collapsing <dbu:POIGENERICI><dbu:MACROTIPOLOGIAPOI>
Collapsing <dbu:POIGENERICI><dbu:TIPOLOGIAPOI>
Collapsing <dbu:POIGENERICI><dbu:LINKSCHEDA>
Collapsing <dbu:POIGENERICI><dbu:PUNTO>
No duplicated rows found in: boundedby_envelope
No duplicated rows found in: member_poigenerici
Recovering Geometry:    member_poigenerici.from_gml_geometry
Creating Spatial Index: member_poigenerici.from_gml_geometry

how collapsing works

  1. collapsing is a recursive process.
  2. each leaf node will be evaluated; if the corresponding parent node has just this single child, then the child will be merged into the parent, and the now useless child will be definitely removed from the tree.
  3. when any parent node has multiple children nodes the collapsing process stops and no further propagates.
  4. Please note: collapsing is an irreversible activity because nodes and tables will be definitely removed from the tree.
  5. Final post-processing: once all possible nodes have been collapsed, than an attempt will be made in order to identify any possible layer: i.e. any table containing Geometries of the same type/SRID.> In this case the Geometry will be properly registered in geometry_columns, and a corresponding Spatial Index will be created.

direct check

Start a spatialite CLI or spatialite_gui session and connect to the collapsed DB-file:
  1. check xml_metacatalog_tables; you can easily verify by yourself how the whole collapsing process has been faithfully mapped.
  2. check the member_poigenerici table: this is a genuine layer; e.g. you could directly export it as a Shapefile.


Exporting yet again an XML document form the DBMS

$ spatialite_xml_print -d poi1.sqlite -x export.xml
SQLite version: 3.8.5
SpatiaLite version: 4.2.0-rc2
Input DB: poi1.sqlite
$

how it works

  1. both xml_metacatalog_tables and xml_metacatalog_columns faithfully preserve all detailed informations describing the initial XML layout.
  2. the layout adopted for each table faithfully corresponds to the initial XML syntax.
  3. so it's not at all surprising to discover that exporting an XML document from the DBMS is a perfectly reversible operation.
  4. Please note well: a collapsed DB will be no longer able to adequately support spatialite_xml_print, simply because it has been massively rearranged thus irreversibly destroying any direct correspondence between XML and DBMS items.



back to main page