Geospatial Data Types

Snowflake offers native support for geospatial features such as points, lines, and polygons on the Earth’s surface.

In this Topic:

Data Types

Snowflake provides the following data types for geospatial data:

  • The GEOGRAPHY data type, which models Earth as though it were a perfect sphere.

  • The GEOMETRY data type, which represents features in a planar (Euclidean, Cartesian) coordinate system.

GEOGRAPHY Data Type

The GEOGRAPHY data type follows the WGS 84 standard (spatial reference ID 4326; for details, see https://spatialreference.org/ref/epsg/wgs-84/).

Points on the earth are represented as degrees of longitude (from -180 degrees to +180 degrees) and latitude (-90 to +90).

Altitude is currently not supported.

Line segments are interpreted as geodesic arcs on the Earth’s surface.

Snowflake also provides geospatial functions that operate on the GEOGRAPHY data type.

If you have geospatial data (e.g. longitude and latitude data, WKT, WKB, GeoJSON, etc.), you should convert and store this data in GEOGRAPHY columns, rather than keeping the data in their original formats in VARCHAR, VARIANT or NUMBER columns. Storing your data in GEOGRAPHY columns can significantly improve the performance of queries that use geospatial functionality.

GEOMETRY Data Type

The GEOMETRY data type represents features in a planar (Euclidean, Cartesian) coordinate system.

The coordinates are represented as pairs of real numbers (x, y). Currently, only 2D coordinates are supported.

The units of the X and Y are determined by the spatial reference system (SRS) associated with the GEOMETRY object. The spatial reference system is identified by the spatial reference system identifier (SRID) number. Unless the SRID is provided when creating the GEOMETRY object or by calling ST_SETSRID, the SRID is 0.

Snowflake provides a set of geospatial functions that operate on the GEOMETRY data type. For these functions:

  • All functions assume planar coordinates, even if the geometry uses a non-planar SRS.

  • The measurement functions (e.g. ST_LENGTH) use the same units as the coordinate system.

  • For functions that accept multiple GEOMETRY expressions as arguments (e.g. ST_DISTANCE), the input expressions must be defined in the same SRS.

Geospatial Input and Output

The following sections cover the supported standard formats and object types when reading and writing geospatial data.

Supported Standard Input and Output Formats

The GEOGRAPHY and GEOMETRY data types support the following standard industry formats for input and output:

You might also find the following references helpful:

Any departure from these standards is noted explicitly in the Snowflake documentation.

Note on GeoJSON Handling for GEOGRAPHY Values

The WKT and WKB standards specify a format only; the semantics of WKT/WKB objects depend on the reference system - for example, a plane or a sphere.

The GeoJSON standard, on the other hand, specifies both a format and its semantics: GeoJSON points are explicitly WGS 84 coordinates, and GeoJSON line segments are supposed to be planar edges (straight lines).

Contrary to that, the Snowflake GEOGRAPHY data type interprets all line segments - including those input from or output to GeoJSON format - as geodesic arcs. In essence, Snowflake treats GeoJSON as JSON-formatted WKT with spherical semantics.

Note on EWKT and EWKB Handling for GEOGRAPHY Values

EWKT and EWKB are non-standard formats introduced by PostGIS. They enhance the WKT and WKB formats by including a Spatial Reference System Identifier (SRID), which specifies the coordinate reference system to use with the data. Snowflake currently supports only WGS84, which maps to SRID=4326.

By default, Snowflake issues an error if an EWKB or EWKT input value contains an SRID other than 4326. Conversely, all EWKB and EWKT output values have SRID=4326.

Supported Geospatial Object Types

The GEOGRAPHY and GEOMETRY data types can store the following types of geospatial objects:

  • WKT / WKB / EWKT / EWKB / GeoJSON geospatial objects:

    • Point

    • MultiPoint

    • LineString

    • MultiLineString

    • Polygon

    • MultiPolygon

    • GeometryCollection

  • These GeoJSON-specific geospatial objects:

    • Feature

    • FeatureCollection

Specifying the Output Format for Result Sets

The session parameters GEOGRAPHY_OUTPUT_FORMAT and GEOMETRY_OUTPUT_FORMAT control the rendering of GEOGRAPHY- and GEOMETRY-typed columns in result sets (respectively).

The GEOGRAPHY_OUTPUT_FORMAT and GEOMETRY parameters can have one of the following values:

Parameter Value

Description

GeoJSON (default)

The GEOGRAPHY / GEOMETRY result is rendered as an OBJECT in GeoJSON format.

WKT

The GEOGRAPHY / GEOMETRY result is rendered as a VARCHAR in WKT format.

WKB

The GEOGRAPHY / GEOMETRY result is rendered as a BINARY in WKB format.

EWKT

The GEOGRAPHY / GEOMETRY result is rendered as a VARCHAR in EWKT format.

EWKB

The GEOGRAPHY / GEOMETRY result is rendered as a BINARY in EWKB format.

For EWKT and EWKB, the SRID is always 4326 in the output. See the note on EWKT and EWKB handling.

This parameter affects all clients, including the Snowflake UI and the SnowSQL command-line client, as well as the JDBC, ODBC, node.js, python, etc. drivers and connectors.

For example: the JDBC Driver returns the following metadata for a GEOGRAPHY-typed result column (column i in this example):

  • If GEOGRAPHY_OUTPUT_FORMAT='GeoJSON' or GEOMETRY_OUTPUT_FORMAT='GeoJSON':

    • ResultSetMetaData.getColumnType(i) returns java.sql.Types.VARCHAR.

    • ResultSetMetaData.getColumnClassName(i) returns "java.lang.String".

  • If GEOGRAPHY_OUTPUT_FORMAT='WKT' or 'EWKT', or if: GEOMETRY_OUTPUT_FORMAT='WKT' or 'EWKT':

    • ResultSetMetaData.getColumnType(i) returns java.sql.Types.VARCHAR.

    • ResultSetMetaData.getColumnClassName(i) returns "java.lang.String".

  • If GEOGRAPHY_OUTPUT_FORMAT='WKB' or 'EWKB', or if GEOMETRY_OUTPUT_FORMAT='WKB' or 'EWKB':

    • ResultSetMetaData.getColumnType(i) returns java.sql.Types.BINARY.

    • ResultSetMetaData.getColumnClassName(i) returns "[B" (array of byte).

Note

APIs for retrieving database-specific type names (getColumnTypeName in JDBC and the SQL_DESC_TYPE_NAME descriptor in ODBC) always return GEOGRAPHY and GEOMETRY for the type name, regardless of the values of the GEOGRAPHY_OUTPUT_FORMAT and GEOMETRY_OUTPUT_FORMAT parameters. For details, see:

Examples of Inserting and Querying GEOGRAPHY Data

The code below shows sample input and output for the GEOGRAPHY data type. Note the following:

  • For the coordinates in WKT, EWKT, and GeoJSON, longitude appears before latitude (e.g. POINT(lon lat)).

  • For the WKB and EWKB output, it is assumed that the BINARY_OUTPUT_FORMAT parameter is set to HEX (the default value for the parameter).

The example creates a table with a GEOGRAPHY column, inserts data in WKT format, and returns the data in different output formats.

create table geospatial_table (id INTEGER, g GEOGRAPHY);
insert into geospatial_table values
    (1, 'POINT(-122.35 37.55)'), (2, 'LINESTRING(-124.20 42.00, -120.01 41.99)');
alter session set GEOGRAPHY_OUTPUT_FORMAT='GeoJSON';
select g
    from geospatial_table
    order by id;
+------------------------+
| G                      |
|------------------------|
| {                      |
|   "coordinates": [     |
|     -122.35,           |
|     37.55              |
|   ],                   |
|   "type": "Point"      |
| }                      |
| {                      |
|   "coordinates": [     |
|     [                  |
|       -124.2,          |
|       42               |
|     ],                 |
|     [                  |
|       -120.01,         |
|       41.99            |
|     ]                  |
|   ],                   |
|   "type": "LineString" |
| }                      |
+------------------------+
alter session set GEOGRAPHY_OUTPUT_FORMAT='WKT';
select g
    from geospatial_table
    order by id;
+-------------------------------------+
| G                                   |
|-------------------------------------|
| POINT(-122.35 37.55)                |
| LINESTRING(-124.2 42,-120.01 41.99) |
+-------------------------------------+
alter session set GEOGRAPHY_OUTPUT_FORMAT='WKB';
select g
    from geospatial_table
    order by id;
+------------------------------------------------------------------------------------+
| G                                                                                  |
|------------------------------------------------------------------------------------|
| 01010000006666666666965EC06666666666C64240                                         |
| 010200000002000000CDCCCCCCCC0C5FC00000000000004540713D0AD7A3005EC01F85EB51B8FE4440 |
+------------------------------------------------------------------------------------+
alter session set GEOGRAPHY_OUTPUT_FORMAT='EWKT';
select g
    from geospatial_table
    order by id;
+-----------------------------------------------+
| G                                             |
|-----------------------------------------------|
| SRID=4326;POINT(-122.35 37.55)                |
| SRID=4326;LINESTRING(-124.2 42,-120.01 41.99) |
+-----------------------------------------------+
alter session set GEOGRAPHY_OUTPUT_FORMAT='EWKB';
select g
    from geospatial_table
    order by id;
+--------------------------------------------------------------------------------------------+
| G                                                                                          |
|--------------------------------------------------------------------------------------------|
| 0101000020E61000006666666666965EC06666666666C64240                                         |
| 0102000020E610000002000000CDCCCCCCCC0C5FC00000000000004540713D0AD7A3005EC01F85EB51B8FE4440 |
+--------------------------------------------------------------------------------------------+

Using Geospatial Data in Snowflake

The following sections cover the supported standard formats and object types when reading and writing geospatial data.

Understanding the Effects of Using Different SRIDs with GEOMETRY

In a GEOMETRY column, you can insert objects that have different SRIDs. If the column contains more than one SRID, some of the important performance optimizations are not applied. This can result in slower queries, in particular when joining on a geospatial predicate.

Performing DML Operations on GEOGRAPHY and GEOMETRY Columns

When a GEOGRAPHY or GEOMETRY column is the target of a DML operation (INSERT, COPY, UPDATE, MERGE, or CREATE TABLE AS…), the column’s source expression can be any of the following types:

Loading Geospatial Data from Stages

Data from CSV or JSON / AVRO files in a stage can be loaded directly (i.e. without copy transforms) into a GEOGRAPHY column.

Loading data from other file formats (Parquet, ORC, etc.) is possible through a COPY transform.

Using Geospatial Data with Java UDFs

Java UDFs allow the GEOGRAPHY type as an argument and as a return value. See SQL-Java Data Type Mappings for Parameters and Return Types and Passing a GEOGRAPHY Value to an In-line Java UDF for details.

Using Geospatial Data with JavaScript UDFs

JavaScript UDFs allow the GEOGRAPHY type as an argument and as a return value.

If a JavaScript UDF has an argument of type GEOGRAPHY, that argument will be visible as a JSON object in GeoJSON format inside the UDF body.

If a JavaScript UDF returns GEOGRAPHY, the UDF body is expected to return a JSON object in GeoJSON format.

For example, these two JavaScript UDFs are roughly equivalent to the built-in functions ST_X and ST_MAKEPOINT:

create or replace function my_st_x(G geography) returns real
language javascript
as
$$
  if (G["type"] != "Point")
  {
     throw "Not a point"
  }
  return G["coordinates"][0]
$$;

create or replace function my_st_makepoint(LNG real, LAT real) returns geography
language javascript
as
$$
  g = {}
  g["type"] = "Point"
  g["coordinates"] = [ LNG, LAT ]
  return g
$$;

Choosing the Geospatial Data Type to Use (GEOGRAPHY or GEOMETRY)

The next sections explain the differences between the GEOGRAPHY and GEOMETRY data types:

Understanding the Differences Between GEOGRAPHY and GEOMETRY

Although both the GEOGRAPHY and GEOMETRY data types define geospatial features, the types use different models. The following table summarizes the differences.

GEOGRAPHY Data Type

GEOMETRY Data Type

  • Defines features on a sphere.

  • Only the WGS84 coordinate system. SRID is always 4326.

  • Coordinates are latitude (-90 to 90) and longitude (-180 to 180) in degrees.

  • Results of measurement operations (ST_LENGTH, ST_AREA, etc.) are in meters.

  • Segments are interpreted as geodesic arcs on the Earth’s surface.

  • Defines features on a plane.

  • Any coordinate system is supported.

  • Unit of coordinate values are defined by the spatial reference system.

  • Results of measurement operations (ST_LENGTH, ST_AREA, etc.) are in the same unit as coordinates. For example, if the input coordinates are in degrees, the results are in degrees.

  • Segments are interpreted as straight lines on the plane.

Examples Comparing the GEOGRAPHY and GEOMETRY Data Types

The following examples compare the output of the geospatial functions when using the GEOGRAPHY and GEOMETRY data types as input.

Example 1: Querying the Distance Between Berlin and San Francisco

The following table compares the output of ST_DISTANCE for GEOGRAPHY types and GEOMETRY types:

ST_DISTANCE Using . GEOGRAPHY Input

ST_DISTANCE Using . GEOMETRY Input

SELECT ST_DISTANCE(
         ST_POINT(13.4814, 52.5015),
         ST_POINT(-121.8212, 36.8252))
       AS distance_in_meters;
+--------------------+
| DISTANCE_IN_METERS |
|--------------------|
|   9182410.99227821 |
+--------------------+
SELECT ST_DISTANCE(
         ST_GEOM_POINT(13.4814, 52.5015),
         ST_GEOM_POINT(-121.8212, 36.8252))
       AS distance_in_degrees;
+---------------------+
| DISTANCE_IN_DEGREES |
|---------------------|
|       136.207708844 |
+---------------------+

As shown in the example above:

  • With GEOGRAPHY input values, the input coordinates are in degrees, and the output value is in meters. (The result is 9,182 km.)

  • With GEOMETRY input values, the input coordinates and output value are degrees. (The result is 136.208 degrees.)

Example 2: Querying the Area of Germany

The following table compares the output of ST_AREA for GEOGRAPHY types and GEOMETRY types:

ST_AREA Using . GEOGRAPHY Input

ST_AREA Using . GEOMETRY Input

SELECT ST_AREA(border) AS area_in_sq_meters
  FROM world_countries
  WHERE name = 'Germany';
+-------------------+
| AREA_IN_SQ_METERS |
|-------------------|
|  356379183635.591 |
+-------------------+
SELECT ST_AREA(border) as area_in_sq_degrees
  FROM world_countries_geom
  WHERE name = 'Germany';
+--------------------+
| AREA_IN_SQ_DEGREES |
|--------------------|
|       45.930026848 |
+--------------------+

As shown in the example above:

  • With GEOGRAPHY input values, the input coordinates are in degrees, the output value is in square meters. (The result is 356,379 km^2.)

  • With GEOMETRY input values, the input coordinates in degrees, and the output value is in square degrees. (The result is 45.930 square degrees.)

Example 3: Querying the Names of Countries Overlapping the Line from Berlin to San Francisco

The following table compares the output of ST_INTERSECTS for GEOGRAPHY types and GEOMETRY types:

ST_INTERSECTS Using . GEOGRAPHY Input

ST_INTERSECTS Using . GEOMETRY Input

SELECT name FROM world_countries WHERE
  ST_INTERSECTS(border,
    TO_GEOGRAPHY(
      'LINESTRING(13.4814 52.5015, -121.8212 36.8252)'
    ));
+--------------------------+
| NAME                     |
|--------------------------|
| Germany                  |
| Denmark                  |
| Iceland                  |
| Greenland                |
| Canada                   |
| United States of America |
+--------------------------+
SELECT name FROM world_countries_geom WHERE
  ST_INTERSECTS(border,
    TO_GEOMETRY(
      'LINESTRING(13.4814 52.5015, -121.8212 36.8252)'
    ));
+--------------------------+
| NAME                     |
|--------------------------|
| Germany                  |
| Belgium                  |
| Netherlands              |
| United Kingdom           |
| United States of America |
+--------------------------+
Countries intersecting when using GEOGRAPHY Countries intersecting when using GEOMETRY

Understanding the Differences in Input Data Validation

To create a GEOMETRY or GEOGRAPHY object for an input shape, you must use a shape that is well-formed and valid, according to the OGC rules for Simple Features. The next sections explain how the validity of input data differs between GEOMETRY and GEOGRAPHY.

A Shape Can Be Valid GEOGRAPHY But Invalid GEOMETRY

A given shape can be a valid GEOGRAPHY object but an invalid GEOMETRY object (and vice versa).

For example, self-intersecting polygons are disallowed by the OGC rules. A given set of points may define edges that intersect in Cartesian domain but not on a sphere. Consider the following polygon:

POLYGON((0 50, 25 50, 50 50, 0 50))

In the Cartesian domain, this polygon degrades to a line and, as a result, is invalid.

However, on a sphere, this same polygon does not intersect itself and is valid:

POLYGON((0 50, 25 50, 50 50, 0 50)) invalid geometry, but valid geography

Conversion and Constructor Functions Handle Validation Differently

When the input data is invalid, the GEOMETRY and GEOGRAPHY functions handle validation in different ways:

  • The functions for constructing and converting to GEOGRAPHY objects (e.g. TO_GEOGRAPHY) might attempt to repair the shape to handle problems like unclosed loops, spikes, cuts, and self-intersecting loops in polygons.

    If the function is successful in repairing the shape, the function returns a GEOGRAPHY object.

  • The functions for constructing and converting to GEOMETRY objects (e.g. TO_GEOMETRY) do not support the ability to repair the shape.

Converting Between GEOGRAPHY and GEOMETRY

Snowflake does not support casting from GEOGRAPHY to GEOMETRY (and vice versa). Snowflake also does not support transformations of objects that use different spatial reference systems (SRS).

If you need to convert a GEOGRAPHY object to a GEOMETRY object (or GEOMETRY to GEOGRAPHY), you can use GeoJSON as an intermediate format.

The following example converts the GEOGRAPHY object geography_expression to a GEOMETRY object with the SRID 0:

SELECT TO_GEOMETRY(ST_ASGEOJSON(geography_expression));

If you need to set the SRID of the GEOMETRY object, call the ST_SETSRID function. For example, to convert the GEOGRAPHY object geography_expression to a GEOMETRY object with the SRID 4326:

SELECT ST_SETSRID(TO_GEOMETRY(ST_ASGEOJSON(geography_expression)), 4326);
Back to top