Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Google Doc: <If the design in question is unclear or needs to be discussed and reviewed, a Google Doc can be used first to facilitate comments from others.>

Motivation

Describe the problems you are trying to solve.

Related Research

...

Apache Doris supports functions (ST_Point,ST_LineFromText,ST_Polygon etc) to generate GEOGRAPHY values,We can combine GEOGRAPHY values with other geographic functions to do complex geographic analysis.The specific functions will be explained in detail below.

So we now need to have the following three things to do:

  1. Support for constructing GEOGRAPHY values in different ways (WKT, WKB, GEOJSON)
  2. Support for more spatial types, currently supported Point, LineString, Polygon, need support MultiPoint, MultiLineString, MultiPolygon, GeometryCollection.
  3. Support for more geographic analysis functions.

Related Research

At present, the geo type of doris is implemented based on the S2 library (https://github.com/google/s2geometry) , The official S2 documentation(http://s2geometry.io/) gives a very comprehensive introduction to the S2 library, So I just need to introduce the necessary information here. We describe a point on Earth in terms of (lng, lat), but in S2, points are represented internally as unit-length vectors (points on the surface of a three-dimensional unit sphere) as opposed to traditional (latitude, longitude) pairs. That is to say, S2 regards the earth as a unit sphere, About space calculations on this earth Transposition vector calculations on the unit sphere, But for us, we only need to care about latitude and longitude, So we need to know two things:1. doris only supports geographic analysis(Based on WGS-84 coordinate system,SRID = 4326 ), not spatial analysis. 2. For points, we only need to know the latitude and longitude, so it's 2D for us.

I will discuss the three things mentioned above separately here:

一、Support for constructing GEOGRAPHY values in different ways

  • WKT

       Currently we use the WKT parser implemented by Yacc and Lex tools.

  • WKB(EWKB)

       “Well-known binary” is a scheme for writing a simple features geometry into a platform-independent array of bytes, usually for transport between systems or between programs.This document(https://libgeos.org/specifications/wkb/) explains wkb in detail.

      I just need to add a few points:as said above,For points, we only need to know the latitude and longitude, so it's 2D for us.Therefore, we only need to realize the conversion of the standard WKB format, However, we consider that many users use the EWKB format of postgis. In order to be compatible with EWKB, I also implemented the corresponding implementation. In fact, I just added SRID to the wkb format, and the value is 4326.This also explains that EWKB parsing can only parse data with SRID=4326, and others will return NULL,Similarly, the SRID in the result of ST_asEWKB can only be 4326, For ISO WKB, more dimensional recognition is added, which has nothing to do with us, so don't consider it.

  • GEOJSON(todo)

Detailed Design

the detailed design of the function.

...