Using MySQL’s geospatial extension for a location finder

It’s possible to use the geospatial extension in MySQL for an efficient location finder.  For this to be worth the trouble, the following conditions must hold.

  • You must use a MyISAM table for your geospatial data, or use version 5.7.5 or later of MySQL.
  • A NOT NULL qualification on your geometry column is required
  • A spatial index is needed:  ALTER TABLE flags ADD SPATIAL INDEX (coordinates)
  • Use of the GeomFromText and MBRContains / MBRWithin functions in your SELECT statement.

For example, a geospatial table might have the following definition.

CREATE TABLE geozip (
  zip VARCHAR(5) CHARACTER SET utf8 NOT NULL,
  geo GEOMETRY NOT NULL,
  PRIMARY KEY zip (zip),
  SPATIAL KEY geo (geo) )
ENGINE=MYISAM

Once you have created your geozip table, you can populate it with this kind of query.

INSERT INTO geozip 
SELECT DISTINCT 
       zip, 
       GEOMFROMTEXT( CONCAT('POINT(', latitude, ' ', longitude, ')') ) AS geo 
  FROM zip

Looking up rows in this table might be done by this query.

SELECT zip, X(geo), Y(geo), distance 
  FROM ( 
         SELECT zip, geo,r, 
                units * DEGREES( ACOS( COS(RADIANS(latpoint)) * 
                                       COS(RADIANS(X(geo))) * 
                                       COS(RADIANS(longpoint) - RADIANS(Y(geo))) + 
                                       SIN(RADIANS(latpoint)) * 
                                       SIN(RADIANS(X(geo))))) AS distance 
           FROM geozip
           JOIN ( 
                  SELECT 42.81 AS latpoint, -70.81 AS longpoint, 
                         10.0 AS r, 69.0 AS units 
                ) AS p ON (1=1) 
          WHERE MbrContains(GeomFromText( 
                        CONCAT('LINESTRING(', latpoint-(r/units),' ',
                                              longpoint-(r /(units* COS(RADIANS(latpoint)))), 
                                              ',', 
                                              latpoint+(r/units) ,' ', 
                                              longpoint+(r /(units * COS(RADIANS(latpoint)))), ')')),
                        geo) 
       ) AS d 
 WHERE distance <= r 
 ORDER BY distance

The secret is using the MbrContains(box,geo) function (“Minimum bounding rectangle Contains”) for your WHERE search. That function can exploit your spatial index. The first parameter, the box parameter, specifies a bounding box, and the second specifies a point.  When you use a constant value for the first parameter, the function uses the index.

In this example we specify the first parameter with this sort of expression:

GeomFromText('LINESTRING(LatSW LonSW, LatNE LonNE)')

The LINESTRING is a diagonal line across the bounding box from its southwest to northeast corner. That LINESTRING’s value is generated based on latpoint/longpoint by the long CONCAT() expression in the query. For the example lat/long value of 42.81/-70.81, the LINESTRING looks like this:

GeomFromText('LINESTRING(42.6650 -71.0076,42.9549 -70.6124)')

Finally, if you are dealing with locations that are close together, you may find that computing your distances
using the Vincenty formula is more accurate. You would do that with this query.

SELECT zip, X(geo), Y(geo), distance FROM (
    SELECT zip, geo,r,
           units * DEGREES(
              ATAN2(
                SQRT(
                   POW(COS(RADIANS(latpoint))*SIN(RADIANS(longpoint-Y(geo))),2) +
                     POW(COS(RADIANS(X(geo)))*SIN(RADIANS(latpoint)) -
                     (SIN(RADIANS(X(geo)))*COS(RADIANS(latpoint)) *
                      COS(RADIANS(longpoint-Y(geo)))) ,2)),
                   SIN(RADIANS(X(geo)))*SIN(RADIANS(latpoint)) +
                     COS(RADIANS(X(geo)))*
                     COS(RADIANS(latpoint))*
                     COS(RADIANS(longpoint-Y(geo)
                   )
                  )) AS distance
      FROM geozip
      JOIN (
             SELECT 42.81  AS latpoint,  -70.81 AS longpoint,
             50.0 AS r, 69.0 AS units
           ) AS p ON (1=1)
     WHERE MBRCONTAINS(GEOMFROMTEXT(
             CONCAT('LINESTRING(',
                       latpoint-(r/units),' ',
                       longpoint-(r /(units* COS(RADIANS(latpoint)))),
                       ',',
                       latpoint+(r/units) ,' ',
                       longpoint+(r /(units * COS(RADIANS(latpoint)))),
                    ')')),  geo)
       ) AS d
 WHERE distance <= r
 ORDER BY distance

Percona’s Alex Rubin has prepared some helpful details on the geo extension.

  6 comments for “Using MySQL’s geospatial extension for a location finder

  1. Rick
    January 20, 2016 at 5:00 pm

    Thanks so much Ollie. Do you know the performance difference between this vs using floats for lat/long (MySql)? I’m comparing your other example here: http://www.plumislandmedia.net/mysql/haversine-mysql-nearest-loc/

    • Ollie
      January 21, 2016 at 1:57 pm

      This kind of geospatial index will be fast compared to indexed lat and long columns if

      1. the number of points to search is larger (50K or more, roughly) AND
      2. the geospatial search is by far the most selective of the searches you are doing.

      That is

      WHERE MBRCONTAINS(…)

      is fast, and

      WHERE point_Type = 7 AND MBRCONTAINS(…)

      not so much. That’s because a geo index can’t be included in a compound index with other columns.

  2. March 12, 2014 at 8:18 pm

    Great post! Thanks!
    I assume we are using miles here. If we wanted Kms we would change the “69.0 AS units” to “111.045 AS units” – is this correct?

    • Ollie
      March 13, 2014 at 8:13 am

      Exactly so. 60 nautical miles per degree, 69 statute miles per degree, 111045 meters (111.045 km) per degree, 5,520 chains per degree.

      • nic
        September 11, 2016 at 11:07 pm

        Hi Ollie, thanks for these posts! if we changed r = 2 and set units to 11045.0 would this calculate a 2 meter bounding box? I am looking at using this technique to cache reverse geocoded addresses and so the range needs to be smaller.

      • Ollie
        September 21, 2016 at 11:52 am

        Use 111045 to specify your units in meters.

        Be careful: two meters is likely to be too small a radius to cope with the imprecision in commercial GPS data.

Leave a Reply

Your email address will not be published. Required fields are marked *