Fast nearest-location finder for Oracle

Elsewhere I have written an article on fast location finding for the MySQL and PostgreSQL databases. This article shows an adaptation of that work for the Oracle database.

If you need data to work with, here’s a US Zip Code table in SQL suitable for an Oracle database. Please don’t use this data in production.

The Great Circle Distance Formula

What’s the distance along the surface of the (spherical) earth between two arbitrary points, in degrees, given by their latitude and longitude in degrees?  That’s determined by the Spherical Cosine Law, or the Haversine Formula, which is this in Oracle syntax.

57.2957795 * (ACOS(COS(0.0174532925*(lat1)) * COS(0.0174532925*(lat2)) *
                   COS(0.0174532925*(long1-long2)) +
                   SIN(0.0174532925*(lat1)) * SIN(0.0174532925*(lat2))))

This formula is all about angles. The constant “57.2957795” converts radians to degrees, and the inverse constant “0.0174532925” converts degrees to radians.

It’s the distance along the surface of the spherical earth. It works equally well when the locations in question are your apartment and your local supermarket, or when they are the airports in Sydney, Australia and Reykjavik, Iceland. Notice that this result is in degrees.  That means we have to multiply it by  111.045, our value for km per degree, if we want the distance in km.

Querying the Nearest Locations

Cool. So to find the nearest points in a database to a given point, we can write a query like this.  Let’s use the point with latitude 42.81 and longitude -70.81. This Oracle query finds the fifteen nearest points to the given point in order of distance.

SELECT * FROM(
 SELECT z.zip, z.primary_city, z.state,
        z.latitude, z.longitude,
        p.radius,
        p.distance_unit
           * rad2deg * (ACOS(COS(deg2rad * (p.latpoint))
                           * COS(deg2rad * (z.latitude))
                           * COS(deg2rad * (p.longpoint - z.longitude))
                           + SIN(deg2rad * (p.latpoint))
                           * SIN(deg2rad * (z.latitude)))) AS distance
  FROM zip z
  JOIN (
        SELECT  42.81  AS latpoint,    -70.81 AS longpoint,
                50.0 AS radius,        111.045 AS distance_unit,
		57.2957795 AS rad2deg,
                0.0174532925 AS deg2rad
	      FROM  DUAL
    ) p ON 1=1
  ORDER BY distance
)
WHERE ROWNUM <= 15

Notice the use of the join to put latpoint and longpoint into the query. It’s convenient to write the query that way because latpoint and longpoint are referred to multiple times in the formula. We also include the degrees-to-radians and radians-to-degrees constants in the joined table. Notice also the use of “ROWNUM < = 15” to limit the number of rows in the resultset.

Great. We’re done, right? Not so fast! This query is accurate,  but it is very slow.

Optimization

The query is slow because It must compute the haversine formula for every possible pair of points. So it makes your Oracle server do a lot of math, and it forces it to scan through your whole location table.  How can we optimize this?  It would be nice if we could use indexes on the latitude and longitude columns in the table. To do this, let’s introduce a constraint. Let’s say we only care about points in the zip code table that are within 50 km of the (latpoint, longpoint). Let’s figure out how to use an index to eliminate points that are further away.

Remember, from our background information in the other article, that a degree of latitude is 111.045 km.  So, if we have an index on our latitude column, we can use a SQL clause like this to eliminate the points that are too far north or too far south to possibly be within 50 km.

latitude BETWEEN latpoint - (50.0 / 111.045)
             AND latpoint + (50.0 / 111.045)

This WHERE clause lets Oracle use an index to omit lots of latitude points before computing the haversine distance formula. It allows Oracle to perform a range scan on the latitude index.

Finally, we can use a similar but more complex SQL clause to eliminate points that are too far east or west.  This clause is more complex because degrees of longitude are smaller distances the further away from the equator we move. This is the formula.

longitude BETWEEN longpoint - (50.0 / (111.045 * COS(deg2rad*(latpoint))))
              AND longpoint + (50.0 / (111.045 * COS(deg2rad*(latpoint))))

So, putting it all together, this query finds the neareast 15 points that are within 50km of the (latpoint,longpoint).

SELECT * FROM (
SELECT zip, primary_city, state,
       latitude, longitude, distance
  FROM (
 SELECT z.zip, z.primary_city, z.state,
        z.latitude, z.longitude,
        p.radius,
        p.distance_unit
            * rad2deg * (ACOS(COS(deg2rad * (p.latpoint))
                            * COS(deg2rad * (z.latitude))
                            * COS(deg2rad * (p.longpoint - z.longitude))
                            + SIN(deg2rad * (p.latpoint))
                            * SIN(deg2rad * (z.latitude)))) AS distance
  FROM zip z
  JOIN (
        SELECT  42.81  AS latpoint,    -70.81 AS longpoint,
                50.0 AS radius,        111.045 AS distance_unit,
                57.2957795 AS rad2deg,
                0.0174532925 AS deg2rad
	      FROM  DUAL
    ) p ON 1=1
  WHERE z.latitude
     BETWEEN p.latpoint  - (p.radius / p.distance_unit)
         AND p.latpoint  + (p.radius / p.distance_unit)
    AND z.longitude
     BETWEEN p.longpoint - (p.radius / (p.distance_unit * COS(deg2rad * (p.latpoint))))
         AND p.longpoint + (p.radius / (p.distance_unit * COS(deg2rad * (p.latpoint))))
 )
 WHERE distance <= radius
 ORDER BY distance
)
WHERE ROWNUM <= 20

This query, even though it’s a bit complicated, takes advantage of your latitude and longitude indexes and works efficiently.

Notice as part of the overall query that we JOIN this little subquery.

        SELECT  42.81  AS latpoint,    -70.81 AS longpoint,
                50.0 AS radius,        111.045 AS distance_unit,
                57.2957795 AS rad2deg,
                0.0174532925 AS deg2rad
	      FROM  DUAL

The purpose of this is to make it easier for application software to provide the parameters needed for the query. latpoint and longpoint are the specific location for which you need nearby places. radius specifies how far away the search should go. We include the radians-degrees conversion constants. Finally distance_unit should be 111.045 if you want to give your distances in kilometers and 69.0 if you want them in statute miles.

Using Miles Instead of Km

Finally, many people need to use miles instead of km for their distances.  This is straightforward.  Just change the value of the “distance_unit” to 69.0.

That’s the query for a typical store-finder or location-finder application based on latitude and longitude.  You should be able to adapt it to your use without too much trouble.

Adapting this query to other location-table definitions

This query is, of course, written to run with a particular table definition for zip, a US zip code table.  That zip  table has columns named zip, primary_city, latitude, and longitude, among others. Notice that the table is referred to by FROM zip z in the query. That makes it have the alias z.

Your location table very likely has different columns. It should be straightforward to rewrite this query to use yours.  Look for the columns referred to as z.something in the query, and replace those with the column names from your table. So, for example, if your table is called SHOP and has shopnameshoplat, and shoplong columns, you’ll put z.shopname in place of z.primary_city and so forth. You’ll refer to your table by mentioning FROM SHOP z in the query.

Using a Stored Procedure

You can use a PL/SQL Stored Procedure to compute the great circle distance if you prefer. Here is a stored procedure for that purpose.

create or replace FUNCTION haversine  (
  lat1 IN REAL, lon1 IN REAL, 
  lat2 IN REAL, lon2 IN REAL)
RETURN REAL
DETERMINISTIC
IS

  DEGREES REAL := 57.2957795;
  RADIANS REAL := 0.0174532925;

BEGIN
  return DEGREES * (ACOS(
              COS(RADIANS *(lat1)) *
              COS(RADIANS *(lat2)) *
              COS(RADIANS *(lon2 - lon1)) +
              SIN(RADIANS * (lat1)) * SIN(RADIANS * (lat2))
            ));
END;

When you use that stored procedure your query is a little easier to read. Your queries will also run quite a bit faster because Oracle compiles its procedures.

SELECT * FROM (
SELECT zip, primary_city, state,
       latitude, longitude, distance
  FROM (
 SELECT z.zip, z.primary_city, z.state,
        z.latitude, z.longitude,
        p.radius,
        p.distance_unit
           * haversine(latitude,   longitude,
                       latpoint,   longpoint) AS distance
  FROM zip z
  JOIN (
        SELECT  42.81  AS latpoint,    -70.81 AS longpoint,
                50.0 AS radius,        111.045 AS distance_unit,
                0.0174532925 AS deg2rad
	      FROM  DUAL
    ) p ON 1=1
  WHERE z.latitude
     BETWEEN p.latpoint  - (p.radius / p.distance_unit)
         AND p.latpoint  + (p.radius / p.distance_unit)
    AND z.longitude
     BETWEEN p.longpoint - (p.radius / (p.distance_unit * COS(deg2rad * (p.latpoint))))
         AND p.longpoint + (p.radius / (p.distance_unit * COS(deg2rad * (p.latpoint))))
 )
 WHERE distance <= radius
 ORDER BY distance
)
WHERE ROWNUM <= 20

  10 comments for “Fast nearest-location finder for Oracle

  1. Suman Panigrahi
    May 7, 2018 at 7:08 am

    Hi Ollie,

    We are using following query in production to find nearest buildings, and it takes 30 secs. Is there a way to tune this query ?

    WITH WUILDING AS
    (SELECT GEO_LATITUDE,GEO_LONGITUDE,6387.7 AS RADIUS, 57.29577951 AS DEGTORAD FROM BUILDING_ADS
    WHERE BUILDING_NK = :p)
    SELECT VP FROM (
    SELECT /*+ PARALLEL 8*/ BA.BUILDING_NK AS BUILDING_ID, MV.VP AS VP,
    (NVL(WO.RADIUS,0) * ACOS((SIN(NVL(WO.GEO_LATITUDE,0) / WO.DEGTORAD) * SIN(NVL(BA.GEO_LATITUDE,0) / WO.DEGTORAD)) +
    (COS(NVL(WO.GEO_LATITUDE,0) / WO.DEGTORAD) * COS(NVL(BA.GEO_LATITUDE,0) / WO.DEGTORAD) *COS(NVL(BA.GEO_LONGITUDE,0) …

    • Ollie
      May 30, 2018 at 12:00 pm

      Your best bet for getting help with query performance is https://stackoverflow.com/

      Your query does no bounding box filtering of your location data. That’s the secret to getting good performance.

  2. Gary Walker
    October 6, 2016 at 1:23 pm

    Computing longitude BETWEEN longpoint – (50.0 / (111.045 * COS(deg2rad*(latpoint))))
    AND longpoint + (50.0 / (111.045 * COS(deg2rad*(latpoint))))

    This Seems like a slow operation to me. If you just ignore all this as simply use 111.045 as the upper bound regardless of latitude you will still have a proper subset (although a little more area would of course be included). Given that stores tend to be near people, it should be safe to assume that 99% of the world’s population and stores are between latitudes 60N and 60S, and cosine 60 is 0.5, so you are only doubling the potential areas included at the extreme north/south region for 99% of your possible data points.

    Calculating distance by the Haversine formula is still quite expensive — for short distances (50 km or less) this flat earth formula:

    flat_earth_distance(lat1, lng1, lat2, lng2) — lat/long are in degrees
    deglen := 111.045 — km
    x := lat1 – lat2 — don’t care if pos. or neg. since we square it
    y := (lng1 – lng2)*cos(lat1) — cosine probably has to convert to radians first
    return deglen*sqrt(x*x + y*y)
    — or return distance squared and avoid the square root calculation

    Results in an distance error of no more than 0.4% even when the the latitude is extreme as 65 deg. You could even use a case statement to use Haversine when latitude is extreme.

    If someone is looking for a store with 50 km and you just happen to miss by a full 0.4% you will at worst case report that a store 50.2 km is in range. Seems like a good compromise in most real-world use cases.

    However, I would not bother with the faster flat earth distance formula if the number of returned points is relatively large, unless you have tested and determined that the database is exceptionally slow when using the Haversine formula.

    http://jonisalonen.com/2014/computing-distance-between-coordinates-can-be-simple-and-fast/ has a nice article on using the flat earth approximation.

    • Ollie
      October 6, 2016 at 6:12 pm

      With respect, you’re pitching a false economy. This would be a concern if we were still in the era of software floating point computation. But, these days, the time consumed by a computation is several orders of magnitude faster than the other times consumed. Here’s the analogy.

      • Christopher Columbus writes a fifty-word letter :: do the computation.
      • Christopher Columbus walks to the dock and hands it to a ship captain :: retrieve the data from the MySQL table, with an SSD.
      • The ship captain sets sail, sails across the ocean, and delivers the letter to Queen Isabella :: send the data over the network to the database client.

      Who cares how long it takes to write the letter? The trick is to ship an optimal amount of cargo across the ocean.

  3. Don
    March 2, 2016 at 1:35 pm

    Awesome! Thanks for sharing your coded procedure! I wasn’t even sure Oracle had all the trig functions built in to do this and you coded the entire procedure.

    • Ollie
      March 7, 2016 at 12:33 pm

      Well, Oracle lacks the RADIANS() and DEGREES() functions that are exposed by standard IEEE floating point units. But that’s not a problem, because multiplications can do the trick.

  4. Hiren
    February 2, 2016 at 6:17 am

    Is it possible to get long/lat using a zipcode as input, like the Google api does?

    • Ollie
      February 2, 2016 at 6:32 am

      If you have a table of zipcodes containing long/lat columns, you can, trivially, look up the long / lat for a zipcode in the table. Google’s benefit is their enormous collection of data behind their apis. If you do this in your own DBMS you must provide the data.

  5. Omar Al hakimy
    December 7, 2014 at 8:39 pm

    I wanted to thank you. Your post helped me a lot with my project. I was trying to find a way to let my users to find something in their circle, and here you helped me.

    • Ollie
      December 8, 2014 at 11:48 am

      You’re welcome, Omar!

Leave a Reply

Your email address will not be published. Required fields are marked *