Category: MySQL

Articles about the MySQL relational data management system.

Publicly Available Datasets

Sometimes learning to use data systems like MySQL means you need to get your hands on various publicly available sets of data. Here are some sources. Pro Publica, the investigative news powerhouse, has a Data Store. It’s mostly health care related material. Not all their datasets are free, but some are. If you’re interested in historical meteorological observations (weather data),… Read more →

The Vincenty great-circle distance formula

This Vincenty formula is a more numerically stable version of the spherical cosine law formula (commonly and wrongly known as the Haversine formula) for computing great circle distances. The question of numerical stability comes up specifically when the distances between points are small. In those cases the cosine is very close to 1, so the inverse cosine function is not… Read more →

SQL Reporting by time intervals

A version of this article specific to the Oracle DBMS is here. It’s often helpful to use SQL to group information by periods of time. For example, we might like to examine sales data. For example, we might have a table of individual sales transactions like so. Sales: sales_id int sales_time datetime net decimal(7,2) tax decimal(7,2) gross decimal(7,2) Each time… Read more →

Using MySQL’s geospatial extension for a location finder

It’s possible to use the geospatial extension in MySQL for an efficient location finder.  For this to be worth the trouble, the following conditions must hold. You must use a MyISAM table for your geospatial data, or use version 5.7.5 or later of MySQL. A NOT NULL qualification on your geometry column is required A spatial index is needed:  ALTER… Read more →

Stored function for haversine distance computation

In another article I described the process of using MySQL to compute great-circle distances between various points on the earth then their latitudes and longitudes are known.  To do this requires the formula commonly called the haversine formula. It’s actually the spherical cosine law formula, and is shown here. There’s a more numerically stable formula — better when points are near… Read more →

Mean Absolute Deviation

Nassim Taleb wrote a provocative article on Edge.Org calling for using the Mean Absolute Deviation in place of the more popular standard deviation as a measure of the variability of a collection of observations. His reasoning is persuasive to me, especially his claim that the standard deviation is widely misapplied and misunderstood. MySQL (like many RDBMs) has an aggregate function… Read more →

What’s a date?

What is a date?  This seems like a silly question.  Indeed, if you are an independent local business person, it is a silly question.  A date is, for example, the seventh of September, 2011 (“2011-09-07”).  It describes a period of 24 hours that starts at midnight and ends just before midnight. If you only care about dates in, let’s say,… Read more →