Getting started with the data

This entry is part 2 of 4 in the series Exploring data with SQL

The Times’s data is here. https://raw.githubusercontent.com/nytimes/covid-19-data/master/us-counties.csv If you’re using a browser, open the context menu (right-click) on that link and use the Save Link As … operation to put it into a file on your system. We’ll use it later to load the database. Use your database interface program to log in to your database … Read more

Computing the mode with SQL

The mode of a bunch of numbers is the most frequent number in the bunch. A bunch of numbers can have more than one most-frequent number. Given a table tbl with columns category and value, this SQL returns the mode for each category.   WITH freq AS ( SELECT COUNT(*) freq, value, category FROM tbl … Read more

Super simple Javascript Queue class

It’s tempting, when in a hurry, to create a queue object in Javascript with an array. To enqueue something, push() it onto the array. To dequeue it, shift() it off. It works. But it’s slow, because shift() takes O(n) time where n is the length of the queue. Kate Morely cooked up an almost-as-simple but … Read more

Should I encrypt columns in my web application’s database?

Somebody asked whether it’s a good idea to encrypt database columns in a web application. My answer: “It probably won’t help much.” Why? Figure out your threat model Best practice: figure out your threat model before you spend time and money securing your system. If you build complex security measures without a clear idea of … Read more

Time Zones in MySQL

Suppose you’re working on a web application to support users in multiple time zones. How do you handle dates and times? If your users all live in one time zone, it’s easy. Simply store important date and time data in DATETIME or TIMESTAMP columns, and be done with it. But, what application has users in just one … Read more

Test data for FULLTEXT searching

MySQL’s FULLTEXT search feature works best when it’s used on a large corpus of text. That is, for best results you need many rows. When the text corpus is too small, full text searching often returns strange results: the indexing process tries to work out common words. If the amount of text being indexed is … Read more

Sample text in many languages

Have you been around long enough to remember the Kermit terminal program? It was a very important way of getting online in the late 1980s and early 1990s. Frank da Cruz, Kermit’s creator, also has an interest in world languages and Unicode. He put together a terrific web page containing samples of text in many languages. … Read more

Unique Numbers in Oracle and MySQL

It’s common in database work to need unique integers. They get used in id columns in tables, and for other purposes. This note compares and contrasts the Oracle and MySQL approaches to doing this. MySQL handles this need with autoincrement columns, in tables and Oracle handles it with database objects called sequences. In MySQL, when … Read more

Fast nearest-location finder for SQL (MySQL, PostgreSQL, SQL Server)

I’ve spent enough time goofing around with location-finder software that it’s worth writing up how to do it.   Of course, finding distances on the surface of the earth means using Great Circle distances, worked out with the Haversine formula, also called the Spherical Cosine Law formula. The problem is this: Given a table of … Read more