Novel DS & Algos Roundup #1

November 04, 2017 by Abhirath Mahipal


These are a few novel algorithms that I stumbled upon this year. I intend to curate and jot down stuff like this on a regular basis from now on so that I can quickly get to speed in the future (that is once I’ve forgotten about them). These also serve as notes. I don’t have to hunt down these links in the future.

Index

Bloom Filters

It’s a space efficient algorithm that checks if a particular item exists in a list or not. It trades memory requirements for accuracy. It has a very high accuracy but isn’t right all the time (you can tune it to use more memory and increase accuracy or reduce memory requirement and accuracy at the same time). Ingenious use of hashing.

Music Shuffling

Random algorithms obviously aren’t deterministic. When using them it’s possible that at times you might end up the same number multiple times clustered together. For example 1, 2, 2, 8, 4, 5, 5, 5. This algorithm ensures that the songs are uniformly distributed.

Rsync

Rsync efficiently transfers only the differences between the two files. Say you have an outdated file A on your server and the latest file A on your personal computer. You have to replace the file on the server with the latest file. The naive way to send the file would be to send the entire file from your computer to the server. It uses more bandwidth and takes more time.

The smarter way would be to just send the differences between both the files. Only the parts that have actually changed. Rsync uses hashing to find out the differences and can then enable you to send only the differences saving you time and bandwidth.

Hyperloglog

I stumbled upon this during my attempt to read the source code of Redis. It is used to return the number of distinct elements in a list or container. It uses lesser memory than the traditional approach (iterate over each thing, if not already encountered increase count by 1) but isn’t exact. It’s pretty accurate that even Redis uses it.

Pratt Parsing

Pratt parsing is a simple parsing technique. I haven’t worked with parsers so I don’t know the benefits it offers over other parsing techniques.

These tutorials go really in depth.