Fun with Binary Trees

     

This is another contrived post to test out my interactive coding scripts, and also to test making a d3 tree graph. While this post is probably not going to be enlightening, I hope it is somewhat entertaining.

One of my favorite data structures is the binary tree. It’s not the fastest at everything, but it does most things you’d need, and it does them reasonably quickly.

If you have a list of things that you need to sort, search, and also insert new items into, a binary tree is not a bad thing to consider. While it’s searching is nowhere near as fast as, say, a hash table and it’s sorting isn’t as fast as timsort - it’s versatility and simplicity make it one of my goto structures.

Adding one item to a binary search tree is on average an \(O(\log n)\) process. Adding n items is an \(O(n \log n)\) process, making tree sorting a ‘fast sort’ process. (Wikipeda)

(This of course depends of if it is balanced or unbalanced. And, as I am sure you already know, Big Oh (Lower is better) is just a way to measure how fast an algorithm will “blow out” as you throw more stuff at it)

Instead of me describing a tree in English, let’s just build one. Don’t forget to click the play button on each section.


The core of a tree is a node. In a binary tree each node has two children - generally called left and right. Here I am instead making an array called children with 2 elements because, later, it’ll make using d3 easier.

await K.Include("https://d3js.org/d3.v4.js");

class Node {
  constructor(name, data) {
    // This is normally called ID, using name because of d3
    this.name = name || undefined;
    this.data = data || undefined;
    // Usually...
    // this.left = undefined;
    // this.right = undefined;
    this.children = new Array(2);
  }
}

return Node;

One neat thing about a tree is everything is a node - we get a lot of value out of this simple data structure.

To start the tree we just create a root node

const [Node] = arguments;
const root = new Node();
return [Node, root];

Insert

If you use this in a real problem, some care must be taken here. As we start adding things to this tree the root node will become the pivot point of the tree. You want this to be as close to the middle as possible - whatever you define the middle as. This will help keep the tree balanced. For now, we’ll just ignore that.

We’ll make a simple insert function and insert a bunch of numbers into our tree:

const [[Node]] = arguments;
const root = new Node();

function insert(root, name, data) {
  // if nothing here, we'll use this node
  if (root.name === undefined) {
    root.name = name;
    root.data = data;
    return true;
  }

  // Duplicate
  if (root.name === name) {
    // should also save this data.
    // but we'll ignore it in this example
    return false;
  }

  if (name < root.name) {
    if (root.children[0] === undefined) {
      root.children[0] = new Node(name, data);
      return true;
    }
    return insert(root.children[0], name, data);
  }
  if (name > root.name) {
    if (root.children[1] === undefined) {
      root.children[1] = new Node(name, data);
      return true;
    }
    return insert(root.children[1], name, data);
  }

  return false;
}

for (const i of [4,621,12,451,131,33,22,56,64]) {
  insert(root, i, `#${i}`);
}

return root;

This insert function looks at a number, and if it is greater than or less than the given value, and adds a node to the left or right.

Graph

To visualize the tree we just made, let’s use D3:

const [data, me] = arguments;

const svg = K.GetD3Canvas(me);
svg.selectAll("*").remove();

let width = 360
let height = 260

const g = svg.attr("width", width)
  .attr("height", height)
  .append("g")
    .attr("transform", "translate(40,0)");

  // Create the cluster layout:
  let cluster = d3.cluster().size([height, width]);
  let root = d3.hierarchy(data, function(d) {
      if(d) return d.children;
  });
  cluster(root);

  // Add the edges
  g.selectAll('path')
    .data( root.descendants().slice(1) )
    .enter()
    .append('path')
    .attr("d", function(d) {
      const p = "M" + d.y + "," + d.x
        + "C" + (d.parent.y + 10) + "," + d.x
        + " " + (d.parent.y + 10) + "," + d.parent.x
        + " " + d.parent.y + "," + d.parent.x;
      return p;
    })
    .style("fill", 'none')
    .attr("stroke", '#ccc')

  // Add each node.
  g.selectAll("g")
      .data(root.descendants())
      .enter()
      .append("g")
      .attr("transform", function(d) {
          return "translate(" + d.y + "," + d.x + ")"
      })
      .append('text')
      .attr("dy", ".35em")
      .attr("x", function(d) {
          if(d) return d.children || d._children ? -13 : 13;
      })
      .text(function(d) { return (d && d.data) ? d.data.name || "0" : "∅"; });

svg.style("height", "280px");

return data;

If the output doesn’t make sense, use the graph output and the insert code above, have a play with tweaking the values of the array. Maybe make the array have fewer numbers, or ‘1,2,3,4’ or start with 300 - rerun the code blocks and see how the tree changes.

Sort

Now for the first magic trick.

If we iterate over the nodes in a particular way, they come out sorted. We can make them ascending or descending depending on which side we visit first.

Here is ascending:

const [data, me] = arguments;
const sorted = [];

function sort(node, sorted) {
  if(node.children && node.children[0]) {
    sort(node.children[0], sorted);
  }
  sorted.push(node);
  if(node.children && node.children[1]) {
     sort(node.children[1], sorted);
  }
}
sort(data, sorted);
K.Print(me, sorted.map(i => i.name).join(', '));

return data;

I remember when I first learned this - it blew my mind. Try adding numbers to the array and re-running. If you’re feeling saucy, see if you can make the sort order descending.

In most programming languages, you wont need to do this kind of thing by hand. There will often be built-in libraries or functions you can use. But if you ever find yourself in some low level language, this can be a useful trick.

Find

So we’ve got insert, and sort. Let finish up with find. Find is actually quite simple, but it’s got some recursion which is fun in and of itself:

const [data, me] = arguments;

function find(node, query) {
  if (node === undefined || query === undefined) return undefined;
  if (node.name === query) {
    return node;
  }
  if (node.name > query) {
    return find(node.children[0], query);
  }
  if (node.name < query) {
    return find(node.children[1], query);
  }
}

const rtn = find(data, 131);
K.Print(me, rtn?.name || 'Not Found');

The idea here is similar in spirit to the sort function. Have a play.

Fin

While this data structure almost never comes up in day to day web development, I’ve used it quite a bit in games programming and backend utilities. It’s a fun data structure, and a good trick to keep in your pocket.