November 5, 2017

What is a dependency tree, and what can it do?

A dependency tree organizes structures so there are explicit relationships between structures.

The simplest example would be stating that one variable b depends on another variable a. Let's use the syntax [dependency, dependent] to express that relationship, e.g. "b depends on a"

['a', 'b']

We can also express the idea of grouped and nested dependencies.

// "c and d both depend on a and b"
[['a', 'b'], ['c','d']

// "c depends on a and b, and d depends on c"
[[['a', 'b'], 'c'], 'd']

Along with that idea, we have the concept of load order. Once we have a tree we, can traverse from the bottom of the tree upward and accumulate the dependencies in the correct load order. In JavaScript, we can accumulate the load order with a recursive function like this...

// BinaryTree a -> Array a
function getOrder(tree) {
  var dependency, dependent;

  if (Array.isArray(tree[0])) {
    dependency = getOrder(tree[0]);
  } else {
    dependency = [tree[0]];
  }

  if (Array.isArray(tree[1])) {
    dependent = getOrder(tree[1]);
  } else {
    dependent = [tree[1]];
  }

  return dependency.concat(dependent);

}

// Returns ['a', 'b1', 'b2', 'c', 'd']
getOrder([[['a', ['b1', 'b2']], 'c'], 'd']);

Note that when we get the load order, we've lost some information about the relationships between elements. If we just look at ['a', 'b1', 'b2', 'c', 'd'] there's no way to know the parent and child relationships between elements. That load order could be from any one of these trees...

[[[['a', 'b1'], 'b2'], 'c'], 'd']
['a', [[['b1', 'b2'], 'c'], 'd']]
...
..

And we lose even more information when we have trees whose nodes can have any number of children that don't depend on each other, that is we introduce the possibility of parallel dependency loading. Let's use the syntax [ A(sibling dependencies...), A(sibling dependents...) ] to express this relationship.

// e, f, and g depend on a, c, p1, and p2.
// b1 and b2 depend on p1 and p2.
[
  A(
    'a',
    [
      A('p1','p2'),
      A('b1','b2')
    ],
    'c'
  ),
  A('e','f','g')
]

We can modify our original ordering function to handle these n-ary trees...

// NAryTree a -> Array a
function getOrder(tree) {
  var dependencies, dependents;

  dependencies = tree[0].siblings
    .reduce(function(accum, d) {
      if (Array.isArray(d)) {
          return accum.concat(getOrder(d));
      } else {
          return accum.concat([d]);
      }
  }, []);

  dependents = tree[1].siblings
    .reduce(function(accum, d) {
      if (Array.isArray(d)) {
          return accum.concat(getOrder(d));
      } else {
          return accum.concat([d]);
      }
  }, []);

  return dependencies.concat(dependents);

}

function A() {
    var args = Array.prototype.slice.call(arguments);
    if (!(this instanceof A)) {
        return new (Function.prototype.bind.apply(
          A, [null].concat(args))
        );
    }

    this.siblings = args;    
}

// ["a", "p1", "p2", "b1", "b2", "c", "e", "f", "g"]
getOrder([
  A(
    'a',
    [
      A('p1','p2'),
      A('b1','b2')
    ],
    'c'
  ),
  A('e','f','g')
]);

Similar to before, reading down the tree directly corresponds to the correct load order, so you might ask, "What's the big deal if we express dependencies as a flat array?" Unlike before, we now have the option of inserting and rearranging elements. There are multiple correct and incorrect orderings. By flattening the tree we've lot the information about where insertions and transpositions can safely happen. The tree can tell us that b1 and b2 can be switched, and another sibling can be inserted before, between, or after. The flat array can't express that information. The same applies to a, c, and the group in-between. If we want to add a dependency at that level, the flat array can't tell us the correct insertion points.

If you'd like to play with this code, check out this code pen