Temporally Constrained Apriori Mining

Data mining is used to find correlations in large noisy data sets. It is most commonly used to find shopping patterns and is therefore known as the market basket approach. Apriori mining finds all possible re-occurring symbol rule sets in examples it is given. Rating each one with a support and a confidence. A rule is something like A«B,C,D meaning that when B, C and D occur, A also occurs. If we give all our positive examples a positive ID symbol then we can make mining only return rules which lead to that symbol e.g. PID«B,C,D. In this examples the head of the rule, PID is called the consequent and the body of the rule, B,C,D, is called the antecedent.

The support of the rule is given by the percentage of examples which contain both the antecedent and the consequent, so in our original example P{A,B,C,D}. The confidence is given by the support over the number of times the antecedent occurs without the consequent. i.e. P(A,B,C,D}/P{B,C,D}. In short the confidence is how often the rule occurs and in our case the confidence is how often the rule occurs in positive examples vs in negative examples.

At this point there is no temporal information known by the mining. It receives all the symbols from all the frames in an example in one lump sum. It is therefore at liberty to pick symbol sets which contain members from any point in the video. This results in large and unwieldy lists of rules which don't contain much information. We know that signs are of a limited length so we constrain the mining to find symbols which occur within a time window of 10 frames.

Another problem with applying mining to large, noisy, video data sets is the magnitude of the task. While we started with Christian Borgelet's apriori implementation we soon ran up against memory problems when calculating the possible rules. Rules are calculated by building trees, starting with a symbol as the trunk a tree branches to sets containing that symbol and another available symbol. The second layer of branches then contains all possible 3 symbol sets and so on until a maximum set size is reached. Even though pruning can be applied when the trees used are all stored in memory and there are a lot of possible rules the resource requirements are not insignificant. We re-implemented the mining with a recursive tree structure so that the memory in use at any given time is now dictated by the number and size of examples given and the length of the rules allowed. In order to reduce the number of branches on the tree examined we employ the standard mining optimisations of curtailing branches containing symbol sets which fail to meet the required support since adding a new symbol to them can never bolster the support. To this we add that any set cannot branch to another symbol set containing a symbol less than the smallest symbol in the current set. This stops duplication of sets (since order is irrelevant). An example of what we mean by this is shown in figure 3. This tree based implementation lends itself to multi-threading since each tree can be built by a separate thread.

Figure 3 - Example of how trees are built without creating duplicate symbol sets.

We can now run a sliding window of 10 frames across all the positive examples looking for the high support/confidence rules. If we sum the confidence of all the rules which occur in a given window we can find the regions of the examples which contain the similar features.

So now we have correlations between positive examples which don't occur in negative examples we need to localise the sign.