We develop an aggregate measure of syntactic difference for automatically finding common syntactic differences between collections of text.
With the use of this measure, it is possible to mine for differences between, for example, the English of learners and natives, or between related dialects.
It enables us to find not only absence or presence, but also under- and overuse of specific constructs and allows for testing hypotheses for statistical significance.
Our earlier publications on it are: crude version of the method, testing it, applying it, and applying it a second time.