
Automatic Derivation and Code Generation
The end product of our work is mathematical software which can be put to use in real applications. Our aim is to eventually encode the principles we employ and develop so that the software can be automatically or semiautomatically generated, given a highlevel problem specification from the user.



Toward Interactive Statistical Modeling
Sooraj Bhat, Ashish Agarwal, Alexander G. Gray, and Richard Vuduc
International Conference on Computational Science (ICCS): Workshop on Automated Program Generation for Computational Science 2010
A system with many of the capabilities of AutoBayes (see below) but based on a more formal approach using type theory, allowing more powerful mechanisms for ensuring the correctness of generated algorithms and code. This work features the first steps toward typetheoretic formalization of probability and statistics to our knowledge.
[pdf]
Abstract:
When solving machine learning problems, there is currently little automated support for easily experimenting with alternative statistical models or solution strategies. This is because this activity often requires expertise from several different fields (e.g. statistics, optimization, linear algebra), and the level of formalism required for automation is much higher than for a human solving problems on paper. We present a system toward addressing these issues, which we achieve by (1) formalizing a type theory for probability and optimization, and (2) providing an interactive rewrite system for applying problem reformulation theorems. Automating solution strategies this way enables not only manual experimentation but also higherlevel, automated activities, such as autotuning.
@inproceedings{bhat2010tism,
author="Sooraj Bhat and Ashish Agarwal and Alexander Gray and Richard Vuduc",
title="{Toward Interactive Statistical Modeling}",
booktitle="{International Conference on Computational Science (ICCS): Workshop on Automated Program Generation for Computational Science}",
year="2010"
}

In preparation
Om
A programming language for operationalizing the mathematics of statistics, based on integrated type theories for probability, optimization, and linear algebra.
Algorithmica
A complete system for deriving algorithms and efficient code for machine learning.


Automating Mathematical Program Transformations
Ashish Agarwal, Sooraj Bhat, Alexander G. Gray, Ignacio E. Grossman
Practical Aspects of Declarative Languages (PADL) 2010
The first work represents the first typetheoretic formalization of optimization, and a demonstration of the consequent ability to automatically improve the formulations of optimization problems.
[pdf]
Abstract:
Mathematical programs (MPs) are a class of constrained optimization problems that include linear, mixedinteger, and disjunctive programs. Strategies for solving MPs rely heavily on various transformations between these subclasses, but most are not automated because MP theory does not presently treat programs as syntactic objects. In this work, we present the ﬁrst syntactic definition of MP and of some widely used MP transformations, most notably the bigM and convex hull methods for converting disjunctive constraints. We use an embedded OCaml DSL on problems from chemical process engineering and operations research to compare our automated transformations to existing technology  finding that no one technique is always best  and also to manual reformulations  finding that our mechanizations are comparable to human experts. This work enables higherlevel solution strategies that can use these transformations as subroutines.
@Inproceedings{agarwal2010ampt,
Author = "Ashish Agarwal and Sooraj Bhat and Alexander G. Gray and Ignacio E. Grossman",
Title = "{Automating Mathematical Program Transformations}",
Booktitle = "Practical Aspects of Declarative Languages (PADL)",
Year = "2010"
}


Automatic Derivation of Statistical Algorithms: The EM Family and Beyond
Alexander G. Gray, Bernd Fischer, Johann Schumann, and Wray Buntine
Neural Information Processing Systems (NIPS) 2002, appeared in 2003
A system which can, given a highlevel specification of a statistical model, automatically derive EM algorithms and other optimizers for it, then generate efficient code for it.
[pdf]
Abstract:
Machine learning has reached a point where many probabilistic methods can be understood as variations, extensions and combinations of a much smaller set of abstract themes, e.g., as different instances of the EM algorithm. This enables the systematic derivation of algorithms customized for different models. Here, we describe the AUTOBAYES system which takes a highlevel statistical model specification, uses powerful symbolic techniques based on schemabased program synthesis and computer algebra to derive an efficient specialized algorithm for learning that model, and generates executable code implementing that algorithm.
@Inproceedings{gray2003autobayes,
Author = "Alexander G. Gray" # " and B. Fischer and J. Schumann and W. Buntine",
Title = "{Automatic Derivation of Statistical
Algorithms: The EM Family and Beyond}",
booktitle = "Advances in " # "Neural Information Processing Systems (NIPS)" # " 15 (Dec 2002)",
Year = "2003",
editor = {Suzanna Becker and Sebastian Thrun and Klaus Obermayer},
publisher = {MIT Press}
}

