|
I am not at Yale any more, nor am I at Yahoo, so this web page is somewhat
out of date.
I am currently (as of summer 2008)
in the math
department at Stanford University.
Click here
for my new web page and for more information.
You can reach me via electronic mail:
mmahoney
is the username, and
cs dot stanford dot edu
follows the
at
symbol.
(Of course, my gmail address is still valid.)
NOTE:
Our PNAS paper on "CUR Matrix Decompositions for Improved Data Analysis"
has appeared as
Proc. Natl. Acad. Sci. USA, 106, 697-702 (2009).
Please email me if you can't access a copy from the PNAS web site.
|
Michael Mahoney's old home page
J. W. Gibbs Assistant Professor,
Department of Mathematics
Research Affiliate,
Department of Computer Science
Yale University
|
By popular demand
... we are running the
MMDS Workshop on
"Algorithms for Modern Massive Data Sets"
again!
As with the 2006 meeting,
MMDS 2008
will address algorithmic, statistical, and mathematical
challenges in large-scale statistical data analysis.
Click
here
for the 2006 Stanford/Yahoo "Workshop on Algorithms for Modern Massive
Data Sets (MMDS 2006)" that I organized (with
G. Golub,
P. Drineas, and
L-H. Lim)
and that took place on June 21-24, 2006,
or click
here
for an article in SIAM News about the meeting.
Click
here
for a ppt of the SIAM-SDM06 2006 tutorial,
or click
here
for the overheads for several other talks.
Research interests
-
Design and analysis of algorithms for very large matrix, graph, and
regression problems.
-
Applications to the statistical data analysis of massive scientific and
Internet data.
-
Monte Carlo methods in
computer science,
statistical physics,
and scientific computation.
-
Molecular modeling of liquid water
(e.g., TIP5P water)
and protein-DNA/RNA-water interactions.
-
Large-scale machine learning and large-scale data analytics.
I work on developing and analyzing algorithms for large matrix, graph, and
regression problems, as well as applications of these tools to the statistical
analysis of extremely large scientific and Internet data sets.
Current work includes applying these and related algorithmic techniques to
problems arising in Internet applications, and past work includes the
application of these techniques to DNA single nucleotide polymorphism data.
I have also worked on developing and analyzing Monte Carlo algorithms for
performing useful computations on extremely
large matrices,
e.g., the additive-error and relative-error CUR matrix decompositions.
Past research has also included work in computational statistical mechanics on
the development and analysis of the
TIP5P
model of liquid water, as well as
work in both computational and experimental
biophysics/bioinformatics on
proteins and protein-nucleic acid interactions.
Publications
-
CUR Matrix Decompositions for Improved Data Analysis,
-
M. W. Mahoney and P. Drineas,
-
Proc. Natl. Acad. Sci. USA, 106, 697-702 (2009).
-
An Improved Approximation Algorithm for the Column Subset Selection Problem,
-
C. Boutsidis, M. W. Mahoney, and P. Drineas,
-
Technical Report, Preprint: arXiv:0812.4293 (2008)
(arXiv).
-
Proc. 20-th Annual SODA, 968-977 (2009)
(ps,
pdf).
-
Community Structure in Large Networks: Natural Cluster Sizes and the Absence of Large Well-Defined Clusters,
-
J. Leskovec, K. J. Lang, A. Dasgupta, and M. W. Mahoney,
-
Technical Report, Preprint: arXiv:0810.1355 (2008)
(arXiv).
-
Unsupervised Feature Selection for Principal Components Analysis,
-
C. Boutsidis, M. W. Mahoney, and P. Drineas,
-
Proc. 14-th Annual SIGKDD, 61-69 (2008)
(ps,
pdf).
-
Statistical properties of community structure in large social and information networks,
-
J. Leskovec, K. J. Lang, A. Dasgupta, and M. W. Mahoney,
-
Proc. 17-th International WWW, 695-704 (2008)
(ps,
pdf).
-
Faster Least Squares Approximation,
-
P. Drineas, M. W. Mahoney, S. Muthukrishnan, and T. Sarlos,
-
Technical Report, Preprint: arXiv:0710.1435 (2007)
(arXiv),
-
Journal version submitted for publication.
-
PCA-Correlated SNPs for Structure Identification in Worldwide Human Populations,
-
P. Paschou, E. Ziv, E. G. Burchard, S. Choudhry, W. Rodriguez-Cintron, M. W. Mahoney, and P. Drineas,
-
PLoS Genetics, 3, 1672-1686 (2007)
(ps,
pdf).
-
Relative-Error CUR Matrix Decompositions,
-
P. Drineas, M. W. Mahoney, and S. Muthukrishnan,
-
Technical Report, Preprint: arXiv:0708.3696 (2007)
(arXiv),
-
SIAM J. Matrix Analysis and Applications, 30, 844-881 (2008)
(ps,
pdf).
-
Feature Selection Methods for Text Classification,
-
A. Dasgupta, P. Drineas, B. Harb, V. Josifovski, and M. W. Mahoney,
-
Proc. 13-th Annual SIGKDD, 230-239 (2007)
(ps,
pdf).
-
Sampling Algorithms and Coresets for Lp Regression,
-
A. Dasgupta, P. Drineas, B. Harb, R. Kumar, and M. W. Mahoney,
-
Technical Report, Preprint: arXiv:0707.1714 (2007)
(arXiv),
-
Proc. 19-th Annual SODA, 932-941 (2008)
(ps,
pdf),
-
SIAM J. Computing, 38, 2060-2078 (2009)
(ps,
pdf)
-
Web Information Retrieval and Linear Algebra Algorithms,
-
A. Frommer, M. W. Mahoney, and D. B. Szyld (Eds.),
-
Proc. of Dagstuhl Seminar 07071, (2007)
(web).
-
Intra- and interpopulation genotype reconstruction from tagging SNPs,
-
P. Paschou, M. W. Mahoney, A. Javed, J. R. Kidd, A. J. Pakstis, S. Gu, K. K. Kidd, and P. Drineas,
-
Genome Research, 17(1), 96-107 (2007)
(ps,
pdf).
-
Bridging the Gap Between Numerical Linear Algebra, Theoretical Computer Science, and Data Applications,
-
G. H. Golub, M. W. Mahoney, P. Drineas, and L.-H. Lim,
-
SIAM News 39:8 October 2006
(ps,
pdf).
-
Randomized Algorithms for Matrices and Massive Data Sets,
-
P. Drineas and M. W. Mahoney,
-
Proc. 32-nd Annual VLDB, 1269 (2006)
(ps,
pdf).
-
Subspace Sampling and Relative-Error Matrix Approximation: Column-Row-Based Methods,
-
P. Drineas, M. W. Mahoney, and S. Muthukrishnan,
-
Proc. 14-th Annual ESA, 304-314 (2006)
(ps,
pdf).
-
Subspace Sampling and Relative-Error Matrix Approximation: Column-Based Methods,
-
P. Drineas, M. W. Mahoney, and S. Muthukrishnan,
-
Proc. 10-th Annual RANDOM, 316-326 (2006)
(ps,
pdf).
-
Tensor-CUR Decompositions For Tensor-Based Data,
-
M. W. Mahoney, M. Maggioni, and P. Drineas,
-
Proc. 12-th Annual SIGKDD, 327-336 (2006)
(ps,
pdf),
-
SIAM J. Matrix Analysis and Applications, 30, 957-987 (2008)
(ps,
pdf).
-
Polynomial Time Algorithm for Column-Row-Based Relative-Error Low-Rank Matrix Approximation,
-
P. Drineas, M. W. Mahoney, and S. Muthukrishnan,
-
Technical Report, DIMACS TR 2006-04 March 2006
(ps,
pdf).
-
Sampling Algorithms for L2 Regression and Applications,
-
P. Drineas, M. W. Mahoney, and S. Muthukrishnan,
-
Proc. 17-th Annual SODA, 1127-1136 (2006)
(ps,
pdf).
-
A Randomized Algorithm for a Tensor-Based Generalization of the Singular Value Decomposition,
-
P. Drineas and M. W. Mahoney,
-
Technical Report, YALEU/DCS/TR-1327, June 2005
(ps,
pdf),
-
Linear Algebra and its Applications, 420, 553-571 (2007)
(ps,
pdf).
-
On the Nystrom Method for Approximating a Gram Matrix for Improved Kernel-Based Learning,
-
P. Drineas and M. W. Mahoney,
-
Technical Report, YALEU/DCS/TR-1319, April 2005
(ps,
pdf),
-
Proc. 18-th Annual COLT, 323-337 (2005)
(ps,
pdf),
-
J. Machine Learning Research, 6, 2153-2175 (2005)
(ps,
pdf).
-
Sampling Sub-problems of Heterogeneous Max-Cut Problems and Approximation Algorithms,
-
P. Drineas, R. Kannan, and M. W. Mahoney,
-
Technical Report, YALEU/DCS/TR-1283, April 2004
(ps,
pdf),
-
Proc. 22-nd Annual STACS, 57-68 (2005)
(ps,
pdf),
-
Random Structures and Algorithms, 32:3, 307-333 (2008)
(ps,
pdf).
-
Fast Monte Carlo Algorithms for Matrices III: Computing an Efficient Approximate Decomposition of a Matrix,
-
P. Drineas, R. Kannan, and M. W. Mahoney,
-
Technical Report, YALEU/DCS/TR-1271, February 2004
(ps,
pdf),
-
SIAM J. Computing, 36, 184-206 (2006)
(ps,
pdf).
-
Fast Monte Carlo Algorithms for Matrices II: Computing Low-Rank Approximations to a Matrix,
-
P. Drineas, R. Kannan, and M. W. Mahoney,
-
Technical Report, YALEU/DCS/TR-1270, February 2004
(ps,
pdf),
-
SIAM J. Computing, 36, 158-183 (2006)
(ps,
pdf).
-
Fast Monte Carlo Algorithms for Matrices I: Approximating Matrix Multiplication,
-
P. Drineas, R. Kannan, and M. W. Mahoney,
-
Technical Report, YALEU/DCS/TR-1269, February 2004
(ps,
pdf),
-
SIAM J. Computing, 36, 132-157 (2006)
(ps,
pdf).
-
Rapid Mixing of Several Markov Chains for a Hard-Core Model,
-
R. Kannan, M. W. Mahoney, and R. Montenegro,
-
Proc. 14-th Annual ISAAC, 663-675 (2003)
(ps,
pdf).
-
Quantum, intramolecular flexibility, and polarizability effects on the reproduction of the density anomaly of liquid water by simple potential functions,
-
M. W. Mahoney and W. L. Jorgensen,
-
J. Chem. Phys., 115, 10758-10768 (2001)
(ps,
pdf).
-
Rapid estimation of electronic degrees of freedom in Monte Carlo calculations for polarizable models of liquid water,
-
M. W. Mahoney and W. L. Jorgensen,
-
J. Chem. Phys., 114, 9337-9349 (2001)
(ps,
pdf).
-
Diffusion constant of the TIP5P model of liquid water,
-
M. W. Mahoney and W. L. Jorgensen,
-
J. Chem. Phys., 114, 363-366 (2001)
(ps,
pdf).
-
A five-site model for liquid water and the reproduction of the density anomaly by rigid, nonpolarizable potential functions,
-
M. W. Mahoney and W. L. Jorgensen,
-
J. Chem. Phys., 112, 8910-8922 (2000)
(ps,
pdf).
-
Repression and activation of promoter-bound RNA polymerase activity by Gal repressor,
-
H. E. Choy, R. R. Hanger, T. Aki, M. Mahoney, K. Murakami, A. Ishihama, and S. Adhya,
-
J. Mol. Biol. 272: 293-300, 1997
(ps,
pdf).
-
Discrete representations of the protein C-alpha chain,
-
X. F. de la Cruz, M. W. Mahoney, and B. K. Lee,
-
Fold. & Des. 2: 223-234, 1997
(ps,
pdf).
|