Principal Research Scientist

Yahoo Labs

New York, NY

edo.liberty@gmail.com

עידו ליברטי

homepage

I received my B.Sc in Physics and Computer Science from Tel Aviv university and my Ph.D in Computer Science from Yale University, under the supervision of Steven Zucker. After that, I was a Post-Doctoral fellow at Yale in Program in Applied Mathematics.

Since 2009 I am at Yahoo. I now lead the Scalable Machine Learning group in Yahoo Labs. We focus on the theory and practice of (very) large scale data mining and machine learning.

My personal research interests include fast dimensionality reduction, clustering, streaming and online algorithms, text and pattern mining, machine learning, and large scale numerical linear algebra. I am especially fond of randomized algorithms and high dimensional geometry.

0368-3248-01-Data Mining - Tel Aviv University

The course covered algorithmic tools for data mining massive data sets.

It was given as a theory/algorithms class with and emphasis on randomization.

fall 2011

fall 2012

fall 2013

Online PCA with Spectral Bounds

Zohar Karnin, Edo Liberty

In progress

Frequent Directions: Simple and Deterministic Matrix Sketching

Mina Ghashami, Edo Liberty, Jeff M. Phillips, David P. Woodruff

In progress

An Algorithm for Online K-Means Clustering

Edo Liberty, Ram Sriharsha, Maxim Sviridenko

In progress

Space Lower Bounds for Itemset Frequency Sketches

Edo Liberty, Michael Mitzenmacher, Justin Thaler, Jonathan Ullman

In progress

Online Principal Component Analysis

Christos Boutsidis, Dan Garber, Zohar Karnin, Edo Liberty

SODA 2014

Near-optimal Distributions for Data Matrix Sampling

Dimitris Achlioptas, Zohar Karnin, Edo Liberty

NIPS 2013

Simple and Deterministic Matrix Sketches

Edo Liberty (see slides and experimental results in json format)

Also, here is talk I gave at the Simons Institute about this.

__Best paper__ at KDD 2013

Threading Machine Generated Email

Nir Ailon, Zohar Karnin, Edo Liberty, Yoelle Maarek

WSDM 2013

__Best paper__ at TechPulse 2012

Unsupervised SVMs: On the complexity of the Furthest Hyperplane Problem

Zohar Karnin,
Edo Liberty,
Shachar Lovett
Roy Schwartz
and Omri Weinstein

COLT 2012 (Slides)

Liran Katzir, Edo Liberty, and Oren Somekh

WWW 2012

An Almost Optimal Unrestricted Fast Johnson-Lindenstrauss Transform

Nir Ailon,
Edo Liberty

__Best paper__ at SODA 2011

Improved Approximation Algorithms for Bipartite Correlation Clustering

Nir Ailon,
Noa Avigdor-Elgrabli,
Edo Liberty,
Anke van Zuylen

ESA 2011 (slides)

Automatically Tagging Email by Leveraging Other Users' Folders

Yehuda Koren,
Edo Liberty, Yoelle Maarek, and
Roman Sandler

KDD 2011

Estimating Sizes of Social Networks via Biased Sampling

Liran Katzir, Edo Liberty, and Oren Somekh

WWW 2011

Inverted Index Compression via Online Document Routing

Gal Lavee, Ronny Lempel, Edo Liberty, and Oren Somekh

WWW 2011

Correlation Clustering Revisited: The "True" Cost of Error
Minimization Problems

Nir Ailon,
Edo Liberty

ICALP 2009

Dense Fast Random Projections and Lean Walsh Transforms,

Edo Liberty, Nir Ailon,
Amit Singer

RANDOM 2008

Fast Dimension Reduction Using Rademacher Series on Dual BCH Codes

Nir Ailon, Edo Liberty

SODA 2008

Estimating Sizes of Social Networks via Biased Sampling

Liran Katzir, Edo Liberty, Oren Somekh, Ioana A. Cosma

Journal of Internet Mathematics

An Almost Optimal Unrestricted Fast Johnson-Lindenstrauss Transform

Nir Ailon,
Edo Liberty

TALG (Transactions on Algorithms)

Improved Approximation Algorithms for Bipartite Correlation Clustering

Nir Ailon,
Noa Avigdor-Elgrabli,
Edo Liberty,
Anke van Zuylen

To appear in SICOMP (SIAM Journal on Computing)

Unsupervised SVMs: On the complexity of the Furthest Hyperplane Problem

Zohar Karnin,
Edo Liberty,
Shachar Lovett
Roy Schwartz
and Omri Weinstein

JMLR 2012 (Journal of Machine Learning Research)

Dense Fast Random Projections and Lean Walsh Transforms,

Edo Liberty, Nir Ailon,
Amit Singer

DCG 2010 (Discrete and Computational Geometry)

The Mailman algorithm: a note on matrix vector multiplication

Edo Liberty,
Steven Zucker

IPL 2009 (Information Processing Letters)

Fast Dimension Reduction Using Rademacher Series on Dual BCH Codes

Nir Ailon, Edo Liberty

DCG 2008 (Discrete and Computational Geometry)

A fast randomized algorithm for the approximation of matrices

Edo Liberty, Franco Woolfe, Vladimir Rokhlin, and Mark Tygert

ACHA 2008 (Applied and Computational Harmonic Analysis)

Randomized algorithms for the low-rank approximation of matrices,

Edo Liberty,
Franco Woolfe,
Per-Gunnar Martinsson,
Vladimir Rokhlin,
and Mark Tygert.

PNAS 2007 (Proceedings of the National Academy of Sciences)

Electrons and Phonons on the Square Fibonacci Tiling

Roni Ilan, Edo Liberty, Shahar Even-Dar Mandel, and
Ron Lifshitz.

Ferroelectrics 2004.

System and Method for Identification of Subject Line Templates

Zohar Karnin, Edo Liberty, David Wajc, Guy Halawi

Methods for filtering data and filling in missing data using
nonlinear inference

Edo Liberty, Steven Zucker, Yosi Keller, Mauro M. Maggioni, Ronald
R. Coifman, Frank Geshwind, and in collaboration with Plain Sight
Systems.

Method And System For Clustering Data Points

Nir Ailon, Edo Liberty, Hari Khalsa

Mining Global Email Folders For Identifying Auto-folders tags

Vishwanath Ramarao, Andrei Broder, Idan Szpektor, Edo Liberty,
Yehuda Koren, Mark Risher, and Yoelle Maarek

Methods for Displaying Contextually Targeted Content on a
Connected Television

Zeev Neumeier, Edo Liberty

Methods for Identifying Video Segmets and Displaying
Contextually Targeted Content on Connected Televisions

Zeev Neumeier, Edo Liberty

Sponsored Apps Marketplace in eMail

Ronny Lempel, Yoelle Maarek, Edward Bortnikov, Edo Liberty

A System for Email sequence identification

Edo Liberty, Zohar Karnin, Yoelle Maarek, Natalie Aizenberg

Correlation Clustering: from Theory to Practice

KDD 2014 Tutorial. The slides.

Streaming Data Mining

MLConf New York 2014

Data Mining in the Streaming Model: Approximating Massive Matrices

IBM Machine Learning Day 2012

Streaming Data Mining

KDD 2012 tutorial on practical algorithms in mining streaming data; with Jelani Nelson.

Fast Random Projections, theory and practice

14th Mini-Workshop on Applied and Computational Mathematics

Fast Random Projections
survey and new results,

SODA 2011 and IAS and Yale math seminars 2011.

Video of the talk at
IAS available here.

Accelerated Dense Random
Projections

PhD Thesis. See also Talk slides

Scoring Psychological Questionnaires using Geometric Harmonics,

Social Data Mining and Knowledge Building (IPAM) 2007.

Scoring Psychological Questionnaires using Geometric Harmonics,

Edo Liberty, Moshe
Almagor, Steven
Zucker, Yosi
Keller, and Ronald Coifman

Snowbird Learning Workshop 2007.

Learning
functions on graphs and manifolds; Application to Psychological
testing,

(Inner departmental OGST 2006).

SODA, ESA, FOCS, KDD, AISTATS, SIGIR, WSDM, WWW

I'm also an enthusiastic kitesurfer and snowboarder.

Here are some pictures of that.

This site was last updated Sep 2013