Edo Liberty

Principal Research Scientist
Yahoo Labs
New York, NY
edo.liberty@yahoo.com
עידו ליברטי
homepage


About me:

I am now research director at Yahoo Labs were I lead the Scalable Machine Learning group. We focus on the theory and practice of (very) large scale data mining and machine learning. We are currently hiring in Mew York both research scientist and research engineers. See the Yahoo labs career page for more info or contact my directly.

I received my B.Sc in Physics and Computer Science from Tel Aviv university and my Ph.D in Computer Science from Yale University, under the supervision of Steven Zucker. After that, I was a Post-Doctoral fellow at Yale in Program in Applied Mathematics.

My personal research interests include fast dimensionality reduction, clustering, streaming and online algorithms, text and pattern mining, machine learning, and large scale numerical linear algebra. I am especially fond of randomized algorithms and high dimensional geometry.


News:

NYCE 2016 had a great turnout and a fantastic set of talks. Links to talk videos are on the main workshop website.


At Alenex, I gave a presentation about online k-means.


The Sublinear Algorithms Workshp at Johns Hopkins University was organized by Vladimir Braverman, Robi Krauthgamer and Piotr Indyk and (as always) they put together a great lineup of speakers. I gave a presentation of recent online PCA results.


I'm glad to announce that Yahoo open sourced parts of our data sketches library. This is an ongoing effort (of which I am a part of) and there are many exciting new algorithms coming in later releases. See an announcement on VentureBeat and the link to the project itself.


Teaching:

0368-3248-01-Data Mining - Tel Aviv University
The course covered algorithmic tools for data mining massive data sets.
It was given as a theory/algorithms class with and emphasis on randomization.
fall 2011
fall 2012
fall 2013


Work in progress:

Efficient Frequent Directions Algorithm for Sparse Matrices
Mina Ghashami, Edo Liberty, Jeff M. Phillips
In progress

Stratified Sampling meets Machine Learning
Kevin Lang, Edo Liberty, Konstantin Shmakov
In progress

Greedy Minimization of Weakly Supermodular Set Functions
Christos Boutsidis, Edo Liberty, Maxim Sviridenko
In progress [bib]

Space Lower Bounds for Itemset Frequency Sketches
Edo Liberty, Michael Mitzenmacher, Justin Thaler, Jonathan Ullman
In progress [bib]


Conference Publications:

An Algorithm for Online K-Means Clustering
Edo Liberty, Ram Sriharsha, Maxim Sviridenko
ALENEX 2016 [bib]

Online PCA with Spectral Bounds
Zohar Karnin, Edo Liberty
COLT 2015 [bib]
(see also 5 minute video letcure)

Online Principal Component Analysis
Christos Boutsidis, Dan Garber, Zohar Karnin, Edo Liberty
SODA 2014 [bib]

Near-optimal Distributions for Data Matrix Sampling
Dimitris Achlioptas, Zohar Karnin, Edo Liberty
NIPS 2013 [bib]

Simple and Deterministic Matrix Sketches
Edo Liberty (see slides and experimental results in json format)
Also, here is talk I gave at the Simons Institute about this.
Best paper at KDD 2013 [bib]

Threading Machine Generated Email
Nir Ailon, Zohar Karnin, Edo Liberty, Yoelle Maarek
Best paper at TechPulse 2012 and WSDM 2013 [bib]

Unsupervised SVMs: On the complexity of the Furthest Hyperplane Problem
Zohar Karnin, Edo Liberty, Shachar Lovett, Roy Schwartz, and Omri Weinstein
COLT 2012 [Slides] [bib]

Framework and Algorithms for Network Bucket Testing
Liran Katzir, Edo Liberty, and Oren Somekh
WWW 2012 [bib]

An Almost Optimal Unrestricted Fast Johnson-Lindenstrauss Transform
Nir Ailon, Edo Liberty
Best paper at SODA 2011 [bib]

Improved Approximation Algorithms for Bipartite Correlation Clustering
Nir Ailon, Noa Avigdor-Elgrabli, Edo Liberty, Anke van Zuylen
ESA 2011 [slides] [bib]

Automatically Tagging Email by Leveraging Other Users' Folders
Yehuda Koren, Edo Liberty,Yoelle Maarek, and Roman Sandler
KDD 2011 [bib]

Estimating Sizes of Social Networks via Biased Sampling
Liran Katzir, Edo Liberty, and Oren Somekh
WWW 2011 [bib]

Inverted Index Compression via Online Document Routing
Gal Lavee, Ronny Lempel, Edo Liberty, and Oren Somekh
WWW 2011 [bib]

Correlation Clustering Revisited: The "True" Cost of Error Minimization Problems
Nir Ailon, Edo Liberty
ICALP 2009 [bib]

Dense Fast Random Projections and Lean Walsh Transforms,
Edo Liberty, Nir Ailon, Amit Singer
RANDOM 2008 [bib]

Fast Dimension Reduction Using Rademacher Series on Dual BCH Codes
Nir Ailon, Edo Liberty
SODA 2008 [bib]


Journal Publications:

Frequent Directions: Simple and Deterministic Matrix Sketching
Mina Ghashami, Edo Liberty, Jeff M. Phillips, David P. Woodruff
In review [bib]

Estimating Sizes of Social Networks via Biased Sampling
Liran Katzir, Edo Liberty, Oren Somekh, Ioana A. Cosma
Journal of Internet Mathematics [bib]

An Almost Optimal Unrestricted Fast Johnson-Lindenstrauss Transform
Nir Ailon, Edo Liberty
Transactions on Algorithms [bib]

Improved Approximation Algorithms for Bipartite Correlation Clustering
Nir Ailon, Noa Avigdor-Elgrabli, Edo Liberty, and Anke van Zuylen
SIAM Journal on Computing [bib]

Unsupervised SVMs: On the complexity of the Furthest Hyperplane Problem
Zohar Karnin, Edo Liberty, Shachar Lovett, Roy Schwartz and Omri Weinstein
JMLR 2012 (Journal of Machine Learning Research) [bib]

Dense Fast Random Projections and Lean Walsh Transforms,
Edo Liberty, Nir Ailon, Amit Singer
DCG 2010 (Discrete and Computational Geometry) [bib]

The Mailman algorithm: a note on matrix vector multiplication
Edo Liberty, Steven Zucker
IPL 2009 (Information Processing Letters) [bib]

Fast Dimension Reduction Using Rademacher Series on Dual BCH Codes
Nir Ailon, Edo Liberty
DCG 2008 (Discrete and Computational Geometry) [bib]

A fast randomized algorithm for the approximation of matrices
Edo Liberty, Franco Woolfe, Vladimir Rokhlin, and Mark Tygert
ACHA 2008 (Applied and Computational Harmonic Analysis) [bib]

Randomized algorithms for the low-rank approximation of matrices,
Edo Liberty, Franco Woolfe, Per-Gunnar Martinsson, Vladimir Rokhlin, and Mark Tygert.
PNAS 2007 (Proceedings of the National Academy of Sciences) [bib]

Electrons and Phonons on the Square Fibonacci Tiling
Roni Ilan, Edo Liberty, Shahar Even-Dar Mandel, and Ron Lifshitz.
Ferroelectrics 2004.


Patents:

System and Method for Identification of Subject Line Templates
Zohar Karnin, Edo Liberty, David Wajc, Guy Halawi

Methods for filtering data and filling in missing data using nonlinear inference
Edo Liberty, Steven Zucker, Yosi Keller, Mauro M. Maggioni, Ronald R. Coifman, Frank Geshwind, and in collaboration with Plain Sight Systems.

Method And System For Clustering Data Points
Nir Ailon, Edo Liberty, Hari Khalsa

Mining Global Email Folders For Identifying Auto-folders tags
Vishwanath Ramarao, Andrei Broder, Idan Szpektor, Edo Liberty, Yehuda Koren, Mark Risher, and Yoelle Maarek

Methods for Displaying Contextually Targeted Content on a Connected Television
Zeev Neumeier, Edo Liberty

Methods for Identifying Video Segmets and Displaying Contextually Targeted Content on Connected Televisions
Zeev Neumeier, Edo Liberty

Sponsored Apps Marketplace in eMail
Ronny Lempel, Yoelle Maarek, Edward Bortnikov, Edo Liberty

A System for Email sequence identification
Edo Liberty, Zohar Karnin, Yoelle Maarek, Natalie Aizenberg


Tutorials, Presentations, Technical reports and other IP:

Online PCA with Spectral Bounds at COLT.

Correlation Clustering: from Theory to Practice
KDD 2014 Tutorial [slides] [bib]

Streaming Data Mining
MLConf New York 2014

<--

Data Mining in the Streaming Model: Approximating Massive Matrices
IBM Machine Learning Day 2012

-->

Streaming Data Mining
KDD 2012 tutorial on practical algorithms in mining streaming data; with Jelani Nelson.

Fast Random Projections survey and new results,
SODA 2011 and IAS and Yale math seminars 2011.
Video of the talk at IAS available here.

Accelerated Dense Random Projections
PhD Thesis. See also Talk slides


Frequently a reviewer and/or program committee member for:

KDD, AISTATS, SIGIR, WSDM, WWW, SODA, ESA, FOCS


Personal:

My (not necessarily updated) resume is available here.

I'm also an enthusiastic kitesurfer and snowboarder.
Here are some pictures of that.

This site was last updated Sep 2013