publications | Alexander Kolesnikov

A Study of Autoregressive Decoders for Multi-Tasking in Computer Vision

Lucas Beyer, Bo Wan, Gagan Madan, Filip Pavetic, Andreas Steiner, Alexander Kolesnikov, André Susano Pinto, Emanuele Bugliarello, Xiao Wang, Qihang Yu, Liang-Chieh Chen, and Xiaa Zhai

arXiv preprint arXiv:2303.17376, 2023

arXiv PDF Code
Sigmoid Loss for Language Image Pre-Training

Xiaohua Zhai, Basil Mustafa, Alexander Kolesnikov, and Lucas Beyer

arXiv preprint arXiv:2303.15343, 2023

arXiv PDF Code
Scaling vision transformers to 22 billion parameters

Mostafa Dehghani, Josip Djolonga, Basil Mustafa, Piotr Padlewski, Jonathan Heek, Justin Gilmer, Andreas Steiner, Mathilde Caron, Robert Geirhos, Ibrahim Alabdulmohsin, Rodolphe Jenatton, Lucas Beyer, Michael Tschannen, Anurag Arnab, Xiao Wang, Carlos Riquelme, Matthias Minderer, Joan Puigcerver, Utku Evci, Manoj Kumar, Sjoerd van Steenkiste, Gamaleldin F. Elsayed, Aravindh Mahendran, Fisher Yu, Avital Oliver, Fantine Huot, Jasmijn Bastings, Mark Patrick Collier, Alexey Gritsenko, Vighnesh Birodkar, Cristina Vasconcelos, Yi Tay, Thomas Mensink, Alexander Kolesnikov, Filip Pavetić, Dustin Tran, Thomas Kipf, Mario Lučić, Xiaohua Zhai, Daniel Keysers, Jeremiah Harmsen, and Neil Houlsby

arXiv preprint arXiv:2302.05442, 2023

arXiv PDF
Tuning computer vision models with task rewards

André Susano Pinto, Alexander Kolesnikov, Yuge Shi, Lucas Beyer, and Xiaohua Zhai

arXiv preprint arXiv:2302.08242, 2022

arXiv PDF Code
FlexiViT: One Model for All Patch Sizes

Lucas Beyer, Pavel Izmailov, Alexander Kolesnikov, Mathilde Caron, Simon Kornblith, Xiaohua Zhai, Matthias Minderer, Michael Tschannen, Ibrahim Alabdulmohsin, and Filip Pavetic

Conference on Computer Vision and Pattern Recognition (CVPR), 2022

arXiv PDF Code
PaLI: A Jointly-Scaled Multilingual Language-Image Model

Xi Chen, Xiao Wang, Soravit Changpinyo, AJ Piergiovanni, Piotr Padlewski, Daniel Salz, Sebastian Goodman, Adam Grycner, Basil Mustafa, Lucas Beyer, Alexander Kolesnikov, Joan Puigcerver, Nan Ding, Keran Rong, Hassan Akbari, Gaurav Mishra, Linting Xue, Ashish Thapliyal, James Bradbury, Weicheng Kuo, Mojtaba Seyedhosseini, Chao Jia, Burcu Karagol Ayan, Carlos Riquelme, Andreas Steiner, Anelia Angelova, Xiaohua Zhai, Neil Houlsby, and Radu Soricut

International Conference on Representation Learning (ICLR), 2022

arXiv PDF
UViM: A unified modeling approach for vision with learned guiding codes

Alexander Kolesnikov, André Susano Pinto, Lucas Beyer, Xiaohua Zhai, Jeremiah Harmsen, and Neil Houlsby

Advances in neural information processing systems (NeurIPS), 2022

arXiv PDF Code
Better plain ViT baselines for ImageNet-1k

Lucas Beyer, Xiaohua Zhai, and Alexander Kolesnikov

arXiv preprint arXiv:2205.01580, 2022

arXiv PDF Code
Scaling vision transformers

Xiaohua Zhai, Alexander Kolesnikov, Neil Houlsby, and Lucas Beyer

Conference on Computer Vision and Pattern Recognition (CVPR), 2022

arXiv PDF Code
Knowledge distillation: A good teacher is patient and consistent

Lucas Beyer, Xiaohua Zhai, Amélie Royer, Larisa Markeeva, Rohan Anil, and Alexander Kolesnikov

Conference on Computer Vision and Pattern Recognition (CVPR), 2022

arXiv PDF Code
LiT: Zero-shot transfer with locked-image text tuning

Xiaohua Zhai, Xiao Wang, Basil Mustafa, Andreas Steiner, Daniel Keysers, Alexander Kolesnikov, and Lucas Beyer

Conference on Computer Vision and Pattern Recognition (CVPR), 2022

arXiv PDF Code
How to train your ViT? data, augmentation, and regularization in vision transformers

Andreas Steiner, Alexander Kolesnikov, Xiaohua Zhai, Ross Wightman, Jakob Uszkoreit, and Lucas Beyer

Transactions on Machine Learning Research (TMLR), 2021

arXiv PDF Code
MLP-Mixer: An all-mlp architecture for vision

Ilya O Tolstikhin, Neil Houlsby, Alexander Kolesnikov, Lucas Beyer, Xiaohua Zhai, Thomas Unterthiner, Jessica Yung, Andreas Steiner, Daniel Keysers, Jakob Uszkoreit, and others

Advances in neural information processing systems (NeurIPS), 2021

arXiv PDF Code
An image is worth 16x16 words: Transformers for image recognition at scale

Alexey Dosovitskiy, Lucas Beyer, Alexander Kolesnikov, Dirk Weissenborn, Xiaohua Zhai, Thomas Unterthiner, Mostafa Dehghani, Matthias Minderer, Georg Heigold, Sylvain Gelly, and others

International Conference on Representation Learning (ICLR), 2020

arXiv PDF Code
Big transfer (BiT): General visual representation learning

Alexander Kolesnikov, Lucas Beyer, Xiaohua Zhai, Joan Puigcerver, Jessica Yung, Sylvain Gelly, and Neil Houlsby

European Conference on Computer Vision (ECCV), 2020

arXiv PDF Code
On Robustness and Transferability of Convolutional Neural Networks

Josip Djolonga, Jessica Yung, Michael Tschannen, Rob Romijnders, Lucas Beyer, Alexander Kolesnikov, Joan Puigcerver, Matthias Minderer, Alexander D’Amour, Dan Moldovan, Sylvain Gelly, Neil Houlsby, Xiaohua Zhai, and Mario Lucic

Conference on Computer Vision and Pattern Recognition (CVPR), 2021

arXiv PDF
Are we done with imagenet?

Lucas Beyer, Olivier J Hénaff, Alexander Kolesnikov, Xiaohua Zhai, and Aäron van den Oord

arXiv preprint arXiv:2006.07159, 2020

arXiv PDF Code
A Large-scale Study of Representation Learning with the Visual Task Adaptation Benchmark

Xiaohua Zhai, Joan Puigcerver, Alexander Kolesnikov, Pierre Ruyssen, Carlos Riquelme, Mario Lucic, Josip Djolonga, Andre Susano Pinto, Maxim Neumann, Alexey Dosovitskiy, Lucas Beyer, Olivier Bachem, Michael Tschannen, Marcin Michalski, Olivier Bousquet, Sylvain Gelly, and Neil Houlsby

arXiv preprint arXiv:1910.04867, 2019

arXiv PDF Code
S4L: Self-supervised semi-supervised learning

Xiaohua Zhai, Avital Oliver, Alexander Kolesnikov, and Lucas Beyer

International Conference on Computer Vision (ICCV), 2019

arXiv PDF Code
The open images dataset v4: Unified image classification, object detection, and visual relationship detection at scale

Alina Kuznetsova, Hassan Rom, Neil Alldrin, Jasper Uijlings, Ivan Krasin, Jordi Pont-Tuset, Shahab Kamali, Stefan Popov, Matteo Malloci, Alexander Kolesnikov, Tom Duerig, and Vittorio Ferrari

International Journal of Computer Vision (IJCV), 2020

arXiv PDF
Detecting Visual Relationships Using Box Attention

Alexander Kolesnikov, Alina Kuznetsova, Christoph Lampert, and Vittorio Ferrari

International Conference on Computer Vision (ICCV) Workshops, 2019

arXiv PDF
Revisiting self-supervised visual representation learning

Alexander Kolesnikov, Xiaohua Zhai, and Lucas Beyer

Conference on Computer Vision and Pattern Recognition (CVPR), 2019

arXiv PDF Code
Estimating barriers to gene flow from distorted isolation-by-distance patterns

Harald Ringbauer, Alexander Kolesnikov, David L Field, and Nicholas H Barton

Genetics, 2018

HTML
Probabilistic image colorization

Amelie Royer, Alexander Kolesnikov, and Christoph H Lampert

British Machine Vision Conference (BMVC), 2017

arXiv PDF
PixelCNN models with auxiliary variables for natural image modeling

Alexander Kolesnikov, and Christoph H Lampert

International Conference on Machine Learning (ICML), 2017

arXiv PDF
iCaRL: Incremental Classifier and Representation Learning

Sylvestre-Alvise Rebuffi, Alexander Kolesnikov, Georg Sperl, and Christoph H. Lampert

Conference on Computer Vision and Pattern Recognition (CVPR), 2017

arXiv PDF Code
Improving weakly-supervised object localization by micro-annotation

Alexander Kolesnikov, and Christoph H Lampert

British Machine Vision Conference (BMVC), 2016

arXiv PDF
Seed, expand and constrain: Three principles for weakly-supervised image segmentation

Alexander Kolesnikov, and Christoph H Lampert

European Conference on Computer Vision (ECCV), 2016

arXiv PDF Code
Identifying Reliable Annotations for Large Scale Image Segmentation

Alexander Kolesnikov, and Christoph H Lampert

arXiv preprint arXiv:1504.07460, 2015

arXiv PDF
Closed-form training of conditional random fields for large scale image segmentation

Alexander Kolesnikov, Matthieu Guillaumin, Vittorio Ferrari, and Christoph H Lampert

European Conference on Computer Vision (ECCV), 2014

arXiv PDF