publications

  1. autostudy.png
    A Study of Autoregressive Decoders for Multi-Tasking in Computer Vision
    Lucas Beyer, Bo Wan, Gagan Madan, Filip Pavetic, Andreas Steiner, Alexander Kolesnikov, André Susano Pinto, Emanuele Bugliarello, Xiao Wang, Qihang Yu, Liang-Chieh Chen, and Xiaa Zhai
    arXiv preprint arXiv:2303.17376, 2023
  2. sigmoid.png
    Sigmoid Loss for Language Image Pre-Training
    Xiaohua Zhai, Basil Mustafa, Alexander Kolesnikov, and Lucas Beyer
    arXiv preprint arXiv:2303.15343, 2023
  3. 22b.png
    Scaling vision transformers to 22 billion parameters
    Mostafa Dehghani, Josip Djolonga, Basil Mustafa, Piotr Padlewski, Jonathan Heek, Justin Gilmer, Andreas Steiner, Mathilde Caron, Robert Geirhos, Ibrahim Alabdulmohsin, Rodolphe Jenatton, Lucas Beyer, Michael Tschannen, Anurag Arnab, Xiao Wang, Carlos Riquelme, Matthias Minderer, Joan Puigcerver, Utku Evci, Manoj Kumar, Sjoerd van Steenkiste, Gamaleldin F. Elsayed, Aravindh Mahendran, Fisher Yu, Avital Oliver, Fantine Huot, Jasmijn Bastings, Mark Patrick Collier, Alexey Gritsenko, Vighnesh Birodkar, Cristina Vasconcelos, Yi Tay, Thomas Mensink, Alexander Kolesnikov, Filip Pavetić, Dustin Tran, Thomas Kipf, Mario Lučić, Xiaohua Zhai, Daniel Keysers, Jeremiah Harmsen, and Neil Houlsby
    arXiv preprint arXiv:2302.05442, 2023
  4. tune.png
    Tuning computer vision models with task rewards
    André Susano Pinto, Alexander Kolesnikov, Yuge Shi, Lucas Beyer, and Xiaohua Zhai
    arXiv preprint arXiv:2302.08242, 2022
  5. flexivit.png
    FlexiViT: One Model for All Patch Sizes
    Lucas Beyer, Pavel Izmailov, Alexander Kolesnikov, Mathilde Caron, Simon Kornblith, Xiaohua Zhai, Matthias Minderer, Michael Tschannen, Ibrahim Alabdulmohsin, and Filip Pavetic
    Conference on Computer Vision and Pattern Recognition (CVPR), 2022
  6. pali.png
    PaLI: A Jointly-Scaled Multilingual Language-Image Model
    Xi Chen, Xiao Wang, Soravit Changpinyo, AJ Piergiovanni, Piotr Padlewski, Daniel Salz, Sebastian Goodman, Adam Grycner, Basil Mustafa, Lucas Beyer, Alexander Kolesnikov, Joan Puigcerver, Nan Ding, Keran Rong, Hassan Akbari, Gaurav Mishra, Linting Xue, Ashish Thapliyal, James Bradbury, Weicheng Kuo, Mojtaba Seyedhosseini, Chao Jia, Burcu Karagol Ayan, Carlos Riquelme, Andreas Steiner, Anelia Angelova, Xiaohua Zhai, Neil Houlsby, and Radu Soricut
    International Conference on Representation Learning (ICLR), 2022
  7. uvim.png
    UViM: A unified modeling approach for vision with learned guiding codes
    Alexander Kolesnikov, André Susano Pinto, Lucas Beyer, Xiaohua Zhai, Jeremiah Harmsen, and Neil Houlsby
    Advances in neural information processing systems (NeurIPS), 2022
  8. better.png
    Better plain ViT baselines for ImageNet-1k
    Lucas Beyer, Xiaohua Zhai, and Alexander Kolesnikov
    arXiv preprint arXiv:2205.01580, 2022
  9. svit.png
    Scaling vision transformers
    Xiaohua Zhai, Alexander Kolesnikov, Neil Houlsby, and Lucas Beyer
    Conference on Computer Vision and Pattern Recognition (CVPR), 2022
  10. distill.png
    Knowledge distillation: A good teacher is patient and consistent
    Lucas Beyer, Xiaohua Zhai, Amélie Royer, Larisa Markeeva, Rohan Anil, and Alexander Kolesnikov
    Conference on Computer Vision and Pattern Recognition (CVPR), 2022
  11. lit.png
    LiT: Zero-shot transfer with locked-image text tuning
    Xiaohua Zhai, Xiao Wang, Basil Mustafa, Andreas Steiner, Daniel Keysers, Alexander Kolesnikov, and Lucas Beyer
    Conference on Computer Vision and Pattern Recognition (CVPR), 2022
  12. howto.png
    How to train your ViT? data, augmentation, and regularization in vision transformers
    Andreas Steiner, Alexander Kolesnikov, Xiaohua Zhai, Ross Wightman, Jakob Uszkoreit, and Lucas Beyer
    Transactions on Machine Learning Research (TMLR), 2021
  13. mixer.png
    MLP-Mixer: An all-mlp architecture for vision
    Ilya O Tolstikhin, Neil Houlsby, Alexander Kolesnikov, Lucas Beyer, Xiaohua Zhai, Thomas Unterthiner, Jessica Yung, Andreas Steiner, Daniel Keysers, Jakob Uszkoreit, and  others
    Advances in neural information processing systems (NeurIPS), 2021
  14. vit.png
    An image is worth 16x16 words: Transformers for image recognition at scale
    Alexey Dosovitskiy, Lucas Beyer, Alexander Kolesnikov, Dirk Weissenborn, Xiaohua Zhai, Thomas Unterthiner, Mostafa Dehghani, Matthias Minderer, Georg Heigold, Sylvain Gelly, and  others
    International Conference on Representation Learning (ICLR), 2020
  15. big_transfer.png
    Big transfer (BiT): General visual representation learning
    Alexander Kolesnikov, Lucas Beyer, Xiaohua Zhai, Joan Puigcerver, Jessica Yung, Sylvain Gelly, and Neil Houlsby
    European Conference on Computer Vision (ECCV), 2020
  16. robust.png
    On Robustness and Transferability of Convolutional Neural Networks
    Josip Djolonga, Jessica Yung, Michael Tschannen, Rob Romijnders, Lucas Beyer, Alexander Kolesnikov, Joan Puigcerver, Matthias Minderer, Alexander D’Amour, Dan Moldovan, Sylvain Gelly, Neil Houlsby, Xiaohua Zhai, and Mario Lucic
    Conference on Computer Vision and Pattern Recognition (CVPR), 2021
  17. arewe.png
    Are we done with imagenet?
    Lucas Beyer, Olivier J Hénaff, Alexander Kolesnikov, Xiaohua Zhai, and Aäron van den Oord
    arXiv preprint arXiv:2006.07159, 2020
  18. vtab.png
    A Large-scale Study of Representation Learning with the Visual Task Adaptation Benchmark
    Xiaohua Zhai, Joan Puigcerver, Alexander Kolesnikov, Pierre Ruyssen, Carlos Riquelme, Mario Lucic, Josip Djolonga, Andre Susano Pinto, Maxim Neumann, Alexey Dosovitskiy, Lucas Beyer, Olivier Bachem, Michael Tschannen, Marcin Michalski, Olivier Bousquet, Sylvain Gelly, and Neil Houlsby
    arXiv preprint arXiv:1910.04867, 2019
  19. s4l.png
    S4L: Self-supervised semi-supervised learning
    Xiaohua Zhai, Avital Oliver, Alexander Kolesnikov, and Lucas Beyer
    International Conference on Computer Vision (ICCV), 2019
  20. oiv4.png
    The open images dataset v4: Unified image classification, object detection, and visual relationship detection at scale
    Alina Kuznetsova, Hassan Rom, Neil Alldrin, Jasper Uijlings, Ivan Krasin, Jordi Pont-Tuset, Shahab Kamali, Stefan Popov, Matteo Malloci, Alexander Kolesnikov, Tom Duerig, and Vittorio Ferrari
    International Journal of Computer Vision (IJCV), 2020
  21. box.png
    Detecting Visual Relationships Using Box Attention
    Alexander Kolesnikov, Alina Kuznetsova, Christoph Lampert, and Vittorio Ferrari
    International Conference on Computer Vision (ICCV) Workshops, 2019
  22. revisiting.png
    Revisiting self-supervised visual representation learning
    Alexander Kolesnikov, Xiaohua Zhai, and Lucas Beyer
    Conference on Computer Vision and Pattern Recognition (CVPR), 2019
  23. harald.png
    Estimating barriers to gene flow from distorted isolation-by-distance patterns
    Harald Ringbauer, Alexander Kolesnikov, David L Field, and Nicholas H Barton
    Genetics, 2018
  24. color.png
    Probabilistic image colorization
    Amelie Royer, Alexander Kolesnikov, and Christoph H Lampert
    British Machine Vision Conference (BMVC), 2017
  25. pixelcnn.png
    PixelCNN models with auxiliary variables for natural image modeling
    Alexander Kolesnikov, and Christoph H Lampert
    International Conference on Machine Learning (ICML), 2017
  26. icarl.jpg
    iCaRL: Incremental Classifier and Representation Learning
    Sylvestre-Alvise Rebuffi, Alexander Kolesnikov, Georg Sperl, and Christoph H. Lampert
    Conference on Computer Vision and Pattern Recognition (CVPR), 2017
  27. improving.jpg
    Improving weakly-supervised object localization by micro-annotation
    Alexander Kolesnikov, and Christoph H Lampert
    British Machine Vision Conference (BMVC), 2016
  28. sec.jpg
    Seed, expand and constrain: Three principles for weakly-supervised image segmentation
    Alexander Kolesnikov, and Christoph H Lampert
    European Conference on Computer Vision (ECCV), 2016
  29. ident.jpg
    Identifying Reliable Annotations for Large Scale Image Segmentation
    Alexander Kolesnikov, and Christoph H Lampert
    arXiv preprint arXiv:1504.07460, 2015
  30. closed-form.jpg
    Closed-form training of conditional random fields for large scale image segmentation
    Alexander Kolesnikov, Matthieu Guillaumin, Vittorio Ferrari, and Christoph H Lampert
    European Conference on Computer Vision (ECCV), 2014