Siddharth Ancha

| CV | Google Scholar | Github |

I am a postdoctoral associate at MIT in Nicholas Roy's Robust Robotics Group. I work at the intersection of robot perception and planning, towards enabling robots to autonomously and reliably perceive and act in complex environments.

I obtained my PhD in the Machine Learning Department at CMU, co-advised by David Held and Srinivasa Narasimhan. My PhD work focused on developing active sensing algorithms for Programmable Light Curtains, a novel controllable depth sensor, by drawing on techniques at the intersection of computer vision, planning and machine learning.

Prior to that, I was a masters student in the Department of Computer Science at the University of Toronto, where I worked on statistical machine learning with Daniel Roy and Roger Grosse. I also spent multiple summers working with Aditya Nori at Microsoft Research Cambridge. I graduated from IIT Guwahati with a major in Computer Science and a minor in Mathematics.

Sep '22 Honored to receive the IROS 2022 Outstanding Reviewer Award 🏆, awarded to 5 out of 4,291 reviewers!
Jul '22 Successfully defended my PhD thesis! A big thanks to my thesis committee: David Held, Srinivasa Narasimhan, Katerina Fragkiadaki and Wolfram Burgard. Also thanks to Chris Atkeson for the many insightful and interesting questions!
Feb '22 Excited to join Nick Roy's Robust Robotics Group at MIT as a postdoc this August!
Dec '21 Proposed my PhD thesis titled Active robot perception using programmable light curtains. Expected to defend and graduate in July 2022. Thesis committee: David Held, Srinivasa Narasimhan, Katerina Fragkiadaki and Wolfram Burgard.
Nov '21 Published a CMU ML Blog Post on our recent work on safety envelopes using light curtains, presented at RSS '21. Check it out!

image not found Massachusetts Institute of Technology
Postdoctoral Associate
Computer Science & Artificial Intelligence Lab (CSAIL)
2022 ─ Present
image not found Carnegie Mellon University
PhD, Machine Learning Department
School of Computer Science (SCS)
2017 ─ 2022
image not found University of Toronto
MS in Computer Science
Department of Computer Science (DCS)
2015 ─ 2017
image not found Indian Institute of Technology, Guwahati
BTech Major in Computer Science & Engineering
BTech Minor in Mathematics
2011 ─ 2015

image not found

Semi-supervised 3D Object Detection via Temporal Graph Neural Networks
Jianren Wang, Haiming Gang, Siddharth Ancha, Yi-Ting Chen, David Held
3DV 2021

webpage | abstract | pdf | bibtex | code | short talk | long talk | slides

3D object detection plays an important role in autonomous driving and other robotics applications. However, these detectors usually require training on large amounts of annotated data that is expensive and time-consuming to collect. Instead, we propose leveraging large amounts of unlabeled point cloud videos by semi-supervised learning of 3D object detectors via temporal graph neural networks. Our insight is that temporal smoothing cancreate more accurate detection results on unlabeled data, and these smoothed detections can then be used to retrain the detector. We learn to perform this temporal reasoning with a graph neural network, where edges represent the relationship between candidate detections in different time frames. After semi-supervised learning, our method achieves state-of-the-art detection performance on the challenging nuScenes and H3D benchmarks, compared to baselines trained on the same amount of labeled data.

  author    = {Jianren Wang AND Haiming Gang AND Siddarth Ancha AND Yi-Ting Cheng AND David Held},
  title     = {Semi-supervised 3D Object Detection via Temporal Graph Neural Networks},
  booktitle = {3DV},
  year      = {2021}

Active Safety Envelopes using Light Curtains with Probabilistic Guarantees
Siddharth Ancha, Gaurav Pathak, Srinivasa Narasimhan,
David Held
RSS 2021

webpage | abstract | pdf | bibtex | code | talk | blog

To safely navigate unknown environments, robots must accurately perceive dynamic obstacles. Instead of directly measuring the scene depth with a LiDAR sensor, we explore the use of a much cheaper and higher resolution sensor: programmable light curtains. Light curtains are controllable depth sensors that sense only along a surface that a user selects. We use light curtains to estimate the safety envelope of a scene: a hypothetical surface that separates the robot from all obstacles. We show that generating light curtains that sense random locations (from a particular distribution) can quickly discover the safety envelope for scenes with unknown objects. Importantly, we produce theoretical safety guarantees on the probability of detecting an obstacle using random curtains. We combine random curtains with a machine learning based model that forecasts and tracks the motion of the safety envelope efficiently. Our method accurately estimates safety envelopes while providing probabilistic safety guarantees that can be used to certify the efficacy of a robot perception system to detect and avoid dynamic obstacles. We evaluate our approach in a simulated urban driving environment and a real-world environment with moving pedestrians using a light curtain device and show that we can estimate safety envelopes efficiently and effectively.

  author    = {Siddharth Ancha AND Gaurav Pathak AND Srinivasa Narasimhan AND David Held}, 
  title     = {Active Safety Envelopes using Light Curtains with Probabilistic Guarantees}, 
  booktitle = {Proceedings of Robotics: Science and Systems}, 
  year      = {2021}, 
  address   = {Virtual}, 
  month     = {July}, 
  doi       = {10.15607/rss.2021.xvii.045} 
image not found

Exploiting & Refining Depth Distributions with Triangulation Light Curtains
Yaadhav Raaj, Siddharth Ancha, Robert Tamburo, David Held, Srinivasa Narasimhan
CVPR 2021

webpage | abstract | pdf | bibtex | code | talk

Active sensing through the use of adaptive depth sensors is a nascent field, with potential in areas such as advanced driver-assistance systems (ADAS). They do however require dynamically driving a laser / light-source to a specific location to capture information, with one such class of sensors being programmable light curtains. In this work, we introduce a novel approach that exploits prior depth distributions from RGB cameras to drive a light curtain's laser line to regions of uncertainty to get new measurements. These measurements are utilized such that depth uncertainty is reduced and errors get corrected recursively. We show real-world experiments that validate our approach in outdoor and driving settings, and demonstrate qualitative and quantitative improvements in depth RMSE when RGB cameras are used in tandem with a light curtain.

  author    = {Yaadhav Raaj, Siddharth Ancha, Robert Tamburo, David Held, Srinivasa Narasimhan},
  title     = {Exploiting and Refining Depth Distributions with Triangulation Light Curtains},
  booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
  year      = {2021}
image not found

Active Perception using Light Curtains for Autonomous Driving
Siddharth Ancha, Yaadhav Raaj, Peiyun Hu, Srinivasa Narasimhan, David Held
ECCV 2020 (Spotlight presentation)

webpage | abstract | pdf | bibtex | code | short talk | long talk | slides

Most real-world 3D sensors such as LiDARs perform fixed scans of the entire environment, while being decoupled from the recognition system that processes the sensor data. In this work, we propose a method for 3D object recognition using light curtains, a resource-efficient controllable sensor that measures depth at user-specified locations in the environment. Crucially, we propose using prediction uncertainty of a deep learning based 3D point cloud detector to guide active perception. Given a neural network's uncertainty, we derive an optimization objective to place light curtains using the principle of maximizing information gain. Then, we develop a novel and efficient optimization algorithm to maximize this objective by encoding the physical constraints of the device into a constraint graph and optimizing with dynamic programming. We show how a 3D detector can be trained to detect objects in a scene by sequentially placing uncertainty-guided light curtains to successively improve detection accuracy.

  author    = {Ancha, Siddharth AND Raaj, Yaadhav AND Hu, Peiyun AND Narasimhan, Srinivasa G.
              AND Held, David},
  editor    = {Vedaldi, Andrea AND Bischof, Horst AND Brox, Thomas AND Frahm, Jan-Michael},
  title     = {Active Perception Using Light Curtains for Autonomous Driving},
  booktitle = {Computer Vision -- ECCV 2020},
  year      = {2020},
  publisher = {Springer International Publishing},
  address   = {Cham},
  pages     = {751--766},
  isbn      = {978-3-030-58558-7}
image not found

Uncertainty-Aware Self-Supervised 3D Data Association
Jianren Wang, Siddharth Ancha, Yi-Ting Chen, David Held
IROS 2020

webpage | abstract | pdf | bibtex | talk | slides | code

3D object trackers usually require training on large amounts of annotated data that is expensive and time-consuming to collect. Instead, we propose leveraging vast unlabeled datasets by self-supervised metric learning of 3D object trackers, with a focus on data association. Large scale annotations for unlabeled data are cheaply obtained by automatic object detection and association across frames. We show how these self-supervised annotations can be used in a principled manner to learn point-cloud embeddings that are effective for 3D tracking. We estimate and incorporate uncertainty in self-supervised tracking to learn more robust embeddings, without needing any labeled data. We design embeddings to differentiate objects across frames, and learn them using uncertainty-aware self-supervised training. Finally, we demonstrate their ability to perform accurate data association across frames, towards effective and accurate 3D tracking.

  author    = {Wang, Jianren AND Ancha, Siddharth AND Chen, Yi-Ting AND Held, David},
  title     = {Uncertainty-aware Self-supervised 3D Data Association},
  booktitle = {IROS},
  year      = {2020}
image not found

Combining Deep Learning and Verification for Precise Object Instance Detection
Siddharth Ancha*, Junyu Nan*, David Held
CoRL 2019

webpage | abstract | pdf | bibtex | talk | code

Deep learning object detectors often return false positives with very high confidence. Although they optimize generic detection performance, such as mean average precision (mAP), they are not designed for reliability. For a reliable detection system, if a high confidence detection is made, we would want high certainty that the object has indeed been detected. To achieve this, we have developed a set of verification tests which a proposed detection must pass to be accepted. We develop a theoretical framework which proves that, under certain assumptions, our verification tests will not accept any false positives. Based on an approximation to this framework, we present a practical detection system that can verify, with high precision, whether each detection of a machine-learning based object detector is correct. We show that these tests can improve the overall accuracy of a base detector and that accepted examples are highly likely to be correct. This allows the detector to operate in a high precision regime and can thus be used for robotic perception systems as a reliable instance detection method.

  author    = {Siddharth Ancha AND Junyu Nan AND David Held},
  editor    = {Leslie Pack Kaelbling AND Danica Kragic AND Komei Sugiura},
  title     = {Combining Deep Learning AND Verification for Precise Object Instance Detection},
  booktitle = {3rd Annual Conference on Robot Learning, CoRL 2019, Osaka, Japan,
               October 30 - November 1, 2019, Proceedings},
  series    = {Proceedings of Machine Learning Research},
  volume    = {100},
  pages     = {122--141},
  year      = {2019},
  url       = {},
  timestamp = {Mon, 25 May 2020 15:01:26 +0200},
  biburl    = {},
  bibsource = {dblp computer science bibliography,}
  Older Work
image not found

Autofocus Layer for Semantic Segmentation
Yao Qin, Konstantinos Kamnitsas, Siddharth Ancha, Jay Nanavati, Garrison Cottrell, Antonio Criminisi, Aditya Nori
MICCAI 2018 (Oral presentation)

abstract | pdf | bibtex | code

We propose the autofocus convolutional layer for semantic segmentation with the objective of enhancing the capabilities of neural networks for multi-scale processing. Autofocus layers adaptively change the size of the effective receptive field based on the processed context to generate more powerful features. This is achieved by parallelising multiple convolutional layers with different dilation rates, combined with an attention mechanism that learns to focus on the optimal scales driven by context. By sharing the weights of parallel convolutions, we make the network scale-invariant, with only a modest increase in the number of parameters. The proposed autofocus layer can be easily integrated into existing networks to improve the model's representational power. Our method achieves very promising performance on the challenging tasks of multi-organ segmentation in pelvic CT scans and brain tumor segmentation in MRI scans.

  title        = {Autofocus layer for semantic segmentation},
  author       = {Qin, Yao AND Kamnitsas, Konstantinos AND Ancha, Siddharth AND Nanavati, Jay
                  AND Cottrell, Garrison AND Criminisi, Antonio AND Nori, Aditya},
  booktitle    = {International conference on medical image computing and computer-assisted
                  intervention (MICCAI)},
  pages        = {603--611},
  year         = {2018},
  organization = {Springer}
image not found

Lifted Auto-Context Forests for Brain Tumour Segmentation
Loïc Le Folgoc, Aditya V. Nori, Siddharth Ancha, Antonio Criminisi
MICCAI 2016 BraTS Challenge (Winner)

abstract | pdf | bibtex |

We revisit Auto-Context Forests for brain tumour segmentation in multi-channel magnetic resonance images, where semantic context is progressively built and refined via successive layers of Decision Forests (DFs). Specifically, we make the following contributions: (1) improved generalization via an efficient node-splitting criterion based on hold-out estimates, (2) increased compactness at the tree level, thereby yielding shallow discriminative ensembles trained orders of magnitude faster, and (3) guided semantic bagging that exposes latent data-space semantics captured by forest pathways. The proposed framework is practical: the per-layer training is fast, modular and robust. It was a top performer in the MICCAI 2016 BraTS (Brain Tumour Segmentation) challenge, and this paper aims to discuss and provide details about the challenge entry.

  title       = {Lifted auto-context forests for brain tumour segmentation},
  author      = {Le Folgoc, Loic AND Nori, Aditya V AND Ancha, Siddharth AND Criminisi, Antonio},
  booktitle   = {International Workshop on Brainlesion: Glioma, Multiple Sclerosis, Stroke and
                 Traumatic Brain Injuries},
  pages       = {171--183},
  year        = {2016},
  organization= {Springer}
image not found

Measuring the reliability of MCMC inference with bidirectional Monte Carlo
Roger B. Grosse, Siddharth Ancha, Daniel M. Roy
NeurIPS 2016

abstract | pdf | bibtex | code

Markov chain Monte Carlo (MCMC) is one of the main workhorses of probabilistic inference, but it is notoriously hard to measure the quality of approximate posterior samples. This challenge is particularly salient in black box inference methods, which can hide details and obscure inference failures. In this work, we extend the recently introduced bidirectional Monte Carlo technique to evaluate MCMC-based posterior inference algorithms. By running annealed importance sampling (AIS) chains both from prior to posterior and vice versa on simulated data, we upper bound in expectation the symmetrized KL divergence between the true posterior distribution and the distribution of approximate samples. We present Bounding Divergences with REverse Annealing (BREAD), a protocol for validating the relevance of simulated data experiments to real datasets, and integrate it into two probabilistic programming languages: WebPPL and Stan. As an example of how BREAD can be used to guide the design of inference algorithms, we apply it to study the effectiveness of different model representations in both WebPPL and Stan.

  author    = {Grosse, Roger B AND Ancha, Siddharth AND Roy, Daniel M},
  booktitle = {Advances in Neural Information Processing Systems},
  editor    = {D. Lee AND M. Sugiyama AND U. Luxburg AND I. Guyon AND R. Garnett},
  pages     = {},
  publisher = {Curran Associates, Inc.},
  title     = {Measuring the reliability of MCMC inference with bidirectional Monte Carlo},
  url       = {
  volume    = {29},
  year      = {2016}

  Conference Reviewing
2021 NeurIPS Workshop on Ecological Theory of RL, 2021
2021 Conference on Robot Learning (CoRL), 2021
2020 Robotics: Science and Systems (RSS), 2020
2020 Conference on Robot Learning (CoRL), 2020
2019 NeurIPS Black in AI Workshop, 2019
2019 Robotics: Science and Systems (RSS), 2019
2019 Conference on Robot Learning (CoRL), 2019


Gates Hillman Center
Room 8021
Machine Learning Depatment
Carnegie Mellon University
Pittsburgh, PA 15213

Template modified from this