Publications

My work focuses on safe and capable autonomy, particularly for navigation task.

MonoMPC: Monocular Vision Based Navigation with Learned Collision Model and Risk-Aware Model Predictive Control

MonoMPC: Monocular Vision Based Navigation with Learned Collision Model and Risk-Aware Model Predictive Control

Basant Sharma, Prajyot Jadhav, Pranjal Paul, K. Madhava Krishna, Arun Kumar Singh

RA-L2026Risk-Aware MPCDepth Foundation ModelMonocular Vision-based Navigation

Navigating unknown spaces with a single RGB camera is notoriously difficult because depth estimates from vision foundation models are often too noisy for reliable zero-shot collision checking. In MonoMPC, we bypass this issue. Instead of using noisy depth for direct collision-checking, we use it as contextual input for a learned collision model that predicts obstacle clearance distributions. Paired with a risk-aware MPC planner, we achieve faster and safer navigation in highly cluttered spaces.

SparseLoc: Sparse Open-Set Landmark-based Global Localization for Autonomous Navigation

SparseLoc: Sparse Open-Set Landmark-based Global Localization for Autonomous Navigation

Pranjal Paul, Vineeth Bhat, Tejas Salian, Mohd. Omama, Krishna Murthy J., Naveen Arulselvan, K. Madhava Krishna

IROS2025Semantic MappingCLIPParticle Filter LocalizationKITTI

What if an autonomous agent could localize across a 5km region using just a thousand points instead of millions? To solve this, we developed a novel city-scale localization method that replaces dense maps and GPS with a sparse topometric map of open-vocabulary landmarks. By leveraging foundation models to identify semantic landmarks and integrating them into a Recursive Bayesian framework, our system achieves robust global localization across diverse environments. The result is a highly efficient pipeline that reduces map density by 500x compared to standard SLAM methods like LOAM and Fast-LIO-SAM, without sacrificing localization reliability.

LeGo-Drive: Language-enhanced Goal-Oriented Closed-Loop End-to-End Autonomous Driving

LeGo-Drive: Language-enhanced Goal-Oriented Closed-Loop End-to-End Autonomous Driving

Pranjal Paul, Anant Garg, Tushar Choudhary, Arun Kumar Singh, K. Madhava Krishna

IROS · RSS Workshop2024Autonomous DrivingVision-LanguageTrajectory OptimizationPlanning

Most neural planners treat safety as a post-hoc operation, forcing downstream optimizers to do the heavy lifting to make trajectories feasible for controllers, while perception models act merely as priors. To make perception inherently planner-aware, we integrate planning directly into perception by formulating the planner as a differentiable optimization layer and tasking the perception module to predict planner-oriented entities, such as goal location, from language commands. This allows gradients to flow end-to-end, ensuring that language-driven goal predictions are not just semantic, but kinematically feasible by design. This tight integration of perception and planning successfully anticipated the architectures that are now standard in Vision-Language-Action (VLA) models.

Interested in collaborating? Get in touch.