Structure-Aware Contact Point Recommendation for Robotic Manipulation
Overview
Developing a system that recommends contact points for instruction-guided robotic manipulation by integrating structural awareness into vision-language models (VLMs). The system interprets natural language commands and infers physically feasible grasp and contact locations considering the object geometry and task constraints.
Period: Sep.2025 – Aug.2026
Funding: Master’s Research Encouragement Grant, National Research Foundation of Korea (2025)
Motivation
Instruction-guided robotic manipulation requires a robot to understand both the semantic intent of a command and the geometric properties of the target object. Existing approaches often treat contact point estimation independently from language understanding, leading to physically infeasible grasps. This project aims to bridge that gap through structure-aware reasoning.
Approach
- Parse natural language instructions using a vision-language model (VLM)
- Extract structural features of target objects (geometry, affordances, contact surfaces)
- Recommend contact points that are both semantically aligned with the instruction and physically feasible
- Validate on a robotic manipulation platform with diverse object categories
Skills
Python ROS2 PyTorch Vision-Language Models OpenCV