Next Article in Journal
Refractive Index Sensing with D-Shaped Plastic Optical Fibers for Chemical and Biochemical Applications
Next Article in Special Issue
Adaptive Local Spatiotemporal Features from RGB-D Data for One-Shot Learning Gesture Recognition
Previous Article in Journal
A New Continuous Rotation IMU Alignment Algorithm Based on Stochastic Modeling for Cost Effective North-Finding Applications
Article Menu

Export Article

Open AccessArticle
Sensors 2016, 16(12), 2117; doi:10.3390/s16122117

Detecting Target Objects by Natural Language Instructions Using an RGB-D Camera

1
Department of Hydraulic, Energy and Power Engineering, Yangzhou University, Yangzhou 225127, China
2
Department of Automotive Engineering, Clemson University, Greenville, SC 29607, USA
3
Department of Electrical and Computer Engineering, Michigan State University, East Lansing, MI 48824, USA
*
Author to whom correspondence should be addressed.
Academic Editor: Joonki Paik
Received: 20 September 2016 / Revised: 24 November 2016 / Accepted: 7 December 2016 / Published: 13 December 2016
(This article belongs to the Special Issue Video Analysis and Tracking Using State-of-the-Art Sensors)
View Full-Text   |   Download PDF [4693 KB, uploaded 21 December 2016]   |  

Abstract

Controlling robots by natural language (NL) is increasingly attracting attention for its versatility, convenience and no need of extensive training for users. Grounding is a crucial challenge of this problem to enable robots to understand NL instructions from humans. This paper mainly explores the object grounding problem and concretely studies how to detect target objects by the NL instructions using an RGB-D camera in robotic manipulation applications. In particular, a simple yet robust vision algorithm is applied to segment objects of interest. With the metric information of all segmented objects, the object attributes and relations between objects are further extracted. The NL instructions that incorporate multiple cues for object specifications are parsed into domain-specific annotations. The annotations from NL and extracted information from the RGB-D camera are matched in a computational state estimation framework to search all possible object grounding states. The final grounding is accomplished by selecting the states which have the maximum probabilities. An RGB-D scene dataset associated with different groups of NL instructions based on different cognition levels of the robot are collected. Quantitative evaluations on the dataset illustrate the advantages of the proposed method. The experiments of NL controlled object manipulation and NL-based task programming using a mobile manipulator show its effectiveness and practicability in robotic applications. View Full-Text
Keywords: object grounding; target object detection; object recognition; natural language processing; natural language control; robotic manipulation system object grounding; target object detection; object recognition; natural language processing; natural language control; robotic manipulation system
Figures

Figure 1

This is an open access article distributed under the Creative Commons Attribution License which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. (CC BY 4.0).

Scifeed alert for new publications

Never miss any articles matching your research from any publisher
  • Get alerts for new papers matching your research
  • Find out the new papers from selected authors
  • Updated daily for 49'000+ journals and 6000+ publishers
  • Define your Scifeed now

SciFeed Share & Cite This Article

MDPI and ACS Style

Bao, J.; Jia, Y.; Cheng, Y.; Tang, H.; Xi, N. Detecting Target Objects by Natural Language Instructions Using an RGB-D Camera. Sensors 2016, 16, 2117.

Show more citation formats Show less citations formats

Note that from the first issue of 2016, MDPI journals use article numbers instead of page numbers. See further details here.

Related Articles

Article Metrics

Article Access Statistics

1

Comments

[Return to top]
Sensors EISSN 1424-8220 Published by MDPI AG, Basel, Switzerland RSS E-Mail Table of Contents Alert
Back to Top