MuSRobS 2015

Workshop on

Multimodal Semantics for Robotic Systems

Call for Papers

Extended deadline

Due to several requests for extending the deadline, the organizing committee has the pleasure to announce that the deadline has been extended until Monday 13 of July at midnight PST.

Main goals of the workshop

Human learning and reasoning is a process that involves information obtained from a range of different senses combined to form an incredibly successful cognitive system. Artificial autonomous systems should also effectively process and combine different sensory information to compliment each other to produce better logical inferences.

Through semantic modeling of low level features within a scenario, robots can generate representation of such features in level of abstraction where logical reasoning methods could be applied to them for decision making. Furthermore, at such level more than one modalities can be fused to compliment each other and produce logical inferences. This creates the possibility of robust decision making even under scenarios where certain modalities under perform, such as generic task performances.

Lately heterogeneous cognitive systems have become quite popular among the research community, specially those using deep learning techniques over images and language sources, showing promising results. This workshop provides a uniquely focused forum for the discussion of the intersection of different fields like, audio, speech, language, images and some others into unique robotics systems that can auto-improve by learning and can be exploited through different reasoning techniques.

This workshop will bring together the foremost researchers from different fields of robotics sharing and unifying techniques that can be applied to different areas on where they are currently used. Along with the presentation of novel works in the field some discussions will be encouraged to share latest advances among researchers. Finally prominent figures on the research of multimodal semantic systems will be invited to share their latest and most successful achievements and overviews on the field.

Topics of interest (but not limited to)

  • Multimodal knowledge representation in robotic systems
  • Semantic modeling of multimodal feature space
  • Multimodal fusion at semantic level
  • Logical reasoning of multimodal feature spaces
  • Heterogeneous cognitive robotics systems
  • Multimodal semantic reasoning on robotic systems
  • Shallow and deep semantic processing of heterogeneous information
  • Computational aspects on robots for multimodal semantic
  • Methodologies and approaches for multimodal semantic annotation
  • Representing and resolving semantic multimodal ambiguity
  • Hybrid symbolic and statistical approaches to representing multimodal semantics
  • Multimodal vs non multimodal approaches
  • Multimodal semantics and ontologies for robots
  • Robotic applications of multimodal semantic systems
  • Joint language and image semantic robot learning and reasoning
  • 13.07.15
    EXTENDED Submission deadline
  • 13.08.15
    Notification of acceptance
  • 17.08.15
    Early Registration Deadline
  • 31.08.15
    Camera Ready Submission Deadline
  • 28.09.15

Submission and Authors Information

All papers must be written in English and submitted electronically in a PDF format that has to conform to the manuscript preparation guidelines. Papers submitted to MuSRobS 2015 undergo a peer-review process. The paper extension should be no less than 4 pages and not more than 8 pages, excluding references. Papers should be submitted via EasyChair.

To write the paper, please use the following IEEE templates: LaTeX or MS Word

The organization is managing to publish selected best papers in a well-known journal.


For registering to this workshop, please do it throught IROS Website

Industrial Session

MuSRobS will host an Industry Session with main researchers from companies in the robotics industry that are conducting applied research in areas related to the main topic of the workshop. This is intended to serve as a forum for ideas and discussion on the applications of Multimodal Semantics for Robotics Systems on real scenarios. Each of the invited company representatives will give a short introduction on their areas of interest and work on the field. This will provide an insight on the real applications that multimodal semantics can have in both industry and social robotics environments. The session will end with a Q&A period and a plenary panel discussion


Agenda - MoWS-05 Workshop

NOTE: All paper presentations are 20 minutes: 15 minutes presentation plus 5 minutes Q&A.

Morning Activities28 September 2015
10:30 — 10:40

Opening Ceremony

10:40 — 11:00

"A Novel Multimodal Emotion Recognition Approach for Affective Human Robot Interaction"
Felipe Cid Burgos, Pedro Núñez Trujillo and Luis J. Manso.

11:00 — 11:20

"A Multi-modal Approach for Assistive Humanoid Robots"
German Parisi, Johannes Bauer, Erik Strahl and Stefan Wermter.

11:20 — 11:40

"An Intelligent Application Development Platform for Service Robots"
Yu Sugawara, Takeshi Morita, Shunta Saito and Takahira Yamaguchi.

11:40 — 12:00

"Multimodal binding of parameters for task-based robot programming based on semantic descriptions of modalities and parameter types"
Alexander Perzylo, Nikhil Somani, Stefan Profanter, Markus Rickert and Alois Knoll.

12:00 — 12:20

"Connecting natural language to task demonstrations and low-level control of industrial robots"
Maj Stenmark, Jacek Malec.

12:20 — 12:40

"Perceptive Parallel Processes Coordinating Geometry and Texture"
Marco A. Gutiérrez, Rafael E. Banchs and Luis F. D'Haro.

12:40 — 14:00

Lunch time!

14:00 — 14:45

Keynote 1: How high-performance & low-energy computing systems for deep learning and computer vision can help Robotics
Serge Palaric, Senior Director EMEA Embedded, NVIDIA.

14:45 — 15:30

Keynote 2: Learning and using semantics in monocular SLAM
Javier Civera, Associate Professor, Robotics, Perception and Real-Time Group, University of Zaragoza, Spain.

15:30 — 16:00

Coffee Break

16:00 — 16:45

Keynote 3: SP1: Stereo Vision in Real Time
Konstantin Schauwecker, Nerian Vision Technologies.

16:45 — 17:00

Closing Ceremony

Invited keynote speakers

  • Serge Palaric
    Serge Palaric

    Senior Director EMEA Embedded


Serge Palaric joined NVIDIA in 2004 after 20 years working at different OEMs as Dell, Packard Bell and NEC on mobile devices at European level. Focused on system design wins at key global accounts; He is in charge of embedded business at NVIDIA covering Europe with a focus on IVA and Autonomous machines where Computer Vision and Deep Learning are key leading technologies.


How high-performance & low-energy computing systems for deep learning and computer vision can help Robotics.

Our NVIDIA Jetson TK1 development kit was the first mobile supercomputer to power next-gen embedded applications employing computer vision, as well as other on-board computationally intensive apps. It's built around the revolutionary NVIDIA® Tegra® K1 SOC, a 4-Plus-1™ quad-core mobile processor that combines the lowest power ARM® Cortex-A15 CPU to highest-performance GPU that uses the same NVIDIA Kepler™ computing core designed into supercomputers around the world. Jetson TK1 enables industries to unleash the power of 192 CUDA® cores to develop solutions in computer vision, robotics, UAV, IVA, and more, shaping the future of embedded.

  • Javier Civera
    Javier Civera

    Associate Professor

    Robotics, Perception and Real-Time Group
    Robotics, Perception and Real-Time Group
    University of Zaragoza, Spain

Javier Civera was born in Barcelona, Spain, in 1980. He received the industrial engineering degree in 2004 and the Ph.D. degree in 2009, both from the University of Zaragoza in Spain. He is currently an Associate Professor at the University of Zaragoza, where he teaches courses in computer vision, machine learning and artificial intelligence. He has participated in several EU-funded, national and technology transfer projects related with vision and robotics and has been funded for research visits to Imperial College (London) and ETH (Zürich). He has coauthored around 30 publications in top conferences and journals, receiving around 1,700 references (GoogleScholar). Currently, his research interests are in the use of 3D vision, distributed architectures and learning algorithms to produce robust and real-time vision technologies for robotics, wearables and AR applications.

Keynote: Learning and using semantics in monocular SLAM

Visual SLAM, standing for Simultaneous Localization and Mapping, aims to estimate the ego-pose and a model (a map) of the scene being an image sequence the main source of information. Visual SLAM has become one of the key technologies in an increasing number of relevant applications like Augmented/Virtual Reality, Robotics and Autonomous Cars; where the estimated camera motion and map can be used for coherent virtual insertions, collision-free autonomous navigation or high-level robotic tasks. Visual SLAM has been traditionally focused on the geometry; and it is only very recently that the estimation of the semantics has gained relevance in the community. Learning and understanding the semantics of a scene opens the door to a wide array of applications; specially for autonomous high-level tasks (e.g., commanding a robot to grasp a cup from the cupboard). This talk will review some of the most relevant algorithms for semantic visual SLAM and will show some application examples and the associated gain in performance.

  • Konstantin Schauwecker
    Konstantin Schauwecker

    Founder of Nerian Vision Technologies

Konstantin Schauwecker graduated in 2008 with a degree in software engineering at the Esslingen University of Applied Sciences in Germany. After one year of industry work, he moved to New Zealand in 2009 to pursue his master's at the University of Auckland, for which he worked on stereo vision for driver assistance. Once graduated, he returned to Germany in 2011 and commenced a part-time PhD program at the University of Tübingen, where he worked on stereo vision based methods for autonomous micro aerial vehicles (MAVs). At the same time he took on a position as contract worker for Thales Air Systems and Robert Bosch GmbH. After graduating from his PhD in 2014 he continued working as a contract worker for Daimler AG, where he joined the research group for vehicle localization. Since 2015 he is now the founder of the new computer vision start-up Nerian Vision Technologies, which provides solutions for real-time stereo vision.

Keynote: SP1: Stereo Vision in Real Time

For planning their motion and for interacting with the physical world, robots require a method for perceiving the three-dimensional world around them. Various sensors can be employed for this task, such as time-of-flight and structured-light cameras, or 3D laser sensors. What these technologies have in common is that they all require the emission of light. While this approach works well when using a single sensor in a controlled environment, difficulties arise when moving the robot outside of the lab or operating more than one robot in the same space. These problems can be avoided when using stereo vision, which is a purely passive approach. The high computational load involved in stereo vision have, however, made it difficult to integrate this technology into real-life robots. To overcome this challenge, Nerian Vision Technologies introduces the SP1 stereo vision system. Using a powerful FPGA, this system is able to perform all stereo vision processing tasks in real-time at high frame rates and at a low power consumption. We hope that this system will make stereo vision more accessible to robot developers.


Conference Venue


Congress Center Hamburg (CCH),
Am Dammtor / Marseiller Str. 20355 Hamburg, Germany



  • Marco A. Gutiérrez

    Robotics And Artificial Vision Laboratory
    University of Extremadura, Spain

  • Rafael E. Banchs

    SERC Robotics Program, A*STAR
    Institute for Infocomm Research, Singapore

  • Suraj Nair

    Technische Universität München
    Munich, Germany

Advisory Committee

  • Luis Fernando D'Haro
    I2R-HLT, Singapore
  • Aravindkumar Vijayalingam
    TUM-Create, Singapore
  • Andreea I. Niculescu
    I2R-HLT, Singapore

Program Committee

Pogram Committee (in alphabetical order):

  • Toni Badia, Pompeu Fabra University, Spain
  • Rafael Banchs, I2R-HLT, Singapore
  • Antonio Bandera, University of Málaga, Spain
  • Marco Baroni, CIMeC, Italy
  • Pablo Bustos, University of Extremadura, Spain
  • Caixia Cai, Technische Universität München, Germany
  • Erik Cambria, Nanyang Technological University, Singapore
  • Jose María Cañas, Rey Juan Carlos University, Spain
  • Monojit Choudhury, Microsoft Research India, India
  • Felipe Cid, Austral University of Chile, Chile
  • Javier Civera, University of Zaragoza, Spain
  • Luis Fernando D'Haro, I2R-HLT, Singapore
  • Paulo Drews-Jr, Federal University of Rio Grande-FURG, Brasil
  • Mary Ellen Foster, Heriot-Watt University, Scotland
  • Ismael García-Varea, University of Castilla-La Mancha, Spain
  • Manuel Giuliani, University Salzburg, Austria
  • Marco Antonio Gutiérrez, University of Extremadura, Spain
  • Luis J. Manso, University of Extremadura, Spain
  • Rebeca Marfil, University of Málaga, Spain
  • Jesus Martinez-Gomez, University of Castilla-La Mancha, Spain
  • Suraj Nair, Technische Universität München, Germany
  • Andreea I. Niculescu, I2R-HLT, Singapore
  • Pedro Nuñez, University of Extremadura, Spain
  • Hemant Patil, DA-IICT Gandhinagar, India
  • Alexander Perzylo, fortiss, Germany
  • Eloy Retamino, TUM-Create, Singapore
  • Rui P. Rocha, University of Coimbra, Portugal
  • Thorsten Roeder, Technische Universität München, Germany
  • Sakriani Sakti, Nara Institute of Science and Technology (NAIST), Japan
  • Nikhil Somani, fortiss, Germany
  • Aravindkumar Vijayalingam, TUM-Create, Singapore
  • Erik Wilhelm, Singapore University of Technology and Design, Singapore
  • Feihu Zhang, fortiss, Germany

Supporting Organizations