A. Ashok, M. Gruteser, N. Mandayam, J. Silva, M. Varga, and K. Dana, Challenge: Mobile Optical Networks Through Visual MIMO, Proceedings of the 16th Annual International Conference on Mobile Computing and Networking (MobiCom). New York, NY, USA:

Visual MIMO Networks

Project Objectives:
This inter-disciplinary project brings together expertise in the areas of mobile networks, communications, and computer vision to analyze, design, and prototype a network stack for visual MIMO communications.

This stack will address the fundamentally different visual channel and receiver constraints through the

following key components:

  • Vision inspired techniques for signal acquisition, tracking, interference cancellation, and modulation at the physical layer
  • Vision-Aware Link and MAC layer protocols that adapt to perspective distortions and partial occlusions as well as addressing camera synchronization limitations
  • Visual multi-path routing, visual localization, and energy-management techniques at the Network and higher layers to enable use of visual MIMO in a variety of application contexts
  •  Analytical models and experimental evaluation results of a visual MIMO network in a dense configuration on the ORBIT test-bed and in mobile outdoor experiments.

Technology Rationale:

The ubiquity of light emitting arrays and cameras in current day digital devices raises the possibility of using visual MIMO communications in a variety of different settings. One example is to use infrared light emitting arrays with camera receivers, rather than visible light, which can provide reduced noise from sunlight, improved fog penetration, thereby eliminating human perception issues with visible light communications. Further, rather than operating in the low-bit rate regime it may be possible to design cameras for this communication task that provide very high frame-rates, by including the LEA recognition and tracking tasks into the camera and outputting only the regions of interest. Some camera manufacturers already provide custom camera designs with custom in-camera processing functions. Camera output bandwidth is one of the main factors that limit camera frame rates. Without real-time output, already, production cameras exist that provide 1.4 million fps and research prototypes have achieved 6 million fps. The visual MIMO concept also opens the door for exploring more complex image analysis such as super-resolution as opposed to de-convolution, and programmable aperture photography in which the camera aperture is modified during image capture to get improved resolution using multiple exposures. Exploring modulation and coding schemes that achieve a desired perceptual effect is yet another possibility.

Technical Approach:

This project proposed to design and develop a network stack for visual MIMO networking. Realizing this vision presents several research challenges and opportunities across the layers of the network stack. At the physical layer, as opposed to traditional baseband signal processing, visual MIMO requires computer vision/image analysis techniques to acquire and track signals from a transmitter as they are captured by different photodiodes (pixels) during movement, thereby opening up many avenues for interdisciplinary research. It will benefit from novel modulation and coding techniques that are robust to perspective distortions and partial occlusions. At the link layer, visual MIMO relies on vision-aware error detection, ARQ, and rate adaptation techniques. For example, perspective distortions can be expected to lead to different bit error rates in different areas of a camera image. Thus link layer mechanisms should compensate by using different data rates in different areas of the image or selectively retransmitting parts of an image. At the MAC layer, there is little need for coordinated channel access because collisions are extremely rare, instead there is a need for coordination because of synchronization limitations of the camera receivers. Visual MIMO also allows novel forms of multi-path routing, visual localization, and presents novel energy management challeges. This project brings together an interdisciplinary team with expertise in communications theory, computer vision, and location-aware networks to address the fundamental challenges that this approach poses.

Results To Date and Future Work Plan:

Cameras as receivers inherently limit the achievable data rates in the system due to the limitations in the camera frame-rates. Though photo-detectors may seem as viable candidates for the receiver our analytical results show that a camera receiver can out-do a photo-detector in terms of the achievable data rate at medium to long ranges. To achieve such ranges conventional optical communication systems would require huge input power but we instead leveraged the fact that an array of light emitters and a camera communication is analogous to a MIMO system and hence delivering multiplexing and diversity gains in the data rate. Based on a MIMO model we derived for a visual MIMO channel we saw an atypical behavior where the multiplexing and diversity gains tradeoff with distance rather than with one another as in RF. At short distance it is possible to multiplex data over multiple light emitting elements of an array and as distance increases, signals from multiple light emitting elements can be combined at the receiver to achieve a diversity gain in the SNR and hence the data rate. In visual MIMO such signal enhancements and gains are highly dependent on the receiver perspective and channel distortions such as occlusion, image noise and blur which can degrade the quality of the optical link. 



We plan to test these proposed algorithms and system designs on a real world application setting for which we have developed a basic prototype of a visual MIMO communication system applied to V2V communication.  Our V2V demo comprises an LED array, assumed to be the brake-lights of a car, controlled by a microcontroller interfaced to a PC, that is set to transmit the brake pedal intensity information in the form of ON-OFF pulses(ON = bit 1, OFF = bit 0) when triggered by an user (brake is pressed). A high speed camera captures a temporal sequence of the image frames of the array which are then individually processed and sequentially decoded to retrieve the data and displayed on a receiver computer screen (car behind).

All our proposed designs so far assumed an LED array at the transmitter. Our recent foray includes designing a visual MIMO based LCD-Camera communication system primarily applied to steganography (digital signal embedding in images). For the same we have developed a technique called photographic steganography which is an interdisciplinary approach that combines methods from computer vision (tracking, object detection, segmentation) with methods from communications (MIMO). While camera-real world scene communications takes place with bar-codes or QR-codes, this work is unique in that the codes are both dynamic and invisible. Our initial prototype achieves an accuracy rate at 94.6% at a bitrate approx 2Kbps.



Currently we are investigating as to how Visual MIMO can be used as an underlying concept to communicate digital information from LED and LCD display screens to cameras in mobile devices especially smart-phones and tablets. In the further work, will seek to increase its robustness of our algorithms to different perspective/lighting conditions, improve its performance by optimizing image resolution and frame rate and incorporate them into unicast protocols. Larger scale proof-of-concept demonstrations are also planned.



Prof.Marco Gruteser
732-932-6857 Ext. 649

gruteser (AT) winlab (DOT) rutgers (DOT) edu

Prof. Narayan Mandayam

732-932-6857 Ext. 642

narayan (AT) winlab (DOT) rutgers (DOT) edu

Prof. Kristin Dana


kdana (AT) ece (DOT) rutgers (DOT) edu




Copyright © 2004-2012 WINLAB, Rutgers, The State University of New Jersey