Summer Internship

2023 Research Projects

Project Pre-Requisites
Distributed Data Infrastructure: This project will design and implement infrastructure capable of handling large amounts of data in a distributed environment. Systematic trading requires extensive use of data for research. Increased throughput for both the writes and reads benefit the downstream research process greatly. To make matters more interesting, the data needs to be available in a cluster-environment where many clients can co-exist across many machines. The stored data is also critical in nature, where resiliency and recovery mechanisms should be one of the main considerations. OS: Linux
Software: C/C++
Financial Data Processing: In today’s applications, data is rarely used as-is. The raw data is often dirty, full of pitfalls and nuances specific to the originating source, and the collection mechanisms that deliver the data. Identifying such issues can be more than half the battle to extract meaningful value from the data. Once the data is cleaned and prepped, it is often transformed and put through a variety of computational workloads to produce value across the systematic trading process. Luckily, many data processing situations present embarrassingly parallel problems that can be solved with distributed processing techniques.  The students will investigate faster ways to scale algorithms by using technologies including, but not limited to: 1) applications using openmp, MPI, 2) c/c++ with embedded avx instructions, 3) high-level language tools such as dask in python. Lastly, we will also consider efficient ways to store data that will be retrieved frequently. The overhead of parsing the data should be minimized if multiple pass to the data will be incurred in downstream processing. Often, processed data is stored in binary format(s) that is optimized for common usage patterns in research. OS: Linux
Software: C/C++, Python
Low Latency Camera Feed Development: This project will measure and evaluate latency performance of existing (integrated) IP based cameras. Students will also research and analyze low-latency camera streaming systems for AI/ML-based object detection and tracking. They will also design, develop and implement the camera streaming system, working with the NVIDIA Xavier module and other hardware and software components to create a working low-latency streaming prototype.
OS: Linux
Software: C/C++
Machine Learning for Systematic Trading: There are countless applications of ML in the financial world. Financial ML is perhaps most prevalent in quant finance and systematic trading disciplines. Here, students will explore the use of ML in prediction problems that often occur in the world of finance. Given a pre-generated training dataset with a set of features and targets, we will test a variety of machine learning algorithms to generate out-of-sample predictions. Be warned, financial data is noisy, and it will take more than just using the algorithms from off-the-shelf libraries to achieve reasonable accuracy. Here, we will discuss pitfalls and challenges when fitting against noisy, non-stationary financial data, and come up with ways to mitigate these issues. Finally, the out-of-sample predictions will be evaluated against unseen test data, and studied for its generalization properties. OS: Linux
Software: C/C++, Python
Channel Measurement Campaign: This project will develop an SDR based platform for channel characterization and coverage measurements on COSMOS/ORBIT testbeds and perform a series of measurement campaigns. Once the platform is developed, a series of measurement campaigns will need to be performed at various locations to validate the platform’s performance and accuracy. These campaigns will involve deploying the platform at different locations on the COSMOS/ORBIT testbeds, capturing wireless signals from different sources, and analyzing the data to derive insights into the wireless channel characteristics and coverage.
OS: Linux
Software: C/C++
Self-Driving Vehicular Project: This project will assemble and train miniature autonomous vehicles to run in the miniature smart city environment, by using low latency networks for vehicular control. This project will use specialized low latency cameras and radios to operate remote model cars and design and implement self-driving algorithms using machine learning libraries in python. Students will design behavior that will allow the vehicles to react realistically to other cars and props in the smart city environment, and work with the testbed infrastructure to use external data from the intersection to improve performance. OS: Linux
Software: C/C++
Hive monitoring: Sensing and monitoring have the potential to make the next agricultural revolution. This project will use Long Range (LoRA) radios to send data from sensors monitoring a beehive at the Horticultural Farm 3 on the Cook Campus. Sensors may include a camera to count bees at the entrance, a microphone to monitor internal activity, and a scale to monitor weight changes. The system will be engineered to use little energy so it can be solar powered. The data will need to be sent using  compressed representations, possibly using AI as a compression technique, to send data back to WINLAB over the radio. In addition, the entire data will be saved to local storage for additional analysis on compressive sensing algorithms. OS: Linux
Software: C/C++
Developing a Vehicular AI Agent for Safer Smart Cities: This project will create a realistic intersection simulation environment and use cutting-edge technology to design and implement it and analyze traffic data to improve its accuracy. Additionally, students will train an in-vehicle AI agent to interact with drivers and test its performance in different situations. Using a VR headset and remote control car with a first-person view camera, you’ll gain valuable insights into the capabilities and limitations of advanced AI agents.
OS: Linux
Software: C/C++
Tradeoff Analysis of Real-Time Machine Learning Models with Multi-Access Edge Computing (MEC) Assistance: The project aims to implement inference-time MEC-assisted machine learning models and evaluate tradeoffs between accuracy, latency, and bandwidth resources. The students will implement low-complexity and high-complexity ML models and evaluate how much local decision making can be made on a device versus running the computation in the edge cloud. They will also evaluate policies for reporting inferences to the edge cloud using the Age of Information metric and investigate approaches for model pruning and transfer to manage accuracy-latency tradeoffs. The project will be conducted on ORBIT with a simulated mobile device and edge cloud.” OS: Linux
Software: C/C++
First Person View Self Driving CarThis project will use specialized low latency cameras and radios to operate remote mode cars. Students will control their cars with these low latency radio connections and evaluate how controllable the cars are inside a model city environment.
 
OS: Linux
Software: C/C++
Smart Intersection Situational AwarenessThis project aims to utilize data inputs from a variety of sensors, including lidars, 2D and 3D cameras, and other relevant sensors, to create a 3D point-cloud of the intersection area. This will involve collecting data from the sensors at regular intervals and processing the data to generate a comprehensive 3D map of the area.
OS: Linux
Software: C/C++
Neural Networks For Feature AnalysisNeural networks have a long history of being used for classification, and more recently. content generation, Example classifiers including, image classification between dogs and cats, text sentiment classification. Example generative networks include those for human faces, images, and text. Rather than classification or generation, this work explores using networks for feature analysis. Intuitively, features are the high level patterns that distinguish data, such as text and images, into different classes. We will explore several data-sets, including driving, insect motion, and synthetic functions to qualitatively measure the ease or difficulty of reverse-engineering the features found by the neural networks. 
OS: Linux
Software: C/C++
Robotic IoT SmartSpace Testbed: This project will design and develop a remotely accessible platform for running experiments based on the mobile robot with the RGB camera, LiDAR camera, custom Maestro multisensing unit, and wireless radio.  Students will work on a web based interface that will allow users to specify the experiment parameters and send commands to the robot remotely as well as on a sensory input data collection framework.
OS: Linux
Software: C/C++
Testing Vehicular AI Agents with the CARLA Simulator for Smart Cities: This project will create a realistic simulation environment using the CARLA Simulator to mimic real-world traffic scenarios. Students will design and implement scenarios incorporating multimodal data of egocentric and allocentric car views. The egocentric view data will be produced by the car, while the allocentric view data will be collected from the infrastructure RGB and LiDAR sensors available. They will develop an AI agent that can interact with human drivers, providing real-time information and alerts that enhance the driving experience. 
OS: Linux
Software: C/C++
AR Mural: This project involves developing an augmented reality (AR) based art platform that allows users to contribute their artwork, including paintings, photos, videos, and sculptures, to a set of locations. The 3D sculpture building feature will enable users to create and contribute to 3D sculptures that will be shown simultaneously across multiple locations. The team will need to develop the necessary tools and algorithms to enable collaborative sculpting, as well as implement the required backend infrastructure and communication protocols to ensure that the art can be displayed in real-time across multiple locations.
OS: Windows, Linux
Software: Unity, C#, C/C++
Tiny machine learning (TinyML) on the MCUThis project aims to explore the design and deployment of a very small AI model on the resource-limited microcontrollers, simultaneously achieving high accuracy and low power consumption.
OS: Linux
Software: C/C++
Evaluating 5G/6G Wireless: 5G-and-beyond networks will utilize data transmissions at millimeter-wave (mmWave) and terahertz (THz) frequencies to improve data throughput and wireless spectrum utilization. In collaboration with Nokia Bell Labs, we will evaluate 5G/6G wireless at a variety of locations within the Columbia campus using a channel sounder. Students working on this project will be responsible for collecting data using the channel sounder and potentially helping analyze the results.
OS: Linux
Software: C/C++
Security in Artificial Intelligence: Artificial intelligence techniques have been widely integrated into mobile and IoT devices, enabling various functionalities based on vision (e.g., face recognition, speech recognition, and speaker identification). The extended pipeline of building deep neural networks (DNN) produces new attack surfaces, such as attacks during the data collection, model training, and model update stages. Recent research studies discover an effective yet stealthy attack, called a backdoor attack, which trains a hidden trigger pattern into the DNNs. The backdoored DNNs will misclassify an input as an adversary-specified label if the trigger pattern appears, while it behaves normally in the absence of the trigger, making it difficult to be detected. Backdoor attacks are originally discovered in the image domain, and recent studies start investigating audio-domain backdoor attacks (e.g., against voice assistant systems). This project aims to study the vulnerabilities of backdoor attacks in both image and audio domains and develop techniques for attack mitigation. OS: Linux
Software: C/C++
Machine learning-based Robotic Motion Planning: This project aims to explore and deploy efficient machine learning algorithm for motion planning tasks in the robotic system
OS: Linux
Software: C/C++
Full-duplex wireless – cross layer design: Full-Duplex (FD) wireless technology allows for simultaneous transmission and reception on the same frequency channel, a more spectrum-efficient communication paradigm than the current half-duplex architecture used in all modern wireless systems. This interdisciplinary project directly addresses important cross-layer challenges stemming from novel small-form-factor FD transceiver implementations. In this project, students will explore FD transceiver and algorithm development, familiarizing themselves with the fundamentals of FD operation at the node, link, and network levels. OS: Linux
Software: C/C++
 

2024 Summer Internship Dates

Applications Due: April 14
Notifications: April 28
Internship Starts: May 28
Internship Ends: Aug 7

Project Pages

Past Research Topics