INTERNSHIPS

Purdue University | Research Intern

Prof. Dharmendra Saraswat, Purdue University | May 2025 – July 2025

Key Responsibilities and Achievements

Conceived and developed an LLM-orchestrated mission execution framework for Unmanned Aerial Vehicles (UAVs), enabling translation of high-level natural language instructions into structured, executable waypoint-based flight plans.
Implemented the SAUCF (Secure, Natural-Language-Guided UAS Control Framework), integrating natural language processing, mission planning, security validation, and human-in-the-loop supervision for safe and interpretable UAV operation.
🔗 Paper:
SAUCF: A Framework for Secure, Natural-Language-Guided UAS Control
🔗 Project Page:
SAUCF Project Website
Integrated DJI Matrice 300 RTK with the DJI Onboard SDK (OSDK) using C++ for autonomous waypoint navigation, mission pausing, resumption, and return-to-home functionality with real-time telemetry monitoring.
Designed behavior tree–based mission planning, mapping LLM-generated high-level intents into modular, robust, and interpretable low-level drone API calls to ensure safe execution and fault tolerance.
Developed a secure, voice-interactive control interface by integrating Automatic Speech Recognition (ASR) and Text-to-Speech (TTS) pipelines, enabling real-time verbal mission input and system feedback.
Implemented security and safety mechanisms, including operator authentication, command validation, and mission approval stages to prevent unsafe or unauthorized UAV actions.
Conducted experimental validation in simulation and real-world agricultural scenarios, evaluating mission planning latency, execution accuracy, and system robustness under varying operational conditions.
Co-authored a peer-reviewed journal publication, contributing to system architecture design, experimental analysis, result interpretation, and manuscript preparation.

Corover | Machine Learning and AI Developer Intern

Industrial Intern | Banglore, Karnataka | May ‘24 - July ‘24

Key Responsibilities and Achievements

Developed simulation for a voice-enabled UPI payment web application using Python , integrating SpeechRecognition for speech-to-text conversion, gTTS for text-to-speech synthesis, and Eyowo API for transaction processing. Designed and implemented the complete workflow, including intent recognition, entity extraction, and action execution, providing a seamless voice-activated payment experience. The project demonstrates innovative use of voice technology for financial transactions, enhancing user accessibility and convenience.
Developed a Bigram Language Model using PyTorch with a transformer architecture to generate text based on pathology data from Robbins-Pathologic2005.txt. Implemented multi-head self-attention and feed-forward layers to capture complex patterns in text sequences. Achieved effective text generation capabilities and demonstrated proficiency in model training and hyperparameter tuning.
Developed a real-time face recognition system using Python, OpenCV, and the face-recognition library: Integrated deep learning for face encoding, implemented secure registration, Aadhaar details retrieval for recognized faces, and managed data through CSV file storage and Tkinter GUI. I also created a user-friendly web interface using Streamlit and WebRTC for real-time video streaming and face recognition, which can be used on Zoom or Google Meet calls for authentication purposes, specifically for interview-like environments.
Worked under the CTO of Corover.ai: I also lead a team of 2 interns, training them in basic skills required to build an LLM and fine tune an LLM.
Developed a real-time text-to-speech system using Bark framework integrated with Torch and Coqui TTS: Implemented voice synthesis from text, utilizing deep learning techniques for efficient audio generation. Successfully loaded and utilized pretrained checkpoints for model evaluation and synthesized high-quality speech outputs saved as WAV files.
Developed a multimedia processing application using Python, integrating libraries such as MoviePy, SpeechRecognition, and Transformers. Implemented functionality to convert MP4 videos to WAV format, transcribe speech to text, summarize content, and perform sentiment analysis. The application also generates synthesized speech from summarized text and classifies the working environment based on sentiment analysis.

Nihar Shah

INTERNSHIPS

Purdue University | Research Intern

Key Responsibilities and Achievements

Corover | Machine Learning and AI Developer Intern

Key Responsibilities and Achievements