Mahesh Raj — Portfolio

Experience

SDE Intern — Finance Automation

Amazon — May 2024 – July 2024 · Bangalore, India

Built a VQA + field-value extraction proof‑of‑concept for tax documents using LLMs/VLMs (Python, PyTorch, Bedrock, Textract, LoRA/QLoRA).
Improved accuracy from 80–85% to 99% overall with 97% on critical fields.

▼

Finance‑Document VQA and Field‑Value Extraction POC

Tech: Python, PyTorch, Amazon Bedrock, SageMaker, Textract, LoRA/QLoRA, Knowledge Distillation, Quantization

Built a prototype for Visual Question Answering (VQA) and field‑value extraction on tax documents using LLMs/VLMs.
Boosted accuracy from 80–85% to 99% overall, with 97% success on critical fields.
Hands‑on with prompt engineering (zero‑/few‑shot, chain‑of‑thought), parameter‑efficient fine‑tuning with LoRA/QLoRA, knowledge distillation, and quantization.

Data Pipeline Monitoring System

Tech: AWS Lambda, Athena, DynamoDB, S3, EventBridge, CloudWatch, Secrets Manager, SQS/SNS, Docker

Designed a scheduled monitoring service that queried multiple data sources to check for missing/late records.
Implemented automated alerting using CloudWatch metrics, dashboards, and alarms.
Improved reliability and data integrity across finance automation pipelines.

Research

Estimating Soil Moisture from Satellite Data

Amrita Vishwa Vidyapeetham × INRAE (France) — Mar 2024 – Jul 2025

Remote sensing + ML for soil moisture prediction and plant life cycles.
Advanced satellite data processing with PyTorch/TensorFlow.

▼

Summary

Tech: PyTorch, TensorFlow, Excel, Pandas, Matplotlib

Collaborated with INRAE, France under the guidance of Dr. Amit Agarwal (TIFAC‑CORE in Cybersecurity).
Engaged in research involving remote sensing, agriculture, and machine learning.
Leveraged advanced satellite data processing and ML techniques to address soil‑moisture prediction and plant life‑cycle challenges.

Text‑Prompted 3D Mesh Character Animation using GNNs & Diffusion

Amrita Vishwa Vidyapeetham — Nov 2024 – Aug 2025

Latent graph diffusion to handle varying mesh topologies.
Pipeline with GNN autoencoders + diffusion for text‑prompted 3D mesh generation.

▼

Summary

Tech: PyTorch, PyTorch Geometric, Trimesh

Contributed to 3D mesh generation using GNNs and diffusion models (professional research elective).
Developed a latent graph diffusion model to handle varying mesh topologies.
Designed a pipeline integrating GNN autoencoders with diffusion models for text‑prompted 3D mesh generation.
Explored applications in animation and game development by creating a generalizable approach to dynamic mesh generation.

Lightweight Student Network for nnU‑Net (in progress)

Amrita Vishwa Vidyapeetham — Sep 2025 – Present

Multi‑stage compression with feature + soft‑label distillation and deep supervision.
Target: clinically deployable nnU‑Net with reduced parameters/memory/latency.

▼

Summary

Tech: PyTorch, nnU‑Net, Knowledge Distillation, Deep Supervision

Developing a multi‑stage model compression pipeline for nnU‑Net to reduce parameters, memory, and latency while maintaining performance.
Employing a multi‑phase knowledge distillation strategy: first at the feature level, then at the soft‑label level, guided by deep supervision.
Designing preprocessing to emphasize contrast‑enhanced regions in DCE‑MRI scans to narrow search space and simplify learning.
Goal: a compact, efficient nnU‑Net suitable for clinical deployment in low‑resource environments without significant accuracy trade‑offs.

Final Year Project — Self‑Driving Cars with Small Language Models

Amrita Vishwa Vidyapeetham — Sep 2025 – Present

Developing lightweight autonomous driving solutions using Qwen‑0.5B with multimodal encoders (LiDAR + cameras).
Designing pipelines for real‑time waypoint prediction, scene understanding, and object detection on edge hardware.

▼

Summary

Tech: Qwen‑0.5B LLM, Multimodal Encoders (LiDAR + Multi‑axis Camera), Edge AI

Experimenting with edge‑focused autonomous driving using a lightweight Qwen‑0.5B decoder paired with encoders for multimodal inputs.
Designed pipelines for real‑time waypoint prediction, scene understanding, and object detection from LiDAR and camera data.
Exploring two architectures:
- Parallel multi‑encoder: separate encoders for LiDAR and multi‑axis camera integrated via the decoder.
- Single fusion encoder: unified encoder for LiDAR + camera (RGB‑point cloud) to reduce compute and speed up inference.
Implementing a safety system to predict future states of surrounding objects, ensuring generated waypoints are safe to execute.
Optimized for local, real‑time inference enabling closed‑loop autonomy on resource‑constrained edge hardware.

Projects

Deep Fake Detection

Jan 2024 – Apr 2024

Developed a video deepfake detection system leveraging multiple detectors with unique methodologies.
Built an ensemble framework where results were intelligently combined by a meta‑model based on detectors' historical performance.

▼

Summary

Tech: PyTorch, OpenCV

Developed a video deepfake detection system leveraging multiple detectors with unique methodologies.
Built an ensemble framework where results were intelligently combined by a meta‑model using detectors’ historical performance.

DDPM Image Generation

2024

Implemented a Denoising Diffusion Probabilistic Model (DDPM) for image synthesis as part of deep learning coursework.
Showcased diffusion‑based generative modeling and its applications in media synthesis and AI ethics demonstrations.

▼

Summary

Tech: PyTorch, Denoising Diffusion Probabilistic Models (DDPMs)

Implemented a DDPM‑based generative model for image synthesis as part of deep learning coursework.
Demonstrated techniques behind deepfake generation, enabling fine‑tuning on a handful of target images to produce realistic samples.
Showcased diffusion‑based generative modeling and its applications in media synthesis and AI ethics demonstrations.

Adobe India Hackathon — Team Starks (Connecting the Dots)

Jan 2025

Built a lightweight CPU‑only offline system to transform static PDFs into dynamic, structured, persona‑aware knowledge artifacts.
Integrated quantized language models with object detection and semantic search to enable retrieval and summarization.

▼

Summary

Hackathon Project / Tech: Qwen2.5‑0.5b (Int8, llama.cpp), YOLOv8n

Built an intelligent, lightweight, CPU‑only offline system to transform static PDFs into dynamic, structured, persona‑aware knowledge artifacts.
Designed a layout‑aware Small Language Model (SLM) using Qwen2.5‑0.5b (Int8 quantized) on llama.cpp for efficient low‑resource inference.
Integrated 2× YOLOv8n models (distilled with PP‑DocLayout‑L + a custom outline detector), SentenceTransformers for semantic search, and K‑means clustering for hierarchical structuring (H1/H2/H3).
Enabled semantic retrieval + summarization by ranking the top 5 relevant sections in embedding space and summarizing with the SLM.
Optimized pipeline to meet hackathon size limits (200 MB Round 1A, 1 GB Round 1B); processed 10–15 docs of 50 pages each in under one minute.

Fire Fighting Drone for Early Forest Fire Detection and Extinguishment

2018

Developed a drone capable of early forest fire detection and suppression with integrated surveillance and rapid response mechanisms.
Won multiple awards: CBSE Science Fair State Finalist, PPTIA Innovation Award National Finalist (Top 10), First Prize at Shastra Science Fair.

▼

Summary

School Project / Tech: Drones, Sensors, Fire Suppression Systems

Developed a drone for early forest fire detection and suppression with integrated surveillance and rapid response mechanisms.
Achievements: CBSE Science Fair State Finalist, PPTIA Innovation Award National Finalist (Top 10), First Prize at Shastra Science Fair.

Skills

Python Java C++ C PyTorch PyTorch3D TensorFlow Scikit‑Learn NumPy Pandas Matplotlib Seaborn MediaPipe OpenCV AWS Prompt Engineering Model Fine‑tuning Quantization Knowledge Distillation CUDA

Education

B.Tech in Computer Science & Engineering

Amrita Vishwa Vidyapeetham, Coimbatore — 2022–2026

CGPA: 8.41 (till 6th semester)

Senior Secondary (CBSE)

St. Peter’s Senior Secondary School — 2022

Percentage: 86.2%

Higher Secondary (CBSE)

St. Peter’s Senior Secondary School — 2020

Percentage: 89.8%

Hi, I’m Mahesh — CSE undergrad passionate about deep learning, computer vision, and building practical AI systems.

Experience

SDE Intern — Finance Automation

Finance‑Document VQA and Field‑Value Extraction POC

Data Pipeline Monitoring System

Research

Estimating Soil Moisture from Satellite Data

Summary

Text‑Prompted 3D Mesh Character Animation using GNNs & Diffusion

Summary

Lightweight Student Network for nnU‑Net (in progress)

Summary

Final Year Project — Self‑Driving Cars with Small Language Models

Summary

Projects

Deep Fake Detection

Summary

DDPM Image Generation

Summary

Adobe India Hackathon — Team Starks (Connecting the Dots)

Summary

Fire Fighting Drone for Early Forest Fire Detection and Extinguishment

Summary

Skills

Education

B.Tech in Computer Science & Engineering

Senior Secondary (CBSE)

Higher Secondary (CBSE)

Contact