LLM Inference Optimization | Multimodal Research & Safety
I am a Machine Learning Engineer and Researcher specializing in the intersection of Applied Research and Implementation. My core focus is on LLM Inference Optimization, Quantization, and AI Safety Mechanisms.
I bridge the gap between "State-of-the-Art" and "Scale," building systems that are not only novel but robust and deployable. Currently, I work full-time as a Machine Learning Engineer at Techolution, where I lead the development of Embodied AI systems that achieve 97% task accuracy and 2mm precision in dynamic environments.
In parallel, I collaborate with the AI Institute of South Carolina (AIISC) as a Contributing Researcher, focusing on AI integrity layers (Watermarking, Hallucination Mitigation) for Generative AI.
My research on "Safety-by-Design" architectures is conducted under the guidance of Dr. Amitava Das (BITS Pilani/AIISC). I am fortunate to be mentored by and collaborate with leading scientists including Vasu Sharma (Meta AI/FAIR), Aman Chadha (Apple/Stanford AI), and Vinija Jain (Meta/Amazon).
- [Feb 2025] Workshop Organizer: Serving as an Associate Organizer for the Defactify 4.0 Workshop at AAAI 2025.
- [Jan 2025] New Pre-print: Released PECCAVI, a novel watermarking technique for AI-generated images, co-authored with mentors from Meta and Apple.
- [2024] Journal Publication: WaveFormer published in Ocean Engineering (Q1 Journal). Proposed a Transformer-based architecture for long-term time-series forecasting.
- [2024] Production Deployment: Deployed end-to-end Agentic Systems using LangGraph and CrewAI for autonomous industrial tasks at Techolution.
- Peccavi: Visual Paraphrase Attack Safe and Distortion Free Image Watermarking (2025)
- Shreyas Dixit, Ashhar Aziz, Shashwat Bajpai, Vasu Sharma (Meta), Aman Chadha (Apple), Vinija Jain (Meta), Amitava Das.
- Proposed a robust watermarking technique resistant to visual paraphrase attacks.
- WaveFormer: Lag Removing Univariate Long Time Series Forecasting Transformer (2024)
- Shreyas Dixit, Pradnya Dixit.
- Published in Ocean Engineering (Elsevier).
- Rethinking Data Integrity in Federated Learning: Are we ready? (2022)
- IEEE International WIE Conference on Electrical and Computer Engineering.
- Patent #1: "Real-Time MultiModal Video Narration Platform for Visually Impaired People" (2023).
- Patent #2: "Assistance Platform for Visually Impaired Person Using Image Captioning" (Indian Patent).
I specialize in optimizing inference pipelines and building forensic layers for AI.
| Domain | Technologies |
|---|---|
| Inference Optimization | vLLM, TensorRT-LLM, Triton Inference Server, TorchServe, Quantization (AWQ/GPTQ) |
| Agentic AI | LangGraph, CrewAI, Model Context Protocol (MCP), AutoGen |
| Computer Vision | PyTorch, OpenCV, YOLO, NVIDIA Isaac Sim, 6D Pose Estimation |
| Infrastructure | Docker, Kubernetes, GCP, VectorDBs (Milvus/Pinecone), FastAPI |
- Research: A multimodal pipeline aligning visual encoders with audio generation modules to create "Safety-by-Design" accessibility tools.
- Stack: PyTorch, Diffusers, CLAP/CLIP Embeddings.
- Research: Optimized Transformer architectures for English-to-Hindi translation, focusing on efficient tokenization for low-resource languages.
- Stack: HuggingFace Transformers, PyTorch.
- Research: A from-scratch PyTorch implementation of the BART architecture for Masked Language Modeling (MLM) on mixed-script datasets.
I am actively seeking high-impact roles as a Senior ML Engineer, AI Architect, or Inference Engineer. If you are building production-grade LLM pipelines or working on efficient inference, I would love to connect.
🌐 Portfolio | 🐦 Twitter | 💼 LinkedIn
"The best way to predict the future is to invent it." – Alan Kay


