Kaustubh Sharma
Core Member, DSG IIT Roorkee
Education
- B.Tech, Electrical Engineering, IIT Roorkee (2023–2027)
Research Interests
- Diffusion Models
- Mechanistic Interpretability in LLMs
- MultiModal Learning
Projects
-
Image-Alchemy: Personalized Text-to-Image Generation
Developed a two-stage pipeline using LoRA-tuned SDXL and segmentation-driven Img2Img to enhance subject fidelity. Achieved a DINO similarity score of 0.789. (ICLR DeLTa Workshop 2025) -
AI Image Classification & Artifact Detection
Created a dataset and trained ConvNeXt to differentiate real vs. AI-generated images. Used adversarial training, CLIP embeddings, and diffusion-based upscaling. (Inter IIT Tech Meet 13.0, IIT Bombay, 2024) -
Multimodal Emotion Recognition
Built a model combining 3D ResNet-18 (video) and HuBERT Transformer (audio) with self/cross-attention, achieving 98.47% accuracy on Crema-D and 95.65% on Ravdess. (IIT Roorkee, 2025) -
NeurIPS Ariel Data Challenge
Conducted EDA and transit depth analysis on exoplanet data using ML techniques. Ranked 227 globally on Kaggle. (NeurIPS 2024)
Publications
- Sharma, K., Puniani, C., Nema, O., & Tiwari, A. (2025). Image-Alchemy: Advancing Subject Fidelity in Personalized Text-to-Image Generation. ICLR DELTA Workshop.
Roles at DSG
- Co-lead, BH’25
- Project lead - Personalized Diffusion