Kaustubh Sharma

Kaustubh Sharma

Core Member, DSG IIT Roorkee


Education

  • B.Tech, Electrical Engineering, IIT Roorkee (2023–2027)

Research Interests

  • Diffusion Models
  • Mechanistic Interpretability in LLMs
  • MultiModal Learning

Projects

  • Image-Alchemy: Personalized Text-to-Image Generation
    Developed a two-stage pipeline using LoRA-tuned SDXL and segmentation-driven Img2Img to enhance subject fidelity. Achieved a DINO similarity score of 0.789. (ICLR DeLTa Workshop 2025)

  • AI Image Classification & Artifact Detection
    Created a dataset and trained ConvNeXt to differentiate real vs. AI-generated images. Used adversarial training, CLIP embeddings, and diffusion-based upscaling. (Inter IIT Tech Meet 13.0, IIT Bombay, 2024)

  • Multimodal Emotion Recognition
    Built a model combining 3D ResNet-18 (video) and HuBERT Transformer (audio) with self/cross-attention, achieving 98.47% accuracy on Crema-D and 95.65% on Ravdess. (IIT Roorkee, 2025)

  • NeurIPS Ariel Data Challenge
    Conducted EDA and transit depth analysis on exoplanet data using ML techniques. Ranked 227 globally on Kaggle. (NeurIPS 2024)


Publications

  • Sharma, K., Puniani, C., Nema, O., & Tiwari, A. (2025). Image-Alchemy: Advancing Subject Fidelity in Personalized Text-to-Image Generation. ICLR DELTA Workshop.

Roles at DSG

  • Co-lead, BH’25
  • Project lead - Personalized Diffusion

Follow

GitHub | LinkedIn


Updated: