Week 07 – Embedding Paper Techniques into Models

Dates: 2025-07-15 – 2025-07-21 Internship: AI/ML Intern at SynerSense Pvt. Ltd. Mentor: Praveen Kulkarni Sir

Focus

This week was dedicated to applying the learnings from research papers into our internal pipeline. We experimented with model slicing, feature transfer, and designing modular networks.

Goals for the Week

Adapt pretrained CLIP visual encoder with custom classifier
Test inference pipeline with sample dataset
Build scripts to visualize intermediate embeddings
Clean up the learnings section with proper structure

Tasks Completed

Task	Status	Notes
Integrated CLIP vision encoder with MLP head	✅ Completed	Used frozen encoder; trained classifier from scratch
Evaluated classification pipeline on sample images	✅ Completed	Achieved stable training with low loss (~0.87)
Documented working code for VLM to MLP architecture	✅ Completed	Recorded notebook in private repo
Added two research summaries to Learnings section	✅ Completed	Each paper has callouts, code examples, and reflection
NDA compliance maintained throughout documentation	✅ Completed	Only shared allowed portions publicly

Key Learnings

Pretrained vision encoders can significantly boost convergence
Modular design helps in attaching heads, slicing blocks flexibly
Accurate logs and markdown docs improve reproducibility
Reviewing loss trends helps in identifying class imbalance

Problems Faced & Solutions

Problem	Solution
Feature mismatch from encoder	Used `.pooler_output` and reshaped inputs
Classifier not converging initially	Tuned learning rate; applied CrossEntropyLoss
Keeping learnings under NDA	Used tags and disclaimers in markdown docs

📎 References

Goals for Next Week

Finalize slicing strategy for full VLM to vision module
Extend MLP head with dropout, batchnorm
Prepare a sharable demo (NDA-compliant)

“Week 7 was all about bridging theory and practice — papers to working code. Each experiment took us closer to a modular, trainable, and explainable VLM classifier stack.”