Week 09 – Prompt Engineering for Facial Fat Analysis
Dates: July 27 – August 2
Internship: AI/ML Intern at SynerSense Pvt. Ltd.
Mentor: Praveen Kulkarni Sir
Focus
This week focused on designing robust prompts to extract structured fat prominence metrics from facial images using vision-language models.
Goals for the Week
- Define specific facial regions (R1–R9) for targeted fat/bulge analysis
- Create consistent prompt schema to guide VLMs in quantifying fat prominence
- Format output for machine parsing in downstream analysis
Tasks Completed
Task | Status | Notes |
---|---|---|
Drafted region-specific definitions for facial fat analysis | ✅ Completed | R1 to R9 covered, with anatomical clarity |
Designed prompt structure for float-based regional scoring | ✅ Completed | Included format examples and strict output rules |
Implemented Markdown-free output enforcement | ✅ Completed | Ensured model returns parsable float values only |
Added edge-case instructions to minimize ambiguity | ✅ Completed | e.g., how to handle unclear folds or asymmetric regions |
Integrated prompt into OpenAI Prompt Management system | ✅ Completed | Used version-controlled prompt ID via API |
Evaluated prompt consistency across different inputs | ✅ Completed | Iteratively tuned language for stability and reproducibility |
Key Learnings
- Prompt engineering precision directly affects model reliability and repeatability.
- Vision-language models can respond more consistently when output format and reasoning steps are reinforced.
- Output structure matters as much as content for downstream parsing.
- Simpler, declarative instructions often outperform verbose, multi-paragraph guidance.
Problems Faced & Solutions
Problem | Solution |
---|---|
Variability in model outputs across repeated image inputs | Added strict float format and prompt repetition to enforce consistency |
Incomplete or malformed outputs from the model | Removed markdown, added fixed format reminders in prompt |
Ambiguity in region definitions | Created visual mappings (R1–R9) and adjusted textual cues |
📎 References
Goals for Next Week
- Add image hashing and response caching for deterministic predictions
- Explore few-shot enhancement and prompt modularity
- Begin analyzing model behavior on difficult facial profiles (e.g., occlusions, shadows)
Screenshots (Optional)
Prompt interface screenshot, response logs showing format compliance, or region definition charts can be added here.
“The model is only as good as the instructions we give it. Week 9 taught me how structure and clarity shape intelligence.”