Day 49 - April 02, 2026

Internship Diary Entry: April 2, 2026

Role: AI Engineer — SynerSense
Project: AnanaCare Clinical Decision Support System (CDSS Upgrade)
Hours Worked: 8

Daily Snapshot

Dimension	Status	Key Milestone
Architecture	Major Shift	Raw prediction engine → Clinical Decision Support System (CDSS)
Core Achievement	Completed	Symptoms registry, winner-takes-all diagnosis, SHA-256 capture IDs
Storage Model	Implemented	Referential, non-duplicative YAML-based ledger design
API Output	Redesigned	Structured medical insights replacing raw probabilities
Next Focus	Validation	Symptom mapping accuracy against clinical data

Work Summary

Today marked a major shift in the AnanaCare backend from a pure prediction engine to a Clinical Decision Support System (CDSS). The focus was on transforming raw model outputs into structured, interpretable, and clinically meaningful records that can support real-world decision-making.

Technical Implementation

Key Work Done

1) Symptoms Registry Integration

Created a centralized SYMPTOMS_MAP in backend/config/symptoms.py to map disease heads (D_3, D_4, etc.) to structured symptom profiles.
Eliminated runtime CSV dependency by hardcoding the registry, improving performance and reliability.
Enabled the system to translate abstract model scores into human-readable medical insights.

2) Winner-Takes-All Diagnosis Logic

Implemented dominant disease selection using:

max_key = max(disease_scores, key=disease_scores.get)

Added a confidence threshold (0.15) to prevent false-positive diagnoses.
Introduced a “Healthy/Baseline” fallback, ensuring the system does not over-diagnose when signals are weak.

3) Secure Capture Identity (SHA-256)

Replaced the older MD5-based ID system with a SHA-256 hashing protocol:
- Generated a 64-character hash from user metadata + timestamp.
- Truncated to:
  - 32-character capture_id (primary identifier)
  - 12-character short_ref (human-readable reference)
This ensures high entropy, uniqueness, and privacy-safe identification for each record.

4) Referential Storage Architecture

Designed a non-duplicative storage model:
- .validate_cache/ stores the single processed image.
- .capture_cache/{capture_id}/info.yaml stores metadata and insights.
Implemented YAML-based “ledger” files containing:
- Patient metadata (age, gender, etc.)
- Reference to image_id (no image duplication)
- Model predictions
- Derived symptom insights

5) API Response Redesign

Updated the response schema to return:
- capture_id (32-char unique session ID)
- predictions (raw model outputs)
- symptoms (mapped clinical insights)
- image_id and timestamp
Removed unnecessary technical fields and aligned output with real-world usability.

Insights & Analysis

Learnings & Insights

Bridging AI to Healthcare: Raw probabilities are not useful unless translated into structured, interpretable insights.
Data Integrity Design: Referential storage significantly reduces redundancy while maintaining traceability.
Clinical Safety: Adding a confidence threshold is critical to avoid misleading outputs in sensitive domains.
System Evolution: This transition highlighted the difference between an ML model and a production-grade decision system.

Challenges & Considerations

Symptom Mapping Accuracy: Hardcoded mappings must be validated carefully to avoid incorrect clinical interpretations.
Threshold Tuning: The 0.15 cutoff is heuristic and may require calibration with real-world data.
Scalability: YAML-based storage is simple but may need to evolve into a database-backed system for large-scale usage.

Next Steps

Prioritized validation and expansion roadmap:

Validate SYMPTOMS_MAP against actual clinical data (test.csv) for correctness.
Add audit logging for each capture_id to track decision history.
Introduce optional database indexing for .capture_cache to improve query performance.
Extend CDSS logic to support multi-condition insights instead of single “winner” output.

Overall Outcome

The system now produces structured medical insights instead of raw predictions, marking a significant step toward a production-ready, privacy-first AI-powered clinical assistant.

This architectural evolution demonstrates the critical difference between ML systems and clinical-grade decision support: transforming abstract probabilities into actionable, interpretable, and safe clinical guidance.