Internship Diary Entry: April 2, 2026

Role: AI Engineer — SynerSense
Project: AnanaCare Clinical Decision Support System (CDSS Upgrade)
Hours Worked: 8


Daily Snapshot

Dimension Status Key Milestone
Architecture Major Shift Raw prediction engine → Clinical Decision Support System (CDSS)
Core Achievement Completed Symptoms registry, winner-takes-all diagnosis, SHA-256 capture IDs
Storage Model Implemented Referential, non-duplicative YAML-based ledger design
API Output Redesigned Structured medical insights replacing raw probabilities
Next Focus Validation Symptom mapping accuracy against clinical data

Work Summary

Today marked a major shift in the AnanaCare backend from a pure prediction engine to a Clinical Decision Support System (CDSS). The focus was on transforming raw model outputs into structured, interpretable, and clinically meaningful records that can support real-world decision-making.


Technical Implementation

Key Work Done

1) Symptoms Registry Integration

  • Created a centralized SYMPTOMS_MAP in backend/config/symptoms.py to map disease heads (D_3, D_4, etc.) to structured symptom profiles.
  • Eliminated runtime CSV dependency by hardcoding the registry, improving performance and reliability.
  • Enabled the system to translate abstract model scores into human-readable medical insights.

2) Winner-Takes-All Diagnosis Logic

  • Implemented dominant disease selection using:

    max_key = max(disease_scores, key=disease_scores.get)
    
  • Added a confidence threshold (0.15) to prevent false-positive diagnoses.
  • Introduced a “Healthy/Baseline” fallback, ensuring the system does not over-diagnose when signals are weak.

3) Secure Capture Identity (SHA-256)

  • Replaced the older MD5-based ID system with a SHA-256 hashing protocol:
    • Generated a 64-character hash from user metadata + timestamp.
    • Truncated to:
      • 32-character capture_id (primary identifier)
      • 12-character short_ref (human-readable reference)
  • This ensures high entropy, uniqueness, and privacy-safe identification for each record.

4) Referential Storage Architecture

  • Designed a non-duplicative storage model:
    • .validate_cache/ stores the single processed image.
    • .capture_cache/{capture_id}/info.yaml stores metadata and insights.
  • Implemented YAML-based “ledger” files containing:
    • Patient metadata (age, gender, etc.)
    • Reference to image_id (no image duplication)
    • Model predictions
    • Derived symptom insights

5) API Response Redesign

  • Updated the response schema to return:
    • capture_id (32-char unique session ID)
    • predictions (raw model outputs)
    • symptoms (mapped clinical insights)
    • image_id and timestamp
  • Removed unnecessary technical fields and aligned output with real-world usability.

Insights & Analysis

Learnings & Insights

  • Bridging AI to Healthcare: Raw probabilities are not useful unless translated into structured, interpretable insights.
  • Data Integrity Design: Referential storage significantly reduces redundancy while maintaining traceability.
  • Clinical Safety: Adding a confidence threshold is critical to avoid misleading outputs in sensitive domains.
  • System Evolution: This transition highlighted the difference between an ML model and a production-grade decision system.

Challenges & Considerations

  • Symptom Mapping Accuracy: Hardcoded mappings must be validated carefully to avoid incorrect clinical interpretations.
  • Threshold Tuning: The 0.15 cutoff is heuristic and may require calibration with real-world data.
  • Scalability: YAML-based storage is simple but may need to evolve into a database-backed system for large-scale usage.

Next Steps

Prioritized validation and expansion roadmap:

  1. Validate SYMPTOMS_MAP against actual clinical data (test.csv) for correctness.
  2. Add audit logging for each capture_id to track decision history.
  3. Introduce optional database indexing for .capture_cache to improve query performance.
  4. Extend CDSS logic to support multi-condition insights instead of single “winner” output.

Overall Outcome

The system now produces structured medical insights instead of raw predictions, marking a significant step toward a production-ready, privacy-first AI-powered clinical assistant.

This architectural evolution demonstrates the critical difference between ML systems and clinical-grade decision support: transforming abstract probabilities into actionable, interpretable, and safe clinical guidance.


This site uses Just the Docs, a documentation theme for Jekyll.