Internship Diary Entry: April 22, 2026

Role: AI Engineer — SynerSense Project: AnanaCare Visualization Layer (Embedding Explorer Integration) Hours Worked: 8


Work Summary

Today’s work focused on building a standalone embedding visualization system by integrating the TensorFlow Embedding Projector into the AnanaCare stack.

The main goal was to create a clean, production-ready interface for exploring high-dimensional embeddings (regions, diseases, conditions, demographics) without relying on experimental or fragmented setups.

I started by cloning the official projector repository and extracting only the required assets into a local public/ directory. Instead of maintaining multiple configs, I designed a single unified configuration file (projector_config.json) that loads all four embedding categories. This simplifies both maintenance and extensibility.

On the backend side, I implemented a lightweight FastAPI server (main.py) to serve static assets and automatically generate required TSV files if they are missing. This ensures the visualization system is self-healing and reduces manual setup.

On the frontend, I updated index.html to point to the new unified config instead of the default demo configuration. This connected the UI directly to your project’s real data.

A key refinement was in metadata handling. Previously, metadata included full file paths, which made the system less portable. I updated the pipeline (to_standalone.py) so that metadata now stores only image_id, making it cleaner and environment-independent.

Finally, I validated the full pipeline locally. The UI successfully loaded at http://localhost:8002, all four embedding spaces were visible, and interactions like switching between tensors and using dimensionality reduction techniques (UMAP, t-SNE, PCA) worked as expected.


Key Technical Achievements

  • Embedding Visualization Integration: Successfully embedded the TensorFlow Embedding Projector into the project with a custom configuration.

  • Unified Configuration System: Created a single projector_config.json to manage multiple embedding spaces cleanly.

  • Backend Automation: Built a FastAPI server that serves static assets and auto-generates TSV files when needed.

  • Metadata Standardization: Simplified metadata to use only image_id, improving portability and consistency.

  • End-to-End Validation: Verified all embedding categories and dimensionality reduction modes in a live local environment.


Learnings & Insights

  • Less config = more reliability: Consolidating multiple configs into one significantly reduces complexity.
  • Portability matters early: Using image_id instead of file paths prevents environment-specific bugs later.
  • Visualization is a debugging tool: Seeing embeddings interactively provides insights that raw metrics cannot.

Challenges & Observations

  • TSV regeneration initially stalled due to running the script from the wrong working directory. This highlights how fragile data pipelines can be when path assumptions are implicit.
  • The projector works well, but scaling to very large embedding sets may require optimization (lazy loading or sampling).

Next Steps

  1. Regenerate TSV files from the correct working directory to align with the updated metadata format.
  2. Validate large-scale performance (thousands of embeddings).
  3. Optionally integrate this visualization into the main dashboard instead of running it as a standalone tool.
  4. Add filtering/search capabilities for better exploration of embeddings.

Today’s work adds a powerful visual analysis layer to the system, turning raw embeddings into something interpretable and actionable, which is a key step toward making the platform more usable for real-world ML workflows.


This site uses Just the Docs, a documentation theme for Jekyll.