Research
Efe Ertugrul works at the intersection of infrastructure and scientific reproducibility—building systems that make bioimaging and biomedical data interoperable, standardized, and production-ready.
Bioimaging Metadata & Schema Translation
Problem: Microscopy metadata is fragmented across incompatible standards (OME, NBO-Q, vendor-specific formats). Researchers can't query, validate, or share imaging data reliably across platforms.
What I'm doing: Building automated pipelines to translate OME and NBO-Q XSD schemas into LinkML schema definitions. This enables unified metadata models, cross-platform validation, and ontology-backed data harmonization.
Impact: Enables reproducible imaging workflows that comply with FAIR data principles. Delivered a talk on this work at the 2025 OME Community Congress (MBL Woods Hole).
Schema Translation Pipelines
Problem: Converting complex XML schemas (XSD) to modern data modeling frameworks is manual, error-prone, and blocks interoperability efforts in bioinformatics.
Approach: Developed extensible tooling in Python for automated XSD parsing, transformation rules, and LinkML schema generation. Built validation layers to ensure semantic consistency across source and target formats.
Why it matters: Accelerates metadata standardization efforts across imaging, genomics, and clinical research domains.
FAIR Data & Ontology Integration
Problem: Scientific datasets are difficult to find, access, integrate, and reuse because metadata lacks semantic structure and controlled vocabularies.
Contribution: Implementing FAIR principles through LinkML, connecting bioimaging metadata to biomedical ontologies (e.g., Cell Ontology, Uberon), enabling structured queries and cross-study integration.
Tech: Python, LinkML, OWL/RDF, graph-based models, XSD/XML parsing.