About

Gen AI & Data Science Manager with a background that bridges quantitative research and applied AI engineering. I design and build LLM-powered systems end-to-end — from data pipelines and embedding architectures to prompt engineering, structured output design, and production deployment. My approach is shaped by years of working with causal inference, statistical modeling, and experimental design, which means I care as much about why a system works as whether it works.

What I Build

LLM Applications & Orchestration — I design multi-stage AI pipelines that combine embedding-based retrieval with LLM reasoning. This includes building hybrid scoring systems, structured output via JSON schemas, LLM-as-a-judge reranking, and validation layers that enforce business logic on model outputs. I work across the proprietary (e.g., OpenAI, Anthropic), and open-weight ecosystems (e.g., Ollama).
Recommendation Systems — I build two-stage recommendation engines that pair semantic retrieval (embeddings, cosine similarity, hybrid scoring functions) with LLM reranking for personalized, context-aware results. This involves user profiling from behavioral data, feature engineering for geographic and categorical signals, and concurrent API processing at scale.
RAG & Information Retrieval — I build retrieval-augmented generation systems using multi-model embedding ensembles, FAISS indexing, cross-encoder re-ranking, and rank fusion strategies. My RAG work has been applied to domains where surface-level semantic similarity isn’t enough — where the system needs to distinguish tone, framing, and rhetorical structure, not just topic overlap.
Synthetic Populations & Counterfactual Simulation — Drawing on my academic work in causal inference, I build frameworks that generate synthetic populations (using Census data, iterative proportional fitting, and survey raking) and simulate how those populations would respond to hypothetical scenarios. This methodology — grounding LLMs in demographically representative synthetic personas — enables counterfactual analysis for both research and commercial applications.
Agentic AI & Data Agents — I design agentic systems that go beyond single-prompt interactions — including data agents that sit between users and complex data infrastructure, translating natural language into validated queries across large-scale schemas and report ecosystems. This extends to multi-agent workflows, the Model Context Protocol (MCP) for tool integration, and end-to-end automation pipelines that chain ideation, evaluation, and execution steps.

Background

Before industry, I spent years in academic research at institutions including GIGA (German Institute for Global and Area Studies), the CEU Democracy Institute, and the University of Erfurt. There, I developed novel AI-driven methodologies for public opinion research and taught courses on AI and quantitative methods. My work is published in top academic journals such as Electoral Studies and Political Science Research and Methods. I still contribute to the academic side: I teach intensive Gen.AI courses (from introduction to practice), and I maintain active research collaborations on LLM applications in social science.

Selected Highlights

Best Early Career Paper Award, MethodsNET Conference 2024 — for Chrono-Sampling, a generative AI framework for simulating historical public opinion using time-gated LLM personas
3rd Place, HUN-REN AI1Science Hackathon — developed the Zeus Protocol, an end-to-end AI system for automating experimental research design with multi-agent workflows
AI Course Instructor, MethodsNET Summer School at CEU Vienna — “Artificial Intelligence: From Theory to Practice”

Tech Stack

Python · R · SQL · FAISS · Sentence Transformers · Cross-Encoders · Ollama · BigQuery · Vertex AI · MCP · Git