Any LLM developers here struggling with aligning models to subject matter experts (SMEs) or domain-specific expertise? I’m finding it tough to evaluate or quantify how well an LLM aligns with SME expectations. Inspired by the paper "LLMs instead of Human Judges?" (link attached), I’m working on a tool to create a base alignment score using cutting-edge research methodologies. Do you rely on manual reviews, automated metrics a hybrid approach or something else? Or is SME alignment not a big focus for you? Curious to hear your thoughts!