TE
TechEcho
Home24h TopNewestBestAskShowJobs
GitHubTwitter
Home

TechEcho

A tech news platform built with Next.js, providing global tech news and discussions.

GitHubTwitter

Home

HomeNewestBestAskShowJobs

Resources

HackerNews APIOriginal HackerNewsNext.js

© 2025 TechEcho. All rights reserved.

Show HN: Simple Synthetic Medical Data

2 pointsby maxrmk7 months ago
Hey hn! Max and Matt here from Talc AI (YC S23). We help teams create data that’s traditionally hard to find - think things you’d normally need a doctor, lawyer, accountant, or engineer to write.<p>We’ve been struggling to demo our synthetic data product (it’s complicated to set up), so we stripped our product down to its core - an &quot;ontologizer&quot; that takes plain text descriptions and generates varied, detailed synthetic data. For this demo, we&#x27;re focused on medical data like radiology reports and SOAP notes.<p>Try it here: <a href="https:&#x2F;&#x2F;demo.talcapi.com&#x2F;demo&#x2F;meddoc" rel="nofollow">https:&#x2F;&#x2F;demo.talcapi.com&#x2F;demo&#x2F;meddoc</a><p>Example use case: Instead of dealing with HIPAA compliance or hiring doctors to write fake data, just type &quot;medical notes with billing codes&quot; to get test data instantly.<p>One key limitation: unlike our real product, this isn’t grounded in reality and won’t match the distribution of real data.<p>For specialized use cases (rare diseases, financial regulations, etc.), we can inject domain expertise into the process. Our customers use these &quot;golden datasets&quot; to test clinical trial matching, train financial and engineering Q&amp;A models, and benchmark LLMs.<p>To generate this data we run an unsupervised process to identify the relevant metadata and structure then use this information to seed a generation process, inspired by papers like Google&#x27;s CodecLM.<p>We&#x27;d love feedback! Our last HN launch helped us catch several bugs.

no comments

no comments