Lilac co-creator here :)<p>Lilac is an open-source tool that enables AI practitioners to see and quantify their datasets.<p>Lilac allows users to:<p>- Browse datasets with unstructured data.<p>- Enrich unstructured fields with structured metadata using Lilac Signals, for instance near-duplicate and personal information detection. Structured metadata allows us to compute statistics, find problematic slices, and eventually measure changes over time.<p>- Create and refine Lilac Concepts which are customizable AI models that can be used to find and score text that matches a concept you may have in your mind.<p>- Download the results of the enrichment for downstream applications.<p>Out of the box, Lilac comes with a set of generally useful Signals and Concepts, however this list is not exhaustive and we will continue to work with the OSS community to continue to add more useful enrichments.<p>Check out the demo on HuggingFace: <a href="https://lilacai-lilac.hf.space/" rel="nofollow noreferrer">https://lilacai-lilac.hf.space/</a>
Find us on GitHub: <a href="https://github.com/lilacai/lilac">https://github.com/lilacai/lilac</a>