科技回声

1 comment

The taxonomy is a contribution from a paper we published in SIGMOD'24 (<a href="https://dl.acm.org/doi/10.1145/3626246.3653389" rel="nofollow">https://dl.acm.org/doi/10.1145/3626246.3653389</a>)<p>The insight of the taxonomy is that not all data transformations in AI systems are equivalent. Some data transformations (aggregations, binning, data compression) produce features that can be reused in many models. Some data transformations (feature encoding/scaling, LLM text encoding) are specific to one model. Some data transformations in real-time AI systems require data only available at request-time.

The Taxonomy for Data Transformations in AI Systems

1 comment

The Taxonomy for Data Transformations in AI Systems

1 comment