This is the code that does the work: <a href="https://github.com/rectanglehq/Shapeshift/blob/d954dab2a866c750cd6862d1eb44830b825bcb06/index.ts#L132-L160">https://github.com/rectanglehq/Shapeshift/blob/d954dab2a866c...</a><p>There are a few ways this could be made a less expensive to run:<p>1. Cache those embeddings somewhere. You're only embedding simple strings like "name" and "address" - no need to do that work more than once in an entire lifetime of running the tool.<p>2. As suggested here <a href="https://news.ycombinator.com/item?id=40973028">https://news.ycombinator.com/item?id=40973028</a> change the design of the tool so instead of doing the work it returns a reusable data structure mapping input keys to output keys, so you only have to run it once and can then use that generated data structure to apply the transformations on large amounts of data in the future.<p>3. Since so many of the keys are going to have predictable names ("name", "address" etc) you could even pre-calculate embeddings for the 1,000 most common keys across all three embedding providers and ship those as part of the package.<p>Also: in <a href="https://github.com/rectanglehq/Shapeshift/blob/d954dab2a866c750cd6862d1eb44830b825bcb06/index.ts#L57-L64">https://github.com/rectanglehq/Shapeshift/blob/d954dab2a866c...</a> you're using Promise.map() to run multiple embeddings through the OpenAI API at once, which risks tripping their rate-limit. You should be able to pass the text as an array in a single call instead, something like this:<p><pre><code> const response = await this.openai!.embeddings.create({
model: this.embeddingModel,
input: texts,
encoding_format: "float",
});
return response.data.map(item => item.embedding);
</code></pre>
<a href="https://platform.openai.com/docs/api-reference/embeddings/create" rel="nofollow">https://platform.openai.com/docs/api-reference/embeddings/cr...</a> says input can be a string OR an array - that's reflected in the TypeScript library here too: <a href="https://github.com/openai/openai-node/blob/5873a017f0f2040ef97040a8df19c5b4dc2a66fd/src/resources/embeddings.ts#L80-L90">https://github.com/openai/openai-node/blob/5873a017f0f2040ef...</a>