科技回声

Not that I can vouch for this as a “good” choice, but you can always use the old school original ETL of any operating system + small tools (awk/sed/jq/grep/… etc) + shell scripts (sh/bash/zsh/etc…) and optionally make files to help if the way they work is a good fit for how you process the data.<p>Not being snarky this is genuinely a potential good solution in some situations <a href="https://adamdrake.com/command-line-tools-can-be-235x-faster-than-your-hadoop-cluster.html" rel="nofollow">https://adamdrake.com/command-line-tools-can-be-235x-faster-...</a>

Airflow but use KubernetesPodOperator so you can run your ETLs in docker containers or pods within k8s (any language you want). You may need to write a few lines of dumb python code to build a DAG of KubernetesPodOperators but the actual work is done within containers.

Ask HN: Recommendations for ETL frameworks that are NOT Python-native

2 条评论

Ask HN: Recommendations for ETL frameworks that are NOT Python-native

2 条评论