When working with ETL, it really helps to not conflate the letters or worry about them in the wrong order. A lot of the most insane complexity comes out of moving too quickly with data.<p>If you don't have good staging data after running extraction (i.e., a 1:1 view of the source system data available in your database), there is nothing you can do to help with this downstream. You should stop right there and keep digging.<p>Extracting the data <i>should be</i> the most challenging aspect of an ETL pipeline. It can make a lot of sense to write custom software to handle this part. It is worth the investment because if you do the extraction really well, the transform & load stages can happen as a combined afterthought [0,1,2,3] in many situations.<p>This also tends to be one of the fastest ways to deal with gigantic amounts of data. If you are doing things like pulling 2 different tables and joining them in code as part of your T/L stages, you are really missing out on the power of views, CTEs, TVFs, merge statements, etc.<p>[0] <a href="https://learn.microsoft.com/en-us/sql/t-sql/statements/merge-transact-sql" rel="nofollow">https://learn.microsoft.com/en-us/sql/t-sql/statements/merge...</a><p>[1] <a href="https://www.postgresql.org/docs/current/sql-merge.html" rel="nofollow">https://www.postgresql.org/docs/current/sql-merge.html</a><p>[2] <a href="https://docs.oracle.com/database/121/SQLRF/statements_9017.htm" rel="nofollow">https://docs.oracle.com/database/121/SQLRF/statements_9017.h...</a><p>[3] <a href="https://www.ibm.com/docs/en/db2/12.1?topic=statements-merge" rel="nofollow">https://www.ibm.com/docs/en/db2/12.1?topic=statements-merge</a>