In data engineering your goal is "standardization". You can't afford every team using their unique tech stack, their own databases, coding styles, etc. People leave the company all the time and you as a data engineer always end up with their mess which now becomes your responsibility to maintain. You'd at least be grateful if those people had used the same methods to code stuff as you and your team so that you wouldn't have to become a Bletchley Park decoding expert any time someone leaves. Or you'd hope the tech stack was powerful and flexible enough that other people other than engineer types could pick it up and maintain themselves. They mostly cannot do that, because there is no such powerful system out there. Even when some modern ELT systems get you 80% there, you, data engineer, are still needed to bridge the gap for the 20% of the cases.<p>Data Engineering really comes down to being a set of hacks and workarounds, because there is no data processing system which you could use in a standardized systematic way that data analysts, engineers, scientists and anyone else could use. It's kind of a blue-collar "dirty job" of the software world, which nobody really wants to do, but which pays the highest.<p>There are of course other parts to it, such as managing multiple data products in a systematic way, which engineering minds seem to be best suited for. But the core of data engineering in 2020, I believe, is still implementing hacks and gluing several systems together so as to have a standardized processing system.<p>Snowflake or Databricks Spark bring you closest to the ideal unified system despite all their shortcomings. But still, you sometimes need to process unstructured jsons, extract stuff from html and xml files, unzip a bunch of zip archives and put them into something that these systems recognize and only then you can run sql on it. It is much better than the ETL of the past, where you really had to hack and glue 50% of the system yourself, but it is still nowhere near the ideal system in which you'd simply tell your data analysts: you can do it all yourself, I'm going to show you how. And I won't have to run and maintain a preprocessing job to munge some data into something spark recognizable for you.<p>It is not that difficult to imagine a world where such a system exists and data engineering is not evem needed. But you can be damn sure, that before this happens, that this position will be here to stay, and will be paying high, when 90% of ML and data science is data engineering and cleaning and all these companies hired a shitton of data science and ML people who are now trying to justify their salaries by desperately trying to do data engineers' job.