An old (but fantastic) comment from the previous discussion about Uber's app size that addresses why the Uber app is so big: <a href="https://news.ycombinator.com/item?id=25376346" rel="nofollow">https://news.ycombinator.com/item?id=25376346</a>
I’m the founder of a YC company in the current batch focused on solving this exact problem! <a href="https://www.emergetools.com" rel="nofollow">https://www.emergetools.com</a><p>We parse Obj-C and Swift runtime metadata to determine size contributions of individual types and functions in your app. We use this analysis to post PR comments with granular size diffs to help devs write smaller, better code.<p>I tried it out on the Uber app and immediately noticed a disproportionate impact from their code-gen dependency injection framework, Needle. The codegen is responsible for over 30k classes in the app binary, and contributes over 10mb! In general codegen is a common problem with Swift binary sizes, and the fewer reference types generated the better, it even helps with startup time!<p>We’ve written a blog post with case studies about how 7 of the most popular iOS apps could reduce their size: <a href="https://medium.com/swlh/how-7-ios-apps-could-save-you-500mb-of-storage-a828782c973e" rel="nofollow">https://medium.com/swlh/how-7-ios-apps-could-save-you-500mb-...</a>
I'd like to see Apple expose more control over their app thinning technologies.<p>Currently they only deliver the binary for the device's CPU, and only the assets for the device's asset class. There's then some tech targeted at game devs for on-demand assets for things like game levels that you don't need all of on device at one time.<p>I suspect the limitations of this are around the binary not being subject to this, but maybe it could be. I can see a couple of options, one is some way of extending the asset classes to code features, so that the App Store doesn't have to download iPad screens for iPhones, etc. Perhaps this could be extended with either App Store account region or locale so that, Uber in this example could not include the Venmo SDK outside of the US where no one has heard of Venmo.<p>Or perhaps Apple could extend the on-demand assets to allow for some sort of plugin system, perhaps backed by Swift Packages, such that apps can on-demand decide they need the Venmo SDK because they're in the US, and download just that. I don't think we want a generalised package manager here, I don't envision that SDK coming from Venmo directly, but allowing an app author to upload all their separate packages if they want to.<p>With feature heavy, international apps such as Uber I'd expect this to dramatically improve things. I'm not sure whether this benefit would translate to that much demand across the whole App Store though as I think this matters more to a very few big apps. Apple is at that optimisation point in the iOS lifecycle though so perhaps it's worth it to them.
Can someone help me understand this? They blame the source of the large bundle size on:<p>> The choice of Swift as our primary programming language, our fast-paced development environment and feature additions, layered software and its dependencies, and statically linked platform libraries result in large app binaries<p>but can somebody familiar with iOS development explain what makes app bundles so big? Actual CPU instructions or config can't contribute this significantly. The entire Bible is about 4.5mb. If you're writing an app by yourself you almost certainly didn't write that much text in the source code. A sibling comment links to <a href="https://news.ycombinator.com/item?id=25376346" rel="nofollow">https://news.ycombinator.com/item?id=25376346</a> which says that they have a lot of screens but even something like "PayTM (15+ screens)" is still just textual source code and config that I don't follow how it gets beyond kilobytes. The App Store places them at 309mb, so ~68 bibles.<p>I understand when games are large because they typically ship with images and videos included in the binary for game assets. But for a normal application where does the size come from?<p>Is it dependencies? (And how did _they_ get so big?) That weird intro video they have on the loading screen? Are they shipping bitmaps of the cities they have markets in?
> The app has a couple of millions of lines of code<p>I wonder if Uber is planning to do anything about that? The technique described in the article (whole program instructions outlining optimisation) is a band aid style solution, merely delaying the inevitable: the code produced by numerous teams independent of each other will inevitably cross first the download size limit threshold, and later maintainability threshold.
Complementary: This thread on Uber's transition to Swift that almost broke them <a href="https://twitter.com/StanTwinB/status/1336890442768547845" rel="nofollow">https://twitter.com/StanTwinB/status/1336890442768547845</a><p>Includes, among other things: forcing Apple to increase cellular download limits, 45 seconds for letters to appear in XCode, 12 seconds to call main, rewriting the linker and so on.
This is certainly an great read, and working on it must have very interesting. That being said, in my experience things like these are invariably technical band-aids over social problems. Whenever I see things like this, often paired with statements like “there’s so many screens and feature flags”, usually the problem is not there but actually in many other processes: for example, the design team adds assets in a way that is not enforced by the usual tooling that checks binary size, or the build process adds duplicate files into the bundle that nobody notices. Sometimes the underlying issue is hard to fix, like if it’s code size explosion due to a custom templating engine, but they really should get addressed at some point. Changes like these don’t actually solve the underlying issues, which can be a benefit for a while, but eventually they become so complex that it is hard to maintain them and they start impacting productivity in harder to measure ways by doing things like increasing build times and reducing the quality of debugging information.
> <i>While power law and fractal patterns have revealed themselves in several physical, biological, and man-made phenomena, to our knowledge we are the first to identify their presence in machine-code sequences in computer executable code. Presumably, machine code is a human expression of instructions to a computer and it is well established that all human languages show a power-law in the frequency of the words.</i><p>Made me chuckle. Maybe the authors should look at getting an ACM subscription.<p>[<a href="https://dl.acm.org/doi/pdf/10.1145/1391984.1391986" rel="nofollow">https://dl.acm.org/doi/pdf/10.1145/1391984.1391986</a>]
Machine-code outlining sounds kind of like the opposite of function inlining. Right down to the name! I am amazed I've never heard of this optimization technique being used in compilers before -- it sounds like it could improve performance in many cases by making code smaller (or hurt performance for the same reason that inlining can help performance)...
I would love to work on something like this. Optimizing assembler instructions for size, speed.. just writing some. How do you get a job in this? Is embedded land like this?
Why not do machine outlining in LTO/ThinLTO? `opt` doesn't really scale with huge module in terms of memory consumption and multi-threading, that's the reason ThinLTO was invented in the first place.<p>I think adding machine outlining into LLVM Pass pipeline is still doable with LLVM plugin (with new PassManager)...worst case just come up with a custom LLVM/Clang
So Uber has about 5000 engineers. If all of them write/click/draw 10,000 keystrokes per day, and it's all new code, and Uber has been around 10 years, thats 182 gigabytes of 'human input'.<p>Compiling that down to 200 MB isnt too shabby!
On a smaller note, adding lossless compression to the image assets in the Uber app can save more than 14% of 2.4MB<p>If your app has larger images, don't waste user bandwidth and optimize your assets!
I was quite surprised by the increase in build times:<p>> Overall, 5 rounds of outlining builds in 66 minutes — a 45-minutes addition to the baseline.
There's apparently a bananas crunch/backstory to this, where they committed to Swift before realizing they would hit its limits, and had to come up with a bunch of this optimization madness on the fly. I guess this is the cleaned up version and the more final, stable optimizations for the company blog:<p><a href="https://twitter.com/StanTwinB/status/1336890442768547845" rel="nofollow">https://twitter.com/StanTwinB/status/1336890442768547845</a>
Is part of the strategy using private iOS APIs, tracking everyone outside of ToS, lying about it, getting caught, and then being too large for Apple to actually punish them?
This seems like a case of sloppy product management saved (or rather having its consequences delayed) by person-years of ingenious engineering.<p>Build times in tens of minutes seem terrible.
It is still amazing to me how Uber cannot narrow down the use cases enough. To me it was a done product in 2014, no need to additional features. I think the software industry as a whole does not have the concept of 1.0. We are trying to ship one more thing all the time.
I'd take a sledge hammer to it. The app doesn't have to be an app at all. It could simply be a stream with an os interactive overlay that intercepts touches. Like a thin client for phones.
Uber is just a scam to launder saudi blood money through softbank. There is zero chance that human driven cars will go away. And self-driving cars cannot drive on roads with human drivers. Uber is in the later stages of the scam now and have "sold" (actually, they gave 400 million USD to the company they "sold" the division too) their self driving setup. They've admitted the only business model that would make them profitable is impossible. It's over. They're just trying to take the money and run now.<p>That their "app" is large is irrelevant to the scam.