TE
科技回声
首页24小时热榜最新最佳问答展示工作
GitHubTwitter
首页

科技回声

基于 Next.js 构建的科技新闻平台,提供全球科技新闻和讨论内容。

GitHubTwitter

首页

首页最新最佳问答展示工作

资源链接

HackerNews API原版 HackerNewsNext.js

© 2025 科技回声. 版权所有。

Reverse Engineering TikTok's VM Obfuscation

683 点作者 hazebooth超过 2 年前

21 条评论

noduerme超过 2 年前
This is really awesome work.<p>I spent a lot of time in the early 2000s coming up with nasty obfuscation techniques to protect certain IP that inherently needed to be run client-side in casino games. Up to and including inserting bytecode that was custom crafted to intentionally crash off-the-shelf decompilers that had to run the code to disassemble it (and forcing them to phone home in the process where possible!)<p>My view on obfuscation is that since it&#x27;s never a valid security practice, it&#x27;s only admissible for hiding machinery from the general public. For instance, if you have IP you want to protect from average script kiddies. Any serious IP can be replicated by someone with deep pockets anyway. Most other uses of code obfuscation are nefarious, and obfuscated code should always be assumed to be malicious until proven otherwise. I&#x27;m not a reputable large company, but no reputable large company should be going to these lengths to hide their process from the user, because doing so serves no valid security purpose.
评论 #34119728 未加载
评论 #34116919 未加载
评论 #34116860 未加载
评论 #34117898 未加载
codedokode超过 2 年前
It is interesting, that while technologies like canvas, WebGL or WebRTC were intented for other purposes, their main usage became fingerprinting. For example, WebGL provides valuable information about GPU model and its drivers.<p>This shows how browser developers race to provide new features ignoring privacy impact.<p>I don&#x27;t understand why features that allow fingerprinting (reading back canvas pixels or GPU buffers) are not hidden behind a permission.
评论 #34115501 未加载
评论 #34118610 未加载
评论 #34115207 未加载
评论 #34119303 未加载
评论 #34125466 未加载
评论 #34115337 未加载
评论 #34116129 未加载
TobyTheDog123超过 2 年前
TikTok changes this algorithm about once every three months. I&#x27;ve reverse-engineered it about two times, and have since given up and decided to run a headless browser to do it for me. I&#x27;d love to see some tool developed to automate solving this so I can sign requests in a more limited context (ala Cloudflare Workers &#x2F; C@E)
评论 #34119971 未加载
评论 #34119077 未加载
thih9超过 2 年前
I&#x27;ve seen some of these techniques elsewhere; e.g. javascript-obfuscator supports replacing variable names with hex values [1] or transforming call structure into something more complex [2]. Bytecode generation is new to me; is there an existing JS obfuscation tool, preferably open source, that supports it?<p>[1]: <a href="https:&#x2F;&#x2F;github.com&#x2F;javascript-obfuscator&#x2F;javascript-obfuscator#identifiernamesgenerator">https:&#x2F;&#x2F;github.com&#x2F;javascript-obfuscator&#x2F;javascript-obfuscat...</a><p>[2]: <a href="https:&#x2F;&#x2F;github.com&#x2F;javascript-obfuscator&#x2F;javascript-obfuscator#stringarraycallstransform">https:&#x2F;&#x2F;github.com&#x2F;javascript-obfuscator&#x2F;javascript-obfuscat...</a>
评论 #34117615 未加载
评论 #34118271 未加载
评论 #34115782 未加载
评论 #34116344 未加载
derefr超过 2 年前
FYI, most CAPTCHA and anti-DDoS services (e.g. Cloudflare) do something very similar, sending the user an obfuscated program implemented on top of an obfuscated JS VM, that they effectively have to execute as-is, in a real browser, to get back the correct results the gateway is looking for. This is done to prevent simple scraping scripts (the ScraPy type) from being able to be used to scrape the site. If you want to do scraping, you have to spend the extra overhead of doing it by driving a real browser to do it. (And not even a headless one; they have tricks to detect that, too.)
antiviral超过 2 年前
This is excellent work.<p>It also shows how Tiktok <i>may</i> be in violation of several US&#x2F;EU privacy laws. I really wonder now who this data is shared with. Perhaps someone should bring this article to the FTC’s attention for further review.
评论 #34122544 未加载
KirillPanov超过 2 年前
Awesome, really awesome work. However:<p>&gt; If that is something you are interested in, keep an eye out for the second part of this series :)<p>Your site is missing an RSS&#x2F;Atom feed, so I can&#x27;t do that. ::sad face::
评论 #34119275 未加载
wiml超过 2 年前
Given that the beginning of the &quot;weird string&quot; has a magic number and a version field, I wonder if the point of this is not so much obfuscation as transpilation? The magic number corresponds to ASCII &quot;HNOJ&quot; &quot;@?RC&quot;, or perhaps &quot;JONH&quot; &quot;CR?@&quot;, which doesn&#x27;t turn anything up on Google but it seems odd to include that redundant header if your main goal is minification or obfuscation.
amelius超过 2 年前
Can someone explain what VM they are talking about, and where that VM is running on, and what is running in it?
评论 #34120476 未加载
评论 #34126857 未加载
Aperocky超过 2 年前
Isn&#x27;t the same concept also used in Youtube? I believe a python mock of the equivalent VM exist in youtube-dl.
评论 #34117832 未加载
评论 #34118784 未加载
Alifatisk超过 2 年前
I never knew that Tiktok was shipped with its own virtual machine!<p>But that explains the obvious subdomain vm.tiktok.com
评论 #34118839 未加载
born-jre超过 2 年前
Something hit me when reading this, you know how zknark is touted as tech which in future allow to create app that can work on user private data while preserving user&#x27;s privacy, could it be used as (opposite) an obfuscation technique to, u encrypt users data inside and zk oracle in user side and send to server. You could reverse engineer what are the inputs to the oracle, but not further what exactly it sends to the server?
评论 #34118064 未加载
mhasbini超过 2 年前
Deobfuscated script without the vm part: <a href="https:&#x2F;&#x2F;gist.github.com&#x2F;mhasbini&#x2F;f9269d230ed8eb6dfdbb1bd1be9114b9" rel="nofollow">https:&#x2F;&#x2F;gist.github.com&#x2F;mhasbini&#x2F;f9269d230ed8eb6dfdbb1bd1be9...</a>
lazyeye超过 2 年前
There needs to be a publicly funded charity that pays people to work fulltime de-obsfucating all the major apps. This should be a well-resourced ongoing operation.
评论 #34122612 未加载
derefr超过 2 年前
That HTTP request is kind of hideous. All those extra parameters that have nothing to do with what the response will end up being, and which change often. Seems like a great way to toss out all your API-response edge-cache-ability.
评论 #34119795 未加载
thecleaner超过 2 年前
Can I conclude that TikTok implemented a custom VM in Javascript ? Any idea what its used for and how many instructions it can process and are there other comparable implementations ?
Exuma超过 2 年前
This article is 2 hours old and his Twitter is already changed?
评论 #34118826 未加载
评论 #34120254 未加载
apienx超过 2 年前
Solid case! Thanks for taking the time to write it up.<p>Those who care and have to use TikTok can probably add their own virtualization layer (and tolerate the hit in cost&#x2F;performance).
评论 #34116350 未加载
frozencell超过 2 年前
The hunt begins.
draw_down超过 2 年前
&gt; void 0 (a fancy obfuscated way of saying undefined)<p>Kind of. But it was possible at one point, maybe still is, to rebind `undefined` to some other value, causing trouble. `void` is an operator, a language keyword; it’s guaranteed to give you the true undefined value. (In other words, the value whose type is `undefined`.)<p>If you’re coding against an environment as adversarial as these people clearly believe they are, you’d go with `void` as well.
评论 #34116231 未加载
Kukumber超过 2 年前
Nice use of low altitude satellites to track individuals and sniff telecoms all over the world<p>This decompiled object class also spy on the grid network, that&#x27;s quite interesting and very clever<p>I never knew we could also lobby governments to push for some office and cloud software full of spyware, even France had to ban them! [1]<p>This TikTok app is very dangerous!<p>Of course &#x2F;s<p>[1] - <a href="https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=33686599" rel="nofollow">https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=33686599</a>
评论 #34119192 未加载