I'm on the team. Part 2 of this post series should have lots of interesting technical details for y'all; be patient, I'm still writing it.<p>But to whet your appetite: we used <a href="https://github.com/cscott/js2php" rel="nofollow">https://github.com/cscott/js2php</a> to generate a "crappy first draft" of the PHP code for our JS source. Not going for correctness, instead trying to match code style and syntax changes so that we could more easily review git diffs from the crappy first draft to the "working" version, and concentrate attention on the important bits, not the boring syntax-change-y parts.<p>The original legacy Mediawiki parser used a big pile of regexps and had all sorts of corner cases caused by the particular order in which the regexps were applied, etc.<p>Parsoid uses a PEG tokenizer, written with pegjs (we wrote a PHP backend to pegjs for this project). There are still a bunch of regexps scattered throughout the code, because they are still very useful for text processing and a valuable feature of both JavaScript and PHP as programming languages, but they are not the primary parsing mechanism. Translating the regexps was actually one of the more difficult parts, because there are some subtle differences between JS and PHP regexps.<p>We made a deliberate choice to switch from JS-style loose typing to strict typing in the PHP port. Whatever you may consider the long term merits are for maintainability, programming-in-the-large, etc, they were <i>extremely useful</i> for the porting project itself, since they caught a bunch of non-obvious problems where the types of things were slightly different in PHP and JS. JS used anonymous objects all over the place; we used PHP associative arrays for many of these places, but found it very worthwhile to take the time to create proper typed classes during the translation where possible; it really helped clarify the interfaces and, again, catch a lot of subtle impedance mismatches during the port.<p>We tried to narrow scope by not converting <i>every</i> loose interface or anonymous object to a type -- we actually converted as many things as possible to proper JS classes in the "pregame" before the port, but the important thing was to get the port done and complete as quickly as possible. We'll be continuing to tighten the type system -- as much for code documentation as anything else -- as we address code debt moving forward.<p>AMA, although I don't check hacker news frequently so I can't promise to reply.