TE
TechEcho
Home24h TopNewestBestAskShowJobs
GitHubTwitter
Home

TechEcho

A tech news platform built with Next.js, providing global tech news and discussions.

GitHubTwitter

Home

HomeNewestBestAskShowJobs

Resources

HackerNews APIOriginal HackerNewsNext.js

© 2025 TechEcho. All rights reserved.

Emitting Safer Rust with C2Rust

159 pointsby dtolnayabout 2 years ago

12 comments

Animatsabout 2 years ago
DARPA is funding this. Good.<p>They haven&#x27;t reached inter-procedural static analysis yet, which means they can&#x27;t solve the big problem: how big is an array? Most of the troubles in C come from that. Whoever creates the array knows how big it is. Everybody else is guessing.<p>A bit of machine learning might help here. If you see<p><pre><code> void dosomethingwitharray(int arr[], size_t n) {} </code></pre> a good conjecture is that <i>n</i> is the length of <i>arr</i>. So, the question is, if this is translated to<p><pre><code> fn dosomethingwitharray(arr: &amp;[i64]) {} </code></pre> does it break anything? Both caller and callee have to be analyzed. The C caller has the constraint<p><pre><code> assert_eq!(arr.len(), n); </code></pre> That&#x27;s a proof goal. If a simple SMT-type prover can prove that true., then the call can be simplified to just use an ordinary Rust slice. If not, conversion to Rust has to drop to those ugly C pointer forms, preferably with a comment inserted. So you need something that makes good guesses, which is a large language model kind of thing, and something which checks them, which is a formalism kind of thing.<p>The process can be assisted by putting asserts in the original C, as checks on the C and hints to the conversion process. That&#x27;s probably the cleanest way to provide human assistance.<p>I&#x27;ve wanted this for conversion of OpenJPEG code to Rust. That&#x27;s a tangle of code doing wavelet transforms, with long blocks of touchy subscripting and arithmetic, plus encoders and decoders for an overly complex binary format containing offsets and lengths. Someone recently ran it through c2rust. The unsafe Rust code works. It&#x27;s compatible with the original C - it segfaults for the same test cases which cause the C code to segfault. This is why a naive transpiler isn&#x27;t too helpful.<p>(The date at the bottom of the article is 2022-06-13. Has there been further progress?)
评论 #35176011 未加载
评论 #35176552 未加载
评论 #35175635 未加载
评论 #35180849 未加载
评论 #35179587 未加载
评论 #35176452 未加载
neopalliumabout 2 years ago
I used c2rust to start rewriting OpenJpeg into Rust code [0].<p>It was easy to get the Rust code compiled and working as a drop-in-replacement for the C Library. This has been a big help with refactoring the unsafe Rust code into safe Rust (manual work). OpenJpeg has a great testsuite that has allowed testing that each refactor step doesn&#x27;t add new bugs (has happened at least 3 times).<p>The original run of c2rust generated 96,842 lines of Rust code (about 1 year ago), now it is down to 46,873 lines code. A lot of the extra 50k lines of code were from C macros that got expanded and from constant lookup tables (C code had 10-30 values per line, Rust 1 value 1 line).<p>For anyone looking to use c2rust to port C code to Rust, I recommend the following:<p><pre><code> 1. Setup some automated testing if it doesn&#x27;t exist already. 2. Do refactoring in small amounts, run the tests and commit the changes before doing more refactoring. 3. Use &quot;search&#x2F;replace&quot; tools (`sed`) to help with rewriting common patterns. Make sure to follow #2 when doing this. 4. Don&#x27;t re-organize the code until after most of the unsafe code has been rewritten. This will allow easier side-by-side comparison with the original C code. 5. c2rust expands macros and constants from `#define`. Being able to do side-by-side comparison of the C code will help with adding constants back in and removing expanded code with Rust macros or just normal Rust functions. </code></pre> [0] <a href="https:&#x2F;&#x2F;github.com&#x2F;Neopallium&#x2F;openjpeg&#x2F;tree&#x2F;master&#x2F;openjp2-rs">https:&#x2F;&#x2F;github.com&#x2F;Neopallium&#x2F;openjpeg&#x2F;tree&#x2F;master&#x2F;openjp2-r...</a>
19habout 2 years ago
I took the insertion_sort impl from the bottom of the post and asked gpt4 to rewrite it into idiomatic Rust:<p><pre><code> pub fn insertion_sort(n: i32, p: &amp;mut [i32]) { for i in 1..n as usize { let tmp = p[i]; let mut j = i; while j &gt; 0 &amp;&amp; p[j - 1] &gt; tmp { p[j] = p[j - 1]; j -= 1; } p[j] = tmp; } } fn main() { let mut arr1: [i32; 3] = [1, 3, 2]; insertion_sort(3, &amp;mut arr1); &#x2F;&#x2F; … } </code></pre> I guess if this actually works, we can translate massive amounts of internal C libraries into human readable Rust... good stuff.<p>(funnily enough, passing in the &quot;original&quot; code without the `unsafe extern &quot;C&quot;` part makes it produce the exact same output as the above)
评论 #35179281 未加载
评论 #35178512 未加载
mtlmtlmtlmtlabout 2 years ago
Has anyone put this to serious use? I played around with it at some point when it was fairly new and at that time I was able to transpile the C into Rust just fine, but that didn&#x27;t help me much. The idea was to be able to use the Rust toolchain to better understand the code, but the resulting Rust code was even less understandable, and also much harder to refactor. In this case I wasn&#x27;t attempting a rewrite per se, just trying to understand a C codebase plagued with memory safety issues. Quickly gave up on this avenue at that point and just started carefully refactoring the C to make the bugs easier to shake out.<p>Would love to see a technical write up of someone outside Immunant using this on a real world codebase for whatever purpose.
评论 #35178785 未加载
评论 #35180049 未加载
评论 #35177952 未加载
boredumbabout 2 years ago
C2rust is really cool, but if you&#x27;re familiar with writing rust and implement even a trivial C function in there it produces something absolutely terrifying. I really enjoy rust and pray I don&#x27;t find myself working in a code base someone just ran c2rust against.
评论 #35175167 未加载
BiteCode_devabout 2 years ago
Since this is DARPA, this shows they are interested in rust, which mean we will probably have strong toolchain certifications coming up eventually, making rust even more fall in the category of &quot;the language you want to use for serious stuff&quot;.
CharlesWabout 2 years ago
This seems like an interesting project to bridge the &quot;boil the ocean&quot; approach of rewriting in Rust wholesale.<p>(For anyone else who found it slightly difficult to read, you can remove the added 0.06em `letter-spacing` using your browser&#x27;s developer tools.)
hardwaregeekabout 2 years ago
I&#x27;m very excited at the possibilities for C2Rust! Dynamic analysis to fill in the gaps of static analysis makes a lot of sense. I&#x27;ve wanted something similar for inferring TypeScript types via runtime analysis (would not be surprised if it exists already).<p>I could see a really compelling use case in cross-compilation where you compile your C code to Rust, then use a Rust toolchain to cross compile. Or avoiding interop as well.
anticrymacticabout 2 years ago
What problem does c2Rust solve exactly? Isn&#x27;t it just gonna produce &quot;garbage&quot; rust.<p>Calling c directly is already possible in rust.
评论 #35174721 未加载
评论 #35174695 未加载
评论 #35175281 未加载
评论 #35175357 未加载
评论 #35175841 未加载
评论 #35174803 未加载
评论 #35174727 未加载
评论 #35175132 未加载
FpUserabout 2 years ago
Do no know this particular tool but some automated language to language transpilers I saw produce the code one would not be able to comprehend never mind edit if the need comes.
评论 #35175311 未加载
xvilkaabout 2 years ago
I wish they revive their refactoring tool - it was abandoned during the toolchain upgrade. Without the tool, converting the code becomes much more tedious.
评论 #35176466 未加载
diego_moitaabout 2 years ago
I am very curious to see how this transpiler problems will be handled by gpt4 in the upcoming months.
评论 #35176057 未加载