TechEcho

12 comments

markjdbover 7 years ago

Please be aware that the article describes a problem with a specific implementation of THP. Other operating systems implement it differently and don't suffer from the same caveats (though any implementation will of course have its own disadvantages, since THP support requires making various tradeoffs and policy decisions). FreeBSD's implementation (based on [1]) is more conservative and works by opportunistically reserving physically contiguous ranges of memory in a way that allows THP promotion if the application (or kernel) actually makes use of all the pages backed by the large mapping. It's tied in to the page allocator in a way that avoids the "leaks" described in the article, and doesn't make use of expensive scans. Moreover, the reservation system enables other optimizations in the memory management subsystem.[1] <a href="https://www.cs.rice.edu/~druschel/publications/superpages.pdf" rel="nofollow">https://www.cs.rice.edu/~druschel/publications/superpages.pd...</a>

评论 #15797740 未加载

评论 #15799644 未加载

lorenzhsover 7 years ago

I've had a really bad run-in with transparent hugepage defragmentation. In a workload consisting of many small-ish reductions, my programme spent over 80% of its total running time in pageblock_pfn_to_page (this was on a 4.4 kernel, <a href="https://github.com/torvalds/linux/blob/v4.4/mm/compaction.c#L74-L115" rel="nofollow">https://github.com/torvalds/linux/blob/v4.4/mm/compaction.c#...</a>) and a total of 97% of the total time in hugepage compaction kernel code. Disabling hugepage defrag with echo never > /sys/kernel/mm/transparent_hugepage/defrag lead to an instant 30x performance improvement.There's been some work to improve performance (e.g. <a href="https://github.com/torvalds/linux/commit/7cf91a98e607c2f935dbcc177d70011e95b8faff" rel="nofollow">https://github.com/torvalds/linux/commit/7cf91a98e607c2f935d...</a> in 4.6) but I haven't tried if this fixes my workload.

评论 #15797619 未加载

xchaoticover 7 years ago

So glad this is on the front page of HN. A good 30% of perf problems for our clients are low level misconfigurations such as this. For databases: huge pages - good THP - bad

reza_nover 7 years ago

Not to mention that there was a race condition in the implementation which would cause random memory corruption under high memory load. Varnish Cache would consistently hit this. Recently fixed:<a href="https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/7/html-single/7.2_release_notes/index#kernel" rel="nofollow">https://access.redhat.com/documentation/en-us/red_hat_enterp...</a>

mnw21camover 7 years ago

Agreed. Found this to be a problem and fixed it by switching it off three years ago. Seems to be a bigger problem on larger systems than small systems. We had a 64-core server with 384GB RAM, and running too many JVMs made the khugepaged go into overdrive and basically cripple the server entirely - unresponsive, getting 1% the work done, etc.

fps_dougover 7 years ago

I stumbled upon this feature when some Windows VMs running 3D accelerated programs exhibited freezes of multiple seconds every now and then. We quickly discovered khugepaged would hog the CPU completely during these hangs. Disabling THP solved any performance issues.

评论 #15797270 未加载

mwolffover 7 years ago

Bad advise... The following article is much better at actually measuring the impact:<a href="https://alexandrnikitin.github.io/blog/transparent-hugepages-measuring-the-performance-impact/" rel="nofollow">https://alexandrnikitin.github.io/blog/transparent-hugepages...</a>Especially the conclusion is noteworthy:> Do not blindly follow any recommendation on the Internet, please! Measure, measure and measure again!

评论 #15795713 未加载

评论 #15796381 未加载

评论 #15795667 未加载

评论 #15797849 未加载

评论 #15800988 未加载

lunixbochsover 7 years ago

Transparent hugepages causes a massive slowdown on one of my systems. It has 64GB of RAM, but it seems the kernel allocator fragments under my workload after a couple of days, resulting in very few >2MB regions free (as per proc buddyinfo) even with >30GB of free ram. This slowed down my KVM boots dramatically (10s -> minutes), and perf top looked like the allocator was spending a lot of cycles repeatedly trying and failing to allocate huge pages.(I don't want to preallocate hugepages because KVM is only a small part of my workload.)

phkahlerover 7 years ago

Shouldn't huge pages be used automatically if you malloc() large amounts of memory at once? Wouldn't that cover some of the applications that benefit from it?

评论 #15800284 未加载

评论 #15797050 未加载

brazzledazzleover 7 years ago

Brendan Gregg's presentation at re:Invent today reflected this advice. Netflix saw good and bad perf so switched back to madvise.

vectorEQover 7 years ago

good article, though as other posters suggest, just use it if you obsolutely must, and measure / test the results for any issues!

hossbeastover 7 years ago

What's the recommendation on a desktop for gaming / browsing / compiling with 32gb of ram ?

评论 #15799104 未加载

12 comments

markjdbover 7 years ago

评论 #15797740 未加载

评论 #15799644 未加载

lorenzhsover 7 years ago

评论 #15797619 未加载

xchaoticover 7 years ago

So glad this is on the front page of HN. A good 30% of perf problems for our clients are low level misconfigurations such as this. For databases: huge pages - good THP - bad

reza_nover 7 years ago

mnw21camover 7 years ago

fps_dougover 7 years ago

评论 #15797270 未加载

mwolffover 7 years ago

评论 #15795713 未加载

评论 #15796381 未加载

评论 #15795667 未加载

评论 #15797849 未加载

评论 #15800988 未加载

lunixbochsover 7 years ago

phkahlerover 7 years ago

Shouldn't huge pages be used automatically if you malloc() large amounts of memory at once? Wouldn't that cover some of the applications that benefit from it?

评论 #15800284 未加载

评论 #15797050 未加载

brazzledazzleover 7 years ago

Brendan Gregg's presentation at re:Invent today reflected this advice. Netflix saw good and bad perf so switched back to madvise.

vectorEQover 7 years ago

good article, though as other posters suggest, just use it if you obsolutely must, and measure / test the results for any issues!

hossbeastover 7 years ago

What's the recommendation on a desktop for gaming / browsing / compiling with 32gb of ram ?

评论 #15799104 未加载

Disable transparent hugepages

12 comments

Disable transparent hugepages

12 comments