> <i>sds.total_pwr is the sum of the power of all CPUs in the scheduling domain. This sum ends up being zero and that’s what causing the crash – division by zero.</i><p>> <i>The “CPU power” is used to take into account how much calculating capabilities a CPU has compared to the other CPUs and the main factors for calculating it are:</i><p>> <i>1. Whether the CPU is shared, for example by using multithreading.</i><p>> <i>2. How many real-time tasks the CPU is processing.</i><p>> <i>3. In newer kernels, how much time the CPU had spent processing IRQs.</i><p>> <i>The current suggested fix for this bug is relying on the theory that while taking into account the real-time tasks (#2 above), scale_rt_power() could return negative value, and thus the sum of all CPU powers may end up being zero.</i><p>The author doesn't really describe how this panic is related to uptime. Do long running kernels collect a lot of real-time tasks, is it a leak of some kind?<p>The suggested fix link doesn't provide any extra context as to why its uptime related either.
Uptime-related crashes seem fairly common. Here’s one of my stories, from Thanksgiving 2012: <a href="https://jacob.jkrall.net/turkey-day-down-time" rel="nofollow noreferrer">https://jacob.jkrall.net/turkey-day-down-time</a>
I’ve seen a couple others since, but they had the same general shape so didn’t bother writing the same story again.