I am having a problem with my Linux desktop and hoping this community can help. The hardware is only about a year old: CPU: AMD Ryzen 7 7800X3D, GPU: AMD Radeon RX 7900XTX. I am running KDE Plasma 6.4.4 on Wayland. My distro is Feodra 42, and I keep up with regular updates. Kernel version is 6.15.9.
Lately, I have been getting a flicker of horizontal white stripes occasionally coming across my monitors. It does not seem to matter what the program on the screen is. They’re not making it impossible to use the computer, but it is very distracting when it happens. I am also worried that it may be the hardware failing, but I am hoping it’s just a driver issue.
Is this a known issue with AMD drivers? Part of my concern is that last year I installed amdgpu and rcom from the AMDGPU and ROCm repos to play with AI models locally. Now I am wondering if that is messing with my video drivers. How can I tell which ones are being used? If I want to go back to the stock drivers, do I just uninstall the amdgpu package with dnf?
Change the cable
I’ve had similarly described visual issues in the recent past on an amd rx 7600xt under Bazzite (based on Fedora 42). It turned out to be the cable was failing. Swapped cables and it hasn’t happened since.
Sounds like display underflow but I won’t know until I see it. Can you tell us the contexts this occurs in? You mention the program is irrelevant; will this occur at an idle desktop? Does it need something to trigger it, like video playback?
I don’t believe the ROCm package you’ve installed would be relevant from the UMD side as that would bundle OGLP / AMDVLK (lately it’s the mesa drivers: RadeonSI and RADV). It shouldn’t change the default GFX API drivers used to compose your desktop.
Are you able to check your present amdgpu (gfx kernel driver) version (maybe dnf info amdgpu)? For whatever it’s worth, the kernel driver encapsulates the display abstraction layer (as it does on windows), which I believe is the relevant part as far as underflow is concerned.
Thank you for responding! I’m not sure how to find the versions, but here goes:
glxinfo reports:
Extended renderer info (GLX_MESA_query_renderer): Vendor: AMD (0x1002) Device: Radeon RX 7900 XTX (radeonsi, navi31, LLVM 20.1.6, DRM 3.63, 6.15.9-201.fc42.x86_64) (0x744c) Version: 25.1.4 Accelerated: yes Video memory: 24576MB OpenGL renderer string: Radeon RX 7900 XTX (radeonsi, navi31, LLVM 20.1.6, DRM 3.63, 6.15.9-201.fc42.x86_64)
AMDgpu package:
$ dnf info amdgpu Updating and loading repositories: Repositories loaded. Available packages Name : amdgpu Epoch : 1 Version : 6.2.60203 Release : 2044426.el9
The exact kernel version is 6.15.9-201
I am slo seeing some amdgpu related stuff in dmesg:
amdgpu 0000:03:00.0: [drm] Registered 4 planes with drm panic amdgpu 0000:03:00.0: [drm] REG_WAIT timeout 1us * 100 tries - dcn32_program_compbuf_size line:138
What could cause an underflow? I have two displays running: Primary is 3840x2160@60 (150% scale, HDR mode), Secondary is 2560x1440@60. Could cables be an issue?
Perhaps i fudged the dnf command example, it doesn’t appear to be possible to list the gfx kernel driver version via dnf. The entity we see is the external package installed for ROCm support. On that note, it’s worth mentioning that Fedora directly packages ROCm, and I’d recommend uninstalling the downloaded driver and switching over unless you have a specific reason to use the out-of-repo package.
As for underflow, a colleague of mine had provided a nice little run-down: The display buffer holds image you want to send to the display, is like part of GPU memory. When you send via DisplayPort, HDMI etc., you add some sync signals to indicate properties like the start of a frame, start of line, and extra data. If at any point this exchange gets messed up (like memory is unreadable, or you send data when it’s not ready), you get garbage on-screen.
One way underflow can happen from the context of DAL (display abstraction layer) is during a DPM (dynamic power management) change. This change takes a relatively long time, and on rare occasions, lead to underflow.
Alternatively, sometimes display hardware can’t change refresh rate (for example) as fast as we ask it to, and you may end up with underflow from that. I suppose we could try other cables but the impression I get is that this just started happening at some point?
if this behaviour is very recent, it could be from the distro provided amdgpu kernel driver, though I’m not sure if there could be any conflict by having the external package installed. Like I’ve mentioned above, it could be worth removing that set of packages installing the Fedora ROCm metapackage instead (
sudo dnf install rocm
).Thank you for the detailed answer, this is very informative. I should read up some more on the underflow issue. This is the first time I’ve heard that term. I’m familiar with buffer underflows, but this sounds a little more complex.
You are right, it did start happening just recently (within the last couple of weeks maybe). I forgot to mention that I am running this through a 4-port KVM. I didn’t think it relevant before because I’m not seeing any issues when using another port (my work PC), but I can’t rule it out.
It has been some time since I played with it, but I think the reason I went out-of-repo at the time was to get recent enough versions of amdgpu and rocm to run ollama. I was following some online guide and had no idea what I was doing so I probably messed something up.
It sounds like it should be safe to uninstall the out-of-repo amdgpu and rocm packages. I am not doing any local AI right now so I can probably leave it out. I do use the PC for gaming, but from what I’ve been reading it sounds like the standard drivers are good enough for that now.
Thank you everyone for your help. I didn’t expect anyone to actually reply and you guys have been awesome! I am going to swap the cables first of all since that’s an easy thing to try and see if anything changes.
to be honest, I feel as if I may have jumped the gun somewhat by suggesting this could be display underflow. I gather how tricky this could be but if you could somehow capture some footage of this in progress, I’d be curious to see it.
Does this white line show up across both displays in tandem?
It’s very quick and unpredictable. I have not noticed it across both screens at the same time, but can’t rule that out. They’re pretty thin lines and usually only one at a time, like a scan line on a TV. It just started becoming more noticeable recently so I’ll try to keep track of when it happens.
If it’s captured via DVR system, it’d likely be a UMD issue. If not, I would attribute it to DAL / KMD
Try to use the instant replay feature in GPU screen recorder to see if you can cap it in there?
I didn’t think it would show up in a screen recording, but I can try that. Thanks again