加载论坛时出错,请强制刷新页面重试。

关于amdgpu驱动崩溃问题

Xinmudotmoe

运行一段时间后出现 闪屏、部分窗口透明(chromium)。

在这之后继续操作一段时间整个屏幕卡死,然后有点类似花屏的样子。黑屏闪退到sddm界面(加了参数才闪退,不加参数就始终停留在花屏状态)。

运行环境是 新世界-gentoo 和 新世界-loongarchlinux

桌面环境是plasma(wayland/x11), gnome(wayland/x11), xfce4(x11), openbox(x11)

驱动是amdgpu

内核 loongarchlinux使用的 6.4.1-1

gentoo使用的 6.5.0-rc5 (https://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson.git loongarch-next分支)

显卡为 rx550 4G

主板是天创者L5A2

BIOS是V4.0.05429-stable202302_rel

内核参数调了一些,晚上我补充一下加了什么参数

用 amd A4-6210 cpu+主板 +显卡 不出现问题


JackyMuyi

我的主板也是天创者L5A2,BIOS是202302dbg的那版,显卡是HD7870,系统是Arch 6.4.1-1,桌面环境是plasma(wayland)+sddm,目前待机状态下没有出现花屏、闪屏、屏幕卡死以及chromium部分窗口透明的问题,不过切换到tty再切换回tty1就会发现花屏,但是再切换到tty2(sddm界面)就会恢复正常,然后再切换回tty1就好了。另外之前发现如果使用睡眠功能的话,只能通过按电源键唤醒,唤醒后会花屏,但是后来更新系统后再使用睡眠暂时没发现再出此问题(不过我也一般不用睡眠功能)
唯独就是在amdgpu下,使用wine玩游戏特别是那些需要opengl一类的硬件渲染方式的,一律都会整个屏幕黑一下屏后再亮起,之后就进入彻底卡死的状态,所有按键均无效只能长按电源键关机后重启,而在radeon驱动下则无此问题。


Xinmudotmoe

JackyMuyi 切换tty我这里也会有花屏的问题。试试glmark2跑opengl能不能卡死呢。我这里glmark2正常,反倒是轻度应用会dead


Xinmudotmoe

内核参数最初几乎是空的,出现渲染问题后尝试参照gentoo和archlinux的wiki调了一下参数

 amdgpu.gpu_recovery=1 amdgpu.lockup_timeout=3000 amdgpu.sg_display=0 pci=noats amdgpu.send_sigterm=1 amdgpu.mes=1 amdgpu.moverate=1024 amdgpu.pcie_gen2=1 amd.aspm=1 radeon.cik_support=0 radeon.si_support=0 amdgpu.cik_support=1 amdgpu.si_support=1 amdgpu.dpm=0 amdgpu.dc=1 amdgpu.vm_update_mode=3 amdgpu.runpm=0 amdgpu.ppfeaturemask=0xfff7ffff

然而……感觉只是现象出现的晚了点,并没有解决。

一些奇怪的日志

Aug 28 20:35:30 archlinux kernel: amdgpu 0000:05:00.0: amdgpu: Disabling VM faults because of PRT request!

-- 此处省略一大堆看起来正常的日志
Aug 28 21:00:16 archlinux budgie-wm[2657]: Can't update stage views actor <unnamed>[<MetaWindowActorX11>:0x55556724cf60] is on because it needs an allocation.
Aug 28 21:00:16 archlinux budgie-wm[2657]: Can't update stage views actor <unnamed>[<MetaSurfaceActorX11>:0x555567448660] is on because it needs an allocation.
-- budgie-wm日志可能与操作有关,在一些操作后固定重复这两行日志

Aug 28 21:04:03 archlinux kgx[2908]: kgx has no capability of surrounding-text feature
-- surrounding-text日志可能与操作有关,而且不仅仅是kgx进程会触发

Aug 28 21:13:57 archlinux budgie-panel[2677]: gvc_mixer_card_get_index: assertion 'GVC_IS_MIXER_CARD (card)' failed
Aug 28 21:13:57 archlinux budgie-panel[2677]: invalid (NULL) pointer instance
Aug 28 21:13:57 archlinux budgie-panel[2677]: g_signal_connect_object: assertion 'G_TYPE_CHECK_INSTANCE (instance)' failed
Aug 28 21:13:57 archlinux budgie-panel[2677]: gvc_mixer_stream_get_volume: assertion 'GVC_IS_MIXER_STREAM (stream)' failed
Aug 28 21:13:57 archlinux budgie-panel[2677]: gvc_mixer_stream_get_is_muted: assertion 'GVC_IS_MIXER_STREAM (stream)' failed
-- 此处的日志似乎与切换tty有关

-- 切换tty后会使得图标变白 (x11),x86架构暂未复显

Aug 28 21:14:41 archlinux chrome[5647]: chrome has no capability of surrounding-text feature
Aug 28 21:14:43 archlinux chromium-snapshot-bin.desktop[5647]: [5647:5647:0828/211443.543526:ERROR:interface_endpoint_client.cc(665)] blink.mojom.WidgetHost
--这是发生崩溃前的最后两条日志,虽然看起来和chromium有关,但是即便没有chromium也会发生驱动崩溃。

-- 运行chromium似乎可以更快触发崩溃

Aug 28 21:14:51 archlinux kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring gfx timeout, signaled seq=1462608, emitted seq=1462610
Aug 28 21:14:51 archlinux kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR* Process information: process chrome pid 5683 thread chrome:cs0 pid 5700
Aug 28 21:14:51 archlinux kernel: amdgpu 0000:05:00.0: amdgpu: GPU reset begin!
Aug 28 21:14:51 archlinux kernel: amdgpu: cp is busy, skip halt cp
Aug 28 21:14:52 archlinux kernel: amdgpu: rlc is busy, skip halt rlc
Aug 28 21:14:52 archlinux kernel: amdgpu 0000:05:00.0: amdgpu: PCI CONFIG reset
Aug 28 21:14:52 archlinux kernel: amdgpu 0000:05:00.0: amdgpu: GPU reset succeeded, trying to resume
Aug 28 21:14:52 archlinux kernel: [drm] PCIE GART of 256M enabled (table at 0x000000F400380000).
Aug 28 21:14:52 archlinux kernel: [drm] VRAM is lost due to GPU reset!
Aug 28 21:14:52 archlinux kernel: [drm] UVD and UVD ENC initialized successfully.
Aug 28 21:14:52 archlinux kernel: [drm] VCE initialized successfully.
Aug 28 21:14:52 archlinux kernel: amdgpu 0000:05:00.0: amdgpu: recover vram bo from shadow start
Aug 28 21:14:52 archlinux chromium-snapshot-bin.desktop[5683]: amdgpu: amdgpu_cs_query_fence_status failed.
Aug 28 21:14:52 archlinux kernel: amdgpu 0000:05:00.0: amdgpu: recover vram bo from shadow done
Aug 28 21:14:52 archlinux kernel: [drm] Skip scheduling IBs!
Aug 28 21:14:52 archlinux kernel: [drm] Skip scheduling IBs!

--此处就是驱动崩溃的现场

Aug 28 21:14:52 archlinux kernel: ------------[ cut here ]------------
Aug 28 21:14:52 archlinux kernel: WARNING: CPU: 0 PID: 2666 at drivers/gpu/drm/ttm/ttm_bo.c:326 ttm_bo_release+0x2b0/0x2d8 [ttm]
Aug 28 21:14:52 archlinux kernel: Modules linked in: snd_seq_dummy snd_hrtimer snd_seq snd_seq_device amdgpu rfkill nls_cp936 vfat fat amdxcp gpu_sched drm_buddy drm_suballoc_helper drm_display_helper drm_ttm_helper snd_hda_codec_realtek ttm snd_hda_codec_generic ledtrig_audio snd_hda_codec_hdmi snd_hda_intel snd_intel_dspcfg snd_hda_codec snd_hda_core snd_hwdep snd_pcm snd_timer snd soundcore mousedev loongson drm_dma_helper acpi_ipmi drm_kms_helper ipmi_si ipmi_devintf ipmi_msghandler crypto_user loop fuse dm_mod ip_tables x_tables ext4 crc32c_generic crc16 mbcache jbd2 nvme xhci_pci xhci_pci_renesas nvme_core nvme_common usbhid
Aug 28 21:14:52 archlinux kernel: CPU: 0 PID: 2666 Comm: budgie-w:shlo0 Not tainted 6.5.0-rc5-2 #1 9ddd6f9e213afcce8ba7dc396751e98e8a3d83e5
Aug 28 21:14:52 archlinux kernel: Hardware name: Loongson Loongson-3A5000-HV-7A2000-1w-V0.1-EVB/Loongson-LS3A5000-7A2000-1w-EVB-V1.21, BIOS Loongson-UDK2018-V4.0.05420-stable20
Aug 28 21:14:52 archlinux kernel: pc ffff80000265d184 ra ffff800002850014 tp 90000001200c0000 sp 90000001200c3a90
Aug 28 21:14:52 archlinux kernel: a0 90000001020e89c0 a1 900000010550fd38 a2 90000001200c3a58 a3 90000001200c3a48
Aug 28 21:14:52 archlinux kernel: a4 90000001200c3a50 a5 900000010550fc00 a6 0000000000000000 a7 0000000000000008
Aug 28 21:14:52 archlinux kernel: t0 0000000000000001 t1 bab5dfbe57163d87 t2 0000000000000001 t3 00000000000018bd
Aug 28 21:14:52 archlinux kernel: t4 0000000000000001 t5 fffffffffffffffc t6 0000000000004000 t7 0000000000000000
Aug 28 21:14:52 archlinux kernel: t8 0000000000000001 u0 0000000000000000 s9 0000000000000001 s0 90000001020e89c0
Aug 28 21:14:52 archlinux kernel: s1 ffff80000267c000 s2 900000010ee8eea0 s3 90000001020e8858 s4 9000000105efc900
Aug 28 21:14:52 archlinux kernel: s5 0000000000000004 s6 90000001200c3ec0 s7 0000000000000000 s8 90000001200c3ca4
Aug 28 21:14:52 archlinux kernel:    ra: ffff800002850014 amdgpu_bo_unref+0x20/0x34 [amdgpu]
Aug 28 21:14:52 archlinux kernel:   ERA: ffff80000265d184 ttm_bo_release+0x2b0/0x2d8 [ttm]
Aug 28 21:14:52 archlinux kernel:  CRMD: 000000b0 (PLV0 -IE -DA +PG DACF=CC DACM=CC -WE)
Aug 28 21:14:52 archlinux kernel:  PRMD: 00000004 (PPLV0 +PIE -PWE)
Aug 28 21:14:52 archlinux kernel:  EUEN: 00000000 (-FPE -SXE -ASXE -BTE)
Aug 28 21:14:52 archlinux kernel:  ECFG: 00071c1d (LIE=0,2-4,10-12 VS=7)
Aug 28 21:14:52 archlinux kernel: ESTAT: 000c0000 [BRK] (IS= ECode=12 EsubCode=0)
Aug 28 21:14:52 archlinux kernel:  PRID: 0014c011 (Loongson-64bit, Loongson-3A5000-HV)
Aug 28 21:14:52 archlinux kernel: CPU: 0 PID: 2666 Comm: budgie-w:shlo0 Not tainted 6.5.0-rc5-2 #1 9ddd6f9e213afcce8ba7dc396751e98e8a3d83e5
Aug 28 21:14:52 archlinux kernel: Hardware name: Loongson Loongson-3A5000-HV-7A2000-1w-V0.1-EVB/Loongson-LS3A5000-7A2000-1w-EVB-V1.21, BIOS Loongson-UDK2018-V4.0.05420-stable20
Aug 28 21:14:52 archlinux kernel: Stack : 00000000000003ac 0000000000000000 9000000002723520 90000001200c0000
Aug 28 21:14:52 archlinux kernel:         90000001200c3700 90000001200c3708 0000000000000000 90000001200c3848
Aug 28 21:14:52 archlinux kernel:         90000001200c3840 90000001200c3840 90000001200c3660 0000000000000001
Aug 28 21:14:52 archlinux kernel:         0000000000000001 90000001200c3708 bab5dfbe57163d87 900000010018fb40
Aug 28 21:14:52 archlinux kernel:         0000000000000001 0000000000000003 0000000000000000 2e302e34562d3831
Aug 28 21:14:52 archlinux kernel:         74732d3032343530 0000000000000a6a 0000000004c14000 0000000000000001
Aug 28 21:14:52 archlinux kernel:         0000000000000000 0000000000000000 9000000003834418 900000000399d000
Aug 28 21:14:52 archlinux kernel:         0000000000000000 0000000000000004 90000001200c3ec0 0000000000000000
Aug 28 21:14:52 archlinux kernel:         90000001200c3ca4 0000000000000000 9000000002723538 00007fffef1b5cbc
Aug 28 21:14:52 archlinux kernel:         00000000000000b0 0000000000000004 0000000000000000 0000000000071c1d
Aug 28 21:14:52 archlinux kernel:         ...
Aug 28 21:14:52 archlinux kernel: Call Trace:
Aug 28 21:14:52 archlinux kernel: [<9000000002723538>] show_stack+0x5c/0x180
Aug 28 21:14:52 archlinux kernel: [<9000000003294e24>] dump_stack_lvl+0x60/0x88
Aug 28 21:14:52 archlinux kernel: [<9000000003284ee4>] __warn+0x84/0xc8
Aug 28 21:14:52 archlinux kernel: [<9000000003258828>] report_bug+0x19c/0x204
Aug 28 21:14:52 archlinux kernel: [<9000000003295498>] do_bp+0x194/0x2ac
Aug 28 21:14:52 archlinux kernel: [<9000000003d61924>] exception_handlers+0x1924/0x10000
Aug 28 21:14:52 archlinux kernel: [<ffff80000265d184>] ttm_bo_release+0x2b0/0x2d8 [ttm]
Aug 28 21:14:52 archlinux kernel: [<ffff800002850014>] amdgpu_bo_unref+0x20/0x34 [amdgpu]
Aug 28 21:14:52 archlinux kernel: [<ffff800002856150>] amdgpu_gem_object_free+0x34/0x58 [amdgpu]
Aug 28 21:14:52 archlinux kernel: [<9000000002db77c8>] drm_gem_dmabuf_release+0x4c/0x6c
Aug 28 21:14:52 archlinux kernel: [<9000000002dfb460>] dma_buf_release+0x3c/0xa0
Aug 28 21:14:52 archlinux kernel: [<900000000298f384>] __dentry_kill+0x148/0x1d8
Aug 28 21:14:52 archlinux kernel: [<9000000002972be8>] __fput+0x128/0x274
Aug 28 21:14:52 archlinux kernel: [<9000000002761324>] task_work_run+0x80/0xbc
Aug 28 21:14:52 archlinux kernel: [<9000000002743600>] do_exit+0x2f8/0x85c
Aug 28 21:14:52 archlinux kernel: [<9000000002743d2c>] do_group_exit+0x34/0x94
Aug 28 21:14:52 archlinux kernel: [<9000000002751b20>] get_signal+0x754/0x7fc
Aug 28 21:14:52 archlinux kernel: [<900000000272582c>] arch_do_signal_or_restart+0x74/0xc2c
Aug 28 21:14:52 archlinux kernel: [<90000000027c8a98>] exit_to_user_mode_loop.isra.0+0x90/0x10c
Aug 28 21:14:52 archlinux kernel: [<90000000032961fc>] syscall_exit_to_user_mode+0x68/0x78
Aug 28 21:14:52 archlinux kernel: [<900000000272119c>] handle_syscall+0xbc/0x158
Aug 28 21:14:52 archlinux kernel:
Aug 28 21:14:52 archlinux kernel: ---[ end trace 0000000000000000 ]---
-- 上方的信息我在dmesg似乎没看到,journalctl记录到了。bios回报它的版本不是5429……我再试试升一下有什么效果。


JackyMuyi

Xinmudotmoe
glmark2没有问题。
我这边目前会崩系统的就只有使用wine运行一些使用d3d或者opengl渲染模式的windows游戏,如果像CS这种的话调成软件渲染模式就正常了
现在就是想找找有没有那种不通过wine,能在龙芯平台上linux原生的游戏试试看
PS:我之前也发过一贴,因为用的是HD7870可以在amdgpu和radeon驱动二选一,选前者的话用wine玩游戏会出现崩系统的问题,选后者的话obs录屏在wayland下无法正常捕获屏幕(黑屏),且vulkan不可用。
https://bbs.loongarch.org/d/274-3a5000hd7870amdgpuopenglegl


Xinmudotmoe

目前archlinux升级到6.5.0.arch1-4后,问题依旧。

经测试,在Loongnix-20.5.loongarch64下也会崩溃。

bios更新成202302dbg,dbg和rel相比对“崩溃”事件相关度低。

同时,经测试,更新/回滚vbios对“崩溃”事件相关度低。

除了显卡rx550 1002:699F,手里还有一张rx6800 1002:73bf,但被我改水冷了,拆下来可能要命。

从X宝上购了一块儿rx6400,到了再试试。


Xinmudotmoe

突然稳定的用户发来消息

做了以下操作,然后莫名奇妙相对稳定了

  1. 重新涂抹了cpu和桥片的硅脂
  2. 桥片使用了原装散热器(原来的是二手卖家魔改的)
  3. 显卡插上时向上预留了1mm(pcie接口大概差1mm插到底。纯手工测量,准不了一点)
  4. 拔掉了cpu和桥片的风扇,上面平放了一颗NL-A15风扇

内核、bios版本没变,这是什么玄学解决方式呢……


JackyMuyi

Xinmudotmoe
好玄学的解决方式……

我这边自从更新了lat后,无论amdgpu还是radeon,用wine运行的游戏不是贴图错误就是贴图拉伸撕裂…… 😿


Xinmudotmoe

续报

经过一段时间的折腾 rx6400已经和3a5000一起运行了

05:00.0 PCI bridge: Advanced Micro Devices, Inc. [AMD/ATI] Navi 10 XL Upstream Port of PCI Express Switch (rev c7) (prog-if 00 [Normal decode])

Flags: bus master, fast devsel, latency 0, IRQ 46, NUMA node 0

Memory at e0020400000 (32-bit, non-prefetchable) [size=16K]

Bus: primary=05, secondary=06, subordinate=07, sec-latency=0

I/O behind bridge: 4000-4fff [size=4K] [16-bit]

Memory behind bridge: 20200000-203fffff [size=2M] [32-bit]

Prefetchable memory behind bridge: e0180000000-e02ffffffff [size=6G] [32-bit]

Capabilities: <access denied>

Kernel driver in use: pcieport

06:00.0 PCI bridge: Advanced Micro Devices, Inc. [AMD/ATI] Navi 10 XL Downstream Port of PCI Express Switch (prog-if 00 [Normal decode])

Subsystem: Advanced Micro Devices, Inc. [AMD/ATI] Navi 10 XL Downstream Port of PCI Express Switch

Flags: bus master, fast devsel, latency 0, IRQ 48, NUMA node 0

Bus: primary=06, secondary=07, subordinate=07, sec-latency=0

I/O behind bridge: 4000-4fff [size=4K] [16-bit]

Memory behind bridge: 20200000-203fffff [size=2M] [32-bit]

Prefetchable memory behind bridge: e0180000000-e02ffffffff [size=6G] [32-bit]

Capabilities: <access denied>

Kernel driver in use: pcieport

07:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Navi 24 [Radeon RX 6400/6500 XT/6500M] (rev c7) (prog-if 00 [VGA controller])

Subsystem: Advanced Micro Devices, Inc. [AMD/ATI] Navi 24 [Radeon RX 6400/6500 XT/6500M]

Flags: bus master, fast devsel, latency 0, IRQ 70, NUMA node 0

Memory at e0200000000 (64-bit, prefetchable) [size=4G]

Memory at e0180000000 (64-bit, prefetchable) [size=2M]

I/O ports at 4000 [size=256]

Memory at e0020200000 (32-bit, non-prefetchable) [size=1M]

Expansion ROM at e0020300000 [disabled] [size=128K]

Capabilities: <access denied>

Kernel driver in use: amdgpu

Kernel modules: amdgpu

07:00.1 Audio device: Advanced Micro Devices, Inc. [AMD/ATI] Navi 21/23 HDMI/DP Audio Controller

Subsystem: Advanced Micro Devices, Inc. [AMD/ATI] Navi 21/23 HDMI/DP Audio Controller

Flags: bus master, fast devsel, latency 0, IRQ 63, NUMA node 0

Memory at e0020320000 (32-bit, non-prefetchable) [size=16K]

Capabilities: <access denied>

Kernel driver in use: snd_hda_intel

Kernel modules: snd_hda_intel

从bios开始启动到看到sddm登陆界面,都是黑屏无信号状态。直到amdgpu驱动完成初始化。

在此期间sshd加载后可以使用。

同样遇到了不稳定问题(显卡插上时向上预留了2mm)然后就好了……


Xinmudotmoe

玄学续报

又经过了一些测试,大概分析出了以下规律,并且目前可以相对稳定的工作(2小时以上的不停机操作)。

  1. 为显卡套上pciex4转pciex16或pciex1转pciex16,在x16口工作,不稳定
  2. 保留上述方式,将显卡移动到x8口,稳定
  3. 保留1中的方式,将显卡移动到x4口,稳定
  4. 保留2中的方式,将ax200经过pciex1转M.2EKey后插入x16口,不稳定
  5. 保留4的方式,将ax200移动到x4口,稳定

接线情况

  1. pciex8 - pciex4转pciex16 rx6400显卡
  2. pciex16 空
  3. pciex4 - pciex1转M.2EKey ax200无线网卡
  4. M.2 nvme硬盘
  5. SATA 均空

内核、BIOS、VBIOS版本都没变(因为……没有更新可用)。

so……是x16接口的问题吗


TF-zhong

3A6000 主板 XA61200 出现此问题在

gentoo: 内核版本 6.6.8 , 6.7.0-rc7 均有此问题

debian12 有此问题

loongnix 20.5 有此问题

使用浏览器时此问题触发概率极高

使用 qemu 时移动 qemu 窗口此问题触发概率极高


TF-zhong

请帮忙分析一下, 这到底是显卡有问题还是 驱动有问题 ,还是哪儿有问题


TF-zhong

继续使用, 发现 显卡诱骗器, 在 3A6000 上无法正常使用,

用 root # get-edid | edid-decode 

无法返回检测的分辨率, 直接提示

1 potential busses found: 6

Bus 6 doesn't really have an EDID...

Couldn't find an accessible EDID on this computer.

开机日志里

[    6.480591] EDID block 0 (tag 0x00) checksum is invalid, remainder is 125

[    6.480595] [00] BAD  00 ff ff ff ff ff ff 00 ff ff ff ff ff ff ff ff

[    6.480596] [00] BAD  ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff

[    6.480597] [00] BAD  ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff

[    6.480598] [00] BAD  ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff

[    6.480599] [00] BAD  ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff

[    6.480600] [00] BAD  ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff

[    6.480601] [00] BAD  ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff

[    6.480602] [00] BAD  ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff

[    6.480605] amdgpu 0000:07:00.0: [drm] *ERROR* EDID checksum invalid.

这个还有救么, 有什么办法能让 显卡获取到 诱骗器的 EDID ?


GSJY

TF-zhong

论坛看到的 你试试可以不

https://bbs.loongarch.org/d/327-amdgpu/4


TF-zhong

经过多番折磨, 最后也是无奈决定换块显卡, 这次尝试 最便宜的 二手拆机 R5 240 1G显存版, DisplayPort 输出口,

可以满足 3840x2160@60hz 显示, 这显卡估计也是堪堪能支撑这个分辨率, 但是也总比 忽然来一个 gpu reset 让人觉得更好


xry111

During an offline discussion some engineer said this is likely a hazard in HyperTransport (not sure the protocol itself or the Loongson implementation). The faster the GPU the more likelyhood to trigger the issue. IIRC I've seen this several times on a 3A4000 + 7A1000 + RX550 ("only" PCIe 2.0 x 8). There seems no well-proven way to work it around in firmware or OS.

For the Linux radeon driver a workaround is https://github.com/chenhuacai/linux/commit/a1e31fe7e00ad569d145b2ac09546a2dda04ba65 (the commit message also describes the issue), but I don't think it's acceptable to the upstream maintainers and I've not tested it at all. It's not applicable for the amdgpu driver too (unless someone can port it).

(Or perhaps it should/might actually be implemented as a PCI quirk against the 7A1000/2000 device ID?)

The long-term solution is replacing HT with Loongson Coherent Link for future 3A/3B/3C/3D/7A chips.

(Sorry for typing in English, I don't have an input method installed on this machine.)


TF-zhong

xry111

多谢大佬的关注 和回复, 我和本贴题主目前已经通过替换其他型号显卡绕过此问题 , 期待龙芯中科在后续的产品不会再有此类问题

替换其他型号显卡后, 3A6000 工况稳定


小洋葱

或者,咱有没有办法把amdgpu给blacklist掉,然后用纯软件渲染?很多arm开发板上跑桌面,我看都是llvmpipe,没有用到板载gpu。我之前试过把amdgpu这个module给blacklist掉,但xorg会起不来,提示找不到gpu或monitor。


EMCA

小洋葱 llvmpipe只是绘制图像,最后还是要用显卡显示出来,这种情况下建议试试fbdev或vesa


小洋葱

EMCA 我在arch上也安装了fbdev和vesa的包,但似乎没什么用,xorg和wayland还是起不来,不知道是不是还要做什么配置。


EMCA

小洋葱 使用vesa的话需要手写配置,像是这个例子

https://forums.linuxmint.com/viewtopic.php?t=266554


下一页 »

知识共享许可协议
本站文章除其作者特殊声明外,一律采用CC BY-NC-SA 4.0许可协议进行授权。
进行转载或二次创作时务必以相同协议进行共享,严禁用于商业用途