kernel BUG at kernel/sched/deadline ? #233

MagnaboscoL · 2020-05-21T19:37:24Z

Hi,

I am just starting to test a "custom distro" I have made using this kernel and Yocto.
I am facing a "kernel BUG" I cannot understand.

Using:

kernel 5.4.20-ti-rt-r6 1fe65cd (or 5.4.38-ti-rt-r8 487bc1d)
defconfig from ti-linux-rt-5.4.y/patches/defconfig. The only modifications I have done are unsetting both CONFIG_MODULE_COMPRESS and CONFIG_MODULE_COMPRESS_XZ
kernel module omap5-sgx-ddk for GPU support from https://git.ti.com
other packages from meta-ti yocto layer (tag: ti2020.00)
BB-BONE-LCD5-01-00A1.dts

Hardware:

BeagleBone green
a resistive LCD Seeed-Studio-BeagleBone-Green-LCD-Cape

When I start Weston touch screen calibration I get:

[   63.286156] 000: kernel BUG at kernel/sched/deadline.c:1495!
[   63.291842] 000: Internal error: Oops - BUG: 0 [#1] PREEMPT_RT SMP ARM
[   63.298400] 000: Modules linked in:
[   63.301902] 000:  pvrsrvkm(O)
[   63.304877] 000:  joydev
[   63.307415] 000:  evdev
[   63.309866] 000:
[   63.311798] 000: CPU: 0 PID: 644 Comm: weston Tainted: G           O      5.4.20 #1
[   63.319491] 000: Hardware name: Generic AM33XX (Flattened Device Tree)
[   63.326044] 000: PC is at enqueue_task_dl+0x90/0xd8c
[   63.331054] 000: LR is at rt_mutex_setprio+0x358/0x568
[   63.336223] 000: pc : [<c018c014>]    lr : [<c01754d4>]    psr: 20000093
[   63.342950] 000: sp : da4afa58  ip : da4afb20  fp : da4afb1c
[   63.348630] 000: r10: df926140  r9 : dc32c420  r8 : 00000001
...

Pleased find attached the entire log
bug_log.txt

Note: attached log has been captured when starting the touch screen calibration, but I got the very same bug with an image without Weston at the end of ts_calibrate from tslib.

Considerations

if I use the kernel linux-ti-staging-rt-5.4 from TI it seems that the issue disappears. Therefore I suppose that I am not configuring this kernel properly.
I suppose that the issue is independent form the applications I was runnig and maybe related to some wrong compilation settings (that probably influenced the module pvrsrvkm.ko), but for the moment I am quite clueless.
I am not sure that this is worth mentioning but, since the DRM seems to be involved the details about libdrm I am using are:
- https://dri.freedesktop.org/libdrm/libdrm-2.4.99.tar.bz2 with patch musl-ioctl.patch
- I tried with and without the patches implemented in meta-arago (see TI pathces) with the same results

Any hint would be very appreciated.

Regards
Luca

The text was updated successfully, but these errors were encountered:

pdp7 · 2020-06-10T20:12:58Z

@jadonk @RobertCNelson do you have the Seeed LCD cape? Maybe we could ask one of the Seeed engineers to check?

RobertCNelson · 2020-06-10T20:28:00Z

This needs to be retested, early 5.4 versions where not being tested.

pdp7 · 2020-06-10T20:35:43Z

@Pillar1989 are you able to test the Seeed BeagleBone Green LCD Cape mentioned in this issue?

Pillar1989 · 2020-06-11T00:12:30Z

@pdp7 not yet. we tested it on version 4.19.

MagnaboscoL · 2020-06-12T07:36:12Z

Hi @pdp7,

I have a Yocto setup that seems to work* with this LCD Cape.
At the moment I am using TI kernel and I could switch kernel for testing.
Please let me know if you think I could do something to speed up the testing.

*Just to give complete info: It seems to work but I am still facing a color issue when using the GPU for QT apps as described in e2e.ti.com. Anyway that could probably be unrelated to the tests needed for this issue.

Regards

commit 129fe718819cc5e24ea2f489db9ccd4371f0c6f6 upstream. When trying to mmap a trace instance buffer that is attached to reserve_mem, it would crash: BUG: unable to handle page fault for address: ffffe97bd00025c8 #PF: supervisor read access in kernel mode #PF: error_code(0x0000) - not-present page PGD 2862f3067 P4D 2862f3067 PUD 0 Oops: Oops: 0000 [#1] PREEMPT_RT SMP PTI CPU: 4 UID: 0 PID: 981 Comm: mmap-rb Not tainted 6.14.0-rc2-test-00003-g7f1a5e3fbf9e-dirty #233 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2 04/01/2014 RIP: 0010:validate_page_before_insert+0x5/0xb0 Code: e2 01 89 d0 c3 cc cc cc cc 66 66 2e 0f 1f 84 00 00 00 00 00 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 0f 1f 44 00 00 <48> 8b 46 08 a8 01 75 67 66 90 48 89 f0 8b 50 34 85 d2 74 76 48 89 RSP: 0018:ffffb148c2f3f968 EFLAGS: 00010246 RAX: ffff9fa5d3322000 RBX: ffff9fa5ccff9c08 RCX: 00000000b879ed29 RDX: ffffe97bd00025c0 RSI: ffffe97bd00025c0 RDI: ffff9fa5ccff9c08 RBP: ffffb148c2f3f9f0 R08: 0000000000000004 R09: 0000000000000004 R10: 0000000000000000 R11: 0000000000000200 R12: 0000000000000000 R13: 00007f16a18d5000 R14: ffff9fa5c48db6a8 R15: 0000000000000000 FS: 00007f16a1b54740(0000) GS:ffff9fa73df00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: ffffe97bd00025c8 CR3: 00000001048c6006 CR4: 0000000000172ef0 Call Trace: <TASK> ? __die_body.cold+0x19/0x1f ? __die+0x2e/0x40 ? page_fault_oops+0x157/0x2b0 ? search_module_extables+0x53/0x80 ? validate_page_before_insert+0x5/0xb0 ? kernelmode_fixup_or_oops.isra.0+0x5f/0x70 ? __bad_area_nosemaphore+0x16e/0x1b0 ? bad_area_nosemaphore+0x16/0x20 ? do_kern_addr_fault+0x77/0x90 ? exc_page_fault+0x22b/0x230 ? asm_exc_page_fault+0x2b/0x30 ? validate_page_before_insert+0x5/0xb0 ? vm_insert_pages+0x151/0x400 __rb_map_vma+0x21f/0x3f0 ring_buffer_map+0x21b/0x2f0 tracing_buffers_mmap+0x70/0xd0 __mmap_region+0x6f0/0xbd0 mmap_region+0x7f/0x130 do_mmap+0x475/0x610 vm_mmap_pgoff+0xf2/0x1d0 ksys_mmap_pgoff+0x166/0x200 __x64_sys_mmap+0x37/0x50 x64_sys_call+0x1670/0x1d70 do_syscall_64+0xbb/0x1d0 entry_SYSCALL_64_after_hwframe+0x77/0x7f The reason was that the code that maps the ring buffer pages to user space has: page = virt_to_page((void *)cpu_buffer->subbuf_ids[s]); And uses that in: vm_insert_pages(vma, vma->vm_start, pages, &nr_pages); But virt_to_page() does not work with vmap()'d memory which is what the persistent ring buffer has. It is rather trivial to allow this, but for now just disable mmap() of instances that have their ring buffer from the reserve_mem option. If an mmap() is performed on a persistent buffer it will return -ENODEV just like it would if the .mmap field wasn't defined in the file_operations structure. Cc: [email protected] Cc: Masami Hiramatsu <[email protected]> Cc: Mathieu Desnoyers <[email protected]> Cc: Vincent Donnefort <[email protected]> Link: https://lore.kernel.org/[email protected] Fixes: 9b7bdf6 ("tracing: Have trace_printk not use binary prints if boot buffer") Signed-off-by: Steven Rostedt (Google) <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>

pdp7 added the requires testing label Jun 10, 2020

pdp7 added the green BeagleBone Green label Jun 10, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

kernel BUG at kernel/sched/deadline ? #233

kernel BUG at kernel/sched/deadline ? #233

MagnaboscoL commented May 21, 2020 •

edited

Loading

pdp7 commented Jun 10, 2020

RobertCNelson commented Jun 10, 2020

pdp7 commented Jun 10, 2020

Pillar1989 commented Jun 11, 2020

MagnaboscoL commented Jun 12, 2020 •

edited

Loading

kernel BUG at kernel/sched/deadline ? #233

kernel BUG at kernel/sched/deadline ? #233

Comments

MagnaboscoL commented May 21, 2020 • edited Loading

pdp7 commented Jun 10, 2020

RobertCNelson commented Jun 10, 2020

pdp7 commented Jun 10, 2020

Pillar1989 commented Jun 11, 2020

MagnaboscoL commented Jun 12, 2020 • edited Loading

MagnaboscoL commented May 21, 2020 •

edited

Loading

MagnaboscoL commented Jun 12, 2020 •

edited

Loading