| ● Now I have the full picture. Here's the comparison:
|
|
|
| KGSL vs Our Implementation
|
|
|
| KGSL Synchronization (Legacy webOS kernel)
|
|
|
| 1. Timestamp-based deferred unmap:
|
| // kgsl_cmdstream.c
|
| void kgsl_cmdstream_memqueue_drain(struct kgsl_device *device)
|
| {
|
| ts_processed = device->ftbl.device_cmdstream_readtimestamp(
|
| device, KGSL_TIMESTAMP_RETIRED);
|
|
|
| list_for_each_entry_safe(entry, entry_tmp, &device->memqueue, list) {
|
| // Only unmap when GPU has retired past this timestamp
|
| if (!timestamp_cmp(ts_processed, entry->free_timestamp))
|
| break;
|
| kgsl_destroy_mem_entry(entry); // calls kgsl_mmu_unmap
|
| }
|
| }
|
|
|
| 2. kgsl_yamato_idle() for GPU idle:
|
| int kgsl_yamato_idle(struct kgsl_device *device, unsigned int timeout)
|
| {
|
| // Step 1: Wait for ringbuffer drain (rptr == wptr)
|
| do {
|
| GSL_RB_GET_READPTR(rb, &rb->rptr);
|
| } while (rb->rptr != rb->wptr);
|
|
|
| // Step 2: Wait for RBBM_STATUS == 0x110
|
| while (time_before(jiffies, wait_time)) {
|
| kgsl_yamato_regread(device, REG_RBBM_STATUS, &rbbm_status);
|
| if (rbbm_status == 0x110)
|
| return 0;
|
| }
|
| }
|
|
|
| 3. Uses GSL_PT_PAGE_DIRTY (0x00000004) instead of 0:
|
| // kgsl_mmu.c line 1008
|
| kgsl_pt_map_set(pagetable, pte, GSL_PT_PAGE_DIRTY); // 0x00000004, not 0!
|
|
|
| Our Implementation
|
|
|
| ✅ a2xx_gpummu_unmap() - Matches KGSL!
|
| - Waits for ringbuffer drain (rptr == wptr)
|
| - Waits for RBBM_STATUS == 0x010 or 0x110
|
| - Uses dsb/mb barriers
|
|
|
| ✅ msm_gem_vma_unmap() - New fence wait (just added)
|
| - Waits for DMA fences via dma_resv_wait_timeout()
|
|
|
| ❌ Difference: We set PTE to 0, KGSL sets to 0x00000004
|
|
|
| The GSL_PT_PAGE_DIRTY value (0x00000004) is interesting - it's an invalid PTE but not zero. This might help catch use-after-unmap issues differently.
|
|
|
| Our fix should work equivalently to KGSL because:
|
| 1. We wait for GPU idle in gpummu_unmap (same as KGSL)
|
| 2. We now also wait for DMA fences before unmapping (equivalent to KGSL's timestamp check)
|
|
|
| The page faults you're seeing are likely from Mesa freeing buffers before all fences are signaled - our new fence wait should help with this.
|