● Now I have the full picture. Here's the comparison: KGSL vs Our Implementation KGSL Synchronization (Legacy webOS kernel) 1. Timestamp-based deferred unmap: // kgsl_cmdstream.c void kgsl_cmdstream_memqueue_drain(struct kgsl_device *device) { ts_processed = device->ftbl.device_cmdstream_readtimestamp( device, KGSL_TIMESTAMP_RETIRED); list_for_each_entry_safe(entry, entry_tmp, &device->memqueue, list) { // Only unmap when GPU has retired past this timestamp if (!timestamp_cmp(ts_processed, entry->free_timestamp)) break; kgsl_destroy_mem_entry(entry); // calls kgsl_mmu_unmap } } 2. kgsl_yamato_idle() for GPU idle: int kgsl_yamato_idle(struct kgsl_device *device, unsigned int timeout) { // Step 1: Wait for ringbuffer drain (rptr == wptr) do { GSL_RB_GET_READPTR(rb, &rb->rptr); } while (rb->rptr != rb->wptr); // Step 2: Wait for RBBM_STATUS == 0x110 while (time_before(jiffies, wait_time)) { kgsl_yamato_regread(device, REG_RBBM_STATUS, &rbbm_status); if (rbbm_status == 0x110) return 0; } } 3. Uses GSL_PT_PAGE_DIRTY (0x00000004) instead of 0: // kgsl_mmu.c line 1008 kgsl_pt_map_set(pagetable, pte, GSL_PT_PAGE_DIRTY); // 0x00000004, not 0! Our Implementation ✅ a2xx_gpummu_unmap() - Matches KGSL! - Waits for ringbuffer drain (rptr == wptr) - Waits for RBBM_STATUS == 0x010 or 0x110 - Uses dsb/mb barriers ✅ msm_gem_vma_unmap() - New fence wait (just added) - Waits for DMA fences via dma_resv_wait_timeout() ❌ Difference: We set PTE to 0, KGSL sets to 0x00000004 The GSL_PT_PAGE_DIRTY value (0x00000004) is interesting - it's an invalid PTE but not zero. This might help catch use-after-unmap issues differently. Our fix should work equivalently to KGSL because: 1. We wait for GPU idle in gpummu_unmap (same as KGSL) 2. We now also wait for DMA fences before unmapping (equivalent to KGSL's timestamp check) The page faults you're seeing are likely from Mesa freeing buffers before all fences are signaled - our new fence wait should help with this.