Even if the event code specifies a dummy interupt it might still
get wrapped with the conditional check. Make sure that all forced
interrupts from the event code get fired.
Change-Id: Ic0dedbadc4599efcbca65ad93cc27943bad2ca79
Signed-off-by: Shubhraprakash Das <sadas@codeaurora.org>
Signed-off-by: Tarun Karra <tkarra@codeaurora.org>
Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
Signed-off-by: Sakshi Agrawal <sakshia@codeaurora.org>
Print more IOMMU registers when a IOMMU pagefault happens which
report whether the pagefault is a read or write fault.
Change-Id: I27acafa9dcfd0c7de9056151ed1baef7dd2346df
Signed-off-by: Shubhraprakash Das <sadas@codeaurora.org>
Signed-off-by: Sakshi Agrawal <sakshia@codeaurora.org>
Lock ringbuffer virtual address translation in TLB of IOMMU. This
is required to prevent a hardware table walk from happening when
GPU executes commands that program the IOMMU registers. If a hardware
table walk happens when GPU is programming IOMMU register then it
can cause a deadlock on the bus.
Change-Id: I99428be7879a5210ba816f5f96864f94714649d7
Signed-off-by: Shubhraprakash Das <sadas@codeaurora.org>
Signed-off-by: Sakshi Agrawal <sakshia@codeaurora.org>
Update all the required timestamps when submitting commands to the
GPU ringbuffer for execution. The global timestamp was not being
updated in all the required cases. Now the global timestamp and
per context timestamp are updated if following conditions are met:
1. A valid non global context is passed
2. The context passed supports per context timestamps
3. The command submission is not from kgsl
If 3 conditions above are not met then only the global timestamp is
updated.
Change-Id: Ib4fe6fbd2ac57fbc5306377f09072f9e4c09d442
Signed-off-by: Shubhraprakash Das <sadas@codeaurora.org>
Signed-off-by: Sakshi Agrawal <sakshia@codeaurora.org>
A kgsl user can free up an allocation of GPU memory with
kgsl_ioctl_sharedmem_free() function, but the GPU address
might still be in use resulting in a GPU page fault.
A bug of this kind is hard to find because the driver keeps
no trace of kgsl_ioctl_sharedmem_free calls.
This change implements a "memfree history" debug feture that
allows to see an information about a last couple of hundred
memory free calls in kernel/debug/kgsl/kgsl-3d0/memfree_history
file.
Change-Id: I12e0e3702db56d99d5de644739b364dea4cc37b1
Signed-off-by: Vladimir Razgulin <vrazguli@codeaurora.org>
Signed-off-by: Sakshi Agrawal <sakshia@codeaurora.org>
Add a type parameter to event callbacks to give tasks a way to
communicate error conditions to the callback. Current type values
are KGSL_EVENT_TIMESTAMP_RETIRED for normal timestamp expiry and
KGSL_EVENT_CANCELLED if the event is cancelled prematurely.
Change-Id: Ic0dedbad9907cd50642a604a3af562e01a4b4a3b
Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
Signed-off-by: Sakshi Agrawal <sakshia@codeaurora.org>
Store the timestamps inside the context rather than in a
list stored in the ringbuffer. This allows for easier
maintanability as well as keeping all context data
centralized.
Change-Id: I0467d07be6c8bb9f062a81a40629c0288be7e868
Signed-off-by: Carter Cooper <ccooper@codeaurora.org>
Signed-off-by: Sakshi Agrawal <sakshia@codeaurora.org>
Get rid of kgsl_find_context. Use instead kgsl_context_get that does
correct RCU read locking around the itr_find and increases the
reference count on the context before returning it. This eliminates
the chance that a context will be destroyed while somebody is still
using it. Of course increased use of kgsl_context_get is accompanied
by kgsl_context_put in all the right places.
Change-Id: Ic0dedbad73d497fd9b451aefad8e5b28d33b829d
Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
Signed-off-by: Sakshi Agrawal <sakshia@codeaurora.org>
Allow the user space to pass in a type field to indicate the
type of upper level library that owns the context. The type
field is added to all the appropriate ftrace output for easier
debugging.
Change-Id: Ic0dedbadd42fc5ccfffd89738affd4794a6ab85e
Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
Signed-off-by: Sakshi Agrawal <sakshia@codeaurora.org>
Normally kgsl_create_context() returns a pointer to a new context.
In case of errors it just returns NULL pointer, and that doesn't allow
to propagate the right error code back to the user via
kgsl_ioctl_drawctxt_create()
This fix modifies kgsl_create_context to use ERR_PTR macro for passing
error codes to its caller.
Change-Id: I447c1765828912b0994bcae67a73864e62eef9b6
Signed-off-by: Vladimir Razgulin <vrazguli@codeaurora.org>
Signed-off-by: Sakshi Agrawal <sakshia@codeaurora.org>
Before we started using tracepoints this log message was used for
debugging the critical loop. Now that we are smarter remove the
unneeded log message.
Change-Id: Ic0dedbad57d97294cf2cb3865cf25f698aaea778
Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
Signed-off-by: Sakshi Agrawal <sakshia@codeaurora.org>
Keep track of the number of times the GPU power rails have been
cycled and provide said number to the user in the device sysfs
directory.
Change-Id: Ic0dedbad4489524a98e52b9993131257068ae2ef
Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
Signed-off-by: Sakshi Agrawal <sakshia@codeaurora.org>
Make sure that the context is freed only after the event is deleted
from the list to avoid accessing invalid list pointers. This makes
sure that the context memory is not accessed after being freed.
Change-Id: I9196219f1472396c0063dbeaa5351f66e6e5648f
Signed-off-by: Lynus Vaz <lvaz@codeaurora.org>
Signed-off-by: Sakshi Agrawal <sakshia@codeaurora.org>
Add tracepoints for tracking the lifespan of GPU events.
Change-Id: Ic0dedbadf99b1efcb650608c6ac664d3c4ddafd5
Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
Signed-off-by: Sakshi Agrawal <sakshia@codeaurora.org>
It is interesting to know when and how we program the hardware so
add a tracepoint to monitor register writes. Register reads are
not traced since they are done much more frequently and are somewhat
less interesting from a debug perspective.
Change-Id: Ic0dedbad8ee85f154b9ba8731fe2c14603351243
Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
Signed-off-by: Sakshi Agrawal <sakshia@codeaurora.org>
In the GPU interrupt handler we attempt to clear the ts_cmp_enable
for the active context so that future interrupts are skipped until
someone needs one again. If for some reason the interrupt handler
is delayed then there is a possiblity that the "current" context in
the GPU isn't the one that fired the interrupt. In that case we
could be accidently clearing a ts_cmp_enable for a context that
needs it. Instead of clearing in the interrupt handler clear it
from the GPU so we can be sure we got the right context.
As a bonus pushing this logic to the GPU side lets us get rid of
some extra register reads/writes in the interrupt handlers.
Change-Id: Ic0dedbadbf350f7c4866092fa0686f9b42f3cd33
Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
Signed-off-by: Sakshi Agrawal <sakshia@codeaurora.org>
Sometimes the core will go idle before the interrupt can be handled on
the GPU. If that happens then we could go to a lower power state before
cleaning up the pending interrupt and various entities that might be
waiting for it. Consider the current interrupt status when checking
for idle.
CRS-fixed: 449813
Change-Id: Ic0dedbadfd2d40e4411cf3b05e1eb4c4eecf7841
Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
Signed-off-by: Sakshi Agrawal <sakshia@codeaurora.org>
Indirect buffers to be executed will not be (and cannot be) zero length.
Check and reject.
Change-Id: Ic0dedbadea429c4f7bd386c5e64603b51ea8af61
Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
Signed-off-by: Sakshi Agrawal <sakshia@codeaurora.org>
adreno_readtimestamp() would return a register value when asked to
get the KGSL_TIMESTAMP_CONSUMED value. With per-context timestamps
this no longer holds valid since the value in the register could
correspond to a different contexts timestamp. Return the start of
packet timestamp for the specified context now that the start of
timestamp values are correctly submitted.
CRs-fixed: 382888
Change-Id: Iad56d07eb4a2973a5906d989c96dd117ff902cea
Signed-off-by: Carter Cooper <ccooper@codeaurora.org>
Signed-off-by: Sakshi Agrawal <sakshia@codeaurora.org>
HW workaround to resolve MMU pagefault caused by memory
freeing early before GPU completes it.
This is race condition where timestamp is getting updated
and the event handler runs before the GPU actually got done
with the timestamp, hence freeing the buffer early while GPU
still not completed with it.
Change-Id: I6d338d7dca680f57f2c39ed5a14381b261bf8bce
Signed-off-by: Anshuman Dani <adani@codeaurora.org>
Signed-off-by: Sakshi Agrawal <sakshia@codeaurora.org>
Add an identifier to label internal commands being sent to the
ringbuffer. This will be used to further enable debugging
and hang recovery.
Change-Id: I4b3c10f0c3d290ae092182cbb450abe65e9dda80
Signed-off-by: Carter Cooper <ccooper@codeaurora.org>
Signed-off-by: Sakshi Agrawal <sakshia@codeaurora.org>
Knowing the retired timestamp on wake-up is useful to debug hangs that
are caused due to clock instability. When we request for clock ON, the
clock may/may-not be turned ON, but we go ahead assuming it is ON. This
can be caught by printing the timestamp on wake. The timestamp we print,
can be compared against the timestamp in postmortem dump (assuming a
hang happens).
If there is a difference, we can say that GPU has run for sometime after
wake. So, we needn't suspect clocks.
If there is no difference, we can say that GPU has hung immediately after
wake. So, we can positively suspect the clocks.
Change-Id: Iedf49977adbb0e61a9b71bbe23acc81f2a75162d
Signed-off-by: Ranjhith Kalisamy <ranjhith@codeaurora.org>
Signed-off-by: Sakshi Agrawal <sakshia@codeaurora.org>
The instruction store on A3XX GPU cores isn't really an instruction store
as on A2XX and it doesn't give us any debugging value. On top of all
that it can be unstable to read in some situations. Since it is of
questionable value and questionable stability pull it from the snapshot
for all A3XX cores and move the functionality to the A2XX specific code.
Change-Id: Ic0dedbada32d8b6b9b93768d148c2624faf3ef30
Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
Signed-off-by: Sakshi Agrawal <sakshia@codeaurora.org>
The incorrect paren placement can lead to a fence timeout
because the last_timestamp is always written. As a result
the last_timestamp could be corrupted by writing an old
value leading to a future timestamp_cmp indicating that a
signaled sync pt was unsignaled.
Change-Id: I4650a467bfdd98a905f86fbec7d833596c67ba19
Signed-off-by: Jeff Boody <jboody@codeaurora.org>
Signed-off-by: Sakshi Agrawal <sakshia@codeaurora.org>
A misconfigured board file could cause an out-of-bounds
array access through this value.
Change-Id: I64bbad0c096a2efe65376991537d810141370d72
CRs-Fixed: 439230
Signed-off-by: Lucille Sylvester <lsylvest@codeaurora.org>
Signed-off-by: Sakshi Agrawal <sakshia@codeaurora.org>
Remove printing of "invalid context_id" messages and
replace those with simply returning with -EINVAL. The
messages serve little purpose and their removal alleviates
kernel log spamming.
CRs-fixed: 449050
Change-Id: Id0d6538d228e7dc928a628bdffde8c952ec5d6fe
Signed-off-by: Harsh Vardhan Dwivedi <hdwivedi@codeaurora.org>
Signed-off-by: Sakshi Agrawal <sakshia@codeaurora.org>
Make sure cache operations don't hit pagefaults by
backing the entire vma in mmap() instead of faulting
in pages as they're touched. Otherwise, there's a
chance that a later cache operation on the mapping
could trigger an unhandled page fault leading to
a kernel panic.
Change-Id: Ia73c8aaed2708c5b9ef46ed50fb0f5cf1ad2450c
Signed-off-by: Jeremy Gebben <jgebben@codeaurora.org>
Signed-off-by: Sakshi Agrawal <sakshia@codeaurora.org>
KGSL_GPUMEM_ALLOC_ID now takes a flag,
KGSL_MEMFLAGS_USE_CPU_MAP. When set, the GPU
mapping will be set up to match the CPU mapping
during mmap(). This feature is only supported when
using per process pagetables with the IOMMU. The
flags field of KGSL_GPUMEM_ALLOC_ID is copied back
to userspace and KGSL_MEMFLAGS_USE_CPU_MAP will
be cleared when this feature is not supported.
The IOMMU virtual address space has been adjusted
when perprocess pagetables is enabled so that the
entire userpace address range (0 to TASK_SIZE) can
have equivalent mappings on the IOMMU. For buffers
that do not have equivalent mappings, the address
range from PAGE_OFFSET to KGSL_IOMMU_GLOBAL_MEM_BASE
is used.
Change-Id: Ib61c03aa7453c3dd901c41e8fd297f66d402ae1a
Signed-off-by: Jeremy Gebben <jgebben@codeaurora.org>
Signed-off-by: Sakshi Agrawal <sakshia@codeaurora.org>
The user allocating memory can provide hints for the caching
settings when said buffer is mapped on the user side. Try
to obey the cache settings when applicable. The user will
be responsible for handling cache management both to and
from the GPU so add a new ioctl IOCTL_KGSL_GPUMEM_CACHE_SYNC to
support both directions (the old IOCTL_KGSL_SHAREDMEM_FLUSH_CACHE)
only handled flushing, not invalidating. The legacy ioctl still
works it just does what it says it does - a cache flush.
Change-Id: Ic0dedbad55ce82f2b01ebc56de30d4649e2e7311
Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
Signed-off-by: Sakshi Agrawal <sakshia@codeaurora.org>
Previously, the gpu address has been used to uniquely
identify each memory allocation. Upcoming patches will
introduce cases where an allocation does not always
have a gpu address, so an additional id is needed.
IOCTL_KGSL_GPUMEM_ALLOC_ID allocates pages and returns
an id.
IOCTL_KGSL_GPUMEM_FREE_ID frees an id. KGSL_SHAREDMEM_FREE
can still be used to free by GPU address, if it exists.
The id can also be passed to mmap(), shifted left by
PAGE_SIZE to get a CPU mapping for the buffer.
IOCTL_KGSL_GPUMEM_ALLOC_GET_INFO can be called to retrieve
the id and other information about the buffer.
Change-Id: I4b45f0660cb9d4a5fb1323ccc6c4aa360791c1ec
Signed-off-by: Jeremy Gebben <jgebben@codeaurora.org>
Signed-off-by: Sakshi Agrawal <sakshia@codeaurora.org>
The v1 iommu only supports splitting between TTBR0 and TTBR1
on a power of two boundary. Cutting off the userspace address
at 2G (0x80000000) is inconvienient, as the GPU userspace
address space should align with the CPU address space.
This requires changing how global allocations are managed,
since there is no longer a separate pagetable for TTBR1.
The default pagetable is still the master of these allocations
and maintains the gen_pool for allocating global addresses.
But now, these regions are mapped into each process pagetable
by calling kgsl_setup_pt(). This requires kgsl_mmu_map
and kgsl_mmu_unmap to be able to handle mapping without
virtual address allocation.
Change-Id: I94e2d63dc7e6a7ef576f993770725b6b7ba14228
Signed-off-by: Jeremy Gebben <jgebben@codeaurora.org>
Signed-off-by: Sakshi Agrawal <sakshia@codeaurora.org>
These cases have never been a normal operation pattern of the
userspace driver. Sharing buffers through multiple mappings
is better left to dma-buf or ion.
Change-Id: I7e7658137937c96b9505d0f912dcb262d652e0c3
Signed-off-by: Jeremy Gebben <jgebben@codeaurora.org>
Signed-off-by: Sakshi Agrawal <sakshia@codeaurora.org>
Some external memory types were using hostptr to store
a userspace virtual address, but other code assumes it is a
kernel virtual address. Make memdesc->hostpr always
be the kernel virtual address and add a useraddr
field for the userspace virtual address.
Change-Id: Id4580a2ff34aeb15f2c1b26a7134f0fd4ec52a6e
Signed-off-by: Jeremy Gebben <jgebben@codeaurora.org>
Signed-off-by: Sakshi Agrawal <sakshia@codeaurora.org>
Userspace passes a flag KGSL_MEMFLAGS_GPUREADONLY when
the gpu should not write to a memory region. Use this
flag to control IOMMU_WRITE permissions on the buffer.
Change-Id: I5d3fc615dc36687252e2242f63fe74d6ce1c4fbc
Signed-off-by: Jeremy Gebben <jgebben@codeaurora.org>
Signed-off-by: Sakshi Agrawal <sakshia@codeaurora.org>
The interrupts added to the ringbuffer on PTFLUSH and TLBUPDATE
were causing a major increase in the number of interrupts from the GPU.
This was leading to increase in power and loss of performance. Add a check
to turn off IOMMU clocks when going to SLEEP.
Change-Id: I41617dd3b7b3f7d9622523f2a1407b912dbd989e
Signed-off-by: Shubhraprakash Das <sadas@codeaurora.org>
Signed-off-by: Jeff Boody <jboody@codeaurora.org>
Remove ts_notifier_list from the driver since it is not being
used and is causing extra work to be done in the interrupt
handlers for A2XX, A3XX and Z180.
Change-Id: I5512e36f1e807f3a3e62aeac54cfd3075d4cf7a4
Signed-off-by: Carter Cooper <ccooper@codeaurora.org>
Signed-off-by: Jeff Boody <jboody@codeaurora.org>
Fence sync seems to react badly when it gets a unexpected timestamp
on a timeline. When a event expires always send the expected timestamp
even if the event expired because the context was destroyed.
Change-Id: Ic0dedbad770363a00257d62c4940f7743bc33060
Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
We expect alloc_pages to fail for high-order pages under stressed
memory conditions. It should then fall back to lower-order page
allocations but due to excessive logging the system crashes. Use
__GFP_NOWARN for high-order allocations to avoid excessive logging.
(cherry picked from commit ff6ea62fe0687e754ddb64492e77fd75ad85fa02)
CRs-Fixed: 454877
Change-Id: Ie18c2dddd6810038abad9b06c7838ff4a0844b6d
Signed-off-by: Mitchel Humpherys <mitchelh@codeaurora.org>
Fixed incorrect logic to check if the next pagefault
is in the 4k address range of previous pagefaults.
Change-Id: I76c2e13f3b1cb287af0ef9c8af2f4452df23ec54
Signed-off-by: Tarun Karra <tkarra@codeaurora.org>
Signed-off-by: Rajeev Kulkarni <krajeev@codeaurora.org>
After all the current events are processed in the main loop, the next
pending event is marked to fire an interrupt to start the cycle anew.
When the event is marked, we check the timestamp again to make sure that
we are always caught up with the GPU. If the timestamp for the next
event happens to be retired when we get to this point, we silently leave
the function.
If there are no other subsequent events or waittimestamp requests, then
it is possible that another interrupt might never be fired for that context
and then the event gets missed. Return a 1 in this case all the way back
up the stack and make the event code keep processing events until we are
positive that none are past.
Change-Id: I5a770e7f06612e1aecbfe2e8ccbfeebdab3fc9fc
CRs-Fixed: 441191
Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
Instead of sending the currently expired timestamp to expired
events, send the timestamp that the event expected. This avoids
confusion for events that don't compare the actual timestamp
to the one they expected. Canceled events send the current
timestamp when they were canceled.
The kgsl_sync handler wasn't prepared for the second case so
add a check to prevent strange things from happening on the
timeline when a event is canceled.
Change-Id: Ic0dedbad3681db6222720884688dcddb9b06bda2
Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
Add reference counting to all KGSL events since they are dependent
on the context. This avoids a race condition where the context
could be destroyed while the events are being processed during cleanup.
Change-Id: Ifcd9655a6335d22c01f2606a0fbc5346d9a0314f
CRs-fixed: 438134
Signed-off-by: Carter Cooper <ccooper@codeaurora.org>
Move the event accounting from a master list on the device to individual
context lists. This makes the management code more simple and reduces
the amount of extra cycles taken trying to walk a single largish list.
The new added code makes the event code slightly bigger than kgsl.c can
bear. As events become more commonplace this code will be more important
and it will be easier to maintain in a dedicated file so move all the event
related code into kgsl_events.c
Change-Id: Ic0dedbadf00b96eee53aea1d26acb195f1a916ce
Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
Event based as well as interruptible based timestamp checks
use the same code, they differ on if the mutex is held or not.
Separate the common code out to avoid confusion.
Change-Id: Ie2ee01d7a07ff2ba7755ab8d327796345f9ca9ea
Signed-off-by: Carter Cooper <ccooper@codeaurora.org>
Signed-off-by: Rajeev Kulkarni <krajeev@codeaurora.org>
Stop parsing an indirect buffer if an error is encountered (such as
a missing buffer). This is a pretty good indication that the buffers
are not reliable and the further the parser goes with a unreliable
buffer the more likely it is to get confused.
Change-Id: Ic0dedbadf28ef374c9afe70613048d3c31078ec6
Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
The snapshot timestamp is currently based on wall time which makes
it very difficult to debug against a kernel log based on boot time.
It doesn't help matters that the snapshot timestamp is output in
human unfriendly hex. Use boot time instead of wall time for the
snapshot value and print it out in decimal.
Change-Id: Ic0dedbadf9b7d799a3cb4d5917787bccd350c90a
Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
When taking a snapshot from a ramdump it is usually important that
we get the last IB2 that was executed so if we find it during
parsing put it in the static snapshot region as opposed to the
dynamic one.
Change-Id: Ic0dedbad878b961ebe29afe9145443e9bfd0bc6a
Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
If there are a lot of sequential commands in the ringbuffer for the
same context there is a high probablity that some of the memory
for the older commands has been reused and so the contents may not
be reliable. Increase robustness by making sure we don't parse the
same IB over and over again and be more tolerant of errors and
strange sizes.
Change-Id: Ic0dedbade8504ea258ceea40593cb75e7d6211c8
Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
When debugfs entity "ft_user_control" is set, it enables userspace to
control GPU fault tolerance policy. By default it is disabled.
Change-Id: Ia93ccf6d9595f6e7cda71863f9067819aaa0a68d
Signed-off-by: Tarun Karra <tkarra@codeaurora.org>