We remove an incorrect check for currently active context.
The intent of the original check was to ensure that the
current context is at least there/valid before we issue
a dummy command with a forced interrupt. However, this
check was implemented incorrectly, instead of checking
the context under which the function is running, the check
was probing the "drawctxt_active" which may not necessarily
be the same as the context for which the function was called.
We fix this by changing the check to instead look for the
context under which the kgsl_check_interrupt_timestamp() has
been called.
CRs-fixed: 426186
Change-Id: I6ac123d16888287b14e6e53028f482eb709f24c5
Signed-off-by: Harsh Vardhan Dwivedi <hdwivedi@codeaurora.org>
When GPU resumes from suspend requested state is not cleared to NONE.
Due to this idle reporting is not done and GPU stays at MAX freq. As
a fix if the current power state is already ACTIVE and request is
made for active power state then clear the requested state immediately.
CRs-fixed: 424682
Change-Id: I7f0d7fa819308f166cbbbf30b2c20aee73644cfb
Signed-off-by: Vinay Roy <vroy@codeaurora.org>
Waittimestamp calls require interrupts to check if a timestamp
has passed. The lack of these interrupts was causing waittimestamp
to wait longer than expected since the interrupts were less frequent.
Cause the conditional interrupts to be issued faster by allowing
internal command submissions to issue them.
CRs-fixed: 417577
Change-Id: Idb6f18261b3dd6fcbea5607d449d70ca54136e81
Signed-off-by: Carter Cooper <ccooper@codeaurora.org>
Signed-off-by: Rajeev Kulkarni <krajeev@codeaurora.org>
Update the VBIF register settings for A330 for better performance and
stability per the latest testing and analysis.
CRs-Fixed: 416680
Change-Id: Ic0dedbad71bfd589b322bed503052315d0bd1940
Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
Signed-off-by: Rajeev Kulkarni <krajeev@codeaurora.org>
The CP dynamic clock seems to be glitchy when the CP clocks are turned
back on after a power event. Turn off said dynamic clock control at
init time. The impact of leaving the dynamic clock control off is
negligible since the CP clock is only on when the CP is actually in
use.
CRs-fixed: 402119
CRs-fixed: 409253
CRs-fixed: 413224
Change-Id: Ic0dedbad783f8b911d9b57d1602d9b3976af1b3b
Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
Signed-off-by: Rajeev Kulkarni <krajeev@codeaurora.org>
Heap mask field passes as argument at allocation time to specify ion
heap.
Change-Id: I6951da81b11bfcb0e006d2539c1707aabbb12db6
Signed-off-by: Hanumant Singh <hanumant@codeaurora.org>
Signed-off-by: Mitchel Humpherys <mitchelh@codeaurora.org>
This syncs up our ion buffer flags in ion.h with upstream (using
ION_FLAG_CACHED instead of CACHED and UNCACHED).
Change-Id: I2e07a3acba230a86817182677fa2394fa846fd7e
Signed-off-by: Mitchel Humpherys <mitchelh@codeaurora.org>
ion_validate_buffer_flag is no longer needed since its only callers
are moving to a new format for buffer flags.
Change-Id: I90f23123681eaefb40b530aec5eaf596770acf18
Signed-off-by: Mitchel Humpherys <mitchelh@codeaurora.org>
To avoid collisions with future ioctls that may be added to ion,
msm-specific ioctls should use the `custom_ioctl' interface provided
by struct ion_device, as originally intended.
Change-Id: I68d32b81c7691fd3f54f38acfdc17fee2573bc44
Signed-off-by: Mitchel Humpherys <mitchelh@codeaurora.org>
All msm_ion clients need to use <linux/msm_ion.h> instead of
<linux/ion.h>
Change-Id: I4f999e301d147ec3242a33ee75b10b6501d42a70
Signed-off-by: Mitchel Humpherys <mitchelh@codeaurora.org>
All msm_ion clients need to use <linux/msm_ion.h> instead of
<linux/ion.h>
Change-Id: Ibc0856a9bcced14d1531f229d8cb232b9f787151
Signed-off-by: Mitchel Humpherys <mitchelh@codeaurora.org>
System performance is enhanced if the gpu frequency is given a
minimum corresponding to various frequency levels of CPU 0.
Change-Id: Iba168d708524fc8ef164428bb5f4e0631a499342
Signed-off-by: Steve Muckle <smuckle@codeaurora.org>
(cherry picked from commit 682c7a01c1d86518cdc7bec25cb413498811137b)
DCVS only expects a core to be registered once and kgsl reregisters
the core every time the policy changes. Kgsl should only do the
sink start and stop every time and only do the register the first time.
Change-Id: Id54f078d6013586899bf85fc2462e6c473ffc6b5
Signed-off-by: Suman Tatiraju <sumant@codeaurora.org>
(cherry picked from commit 229d24c088699c5f7a3591e5335dcc1668e5e913)
Always rest the ts_cmp_enable when an interrupt is received
from the GPU. This keeps legacy code that is not using
per context timestamps correctly updated. No effect is
seen with mainline code using per context timestamps.
CRs-fixed: 418172
Change-Id: I7f29086d4885571bdb165c0e759dc6ffc40b554f
Signed-off-by: Carter Cooper <ccooper@codeaurora.org>
Signed-off-by: Rajeev Kulkarni <krajeev@codeaurora.org>
When the sysfs entry for max clock is updated, set it immediately for the
devices that has no power scale policy. This prevents the device from
running with undesired clock.
Change-Id: I3211234f6d2afe4c2de115dace70b332d2b30f21
CRs-fixed: 408963
Signed-off-by: Ranjhith Kalisamy <ranjhith@codeaurora.org>
Signed-off-by: Rajeev Kulkarni <krajeev@codeaurora.org>
With 8974, bandwidth requests are exceeding integer size.
Update ab and ib to ensure values over 4Gbps can be requested.
For 64-bit support, there are four changes in this patch:
Changes in the bus driver structures and functions to
update ab and ib values to uint64.
Updates in client drivers to ensure that the bus APIs
are called with the right types.
Updates to the device-tree. Device-tree nodes don't
provide a clean mechanism to read a combination of u32
and u64 variables within the same property name. So, the
ab and ib values must be specified in kbps instead of
bytes/sec from the device tree. The bus driver function
will convert the ab and ib values read from device-tree
to Mbps.
Change the existing property names to make them consistent
with the convention used by the new property name, which
abides to the Device Tree convention.
Change-Id: Ib0a6db0e221af366c61c51cc21ac722f02cb7a34
CRs-Fixed: 408786
Signed-off-by: Gagan Mac <gmac@codeaurora.org>
Signed-off-by: Rajeev Kulkarni <krajeev@codeaurora.org>
Adreno idle detection during a hang was being obstructed when
called from adreno_waittimestamp(). The variable active_cnt which
was meant to protect the device from going into suspend while
doing a waittimestamp was being set and was preventing the
check of rbbm_status register which was leading to a false hang
detection, where, if the rbbm_status register value had been
probed the device would have been found to be idle. We fix this by
having a separate function is_adreno_rbbm_status_idle() which checks
this register and report whether the core is idle or not. And we call
this function instead of the generic adreno_isidle() inside
adreno_hang_detect(). Doing this fixes at least one scenario of false
hang detection.
CRs-fixed: 406631
Change-Id: I44b5446769eaa361cca63674f700153ee3faff45
Signed-off-by: Harsh Vardhan Dwivedi <hdwivedi@codeaurora.org>
Signed-off-by: Rajeev Kulkarni <krajeev@codeaurora.org>
Prevent the iommu fault handler from clearing the stalled status
of GPU on page fault. This will cause a GPU hang and print out
the snapshot that will help in fault analysis.
Change-Id: I9dcab83a098a988f86a0c03c46b0dbe6624de937
Signed-off-by: Shubhraprakash Das <sadas@codeaurora.org>
Signed-off-by: Rajeev Kulkarni <krajeev@codeaurora.org>
Allow users to specify "big page" mode when allocating GPU
memory. This will attempt to allocate contigious units of 64k
bytes per instead of the usual PAGE_SIZE units. If the allocation
fails, then automatically fall back to allocate the rest of the
region with 4K pages. This means that potentially any allocation
could be comprised of a mixture of 64k chunks and 4k chunks.
Change-Id: Ib92c0c099d3b10ba54dbb864c5f977f8df4912f2
Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
Signed-off-by: Jeremy Gebben <jgebben@codeaurora.org>
Signed-off-by: Rajeev Kulkarni <krajeev@codeaurora.org>
io_fraction is set to max so that io_wait gets reported
for DCVS so that MP decision won't offline cores
during GPU activity.
Change-Id: I39db72dd9c443069675a8a1b1f4e17d8928625b8
Signed-off-by: Suman Tatiraju <sumant@codeaurora.org>
Signed-off-by: Rajeev Kulkarni <krajeev@codeaurora.org>
Add a file for each process in debugfs named, kgsl/proc/<pid>/mem
which contains information about all memory allocations the process
has made.
Change-Id: Ice3f039d92cc1b1cdb5a6192808441ddfdf8abfb
Signed-off-by: Jeremy Gebben <jgebben@codeaurora.org>
Signed-off-by: Rajeev Kulkarni <krajeev@codeaurora.org>
Add a "usage" field to memory allocation, mapping and
free ftrace events.
Change-Id: I673a9593650d5285b0abc8c94de8f9f80d3d449e
Signed-off-by: Jeremy Gebben <jgebben@codeaurora.org>
Signed-off-by: Rajeev Kulkarni <krajeev@codeaurora.org>
Userspace passes a set of values indicating how it
uses each buffer it allocates, which were previously
ignored. These are useful hints for debugging and
profiling applications. These flags will be exposed
through ftrace and debugfs in later patches.
Change-Id: Ie26c26e413c074dcd5dfa24d355443ee47c3cd6a
Signed-off-by: Shubhraprakash Das <sadas@codeaurora.org>
Signed-off-by: Jeremy Gebben <jgebben@codeaurora.org>
Signed-off-by: Rajeev Kulkarni <krajeev@codeaurora.org>
Some intensive shader operations can go for the full timeout
in the SP block without changes in th RBBM and CP registers
that we monitor for hang detection. Add the performance counter
SP_FS_FULL_ALU_INSTRUCTIONS to see if any full precision
instructions have been executed during the hang detection interval.
CRs-Fixed: 392730
Change-Id: Ic0dedbadd6e5bcd0b46aab4209430de2f74711f7
Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
Signed-off-by: Rajeev Kulkarni <krajeev@codeaurora.org>
Because we store process memory indexed by the thread group leader and
report pagefaults accordingly it helps reduce confusion to record the
TGID for the allocation in the memory trace messages.
Change-Id: Ic0dedbad79c7112c4a6746cea0941104519e495c
Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
Signed-off-by: Rajeev Kulkarni <krajeev@codeaurora.org>u
VBIF registers are set dependent on what A3XX GPU core is present.
Set the registers from a table that is explicitly tied to each of
the A3XX GPU cores. This will prevent side effects across cores
when changing a specific cores VBIF data.
Change-Id: I4c20cd891a940abd85459ce5bf548cf91d06004a
Signed-off-by: Carter Cooper <ccooper@codeaurora.org>
Signed-off-by: Rajeev Kulkarni <krajeev@codeaurora.org>
When recovery fails we are in HUNG state and clocks are turned off,
If a client closes kgsl device, driver release gets called and
we will try to idle the device which leads to accessing GPU registers
without clock being on. This leads to excessive kernel logging
"kgsl_pre_hwaccess hw access while clocks off from state HUNG"
and this eventually leads to watchdog.
CRs-fixed: 397149
Change-Id: Iac720bef5d811269417ec1e5a346c95ce5a4324f
Signed-off-by: Tarun Karra <tkarra@codeaurora.org>
Signed-off-by: Rajeev Kulkarni <krajeev@codeaurora.org>
Set the address translation to happen via ttbr0 for GPU IOMMU till
address 0x80000000 instead of 0x40000000. Also, adjust the size of
virtual pool so that it does not allocate an address greater
than 0x80000000.
CRs-fixed: 401364
Change-Id: I268dda47e82e80a7ee0a72a835c7777be8b6f7f5
Signed-off-by: Shubhraprakash Das <sadas@codeaurora.org>
Signed-off-by: Rajeev Kulkarni <krajeev@codeaurora.org>
If a hang is detected when allocating space in ringbuffer and
if the context for which the space is being allocated is hung
then do not allocate space at all.
Change-Id: Ia5ade2341fe5016119d8c140413860420c5c3a3d
Signed-off-by: Shubhraprakash Das <sadas@codeaurora.org>
Signed-off-by: Rajeev Kulkarni <krajeev@codeaurora.org>
The existing code leaves clks prepared when we transition from
NAP to SLEEP. We fail to reach Vdd_min as voltage scaling has
moved to prepare/unprepare.
Change-Id: I509f4c6bd217ead985890147819236dd2c5d03ce
Signed-off-by: Suman Tatiraju <sumant@codeaurora.org>
Signed-off-by: Rajeev Kulkarni <krajeev@codeaurora.org>
There is a chance that we could fail to allocate memory and then write
to it when allocating entries for our scatter gather list. Handle the
error cases properly to prevent writing to NULL.
Change-Id: I33978ff53114ede7007b3240406583bcbef2f292
Signed-off-by: Carter Cooper <ccooper@codeaurora.org>
Signed-off-by: Rajeev Kulkarni <krajeev@codeaurora.org>
Invalidate the base pointer even when tlb is flushed. For
gpummu the base pointers are invalidated even after just
a tlb flush, with IOMMU do the same.
Change-Id: I11c7d4b841e5f8becd30a087bcabc0ae69c3fee5
Signed-off-by: Shubhraprakash Das <sadas@codeaurora.org>
Signed-off-by: Rajeev Kulkarni <krajeev@codeaurora.org>
The A330 GPU defines a few new registers that don't exist on
A305/A320. Define a new subset for A330 and dump it in the
postmortem and binary snapshot.
Change-Id: Ic0dedbadd0c44ee8872b99fd6b0b3dc8eb972eea
Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
Signed-off-by: Rajeev Kulkarni <krajeev@codeaurora.org>
Expand the snapshot register dump helper function to support multiple
sets of registers. This will be useful for derivative GPUS that
use a global subset of registers and add a few new ones. This will
not be useful for chipsets that have extensive changes to existing
registers.
Change-Id: Ic0dedbad05bcc3b5a3a0cc933659959965ff5817
Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
Signed-off-by: Rajeev Kulkarni <krajeev@codeaurora.org>
The CP MERCIU queue can be read from the A330 GPU. Dump it into
the snapshot binary.
Change-Id: Ic0dedbadf2c61ccec6a11af103374f4aee8be727
Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
Signed-off-by: Rajeev Kulkarni <krajeev@codeaurora.org>
The A330 has a larger ROQ buffer then the A305/A320 variants so
adjust the size at runtime based on the core type.
Change-Id: Ic0dedbade62c988cfe402876bc94d91a2dd71617
Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
Signed-off-by: Rajeev Kulkarni <krajeev@codeaurora.org>
Create adreno_is_a330() to identify the GPU for A330 specific
register settings and core specific code.
Change-Id: Ic0dedbade244ffba3ba3917661a88f97108e6182
Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
Signed-off-by: Rajeev Kulkarni <krajeev@codeaurora.org>
Increase number of clocks that RBBM will wait before de-asserting
the Register Clock Active signal. This fixes kernel panics during
stability tests on multiple devices
Change-Id: I6f7f8bb17cfd9c5beed0fd21d56ab6ab9fd40195
Signed-off-by: Rammohan Basavaraju <rammoh@codeaurora.org>
Signed-off-by: Rajeev Kulkarni <krajeev@codeaurora.org>
For A3xx, initialize UCHE_CACHE_MODE_CONTROL_REG to
0x00000001 so that UCHE will always use 64-byte
cachelines when we boot up or reset. This value
increases performance and was previously set in the
graphics preambles, but should instead be set at
boot/reset time.
Change-Id: Iec71ffc04262ac43534fd632d8b092a48d280509
Signed-off-by: Kevin Matlage <kmatlage@codeaurora.org>
Signed-off-by: Rajeev Kulkarni <krajeev@codeaurora.org>
GPU register RB_GMEM_BASE_ADDR needs to be initialized with GMEM base
address. From A330, OCMEM is utilized to be GMEM dynamically; when
OCMEM is allocated for GFX, the allocated region address may vary
every time, GPU register RB_GMEM_BASE_ADDR need to be initialized
with the allocated OCMEM region address.
Change-Id: I5cb4472a9f18759d2af160a15d83f1404378a530
Signed-off-by: liu zhong <zhongl@codeaurora.org>
Signed-off-by: Rajeev Kulkarni <krajeev@codeaurora.org>
No need to save a context if it is being destroyed since
it will no longer be used at all. This is better for
performance and also avoids the use of legacy kgsl
context save code for contexts that use preambles
Change-Id: I19a64e82188b4132f353bb61c21e4ed2281092fc
Signed-off-by: Shubhraprakash Das <sadas@codeaurora.org>
Signed-off-by: Rajeev Kulkarni <krajeev@codeaurora.org>
Define the register offsets for IOMMU-v2 and switch off using per
process pagetables for IOMMU-v2.
Change-Id: I8b76de557c8e52b5a2a333ceb987bd743b213eb7
Signed-off-by: Shubhraprakash Das <sadas@codeaurora.org>
Signed-off-by: Rajeev Kulkarni <krajeev@codeaurora.org>
Changed values written to VBIF registers for A330.
The maximum pending request from clients is increased to 24.
Disable VBIF clock gating. This potentially increases stand-by power.
Change-Id: Ic9a4f15546f4122298e140e79e4572c82e6385fc
Signed-off-by: Lokesh Batra <lbatra@codeaurora.org>
Signed-off-by: Rajeev Kulkarni <krajeev@codeaurora.org>
Define the IOMMU register offsets in a structure array. This
offers flexibility in defining IOMMU v2 offsets in another array
and the right array to be used can be setup during MMU
initialization.
Also, restrict the usage of IOMMU offsets only in the iommu file
by redifining the functions that return iommu information. Remove
the function to get iommu mapped register address and replace
it with a function that returns the gpuaddress of given iommu
register. Only return the valid address bits of an iommu pagetable
instead of just returning the pagetable base register value.
Change-Id: Ib88e605f57e551c7b84029647451cb20f06025a0
Signed-off-by: Shubhraprakash Das <sadas@codeaurora.org>
Signed-off-by: Rajeev Kulkarni <krajeev@codeaurora.org>
Calling clk_set_rate() for both AXI & 2D core clocks without putting
them in async mode causes 2D core hang. Since AXI & 2D core clocks
are in sync mode, ensure that clk_set_rate() is called only for AXI
& 2D core clock is enabled only after it is prepared.
Change-Id: I4634e2342d62ce16ad7afc748b10b0573fbfd913
CRs-fixed: 385393
Signed-off-by: Ranjhith Kalisamy <ranjhith@codeaurora.org>
Signed-off-by: Rajeev Kulkarni <krajeev@codeaurora.org>
Turn io_fraction to 100 so that the kernel
wait_io_interruptible call is not required for io_busy
reporting while using msm policy. Msm policy uses its
own schema to capture busy events.
Change-Id: Iddb63552305974ce4a12446117f27d07b7201387
Signed-off-by: Suman Tatiraju <sumant@codeaurora.org>
Signed-off-by: Rajeev Kulkarni <krajeev@codeaurora.org>
Turn on an additional clock for IOMMU if it is valid. This additional
clock is present on devices which have the IOMMU unit access tied to
the core clock of another component. This has been introduced from
IOMMU-v2 onwards.
Change-Id: I8b0a0f23cb789d820a8d515cae54b44f556e634d
Signed-off-by: Shubhraprakash Das <sadas@codeaurora.org>
Signed-off-by: Rajeev Kulkarni <krajeev@codeaurora.org>
Set the upper limit for the gpu address that the MMU can access to
be the base address where the first IOMMU units register space
is mapped. Earlier we set it to the last address mapped minus
a PAGE_SIZE, but we can actually set it to the base address of the
first mapped unit since the GPU should never access that space
unless we are updating IOMMU registers.
Change-Id: I6507ee373a9218210c148685e443e948a311bd29
Signed-off-by: Shubhraprakash Das <sadas@codeaurora.org>
Signed-off-by: Rajeev Kulkarni <krajeev@codeaurora.org>
Limit the size of individual GPU memory allocations to the amount of
free memory on the system minus 32MB. This early check gives us the
chance to verify that the user didn't ask for an obscene amount of
memory, and also to limit the chance that an allocation attempt will
invoke the OOM killer.
The 32MB buffer in particular should keep us out of the clutches of
the OOM killer. That number is the same amount of buffer used in
the page allocation alogrithms and it should keep the GPU from further
throwing fuel on the fire of a low memory situation.
Change-Id: Ic0dedbadcc5ee9cc0d77056b1a22eed5c385d636
Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
Signed-off-by: Rajeev Kulkarni <krajeev@codeaurora.org>
Instead of calling BUG_ON if the user passes a zero length via
IOCTL_KGSL_GPUMEM_ALLOC to kgsl_sharedmem_alloc_user, return
-EINVAL instead and propegate the error back up to the user.
CRs-fixed: 389886
Change-Id: Ic0dedbad2dc1f68ad1d3227498893c73c2a1c59e
Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
Signed-off-by: Rajeev Kulkarni <krajeev@codeaurora.org>