|
1990 | 1990 | "attributes" : "ReadMem" |
1991 | 1991 | }, |
1992 | 1992 |
|
| 1993 | +### ``llvm.genx.lsc.load.merge.*.<return type if not void>.<any type>.<any type>`` : lsc_load merge instructions |
| 1994 | +### ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
| 1995 | +### |
| 1996 | +### * ``llvm.genx.lsc.load.merge.slm`` : |
| 1997 | +### * ``llvm.genx.lsc.load.merge.bti`` : |
| 1998 | +### * ``llvm.genx.lsc.load.merge.stateless`` : |
| 1999 | +### |
| 2000 | +### * Exec_size ignored unless operation is transposed (DataOrder == Tranpose) |
| 2001 | +### * arg0: {1,32}Xi1 predicate (overloaded) |
| 2002 | +### * arg1: i8 Subopcode, [MBZ] |
| 2003 | +### * arg2: i8 Caching behavior for L1, [MBC] |
| 2004 | +### * arg3: i8 Caching behavior for L3, [MBC] |
| 2005 | +### * arg4: i16 Address scale, [MBC] |
| 2006 | +### * arg5: i32 Immediate offset added to each address, [MBC] |
| 2007 | +### * arg6: i8 The dataum size, [MBC] |
| 2008 | +### * arg7: i8 Number of elements to load per address (vector size), [MBC] |
| 2009 | +### * arg8: i8 Indicates if the data is transposed during the transfer, [MBC] |
| 2010 | +### * arg9: i8 Channel mask for quad versions, [MBC] |
| 2011 | +### * arg10: {1,32}Xi{16,32,64} The vector register holding offsets (overloaded) |
| 2012 | +### for flat version Base Address + Offset[i] goes here |
| 2013 | +### * arg11: i32 surface to use for this operation. This can be an immediate or a register |
| 2014 | +### for flat and bindless version pass zero here |
| 2015 | +### * arg12: VXi{16,32,64} The data to merge disable channels (overloaded) |
| 2016 | +### |
| 2017 | +### * Return value: the value read merged witg arg12 by predicate |
| 2018 | +### |
| 2019 | +### Cache mappings are: |
| 2020 | +### |
| 2021 | +### - 0 -> .df (default) |
| 2022 | +### - 1 -> .uc (uncached) |
| 2023 | +### - 2 -> .ca (cached) |
| 2024 | +### - 3 -> .wb (writeback) |
| 2025 | +### - 4 -> .wt (writethrough) |
| 2026 | +### - 5 -> .st (streaming) |
| 2027 | +### - 6 -> .ri (read-invalidate) |
| 2028 | +### |
| 2029 | +### Only certain combinations of CachingL1 with CachingL3 are valid on hardware. |
| 2030 | +### |
| 2031 | +### +---------+-----+-----------------------------------------------------------------------+ |
| 2032 | +### | L1 | L3 | Notes | |
| 2033 | +### +---------+-----+-----------------------------------------------------------------------+ |
| 2034 | +### | .df | .df | default behavior on both L1 and L3 (L3 uses MOCS settings) | |
| 2035 | +### +---------+-----+-----------------------------------------------------------------------+ |
| 2036 | +### | .uc | .uc | uncached (bypass) both L1 and L3 | |
| 2037 | +### +---------+-----+-----------------------------------------------------------------------+ |
| 2038 | +### | .st | .uc | streaming L1 / bypass L3 | |
| 2039 | +### +---------+-----+-----------------------------------------------------------------------+ |
| 2040 | +### | .uc | .ca | bypass L1 / cache in L3 | |
| 2041 | +### +---------+-----+-----------------------------------------------------------------------+ |
| 2042 | +### | .ca | .uc | cache in L1 / bypass L3 | |
| 2043 | +### +---------+-----+-----------------------------------------------------------------------+ |
| 2044 | +### | .ca | .ca | cache in both L1 and L3 | |
| 2045 | +### +---------+-----+-----------------------------------------------------------------------+ |
| 2046 | +### | .st | .ca | streaming L1 / cache in L3 | |
| 2047 | +### +---------+-----+-----------------------------------------------------------------------+ |
| 2048 | +### | .ri | .ca | read-invalidate (e.g. last-use) on L1 loads / cache in L3 | |
| 2049 | +### +---------+-----+-----------------------------------------------------------------------+ |
| 2050 | +### |
| 2051 | +### Immediate offset. The compiler may be able to fuse this add into the message, otherwise |
| 2052 | +### additional instructions are generated to honor the semantics. |
| 2053 | +### Alternative variant for predicated variant of loads - merge destination for disabled |
| 2054 | +### lanes with values from additional input(arg12) |
| 2055 | +### |
| 2056 | +### Dataum size mapping is |
| 2057 | +### |
| 2058 | +### - 1 = :u8 |
| 2059 | +### - 2 = :u16 |
| 2060 | +### - 3 = :u32 |
| 2061 | +### - 4 = :u64 |
| 2062 | +### - 5 = :u8u32 (load 8b, zero extend to 32b; store the opposite), |
| 2063 | +### - 6 = :u16u32 (load 8b, zero extend to 32b; store the opposite), |
| 2064 | +### - 7 = :u16u32h (load 16b into high 16 of each 32b; store the high 16) |
| 2065 | +### |
| 2066 | + "lsc_load_merge_slm" : { "result" : "anyvector", |
| 2067 | + "arguments" : ["any","char","char","char","short","int","char","char","char","char","any","int"], |
| 2068 | + "attributes" : "ReadMem" |
| 2069 | + }, |
| 2070 | + "lsc_load_merge_stateless" : { "result" : "anyvector", |
| 2071 | + "arguments" : ["any","char","char","char","short","int","char","char","char","char","any","int"], |
| 2072 | + "attributes" : "ReadMem" |
| 2073 | + }, |
| 2074 | + "lsc_load_merge_bindless" : { "result" : "anyvector", |
| 2075 | + "arguments" : ["any","char","char","char","short","int","char","char","char","char","any","int"], |
| 2076 | + "attributes" : "ReadMem" |
| 2077 | + }, |
| 2078 | + "lsc_load_merge_bti" : { "result" : "anyvector", |
| 2079 | + "arguments" : ["any","char","char","char","short","int","char","char","char","char","any","int"], |
| 2080 | + "attributes" : "ReadMem" |
| 2081 | + }, |
| 2082 | + "lsc_load_merge_quad_slm" : { "result" : "anyvector", |
| 2083 | + "arguments" : ["any","char","char","char","short","int","char","char","char","char","any","int"], |
| 2084 | + "attributes" : "ReadMem" |
| 2085 | + }, |
| 2086 | + "lsc_load_merge_quad_stateless" : { "result" : "anyvector", |
| 2087 | + "arguments" : ["any","char","char","char","short","int","char","char","char","char","any","int"], |
| 2088 | + "attributes" : "ReadMem" |
| 2089 | + }, |
| 2090 | + "lsc_load_merge_quad_bindless" : { "result" : "anyvector", |
| 2091 | + "arguments" : ["any","char","char","char","short","int","char","char","char","char","any","int"], |
| 2092 | + "attributes" : "ReadMem" |
| 2093 | + }, |
| 2094 | + "lsc_load_merge_quad_bti" : { "result" : "anyvector", |
| 2095 | + "arguments" : ["any","char","char","char","short","int","char","char","char","char","any","int"], |
| 2096 | + "attributes" : "ReadMem" |
| 2097 | + }, |
| 2098 | + |
1993 | 2099 | ### ``llvm.genx.lsc.store.*.<any type>.<any type>.<any vector>`` : lsc_store instructions |
1994 | 2100 | ### ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
1995 | 2101 | ### |
|
0 commit comments