-
Notifications
You must be signed in to change notification settings - Fork 1.6k
OpenBLAS Extensions
Martin Kroeker edited this page Apr 7, 2022
·
11 revisions
- BLAS-like extensions
| Routine | Data Types | Description |
|---|---|---|
| ?gemm3m | c,z | gemm3m |
| ?imatcopy | s,d,c,z | in-place transpositon/copying |
| ?omatcopy | s,d,c,z | out-of-place transpositon/copying |
| ?geadd | s,d,c,z | matrix add |
-
BLAS-like and Conversion functions for bfloat16 (available when OpenBLAS was compiled with BUILD_BFLOAT16=1)
-
void cblas_sbstobf16converts a float array to an array of bfloat16 values by rounding -
void cblas_sbdtobf16converts a double array to an array of bfloat16 values by rounding -
void cblas_sbf16tosconverts a bfloat16 array to an array of floats -
void cblas_dbf16todconverts a bfloat16 array to an array of doubles -
float cblas_sbdotcomputes the dot product of two bfloat16 arrays -
void cblas_sbgemvperforms the matrix-vector operations of GEMV with the input matrix and X vector as bfloat16 -
void cblas_sbgemmperforms the matrix-matrix operations of GEMM with both input arrays containing bfloat16
-
-
Utility functions
- openblas_get_num_threads
- openblas_set_num_threads
-
int openblas_get_num_procs(void)returns the number of processors available on the system (may include "hyperthreading cores") -
int openblas_get_parallel(void)returns 0 for sequential use, 1 for platform-based threading and 2 for OpenMP-based threading -
char * openblas_get_config()returns the options OpenBLAS was built with, something likeNO_LAPACKE DYNAMIC_ARCH NO_AFFINITY Haswell -
int openblas_set_affinity(int thread_index, size_t cpusetsize, cpu_set_t *cpuset)sets the cpu affinity mask of the given thread to the provided cpuset. (Only available under Linux, with semantics identical to pthread_setaffinity_np)