Skip to content

Feat: Adding Linear Algebra Dot operation support #116

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 48 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 46 commits
Commits
Show all changes
48 commits
Select commit Hold shift + click to select a range
bc333df
interfacing with qblas
SwayamInSync Jul 9, 2025
babaa96
adding test cases
SwayamInSync Jul 9, 2025
c3aaa05
test-1: ci
SwayamInSync Jul 9, 2025
ca7dd6d
fixing ci
SwayamInSync Jul 9, 2025
04314e3
fixing ci
SwayamInSync Jul 9, 2025
037021a
fixing ci
SwayamInSync Jul 9, 2025
d6fc9c6
fixing linux CI
SwayamInSync Jul 10, 2025
f99f565
fixing linux CI
SwayamInSync Jul 10, 2025
fb3579c
fixing linux CI
SwayamInSync Jul 10, 2025
03e9acd
fixing linux CI
SwayamInSync Jul 10, 2025
1ed7bab
fixing linux CI
SwayamInSync Jul 10, 2025
63a355e
fixing linux CI
SwayamInSync Jul 10, 2025
88a98d1
fixing linux CI
SwayamInSync Jul 10, 2025
764fc72
updating qblas:
SwayamInSync Jul 10, 2025
1669e5f
fixing macos CI
SwayamInSync Jul 10, 2025
b35bac3
bumping macos deployment target
SwayamInSync Jul 10, 2025
042b25a
bumping macos deployment target
SwayamInSync Jul 10, 2025
f78dd90
dynamic macos deployment target
SwayamInSync Jul 10, 2025
cd88de0
explicit init of res array in dot-mat-mat
SwayamInSync Jul 10, 2025
abf0224
fixing windows CI
SwayamInSync Jul 10, 2025
c5198d1
disabling qblas for windows; MSVC incompatibility
SwayamInSync Jul 10, 2025
c0d93f8
updating CI triggering paths
SwayamInSync Jul 11, 2025
838adee
updating CI triggering paths
SwayamInSync Jul 11, 2025
433aa90
reverting branch to main
SwayamInSync Jul 11, 2025
5836505
bumping qblas
SwayamInSync Jul 11, 2025
33b48fe
umath refactor
SwayamInSync Jul 16, 2025
a35abce
updaing ci
SwayamInSync Jul 16, 2025
8516544
switching to apt
SwayamInSync Jul 16, 2025
335f425
submodule fix
SwayamInSync Jul 16, 2025
85e7840
submodule fix
SwayamInSync Jul 16, 2025
e467f4b
submodule fix
SwayamInSync Jul 16, 2025
e201b90
initial matmul ufunc setup
SwayamInSync Jul 17, 2025
09918a3
mid-way test
SwayamInSync Jul 17, 2025
70ca644
shifting to matmul ufunc
SwayamInSync Jul 17, 2025
f89c2e6
will figure out later
SwayamInSync Jul 17, 2025
894a84d
matmul registered with naive
SwayamInSync Jul 19, 2025
6800a90
adding initial qblas support to matmul ufunc, something is breaking, nan
SwayamInSync Jul 19, 2025
742ce64
matmul ufunc completed, naive plugged, qblas experimental
SwayamInSync Jul 19, 2025
d993bc9
adding release tracker to keep record for tasks, v1.0.0
SwayamInSync Jul 19, 2025
c518a29
it should be failing but passes on x86-64
SwayamInSync Jul 19, 2025
bbce2ac
ahh stupid me :), fallback to naive for MSVC
SwayamInSync Jul 19, 2025
5e5fa65
switching to internal function use only
SwayamInSync Jul 19, 2025
cec5ace
this should fix them all
SwayamInSync Jul 19, 2025
1fe6c81
wrapping up
SwayamInSync Jul 19, 2025
8f16b99
updated branch to main
SwayamInSync Jul 19, 2025
238ef89
Merge pull request #2 from SwayamInSync/matmul-ufunc
SwayamInSync Jul 19, 2025
ed47e33
added test coverage in release_tracker.md
SwayamInSync Jul 20, 2025
573eb76
more edge tests
SwayamInSync Jul 20, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
126 changes: 63 additions & 63 deletions .github/workflows/build_wheels.yml
Original file line number Diff line number Diff line change
Expand Up @@ -8,9 +8,11 @@ on:
- "quaddtype-v*"
paths:
- "quaddtype/**"
- ".github/workflows/**"
pull_request:
paths:
- "quaddtype/**"
- ".github/workflows/**"
workflow_dispatch:

jobs:
Expand All @@ -19,12 +21,19 @@ jobs:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
with:
submodules: recursive

- name: Set up Python
uses: actions/setup-python@v4
with:
python-version: ">=3.10.0"

- name: Verify QuadBLAS submodule
run: |
ls -la quaddtype/numpy_quaddtype/QBLAS/
ls -la quaddtype/numpy_quaddtype/QBLAS/include/quadblas/

- name: Install cibuildwheel
run: pip install cibuildwheel==2.20.0

Expand All @@ -34,16 +43,23 @@ jobs:
CIBW_MANYLINUX_X86_64_IMAGE: manylinux_2_28
CIBW_BUILD_VERBOSITY: "3"
CIBW_BEFORE_ALL: |
yum update -y
yum install -y cmake gcc gcc-c++ make git pkgconfig
# Install SLEEF in container
git clone --branch 3.8 https://github.com/shibatch/sleef.git
cd sleef
cmake -S . -B build -DSLEEF_BUILD_QUAD:BOOL=ON -DSLEEF_BUILD_SHARED_LIBS:BOOL=ON -DCMAKE_POSITION_INDEPENDENT_CODE=ON
cmake -S . -B build \
-DSLEEF_BUILD_QUAD:BOOL=ON \
-DSLEEF_BUILD_SHARED_LIBS:BOOL=ON \
-DCMAKE_POSITION_INDEPENDENT_CODE=ON
cmake --build build/ --clean-first -j
cmake --install build --prefix /usr/local
CIBW_ENVIRONMENT: >
CFLAGS="-I/usr/local/include $CFLAGS"
CXXFLAGS="-I/usr/local/include $CXXFLAGS"
LDFLAGS="-L/usr/local/lib64 $LDFLAGS"
LD_LIBRARY_PATH="/usr/local/lib64:$LD_LIBRARY_PATH"
CFLAGS="-I/usr/local/include -I{project}/numpy_quaddtype/QBLAS/include $CFLAGS"
CXXFLAGS="-I/usr/local/include -I{project}/numpy_quaddtype/QBLAS/include -fext-numeric-literals $CXXFLAGS"
LDFLAGS="-L/usr/local/lib64 -L/usr/local/lib -Wl,-rpath,/usr/local/lib64 -Wl,-rpath,/usr/local/lib -fopenmp $LDFLAGS"
LD_LIBRARY_PATH="/usr/local/lib64:/usr/local/lib:$LD_LIBRARY_PATH"
PKG_CONFIG_PATH="/usr/local/lib64/pkgconfig:/usr/local/lib/pkgconfig:$PKG_CONFIG_PATH"
CIBW_REPAIR_WHEEL_COMMAND: |
auditwheel repair -w {dest_dir} --plat manylinux_2_28_x86_64 {wheel}
CIBW_TEST_COMMAND: |
Expand All @@ -68,46 +84,59 @@ jobs:

steps:
- uses: actions/checkout@v3
with:
submodules: recursive

- name: Set up Python
uses: actions/setup-python@v4
with:
python-version: "3.10"

- name: Install dependencies
run: |
brew install cmake libomp git

- name: Install SLEEF
env:
MACOSX_DEPLOYMENT_TARGET: "11.0"
MACOSX_DEPLOYMENT_TARGET: ${{ matrix.os == 'macos-13' && '13.0' || '14.0' }}
run: |
git clone --branch 3.8 https://github.com/shibatch/sleef.git
cd sleef
cmake -S . -B build \
-DSLEEF_BUILD_QUAD:BOOL=ON \
-DSLEEF_BUILD_SHARED_LIBS:BOOL=ON \
-DCMAKE_POSITION_INDEPENDENT_CODE=ON \
-DCMAKE_OSX_DEPLOYMENT_TARGET=11.0 \
-DCMAKE_OSX_DEPLOYMENT_TARGET=${{ matrix.os == 'macos-13' && '13.0' || '14.0' }} \
-DCMAKE_INSTALL_RPATH="@loader_path/../lib" \
-DCMAKE_BUILD_WITH_INSTALL_RPATH=ON
cmake --build build/ --clean-first -j
sudo cmake --install build --prefix /usr/local

- name: Verify QuadBLAS submodule
run: |
ls -la quaddtype/numpy_quaddtype/QBLAS/
ls -la quaddtype/numpy_quaddtype/QBLAS/include/quadblas/

- name: Install cibuildwheel
run: pip install cibuildwheel==2.20.0

- name: Build wheels
env:
CIBW_BUILD: "cp310-* cp311-* cp312-*"
CIBW_ARCHS_MACOS: ${{ matrix.os == 'macos-13' && 'x86_64' || 'arm64' }}
CIBW_BUILD_VERBOSITY: "1"
CIBW_BUILD_VERBOSITY: "3"
CIBW_ENVIRONMENT: >
MACOSX_DEPLOYMENT_TARGET="11.0"
MACOSX_DEPLOYMENT_TARGET="${{ matrix.os == 'macos-13' && '13.0' || '14.0' }}"
DYLD_LIBRARY_PATH="/usr/local/lib:$DYLD_LIBRARY_PATH"
CFLAGS="-I/usr/local/include $CFLAGS"
CXXFLAGS="-I/usr/local/include $CXXFLAGS"
CFLAGS="-I/usr/local/include -I{project}/numpy_quaddtype/QBLAS/include $CFLAGS"
CXXFLAGS="-I/usr/local/include -I{project}/numpy_quaddtype/QBLAS/include $CXXFLAGS"
LDFLAGS="-L/usr/local/lib $LDFLAGS"
PKG_CONFIG_PATH="/usr/local/lib/pkgconfig:$PKG_CONFIG_PATH"
CIBW_REPAIR_WHEEL_COMMAND: >
delocate-wheel --require-archs {delocate_archs} -w {dest_dir} -v {wheel}
CIBW_TEST_COMMAND: |
pip install {package}[test]
pytest {project}/tests
pytest -s {project}/tests
CIBW_TEST_EXTRAS: "test"
run: |
python -m cibuildwheel --output-dir wheelhouse
Expand All @@ -118,6 +147,7 @@ jobs:
path: ./quaddtype/wheelhouse/*.whl
name: wheels-${{ matrix.os }}

# disabling QBLAS optimization for windows due to incompatibility with MSVC
build_wheels_windows:
name: Build wheels on Windows
runs-on: windows-latest
Expand All @@ -127,6 +157,8 @@ jobs:

steps:
- uses: actions/checkout@v3
with:
submodules: recursive

- name: Setup MSVC
uses: ilammy/msvc-dev-cmd@v1
Expand All @@ -142,6 +174,12 @@ jobs:
- name: Install CMake
uses: lukka/get-cmake@latest

- name: Verify QuadBLAS submodule
shell: pwsh
run: |
Get-ChildItem quaddtype/numpy_quaddtype/QBLAS/
Get-ChildItem quaddtype/numpy_quaddtype/QBLAS/include/quadblas/

- name: Clone and Build SLEEF
shell: pwsh
run: |
Expand All @@ -151,16 +189,6 @@ jobs:
cmake --build build --config Release
cmake --install build --prefix "C:/sleef" --config Release

- name: Setup build environment
shell: pwsh
run: |
$env:INCLUDE += ";C:\sleef\include"
$env:LIB += ";C:\sleef\lib"
$env:PATH = "C:\sleef\bin;$env:PATH"
echo "INCLUDE=$env:INCLUDE" >> $env:GITHUB_ENV
echo "LIB=$env:LIB" >> $env:GITHUB_ENV
echo "PATH=$env:PATH" >> $env:GITHUB_ENV

- name: Install build dependencies
shell: bash -l {0}
run: |
Expand All @@ -177,10 +205,17 @@ jobs:
MSSdk: "1"
CIBW_BEFORE_BUILD: |
pip install meson meson-python ninja numpy
CIBW_ENVIRONMENT: >
INCLUDE="C:/sleef/include;{project}/numpy_quaddtype/QBLAS/include;$INCLUDE"
LIB="C:/sleef/lib;$LIB"
PATH="C:/sleef/bin;$PATH"
CFLAGS="/IC:/sleef/include /I{project}/numpy_quaddtype/QBLAS/include /DDISABLE_QUADBLAS $CFLAGS"
CXXFLAGS="/IC:/sleef/include /I{project}/numpy_quaddtype/QBLAS/include /DDISABLE_QUADBLAS $CXXFLAGS"
LDFLAGS="C:/sleef/lib/sleef.lib C:/sleef/lib/sleefquad.lib $LDFLAGS"
CIBW_REPAIR_WHEEL_COMMAND: 'delvewheel repair -w {dest_dir} {wheel} --add-path C:\sleef\bin'
CIBW_TEST_COMMAND: |
pip install {package}[test]
python -m pytest -v {project}/test
pytest -s {project}/tests
CIBW_TEST_EXTRAS: test
CIBW_TEST_FAIL_FAST: 1
shell: pwsh
Expand All @@ -199,56 +234,21 @@ jobs:
needs: [build_wheels_linux, build_wheels_macos, build_wheels_windows]
runs-on: ubuntu-latest
if: startsWith(github.ref, 'refs/tags/quaddtype-v')

environment:
name: quadtype_release
url: https://pypi.org/p/numpy-quaddtype

permissions:
id-token: write # IMPORTANT: mandatory for trusted publishing
id-token: write # IMPORTANT: mandatory for trusted publishing

steps:
- name: Download all workflow run artifacts
uses: actions/download-artifact@v4
with:
path: dist

- name: Publish to PyPI
uses: pypa/gh-action-pypi-publish@release/v1
with:
packages-dir: dist/*

# With the current setup, we are not creating a release on GitHub.
# create_release:
# name: Create Release
# needs: [build_wheels_linux, build_wheels_macos, build_wheels_windows]
# runs-on: ubuntu-latest
# if: startsWith(github.ref, 'refs/tags/quaddtype-v')

# steps:
# - name: Checkout code
# uses: actions/checkout@v2

# - name: Download all workflow run artifacts
# uses: actions/download-artifact@v4
# with:
# path: artifacts

# - name: Create Release
# id: create_release
# uses: actions/create-release@v1
# env:
# GITHUB_TOKEN: ${{ secrets.QUADDTYPE_GITHUB_TOKEN }}
# with:
# tag_name: ${{ github.ref }}
# release_name: Release ${{ github.ref }}
# draft: false
# prerelease: false

# - name: Upload Release Assets
# uses: softprops/action-gh-release@v1
# if: startsWith(github.ref, 'refs/tags/')
# with:
# files: ./artifacts/**/*.whl
# env:
# GITHUB_TOKEN: ${{ secrets.QUADDTYPE_GITHUB_TOKEN }}
24 changes: 22 additions & 2 deletions .github/workflows/ci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -59,17 +59,37 @@ jobs:
run: |
sudo apt-get update
sudo apt-get install -y libmpfr-dev libssl-dev libfftw3-dev

- name: Install SLEEF
run: |
sudo apt-get update -y
sudo apt-get install -y cmake gcc g++ make git pkg-config
git clone --branch 3.8 https://github.com/shibatch/sleef.git
cd sleef
cmake -S . -B build -DSLEEF_BUILD_QUAD:BOOL=ON -DSLEEF_BUILD_SHARED_LIBS:BOOL=ON -DCMAKE_POSITION_INDEPENDENT_CODE=ON
cmake --build build/ --clean-first -j
sudo cmake --install build --prefix /usr
sudo cmake --install build --prefix /usr/local

- name: Install quaddtype
working-directory: quaddtype
run: |
LDFLAGS="-Wl,-rpath,/usr/lib" python -m pip install . -v --no-build-isolation -Cbuilddir=build -C'compile-args=-v' -Csetup-args="-Dbuildtype=debug"
# Initialize submodules first
git submodule update --init --recursive
ls -la numpy_quaddtype/QBLAS/

# Set environment variables with proper export and correct paths
export CFLAGS="-I/usr/local/include -I$(pwd)/numpy_quaddtype/QBLAS/include"
export CXXFLAGS="-I/usr/local/include -I$(pwd)/numpy_quaddtype/QBLAS/include -fext-numeric-literals"
export LDFLAGS="-L/usr/local/lib64 -L/usr/local/lib -Wl,-rpath,/usr/local/lib64 -Wl,-rpath,/usr/local/lib -fopenmp"
export LD_LIBRARY_PATH="/usr/local/lib64:/usr/local/lib:$LD_LIBRARY_PATH"

# Install with meson args to ensure the C++ flags are passed through
python -m pip install . -v --no-build-isolation \
-Cbuilddir=build \
-C'compile-args=-v' \
-Csetup-args="-Dbuildtype=debug" \
-Csetup-args="-Dcpp_args=-fext-numeric-literals"

- name: Run quaddtype tests
working-directory: quaddtype
run: |
Expand Down
3 changes: 3 additions & 0 deletions .gitmodules
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
[submodule "quaddtype/numpy_quaddtype/QBLAS"]
path = quaddtype/numpy_quaddtype/QBLAS
url = https://github.com/SwayamInSync/QBLAS
13 changes: 10 additions & 3 deletions quaddtype/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -27,7 +27,7 @@ np.array([1,2,3], dtype=QuadPrecDType("longdouble"))

The code needs the quad precision pieces of the sleef library, which
is not available on most systems by default, so we have to generate
that first. The below assumes one has the required pieces to build
that first. The below assumes one has the required pieces to build
sleef (cmake and libmpfr-dev), and that one is in the package
directory locally.

Expand All @@ -40,6 +40,7 @@ cd ..
```

Building the `numpy-quaddtype` package from locally installed sleef:

```bash
export SLEEF_DIR=$PWD/sleef/build
export LIBRARY_PATH=$SLEEF_DIR/lib
Expand All @@ -53,11 +54,17 @@ source temp/bin/activate
# Install the package
pip install meson-python numpy pytest

export LDFLAGS="-Wl,-rpath,$SLEEF_DIR/lib"
export LDFLAGS="-Wl,-rpath,$SLEEF_DIR/lib -fopenmp -latomic -lpthread"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this work on Windows?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, I can add a section for windows build too in README

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(if not, maybe add a comment with a different suggestion)

export CFLAGS="-fPIC"
export CXXFLAGS="-fPIC"

# To build without QBLAS (default for MSVC)
# export CFLAGS="-fPIC -DDISABLE_QUADBLAS"
# export CXXFLAGS="-fPIC -DDISABLE_QUADBLAS"

python -m pip install . -v --no-build-isolation -Cbuilddir=build -C'compile-args=-v'

# Run the tests
cd ..
python -m pytest
```

27 changes: 23 additions & 4 deletions quaddtype/meson.build
Original file line number Diff line number Diff line change
Expand Up @@ -23,9 +23,17 @@ incdir_numpy = run_command(py,
check : true
).stdout().strip()

# Add OpenMP dependency (optional, for threading)
openmp_dep = dependency('openmp', required: false)
dependencies = [sleef_dep, py_dep]
if openmp_dep.found()
dependencies += openmp_dep
endif

includes = include_directories(
[
incdir_numpy,
'numpy_quaddtype/QBLAS/include',
'numpy_quaddtype/src',
]
)
Expand All @@ -42,10 +50,21 @@ srcs = [
'numpy_quaddtype/src/scalar_ops.h',
'numpy_quaddtype/src/scalar_ops.cpp',
'numpy_quaddtype/src/ops.hpp',
'numpy_quaddtype/src/umath.h',
'numpy_quaddtype/src/umath.cpp',
'numpy_quaddtype/src/dragon4.h',
'numpy_quaddtype/src/dragon4.c'
'numpy_quaddtype/src/dragon4.c',
'numpy_quaddtype/src/quadblas_interface.h',
'numpy_quaddtype/src/quadblas_interface.cpp',
'numpy_quaddtype/src/umath/umath.h',
'numpy_quaddtype/src/umath/umath.cpp',
'numpy_quaddtype/src/umath/binary_ops.h',
'numpy_quaddtype/src/umath/binary_ops.cpp',
'numpy_quaddtype/src/umath/unary_ops.h',
'numpy_quaddtype/src/umath/unary_ops.cpp',
'numpy_quaddtype/src/umath/comparison_ops.h',
'numpy_quaddtype/src/umath/comparison_ops.cpp',
'numpy_quaddtype/src/umath/promoters.hpp',
'numpy_quaddtype/src/umath/matmul.h',
'numpy_quaddtype/src/umath/matmul.cpp',
]

py.install_sources(
Expand All @@ -60,7 +79,7 @@ py.extension_module('_quaddtype_main',
srcs,
link_args: is_windows ? ['/DEFAULTLIB:sleef', '/DEFAULTLIB:sleefquad'] : ['-lsleef', '-lsleefquad'],
link_language: 'cpp',
dependencies: [sleef_dep, py_dep],
dependencies: dependencies,
install: true,
subdir: 'numpy_quaddtype',
include_directories: includes
Expand Down
1 change: 1 addition & 0 deletions quaddtype/numpy_quaddtype/QBLAS
Submodule QBLAS added at 9468e2
Loading
Loading