![]() |
|
![]() |
| Note that NumPy, CuPy and PyTorch are all involved in the definition of a shared subset of their API:
https://data-apis.org/array-api/ So it's possible to write array API code that consumes arrays from any of those libraries and delegate computation to them without having to explicitly import any of them in your source code. The only limitation for now is that PyTorch (and to some lower extent cupy as well) array API compliance is still incomplete and in practice one needs to go through this compatibility layer (hopefully temporarily): |
![]() |
| As nice as it is to have a drop in replacement, most of the cost of GPU computing is moving memory around. Wouldn’t be surprised if this catches unsuspecting programmers in a few performance traps. |
![]() |
| I'm surprised to see pytorch and Jax mentioned as alternatives but not numba : https://github.com/numba/numba
I've recently had to implement a few kernels to lower the memory footprint and runtime of some pytorch function : it's been really nice because numba kernels have type hints support (as opposed to raw cupy kernels). |
![]() |
| Interesting. Any links to examples or docs on how to use PyTorch as a general linear algebra library for this purpose? Like a “SciPy to PyTorch” transition guide if I want to do the same? |
![]() |
| Personally, I prefer CuPy over your library. For example, your vectorAdd.cu implementation at https://github.com/eyalroz/cuda-api-wrappers/blob/master/exa... is much longer than a similar CuPy implementation:
It could be made even shorter with a cp.ElementwiseKernel https://docs.cupy.dev/en/stable/user_guide/kernel.html#basic...Although I have to concede that the automatic grid size computation in cuda-api-wrappers is nice. A few marketing tips for your README: * Put a code example directly at the top. You want to present the selling points of your library to the reader as fast as possible. For reference, look at the CuPy README https://github.com/cupy/cupy?tab=readme-ov-file#cupy--numpy-... which immediately shows reader what it is good for. Your README starts with lots of text, but nobody reads text anymore these days. A link to examples is almost at the end, and then the examples are deeply nested. * The first links in the README should link to your own library, for example to documentation or examples. You do not want to lead the reader away from your GitHub page. * Add syntax highlighting with "cpp" after triple backticks:
|
![]() |
| I taught my numpy class to a client who wanted to use GPUs. Installation (at that time) was a chore but afterwards it was really smooth using this library. Big gains with minimal to no code changes. |
![]() |
| Are you able to share what functions or situations result in speedups? In my experience, vectorized numpy is already fast, so I'm very curious. |
![]() |
| Not OP, but think about stuff like FFTs or Matmuls. It's not even a competition, GPUs win when the algorithm is somewhat suitable and you're dealing with FP32 or lower precision. |
![]() |
| > Why not Jax?
- JAX Windows support is lacking - CuPy is much closer to CUDA than JAX, so you can get better performance - CuPy is generally more mature than JAX (fewer bugs) - CuPy is more flexible thanks to cp.RawKernel - (For those familiar with NumPy) CuPy is closer to NumPy than jax.numpy But CuPy does not support automatic gradient computation, so if you do deep learning, use JAX instead. Or PyTorch, if you do not trust Google to maintain a project for a prolonged period of time https://killedbygoogle.com/ |
![]() |
| IIRC jax's `scipy.sparse.linalg.bicgstab` does support multiple right hand sides.
EDIT: Or rather, all the solvers under jax's `scipy.sparse.linalg` all support multiple right hand sides. |
![]() |
| Is anyone aware of a pandas like library that is based on something like CuPy instead of Numpy. It would be great to have the ease of use of pandas with the parallelism unlocked by gpu. |
![]() |
| We are fans! We mostly use cudf/cuml/cugraph (GPU dataframes etc) in the pygraphistry ecosystem, and when things get a bit tricky, cupy is one of the main escape hatches |
However, the AMD-GPU compatibility for CuPy is quite an attractive feature.