Tags: pytorch/pytorch
Tags
[xpu][fix] Fix UT test_fuse_mix_order_reductions_combo_kernels (#170297) Fixes #170296 Pull Request resolved: #170297 Approved by: https://github.com/EikanWang, https://github.com/jansel
[dynamo][DebugMode] make ModTracker a no-op in compiled regions (#170124 ) ModTracker causes a graph break in compiled regions: #169995, so this makes it a no-op by introducing the `torch._dynamo.eval_frame._is_in_compiled_region()` check. The next PR in the stack makes nn.Module tracking work for DebugMode by introducing an interpreter. Pull Request resolved: #170124 Approved by: https://github.com/tugsbayasgalan
[ROCm] Enable group gemm on gfx90a (#169356) Fix concurrency race condition for group gemm and enable group gemm support on gfx90a architecture. Test command: PYTORCH_TEST_WITH_ROCM=1 pytest test/test_matmul_cuda.py -v -k "test_grouped_gemm_2d_2d or test_grouped_gemm_2d_3d or or test_grouped_gemm_3d_3d or test_grouped_gemm_3d_2d" Pull Request resolved: #169356 Approved by: https://github.com/slayton58, https://github.com/jeffdaily Co-authored-by: Jeff Daily <[email protected]>
Fix: torch.view_as_complex() does not work on memory layout produced … …by torch.contiguous() after transpose (#169780) Fixes #150050, by ignoring the stride divisibility requirement for singleton dimensions (since stride is irrelevant for singleton dimensions). Pull Request resolved: #169780 Approved by: https://github.com/soulitzer
[codemod][lowrisk] Remove unused exception parameter from caffe2 (#17… …0325) Summary: `-Wunused-exception-parameter` has identified an unused exception parameter. This diff removes it. This: ``` try { ... } catch (exception& e) { // no use of e } ``` should instead be written as ``` } catch (exception&) { ``` If the code compiles, this is safe to land. Test Plan: Sandcastle Reviewed By: dmm-fb Differential Revision: D89071569 Pull Request resolved: #170325 Approved by: https://github.com/malfet
[18/N] Use Python 3.10 typing (#170280) This PR uses Python 3.10 typing to some files. Pull Request resolved: #170280 Approved by: https://github.com/Lucaskabela
[CI] Update update_expected.py to skip cuda-13 results (#170348) Summary: Since cuda-13 runs skip more tests, we should only use cuda-12 runs to update the expected result files. Pull Request resolved: #170348 Approved by: https://github.com/eellison
[Release 2.11] Version Bump (#170346) Same as #162526 Compatibility matrix will be updated separately Pull Request resolved: #170346 Approved by: https://github.com/huydhn
[ROCm] enable fastSpecializedAtomicAdd for gfx950 (#170330) Use standard HIP headers for unsafeAtomicAdd. Cannot remove copy/paste of unsafeAtomicAdd as "preview" implementation until ROCm 6.2 can be fully deprecated. Re-land of #167661 Co-author: @jeffdaily Pull Request resolved: #170330 Approved by: https://github.com/jeffdaily
Don't call str when in redistribute hotpath (#170366) Signed-off-by: Edward Z. Yang <[email protected]> Pull Request resolved: #170366 Approved by: https://github.com/wconstab
PreviousNext