为什么在einsum中优化可以加速二进制收缩？-编程知识-白鹭情

在https://numpy.org/doc/stable/reference/generated/numpy.einsum.html

在我看来，optimize标志是在多个收缩中选择顺序。例如，

A B C -> D

对于 (AB)C 或 A(BC) 或 (AC)B，它更快，而不是二元收缩，例如， AB->C。

对于以下代码 A[a,b] * B[b,c,d] = C[a,c,d]

import numpy as np
import time
import scipy.stats

# from https://stackoverflow.com/questions/15033511/compute-a-confidence-interval-from-sample-data
def mean_confidence_interval(data, var_name, unit, confidence=0.95):
    a = 1.0 * np.array(data)
    n = len(a)
    m, se = np.mean(a), scipy.stats.sem(a)
    h = se * scipy.stats.t.ppf((1   confidence) / 2., n-1)
    
    print(var_name, round(m, 5), "\u00B1",  round(h, 5), unit )


def einsum_greedy(A, B, na, nb, nc, nd):
    #res = np.zeros((na,nc,nd))
    res = np.einsum('ab,bcd->acd', A, B, optimize="greedy")
    return res 


def einsum_standard(A, B, na, nb, nc, nd):
   # res = np.zeros((na,nc,nd))
    res = np.einsum('ab,bcd->acd', A, B)
    return res 

def btime_ABC(name_def, name_out, A, B, C, na, nb, nc, nd, n_times):
    global opt_path
    list_time = []
    for i in range(n_times):
        start_time = time.time()
        C = name_def(A, B, na, nb, nc, nd)
        finish_time = time.time()
        list_time.append(finish_time - start_time)

    mean_confidence_interval(list_time, name_out, 's' )



# A[a,b] * B[b,c,d] = C[a,c,d]
na = nb = nc = nd = dim_comm = 90
n_times = 60


print('number of common dimension', dim_comm)
print('number of averaged time', n_times)

A = np.random.random((na,nb))
B = np.random.random((nb,nc,nd))
C1 = np.zeros((na,nc,nd))
C2 = np.zeros((na,nc,nd))

btime_ABC(einsum_standard, 'einsum_standard', A, B, C1, na, nb, nc, nd, n_times)
btime_ABC(einsum_greedy, 'einsum_greedy', A, B, C2, na, nb, nc, nd, n_times)

我有

number of common dimension 90
number of averaged time 60
einsum_standard 0.04799 ± 0.00312 s
einsum_greedy 0.00805 ± 0.00137 s

该optimize标志有助于二进制收缩A[a,b] * B[b,c,d] = C[a,c,d]。那么为什么？

uj5u.com热心网友回复：

我的时间：

In [26]: A = np.random.random((90,80))
In [27]: B = np.random.random((80,81,82))
In [28]: timeit np.einsum('ab,bcd->acd',A,B,optimize=False)
39.2 ms ± 1.51 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)
In [29]: timeit np.einsum('ab,bcd->acd',A,B,optimize=True)
9.06 ms ± 70.5 μs per loop (mean ± std. dev. of 7 runs, 100 loops each)

查看einsum代码，我看到，早期：

# If no optimization, run pure einsum
if optimize is False:
    return c_einsum(*operands, **kwargs)

如果不是，它会检查各种自变量，并执行

operands, contraction_list = einsum_path(*operands, optimize=optimize,
                                         einsum_call=True)
...
# Call tensordot if still possible
if blas:
    ...
new_view = tensordot(*tmp_operands, axes=(tuple(left_pos), tuple(right_pos)))

由于我们只有 2 个自变量并且path很简单，我认为这种True情况只是：

In [30]: timeit np.tensordot(A,B,(1,0))
7.62 ms ± 609 μs per loop (mean ± std. dev. of 7 runs, 100 loops each)

也就是说，从过去的研究tensordot：

In [31]: timeit (A@B.reshape(80,-1)).reshape(90,81,82)
6.44 ms ± 116 μs per loop (mean ± std. dev. of 7 runs, 100 loops each)

所以基本上时间差是运行一个编译的“纯 einsum”，和一个将其视为matmul问题的替代方案，它可以使用优化的BLAS例程。[29] 时间似乎是 [31] 时间加上一些开销。

为什么在einsum中优化可以加速二进制收缩？

0 评论

发表评论

最新文章

斥350亿美元建新航厦，迪拜将打造世界最大机场

Windows系统安装最详细教程，基于U盘方式

分手后仍难以与前任断绝联系的三大星座，纠缠不清的情感纠葛！

优秀的女人，必须坚持的11个生活习惯！

此刻，像宋人一样热爱生活！

唐诗中描写爱情的6句诗，最深的情遇到最美的诗！

随机推荐

如何删除 Gmail 中的所有邮件

如何永久删除 Ancestry.com 帐户

如何在 Pixlr 编辑器中删除背景

如何在Excel中每隔一行进行删除

如何从 Mac 中删除 GarageBand

如何从云端硬盘中删除 Google 表格

热门分类

热门标签