python利用eval方法提升dataframe运算性能
阅读原文时间:2022年05月01日阅读:1

eval方法可以直接利用c语言的速度,而不用分配中间数组,不需要中间内存的占用.

如果包含多个步骤,每个步骤都要分配一块内存

import numpy as np
import pandas as pd
import timeit

df = pd.DataFrame({'a': np.random.randn(10000000),
'b': np.random.randn(10000000),
'c': np.random.randn(10000000),
'x': 'x'})
# print df
start_time = timeit.default_timer()
df['a']/( df['b']+0.1)-df['c']
end_time = timeit.default_timer()
print (end_time - start_time)
print "___________________"
start_time = timeit.default_timer()
pd.eval("df['a']/( df['b']+0.1)-df['c']")
end_time = timeit.default_timer(http://www.my516.com)
print (end_time - start_time)
运行时间对比

0.136633455546
___________________
0.087637596342
As of version 0.13 (released January 2014), Pandas includes some experimental tools that allow you to directly access C-speed operations without costly allocation of intermediate arrays.
---------------------

手机扫一扫

移动阅读更方便

阿里云服务器
腾讯云服务器
七牛云服务器

你可能感兴趣的文章