**对Series 对象使用匿名函数
使用 pipe 函数对 Series 对象使用 匿名函数
pd.Series(range(5)).pipe(lambda x,y,z :(x**y)%z,2,5)
pd.Series(range(5)).pipe(lambda x:x+3).pipe(lambda x:x*3)
使用 apply 函数对 Series 对象使用 匿名函数
pd.Series(range(5)).apply(lambda x:x+3)
pd.Series(range(0,5)).sem()
dff = dataframe.groupby(by = ['姓名','日期'],as_index = False).sum()
dff = dff.pivot(index = '姓名',columns = '日期',values = '交易额')
index 设置行索引
columns 设置列索引
values 对应的值
dff.iloc[:,:1]
dataframe.pivot_table(values = '交易额',index = '姓名',columns = '日期',aggfunc = 'sum',margins = True).iloc[:,:2]
dataframe.pivot_table(values = '交易额',index = '姓名',columns = '日期',aggfunc = 'count',margins = True)
pd.crosstab(dataframe.姓名,dataframe.柜台)
pd.crosstab(dataframe.姓名,dataframe.柜台,dataframe.交易额,aggfunc = 'mean').apply(lambda num:round(num,2) )
by 可以为匿名函数,字典,字符串
dataframe.groupby(by = lambda num:num % 5)['交易额'].sum()
dataframe.groupby(by = {7:'索引为7的行',15:'索引为15的行'})['交易额'].sum()
dataframe.groupby(by = '时段')['交易额'].sum()
data['排名'] = data['交易额'].rank(ascending = False)
dataframe.groupby(by = ['姓名','时段'])['交易额'].sum()
dataframe.loc[dataframe.交易额 < 1500,'交易额'] = dataframe[dataframe.交易额 < 1500]['交易额'].map(lambda num:num*1.5)
len(dataframe.dropna())
dataframe[dataframe['交易额'].isnull()]
dataframe[dataframe.duplicated()]
dataframe = dataframe.drop_duplicates()
dff = dataframe[['工号','姓名']]
dff.drop_duplicates()
dff = dataframe.groupby(by = '日期').sum()['交易额'].diff()
data_group = pd.crosstab(data.姓名,data.柜台,data.交易额,aggfunc = 'mean').apply(round)
df3 = pd.concat([df1,df2])
rows = np.random.randint(0,len(df5),3)
pd.merge(df4,df5).iloc[rows,:]
pd.merge(df1,df2,on = '工号',suffixes = ['_x','_y']).iloc[:,:]
df2.set_index('工号').join(df3.set_index('工号'),lsuffix = '_x',rsuffix = '_y').iloc[:]
dataframe = pd.read_excel(r'C:\Users\lenovo\Desktop\总结\Python\超市营业额.xlsx',
usecols = ['工号','姓名','时段','交易额','柜台'])
dataframe.sort_values(by = ['交易额','工号'],ascending = [False,True])[:5]
dataframe.sort_values(by = ['工号'])[:5]
data.resample('3H').mean()
data.resample('5H').ohlc()
data.index = data.index + pd.Timedelta('1D')
pd.Timestamp('').is_leap_year
dataframe['交易额'].describe()
index = dataframe['交易额'].idxmin()
index = dataframe['交易额'].idxmax()
dataframe.loc[index,'交易额']
#
dataframe2 = pd.read_excel(r'C:\Users\lenovo\Desktop\总结\Python\超市营业额.xlsx',
skiprows = [1,2,4],
index_col = 1)
skiprows 跳过的行
index_col 指定的列
dataframe.iloc[[0,2,3],:]
dataframe.at[3,'姓名']**
2020-05-07
手机扫一扫
移动阅读更方便
你可能感兴趣的文章