列表解释
You now have all the knowledge necessary to begin writing list comprehensions! Your job in this exercise is to write a list comprehension that produces a list of the squares of the numbers ranging from 0 to 9.
Create list comprehension: squares
squares = [i**2 for i in range(0,10)]
[[output expression] for iterator variable in iterable]
writing a list comprehension within another list comprehension, or nested list comprehensions.
# Create a 5 x 5 matrix using a list of lists: matrix
matrix = [[col for col in range(5)] for row in range(5)]
# Print the matrix
for row in matrix:
print(row)
you can apply a conditional statement to test the iterator variable by adding an if statement in the optional predicate expression part after the for statement in the comprehension:
通用表达式,这种形式,看别人代码的时候出现很多,确实省代码的
[ output expression for iterator variable in iterable if predicate expression ].
# Create a list of strings: fellowship
fellowship = ['frodo', 'samwise', 'merry', 'aragorn', 'legolas', 'boromir', 'gimli']
# Create list comprehension: new_fellowship
new_fellowship = [member for member in fellowship if len(member) >= 7]
# Print the new list
print(new_fellowship)
<script.py> output:
['samwise', 'aragorn', 'legolas', 'boromir']
using a list comprehension and an if-else conditional statement in the output expression
输出的结果是一个if-else语句,这样挺直观简单的
# Create a list of strings: fellowship
fellowship = ['frodo', 'samwise', 'merry', 'aragorn', 'legolas', 'boromir', 'gimli']
# Create list comprehension: new_fellowship
new_fellowship = [member if len(member) >= 7 else '' for member in fellowship]
# Print the new list
print(new_fellowship)
字典解析
同理字典解析
In [1]: # List of strings
... fellowship = ['frodo', 'samwise', 'merry', 'aragorn', 'legolas', 'boromir', 'gimli']
...
... # List comprehension
... fellow1 = [member for member in fellowship if len(member) >= 7]
In [2]: fellow2 = (member for member in fellowship if len(member) >= 7)
In [3]: fellow1
#很明显,列表解析输出的是一个列表
Out[3]: ['samwise', 'aragorn', 'legolas', 'boromir']
In [4]: fellow2
#生成器就是一个生成器对象
Out[4]: <generator object <genexpr> at 0x7f3821ba3a40>
一个列表解析的小例子
# Extract the created_at column from df: tweet_time
tweet_time = df['created_at']
# Extract the clock time: tweet_clock_time
tweet_clock_time = [entry[11:19] for entry in tweet_time if entry[17:19] == '19']
# Print the extracted times
print(tweet_clock_time)
zip()是可迭代对象,使用时必须将其包含在一个list中,方便一次性显示出所有结果
它可以将多个序列(列表、元组、字典、集合、字符串以及 range() 区间构成的列表)“压缩”成一个 zip 对象。所谓“压缩”,其实就是将这些序列中对应位置的元素重新组合,生成一个个新的元组
zipped_lists = zip(feature_names,row_vals)
rs_dict = dict(zipped_lists)
print(rs_dict)
生成器的关键字,功能有点类似于return
数据集来自the World Bank's World Development Indicators
通过这个小demo能够更好的理解函数的定义,就是把所有的需求放到一个函数里,想让这个函数通用,那就提取公共的参数,从外面传进去。
# Define lists2dict()
def lists2dict(list1, list2):
"""Return a dictionary where list1 provides
the keys and list2 provides the values."""
# Zip lists: zipped_lists
zipped_lists = zip(list1, list2)
# Create a dictionary: rs_dict
rs_dict = dict(zipped_lists)
# Return the dictionary
return rs_dict
# Call lists2dict: rs_fxn
rs_fxn = lists2dict(feature_names,row_vals)
# Print rs_fxn
print(rs_fxn)
# Open a connection to the file
# 打开一个文件,读出里面的数据
with open('world_dev_ind.csv') as file:
# Skip the column names
file.readline()
# Initialize an empty dictionary: counts_dict
counts_dict = {}
# Process only the first 1000 rows
for j in range(0,1000):
# Split the current line into a list: line
line = file.readline().split(',')
# Get the value for the first column: first_col
first_col = line[0]
# If the column value is in the dict, increment its value
if first_col in counts_dict.keys():
counts_dict[first_col] += 1
# Else, add to the dict and set value to 1
else:
counts_dict[first_col] = 1
# Print the resulting dictionary
print(counts_dict)
自定义一个绘图函数
# Define plot_pop()
def plot_pop(filename, country_code):
# Initialize reader object: urb_pop_reader
urb_pop_reader = pd.read_csv(filename, chunksize=1000)
# Initialize empty DataFrame: data
data = pd.DataFrame()
# Iterate over each DataFrame chunk
for df_urb_pop in urb_pop_reader:
# Check out specific country: df_pop_ceb
df_pop_ceb = df_urb_pop[df_urb_pop['CountryCode'] == country_code]
# Zip DataFrame columns of interest: pops
pops = zip(df_pop_ceb['Total Population'],
df_pop_ceb['Urban population (% of total)'])
# Turn zip object into list: pops_list
pops_list = list(pops)
# Use list comprehension to create new DataFrame column 'Total Urban Population'
df_pop_ceb['Total Urban Population'] = [int(tup[0] * tup[1] * 0.01) for tup in pops_list]
# Append DataFrame chunk to data: data
data = data.append(df_pop_ceb)
# Plot urban population data
data.plot(kind='scatter', x='Year', y='Total Urban Population')
plt.show()
# Set the filename: fn
fn = 'ind_pop_data.csv'
# Call plot_pop for country code 'CEB'
plot_pop(fn,'CEB')
# Call plot_pop for country code 'ARB'
plot_pop(fn,'ARB')
手机扫一扫
移动阅读更方便
你可能感兴趣的文章