python list comprehensions
阅读原文时间:2023年07月10日阅读:2

list comprehensions

列表解释

You now have all the knowledge necessary to begin writing list comprehensions! Your job in this exercise is to write a list comprehension that produces a list of the squares of the numbers ranging from 0 to 9.

Create list comprehension: squares
squares = [i**2 for i in range(0,10)]


[[output expression] for iterator variable in iterable]

writing a list comprehension within another list comprehension, or nested list comprehensions.

# Create a 5 x 5 matrix using a list of lists: matrix
matrix = [[col for col in range(5)] for row in range(5)]

# Print the matrix
for row in matrix:
    print(row)

you can apply a conditional statement to test the iterator variable by adding an if statement in the optional predicate expression part after the for statement in the comprehension:

通用表达式,这种形式,看别人代码的时候出现很多,确实省代码的

[ output expression for iterator variable in iterable if predicate expression ].


# Create a list of strings: fellowship
fellowship = ['frodo', 'samwise', 'merry', 'aragorn', 'legolas', 'boromir', 'gimli']

# Create list comprehension: new_fellowship
new_fellowship = [member for member in fellowship if len(member) >= 7]

# Print the new list
print(new_fellowship)

<script.py> output:
    ['samwise', 'aragorn', 'legolas', 'boromir']

using a list comprehension and an if-else conditional statement in the output expression

输出的结果是一个if-else语句,这样挺直观简单的

# Create a list of strings: fellowship
fellowship = ['frodo', 'samwise', 'merry', 'aragorn', 'legolas', 'boromir', 'gimli']

# Create list comprehension: new_fellowship
new_fellowship = [member if len(member) >= 7 else '' for member in fellowship]

# Print the new list
print(new_fellowship)

Dict comprehensions

字典解析

同理字典解析

In [1]: # List of strings
... fellowship = ['frodo', 'samwise', 'merry', 'aragorn', 'legolas', 'boromir', 'gimli']
...
... # List comprehension
... fellow1 = [member for member in fellowship if len(member) >= 7]

In [2]: fellow2 = (member for member in fellowship if len(member) >= 7)

In [3]: fellow1
#很明显,列表解析输出的是一个列表
Out[3]: ['samwise', 'aragorn', 'legolas', 'boromir']

In [4]: fellow2
#生成器就是一个生成器对象
Out[4]: <generator object <genexpr> at 0x7f3821ba3a40>

一个列表解析的小例子

# Extract the created_at column from df: tweet_time
tweet_time = df['created_at']

# Extract the clock time: tweet_clock_time
tweet_clock_time = [entry[11:19] for entry in tweet_time if entry[17:19] == '19']

# Print the extracted times
print(tweet_clock_time)
  • zip()是可迭代对象,使用时必须将其包含在一个list中,方便一次性显示出所有结果

  • 它可以将多个序列(列表、元组、字典、集合、字符串以及 range() 区间构成的列表)“压缩”成一个 zip 对象。所谓“压缩”,其实就是将这些序列中对应位置的元素重新组合,生成一个个新的元组

    Zip lists: zipped_lists

    zipped_lists = zip(feature_names,row_vals)

    Create a dictionary: rs_dict

    rs_dict = dict(zipped_lists)

    Print the dictionary

    print(rs_dict)

生成器的关键字,功能有点类似于return

参考

数据集来自the World Bank's World Development Indicators

通过这个小demo能够更好的理解函数的定义,就是把所有的需求放到一个函数里,想让这个函数通用,那就提取公共的参数,从外面传进去。

# Define lists2dict()
def lists2dict(list1, list2):
    """Return a dictionary where list1 provides
    the keys and list2 provides the values."""

    # Zip lists: zipped_lists
    zipped_lists = zip(list1, list2)

    # Create a dictionary: rs_dict
    rs_dict = dict(zipped_lists)

    # Return the dictionary
    return rs_dict

# Call lists2dict: rs_fxn
rs_fxn = lists2dict(feature_names,row_vals)

# Print rs_fxn
print(rs_fxn)


# Open a connection to the file
# 打开一个文件,读出里面的数据
with open('world_dev_ind.csv') as file:

    # Skip the column names
    file.readline()

    # Initialize an empty dictionary: counts_dict
    counts_dict = {}

    # Process only the first 1000 rows
    for j in range(0,1000):

        # Split the current line into a list: line
        line = file.readline().split(',')

        # Get the value for the first column: first_col
        first_col = line[0]

        # If the column value is in the dict, increment its value
        if first_col in counts_dict.keys():
            counts_dict[first_col] += 1

        # Else, add to the dict and set value to 1
        else:
            counts_dict[first_col] = 1

# Print the resulting dictionary
print(counts_dict)

自定义一个绘图函数

# Define plot_pop()
def plot_pop(filename, country_code):

    # Initialize reader object: urb_pop_reader
    urb_pop_reader = pd.read_csv(filename, chunksize=1000)

    # Initialize empty DataFrame: data
    data = pd.DataFrame()

    # Iterate over each DataFrame chunk
    for df_urb_pop in urb_pop_reader:
        # Check out specific country: df_pop_ceb
        df_pop_ceb = df_urb_pop[df_urb_pop['CountryCode'] == country_code]

        # Zip DataFrame columns of interest: pops
        pops = zip(df_pop_ceb['Total Population'],
                    df_pop_ceb['Urban population (% of total)'])

        # Turn zip object into list: pops_list
        pops_list = list(pops)

        # Use list comprehension to create new DataFrame column 'Total Urban Population'
        df_pop_ceb['Total Urban Population'] = [int(tup[0] * tup[1] * 0.01) for tup in pops_list]

        # Append DataFrame chunk to data: data
        data = data.append(df_pop_ceb)

    # Plot urban population data
    data.plot(kind='scatter', x='Year', y='Total Urban Population')
    plt.show()

# Set the filename: fn
fn = 'ind_pop_data.csv'

# Call plot_pop for country code 'CEB'
plot_pop(fn,'CEB')

# Call plot_pop for country code 'ARB'

plot_pop(fn,'ARB')