Python酷库之旅-第三方库Pandas(043)

一、用法精讲

146、pandas.Series.expanding方法

146-1、语法

146-2、参数

146-3、功能

146-4、返回值

146-5、说明

146-6、用法

146-6-1、数据准备

146-6-2、代码示例

146-6-3、结果输出

147、pandas.Series.ewm方法

147-1、语法

147-2、参数

147-3、功能

147-4、返回值

147-5、说明

147-6、用法

147-6-1、数据准备

147-6-2、代码示例

147-6-3、结果输出

148、pandas.Series.pipe方法

148-1、语法

148-2、参数

148-4、返回值

148-5、说明

148-6、用法

148-6-1、数据准备

148-6-2、代码示例

148-6-3、结果输出

149、pandas.Series.abs方法

149-1、语法

149-2、参数

149-3、功能

149-4、返回值

149-5、说明

149-6、用法

149-6-1、数据准备

149-6-2、代码示例

149-6-3、结果输出

150、pandas.Series.all方法

150-1、语法

150-2、参数

150-3、功能

150-4、返回值

150-5、说明

150-6、用法

150-6-1、数据准备

150-6-2、代码示例

150-6-3、结果输出

二、推荐阅读

1、Python筑基之旅

2、Python函数之旅

3、Python算法之旅

4、Python魔法之旅

5、博客个人主页

一、用法精讲

146、pandas.Series.expanding方法

146-1、语法

# 146、pandas.Series.expanding方法
pandas.Series.expanding(min_periods=1, axis=_NoDefault.no_default, method='single')
Provide expanding window calculations.Parameters:
min_periodsint, default 1
Minimum number of observations in window required to have a value; otherwise, result is np.nan.axisint or str, default 0
If 0 or 'index', roll across the rows.If 1 or 'columns', roll across the columns.For Series this parameter is unused and defaults to 0.methodstr {‘single’, ‘table’}, default ‘single’
Execute the rolling operation per single column or row ('single') or over the entire object ('table').This argument is only implemented when specifying engine='numba' in the method call.New in version 1.3.0.Returns:
pandas.api.typing.Expanding

146-2、参数

146-2-1、min_periods(可选，默认值为1)：设置最少的观测值数量，达到该数量时开始计算结果。例如，如果设置为3，则前两个结果为NaN，从第三个开始计算累积统计。

146-2-2、axis(可选)：在Series对象中，这个参数一般没有作用，主要用于DataFrame对象。

146-2-3、method(可选，默认值为'single')：用于指定如何计算累积统计，当前支持的方法有'single'。

146-3、功能

主要用于计算累积统计信息，如累积平均值、累积和等。

146-4、返回值

返回一个Expanding对象，该对象可以进一步调用各种累积统计方法，如mean()、sum()、std()等。

146-5、说明

pandas.Series.expanding是一个非常有用的方法，用于计算各种累积统计信息，通过设置不同的参数，可以灵活地控制计算方式，从而适应各种数据分析需求，它的返回值是一个包含累积统计结果的Series对象。

146-6、用法

146-6-1、数据准备

无

146-6-2、代码示例

# 146、pandas.Series.expanding方法
# 146-1、计算累积平均值
import pandas as pd
data = pd.Series([3, 5, 6, 8, 10, 11, 24])
expanding_mean = data.expanding().mean()
print(expanding_mean, end='\n\n')# 146-2、计算累积和
import pandas as pd
data = pd.Series([3, 5, 6, 8, 10, 11, 24])
expanding_sum = data.expanding().sum()
print(expanding_sum, end='\n\n')# 146-3、设置 min_periods 参数
import pandas as pd
data = pd.Series([3, 5, 6, 8, 10, 11, 24])
expanding_mean_min = data.expanding(min_periods=3).mean()
print(expanding_mean_min)

146-6-3、结果输出

# 146、pandas.Series.expanding方法
# 146-1、计算累积平均值
# 0    3.000000
# 1    4.000000
# 2    4.666667
# 3    5.500000
# 4    6.400000
# 5    7.166667
# 6    9.571429
# dtype: float64# 146-2、计算累积和
# 0     3.0
# 1     8.0
# 2    14.0
# 3    22.0
# 4    32.0
# 5    43.0
# 6    67.0
# dtype: float64# 146-3、设置 min_periods 参数
# 0         NaN
# 1         NaN
# 2    4.666667
# 3    5.500000
# 4    6.400000
# 5    7.166667
# 6    9.571429
# dtype: float64

147、pandas.Series.ewm方法

147-1、语法

# 147、pandas.Series.ewm方法
pandas.Series.ewm(com=None, span=None, halflife=None, alpha=None, min_periods=0, adjust=True, ignore_na=False, axis=_NoDefault.no_default, times=None, method='single')
Provide exponentially weighted (EW) calculations.Exactly one of com, span, halflife, or alpha must be provided if times is not provided. If times is provided, halflife and one of com, span or alpha may be provided.Parameters:
comfloat, optional
Specify decay in terms of center of mass\(\alpha = 1 / (1 + com)\), for \(com \geq 0\).spanfloat, optional
Specify decay in terms of span\(\alpha = 2 / (span + 1)\), for \(span \geq 1\).halflifefloat, str, timedelta, optional
Specify decay in terms of half-life\(\alpha = 1 - \exp\left(-\ln(2) / halflife\right)\), for \(halflife > 0\).If times is specified, a timedelta convertible unit over which an observation decays to half its value. Only applicable to mean(), and halflife value will not apply to the other functions.alphafloat, optional
Specify smoothing factor \(\alpha\) directly\(0 < \alpha \leq 1\).min_periodsint, default 0
Minimum number of observations in window required to have a value; otherwise, result is np.nan.adjustbool, default True
Divide by decaying adjustment factor in beginning periods to account for imbalance in relative weightings (viewing EWMA as a moving average).When adjust=True (default), the EW function is calculated using weights \(w_i = (1 - \alpha)^i\). For example, the EW moving average of the series [\(x_0, x_1, ..., x_t\)] would be:\[y_t = \frac{x_t + (1 - \alpha)x_{t-1} + (1 - \alpha)^2 x_{t-2} + ... + (1 - \alpha)^t x_0}{1 + (1 - \alpha) + (1 - \alpha)^2 + ... + (1 - \alpha)^t}\]
When adjust=False, the exponentially weighted function is calculated recursively:\[\begin{split}\begin{split} y_0 &= x_0\\ y_t &= (1 - \alpha) y_{t-1} + \alpha x_t, \end{split}\end{split}\]
ignore_nabool, default False
Ignore missing values when calculating weights.When ignore_na=False (default), weights are based on absolute positions. For example, the weights of \(x_0\) and \(x_2\) used in calculating the final weighted average of [\(x_0\), None, \(x_2\)] are \((1-\alpha)^2\) and \(1\) if adjust=True, and \((1-\alpha)^2\) and \(\alpha\) if adjust=False.When ignore_na=True, weights are based on relative positions. For example, the weights of \(x_0\) and \(x_2\) used in calculating the final weighted average of [\(x_0\), None, \(x_2\)] are \(1-\alpha\) and \(1\) if adjust=True, and \(1-\alpha\) and \(\alpha\) if adjust=False.axis{0, 1}, default 0
If 0 or 'index', calculate across the rows.If 1 or 'columns', calculate across the columns.For Series this parameter is unused and defaults to 0.timesnp.ndarray, Series, default None
Only applicable to mean().Times corresponding to the observations. Must be monotonically increasing and datetime64[ns] dtype.If 1-D array like, a sequence with the same shape as the observations.methodstr {‘single’, ‘table’}, default ‘single’
New in version 1.4.0.Execute the rolling operation per single column or row ('single') or over the entire object ('table').This argument is only implemented when specifying engine='numba' in the method call.Only applicable to mean()Returns:
pandas.api.typing.ExponentialMovingWindow

147-2、参数

147-2-1、com(可选，默认值为None)：浮点型，指定中心的加权系数$\alpha=\frac{1} {1+\text{com}}$。

147-2-2、span(可选，默认值为None)：浮点型，指定加权的时间跨度$\alpha=\frac{2} {\text{span}+1}$。

147-2-3、halflife(可选，默认值为None)：浮点型，指定加权的半衰期$\alpha=1-e^{-\frac{\ln(2)} {\text{halflife}}}$。

147-2-4、alpha(可选，默认值为None)：浮点型，直接指定加权因子，范围为(0, 1)。

147-2-5、min_periods(可选，默认值为0)：整型，指定最小观测值数量，达到该数量时开始计算结果。

147-2-6、adjust(可选，默认值为True)：布尔型，如果为True，使用调整后的加权，如果为False，使用未调整的加权。

147-2-7、ignore_na(可选，默认值为False)：布尔型，如果为True，会忽略NaN值。

147-2-8、axis(可选)：在Series对象中，这个参数一般没有作用，主要用于DataFrame对象。

147-2-9、times(可选，默认值为None)：时间间隔序列，与数据序列长度相同，用于设置时间权重。

147-2-10、method(可选，默认值为'single')：字符串，用于指定如何计算加权统计，当前支持的方法有'single'。

147-3、功能

用于计算指数加权移动统计信息，如指数加权移动平均值(EWMA)、指数加权移动和(EWM Sum)等。

147-4、返回值

返回一个ExponentialMovingWindow对象，该对象可以进一步调用各种加权统计方法，如mean()、sum()、std()等。

147-5、说明

pandas.Series.ewm是一个非常有用的方法，用于计算各种指数加权移动统计信息，通过设置不同的参数，可以灵活地控制计算方式，从而适应各种数据分析需求，它的返回值是一个包含指数加权移动统计结果的Series对象。

147-6、用法

147-6-1、数据准备

无

147-6-2、代码示例

# 147、pandas.Series.ewm方法
# 147-1、计算指数加权移动平均值
import pandas as pd
data = pd.Series([3, 5, 6, 8, 10, 11, 24])
ewm_mean = data.ewm(span=2).mean()
print(ewm_mean, end='\n\n')# 147-2、计算指数加权移动和
import pandas as pd
data = pd.Series([3, 5, 6, 8, 10, 11, 24])
ewm_sum = data.ewm(span=2).sum()
print(ewm_sum, end='\n\n')# 147-3、设置min_periods参数
import pandas as pd
data = pd.Series([3, 5, 6, 8, 10, 11, 24])
ewm_mean_min = data.ewm(span=2, min_periods=2).mean()
print(ewm_mean_min)

147-6-3、结果输出

# 147、pandas.Series.ewm方法
# 147-1、计算指数加权移动平均值
# 0     3.000000
# 1     4.500000
# 2     5.538462
# 3     7.200000
# 4     9.074380
# 5    10.359890
# 6    19.457457
# dtype: float64# 147-2、计算指数加权移动和
# 0     3.000000
# 1     6.000000
# 2     8.000000
# 3    10.666667
# 4    13.555556
# 5    15.518519
# 6    29.172840
# dtype: float64# 147-3、设置min_periods参数
# 0          NaN
# 1     4.500000
# 2     5.538462
# 3     7.200000
# 4     9.074380
# 5    10.359890
# 6    19.457457
# dtype: float64

148、pandas.Series.pipe方法

148-1、语法

# 148、pandas.Series.pipe方法
pandas.Series.pipe(func, *args, **kwargs)
Apply chainable functions that expect Series or DataFrames.Parameters:
func
function
Function to apply to the Series/DataFrame. args, and kwargs are passed into func. Alternatively a (callable, data_keyword) tuple where data_keyword is a string indicating the keyword of callable that expects the Series/DataFrame.*args
iterable, optional
Positional arguments passed into func.**kwargs
mapping, optional
A dictionary of keyword arguments passed into func.Returns:
the return type of func.

148-2、参数

148-2-1、func(必须)：函数、元组或字典，你希望应用到Series对象上的函数。

148-2-1-1、如果是一个函数，应该接受Series对象作为第一个参数。

148-2-1-2、如果是一个(callable, data)形式的元组，callable是函数，data是要传递给函数的其他参数。

148-2-1-3、如果是一个字典，键是函数，值是要传递给函数的其他参数。

148-2-2、*args(可选)：传递给函数的其他位置参数。

148-2-3、**kwargs(可选)：传递给函数的其他关键字参数。

148-3、功能

使方法链的操作更加整洁，它允许你将一个Series对象传递给一个函数，并返回该函数的处理结果。

148-4、返回值

返回应用函数后的结果，结果的类型取决于所应用的函数。

148-5、说明

pandas.Series.pipe是一种有效的方式来简化对Series对象的函数应用，使代码更具可读性和可维护性，通过这种方式，你可以将复杂的操作分解为一系列可重用的函数，从而使代码更加模块化和清晰。

148-6、用法

148-6-1、数据准备

无

148-6-2、代码示例

# 148、pandas.Series.pipe方法
# 148-1、基本应用
import pandas as pd
data = pd.Series([1, 2, 3, 4, 5])
def add_ten(series):return series + 10
result = data.pipe(add_ten)
print(result, end='\n\n')# 148-2、传递额外参数
import pandas as pd
data = pd.Series([1, 2, 3, 4, 5])
def multiply(series, multiplier):return series * multiplier
result = data.pipe(multiply, 5)
print(result, end='\n\n')# 148-3、使用lambda表达式
import pandas as pd
data = pd.Series([1, 2, 3, 4, 5])
result = data.pipe(lambda x: x**2)
print(result)

148-6-3、结果输出

# 148、pandas.Series.pipe方法
# 148-1、基本应用
# 0    11
# 1    12
# 2    13
# 3    14
# 4    15
# dtype: int64# 148-2、传递额外参数
# 0     5
# 1    10
# 2    15
# 3    20
# 4    25
# dtype: int64# 148-3、使用lambda表达式
# 0     1
# 1     4
# 2     9
# 3    16
# 4    25
# dtype: int64

149、pandas.Series.abs方法

149-1、语法

# 149、pandas.Series.abs方法
pandas.Series.abs()
Return a Series/DataFrame with absolute numeric value of each element.This function only applies to elements that are all numeric.Returns:
abs
Series/DataFrame containing the absolute value of each element.

149-2、参数

无

149-3、功能

计算Series对象中每个元素的绝对值，并返回一个新的Series对象。

149-4、返回值

返回一个新的Series对象，其中包含了原始Series对象中每个元素的绝对值。

149-5、说明

应用场景：

149-5-1、数据预处理：在进行数据分析和建模之前，确保所有数值都是非负的，这样可以避免一些算法对负值的敏感性。

149-5-2、金融数据：在处理收益率、损失等数据时，计算绝对值可以帮助分析总波动或变化幅度。

149-5-3、数学计算：在涉及到距离、长度等计算时，使用绝对值可以确保结果的正确性。

149-6、用法

149-6-1、数据准备

无

149-6-2、代码示例

# 149、pandas.Series.abs方法
import pandas as pd
data = pd.Series([-1, 2, -3, 4, -5])
result = data.abs()
print(result)

149-6-3、结果输出

# 149、pandas.Series.abs方法
# 0    1
# 1    2
# 2    3
# 3    4
# 4    5
# dtype: int64

150、pandas.Series.all方法

150-1、语法

# 150、pandas.Series.all方法
pandas.Series.all(axis=0, bool_only=False, skipna=True, **kwargs)
Return whether all elements are True, potentially over an axis.Returns True unless there at least one element within a series or along a Dataframe axis that is False or equivalent (e.g. zero or empty).Parameters:
axis
{0 or ‘index’, 1 or ‘columns’, None}, default 0
Indicate which axis or axes should be reduced. For Series this parameter is unused and defaults to 0.0 / ‘index’ : reduce the index, return a Series whose index is the original column labels.1 / ‘columns’ : reduce the columns, return a Series whose index is the original index.None : reduce all axes, return a scalar.bool_only
bool, default False
Include only boolean columns. Not implemented for Series.skipna
bool, default True
Exclude NA/null values. If the entire row/column is NA and skipna is True, then the result will be True, as for an empty row/column. If skipna is False, then NA are treated as True, because these are not equal to zero.**kwargs
any, default None
Additional keywords have no effect but might be accepted for compatibility with NumPy.Returns:
scalar or Series
If level is specified, then, Series is returned; otherwise, scalar is returned.

150-2、参数

150-2-1、axis(可选，默认值为0)：该参数在Series对象中没有实际意义，因为Series是一维的。

150-2-2、bool_only(可选，默认值为False)：如果为True，则仅计算布尔值。

150-2-3、skipna(可选，默认值为True)：如果为True，则跳过NA/null值。

150-2-4、**kwargs(可选)：传递给函数的其他关键字参数。

150-3、功能

检查Series对象中的所有值是否都为True，如果Series中所有值都是True，则返回True，否则返回False。

150-4、返回值

返回一个布尔值，如果Series对象中的所有元素都是True，则返回True，否则返回False。

150-5、说明

应用场景：

150-5-1、数据验证：在数据分析过程中，可以用来验证某些条件是否在所有数据中都满足。

150-5-2、条件检查：可以用于检查数据集中的所有元素是否都符合特定条件。

150-5-3、数据清理：在数据清理过程中，检查所有数据是否都为非空值或者是否都满足某些特定标准。

150-6、用法

150-6-1、数据准备

无

150-6-2、代码示例

# 150、pandas.Series.all方法
import pandas as pd
data = pd.Series([True, True, False, 0, 1])
result = data.all()
print(result)

150-6-3、结果输出

# 150、pandas.Series.all方法
# False