机器学习·L3W2-协同过滤

时间:2025/7/10 10:10:51来源：https://blog.csdn.net/2301_80132162/article/details/141034921 浏览次数:0次

协同过滤

评分矩阵Y，左侧索引是名称，栏目是用户名
协同过滤的基本原理就是利用已有的评分数据，对未有的评分数据进行预测，根据评分大小推荐给用户

在这里插入图片描述
本质上用的算法还是线性回归那套

计算公式

$J({\mathbf{x}^{(0)},...,\mathbf{x}^{(n_m-1)},\mathbf{w}^{(0)},b^{(0)},...,\mathbf{w}^{(n_u-1)},b^{(n_u-1)}})= \left[ \frac{1}{2}\sum_{(i,j):r(i,j)=1}(\mathbf{w}^{(j)} \cdot \mathbf{x}^{(i)} + b^{(j)} - y^{(i,j)})^2 \right]+ \underbrace{\left[\frac{\lambda}{2}\sum_{j=0}^{n_u-1}\sum_{k=0}^{n-1}(\mathbf{w}^{(j)}_k)^2+ \frac{\lambda}{2}\sum_{i=0}^{n_m-1}\sum_{k=0}^{n-1}(\mathbf{x}_k^{(i)})^2\right]}_{regularization}$
The first summation in (1) is “for all $i$ , $j$ where $r (i, j)$ equals $1$ ” and could be written:

$\left[ \frac{1}{2}\sum_{j=0}^{n_u-1} \sum_{i=0}^{n_m-1}r(i,j)*(\mathbf{w}^{(j)} \cdot \mathbf{x}^{(i)} + b^{(j)} - y^{(i,j)})^2 \right] +\text{regularization}$

代码

本质上就是一个矩阵的运算，利用np.sum()化简代码

自定义计算函数cofi_cost_func_v

def cofi_cost_func_v(X, W, b, Y, R, lambda_):"""Returns the cost for the content-based filteringVectorized for speed. Uses tensorflow operations to be compatible with custom training loop.Args:X (ndarray (num_movies,num_features)): matrix of item featuresW (ndarray (num_users,num_features)) : matrix of user parametersb (ndarray (1, num_users)            : vector of user parametersY (ndarray (num_movies,num_users)    : matrix of user ratings of moviesR (ndarray (num_movies,num_users)    : matrix, where R(i, j) = 1 if the i-th movies was rated by the j-th userlambda_ (float): regularization parameterReturns:J (float) : Cost"""j = (tf.linalg.matmul(X, tf.transpose(W)) + b - Y)*RJ = 0.5 * tf.reduce_sum(j**2) + (lambda_/2) * (tf.reduce_sum(X**2) + tf.reduce_sum(W**2))return J

利用tensorflow求导

iterations = 200
lambda_ = 1
for iter in range(iterations):with tf.GradientTape() as tape:cost_value = cofi_cost_func_v(X, W, b, Y, R, lambda_)grads = tape.gradient( cost_value, [X,W,b] )optimizer.apply_gradients( zip(grads, [X,W,b]) )if iter % 5 == 0:print(f"Training loss at iteration {iter}: {cost_value:0.1f}")