当前位置: 首页> 科技> 能源 > 服务器iis搭建网站_建站工具哪个好用_免费b站推广网站_今日头条网站推广

服务器iis搭建网站_建站工具哪个好用_免费b站推广网站_今日头条网站推广

时间:2025/7/14 17:26:25来源:https://blog.csdn.net/yaojiawan/article/details/145845416 浏览次数:0次
服务器iis搭建网站_建站工具哪个好用_免费b站推广网站_今日头条网站推广

       

 本博文继续研究如何利用大语言模型(LLM)来构建播客推荐系统。

使用mongoDB 数据看存储基本数据

  1. 播客表(Podcast)
  2. 节目表(Episodes)
  3. 用户表(User)

用户表(User)

DeepSeek 笔记:推荐的spotify listener 记录格式

1. Basic Account Information

  • User ID: A unique identifier for the user (e.g., a Spotify-generated UUID).
  • Username: The display name chosen by the user.
  • Email Address: The email associated with the account.
  • Password: Encrypted for security.
  • Account Type: Free (ad-supported) or Premium (paid subscription).
  • Country/Region: The user's location, which may affect available content.
  • Date of Birth: Used for age verification and personalized content.
  • Account Creation Date: When the user signed up for Spotify.

2. Subscription and Payment Details

  • Subscription Status: Active, canceled, or trial.
  • Payment Method: Credit card, PayPal, etc.
  • Billing History: Records of past payments.
  • Renewal Date: For Premium users.

3. Usage and Activity Data

  • Listening History: Tracks, albums, and playlists the user has played.
  • Recently Played: A list of recently listened-to songs.
  • Playlists Created: User-generated playlists.
  • Followed Artists/Users: Artists or other users the user follows.
  • Liked Songs: Songs saved to the "Liked Songs" library.
  • Podcasts Subscribed To: Podcasts the user follows.
  • Device Information: Devices used to access Spotify (e.g., mobile, desktop, smart speaker).

4. Preferences and Settings

  • Language Preference: The user's chosen language for the app.
  • Privacy Settings: Whether the user's activity is public or private.
  • Audio Quality Settings: Streaming quality (e.g., low, normal, high, very high).
  • Explicit Content Filter: Whether explicit content is allowed.
  • Social Sharing Settings: Whether the user allows sharing activity on social media.

5. Analytics and Recommendations

  • Personalized Recommendations: Generated based on listening habits (e.g., Discover Weekly, Daily Mixes).
  • Top Tracks/Artists: Lists of the user's most-played songs and artists.
  • Listening Trends: Data on when and how often the user listens to music.

6. Security and Privacy

  • wo-Factor Authentication (2FA): Whether enabled.
  • Login History: Records of recent logins and devices used.
  • Data Sharing Preferences: Whether the user allows Spotify to share data with third parties.

从上面的信息中,截取一部分重要的部分,构建一个用户记录。

用户

UserSchema={UserID:String,Username:String,Email_Address:String,Password:String,CountryRegion:String,Date_of_Birth:String,Language:String,Account_Creation_Date:String,
}

收听历史(UserListenHistory)

UserListenHistorySchema={episodes_id:String,listen_time:String,//收听时间completion_rate:Numeric, //收听完成率(百分比)}

关注的播客(UserFllowingPodcast)

UserFllowingPodcastSchema={
Podcast_ID:String,
Fllowing_time:String,//关注的时间}

History ,Follow,Like 的列表可以数组的方式存储在听众表中 

UserSchema={UserID:String,Username:String,Email_Address:String,Password:String,CountryRegion:String,Date_of_Birth:String,Language:String,Account_Creation_Date:String,History:HistorySchema,Follows:followsSchema,Likes:likesSchema
}

播客表(Podcast)

podcastSchema={podcast:String,uuid:String,title:String,image:String,description:Stringlanguage:Stringcategories:String,website:String,itunes_id:String,follows:Numer               }

节目表(Epicodes)

   epicodeSchema={audio:String,audio_length:String,description:String,pub_date:String,uuid:String,podcast_uuid:String,likes:number
}

推荐算法

   使用Embedding 模型,矢量数据库实现相似性检索。

基本原则

  1.  用户通过相似性搜索,通过关键词,自然语言的提示,搜索心仪的播客节目,用户收听之后,这些节目将会加入收听历史列表中
  2. 计算用户的特征与电影的特征的相似度,列出前10个最相似的播客节目。
  3. 列出用户最近看过的前十部电影,计算出与这10部电影相似的播客节目,比如选择5部播客节目,一共列出10*5=50 部新的播客节目。
  4. 计算与用户的特征相似的其它用户,列出前10个相似的用户,找出相似用户看过的播客节目(每个用户选择2部),于是推荐20部播客节目。

 其中(2)(4) 在听众少的时候,比较难采纳,可以先从(1)(3) 开始做。

实现

收集播客和节目数据集

数据集来源:

Building a Podcast Recommendation Engine | Kaggle

将数据集存入mongoDB 数据库

代码

生成矢量数据库

根据播客和节目的描述生成矢量数据库。以便进一步进行相似性搜索(Similarity Search)。

生成用户数据集

通过程序生成一个·user 数据表。收听历史通过节目相似性查询生成。

代码

结果

关键字:服务器iis搭建网站_建站工具哪个好用_免费b站推广网站_今日头条网站推广

版权声明:

本网仅为发布的内容提供存储空间,不对发表、转载的内容提供任何形式的保证。凡本网注明“来源:XXX网络”的作品,均转载自其它媒体,著作权归作者所有,商业转载请联系作者获得授权,非商业转载请注明出处。

我们尊重并感谢每一位作者,均已注明文章来源和作者。如因作品内容、版权或其它问题,请及时与我们联系,联系邮箱:809451989@qq.com,投稿邮箱:809451989@qq.com

责任编辑: