17.1 使用Web API
使用Python的Requests库发送API请求,基础语法如下:
import requests url = 'https://api.github.com/search/repositories?q=language:python&sort=stars' headers = {'Accept': 'application/vnd.github.v3+json'} response = requests.get(url, headers=headers) print(f"Status code: {response.status_code}")
处理API响应时需检查状态码(200表示成功),并将响应内容转换为字典:
response_dict = response.json() print(f"Total repositories: {response_dict['total_count']}")
解析仓库信息示例:
repo_dicts = response_dict['items'] for repo in repo_dicts: print(f"Name: {repo['name']}") print(f"Owner: {repo['owner']['login']}") print(f"Stars: {repo['stargazers_count']}")
监视API速率限制需读取响应头信息:
rate_limit = response.headers['X-RateLimit-Remaining'] print(f"Remaining requests: {rate_limit}")
17.2 使用Plotly可视化仓库数据
安装Plotly:pip install plotly
生成条形图的核心代码:
from plotly.graph_objs import Bar from plotly import offline repo_names = [repo['name'] for repo in repo_dicts] stars = [repo['stargazers_count'] for repo in repo_dicts] data = [Bar(x=repo_names, y=stars, marker={'color': 'rgb(60, 100, 150)'})] layout = { 'title': 'GitHub最受欢迎的Python项目', 'xaxis': {'title': '仓库名称'}, 'yaxis': {'title': '星标数'} } fig = {'data': data, 'layout': layout} offline.plot(fig, filename='python_repos.html')
添加自定义工具提示:
hover_texts = [f"{repo['owner']['login']}<br>{repo['description']}" for repo in repo_dicts] data[0]['hovertext'] = hover_texts
在图表中添加可点击链接:
repo_links = [f"<a href='{repo['html_url']}'>{repo['name']}</a>" for repo in repo_dicts] data[0]['x'] = repo_links
17.3 Hacker News API示例
调用Hacker News API获取热门文章:
url = 'https://hacker-news.firebaseio.com/v0/topstories.json' response = requests.get(url) submission_ids = response.json()
提取文章详细信息:
for submission_id in submission_ids[:10]: url = f'https://hacker-news.firebaseio.com/v0/item/{submission_id}.json' response = requests.get(url) data = response.json() print(f"Title: {data.get('title')}") print(f"Link: {data.get('url')}") print(f"Comments: {data.get('descendants', 0)}")
关键公式与注意事项:
1. API请求频率限制公式:每分钟请求数 = 速率限制总数 / 时间窗口(单位:分钟)
2. 数据分页参数:使用page和per_page参数控制返回数据量(如?page=2&per_page=100)
3. 响应时间优化:缓存频繁请求的数据(例如使用字典存储已获取的Hacker News文章)
代码逻辑要点:
- 异常处理需包含try-except块捕获requests.exceptions.RequestException
- JSON解析时检查键是否存在(使用data.get('key')替代data['key']避免KeyError)
- 可视化图表保存为HTML文件后,可通过浏览器直接交互查看
技术流程总结:通过API获取数据 → 清洗和解析JSON → 分析关键指标 → 使用可视化工具展示结果。