实现细节

  • 使用requests库发送HTTP请求
  • 使用BeautifulSoup+lxml解析HTML
  • 汇率数据优先取"现汇买入价"(第2列),若无则取"现钞买入价"(第3列)
  • 最终汇率值会除以100(如页面显示716.09,实际存储7.1609)

代码如下:

import requests
import time
import re
from bs4 import BeautifulSoup
from requests.exceptions import RequestException


def getHTML(url):
    try:
        headers = {
            'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_4) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/52.0.2743.116 Safari/537.36'
        }
        response = requests.get(url,timeout=30, headers=headers)
        response.encoding = response.apparent_encoding
        if response.status_code == 200:
            return response.text
        return None
    except RequestException:
        return None

def getBank(soup):
    # 查找数据所在表格
    table = soup.find_all('table')[1]
    # print(table)
    dataAll = {}
    for all_tr in table.find_all('tr'):  # 找到所有tr,返回一个列表
        all_td = all_tr.find_all('td')
        # print(all_td)
        if len(all_td) > 0:
            # dataRow = []
            num = 0
            currency_name = all_td[0].text
            rate = 0
            if all_td[1].text !='':
                rate = float(all_td[1].text)
            elif all_td[2].text !='':
                rate = float(all_td[2].text)
            dataAll[currency_name] = rate/100
    return dataAll

def get_rate_date():
    url = "https://www.bankofchina.com/sourcedb/whpj/"
    html = getHTML(url)
    # BeautifulSoup将字节流转换为utf-8编码
    soup = BeautifulSoup(html, 'lxml')
    Bankinfo = getBank(soup)
    print(Bankinfo)
    return Bankinfo


    
if __name__ == '__main__':
    safe = get_rate_date()
Logo

魔乐社区(Modelers.cn) 是一个中立、公益的人工智能社区,提供人工智能工具、模型、数据的托管、展示与应用协同服务,为人工智能开发及爱好者搭建开放的学习交流平台。社区通过理事会方式运作,由全产业链共同建设、共同运营、共同享有,推动国产AI生态繁荣发展。

更多推荐