python获取页面所有a标签下href的值代码示例

时间：2022-06-25 02:03:26 编辑：袖梨来源：一聚教程网

本篇文章小编给大家分享一下python获取页面所有a标签下href的值代码示例，文章代码介绍的很详细，小编觉得挺不错的，现在分享给大家供大家参考，有需要的小伙伴们可以来看看。

代码如下

# -*- coding:utf-8 -*-
#python 2.7
#http://tieba.ba**i*du.com/p/2460150866
#标签操作 
 
from bs4 import BeautifulSoup
import urllib.request
import re 
 
#如果是网址，可以用这个办法来读取网页
#html_doc = "http://tieba.ba**i*du.com/p/2460150866"
#req = urllib.request.Request(html_doc)  
#webpage = urllib.request.urlopen(req)  
#html = webpage.read() 
 
html="""
The Dormouse's story

The Dormouse's story
Once upon a time there were three little sisters; and their names were
,
 and
;

and they lived at the bottom of a well.
...
"""
soup = BeautifulSoup(html, 'html.parser')   #文档对象 
 
#查找a标签,只会查找出一个a标签
#print(soup.a)#
 
for k in soup.find_all('a'):
    print(k)
    print(k['class'])#查a标签的class属性
    print(k['id'])#查a标签的id值
    print(k['href'])#查a标签的href值
    print(k.string)#查a标签的string

如果，标签

soup = BeautifulSoup(html, 'html.parser')   #文档对象
#查找a标签,只会查找出一个a标签
for k in soup.find_all('a'):
    print(k)
    print(k['class'])#查a标签的class属性
    print(k['id'])#查a标签的id值
    print(k['href'])#查a标签的href值
    print(k.string)#查a标签的string

如果，标签

通常我们使用下面这种模式也是能够处理的，下面的方法使用了get()。

 html = urlopen(url)
 soup = BeautifulSoup(html, 'html.parser')
 t1 = soup.find_all('a')
 print t1
 href_list = []
 for t2 in t1:
    t3 = t2.get('href')
    href_list.append(t3)

推荐专题

最新下载

热门教程

python获取页面所有a标签下href的值代码示例

相关文章

热门栏目

php教程

asp.net教程

手机开发

css教程

网页制作

办公数码

jsp教程