如果tag中包含多个字符串 [2] ,可以使用  for string in soup.strings:
    print(repr(string))
    # u"The Dormouse's story"
    # u'\n\n'
    # u"The Dormouse's story"
    # u'\n\n'
    # u'Once upon a time there were three little sisters; and their names were\n'
    # u'Elsie'
    # u',\n'
    # u'Lacie'
    # u' and\n'
    # u'Tillie'
    # u';\nand they lived at the bottom of a well.'
    # u'\n\n'
    # u'...'
    # u'\n'
输出的字符串中可能包含了很多空格或空行,使用  for string in soup.stripped_strings:
    print(repr(string))
    # u"The Dormouse's story"
    # u"The Dormouse's story"
    # u'Once upon a time there were three little sisters; and their names were'
    # u'Elsie'
    # u','
    # u'Lacie'
    # u'and'
    # u'Tillie'
    # u';\nand they lived at the bottom of a well.'
    # u'...'
全部是空格的行会被忽略掉,段首和段末的空白会被删除 父节点继续分析文档树,每个tag或字符串都有父节点:被包含在某个tag中 .parent通过  title_tag = soup.title
title_tag
# <title>The Dormouse's story</title>
title_tag.parent
# <head><title>The Dormouse's story</title></head>
文档title的字符串也有父节点:<title>标签 title_tag.string.parent
# <title>The Dormouse's story</title>
文档的顶层节点比如<html>的父节点是  html_tag = soup.html
type(html_tag.parent)
# <class 'bs4.BeautifulSoup'>
 print(soup.parent)
# None
.parents通过元素的  link = soup.a
link
# <a class="sister" href="http://example.com/elsie" id="link1">Elsie</a>
for parent in link.parents:
    if parent is None:
        print(parent)
    else:
        print(parent.name)
# p
# body
# html
# [document]
# None
 | 
Archiver|手机版|笨鸟自学网 ( 粤ICP备20019910号 )
GMT+8, 2025-11-4 11:16 , Processed in 0.033759 second(s), 18 queries .