Difference between .string and .text BeautifulSoup


Beautiful Soup is a Python library for parsing HTML and XML documents. It can be used to parse malformed markup, (i.e. non-closed tags). It creates a parse tree for parsed pages that can be used to extract data from HTML, which is useful for web scraping. It is available for Python 2.6+ and Python 3.

.string on a Tag type object returns a NavigableString type object. On the other hand, .text gets all the child strings and return concatenated using the given separator. Return type of .text is unicode object.

From the documentation, A NavigableString is just like a Python Unicode string, except that it also supports some of the features described in Navigating the tree and Searching the tree.

From the documentation on .string, we can see that, If the html is like this,

<td>some text</td>
<td></td>
<td><p>more text</p></td>
<td>even <p>more text</p></td>

.string on the four td will return,

some text
None
more text
None

.text will give result like this,

some text

more text
even more text