tldr/pages/common/pup.md

28 lines
560 B
Markdown
Raw Normal View History

2017-10-08 17:00:15 +01:00
# pup
2017-10-11 00:00:49 +01:00
> Command line HTML parsing tool.
2017-10-08 17:00:15 +01:00
2017-10-11 00:00:49 +01:00
- Transform a raw HTML file into a cleaned, indented, and colored format:
2017-10-08 17:00:15 +01:00
`cat {{index.html}} | pup --color`
- Filter HTML by element tag name:
2017-10-11 13:17:46 +01:00
`cat {{index.html}} | pup '{{tag}}'`
2017-10-08 17:00:15 +01:00
- Filter HTML by id:
2017-10-11 00:00:49 +01:00
`cat {{index.html}} | pup '{{div#id}}'`
2017-10-08 17:00:15 +01:00
- Filter HTML by attribute value:
2017-10-11 00:05:27 +01:00
`cat {{index.html}} | pup '{{input[type="text"]}}'`
2017-10-08 17:00:15 +01:00
2017-10-11 00:00:49 +01:00
- Print all text from the filtered HTML elements and their children:
2017-10-08 17:00:15 +01:00
2017-10-11 00:00:49 +01:00
`cat {{index.html}} | pup '{{div}} text{}'`
2017-10-08 17:00:15 +01:00
- Print HTML as JSON:
2017-10-11 00:00:49 +01:00
`cat {{index.html}} | pup '{{div}} json{}'`