Feeds XPath Parser is a Feeds plugin for parsing XML and HTML documents. It enables site builders to leverage the power of Feeds to easily import data from complex, external data sources. Each element you wish to extract is setup using configurable mapping queries, saving time for developers who would otherwise have to code complex, specific-use modules. It also enables end-users to build web scrapers and other useful tools within Backdrop.
You may need this module if you would like to:
- Import XML or HTML documents into nodes, users, taxonomy terms, or regular database tables
- Scrape webpages like regular feed sources with scheduling, updating, and expiring
- Extract content from HTML documents to create a semantic data bank or mashup
Features
- Builtin query debugger to assist with writing XPath queries
- Tidy support for badly formatted markup
- Variable substitution, allowing you to use the value from one or more queries as arguments in another
- Various workarounds that cover up PHP’s idiosyncrasies with XPath
Installation
- Install this module using the official Backdrop CMS instructions at
https://backdropcms.org/guide/modules
Issues
Bugs and Feature requests should be reported in the Issue Queue:
https://github.com/backdrop-contrib/feeds_xpathparser/issues
Current Maintainer
- Laryn Kragt Bakker (https://github.com/laryn).
Credits
-
Mitchell Tannenbaum (https://www.drupal.org/user/68284) - Conceptual design, documentation, and funding.
-
Chris Leppanen (https://www.drupal.org/user/473738) - Initial development and ongoing, voluntary maintainership.
-
Ported to Backdrop CMS by Alexander Rozhkov (https://github.com/Al-Rozhkov).
License
This project is GPL v2 software. See the LICENSE.txt file in this directory for
complete text.