Steps to reproduce:
- Setup a Web Crawling Configuration with the following rules
- Included URLs For Crawling
- Excluded URLs For Crawling
- Included URLs For Indexing:
- Excluded URLs For Indexing:
Start the Crawl
After results are generated search for
Results include content from
The “Included URLs For Indexing” rule should limit the indexed results to content hosted on
The “Excluded URLs For Indexing” rule should prevent the /apidocs and /dev content from being indexed.
The outcome of 1 is true and results are limited to the expected host domain however the rule appears to invalidate point 2; with results for /apidocs and /dev getting included in the index.
Please advise if this is a bug or an issue with the Regular Expression rules in use?