Fess Crwaling External Domains

Hi,

We have configured fess to index only our company website. But it is also crawling the external domains like microsoft.com.

Have posted the sample config below. Please let me know, what’s the issue

Name						Test
URLs						https://www.abc.com/en-us/index.html
Included URLs For Crawling	https://www.abc.com/.*
Excluded URLs For Crawling	
Included URLs For Indexing	https://www.abc.com/.*
Excluded URLs For Indexing	
							(.*)/assets/.*
							(.*)/www/.*
Config Parameters			config.html.canonical.xpath=
							field.xpath.lastModified=//META[@name="lastmodified"]/@content
							field.xpath.releaseDate=//META[@name="releaseDate"]/@content
Depth						10
Max Access Count			50000
User Agent					Mozilla/5.0 (compatible; Fess/13.10; 
                                                     +http://fess.codelibs.org/bot.html)
The number of Thread		          1
Interval time				10000 ms
Boost						1.0
Permissions					{role}guest

Status						Enabled
Description	

Did you check fess-crawler.log?

Yes. I do see entries for crawling

2022-10-09 06:07:52,504 [Crawler-XaHAI4MBA_J1JKbDyNUS-1-1] INFO  Crawling URL: https://docs.microsoft.com/en-us/sysinternals/downloads/sigcheck
2022-10-09 06:07:52,600 [Crawler-XaHAI4MBA_J1JKbDyNUS-1-1] INFO  Redirect to URL: https://learn.microsoft.com/en-us/sysinternals/downloads/sigcheck
2022-10-09 06:08:12,977 [Crawler-XaHAI4MBA_J1JKbDyNUS-1-1] INFO  Crawling URL: https://docs.microsoft.com/en-us/openspecs/windows_protocols/ms-rdpbcgr/023f1e69-cfe8-4ee6-9ee0-7e759fb4e4ee
2022-10-09 06:08:13,044 [Crawler-XaHAI4MBA_J1JKbDyNUS-1-1] INFO  Redirect to URL: https://learn.microsoft.com/en-us/openspecs/windows_protocols/ms-rdpbcgr/023f1e69-cfe8-4ee6-9ee0-7e759fb4e4ee
2022-10-09 06:21:11,224 [Crawler-XaHAI4MBA_J1JKbDyNUS-1-1] INFO  Crawling URL: https://www.bleepingcomputer.com/news/microsoft/hands-on-with-windows-11s-new-task-manager/

Please try the following setting.

Included URLs For Crawling	https://www.abc.com/.*
Excluded URLs For Crawling	
							(.*)/assets/.*
							(.*)/www/.*
Included URLs For Indexing
Excluded URLs For Indexing	

Something similar is happening to me as well. I included only a specific url for crawling but the crawler is going to other (external) urls… what are we doing wrong?
I am on 14.5.0

Thank you.

Could you provide the crawling config to reproduce it?