OneDriveクロールの際に発生する404エラーについて

使用環境
FESS (v14.11.0)
Open Search 2.11

お世話になっております。
現在以下を参考にOneDriveのクロールを試みています。

初歩的な内容で恐縮ですが質問させてください。

「APIアクセス許可の要求」で「アプリケーションの許可」にて参考URLにて指定されたAPIを追加してクロールを行った所404エラーと403エラーが発生しました。

ログを確認すると閲覧権限を所持しているグループのクロールの際に403エラー、またグループユーザーのクロールの際に404エラーが発生しているようで
今回ユーザーのクロールは必要がない為追加したAPIから「user.read.all」「user.read」を外してクロールを行ったのですが、変わらずユーザーをクロールしているようなログが出力され404及び403エラーが出力されました。

ユーザーのクロールの際に404の場合と403の場合があり現在こちらのエラーの原因が分かれる原因、及びユーザーのクロールを行わない方法があればご教示頂けないでしょうか?

お忙しい中恐縮ですが、何卒宜しくお願い致します。

GET https://graph.microsoft.com/v1.0/users/ユーザーのオブジェクトID/drive/root/children
SdkVersion : graph-java/v5.70.0


404 : 
[...]

[Some information was truncated for brevity, enable debug logging for more details]
	at com.microsoft.graph.http.GraphServiceException.createFromResponse(GraphServiceException.java:419) ~[fess-ds-office365-14.11.0.jar:?]
	at com.microsoft.graph.http.GraphServiceException.createFromResponse(GraphServiceException.java:378) ~[fess-ds-office365-14.11.0.jar:?]
	at com.microsoft.graph.http.CoreHttpProvider.handleErrorResponse(CoreHttpProvider.java:510) ~[fess-ds-office365-14.11.0.jar:?]
	at com.microsoft.graph.http.CoreHttpProvider.processResponse(CoreHttpProvider.java:442) ~[fess-ds-office365-14.11.0.jar:?]
	at com.microsoft.graph.http.CoreHttpProvider.sendRequestInternal(CoreHttpProvider.java:408) ~[fess-ds-office365-14.11.0.jar:?]
	at com.microsoft.graph.http.CoreHttpProvider.send(CoreHttpProvider.java:225) ~[fess-ds-office365-14.11.0.jar:?]
	at com.microsoft.graph.http.CoreHttpProvider.send(CoreHttpProvider.java:202) ~[fess-ds-office365-14.11.0.jar:?]
	at com.microsoft.graph.http.BaseCollectionRequest.send(BaseCollectionRequest.java:103) ~[fess-ds-office365-14.11.0.jar:?]
	at com.microsoft.graph.http.BaseEntityCollectionRequest.get(BaseEntityCollectionRequest.java:78) ~[fess-ds-office365-14.11.0.jar:?]
	at org.codelibs.fess.ds.office365.client.Office365Client.getDriveItemPage(Office365Client.java:227) ~[fess-ds-office365-14.11.0.jar:?]
	at org.codelibs.fess.ds.office365.OneDriveDataStore.getDriveItemChildren(OneDriveDataStore.java:633) ~[fess-ds-office365-14.11.0.jar:?]
	at org.codelibs.fess.ds.office365.OneDriveDataStore.getDriveItemsInDrive(OneDriveDataStore.java:616) ~[fess-ds-office365-14.11.0.jar:?]
	at org.codelibs.fess.ds.office365.OneDriveDataStore.lambda$storeUsersDrive$10(OneDriveDataStore.java:277) ~[fess-ds-office365-14.11.0.jar:?]
	at org.codelibs.fess.ds.office365.Office365DataStore.lambda$getLicensedUsers$0(Office365DataStore.java:44) ~[fess-ds-office365-14.11.0.jar:?]
	at java.util.ArrayList.forEach(ArrayList.java:1511) [?:?]
	at org.codelibs.fess.ds.office365.client.Office365Client.getUsers(Office365Client.java:241) [fess-ds-office365-14.11.0.jar:?]
	at org.codelibs.fess.ds.office365.Office365DataStore.getLicensedUsers(Office365DataStore.java:42) [fess-ds-office365-14.11.0.jar:?]
	at org.codelibs.fess.ds.office365.OneDriveDataStore.storeUsersDrive(OneDriveDataStore.java:275) [fess-ds-office365-14.11.0.jar:?]
	at org.codelibs.fess.ds.office365.OneDriveDataStore.storeData(OneDriveDataStore.java:174) [fess-ds-office365-14.11.0.jar:?]
	at org.codelibs.fess.ds.AbstractDataStore.store(AbstractDataStore.java:122) [classes/:?]
	at org.codelibs.fess.helper.DataIndexHelper$DataCrawlingThread.process(DataIndexHelper.java:218) [classes/:?]
	at org.codelibs.fess.helper.DataIndexHelper$DataCrawlingThread.run(DataIndexHelper.java:204) [classes/:?]
2024-08-06 13:52:47,680 [Thread-288170] INFO  Returning token from cache
2024-08-06 13:52:47,680 [Thread-288170] DEBUG [Correlation ID: 2b4bf907-5f54-42b9-92a0-994bc6807cbc] Access Token was returned
2024-08-06 13:52:47,680 [Thread-288170] INFO  Azure Identity => getToken() result for scopes [https://graph.microsoft.com/.default]: SUCCESS
2024-08-06 13:52:47,681 [mUz5X48By8hob0hfRPo7-1] DEBUG >> 0x0004a609   221 HEADERS       END_STREAM|END_HEADERS
2024-08-06 13:52:47,878 [OkHttp graph.microsoft.com] DEBUG << 0x0004a609   187 HEADERS       END_HEADERS
2024-08-06 13:52:47,878 [OkHttp graph.microsoft.com] DEBUG << 0x0004a609    10 DATA          
2024-08-06 13:52:47,878 [OkHttp graph.microsoft.com] DEBUG << 0x0004a609   323 DATA          
2024-08-06 13:52:47,878 [OkHttp graph.microsoft.com] DEBUG << 0x0004a609    10 DATA          
2024-08-06 13:52:47,878 [OkHttp graph.microsoft.com] DEBUG << 0x0004a609     0 DATA          END_STREAM
2024-08-06 13:52:47,878 [mUz5X48By8hob0hfRPo7-1] DEBUG Q10000 scheduled after   0 µs: OkHttp ConnectionPool
2024-08-06 13:52:47,878 [OkHttp TaskRunner] DEBUG Q10000 starting              : OkHttp ConnectionPool
2024-08-06 13:52:47,878 [OkHttp ConnectionPool] DEBUG Q10000 run again after 300 s : OkHttp ConnectionPool
2024-08-06 13:52:47,878 [OkHttp TaskRunner] DEBUG Q10000 finished run in  97 µs: OkHttp ConnectionPool
2024-08-06 13:52:47,884 [Thread-288171] INFO  Returning token from cache
2024-08-06 13:52:47,884 [Thread-288171] DEBUG [Correlation ID: 4e7cc99d-f65a-4851-93f7-35042910fa2f] Access Token was returned
2024-08-06 13:52:47,884 [Thread-288171] INFO  Azure Identity => getToken() result for scopes [https://graph.microsoft.com/.default]: SUCCESS
2024-08-06 13:52:47,885 [mUz5X48By8hob0hfRPo7-1] DEBUG >> 0x0004a60b    92 HEADERS       END_STREAM|END_HEADERS
2024-08-06 13:52:48,087 [OkHttp graph.microsoft.com] DEBUG << 0x0004a60b    92 HEADERS       END_HEADERS
2024-08-06 13:52:48,087 [OkHttp graph.microsoft.com] DEBUG << 0x0004a60b    10 DATA          
2024-08-06 13:52:48,087 [OkHttp graph.microsoft.com] DEBUG << 0x0004a60b   380 DATA          
2024-08-06 13:52:48,087 [OkHttp graph.microsoft.com] DEBUG << 0x0004a60b    10 DATA          
2024-08-06 13:52:48,088 [OkHttp graph.microsoft.com] DEBUG << 0x0004a60b     0 DATA          END_STREAM
2024-08-06 13:52:48,088 [mUz5X48By8hob0hfRPo7-1] DEBUG Q10000 scheduled after   0 µs: OkHttp ConnectionPool
2024-08-06 13:52:48,088 [OkHttp TaskRunner] DEBUG Q10000 starting              : OkHttp ConnectionPool
2024-08-06 13:52:48,088 [OkHttp ConnectionPool] DEBUG Q10000 run again after 300 s : OkHttp ConnectionPool
2024-08-06 13:52:48,088 [OkHttp TaskRunner] DEBUG Q10000 finished run in 128 µs: OkHttp ConnectionPool
2024-08-06 13:52:48,088 [mUz5X48By8hob0hfRPo7-1] DEBUG Current item: root
2024-08-06 13:52:48,089 [Thread-288172] INFO  Returning token from cache
2024-08-06 13:52:48,089 [Thread-288172] DEBUG [Correlation ID: 77051d95-fb3a-4683-881b-8db3d5b326ba] Access Token was returned
2024-08-06 13:52:48,089 [Thread-288172] INFO  Azure Identity => getToken() result for scopes [https://graph.microsoft.com/.default]: SUCCESS
2024-08-06 13:52:48,090 [mUz5X48By8hob0hfRPo7-1] DEBUG >> 0x0004a60d    87 HEADERS       END_STREAM|END_HEADERS
2024-08-06 13:52:48,468 [OkHttp graph.microsoft.com] DEBUG << 0x0004a60d   237 HEADERS       END_HEADERS
2024-08-06 13:52:48,468 [OkHttp graph.microsoft.com] DEBUG << 0x0004a60d    10 DATA          
2024-08-06 13:52:48,468 [OkHttp graph.microsoft.com] DEBUG << 0x0004a60d    61 DATA          
2024-08-06 13:52:48,468 [OkHttp graph.microsoft.com] DEBUG << 0x0004a60d    10 DATA          
2024-08-06 13:52:48,468 [OkHttp graph.microsoft.com] DEBUG << 0x0004a60d     0 DATA          END_STREAM
2024-08-06 13:52:48,468 [mUz5X48By8hob0hfRPo7-1] DEBUG Q10000 scheduled after   0 µs: OkHttp ConnectionPool
2024-08-06 13:52:48,468 [OkHttp TaskRunner] DEBUG Q10000 starting              : OkHttp ConnectionPool
2024-08-06 13:52:48,468 [OkHttp ConnectionPool] DEBUG Q10000 run again after 300 s : OkHttp ConnectionPool
2024-08-06 13:52:48,468 [OkHttp TaskRunner] DEBUG Q10000 finished run in  71 µs: OkHttp ConnectionPool
2024-08-06 13:52:48,468 [mUz5X48By8hob0hfRPo7-1] DEBUG Drive item is not found.
com.microsoft.graph.http.GraphServiceException: Error code: itemNotFound
Error message: Item not found

原因はGraph APIから返ってくるものから考えるしかないと思います。
ユーザーフォルダを無視するには、クロール設定のパラメーターにuser_drive_crawler=falseとすればよいと思います。shared_documents_drive_crawlerとgroup_drive_crawlerも同じように設定できると思います。

1 Like

期待する結果が得られました、ありがとうございます。