[AURON #2011] History Server fails when BuildInfo event is missing.#2012
[AURON #2011] History Server fails when BuildInfo event is missing.#2012slfan1989 wants to merge 2 commits intoapache:masterfrom
Conversation
…ing. Signed-off-by: slfan1989 <slfan1989@apache.org>
There was a problem hiding this comment.
Pull request overview
This PR fixes an issue where the History Server plugin crashes during initialization when BuildInfo event data is missing from the KVStore. The fix implements graceful handling of missing BuildInfo by converting the buildInfo() method to return an Option type and displaying a user-friendly message in the UI when data is unavailable.
Changes:
- Modified
AuronSQLAppStatusStore.buildInfo()to returnOption[AuronBuildInfoUIData]with exception handling - Removed conditional tab creation check in
AuronSQLHistoryServerPlugin, allowing the Auron tab to always be created - Added
buildInfoSummary()method inAuronAllExecutionsPageto handle both present and absent BuildInfo scenarios with appropriate UI rendering
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 1 comment.
| File | Description |
|---|---|
| auron-spark-ui/src/main/scala/org/apache/spark/sql/execution/ui/AuronSQLAppStatusStore.scala | Changed buildInfo() to return Option[AuronBuildInfoUIData] with try-catch handling for missing records |
| auron-spark-ui/src/main/scala/org/apache/spark/sql/execution/ui/AuronSQLHistoryServerPlugin.scala | Removed null check to allow unconditional tab creation, delegating empty state handling to the UI layer |
| auron-spark-ui/src/main/scala/org/apache/spark/sql/execution/ui/AuronAllExecutionsPage.scala | Refactored render methods to use new buildInfoSummary() helper that displays a warning message when BuildInfo is unavailable |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| Option(store.read(kClass, kClass.getName)) | ||
| } catch { | ||
| case _: NoSuchElementException => None | ||
| case NonFatal(_) => None |
There was a problem hiding this comment.
The NonFatal exception is being silently swallowed without logging. This makes debugging difficult if unexpected errors occur when reading from the KVStore. Consider logging the exception at a warning or debug level to help diagnose issues. For example: case NonFatal(e) => logWarning(s"Failed to read BuildInfo from KVStore", e); None
Note that this would require AuronSQLAppStatusStore to extend the Logging trait.
There was a problem hiding this comment.
Thank you very much for your valuable suggestion. I will further refine and improve the relevant content according to your comments.
ShreyeshArangath
left a comment
There was a problem hiding this comment.
LGTM, thanks for fixing this.
…ing. Signed-off-by: slfan1989 <slfan1989@apache.org>
|
@cxzl25 @richox We frequently encountered Rust crashes when running TPC-DS. I’ve submitted a fix in PR #2023. Could you please take a look and review it? https://github.com/apache/auron/actions/runs/22424571173/job/64931203277?pr=2012 |
Which issue does this PR close?
Closes #2011
Rationale for this change
The History Server plugin currently crashes during initialization when the
AuronBuildInfoUIDatarecord is missing from the KVStore. This causes applications without BuildInfo events to either fail plugin initialization or show no Auron tab.What changes are included in this PR?
buildInfo()to returnOption[AuronBuildInfoUIData], catchingNoSuchElementExceptionand other exceptions to returnNoneinstead of throwingbuildInfoSummary()method to handleOption[AuronBuildInfoUIData]:Some: displays BuildInfo table as beforeNone: shows user-friendly message "Auron build information is not available for this application."Are there any user-facing changes?
Yes. When BuildInfo is not available:
How was this patch tested?