Make Your Jobs More Robust With Automatic Safety Switches

In this article, I'll refer to a "job" as a batch processing program, as defined in JSR 352. A job can be written in any language but is scheduled periodically to automatically process bulk data, in contrast to interactive processing (CLI or GUI) for end-users. Error handling in jobs differs significantly from interactive processing. For instance, in the latter case, backend calls might not be retried as a human can respond to errors, while jobs need robust error recovery due to their automated nature. Moreover, jobs often possess higher privileges and can potentially damage extensive data.

Consider a scenario: What if a job fails due to a backend or dependency component issue? If a job is scheduled hourly and faces a major downtime just minutes before execution, what should be done?

CategoriesUncategorized