Welcome to mirror list, hosted at ThFree Co, Russian Federation.

github.com/matomo-org/matomo.git - Unnamed repository; edit this file 'description' to name the repository.
summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorThomas Steur <tsteur@users.noreply.github.com>2021-08-02 23:36:07 +0300
committerGitHub <noreply@github.com>2021-08-02 23:36:07 +0300
commit052673f1a559c349afb7d0e3ee7379e43f2ed28f (patch)
tree136abbb7bc890d8335c2791a296d1ceacc53bac0
parentf21acfe280b284b09c96e8b77794ac9cf74dc332 (diff)
Scheduled tasks: Always read timetable from the database and not from memoryconcutask
I was reviewing another issue and then saw that we actually read always the cached option entry for the scheduled tasks timetable by the looks. This is executed in https://github.com/matomo-org/matomo/blob/4.4.1/core/Scheduler/Scheduler.php#L105-L115 Because it is normal to have 2 or many more archivers running in parallel it's not uncommon that multiple archivers might execute the task runner at the same time. They would all fetch the timetable (the entries of what scheduled tasks to execute when) and they would all have a different version of it and work on this version constantly. However, because it can take a long time (from seconds up to hours) to execute all tasks, there's a high risk that some tasks may be executed multiple times if we don't always read the timetable from the database. It will cause quite a few additional queries but should reduce some concurrency issues. Currently, there was already code to always read the DB value again. However, `Option::get` would always first return a cached result from memory and not fetch the DB again. Basically this is how it currently looks like: ``` job 1: load tasks, returns [a,b,c,d] job 1: work task a job 2: load tasks, returns [b,c,d] job 1: work task b job 2: work task b job 1: work task c job 2: work task c ... ``` The task runner logic is still far from being thread safe but this should improve it quite a bit. Consequence of all this is a lot of added load as several tasks may be executed multiple times, potentially some scheduled reports or custom alerts may be sent multiple times (I remember seeing such reports), etc.
-rw-r--r--core/Scheduler/Timetable.php1
1 files changed, 1 insertions, 0 deletions
diff --git a/core/Scheduler/Timetable.php b/core/Scheduler/Timetable.php
index 3831295c62..47e6b094d7 100644
--- a/core/Scheduler/Timetable.php
+++ b/core/Scheduler/Timetable.php
@@ -143,6 +143,7 @@ class Timetable
public function readFromOption()
{
+ Option::clearCachedOption(self::TIMETABLE_OPTION_STRING);
$optionData = Option::get(self::TIMETABLE_OPTION_STRING);
$unserializedTimetable = Common::safe_unserialize($optionData);