diff options
author | diosmosis <diosmosis@users.noreply.github.com> | 2020-08-04 05:59:58 +0300 |
---|---|---|
committer | GitHub <noreply@github.com> | 2020-08-04 05:59:58 +0300 |
commit | f5e9420a987340b036fa342e876ab92e314f4ec7 (patch) | |
tree | 2267e9eafe8b6577c4f7d3219d39c284a90677c6 /core/CronArchive.php | |
parent | 2394c8c954d46d1ca9fe8d1304405e7fd6727c89 (diff) |
allow invalidating plugin archives only and archiving past data for plugins (#15889)
* Adding initial new code for cron archive rewrite.
* first pass at removing unused CronArchive code.
* unfinished commit
* fill out archiveinvalidator code
* getting some tests to pass
* unfinished commit
* fixing part of test
* Another test fix.
* another sql change
* fix broken merge or something else that went wrong
* Couple more fixes and extra logs.
* Fixing enough issues to get core archive command to run completely.
* Fix and log change.
* Fixed more segment/test related issues for CronArchiveTest. Includes optimization for no visits for period + segment process from handling.
* another optimization and possible build fix
* no visit optimization
* test fix
* Implement archiving_custom_ranges logic w/ queue based implementation
* fixes to get archivecrontest to work
* add logic to invalidate today period
* fix optimization and some tests
* Fixing more tests.
* Fixing more tests
* debug travis failure
* more test fixes
* more test fixes, removing more unneeded code, handling some TODOs
* Handle more TODOs including creating ArchiveFilter class for some cli options.
* tests and todos
* idarchives are specific to table + start on archivefilter tests
* one test
* more TODOs and tests
* more tests and todo taken care of
* handle more todos
* fixing more tests
* fix comment
* make sure autoarchiving is enabled for segments when cron archive picks them up
* Fixing test.
* apply more pr feedback
* order by date1 asc
* quick refactor
* use batch insert instead of createDummyArchives
* apply rest of pr feedback
* add removed events, add new test, fix an issue (when deleting idarchives older than do not lump all segments together).
* re-add fixed/shared siteids
* fix tests
* incomplete commit
* Insert archive entries into archive_invalidations table.
* Use invalidations table in core:archive and get ArchiveCronTest to pass.
* fixing some tests
* debugging travis
* fix more tests & remove DONE_IN_PROGRESS which is no longer used.
* fix more tests
* Allow forcing plugin specific archive in core:archive.
* When querying from archive data use all available archives including "all" archives and plugin specific archives.
* Adding some code for invalidating specific plugin archives.
* Get archive invalidation test to pass.
* add plugin capability to invalidate command
* Handle plugin only archives in core:archive.
* Add Archive test and get ArchiveCronTest to pass.
* update some expected files
* Fix some more tests.
* incomplete commit
* allow invalidating individual reports
* adding more API for DONE_PARTIAL support
* get archivecrontest to pass
* add archive processor tests
* fix some test randomnes
* when purging keep latest partial archives if there is no newer whole archive
* add rearchivereport method + some unfinished tests
* Add archiveReports API method, fix race condition in test, when archiving single report, always ignore inserting other reports.
* require archivers to handle partial archives themselves entirely instead of trying to do it automatically and allow requested report to be any string
* couple fixes
* Use core config option for last N montsh to invalidate.
* Add test for ArchiveSelector method.
* Ignore archives w/ deactivated plugins.
* Refactor queue looping into new QueueConsumer class.
* apply more review feedback + another fix
* invalidate segments too in reArchiveReport w/ etsts
* remove DONE_IN_PROGRESS, no longer used. use new status in query and add queue consumer test.
* forgot to add file
* delete old unneeded archives when finalizing a new one.
* tweak invalidation archive description
* add plugin archiving tests and get them to pass
* fix test
* many fixes
* fix another test
* update expected test files
* fix more tests
* last test fixes hopefully
* tweak log
* In case a column already exists, do not try to add it in an AddColumns migration or the entire migration will fail and no columns will be added.
* try to fix tests again
* fix again?
* apply some review feedback + fix test
* fix test
* fix another test
* couple fixes
* Remove extra param.
* apply pr feedback
* check for usable archive before invalidating and before initiating archiving
* fixing tests
* fixing tests
* fixing tests
* fix another test issue
* fix archiveinvalidator test
* fix one test and debug another
* more debugging
* fix test
* use twig
* remove no longer needed change
* add back previous logic
* fix tracking is not working
* apply pr feedback and add tests
* fixing tests
* update submodule
* debugging random travis failure
* update test
* more debugging
* more debugging
* another attempt at debugging
* Lets try this fix
* trying to fix the build
* debug
* simpler test
* fix test
* fix test
* fix test
* fix test
* fix test failure
* update screenshots
* update screenshots
Co-authored-by: Thomas Steur <tsteur@users.noreply.github.com>
Diffstat (limited to 'core/CronArchive.php')
-rw-r--r-- | core/CronArchive.php | 429 |
1 files changed, 60 insertions, 369 deletions
diff --git a/core/CronArchive.php b/core/CronArchive.php index b52fa24ab8..1dd1265070 100644 --- a/core/CronArchive.php +++ b/core/CronArchive.php @@ -19,12 +19,14 @@ use Piwik\CronArchive\FixedSiteIds; use Piwik\CronArchive\Performance\Logger; use Piwik\Archive\ArchiveInvalidator; use Piwik\CliMulti\RequestParser; +use Piwik\CronArchive\QueueConsumer; use Piwik\CronArchive\SharedSiteIds; use Piwik\DataAccess\ArchiveSelector; use Piwik\DataAccess\ArchiveTableCreator; use Piwik\DataAccess\Model; use Piwik\DataAccess\RawLogDao; use Piwik\Metrics\Formatter; +use Piwik\Period\Factory; use Piwik\Period\Factory as PeriodFactory; use Piwik\CronArchive\SegmentArchiving; use Piwik\Period\Range; @@ -139,7 +141,7 @@ class CronArchive * * @var int|false */ - public $dateLastForced = SegmentArchiving::DEFAULT_BEGINNIN_OF_TIME_LAST_N_YEARS; + public $dateLastForced = SegmentArchiving::DEFAULT_BEGINNING_OF_TIME_LAST_N_YEARS; /** * The number of concurrent requests to issue per website. Defaults to {@link MAX_CONCURRENT_API_REQUESTS}. @@ -208,11 +210,6 @@ class CronArchive private $archiveFilter; /** - * @var array - */ - private $invalidationsToExclude = []; - - /** * @var RequestParser */ private $cliMultiRequestParser; @@ -329,8 +326,6 @@ class CronArchive $pid = Common::getProcessId(); $timer = new Timer; - $siteTimer = null; - $siteRequests = 0; $this->logSection("START"); $this->logger->info("Starting Matomo reports archiving..."); @@ -344,14 +339,12 @@ class CronArchive $countOfProcesses = $this->getMaxConcurrentApiRequests(); + $queueConsumer = new QueueConsumer($this->logger, $this->websiteIdArchiveList, $countOfProcesses, $pid, + $this->model, $this->segmentArchiving, $this, $this->cliMultiRequestParser, $this->archiveFilter); + // invalidate once at the start no matter when the last invalidation occurred $this->invalidateArchivedReportsForSitesThatNeedToBeArchivedAgain(); - // if we skip or can't process an idarchive, we want to ignore it the next time we look for an invalidated - // archive. these IDs are stored here (using a list like this serves to keep our SQL simple). - $this->invalidationsToExclude = []; - - $idSite = null; while (true) { if ($this->isMaintenanceModeEnabled()) { $this->logger->info("Archiving will stop now because maintenance mode is enabled"); @@ -363,144 +356,15 @@ class CronArchive flush(); } - if (empty($idSite)) { - $idSite = $this->getNextIdSiteToArchive(); - if (empty($idSite)) { // no sites left to archive, stop - $this->logger->debug("No more sites left to archive, stopping."); - return; - } - - /** - * This event is triggered before the cron archiving process starts archiving data for a single - * site. - * - * Note: multiple archiving processes can post this event. - * - * @param int $idSite The ID of the site we're archiving data for. - * @param string $pid The PID of the process processing archives for this site. - */ - Piwik::postEvent('CronArchive.archiveSingleSite.start', array($idSite, $pid)); - - $this->logger->info("Start processing archives for site {idSite}.", ['idSite' => $idSite]); - - $siteTimer = new Timer(); - $siteRequests = 0; - } - - // we don't want to invalidate different periods together or segment archives w/ no-segment archives - // together, but it's possible to end up querying these archives. if we find one, we keep track of it - // in this array to exclude, but after we run the current batch, we reset the array so we'll still - // process them eventually. - $invalidationsToExcludeInBatch = []; - - // get archives to process simultaneously - $archivesToProcess = []; - while (count($archivesToProcess) < $countOfProcesses) { - $invalidatedArchive = $this->getNextInvalidatedArchive($idSite, array_keys($invalidationsToExcludeInBatch)); - if (empty($invalidatedArchive)) { - $this->logger->debug("No next invalidated archive."); - break; - } - - if ($this->hasDifferentPeriod($archivesToProcess, $invalidatedArchive['period'])) { - $this->logger->debug("Found archive with different period than others in concurrent batch, skipping until next batch: {$invalidatedArchive['period']}"); - - $idinvalidation = $invalidatedArchive['idinvalidation']; - $invalidationsToExcludeInBatch[$idinvalidation] = true; - continue; - } - - if ($this->hasDifferentDoneFlagType($archivesToProcess, $invalidatedArchive['name'])) { - $this->logger->debug("Found archive with different done flag type (segment vs. no segment) in concurrent batch, skipping until next batch: {$invalidatedArchive['name']}"); - - $idinvalidation = $invalidatedArchive['idinvalidation']; - $invalidationsToExcludeInBatch[$idinvalidation] = true; - - continue; - } - - if ($invalidatedArchive['segment'] === null) { - $this->logger->debug("Found archive for segment that is not auto archived, ignoring."); - $this->addInvalidationToExclude($invalidatedArchive); - continue; - } - - if ($this->isDoneFlagForPlugin($invalidatedArchive['name'])) { - $this->logger->debug("Found plugin specific invalidated archive, ignoring."); - $this->addInvalidationToExclude($invalidatedArchive); - continue; - } - - if ($this->archiveArrayContainsArchive($archivesToProcess, $invalidatedArchive)) { - $this->logger->debug("Found duplicate invalidated archive {$invalidatedArchive['idarchive']}, ignoring."); - $this->addInvalidationToExclude($invalidatedArchive); - $this->model->deleteInvalidations([$invalidatedArchive]); - continue; - } - - $reason = $this->shouldSkipArchive($invalidatedArchive); - if ($reason) { - $this->logger->debug("Skipping invalidated archive {$invalidatedArchive['idarchive']}: $reason"); - $this->addInvalidationToExclude($invalidatedArchive); - continue; - } - - if ($this->canSkipArchiveBecauseNoPoint($invalidatedArchive)) { - $this->logger->debug("Found invalidated archive we can skip (no visits or latest archive is not invalidated). " - . "[idSite = {$invalidatedArchive['idsite']}, dates = {$invalidatedArchive['date1']} - {$invalidatedArchive['date2']}, segment = {$invalidatedArchive['segment']}]"); - $this->addInvalidationToExclude($invalidatedArchive); - $this->model->deleteInvalidations([$invalidatedArchive]); - continue; - } - - // TODO: should use descriptive string instead of just invalidation ID - $reason = $this->shouldSkipArchiveBecauseLowerPeriodOrSegmentIsInProgress($invalidatedArchive); - if ($reason) { - $this->logger->debug("Skipping invalidated archive {$invalidatedArchive['idarchive']}: $reason"); - $invalidationsToExcludeInBatch[$invalidatedArchive['idinvalidation']] = true; - $this->addInvalidationToExclude($invalidatedArchive); - continue; - } - - $started = $this->model->startArchive($invalidatedArchive); - if (!$started) { // another process started on this archive, pull another one - $this->logger->debug("Archive invalidation {$invalidatedArchive['idinvalidation']} is being handled by another process."); - $this->addInvalidationToExclude($invalidatedArchive); - continue; - } - - $this->addInvalidationToExclude($invalidatedArchive); - - $archivesToProcess[] = $invalidatedArchive; + $archivesToProcess = $queueConsumer->getNextArchivesToProcess(); + if ($archivesToProcess === null) { + break; } - if (empty($archivesToProcess)) { // no invalidated archive left - /** - * This event is triggered immediately after the cron archiving process starts archiving data for a single - * site. - * - * Note: multiple archiving processes can post this event. - * - * @param int $idSite The ID of the site we're archiving data for. - * @param string $pid The PID of the process processing archives for this site. - */ - Piwik::postEvent('CronArchive.archiveSingleSite.finish', array($idSite, $pid)); - - $this->logger->info("Finished archiving for site {idSite}, {requests} API requests, {timer} [{processed} / {totalNum} done]", [ - 'idSite' => $idSite, - 'processed' => $this->websiteIdArchiveList->getNumProcessedWebsites(), - 'totalNum' => $this->websiteIdArchiveList->getNumSites(), - 'timer' => $siteTimer, - 'requests' => $siteRequests, - ]); - - $idSite = null; - + if (empty($archivesToProcess)) { continue; } - $siteRequests += count($archivesToProcess); - $successCount = $this->launchArchivingFor($archivesToProcess); $numArchivesFinished += $successCount; } @@ -521,64 +385,12 @@ class CronArchive $this->logger->info($timer->__toString()); } - private function isDoneFlagForPlugin($doneFlag) - { - return strpos($doneFlag, '.') !== false; - } - - private function archiveArrayContainsArchive($archiveArray, $archive) - { - foreach ($archiveArray as $entry) { - if ($entry['idsite'] == $archive['idsite'] - && $entry['period'] == $archive['period'] - && $entry['date1'] == $archive['date1'] - && $entry['date2'] == $archive['date2'] - && $entry['name'] == $archive['name'] - ) { - return true; - } - } - return false; - } - - // TODO: need to also delete rows from archive_invalidations via scheduled task, eg, if ts_invalidated is older than 3 days or something. - private function getNextInvalidatedArchive($idSite, $extraInvalidationsToIgnore) - { - $lastInvalidationTime = self::getLastInvalidationTime(); - if (empty($lastInvalidationTime) - || (time() - $lastInvalidationTime) >= 3600 - ) { - $this->invalidateArchivedReportsForSitesThatNeedToBeArchivedAgain(); - } - - $iterations = 0; - while ($iterations < 100) { - $invalidationsToExclude = array_merge($this->invalidationsToExclude, $extraInvalidationsToIgnore); - - $nextArchive = $this->model->getNextInvalidatedArchive($idSite, $invalidationsToExclude); - if (empty($nextArchive)) { - break; - } - - $isCronArchivingEnabled = $this->findSegmentForArchive($nextArchive); - if ($isCronArchivingEnabled) { - return $nextArchive; - } - - $this->invalidationsToExclude[] = $nextArchive['idinvalidation']; - - ++$iterations; - } - - return null; - } - private function launchArchivingFor($archives) { $urls = []; $archivesBeingQueried = []; foreach ($archives as $index => $archive) { - list($url, $segment) = $this->generateUrlToArchiveFromArchiveInfo($archive); + list($url, $segment, $plugin) = $this->generateUrlToArchiveFromArchiveInfo($archive); if (empty($url)) { // can happen if, for example, a segment was deleted after an archive was invalidated // in this case, we can just delete the archive entirely. @@ -591,6 +403,11 @@ class CronArchive $period = PeriodFactory::build($this->periodIdsToLabels[$archive['period']], $dateStr); $params = new Parameters(new Site($idSite), $period, new Segment($segment, [$idSite], $period->getDateStart(), $period->getDateEnd())); + if (!empty($plugin)) { + $params->setRequestedPlugin($plugin); + $params->onlyArchiveRequestedPlugin(); + } + $loader = new Loader($params); if ($loader->canSkipThisArchive()) { $this->logger->info("Found no visits for site ID = {idSite}, {period} ({date1},{date2}), site is using the tracker so skipping archiving...", [ @@ -620,7 +437,6 @@ class CronArchive $responses = $cliMulti->request($urls); $timers = $cliMulti->getTimers(); - $successCount = 0; foreach ($urls as $index => $url) { @@ -629,13 +445,15 @@ class CronArchive $stats = json_decode($content, $assoc = true); if (!is_array($stats)) { - $this->logError("Error unserializing the following response from $url: " . $content); + $this->logger->info(var_export($content, true)); + + $this->logError("Error unserializing the following response from $url: '" . $content . "'"); continue; } $visitsForPeriod = $this->getVisitsFromApiResponse($stats); - $this->logArchiveJobFinished($url, $timers[$index], $visitsForPeriod); + $this->logArchiveJobFinished($url, $timers[$index], $visitsForPeriod, $archivesBeingQueried[$index]['plugin'], $archivesBeingQueried[$index]['report']); // TODO: do in ArchiveWriter $this->deleteInvalidatedArchives($archivesBeingQueried[$index]); @@ -661,6 +479,8 @@ class CronArchive private function generateUrlToArchiveFromArchiveInfo($archive) { + $plugin = $archive['plugin']; + $report = $archive['report']; $period = $this->periodIdsToLabels[$archive['period']]; if ($period == 'range') { @@ -673,7 +493,7 @@ class CronArchive $segment = isset($archive['segment']) ? $archive['segment'] : ''; - $url = $this->getVisitsRequestUrl($idSite, $period, $date, $segment); + $url = $this->getVisitsRequestUrl($idSite, $period, $date, $segment, $plugin); $url = $this->makeRequestUrl($url); if (!empty($segment)) { @@ -685,35 +505,25 @@ class CronArchive } } - return [$url, $segment]; - } - - private function findSegmentForArchive(&$archive) - { - $flag = explode('.', $archive['name'])[0]; - if ($flag == 'done') { - $archive['segment'] = ''; - return true; + if (!empty($plugin)) { + $url .= "&pluginOnly=1"; } - $hash = substr($flag, 4); - $storedSegment = $this->segmentArchiving->findSegmentForHash($hash, $archive['idsite']); - if (!isset($storedSegment['definition'])) { - $archive['segment'] = null; - return false; + if (!empty($report)) { + $url .= "&requestedReport=" . urlencode($report); } - $archive['segment'] = $storedSegment['definition']; - return $this->segmentArchiving->isAutoArchivingEnabledFor($storedSegment); + return [$url, $segment, $plugin]; } - private function logArchiveJobFinished($url, $timer, $visits) + private function logArchiveJobFinished($url, $timer, $visits, $plugin = null, $report = null) { $params = UrlHelper::getArrayFromQueryString($url); $visits = (int) $visits; $this->logger->info("Archived website id {$params['idSite']}, period = {$params['period']}, date = " - . "{$params['date']}, segment = '" . (isset($params['segment']) ? $params['segment'] : '') . "', $visits visits found. $timer"); + . "{$params['date']}, segment = '" . (isset($params['segment']) ? $params['segment'] : '') . "', " + . ($plugin ? "plugin = $plugin, " : "") . ($report ? "report = $report, " : "") . "$visits visits found. $timer"); } public function getErrors() @@ -787,12 +597,15 @@ class CronArchive * @param bool|false $segment * @return string */ - private function getVisitsRequestUrl($idSite, $period, $date, $segment = false) + private function getVisitsRequestUrl($idSite, $period, $date, $segment = false, $plugin = null) { - $request = "?module=API&method=API.get&idSite=$idSite&period=$period&date=" . $date . "&format=json"; + $request = "?module=API&method=CoreAdminHome.archiveReports&idSite=$idSite&period=$period&date=" . $date . "&format=json"; if ($segment) { $request .= '&segment=' . urlencode($segment); } + if (!empty($plugin)) { + $request .= "&plugin=" . $plugin; + } return $request; } @@ -912,11 +725,26 @@ class CronArchive foreach ($sitesPerDays as $date => $siteIds) { //Concurrent transaction logic will end up with duplicates set. Adding array_unique to the siteIds. - $listSiteIds = implode(',', array_unique($siteIds)); + $siteIds = array_unique($siteIds); + + $period = Factory::build('day', $date); + + $siteIdsToInvalidate = []; + foreach ($siteIds as $idSite) { + $params = new Parameters(new Site($idSite), $period, new Segment('', [$idSite], $period->getDateStart(), $period->getDateEnd())); + if ($this->isThereExistingValidPeriod($params)) { + $this->logger->info(' Found usable archive for date range {date} for site {idSite}, skipping invalidation for now.', ['date' => $date, 'idSite' => $idSite]); + continue; + } + + $siteIdsToInvalidate[] = $idSite; + } + + $listSiteIds = implode(',', $siteIdsToInvalidate); try { $this->logger->info(' Will invalidate archived reports for ' . $date . ' for following websites ids: ' . $listSiteIds); - $this->getApiToInvalidateArchivedReport()->invalidateArchivedReports($siteIds, $date); + $this->getApiToInvalidateArchivedReport()->invalidateArchivedReports($siteIdsToInvalidate, $date); } catch (Exception $e) { $this->logger->info(' Failed to invalidate archived reports: ' . $e->getMessage()); } @@ -975,8 +803,6 @@ class CronArchive } } - Db::fetchAll("SELECT idinvalidation, idarchive, idsite, date1, date2, period, name, status FROM " . Common::prefixTable('archive_invalidations')); - $this->setInvalidationTime(); $this->logger->info("Done invalidating"); @@ -997,7 +823,7 @@ class CronArchive $loader = new Loader($params); if ($loader->canSkipThisArchive()) { - $this->logger->debug(" " . ucfirst($dateStr) . " archive can be skipped due to no visits, skipping invalidation..."); + $this->logger->debug(" " . ucfirst($dateStr) . " archive can be skipped due to no visits for idSite = $idSite, skipping invalidation..."); continue; } @@ -1010,15 +836,15 @@ class CronArchive } } - private function isThereExistingValidPeriod(Parameters $params, $isYesterday = false) + public function isThereExistingValidPeriod(Parameters $params, $isYesterday = false) { - $today = Date::factory('today'); + $today = Date::factoryInTimezone('today', Site::getTimezoneFor($params->getSite()->getId())); $isPeriodIncludesToday = $params->getPeriod()->isDateInPeriod($today); $minArchiveProcessedTime = $isPeriodIncludesToday ? Date::now()->subSeconds(Rules::getPeriodArchiveTimeToLiveDefault($params->getPeriod()->getLabel())) : null; // empty plugins param since we only check for an 'all' archive - list($idArchive, $visits, $visitsConverted, $ignore, $tsArchived) = ArchiveSelector::getArchiveIdAndVisits($params, $minArchiveProcessedTime, $includeInvalidated = false); + list($idArchive, $visits, $visitsConverted, $ignore, $tsArchived) = ArchiveSelector::getArchiveIdAndVisits($params, $minArchiveProcessedTime, $includeInvalidated = $isPeriodIncludesToday); // day has changed since the archive was created, we need to reprocess it if ($isYesterday @@ -1096,7 +922,7 @@ class CronArchive $this->logger->info(" See the doc at: https://matomo.org/docs/setup-auto-archiving/"); } - $cliMulti = new CliMulti(); + $cliMulti = new CliMulti($this->logger); $supportsAsync = $cliMulti->supportsAsync(); $this->logger->info("- " . ($supportsAsync ? 'Async process archiving supported, using CliMulti.' : 'Async process archiving not supported, using curl requests.')); @@ -1255,7 +1081,7 @@ class CronArchive private function makeCliMulti() { /** @var CliMulti $cliMulti */ - $cliMulti = StaticContainer::getContainer()->make('Piwik\CliMulti'); + $cliMulti = new CliMulti($this->logger); $cliMulti->setUrlToPiwik($this->urlToPiwik); $cliMulti->setPhpCliConfigurationOptions($this->phpCliConfigurationOptions); $cliMulti->setAcceptInvalidSSLCertificate($this->acceptInvalidSSLCertificate); @@ -1321,15 +1147,6 @@ class CronArchive return false; } - private function shouldSkipArchive($archive) - { - if ($this->archiveFilter) { - return $this->archiveFilter->filterArchive($archive); - } - - return false; - } - protected function wasSegmentChangedRecently($definition, $allSegments) { foreach ($allSegments as $segment) { @@ -1354,19 +1171,6 @@ class CronArchive $this->archiveFilter = $archiveFilter; } - private function addInvalidationToExclude(array $invalidatedArchive) - { - $id = $invalidatedArchive['idinvalidation']; - if (empty($this->invalidationsToExclude[$id])) { - $this->invalidationsToExclude[$id] = $id; - } - } - - private function getNextIdSiteToArchive() - { - return $this->websiteIdArchiveList->getNextSiteId(); - } - private function makeWebsiteIdArchiveList(array $websitesIds) { if ($this->shouldArchiveAllSites) { @@ -1381,117 +1185,4 @@ class CronArchive return new SharedSiteIds($websitesIds, SharedSiteIds::OPTION_ALL_WEBSITES); } - - private function hasDifferentPeriod(array $archivesToProcess, $period) - { - if (empty($archivesToProcess)) { - return false; - } - - return $archivesToProcess[0]['period'] != $period; - } - - private function hasDifferentDoneFlagType(array $archivesToProcess, $name) - { - if (empty($archivesToProcess)) { - return false; - } - - $existingDoneFlagType = $this->getDoneFlagType($archivesToProcess[0]['name']); - $newArchiveDoneFlagType = $this->getDoneFlagType($name); - - return $existingDoneFlagType != $newArchiveDoneFlagType; - } - - private function getDoneFlagType($name) - { - if ($name == 'done') { - return 'all'; - } else { - return 'segment'; - } - } - - private function canSkipArchiveBecauseNoPoint(array $invalidatedArchive) - { - $site = new Site($invalidatedArchive['idsite']); - - $periodLabel = $this->periodIdsToLabels[$invalidatedArchive['period']]; - $dateStr = $periodLabel == 'range' ? ($invalidatedArchive['date1'] . ',' . $invalidatedArchive['date2']) : $invalidatedArchive['date1']; - $period = PeriodFactory::build($periodLabel, $dateStr); - - $segment = new Segment($invalidatedArchive['segment'], [$invalidatedArchive['idsite']], $period->getDateStart(), $period->getDateEnd()); - - $params = new Parameters($site, $period, $segment); - - $loader = new Loader($params); - if ($loader->canSkipThisArchive()) { // if no point in archiving, skip - return true; - } - - // if valid archive already exists, do not re-archive - $minDateTimeProcessedUTC = Date::now()->subSeconds(Rules::getPeriodArchiveTimeToLiveDefault($periodLabel)); - $archiveIdAndVisits = ArchiveSelector::getArchiveIdAndVisits($params, $minDateTimeProcessedUTC, $includeInvalidated = false); - - $idArchive = $archiveIdAndVisits[0]; - return !empty($idArchive); - } - - private function shouldSkipArchiveBecauseLowerPeriodOrSegmentIsInProgress(array $archiveToProcess) - { - $inProgressArchives = $this->cliMultiRequestParser->getInProgressArchivingCommands(); - - $periodLabel = $this->periodIdsToLabels[$archiveToProcess['period']]; - $archiveToProcess['periodObj'] = PeriodFactory::build($periodLabel, $archiveToProcess['date1']); - - foreach ($inProgressArchives as $archiveBeingProcessed) { - if (empty($archiveBeingProcessed['period']) - || empty($archiveBeingProcessed['date']) - ) { - continue; - } - - $archiveBeingProcessed['periodObj'] = PeriodFactory::build($archiveBeingProcessed['period'], $archiveBeingProcessed['date']); - - if ($this->isArchiveOfLowerPeriod($archiveToProcess, $archiveBeingProcessed)) { - return "lower period in progress (period = {$archiveBeingProcessed['period']}, date = {$archiveBeingProcessed['date']})"; - } - - if ($this->isArchiveNonSegmentAndInProgressArchiveSegment($archiveToProcess, $archiveBeingProcessed)) { - return "segment archive in progress for same site/period ({$archiveBeingProcessed['segment']})"; - } - } - - return false; - } - - private function isArchiveOfLowerPeriod(array $archiveToProcess, $archiveBeingProcessed) - { - /** @var Period $archiveToProcessPeriodObj */ - $archiveToProcessPeriodObj = $archiveToProcess['periodObj']; - /** @var Period $archivePeriodObj */ - $archivePeriodObj = $archiveBeingProcessed['periodObj']; - - if ($archiveToProcessPeriodObj->getId() >= $archivePeriodObj->getId() - && $archiveToProcessPeriodObj->isPeriodIntersectingWith($archivePeriodObj) - ) { - return true; - } - - return false; - } - - private function isArchiveNonSegmentAndInProgressArchiveSegment(array $archiveToProcess, array $archiveBeingProcessed) - { - // archive is for different site/period - if (empty($archiveBeingProcessed['idSite']) - || $archiveToProcess['idsite'] != $archiveBeingProcessed['idSite'] - || $archiveToProcess['periodObj']->getId() != $archiveBeingProcessed['periodObj']->getId() - || $archiveToProcess['periodObj']->getDateStart()->toString() != $archiveBeingProcessed['periodObj']->getDateStart()->toString() - ) { - return false; - } - - return empty($archiveToProcess['segment']) && !empty($archiveBeingProcessed['segment']); - } -}
\ No newline at end of file +} |