Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Move asynchronous tasks from web app container to dedicated MediaWiki maintenance container #33

Closed
jeffw16 opened this issue Feb 11, 2022 · 20 comments
Labels
enhancement New feature or request

Comments

@jeffw16
Copy link
Member

jeffw16 commented Feb 11, 2022

No description provided.

@jeffw16 jeffw16 added the enhancement New feature or request label Mar 2, 2022
@amalpaul54111
Copy link
Contributor

I am not sure how to work on this task, but If I could get some pointers on how this could be done, I might be able to work on this.

@abhi-bhatra
Copy link

Hi @jeffw16 I want to work on this issue. Can you provide a small description about this issue and from where to kick-start !

@Ayman161803
Copy link
Contributor

Ayman161803 commented Mar 7, 2023

Can you please elaborate on what you mean by asynchronous tasks in web app container?

@jeffw16
Copy link
Member Author

jeffw16 commented Mar 8, 2023

Basically, anything that run-apache.sh is doing that isn't caused by the last line of the script: exec /usr/sbin/apachectl -DFOREGROUND. Namely:

jobrunner &
transcoder &
sitemapgen &

@Ayman161803
Copy link
Contributor

Ayman161803 commented Mar 8, 2023

Thank you. That makes sense. From what I understand, the main purpose of this container will be to run maintenance scripts mwjobrunner.sh, mwsitemapgen.sh, mwtranscoder.sh. The following steps will need to implemented:

  1. Implement a seperate Dockerfile for maintainer container
  2. Add maintainer container to docker-compose.
  3. Create a volume named MediaWikiHome in docker-compose
  4. Access the above volume from both web app and the maintenance container.

Do correct me if I am wrong

I would like to work on this. Can I please have this assigned?

@jeffw16
Copy link
Member Author

jeffw16 commented Mar 8, 2023

Assigned.

Can I ask why we need a shared volume called MediaWikiHome? I'm not thinking about this too deeply, so I wanted to ask you first before I think too much about it.

@Ayman161803
Copy link
Contributor

Once the maintenance container is set up, I would expect this container(like the web container does here) to run maintenance scripts as well.

An example where maintenance might need access to something like mw_home:
Currently our jobrunner script runs htmlCacheUpdates here.

I would expect jobs like the one mentioned above to require access to MW_HOME to perform as expected.

Please correct me if I am wrong.

@jeffw16
Copy link
Member Author

jeffw16 commented Mar 8, 2023

The maintenance container would need a reduced subset of what the web container (basically, the remaining Canasta container) would need:

  • PHP-FPM
  • The MediaWiki code
  • The same Canasta-specific code, like the LocalSettings.php, CanastaDefaultSettings.php, etc.
  • The same bind mounts: LocalSettings.php volumed in, etc.

The major changes are:

  • There's no Apache running on this container
  • The images directory will not be volumed in
  • The extensions and skins directory should follow the current practice of symlinking the {canasta,user}-{extensions,skins} directories' contents into the {extensions,skins} directories.

It may seem wasteful to duplicate the MediaWiki container's storage, but I think using a volume to store the shared MediaWiki code would make things messy. We don't want any chance of mixing code for computation (i.e. the MediaWiki core code + maintenance scripts, extensions, skins) with items actually meant for persistent storage, such as media files and MySQL databases. Volumes are meant for the latter, so we should shy away from mixing the two.

@Ayman161803
Copy link
Contributor

Makes sense. Noted.

@jeffw16
Copy link
Member Author

jeffw16 commented Mar 19, 2023

Any progress on this @Ayman161803?

@jeffw16
Copy link
Member Author

jeffw16 commented Mar 20, 2023

Ok, since it has been 2 weeks, this task has been released to other potential contributors.

@PiyushRaj927
Copy link
Contributor

PiyushRaj927 commented Mar 24, 2023

I would like to work on this issue, can I please have it assigned.

The images directory will not be volumed in

It seems that the Jobs webVideoTranscodePrioritized and webVideoTranscode will require access to the file store in order to properly transcode the videos. Therefore, I believe it may be necessary to mount the image directory.

Please correct me if I have misunderstood something

@jeffw16
Copy link
Member Author

jeffw16 commented Mar 25, 2023

@PiyushRaj927 okay, feel free to work on it; it's yours.

Thanks for your observation of the images directory. Yes, you are right. Looks like we indeed need to volume it in. Great catch!

@PiyushRaj927
Copy link
Contributor

Hi @jeffw16, I had a question about the new container that we're working on.
I'm not quite sure where I should be submitting my PR, would it be a new repository?

@jeffw16
Copy link
Member Author

jeffw16 commented Mar 26, 2023

@PiyushRaj927 Great question. I just made a repo that you may submit here: https://github.com/CanastaWiki/Canasta-Maintenance

@PiyushRaj927
Copy link
Contributor

PiyushRaj927 commented Mar 28, 2023

Hi @jeffw16, I have submitted the Related PRs for this issue Canasta-Maintenance#1, Canasta#266 Canasta-DockerCompose#42
Please review them when you are free. Thanks

@freephile
Copy link
Contributor

I do not believe Canasta should create a completely different image (dockerfile) for running asynchronous scripts.

Quoting from factor XII of the Twelve-Factor App:

One-off admin processes should be run in an identical environment as the regular long-running processes of the app. They run against a release, using the same codebase and config as any process run against that release. Admin code must ship with application code to avoid synchronization issues.

https://12factor.net/admin-processes

The exact method depends on the type of deplyment (is it AWS Fargate? local docker-compose?) Scripts can either be run from the REPL (e.g. docker compose exec myapp php maintenance/update.php --quick) or be scripted into the entrypoint (e.g. if $DO_UPDATE_PHP then ...) and a new container can be started with that environment variable 'turned on'. The entrypoint script can exit on script completion.

@freephile
Copy link
Contributor

freephile commented May 12, 2023

Example entrypoint script stanza, based on an environment variable named MEDIAWIKI_UPDATE and using a semaphore (lock) in the shared external volume for concurrency

# If LocalSettings.php exists, then attempt to run the update.php maintenance
# script. If already up to date, it won't do anything, otherwise it will
# migrate the database if necessary on container startup. It also will
# verify the database connection is working.
if [ -e "LocalSettings.php" -a "$MEDIAWIKI_UPDATE" = 'true' -a ! -f "$MEDIAWIKI_SHARED/update.lock" ]; then
    touch $MEDIAWIKI_SHARED/update.lock
	echo >&2 'info: Running maintenance/update.php';
	php maintenance/update.php --quick --conf ./LocalSettings.php
    rm $MEDIAWIKI_SHARED/update.lock
fi

@yaronkoren
Copy link
Member

I think it's safe to close this issue - the consensus now seems to be that it's better to keep the maintenance scripts in the same container as MediaWiki. Which in turn means that the Canasta-Maintenance repository can probably be removed. My apologies to the people who worked on this, especially @PiyushRaj927 - sometimes plans just change, unfortunately.

@PiyushRaj927
Copy link
Contributor

No problem! It's natural for plans to shift. Thank you for the update.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

7 participants