Do you think IIT Guwahati certified course can help you in your career?
No
Introduction
In web2py, every HTTP request is served in its thread. The web server manages threads and recycles them for efficiency. The web server times out each request for security reasons. This means that actions should avoid starting new threads, forking processes, and performing lengthy tasks (it is possible but not recommended).
Time-consuming tasks should always be run in the background. There are three mechanisms built into web2py: cron, custom task queues, and scheduler, so there isn't just one way to go about it.
What are we going to learn in this article?
Well, Ninja, this article covers the web2py scheduler, how it produces results and outputs in web2py, and a task's life cycle.
Explore now!
web2py Scheduler
The built-in scheduler is the most popular web2py solution for running tasks in the background (and thus separate from the webserver process).
These are the functions that make up the stable API:
resume()
disable()
kill()
terminate()
task_status()
queue_task()
stop_task()
The web2py scheduler functions similarly to the task queue with a few differences. It provides a standard mechanism for scheduling, creating, and monitoring tasks.
There are several worker processes rather than a single background process.
Since both the state of the tasks and the state of the worker nodes are stored in the database, it is possible to keep an eye on their performance.
Tasks
The appadmin or programmatic scheduling of tasks is an option. In reality, scheduling a task only requires adding an entry to the "scheduler task" table, which you can access through appadmin:
The fields in this table have clear meanings. The values to be sent to the task in JSON format are in the "args" and "vars" fields. An illustration of "args" and "vars" in the context of the above-mentioned "task add" would be:
args = [3, 4]
vars = {}
or
args = []
vars = {'a': 3, 'b': 4}
The table where tasks are organized is the scheduler task.
You can use the API to add tasks.
scheduler.queue_task('mytask', ...)
Task Lifecycle
All tasks follow a lifecycle.
A task has defaulted in the QUEUED status when sent to the scheduler. Use the start_time argument (default = now) if you need it to run at a later time. You can set a stop_time parameter(default = None) for a task if you need to ensure that it is not carried out after a specific time (perhaps a request to a web service that closes at 1 AM, a mail that needs to be sent before working hours, etc.). Your task will be marked as EXPIRED if no worker picks it up before stop time. Tasks picked up BEFORE stop time or without a stop time set are ASSIGNED to a worker.
RUNNING tasks could result in:
TIMEOUT when the timeout parameter (default: 60 seconds) was used for more than n seconds.
When an exception is found, they return FAILED, and
When they succeed, they return COMPLETED.
Results and Output in web2py
The status of all currently running tasks is kept in the table "scheduler run." Each record refers to a job that a worker has taken on. One task may be run more than once. For instance, a task scheduled to run ten times every hour will most likely have ten runs (unless one fails or takes longer than 1 hour). Be aware that as soon as a task completes, it is immediately removed from the scheduler run table if it produces no results and output in web2py values.
Possible run statuses include:
RUNNING, COMPLETED, FAILED, TIMEOUT
If the run is finished, no exceptions are raised, and the task does not require time out, the run is marked as COMPLETED, and the task, depending on whether it is intended to run again later, is marked as QUEUED or COMPLETED. The run record contains the task's output in web2py, serialized in JSON.
A RUNNING task that throws an exception marks both the run and the task as FAILED. The run record contains the traceback.
The task is also marked as TIMEOUT when a run is stopped and TIMEOUT after exceeding the timeout.
Whatever the case, the stdout is recorded and added to the run record.
Use caution when using either extensive Results and output in web2py values or significant print statements on the queued functions due to multiprocessing limitations. Your task might not succeed simply because the parent process hangs while reading values because web2py's output is buffered. Additionally, use as few print statements as possible and, if necessary, a suitable logging library that doesn't clog stdout. For large return values, it may be preferable to use a table where the function saves the results; this way, you can return only a reference to a particular line of results without interfering with the scheduler's central processing.
One can check all RUNNING tasks, completed task output in web2py, failed task error, etc. using appadmin.
Additionally, the scheduler creates a new table called "scheduler worker" that stores the status and heartbeat of the workers.
Managing Processes
Worker management is challenging. This module makes an effort to support every platform (Mac, Windows, and Linux).
When you begin as a worker, you might subsequently want to:
No matter what it's doing, kill it.
only terminate it if it is not performing tasks
Lay it to rest
Perhaps you want to conserve resources because you have some tasks in the queue. Since you want them to be processed every hour, you should:
process all tasks in the queue, then exit automatically
Managing Scheduler parameters or the scheduler worker table makes it possible to do all of these things. More specifically, you can modify any worker's status value for started workers to affect their behavior. Workers can be in the statuses of ACTIVE, DISABLED, TERMINATE, or KILLED, depending on their task.
As the names of the statuses suggest, ACTIVE and DISABLED are "persistent," whereas TERMINATE or KILL are more like "commands" than actual statuses. Ctrl+c is equivalent to setting a worker to KILL.
Since version 2.4.1, a few common functions have been added (self-explanatory)
Each function accepts a string or list as an optional parameter that can be used to manage workers according to their group names. The group names specified during scheduler instantiation are used by default.
An illustration speaks louder than a thousand words: All workers processing high-priority tasks will be terminated by the scheduler.terminate('high prio'), whereas all high-priority and low-priority workers will be terminated by scheduler.terminate(['high prio,' 'low prio']).
Be careful: scheduler.terminate('high prio') will terminate the worker entirely even if you didn't want to terminate low prio as well if you have a worker processing both high prio and low prio.
By adding and updating records in these tables, one can programmatically carry out all actions that one can carry out via appadmin.
In any case, updating records in relation to running tasks is not advised because it might result in unexpected behavior. Using the "queue task" method to queue tasks is the recommended procedure.
For example:
scheduler.queue_task(function_name='task_add',
pargs=[],
pvars={'a': 3, 'b': 4},
# run 10 times
repeats=10,
# every 1h
period=3600,
# should take less than 120 seconds
timeout=120,
)
The fields "times run," "last run time," and "assigned worker name" in the "scheduler task" table is automatically filled by the workers rather than being provided at the schedule time.
Additionally, you can get the results of tasks you've finished:
The scheduler is regarded as experimental due to the need for more thorough testing and the possibility of the table structure changing as additional features are added.
Reporting Progress Percentages
Any previous output is cleared when a specific "word" is encountered in your functions' print statements. The word there is "clear." This makes it possible to report percentages when combined with the sync output parameter.
The function reporting percentages outputs 50% after 5 seconds of sleep. After another five seconds of sleep, it outputs at total capacity. Note that the second print statement contains and that the output in the scheduler run table is synced every 2 seconds. Clear! The output from 50% is cleared and replaced with 100% only when 100% is used.
The task's status is running when a user picks that particular task.
What is the possible output in web2py of run statuses?
Possible run statuses are: RUNNING, COMPLETED, FAILED, TIMEOUT
When is the run status marked as COMPLETED?
If the run is completed, no exceptions are thrown, and there is no task timeout, the run is marked as COMPLETED.
When is the run status marked as FAILED?
When a RUNNING task throws an exception, the run is marked as FAILED.
When is the run status marked as TIMEOUT?
When a run exceeds the timeout, it is stopped and marked as TIMEOUT.
Conclusion
In this article, we discussed the web2py scheduler, the task's life cycle, their running statuses, Results and output in web2py, managing processes, and reporting progress percentages. Here are a few key websites that will aid in your exploration of Web2py