seesaw Package
seesaw Package
ArchiveTeam seesaw kit
config Module
Configuration value manipulation.
-
class seesaw.config.ConfigInterpolation(s, c)[source]
Bases: object
-
realize(item)[source]
-
class seesaw.config.ConfigValue(name, title='', description='', default=None, editable=True, advanced=True)[source]
Bases: object
Configuration value validator.
The collection methods are useful for providing user configurable
settings at run time. For example, when a pipeline file is executed
by the warrior, the additional config values are presented in the
warrior configuration panel.
-
check_value(value)[source]
-
collector = None
-
convert_value(value)[source]
-
is_valid()[source]
-
realize(dummy)[source]
-
set_value(value)[source]
-
classmethod start_collecting()[source]
-
classmethod stop_collecting()[source]
-
class seesaw.config.NumberConfigValue(*args, **kwargs)[source]
Bases: seesaw.config.ConfigValue
-
check_value(value)[source]
-
convert_value(value)[source]
-
class seesaw.config.StringConfigValue(*args, **kwargs)[source]
Bases: seesaw.config.ConfigValue
-
check_value(value)[source]
-
seesaw.config.realize(v, item=None)[source]
Makes objects contain concrete values from an item.
A silly example:
class AddExpression(object):
def realize(self, item):
return = item['x'] + item['y']
pipeline = Pipeline(ComputeMath(AddExpression()))
In the example, we want to compute an addition expression. The values
are defined in the Item.
event Module
Actor model.
-
class seesaw.event.Event[source]
Bases: object
Lightweight event system.
Example:
my_event_system = Event()
my_event_system += my_listener_callback_function
my_event_system(my_event_data)
-
fire(*args, **kargs)[source]
-
getHandlerCount()[source]
-
handle(handler)[source]
-
unhandle(handler)[source]
externalprocess Module
Running subprocesses asynchronously.
-
class seesaw.externalprocess.AsyncPopen(*args, **kwargs)[source]
Bases: object
Asynchronous version of subprocess.Popen.
Deprecated.
-
classmethod ignore_sigint()[source]
-
run()[source]
-
class seesaw.externalprocess.AsyncPopen2(*args, **kwargs)[source]
Bases: object
Adapter for the legacy AsyncPopen
-
run()[source]
-
stdin[source]
-
class seesaw.externalprocess.CurlUpload(target, filename, connect_timeout='60', speed_limit='1', speed_time='900', max_tries=None)[source]
Bases: seesaw.externalprocess.ExternalProcess
Upload with Curl process runner.
-
class seesaw.externalprocess.ExternalProcess(name, args, max_tries=1, retry_delay=30, accept_on_exit_code=None, retry_on_exit_code=None, env=None)[source]
Bases: seesaw.task.Task
External subprocess runner.
-
enqueue(item)[source]
-
handle_process_error(exit_code, item)[source]
-
handle_process_result(exit_code, item)[source]
-
on_subprocess_end(item, returncode)[source]
-
on_subprocess_stdout(pipe, item, data)[source]
-
process(item)[source]
-
stdin_data(item)[source]
-
class seesaw.externalprocess.RsyncUpload(target, files, target_source_path='./', bwlimit='0', max_tries=None, extra_args=None)[source]
Bases: seesaw.externalprocess.ExternalProcess
Upload with Rsync process runner.
-
stdin_data(item)[source]
-
class seesaw.externalprocess.WgetDownload(args, max_tries=1, accept_on_exit_code=None, retry_on_exit_code=None, env=None, stdin_data_function=None)[source]
Bases: seesaw.externalprocess.ExternalProcess
Download with Wget process runner.
-
stdin_data(item)[source]
-
seesaw.externalprocess.cleanup()[source]
item Module
Managing work units.
-
class seesaw.item.Item(pipeline, item_id, item_number, keep_data=False, prepare_data_directory=True, **kwargs)[source]
Bases: seesaw.item.ItemData
A thing, or work unit, that needs to be downloaded.
It has properties that are filled by the Task.
An Item behaves like a mutable mapping.
Note
State belonging to a item should be stored on the actual item
itself. That is, do not store variables onto a Task unless
you know what you are doing.
-
class ItemState[source]
Bases: object
State of the item.
-
canceled = 'canceled'
-
completed = 'completed'
-
failed = 'failed'
-
running = 'running'
-
class Item.TaskStatus[source]
Bases: object
Status of happened on a task.
-
completed = 'completed'
-
failed = 'failed'
-
running = 'running'
-
Item.cancel()[source]
-
Item.canceled[source]
-
Item.clear_data_directory()[source]
-
Item.complete()[source]
-
Item.completed[source]
-
Item.description()[source]
-
Item.end_time[source]
-
Item.fail()[source]
-
Item.failed[source]
-
Item.finished[source]
-
Item.item_id[source]
-
Item.item_number[source]
-
Item.item_state[source]
-
Item.log_error(task, *args)[source]
-
Item.log_output(data, full_line=True)[source]
-
Item.pipeline[source]
-
Item.prepare_data_directory()[source]
-
Item.set_task_status(task, status)[source]
-
Item.start_time[source]
-
Item.task_status[source]
-
class seesaw.item.ItemData(properties=None)[source]
Bases: _abcoll.MutableMapping
Base item data property container.
- Args:
properties (dict): Original dict
on_property (Event): Fired whenever a property changes.
Callback accepts:
- self
- key
- new value
- old value
-
properties[source]
-
class seesaw.item.ItemInterpolation(s)[source]
Bases: object
Formats a string using the percent operator during realize().
-
realize(item)[source]
-
class seesaw.item.ItemValue(key)[source]
Bases: object
Get an item’s value during realize().
-
fill(item, value)[source]
-
realize(item)[source]
pipeline Module
-
class seesaw.pipeline.Pipeline(*tasks)[source]
Bases: object
The sequence of steps that complete a Task.
Your pipeline will probably be something like this:
- Request an assignment from the tracker.
- Run Wget to download the file.
- Upload the downloaded file with rsync.
- Tell the tracker that the assignment is done.
-
add_task(task)[source]
-
cancel_items()[source]
-
enqueue(item)[source]
-
ui_task_list()[source]
project Module
Project information.
-
class seesaw.project.Project(title=None, project_html=None, utc_deadline=None)[source]
Bases: object
Briefly describes a project metadata.
This class defines the title of the project, a short description with an
optional project logo and an optional deadline. The information will be
shown in the web interface when the project is running.
-
data_for_json()[source]
runner Module
Pipeline execution.
-
class seesaw.runner.Runner(stop_file=None, concurrent_items=1, max_items=None, keep_data=False)[source]
Bases: object
Executes and manages the lifetime of Pipeline instances.
-
add_items()[source]
-
check_stop_file()[source]
-
is_active()[source]
-
keep_running()[source]
-
set_current_pipeline(pipeline)[source]
-
should_stop()[source]
-
start()[source]
-
stop_file_changed()[source]
-
stop_file_mtime()[source]
-
stop_gracefully()[source]
-
class seesaw.runner.SimpleRunner(pipeline, stop_file=None, concurrent_items=1, max_items=None, keep_data=False)[source]
Bases: seesaw.runner.Runner
Executes a single class:Pipeline instance.
-
forced_stop()[source]
-
start()[source]
tracker Module
Contacting the work unit server.
A Tracker refers to the Universal Tracker
(https://github.com/ArchiveTeam/universal-tracker).
-
class seesaw.tracker.GetItemFromTracker(tracker_url, downloader, version=None)[source]
Bases: seesaw.tracker.TrackerRequest
Get a single work unit information from the Tracker.
-
data(item)[source]
-
process_body(body, item)[source]
-
class seesaw.tracker.PrepareStatsForTracker(defaults=None, file_groups=None, id_function=None)[source]
Bases: seesaw.task.SimpleTask
Apply statistical values on the item.
-
process(item)[source]
-
class seesaw.tracker.SendDoneToTracker(tracker_url, stats)[source]
Bases: seesaw.tracker.TrackerRequest
Inform the Tracker the work unit has been completed.
-
data(item)[source]
-
process_body(body, item)[source]
-
class seesaw.tracker.TrackerRequest(name, tracker_url, tracker_command, may_be_canceled=False)[source]
Bases: seesaw.task.Task
Represents a request to a Tracker.
-
DEFAULT_RETRY_DELAY = 60
-
data(item)[source]
-
enqueue(item)[source]
-
handle_response(item, response)[source]
-
increment_retry_delay(max_delay=300)[source]
-
process_body(body, item)[source]
-
reset_retry_delay()[source]
-
schedule_retry(item, message='')[source]
-
send_request(item)[source]
-
class seesaw.tracker.UploadWithTracker(tracker_url, downloader, files, version=None, rsync_target_source_path='./', rsync_bwlimit='0', rsync_extra_args=[], curl_connect_timeout='60', curl_speed_limit='1', curl_speed_time='900')[source]
Bases: seesaw.tracker.TrackerRequest
Upload work unit results.
One of the inner task is used depending on the Tracker’s response
to where to upload:
-
data(item)[source]
-
process_body(body, item)[source]
util Module
Miscellaneous functions.
-
seesaw.util.find_executable(name, version, paths, version_arg='-V')[source]
Returns the path of a matching executable.
-
seesaw.util.test_executable(name, version, path, version_arg='-V')[source]
Try to run an executable and check its version.
-
seesaw.util.unique_id_str()[source]
Returns a unique string suitable for IDs.
warrior Module
The warrior server.
The warrior phones home to Warrior HQ
(https://github.com/ArchiveTeam/warrior-hq).
-
class seesaw.warrior.BandwidthMonitor(device)[source]
Bases: object
Extracts the bandwidth usage from the system stats.
-
current_stats()[source]
-
devre = <_sre.SRE_Pattern object at 0x7f3795b8ac90>
-
update()[source]
-
class seesaw.warrior.ConfigManager(config_file)[source]
Bases: object
Manages the configuration.
-
add(config_value)[source]
-
all_valid()[source]
-
editable_values()[source]
-
load()[source]
-
remove(name)[source]
-
save()[source]
-
set_value(name, value)[source]
-
class seesaw.warrior.Warrior(projects_dir, data_dir, warrior_hq_url, real_shutdown=False, keep_data=False)[source]
Bases: object
The warrior god object.
-
class Status[source]
Bases: object
-
INVALID_SETTINGS = 'INVALID_SETTINGS'
-
NO_PROJECT = 'NO_PROJECT'
-
REBOOTING = 'REBOOTING'
-
RESTARTING_PROJECT = 'RESTARTING_PROJECT'
-
RUNNING_PROJECT = 'RUNNING_PROJECT'
-
SHUTTING_DOWN = 'SHUTTING_DOWN'
-
STARTING_PROJECT = 'STARTING_PROJECT'
-
STOPPING_PROJECT = 'STOPPING_PROJECT'
-
SWITCHING_PROJECT = 'SWITCHING_PROJECT'
-
UNINITIALIZED = 'UNINITIALIZED'
-
Warrior.bandwidth_stats()[source]
-
Warrior.check_project_has_update(*args, **kwargs)[source]
-
Warrior.clone_project(project_name, project_path)[source]
-
Warrior.collect_install_output(data)[source]
-
Warrior.find_lat_lng()[source]
-
Warrior.fire_status()[source]
-
Warrior.forced_reboot()[source]
-
Warrior.forced_stop()[source]
-
Warrior.handle_lat_lng(response)[source]
-
Warrior.handle_runner_finish(runner)[source]
-
Warrior.install_project(*args, **kwargs)[source]
-
Warrior.keep_running()[source]
-
Warrior.load_pipeline(pipeline_path, context)[source]
-
Warrior.max_age_reached()[source]
-
Warrior.reboot_gracefully()[source]
-
Warrior.schedule_forced_reboot()[source]
-
Warrior.select_project(*args, **kwargs)[source]
-
Warrior.start()[source]
-
Warrior.start_selected_project(*args, **kwargs)[source]
-
Warrior.stop_gracefully()[source]
-
Warrior.update_project(*args, **kwargs)[source]
-
Warrior.update_warrior_hq(*args, **kwargs)[source]
-
Warrior.warrior_status()[source]
web Module
The warrior web interface.
-
class seesaw.web.ApiHandler(application, request, **kwargs)[source]
Bases: seesaw.web_util.BaseWebAdminHandler
Processes API requests.
-
get(command)[source]
-
get_template_path()[source]
-
initialize(warrior=None, runner=None)[source]
-
post(command)[source]
-
class seesaw.web.IndexHandler(application, request, **kwargs)[source]
Bases: seesaw.web_util.BaseWebAdminHandler
Shows the index.html.
-
get()[source]
-
class seesaw.web.ItemMonitor(item)[source]
Bases: object
Pushes item states and information to the client.
-
handle_item_cancel(item)[source]
-
handle_item_complete(item)[source]
-
handle_item_fail(item)[source]
-
handle_item_output(item, data)[source]
-
handle_item_property(item, key, new_value, old_value)[source]
-
handle_item_task_status(item, task, new_status, old_status)[source]
-
item_for_broadcast()[source]
-
item_status()[source]
-
class seesaw.web.SeesawConnection(session)[source]
Bases: sockjs.tornado.conn.SockJSConnection
A WebSocket server that communicates the state of the warrior.
-
classmethod broadcast(event, message)[source]
-
classmethod broadcast_bandwidth()[source]
-
classmethod broadcast_project_refresh()[source]
-
classmethod broadcast_projects()[source]
-
classmethod broadcast_timestamp()[source]
-
clients = set([])
-
emit(event_name, message)[source]
tornadoio to sockjs adapter.
-
classmethod handle_broadcast_message(warrior, message)[source]
-
classmethod handle_finish_item(runner, pipeline, item)[source]
-
classmethod handle_project_installation_failed(warrior, project, output)[source]
-
classmethod handle_project_installed(warrior, project, output)[source]
-
classmethod handle_project_installing(warrior, project)[source]
-
classmethod handle_project_refresh(warrior, project, runner)[source]
-
classmethod handle_project_selected(warrior, project)[source]
-
classmethod handle_projects_loaded(warrior, projects)[source]
-
classmethod handle_runner_status(runner, status)[source]
-
classmethod handle_start_item(runner, pipeline, item)[source]
-
classmethod handle_warrior_status(warrior, new_status)[source]
-
instance_id = '1400-0.626762'
-
item_monitors = {}
-
on_close()[source]
-
on_message(message)[source]
-
on_open(info)[source]
-
project = None
-
runner = None
-
warrior = None
-
seesaw.web.hash_string(text)[source]
Generate a digest for broadcast message.
-
seesaw.web.start_runner_server(project, runner, bind_address='localhost', port_number=8001, http_username=None, http_password=None)[source]
Starts a web interface for a manually run pipeline.
Unlike start_warrior_server(), this UI does not contain an
configuration or project management panel.
-
seesaw.web.start_warrior_server(warrior, bind_address='localhost', port_number=8001, http_username=None, http_password=None)[source]
Starts the warrior web interface.
web_util Module
-
class seesaw.web_util.BaseWebAdminHandler(application, request, **kwargs)[source]
Bases: tornado.web.RequestHandler
-
prepare()[source]