servicex client internals

servicex.expandable_progress module

class servicex.expandable_progress.ExpandableProgress(display_progress: bool = True, provided_progress: Progress | ExpandableProgress | None = None, overall_progress: bool = False)[source]

Bases: object

We want to be able to use rich progress bars in the async code, but there are some situtations where the user doesn’t want them. Also we might be running several simultaneous progress bars, and we want to be able to control that.

We still want to keep the context manager interface, so this class implements the context manager but if display_progress is False, then it does nothing. If provided_progress is set then we just use that. Otherwise we create a new progress bar

Parameters:
  • display_progress

  • provided_progress

add_task(param, start, total)[source]
advance(task_id, task_type)[source]
refresh()[source]
start_task(task_id, task_type)[source]
update(task_id, task_type, total=None, completed=None, **fields)[source]
class servicex.expandable_progress.ProgressCounts(description: str, task_id: TaskID, start: int | None = None, total: int | None = None, completed: int | None = None)[source]

Bases: object

class servicex.expandable_progress.TranformStatusProgress(*columns: str | ProgressColumn, console: Console | None = None, auto_refresh: bool = True, refresh_per_second: float = 10, speed_estimate_period: float = 30.0, transient: bool = False, redirect_stdout: bool = True, redirect_stderr: bool = True, get_time: Callable[[], float] | None = None, disable: bool = False, expand: bool = False)[source]

Bases: Progress

get_renderables()[source]

Get a number of renderables for the progress display.

servicex.minio_adapter module

class servicex.minio_adapter.MinioAdapter(endpoint_host: str, secure: bool, access_key: str, secret_key: str, bucket: str)[source]

Bases: object

MAX_PATH_LEN = 60
async download_file(object_name: str, local_dir: str, shorten_filename: bool = False, expected_size: int | None = None) Path[source]
classmethod for_transform(transform: TransformStatus)[source]
async get_signed_url(object_name: str) str[source]
classmethod hash_path(file_name)[source]

Make the path safe for object store or POSIX, by keeping the length less than MAX_PATH_LEN. Replace the leading (less interesting) characters with a forty character hash. :param file_name: Input filename :return: Safe path string

async list_bucket() List[ResultFile][source]
servicex.minio_adapter.init_s3_config(concurrency: int = 10)[source]

Update the number of concurrent connections

servicex.query module

servicex.query_cache module

exception servicex.query_cache.CacheException[source]

Bases: Exception

class servicex.query_cache.QueryCache(config: Configuration)[source]

Bases: object

cache_path_for_transform(transform_status: TransformStatus) Path[source]
cache_submitted_transform(transform: TransformRequest, request_id: str) None[source]

Cache a transform that has been submitted but not completed.

cache_transform(record: TransformedResults)[source]
cached_queries() List[TransformedResults][source]
close()[source]
contains_hash(hash: str) bool[source]

Check if the cache has completed records for a hash

delete_record_by_hash(hash: str)[source]
delete_record_by_request_id(request_id: str)[source]
get_transform_by_hash(hash: str) TransformedResults | None[source]

Returns completed transformations by hash

get_transform_by_request_id(request_id: str) TransformedResults | None[source]

Returns completed transformed results using a request id

get_transform_request_id(hash_value: str) str | None[source]

Return the request id of cached record

is_transform_request_submitted(hash_value: str) bool[source]

Returns True if request is submitted Returns False if the request is not in the cache at all or not submitted

queries_in_state(state: str) List[dict][source]

Return all transform records in a given state.

transformed_results(transform: TransformRequest, completed_status: TransformStatus, data_dir: str, file_list: List[str], signed_urls) TransformedResults[source]
update_record(record: TransformedResults)[source]
update_transform_request_id(hash_value: str, request_id: str) None[source]

Update the cached record request id

update_transform_status(hash_value: str, status: str) None[source]

Update the cached record status

servicex.servicex_adapter module

exception servicex.servicex_adapter.AuthorizationError[source]

Bases: Exception

class servicex.servicex_adapter.ServiceXAdapter(url: str, refresh_token: str | None = None)[source]

Bases: object

async cancel_transform(transform_id=None)[source]
async delete_dataset(dataset_id=None) bool[source]
async delete_transform(transform_id=None)[source]
get_code_generators() dict[str, str]
async get_code_generators_async() dict[str, str][source]
async get_dataset(dataset_id=None) CachedDataset[source]
async get_datasets(did_finder=None, show_deleted=False) List[CachedDataset][source]
async get_servicex_capabilities() List[str][source]
async get_servicex_info() ServiceXInfo[source]
async get_servicex_sample_title_limit() int | None[source]
async get_transform_status(request_id: str) TransformStatus[source]
async get_transformation_results(request_id: str, later_than: datetime | None = None)[source]
async get_transforms() List[TransformStatus][source]
async submit_transform(transform_request: TransformRequest) str[source]
class servicex.servicex_adapter.ServiceXFile(created_at: datetime.datetime, filename: str, total_bytes: int)[source]

Bases: object

created_at: datetime
filename: str
total_bytes: int

servicex.servicex_client module

class servicex.servicex_client.GuardList(data: Sequence | Exception)[source]

Bases: Sequence

valid() bool[source]
enum servicex.servicex_client.ProgressBarFormat(value)[source]

Bases: str, Enum

Specify the way progress bars are displayed.

Member Type:

str

Valid values are as follows:

expanded = <ProgressBarFormat.expanded: 'expanded'>
compact = <ProgressBarFormat.compact: 'compact'>
none = <ProgressBarFormat.none: 'none'>
exception servicex.servicex_client.ReturnValueException(exc)[source]

Bases: Exception

An exception occurred at some point while obtaining this result from ServiceX

class servicex.servicex_client.ServiceXClient(backend=None, url=None, config_path=None, cache_dir: str | None = None)[source]

Bases: object

Connection to a ServiceX deployment. Instances of this class can deployment data from the service and also interact with previously run transformations. Instances of this class are factories for Datasets`

If both backend and url are unspecified then it will attempt to pick up the default backend from .servicex

Parameters:
  • backend – Name of a deployment from the .servicex file

  • url – Direct URL of a serviceX deployment instead of using .servicex. Can only work with hosts without auth, or the token is found in a file pointed to by the environment variable BEARER_TOKEN_FILE

  • config_path – Optional path to the .servicex file. If not specified, will search in local directory and up in enclosing directories

  • cache_dir – Optional path to override the cache directory for downloads and the cache database. If not specified, uses the value from the configuration file or the default path.

cancel_transform(transform_id) None[source]

Cancel a Transform by its request ID

delete_dataset(dataset_id) bool[source]

Delete a dataset by its ID :return: boolean showing whether the dataset has been deleted

delete_transform(transform_id) None[source]

Delete a Transform by its request ID

delete_transform_from_cache(transform_id: str)[source]
generic_query(dataset_identifier: DataSetIdentifier | FileListDataset, query: str | QueryStringGenerator, codegen: str | None = None, title: str = 'ServiceX Client', result_format: ResultFormat = ResultFormat.parquet, ignore_cache: bool = False, fail_if_incomplete: bool = True) Query[source]

Generate a Query object for a generic codegen specification

Parameters:
  • dataset_identifier – The dataset identifier or filelist to be the source of files

  • title – Title to be applied to the transform. This is also useful for relating transform results.

  • codegen – Name of the code generator to use with this transform

  • result_format – Do you want Paqrquet or Root? This can be set later with the set_result_format method

  • ignore_cache – Ignore the query cache and always run the query

Returns:

A Query object

get_code_generators() dict[str, str][source]

Retrieve the code generators deployed with the ServiceX instance.

Returns the cached result if already fetched, otherwise performs a network request via ServiceXAdapter.get_code_generators().

get_dataset(dataset_id) CachedDataset[source]

Retrieve a dataset by its ID :return: A Query object

get_datasets(did_finder=None, show_deleted=False) List[CachedDataset][source]

Retrieve all datasets you have run on the server :return: List of Query objects

get_transform_status(transform_id) TransformStatus

Get the status of a given transform :param transform_id: The uuid of the transform :return: The current status for the transform

async get_transform_status_async(transform_id) TransformStatus[source]

Get the status of a given transform :param transform_id: The uuid of the transform :return: The current status for the transform

get_transforms() List[TransformStatus]

Retrieve all transforms you have run on the server :return: List of Transform status objects

async get_transforms_async() List[TransformStatus][source]

Retrieve all transforms you have run on the server :return: List of Transform status objects

servicex.servicex_client.deliver(spec: ServiceXSpec | Mapping[str, Any] | str | Path, config_path: str | None = None, servicex_name: str | None = None, return_exceptions: bool = True, fail_if_incomplete: bool = True, ignore_local_cache: bool = False, progress_bar: ProgressBarFormat = ProgressBarFormat.expanded, concurrency: int = 10, cache_dir: str | None = None)

Execute a ServiceX query.

Parameters:
  • spec – The specification of the ServiceX query, either in a dictionary or a ServiceXSpec object.

  • config_path – The filesystem path to search for the servicex.yaml or .servicex file.

  • servicex_name – The name of the ServiceX instance, as specified in the configuration YAML file (None will give the default backend).

  • return_exceptions – If something goes wrong, bubble up the underlying exception for debugging (as opposed to just having a generic error).

  • fail_if_incomplete – If True: if not all input files are transformed, the transformation will be marked as a failure and no outputs will be available. If False, a partial file list will be returned.

  • ignore_local_cache – If True, ignore the local query cache and always run the query on the remote ServiceX instance.

  • progress_bar – specify the kind of progress bar to show. ProgressBarFormat.expanded (the default) means every Sample will have its own progress bars; ProgressBarFormat.compact gives one summary progress bar for all transformations; ProgressBarFormat.none switches off progress bars completely.

  • concurrency – specify how many downloads to run in parallel (default is 10).

  • cache_dir – if set, will override the target directory for downloads and the cache database.

Returns:

A dictionary mapping the name of each Sample to a GuardList with the file names or URLs for the outputs.

async servicex.servicex_client.deliver_async(spec: ServiceXSpec | Mapping[str, Any] | str | Path, config_path: str | None = None, servicex_name: str | None = None, return_exceptions: bool = True, fail_if_incomplete: bool = True, ignore_local_cache: bool = False, progress_bar: ProgressBarFormat = ProgressBarFormat.expanded, concurrency: int = 10, cache_dir: str | None = None)[source]

Execute a ServiceX query.

Parameters:
  • spec – The specification of the ServiceX query, either in a dictionary or a ServiceXSpec object.

  • config_path – The filesystem path to search for the servicex.yaml or .servicex file.

  • servicex_name – The name of the ServiceX instance, as specified in the configuration YAML file (None will give the default backend).

  • return_exceptions – If something goes wrong, bubble up the underlying exception for debugging (as opposed to just having a generic error).

  • fail_if_incomplete – If True: if not all input files are transformed, the transformation will be marked as a failure and no outputs will be available. If False, a partial file list will be returned.

  • ignore_local_cache – If True, ignore the local query cache and always run the query on the remote ServiceX instance.

  • progress_bar – specify the kind of progress bar to show. ProgressBarFormat.expanded (the default) means every Sample will have its own progress bars; ProgressBarFormat.compact gives one summary progress bar for all transformations; ProgressBarFormat.none switches off progress bars completely.

  • concurrency – specify how many downloads to run in parallel (default is 10).

  • cache_dir – if set, will override the target directory for downloads and the cache database.

Returns:

A dictionary mapping the name of each Sample to a GuardList with the file names or URLs for the outputs.

servicex.types module