servicex client internals¶
servicex.expandable_progress module¶
- class servicex.expandable_progress.ExpandableProgress(display_progress: bool = True, provided_progress: Progress | ExpandableProgress | None = None, overall_progress: bool = False)[source]¶
Bases:
objectWe want to be able to use rich progress bars in the async code, but there are some situtations where the user doesn’t want them. Also we might be running several simultaneous progress bars, and we want to be able to control that.
We still want to keep the context manager interface, so this class implements the context manager but if display_progress is False, then it does nothing. If provided_progress is set then we just use that. Otherwise we create a new progress bar
- Parameters:
display_progress
provided_progress
- class servicex.expandable_progress.ProgressCounts(description: str, task_id: TaskID, start: int | None = None, total: int | None = None, completed: int | None = None)[source]¶
Bases:
object
- class servicex.expandable_progress.TranformStatusProgress(*columns: str | ProgressColumn, console: Console | None = None, auto_refresh: bool = True, refresh_per_second: float = 10, speed_estimate_period: float = 30.0, transient: bool = False, redirect_stdout: bool = True, redirect_stderr: bool = True, get_time: Callable[[], float] | None = None, disable: bool = False, expand: bool = False)[source]¶
Bases:
Progress
servicex.minio_adapter module¶
- class servicex.minio_adapter.MinioAdapter(endpoint_host: str, secure: bool, access_key: str, secret_key: str, bucket: str)[source]¶
Bases:
object- MAX_PATH_LEN = 60¶
- async download_file(object_name: str, local_dir: str, shorten_filename: bool = False, expected_size: int | None = None) Path[source]¶
- classmethod for_transform(transform: TransformStatus)[source]¶
- classmethod hash_path(file_name)[source]¶
Make the path safe for object store or POSIX, by keeping the length less than MAX_PATH_LEN. Replace the leading (less interesting) characters with a forty character hash. :param file_name: Input filename :return: Safe path string
- async list_bucket() List[ResultFile][source]¶
servicex.query module¶
servicex.query_cache module¶
- class servicex.query_cache.QueryCache(config: Configuration)[source]¶
Bases:
object- cache_path_for_transform(transform_status: TransformStatus) Path[source]¶
- cache_submitted_transform(transform: TransformRequest, request_id: str) None[source]¶
Cache a transform that has been submitted but not completed.
- cache_transform(record: TransformedResults)[source]¶
- cached_queries() List[TransformedResults][source]¶
- get_transform_by_hash(hash: str) TransformedResults | None[source]¶
Returns completed transformations by hash
- get_transform_by_request_id(request_id: str) TransformedResults | None[source]¶
Returns completed transformed results using a request id
- get_transform_request_id(hash_value: str) str | None[source]¶
Return the request id of cached record
- is_transform_request_submitted(hash_value: str) bool[source]¶
Returns True if request is submitted Returns False if the request is not in the cache at all or not submitted
- transformed_results(transform: TransformRequest, completed_status: TransformStatus, data_dir: str, file_list: List[str], signed_urls) TransformedResults[source]¶
- update_record(record: TransformedResults)[source]¶
servicex.servicex_adapter module¶
- class servicex.servicex_adapter.ServiceXAdapter(url: str, refresh_token: str | None = None)[source]¶
Bases:
object- get_code_generators() dict[str, str]¶
- async get_dataset(dataset_id=None) CachedDataset[source]¶
- async get_datasets(did_finder=None, show_deleted=False) List[CachedDataset][source]¶
- async get_servicex_info() ServiceXInfo[source]¶
- async get_transform_status(request_id: str) TransformStatus[source]¶
- async get_transforms() List[TransformStatus][source]¶
- async submit_transform(transform_request: TransformRequest) str[source]¶
servicex.servicex_client module¶
- enum servicex.servicex_client.ProgressBarFormat(value)[source]¶
Bases:
str,EnumSpecify the way progress bars are displayed.
- Member Type:
str
Valid values are as follows:
- expanded = <ProgressBarFormat.expanded: 'expanded'>¶
- compact = <ProgressBarFormat.compact: 'compact'>¶
- none = <ProgressBarFormat.none: 'none'>¶
- exception servicex.servicex_client.ReturnValueException(exc)[source]¶
Bases:
ExceptionAn exception occurred at some point while obtaining this result from ServiceX
- class servicex.servicex_client.ServiceXClient(backend=None, url=None, config_path=None, cache_dir: str | None = None)[source]¶
Bases:
objectConnection to a ServiceX deployment. Instances of this class can deployment data from the service and also interact with previously run transformations. Instances of this class are factories for Datasets`
If both backend and url are unspecified then it will attempt to pick up the default backend from .servicex
- Parameters:
backend – Name of a deployment from the .servicex file
url – Direct URL of a serviceX deployment instead of using .servicex. Can only work with hosts without auth, or the token is found in a file pointed to by the environment variable BEARER_TOKEN_FILE
config_path – Optional path to the .servicex file. If not specified, will search in local directory and up in enclosing directories
cache_dir – Optional path to override the cache directory for downloads and the cache database. If not specified, uses the value from the configuration file or the default path.
- delete_dataset(dataset_id) bool[source]¶
Delete a dataset by its ID :return: boolean showing whether the dataset has been deleted
- generic_query(dataset_identifier: DataSetIdentifier | FileListDataset, query: str | QueryStringGenerator, codegen: str | None = None, title: str = 'ServiceX Client', result_format: ResultFormat = ResultFormat.parquet, ignore_cache: bool = False, fail_if_incomplete: bool = True) Query[source]¶
Generate a Query object for a generic codegen specification
- Parameters:
dataset_identifier – The dataset identifier or filelist to be the source of files
title – Title to be applied to the transform. This is also useful for relating transform results.
codegen – Name of the code generator to use with this transform
result_format – Do you want Paqrquet or Root? This can be set later with the set_result_format method
ignore_cache – Ignore the query cache and always run the query
- Returns:
A Query object
- get_code_generators() dict[str, str][source]¶
Retrieve the code generators deployed with the ServiceX instance.
Returns the cached result if already fetched, otherwise performs a network request via
ServiceXAdapter.get_code_generators().
- get_dataset(dataset_id) CachedDataset[source]¶
Retrieve a dataset by its ID :return: A Query object
- get_datasets(did_finder=None, show_deleted=False) List[CachedDataset][source]¶
Retrieve all datasets you have run on the server :return: List of Query objects
- get_transform_status(transform_id) TransformStatus¶
Get the status of a given transform :param transform_id: The uuid of the transform :return: The current status for the transform
- async get_transform_status_async(transform_id) TransformStatus[source]¶
Get the status of a given transform :param transform_id: The uuid of the transform :return: The current status for the transform
- get_transforms() List[TransformStatus]¶
Retrieve all transforms you have run on the server :return: List of Transform status objects
- async get_transforms_async() List[TransformStatus][source]¶
Retrieve all transforms you have run on the server :return: List of Transform status objects
- servicex.servicex_client.deliver(spec: ServiceXSpec | Mapping[str, Any] | str | Path, config_path: str | None = None, servicex_name: str | None = None, return_exceptions: bool = True, fail_if_incomplete: bool = True, ignore_local_cache: bool = False, progress_bar: ProgressBarFormat = ProgressBarFormat.expanded, concurrency: int = 10, cache_dir: str | None = None)¶
Execute a ServiceX query.
- Parameters:
spec – The specification of the ServiceX query, either in a dictionary or a
ServiceXSpecobject.config_path – The filesystem path to search for the servicex.yaml or .servicex file.
servicex_name – The name of the ServiceX instance, as specified in the configuration YAML file (None will give the default backend).
return_exceptions – If something goes wrong, bubble up the underlying exception for debugging (as opposed to just having a generic error).
fail_if_incomplete – If
True: if not all input files are transformed, the transformation will be marked as a failure and no outputs will be available. IfFalse, a partial file list will be returned.ignore_local_cache – If
True, ignore the local query cache and always run the query on the remote ServiceX instance.progress_bar – specify the kind of progress bar to show.
ProgressBarFormat.expanded(the default) means everySamplewill have its own progress bars;ProgressBarFormat.compactgives one summary progress bar for all transformations;ProgressBarFormat.noneswitches off progress bars completely.concurrency – specify how many downloads to run in parallel (default is 10).
cache_dir – if set, will override the target directory for downloads and the cache database.
- Returns:
A dictionary mapping the name of each
Sampleto aGuardListwith the file names or URLs for the outputs.
- async servicex.servicex_client.deliver_async(spec: ServiceXSpec | Mapping[str, Any] | str | Path, config_path: str | None = None, servicex_name: str | None = None, return_exceptions: bool = True, fail_if_incomplete: bool = True, ignore_local_cache: bool = False, progress_bar: ProgressBarFormat = ProgressBarFormat.expanded, concurrency: int = 10, cache_dir: str | None = None)[source]¶
Execute a ServiceX query.
- Parameters:
spec – The specification of the ServiceX query, either in a dictionary or a
ServiceXSpecobject.config_path – The filesystem path to search for the servicex.yaml or .servicex file.
servicex_name – The name of the ServiceX instance, as specified in the configuration YAML file (None will give the default backend).
return_exceptions – If something goes wrong, bubble up the underlying exception for debugging (as opposed to just having a generic error).
fail_if_incomplete – If
True: if not all input files are transformed, the transformation will be marked as a failure and no outputs will be available. IfFalse, a partial file list will be returned.ignore_local_cache – If
True, ignore the local query cache and always run the query on the remote ServiceX instance.progress_bar – specify the kind of progress bar to show.
ProgressBarFormat.expanded(the default) means everySamplewill have its own progress bars;ProgressBarFormat.compactgives one summary progress bar for all transformations;ProgressBarFormat.noneswitches off progress bars completely.concurrency – specify how many downloads to run in parallel (default is 10).
cache_dir – if set, will override the target directory for downloads and the cache database.
- Returns:
A dictionary mapping the name of each
Sampleto aGuardListwith the file names or URLs for the outputs.