Setting up for xAOD¶
To use FuncADL with Atlas xAOD the dataset/sample being used needs to be specified. To do this it is useful to setup a datatype for passing samples to the ServiceX frontend.
Setting up Samples¶
When passing samples to the ServiceX frontend, it is convenient to use a simple data structure that contains the name of the sample, the dataset location, and the codegen to use. This will allows easy management and reference of the samples in the analysis.
from dataclasses import dataclass
from typing import List, Union
@dataclass
class sample:
"Location of data sample"
# Shorthand name
name: str
ds: Union[List[str], str]
# Codegen
codegen: str
Now that a datatype is setup to make it easier to define samples a list of samples that is used in the analysis can be created:
from servicex import dataset as servicex_dataset
_samples = {
"ds_physlite": sample(
name="physlite",
ds = servicex_dataset.FileList(["root://eospublic.cern.ch//eos/opendata/atlas/rucio/mc20_13TeV/DAOD_PHYSLITE.37622528._000013.pool.root.1"]),
codegen="atlasr25"
),
"ds_phys": sample(
name="phys",
ds = servicex_dataset.Rucio("mc23_13p6TeV:mc23_13p6TeV.902046.QBHPy8EG_QBH_photonjet_n1_Mth7000.deriv.DAOD_PHYS.e8557_e8528_s4162_s4114_r14622_r14663_p6026_tid37642334_00"),
codegen="atlasr25",
),
}
Then these can be mapped to variables to make it easier to reference them later:
ds_physlite = _samples["ds_physlite"]
ds_phys = _samples["ds_phys"]
Getting Data from ServiceX¶
Now that the samples have been defined they are ready to be passed to ServiceX deliver() to get the files from the ServiceX backend. To make it easier later a function can be defined that will take a sample and return the data:
from servicex_analysis_utils import to_awk
from servicex import deliver
def get_data(query, s: sample):
"""Sends request for data to servicex backend.
Args:
query: FuncADLQueryPHYSLITE
s (sample): The sample to create
Returns:
List of files returned from servicex backend
"""
spec = {
'Sample': [{
'Name': s.name,
'Dataset': s.ds,
'Query': query,
'Codegen': s.codegen,
}]
}
# Get the files from the ServiceX backend
files = deliver(spec, servicex_name="servicex")
assert files is not None, "No files returned from deliver! Internal error"
# Get the data into an awkward array
data = to_awk(files)
# For these examples we are only using one sample, so we return just the array, not the dictionary.
return data[s.name]
This example uses the to_awk() function from servicex_analysis_utils.to_awk. This takes the list of files that are from ServiceX and returns an awkward array.