Shell script variables in steps.yml

Hello,

I am dealing with following problem. I have a script in the steps.yml file which has a for-loop of this type in it (see below). Commands are simplified but I just want to demonstrate the problem.

for sig in GGMH
do
python script.py > logs/${sig}.log
done

When I run RECAST it crashes with KeyError: sig, which I suppose is because it is expecting to have variable ${sig} to be a “RECAST variable” (i.e. assigned in workflow.yml) but it is actually the variable I am defining only in the shell script so it is supposed to have no reference in workflow.yml at all. So my question is:

How do I treat variables in the shell script such that they are not interpreted as a “RECAST variables”? Is there some kind of escape sequence that would allow me to use a variable solely in the shell script but RECAST won’t consider it?

Cheers,
Ondřej

Hi Ondřej,

maybe
python script.py > logs/"$sig".log
will do the trick? I have a for loop in my own steps.yml, it’s working fine, see

https://gitlab.cern.ch/recast-atlas/exotics/ana-exot-2018-54/-/blob/master/recast-atlas/specs/steps.yml#L23-30

Is there some kind of escape sequence that would allow me to use a variable solely in the shell script but RECAST won’t consider it?

edit: See Shell script variables in steps.yml - #5 by feickert for revised answer.

@otheiner I believe the answer is “no” with regards to an escape sequence and as @ysmirnov points out the best way to approach this is to break with best practices RE: shell scripting and drop the use of {} for your shell variables. This of course has the negative side effect of removing the ability to do brace expansion but at the moment in recast-atlas v0.1.9 I do not think there is an alternative.

So to summarize:

  • General good practice for using accessing shell variables: "${my_shell_variable}"
  • General good practice for using accessing shell variables when working with RECAST: "$my_shell_variable"
  • RECAST variables: {my_recast_variable}

@feickert Thank you for your answer! I asked this question on Mattermost some time after I asked here because I thought there has to be a way and someone actually suggested using double braces (like this: {{expression}}) and this seems to work. I am using following line in my RECAST script:

ls '{toMerge}'/mc16_13TeV*.root | awk -F'[.]' '{{print $2}}' | sort -u >sig_dids.txt

This line is supposed to get dataset IDs from the name of the files inside the folder.

I know that this line does not directly refer to shell variables but to use of curly braces in scripts in general but this is the solution that works in RECAST.

@otheiner you are correct, as I followed up with @lheinric who explained that

yadage uses Python string formatting so if you want ${dsgfaf} in the script you need to do ${{dsgfaf}}

So the corrected information is:

  • Normal shell script variables: "${my_shell_variable}"
  • shell script variables in RECAST: "${{my_shell_variable}}"
  • RECAST variables: {my_recast_variable}

I’ve opened up Sign in to CERN to improve this in the user guide documentation.

1 Like