How do I download files from a directory on eos recursively using xrootd?

An annoyance frequently encountered when making the transition from developing a RECAST workflow with local input to downloading input from eos with xrootd is that the xrdcp tool seems to have no native functionality for recursively downloading all files in a directory. Instead, it forces you to specify the full path to the file you want and download your files one file at a time.

Fortunately, Giordon Stark has thought about this issue and, as usual, found a clever solution:

Suppose you want to use xrootd to download all files in the directory /eos/user/g/gstark/pyhf/ANA-SUSY-2019-09_3Loffshell on eos. First make a local directory to contain the files

mkdir 3Loffshell

then use xrdfs to list all files ending with eg. .json in the directory on eos, and for each such file, download it to the local directory you just made:

xrdfs eoshome.cern.ch ls /eos/user/g/gstark/pyhf/ANA-SUSY-2019-09_3Loffshell/ | grep '.json$' | xargs -I {} xrdcp root://eoshome.cern.ch/{} 3Loffshell/.

If you’ve found another solution or workaround for this issue, please don’t hesitate to share it in the comment section below!

UPDATE (March 29, 2021):

Thanks to @ysmirnov for sharing his finding that the xrdcp -r works in newer analysisbase image (tested in atlas/analysisbase:21.2.139). Eg.

docker run --rm -it atlas/analysisbase:21.2.139
source /release_setup.sh
kinit damacdon@CERN.CH
xrdcp -r root://eoshome.cern.ch//eos/project/r/recast/atlas/ANA-EXOT-2018-06/antiSF .

Note: eosproject.cern.ch should work as well in this container:

xrdcp -r root://eosproject.cern.ch//eos/project/r/recast/atlas/ANA-EXOT-2018-06/antiSF .

If it’s an option for you, you can also store your files on eos in an archive e.g:
tar -czvf hists.tar.gz hists
then just xrdcp the tarball and extract
tar -xf hists.tar.gz

Might not be the preferred option, but has the benefit that transfer is much faster.

This is a really good suggestion, thanks @pbakker!

Hi,

xrdcp has a -r (or --recursive) flag, which makes it recursively copy all files. That is, the following command copies all my files from eos to afs:

xrdcp -r /eos/atlas/user/y/ysmirnov/MC15_2ndBatch/ForValidationPlots/ .

Hey @ysmirnov, thanks for the report!

I can confirm that that xrdcp -r works for me as well when logged into lxplus. The issue arises when you try to do a recursive copy remotely (eg. from a docker container) using the root://eoshome.cern.ch/ prefix, eg.

docker run --rm -it atlas/analysisbase:21.2.85-centos7
source /release_setup.sh 
kinit damacdon@CERN.CH
xrdcp -r root://eoshome.cern.ch//eos/user/d/damacdon/Feedback_and_debrief_zoom_recording .

which fails with the following error message:

Error indexing remote directory.

Hi Danika,

yes, this is true.

Interestingly enough, the xrdcp -r fails in Docker only (?) for files inside the /eos/user. For files in e.g. /eos/project it works just fine:

xrdcp -r root://eosproject.cern.ch//eos/project/r/recast/atlas/ANA-SUSY-2018-19/ .

I guess this is hinted at at the ATLAS EOS twiki page, search for the “The CLI does NOT work for EOS space on CERNBox under /eos/user/” sentence.

Hey @ysmirnov, strangely I can’t seem to reproduce this in the analysisbase container (still get the same Error indexing remote directory. error:

docker run --rm -it atlas/analysisbase:21.2.85-centos7
source /release_setup.sh
kinit damacdon@CERN.CH
xrdcp -r root://eosproject.cern.ch//eos/project/r/recast/atlas/ANA-EXOT-2018-06/antiSF .
Error indexing remote directory.

Hi @damacdon,

I can’t reproduce the correct behavior with your image either, but when I try a newer one, the atlas/analysisbase:21.2.139, it works fine for me:

PS C:\Users\Fujitsu> docker run --rm -it atlas/analysisbase:21.2.139
Unable to find image 'atlas/analysisbase:21.2.139' locally
21.2.139: Pulling from atlas/analysisbase
5b3f21ee06c1: Already exists                                                                                            15869d7db4b1: Already exists                                                                                            3c67b238eaf6: Already exists                                                                                            00037f4c3fa7: Already exists                                                                                            884893bf7e5f: Already exists                                                                                            0f1d13de82c1: Already exists                                                                                            af773888e33e: Already exists                                                                                            Digest: sha256:cf69e10defa9cb564dcb60c9ca723f0de9e7a1813f588bdde1d1a06a944c1e3e
Status: Downloaded newer image for atlas/analysisbase:21.2.139
             _ _____ _      _   ___
            /_\_   _| |    /_\ / __|
           / _ \| | | |__ / _ \\__ \
          /_/ \_\_| |____/_/ \_\___/

This is a self-contained ATLAS AnalysisBase image.
To set up the analysis release of the image, please
execute:

          source /release_setup.sh

[bash][atlas]:workdir > source /release_setup.sh
Configured GCC from: /opt/lcg/gcc/8.3.0-cebb0/x86_64-centos7/bin/gcc
Configured AnalysisBase from: /usr/AnalysisBase/21.2.139/InstallArea/x86_64-centos7-gcc8-opt
[bash][atlas AnalysisBase-21.2.139]:workdir > kinit ysmirnov@CERN.CH
Password for ysmirnov@CERN.CH:
[bash][atlas AnalysisBase-21.2.139]:workdir > xrdcp -r root://eosproject.cern.ch//eos/project/r/recast/atlas/ANA-EXOT-2018-06/antiSF .
Job: 1/3
Source: root://eosproject.cern.ch:1094//eos/project/r/recast/atlas/ANA-EXOT-2018-06/antiSF/sig16a_MJA.275_lowMET.txt
Target: file://localhost/workdir/.//antiSF/sig16a_MJA.275_lowMET.txt
[117.1kB/117.1kB][100%][==================================================][117.1kB/s]
Job: 2/3
Source: root://eosproject.cern.ch:1094//eos/project/r/recast/atlas/ANA-EXOT-2018-06/antiSF/sig16d_MJA.275_lowMET.txt
Target: file://localhost/workdir/.//antiSF/sig16d_MJA.275_lowMET.txt
[114.4kB/114.4kB][100%][==================================================][114.4kB/s]
Job: 3/3
Source: root://eosproject.cern.ch:1094//eos/project/r/recast/atlas/ANA-EXOT-2018-06/antiSF/sig16e_MJA.275_lowMET.txt
Target: file://localhost/workdir/.//antiSF/sig16e_MJA.275_lowMET.txt
[115.7kB/115.7kB][100%][==================================================][115.7kB/s]
[bash][atlas AnalysisBase-21.2.139]:workdir >                                                                           

I also noticed that on some lxplus nodes this xrdcp -r thing works fine, and on the others it spits out this Error indexing remote directory error.

Hi @ysmirnov,

Interesting! That does also work for me with that newer atlas/analysisbase:21.2.139 image - maybe an updated version of xrootd?? I tested in that image with both root://eosproject.cern.ch and root://eoshome.cern.ch, and both work fine.

Thanks a lot for this find! Will update the main text of this post!