Loading CNN model in C++ analysis code

Dear experts,

I am working to load a CNN model inside the VVJJ analysis framework (CxAOD fw). I have been already aware of the onnx runtime package road but I am getting some troubles to get it. I am starting from the athena example:

now, in my understading I need to use the main onnx project (microsoft/onnxruntime).

I tried for instance to get it locally but getting some compiling errors in the final tests it perform. But, following few issues I found around:
https://its.cern.ch/jira/browse/AML-3

I think this is not the best/smart solution to procced with. I can say that it has been intragrated in the atlas external in athena, so is it possible to use it in an analysis fw setup (like anaylisis base release) or something like this? Maybe, using AtlasExternal directly? Is it supposed to work insiede an analysis fw? Sorry for my lack on this point.

Thanks a lot in advance
Cheers,
Antonio

Hi @angianni

Thanks for asking here! Can you confirm that you’re trying to build onnxruntime yourself? If so, you’re making this way too difficult for yourself. You can just use the version we supply with AnalysisBase by putting this in your CMakeLists.txt file:

find_package( onnxruntime REQUIRED )
target_link_libraries(your-target ${ONNXRUNTIME_LIBRARIES})
target_include_directories(your-target PRIVATE ${ONNXRUNTIME_INCLUDE_DIRS})

where your-target is whatever library or executable you’re trying to build.

Hi @dguest,

yes, I started with the example in athena AthExOnnxRuntime and I was tring to compile it in order to use that structure to develope a simple tool in the analysis fw; since I got errors I supposed the main onnxruntime package was missing, so just tried to download it but I undestood soon that something was not really smooth to do…

Now looking better at the errors I have, I think I solved them, just replacing:
INCLUDE_DIRS ${ONNXRUNTIME_INCLUDE_DIRS} PathResolver

with
INCLUDE_DIRS PRIVATE ${ONNXRUNTIME_INCLUDE_DIRS} PathResolver
here:
https://gitlab.cern.ch/atlas/athena/-/blob/release/21.2/Control/AthenaExamples/AthExOnnxRuntime/CMakeLists.txt#L14

and
#include <core/session/onnxruntime_cxx_api.h>
with
#include <onnxruntime/core/session/onnxruntime_cxx_api.h>
here:
https://gitlab.cern.ch/atlas/athena/-/blob/release/21.2/Control/AthenaExamples/AthExOnnxRuntime/AthExOnnxRuntime/CxxApiAlgorithm.h#L10

now it is compiling smoothly, so I understand that the onnxruntime dependencies are already availables :slight_smile:
Now, I think I can just use this structure (really super useful) to implement what I need.

Thanks a lot for help :slight_smile:
I will ask again in case

Cheers,
Antonio

Dear experts,

going on with the implementation…
The ONNX tool expects to have the input (10.000 images with 28x28x1 dim) in the flatten format:
input_tensor_values.resize(10000, std::vector(28281));

using the transformation:
input_tensor_values[i][r*n_cols+c] = MyPixel…

now, I am wondering when I am moving for instance from 28x28x1 images to 28x28x3 images, which is exactly the transformation the tool expect to have?

Thanks in advance
Cheers,
Antonio

You should ask @dbakshig.

There might be a complimentary example which is computing a graph network. The problem with this one is that it may crash in an implementation defined way because it returns a pointer to unowned memory in an array, but if you avoid that (by using something like a vector for the return, for example) it might get you further.

@angianni,
So for flattening to 28x28x3 you need to do it in data processing (e.g. https://gitlab.cern.ch/atlas/athena/-/blob/21.2/Control/AthenaExamples/AthExOnnxRuntime/Root/CxxApiAlgorithm.cxx#L30) you can put another for loop inside

for(int c=0;c<n_cols;++c)
{
unsigned char temp=0;
file.read((char*)&temp,sizeof(temp));
input_tensor_values[i][r*n_cols+c]= float(temp)/255;
}
for your last column; I haven’t done by myself yet but that’s the way I will go.

@angianni it might help if you explain a bit more about what you’re trying to do here: I’m confused about why we’re talking about images for the VVJJ analysis. Are we really trying to do an image model here? Maybe you can say more about the architecture of your network.

Hi @dguest, @dbakshig,

thanks a lot for your prompt replies!

Yes, I was wondering to add an extra loop in the data preprocessing for the third dimension but I am not sure about the convention the tool expects, in the sense that I can make my transformation like:
input_tensor_values[i][ (n_cols*n_rows*k) + r*n_cols + c] = …;

where k is the index running over the image deep; but I guess that I can do the same transformation (row, colums, color) -->> 1dim-index also in other ways (I would say there are at least 3 ways to make the transformation but not sure).

Yes, about the model I am trying to retrieve, I have a CNN jet image (eta, phi) pixels but we have a “3-Colors” for our images driven by the jets consituents tastes. So, I am using this syntax in keras:
Image_SHAPE = (nPixelsx, nPixelsy, nColors)
image_input = Input(Image_SHAPE)
image = Conv2D(32, kernel_size=3, activation=‘relu’, input_shape=Image_SHAPE )(image_input)

and using in input a 3D array having (nPixelsEta, nPixelsEta, nColors) as dimensions.

Maybe, this summury would give an overview of what I am trying for VVJJ :slight_smile:
https://indico.cern.ch/event/939174/contributions/3986619/

Thanks
Antonio

I think what you’re asking is whether your arrays are in C or fortran order, right? I would assume they are in C order, i.e. the last index is the one that changes fastest as you iterate through the flattened array. So I think your description is correct, assuming that you index the arrays like

value = array[column][row][color]

Hi @dguest,

sorry for delay on my reply.

Basically, I have been able to successfully run my case a couple of weekend ago (and forgot to reply here…)
Yes, my point was how to transform the 3-dim tensor to the 1-dim array the tool expected. After few attempts I found that the working transformation is:

input_tensor_values[i][ (n_colorsn_colsr) + c*n_colors + k] = Images[k].GetBinContent(r+1, c+1);

so, it seems that the expected priority for the indices is:
k --> c --> r
that is:
colors --> columns --> rows

I would really thank you guys for the nice and useful inputs and hints :slight_smile:

Also, I was wondering if it could be useful to anyone if I would provide my version of the example from @dbakshig; in case, I would need a bit a time to clean-up it from the several attempts I made.

Thanks again
Cheers,
Antonio

Hi @angianni, glad you worked it out! Just for reference, when you index things in the training framework, is the order something like

array[batch_index][row][column][color]

?

I just want to make sure onnx and onnx runtime are doing what we think they are doing, and I believe it should follow exactly the same rules as most ML libraries that use numpy.

My personal advice: don’t think about this in terms of rows and columns and colors. These are all just indices in a rank-n array. There is no hard rule to what a ML library considers a “color”, but and “row” and “column” usually correspond to the 1st and 2nd index of a numpy array. These things you are calling “row” and “column”, they are the 2nd and 3rd indices in the array for training right? And they correspond to bins in eta and phi, right?

As for the magic with indices: I think the main missing bit of information here was how an index in a rank-4 array corresponds to an index in a “flattened” array in memory. Assuming your data is stored in a c-style array, the rule is that the last index is the one that increments most rapidly as you move through the continuous memory. So you can calculate the index in the memory block as

global_index = (batch_index * pixels_eta * pixels_phi * n_colors) + (eta_index * pixels_phi * n_colors) + (phi_index * n_colors) + color_index

There are a few things to note here:

  1. I don’t know what you’re using as eta or phi, I just guessed as to which corresponds to “row” and “column”.
  2. The batch index is completely omitted from your calculation. This works in your case, because you are working with batch size 1 and thus the index is always zero.

One final note: it looks like you are using ROOT histograms which are created in each iteration of this algorithm. As ROOT objects, these things are relatively heavy and liable to interact with some global variables. If this code is just for an analysis this might be fine, but if it goes into reconstruction (or you find some weird memory leaks or performance issues) you might consider something lighter weight.

Hi @dguest,

thanks a lot for your suggestions!

Yes I completely agree that row/columns do not mean anything here. Basically, I have eta/phi bins, in particular, phi corresponds to index-1, eta to index-2 and of course batch size to index-0 and colors to index-3. Of course, as you are guessing, I have been a bit confused at some point between eta/phi/row/colums/colors ecc so I just tried “experimentally” different combinations in order to get convergency to the right way by “brute force”, but now thinking with fresh mind I can totally understand how it is working. We just need to change eta --> phi in your line, just because you were supposing to have eta first then phi in the indices set as maybe it is more intuitive (indeed I just made an unwanted transpose in my data pre-processing before training to make some plotting, but I would update this in future just for personal aesthetics).

About batch size, exactly there is just a dummy index (index-i in my previous line) but it is not useful in the sense I just load model for each event (each jet). In principle, it could be used to load the model on the jets collection in the event at least (maybe it could speed up a bit?) but at the moment it is something more complicated to implement in the analysis fw becuase this part is running over single object for each collections for each event.

Finally, about ROOT objects, I started to use TH2 to store jet images just to have something more intuitive to look at, but it is defintely to be improved, maybe I can directly use vectors instead of using TH2 first and then move to vectors.

Thanks
Cheers,
Antonio