Excluding very large Energy Depositions in DepositionGeant4

agsponer · May 19, 2022, 2:14pm

Hi everyone,

I am trying to simulate the Landau distribution of the CSA signals (signal area, signal maximum, Gaussian ToT) from a silicon pad detector with MIP particles. In order to get the CSA Signals, I use the TransientPropagation + PulseTransfer modules together with the new PixelPulse object (see the very helpful merge request by Simon).

For sufficient statistics, I need to run a large number of Simulations (10-20k), which is where I run into a problem: Even though the probability is very low, the Landau tail of the energy deposition inside the detector implies that some events with “incredibly” large energy deposition occur. This energy deposition leads to a very large number of electrons and holes, that need to be propagated by TransientPropagation. This can then result in Allpix getting “stuck” on an event.

I attached a minimum reproducible example (with fixed seed), which I ran using the v2.3.0 docker image.
detector.conf (143 Bytes)
geometry.conf (105 Bytes)
simulation.conf (1.2 KB)

Here, event #361 will generate ~41’362ke, compared to the mean deposited charge of ~36ke.
With a ~1000x as large charge deposition, the propagation will also take 1000x as long to complete. When I use multiprocessing, this ends up “blocking” the simulation (after the buffer is filled) until the propagation is finished.

Do I understand correctly that this “blocking” happens because Allpix² wants to enforce the correct order of events?
Would it be possible to relax somehow the requirement of events finishing in order and let the other workers continue computation while a very large charge deposition is still being propagated?

Another approach I thought about using is to “cut off” the spectrum of the deposited energy/charge (at for example, 10 * the median energy/charge). This could be done by splitting the simulation into two after DepositionGeant4 (using ROOTObjectWriter and ROOTObjectReader) and manually removing events from the ROOT file.

However, I feel that manually deleting events is quite a “hacky” way, and I’m not sure if Allpix² likes it if there are fewer events in a file than expected.

PS:
I have one more question, this one about the ElectricFieldReader module and linear electric fields:
The TransientPropagation module explicitly disallows linear electric fields (mode = "linear").
What is the exact reason for this?
For a simple pad diode, the electric field would accurately be described by a linear function and indeed, using an electric field model = "custom" with a linear field function results in no issues during execution.

Thanks for your help,
Andreas

simonspa · May 19, 2022, 4:13pm

Hi @agsponer

thanks for this detailed report - we were stumbling over the same issue already several times in the past and have several ideas how to properly deal with this. But in order, first to your comments & questions.

The blocking after the buffer is full is indeed happening, because the ROOTObjectWriter requires the events in order such that the output file and its order is always the same and does not depend on the threading used to process the simulation. The framework already foresees for those modules to be able to waive this sequence requirement, i.e. to accept whatever event comes next. However, we have not used this for any module yet and would likely need to test it thoroughly. But this would be one way out.
Instead of a cut-off and a manual selection, we have two more ideas up our sleeves. One is to introduce the possibility to abort events based on a set of criteria - one of them could be the total charge deposited. This is currently in active development here: Draft: Introduce Possibility to Abort Event (!706) · Merge requests · Allpix Squared / Allpix Squared
The second - even nicer - solution is that we would like to add a dynamic scaling of charge carrier groups in the propagation modules. In your case, this algorithm would scale from charge_per_step = 100 to, let’s say, charge_per_step = 50000 just for the single huge event and therefore make this event be processed as quickly as the ones in the Landau MPV. The only downside of this is a slight loss of accuracy in the very hard Landau tail - but nothing expected to be noticeable since the gain of moving 40M charge carriers in packs of 100 is very small This feature is currently being worked on by @hwennlof .
The manual selection would anyway work and APSQ would not care about missing events, but just process the available ones. The worst thing that might happen is a notice printed at the end saying “ending run after 99800 events of 100000 events, end of file” or similar.

I Hope we get this sorted out soon, because - as said - we have similar issues from time to time! I’ll update this thread once something is there to test.

All the best,
Simon

PS Bonus question - bonus answer:

It basically is a safeguard against users not knowing what they are looking for. We many times had the issue that people have loaded a weighting potential (or used the pad one) for a pixel detector, and then ran with a linear field. The result is very unphysical because the charge carriers will first drift towards the electrode and induce current - but if their position is slightly off the implant, they then “reverse” through the weighting potential back to zero (next to the implant) leaving a net charge of zero - because they’re not following the actual electric field lines that would be heavily bent at those positions.
But since you are very well aware of what you are simulating, suing the custom field is perfectly fine for this - and I agree that for a pad detector, that’s exactly what you’re looking for.

agsponer · May 20, 2022, 12:28pm

Hi @simonspa

Thanks for the very quick and informative answer.
I think I already skimmed over the discussion on aborting events (!706) once, but did not think that this could be used to cancel events of APSQ itself as well instead of just handling errors from “external” modules.

Having a dynamically chosen charge_per_step (maybe only above a certain deposited charge threshold) would indeed be very nice and a clean way of dealing with the Landau tail. Is there a branch that this is being worked on?

I did implement a manual selection on the deposited charge (after some fighting with ROOT and migrating to PyROOT), see cut_energy_deposition_pyroot.txt (rename to .py) and this seems to work fine with APSQ.

On the electric field:
I can see how some people would make this mistake with pixelated detectors. For now, I just commented out the check in TransientPropagation (so that I don’t need to copy everything from ElectricFieldReader and can keep my .conf file a bit cleaner), as I am working on a fork anyways.

Wishing you a good weekend,
Andreas

simonspa · May 31, 2022, 6:21am

Hi @agsponer

I’m not sure how far tihings are, @hwennlof can probably jump in here…?

Thanks for sharing this! (I now also allowed .py extensions for file attachments on the forum…)

We could actually consider to demote this from a ModuleError to a LOG(ERROR) only and leave it to the user to do “the right thing”. I would consider this if a merge request was to appear…

All the best,
Simon

hwennlof · May 31, 2022, 11:18am

Hi!

Indeed, I’m working on (somehow) dynamically changing the number of charges propagated together. The code is currently in my fork; Files · dynamicChargePerStep · Haakan Wennloef / Allpix Squared · GitLab
After some thinking, I’ve started with the rather basic idea of setting a “maximum number of bunches” allowed for a single deposit. This has the advantage that the total deposited charge for an event doesn’t matter, but just the individual deposition points.
The way it works is by checking if the given charge_per_step is enough to get all the deposited charges propagated without exceeding the max_charge_bunches. If not, the charge_per_step is increased for this particular deposit. This will thus set a sort of “maximum effort” used to propagate a single deposit. If the max_charge_bunches value is set high enough, only the massive deposits are actually affected.
I’ve also added a histogram to output_plots to display the charge_per_step for each propagation.
This can thus also be seen as setting a threshold of when this comes into effect, of charge_per_step*max_charge_bunches.

This is still very much in the testing phase, and right now only implemented for [TransientPropagation]. Here are the time results for a run of 1000 events for different values of max_charge_bunches on my local machine;
10 000: 4 minutes 14 seconds
1 000: 4 minutes 11 seconds
100: 3 minutes 25 seconds
10: 1 minutes 35 seconds

So, for this particular seed there are no massive charges, and we don’t get any significant difference until we make the maximum number a bit too small (I have charge_per_step = 5 here, but also a rather thin sensor). Looking at just a single massive event however (this one is a 500 MeV pion, depositing 3 676 712 electron-hole pairs);
10: 4 seconds
100: 12 seconds
1 000: 1 minutes 24 seconds
10 000: 1 minutes 28 seconds
100 000: 2 minutes 12 seconds
1 000 000: 2 minutes 31 seconds

With 10 000, there are actually only two deposits that get affected at all, so this event is odd. But setting it to 10 000 definitely makes this event manageable rather than freezy and memory-eaty.

For your event, that’s really a massive deposit. Running just that single event with different max_charge_bunches and charge_per_step = 100 , I get
10: 3 minutes 43 seconds
50: 14 minutes 0 seconds
100: 23 minutes 29 seconds
And with enough for it not to make a difference: 1 hours 28 minutes 1 seconds (I used 100 000 in this case)

So, just to summarise: I’m working on it, and what I have so far seems to work okay. I have not yet investigated the impact on the final results, however.

Kind regards,
Håkan

hwennlof · June 1, 2022, 1:10pm

Small but important update, so as not to cause confusion; the keyword is now max_charge_groups instead, to be more in line with the rest of the framework. It will also soon be available in [GenericPropagation].

Kind regards,
Håkan