The analysis presented here is a search for WIMP (neutralino) dark matter in the centre of the Earth, using data taken in 2001-2003 including string triggered events. The analysis is focused on low energy, i e low mass, WIMPs and includes simulated WIMP masses of 50 and 100 GeV.
This unblinding proposal is written primarily for people with experience in analysis of AMANDA data. It should be simple for you who have seen this type of analysis before to get the whole picture just in two clicks.
The material is strucured in the following way. Those who are familiar with this analysis can go directly to the results. This page is intended to be printable and possible to read on its own. Some notes are included in the form of links to files or pages with explanations, plots, and other stuff.
A page with cut variables and other details gives an introduction to the analysis. A certain knowledge of analysis and of the AMANDA software is assumed to keep it short, but links on the page refer to further information.
There is also a page with some plots and discussion about string trigger stability.
A list of updates, the most recent first:
I've overlooked the warning on your page that the description is mainly for experts, and plunged in to see what I could understand.
It looks like an interesting analysis. I'm not too knowledgeable about the context, that is, how the new sensitivity compares with theoretical predictions or the sensitivities of other experiments, but you point out that at the lowest mass range the new sensitivity is an order of magnitude better than in the 1999 analysis.
Since the low mass range is where the contribution of the string trigger to the effective volume is greatest, and since the string trigger is being used for the first time here, it seems worth looking at the mc / data comparison in detail. There are several plots showing the string with most hits for mc and data separately. It looks like you may already be working on this: could you provide ratios of these plots? (since there is so much more data than mc, the error bars could be estimated from the mc statistics). String 15 certainly stands out as the most discrepant by eye, but given the statistics it seems like some of the other relative discrepancies might be significant as well? Is there an idea about what could account for this after retriggering?
Anna: Ratios are now provided. We do not know exactly why this difference occurs, it's included in the uncertainty of the simulation.
Best,
Chad
From Per Olof Hulth September 13
Hi Anna,
I have a few comments and questions about your request for unblinding.
I am missing (as David) some details about the string trigger. e.g. comparing MC and data. It is the first time we are using the string trigger for an analysis and it would be good to see more details. How big fraction of data is based on only string trigger? How are the MC and data distributions looking?
Anna: More details available now on the string trigger page.
It would be nice to see the cumulative event distribution for data and anis in the -cos(zenith angle) 2001-2003 since the statistics is so low. Do it also for the last but one cut level.
Anna: I'm not sure what you mean by cumulative, but I interpret this as the -cos(zenith angle) plots on the next to last cut level for all three years together (which I already showed for one combination of mass and channel) for all four analysed combinations. This is now added on the page of analysis details.
Why do you start the wimp signal below 1.0 in the relative efficiency plots? I thought it should be the number of triggered signal events at 0 cut level giving 100%?
Anna: The somewhat hidden explanation on the web page is the following: "the amount of signal is compared to the number of generated muons to include a sense of the effective volume in these plots." This means that I normalize to the number of generated muons instead of triggered events. I'll rewrite the explanation to make it clearer.
I have not found anything strange and it looks OK to me.
Cheers,
Per Olof
From David Hardke September 1
Hi Anna,
I looked closely at your unblindiing proposal, and everything looks OK. I only have a few small requests:
1) This is the first analysis to use the string trigger. I'd like to be
certain that the string trigger is performing as it should and is being
simulated correctly. Have you made any performance and stability checks for
the string trigger. Here are some things that should be checked:
-- Stability of the string trigger (compare rates of string to mult
trigger throughout the data set). I only bring this up because they were
having some hardware problems in recent weeks.
-- Comparison of the data and corsika MC for string triggered
events:
* Which string? I know that there is only a yes/no for the
string trigger, but it would be interesting to look at which strings
have 6 or more hits for string triggered events in the data and MC. This
might reveal problems with the string trigger hardware and problems with the
simulation not accounting for the detailed detector configuration.
* Ratio of string the multiplicity triggers (as well as
events where both are triggered)
Anna: Plots and discussion coming up on a separate page.
2) Have you done your signal simulations using the newest photonics based AMASIM? Your effective volumes will probably change a little bit. Since you are looking at events from a single direction, it should be relatively easy to simulate (you only need to load one photonics table). This can be done during the systematic error evaluation stage (after unblinding).
Anna: No, I haven't. But I intend to, just as you suggest.
3) Could you list the 250h and 250s cuts on the details page? From the results page I gather that you use the same cuts as for 100h but you allow longer tracks. I find it curious that the cut variable that changes most for the different assumed WIMP masses and decay channels is the upper limit on Ldirb. Is there an intuitive explanation for that?
Anna: Sure, I can add the cuts to the table. The "intuitive explanation" is that I didn't optimize the cuts for the 250 GeV WIMPs. If I just took the cuts as they stand in the 100h case it cuts very hard into the signal because I have a hard cuts selecting short tracks -- this is why I loosened this. The cut on track length is based on the signal distribution, and the low energy signal versions of course have shorter tracks.
Other than the investigations of the string trigger performance and stability, I have no concerns that would delay the unblinding.
Dave
From Daan Hubert September 5
anna, here are the results of the belgian jury...
i must say, your results look very exiting! :) and your new proposal gets rid of most of my comments i made last time. 100h 02+03 and 50s 02+03 seem in much better shape now.
apart from the same comment david made (about string trigger comparison at various cut levels), i have following remarks
x results table 1: the 100h analysis results in 2 data events remaining, but the 2 plots (for 01 and 02+03) show at least 3 events in the final theta search bin of 173-180. maybe a typo?
Anna: Indeed a typo (or two). Fixed.
x the numbers on the sensitivity plot for 250GeV do not correspond to those in the table.
Anna: Another typo. Fixed.
x maybe it would be interesting to know what the atm nu background level is that you use to optimize the last cuts.
Anna: The number of (weighted) atm nu events corresponding to the 80% data taking time used for the MRF optimization were: 97, 60, 46, 31 (rounded to nearest integer).
From Daan Hubert July 1 This was a very early response, and some comments refer to things that were only seen in an earlier version of the analysis. My answers have been omitted.
x MC
- you mention 3.6x10^8 2001 dcorsika triggers for a lifetime of ~2 days.
this corresponds to an unrealistic trigger rate of 2kHz. is 3.6x10^8 a typo?
- one ANIS sample for whole analysis? how many events?
x flarecuts: the flare_only_adc_3 (3) and flare_induc_1119_9 (9) cuts for 03 seem so different for 01 and 02. shouldn't the values be the other way around or is this real? and how could there be such a big difference?
x final cut:
! you claim a uniform theta-distribution above 160deg. looking at the
distributions at next-to-last level, this is probably the case for most of
the bins. but *not* for the very last - and most important - bin (175-180)
where ANIS is systematically below data. is this understood?
- do you normalise the uniform distribution (used as input for MRF
minimisation) to the number of 160-180 events or to the number of 140-180
events?
- anyway, i don't think that a uniform distribution is a too bad
assumption since you will overestimate the last ANIS bin and hence arrive at
a looser (conservative) cut.
x overview:
- the ldirb cut is an interval, which is optimized using data. i didn't
expect there to be much data events with ldirb>250..350? the upper
ldirb-limit seems to be especially a good idea when also caring about
atmospheric neutrinos. was that the idea?
- pha-phe is the only cut that is looser for higher energetic wimp models. i'm not sure how to interpret this observable, so why would a looser cut be necessary for higher energies (and hence more hits in the detector)?
x passing rates seem to agree in general (more comments about this later). how do the dcors-data distributions agree? does the simulation do a good job at describing the data?
x why are 01 and 0203 datasets only combined at final cutlevel? are they still too incompatible before the final cut level? or would it not improve the selection procedure?
x last cut: BG = ANIS normalisation * uniform - is ANIS normalised on 140-180 or on 160-180? (allready asked this before) - quite some normalisation trouble for 0203+100h at level 11 and 12. the BG level is clearly too high. ! same (but more problematic) holds for 0203+50s. - the normalisation in the other analyses seem ok to me.
x harder cuts for soft channels
- in general less events are selected the less energetic the wimp model.
100GeV: 6 hard<->5 soft and 50GeV: 7 hard<->4 soft. this has also
consequences for signal efficiencies... V_eff^string is smaller for 50s
(72.7%) than for 50h (79.6%)
- can i conclude from this that the harder cuts only follow from the
delicate MRF minimisation interplay between data and signal? this occurs to
me as unnatural, but this is probably due to my premature intuition for this
kind of things.
x final cut level:
- "data~BG" for 50s. while "data>BG" for 100h, 100s and 50h. would this
trend (data>BG) be continued towards higher masses (250,500,...)? or is it
merely a statistical effect? off course data>BG doesn't really help us place
stringent limits. but if it is statistics, it is what it is. otherwise one
could argue about ANIS normalisation. did you by any chance compare
(shape+normalisation) NUSIM and ANIS at higher cut levels?
- as comparison: in 99 (NUSIM), data was almost always below BG
expectation. and there was quite a good agreement in absolute numbers for
lowE masses/channels.
- how many data events overlap in the different neutralino analyses? it appears to me (looking at #events in each theta-bin) that final event samples for 0203 are quite different between wimp models.
x 99-01 table: ! wrong sensitivity for 50h. increase (wrt 99) in V_eff is x4 while the improvement in sensitbiy is x40. i suspect this to be a typo (3.15e3 instead of 3.15e4).
- what's the difference between your calculation of 99 N_90 values and philip's (in his thesis, without syst. err. on p112, with syst. err. on p126)?
- difference (between WIMP models) in unweighted ANIS events is much higher than for weighted events. just a peculiar obervation. i have no idea whether this should be expected.
x next-to-last cut table: - 0203 - 100h: 11 data - 19.67 ANIS --> poisson probability = 1.2% 0203 - 50s: 29 data - 9.64 ANIS --> poisson probability = 10^-5% ! 100h can be argued i think. but certainly not 50s. my opinion is that this has to be understood before unblinding can occur. - the excess is coming from a blob of (0203 - 50s) data events at 140-160 which also appears at 10<nch<15 (are these the same events?). the final cut removes all of these excess data events, but i'd like to have some explication which goes beyond "we're lucky".
x passing rate plots: - are these final plots? some plots have more markers than cut labels on the axis.
- what are the cuts corresponding to the cut numbers? are they applied in the same order for all analyses?
- N_line is hardest cut (apart from final theta_cut). is this correct? what are typical signal efficiencies?
- cut 8 always seems to have different data-dcorsika passing rate. is this the same observable?
! 0203 - 100h: strange behaviour for cut>10 and ANIS>data. also different
behavour compared to 01 (but this is probably explained by different cut
order?)
- 0203 - 100h: cut0 for ANIS has different (too low) normalisation wrt
other wimp analyses for 0203.
! 0203 - 50s: strange data-dcorsika behaviour for cut>5. ABUS always
below data, except at last cut level.
x next-to-final level distribution plots
- 01 - 50h: 3 data events selected (theta>172) at final level, while
there are only 2 data events in 170<theta<180
- 0203 - 50s: 3.09 selected ANIS events (theta>171.5) at final level,
while there are only ~2 events in 170<theta<180
- smoothness != smoothness on which you cut?