S. Digel, 2 January 2007
Here's my impression of the status of the LAT science data products - not to get bogged down in the issues today but as a reminder that we have real issues to sort out.
Upcoming ground system tests will increasingly require these data products to be delivered and ingested (as a way of verifying that the system 'works'). So we need to make sure that we are producing useful products, not wasting time.
The GSSC is interested in having the definitions of the data products under configuration control, sooner rather than later. Configuration control is definitely coming - it needs to - but I think we have enough outstanding issues that we should not subject the definitions to configuration control yet.
Options for getting the definitions ironed out include posting issues and proposals in Confluence, convening a new LAT-GSSC Data Products Working Group (requiring you to live with what the group concludes), and having a DPWG that meets (once) to thrash out open issues.
In terms of implementing definitions of the data products - agreed on changes should be tracked in JIRA, as a new Project under Science Tools.
Notionally this is like an expanded version of LS-002, with more columns (but probably not all of the Merit ntuple) and most likely also more events. The idea would be to allow a motivated person to study in some way event classification, but not allow second guessing of the reconstruction (direction and energy assignments). This data product would not be useful even for this purpose if the standard event classification method and the ntuple variables are not well documented.
Regarding contents, one suggestion that has floated around has been to include the Merit ntuple variables that are used in the classification trees that are ultimately used. This would mean including the 'fundamental' variables as well as the derived ones, like the CTB variables.
Regarding which events to include, no winning ideas have been circulated. Maybe events with CTBGAM > 0?
I think that one open issue should be to define a realistic use case for this data product. It could end up as a very large and largely useless data product.
This data product is widely used but still has important details to be defined.
We still have not defined the event number that will be used at this level to identify events (for mapping back to full event data), or whether/how the event ID relates to the run number.
The accumulated live time since the start of the mission is practically speaking not practical to derive. The use cases for needing accumulated live time per event all relate to GRBs - very bright transients for which the dead time of the LAT may be a limiting factor for the rates/flux measurement. For all but very bright GRBs that span run boundaries having an accumulated live time since the start of the current run probably would be good enough, right? This would be easier to derive, at least, and not as subject to causing cascading reprocessing.
Some other issues from the LAT Photons page are still awaiting final resolution, too, like how we will represent event classes and whether processing/calibration version needs to be tracked on an event-by-event basis (via an additional column).
The current concept is that we will deliver LS-002 rather than have the GSSC infer the contents from LS-001 and a specification of the cuts.
This is probably in good shape, although we ought to do away with the DEADTIME column, which is not used.
Is the ~30 s sampling adequate? Do we have additional modes other than ON and OFF that need to be designated?
This was defined as a record of all of the registers/settings of the LAT, updated as needed. This data product is I think recognized as not being particularly useful and most likely would become something more like a LAT status summary, something high level that indicates how the instrument itself is working. This arguably would not be a 'data' product but I think would be interesting to people outside the collaboration if the impact of any changes on the overall 'performance' of the LAT could be quantified simply. This is a tall order, obviously.
This was defined when we thought that the LAT would be telling the world about flaring blazars by sending a data product to the GSSC. Now the concept is that we will send our own alerts (possibly via GCN).
Do we still want a data product like this, to be delivered to the GSSC for the record? The current definition is probably not very practical - it looks like an entry for the LAT point source catalog
Ok, I think.
I think that this probably won't exist, at least not compiled by the LAT team. This should be confirmed with the GRB science working group, of course.
This has been part of the Science Tools distributions, although it has also been provided for download by the GSSC for the Data Challenges. The format is not set in stone yet, but it will be something that gtobssim and gtlikelihood can both handle, which limits the possibilities.
These are fairly advanced in terms of specification, although still subject to revision.