Prototyping Use of Root

Root is an object oriented framework designed to store and manipulate large amounts of data.  Root offers libraries of routines including object I/O and data analysis tools.  The analysis tools are similar to those provided by PAW and IDL and include:  display capabilities, histogram creation, and function fitting.  See http://root.cern.ch for complete information on the Root system.

The 1999-2000 SLAC Test Beam run was used as a venue for the first use of Root in a more central role. The original purpose was to replace the ascii output files previously in use in GlastSim and to investigate use of Root's analysis tools as a follow-on to PAW and a free alternative to IDL. 

Object I/O

GLAST's SAS-related software language standard is C++. A shortcoming of the language is no built-in object I/O: an object in memory has no analogue of Fortran's binary I/O to allow it to be saved to and retrieved from permanent store ("persistency"). Consequently, one must use a 3rd party solution to achieve persistency. Two are in common use: Root and Objectivity. The latter option was deemed too complex and expensive for GLAST's resources.

Structured I/O

Root allows a rich description of the data, as demonstrated by the tree of classes used for the reconstruction:

Branched I/O

In addition, Root has the feature of branched I/O, wherein one can read in a single branch of a tree, and read in the rest of the event only if there is something of interest found in the original branch. This can give a tremendous savings in I/O time. An example would be to read the TKR branch, and only read the CAL data if there was a gamma found by the tracker. Root I/O is also random access, so one can also read in a list of events, only accessing the desired events, not the whole dataset.

Testbeam I/O

For the 1999-2000 SLAC Test Beam, Root was used to store both raw detector data and the results of reconstruction.  

Root for Analysis

Root is the follow-on from the authors of PAW - the world standard analysis tool in HEP for the past 15 years. It provides full access to its I/O mechanism (of course) and has a collection of analysis tools for visualization (gui, histogramming, event display) and fitting (eg Minuit minimization package). Many other tools (such as a sophisticated neural net package) are being contributed from the broad user community.

Using Root

Our main tool aiding the user for direct Root was a helper macro to ease many of the routine tasks. This class can manipulate a raw and a recon Root file at the same time and is intended to provide useful manipulation of the Root Event loop. The macro handles opening and accessing events in the Root files.  One function, called Go( ), contains an event loop and all user defined analysis code. One of the main uses of the macro was to create user ntuples, for easy 2nd pass analysis.  The macro's interface allows:

Root & IDL

We provided an interface between Root and IDL, one of the standard (commercial) analysis packages in the astronomy world. This interface allowed IDL users to read in Root files and dynamically fill IDL structures for their subsequent analysis. This allowed us to maintain a single form of data file, while allowing analysis in both frameworks.

Problems

All was not smooth in our initial use of Root. We suffered from it being new to our group. We were under time pressure and learning and implementing new features as we went. There was insufficient group expertise to adequately support the new users. This was especially the case for questions of presentation graphics. The support group was more familiar with the I/O system and basic classes than with the creation of nice plots.

There were a number of desirable features not available in the version of Root available at the time:

  1. Root files were not self-describing, so the Root-to-IDL interface was hardcoded to the particular set of classes we designed. In principle, the interface should be able to create a structure on the fly from an internal description of the Root file. 
  2. Changes to class definitions were painful, as no real schema evolution was in place.
  3. It was not convenient to create a user gui, since support on Windows and unix was quite uneven: the gui classes on Windows were quite primitive.
  4. We were unable to optimize our Root files for efficiency and organization (eg storing raw and recon branches of the same tree in separate files).
  5. Cannot use pointers across branches (pointers are valid within a branch).

The first four issues have been or will imminently be fixed.  Item 5 is not a show-stopper. We intend to implement all these features shortly.

Opportunities

Many new features are appearing in Root. Among those we will keep an eye on are the interface to sql databases (eg Oracle), and a parallel processing analysis facility, called PROOF. This latter allows a user to issue his command to execute a macro and, under the covers, the macro is distributed to a set of processors and the result re-assembled and delivered back to the user. He is not aware of it being different from local analysis (except for the elapsed time).

Another tool to investigate is a neural net package. Our background rejection criteria form a complex multi-dimensional space. Currently we project those dimensions out, making a series of sequential cuts. A neural net may allow for much more efficient selection. The neural net package comes with many tools for optimizing and understanding the network.


H.Kelly, R.Dubois Last Modified: 07/17/2001 06:08