For simplicity, in the following examples, we all start by importing from the packages histogram and numpy.
>>> from numpy import *
>>> from histogram import *
To make an instance of a histogram:
>>> h = histogram( "h", [ ('tof', arange(1000., 3000., 1.0), "microsecond") ] )
Here, string “h” denotes the name of the histogram. The tuple
('tof', arange(1000., 3000., 1.0), "microsecond")
defines a time-of-flight axis. The axis has a name ‘tof’, and the centers of the bins on the axis are
1000., 1001., ..., 2999.
The string “microsecond” is the unit of the tof axis. The tuple
('tof', arange(1000., 3000., 1.0), "microsecond")
is put into a list.
Note
In many cases, histograms are multi-dimensional. For a multi-dimensional histogram, we need a list of axes, in which each item describes an axis. So, even for a 1D histogram, we require user to supply a single-item list, in which the only item describes the only axis for the histogram.
Note
To construct a histogram, you could supply its axes using a list of tuples, while each tuple represents an axis:
[ ('x', range(10)), ('y', range(15)) ]
or using a list of instances of Axis class:
[ axis('x', range(10)), axis('y', range(15)) ]
If we try to print a histogram:
>>> print h
The following text show up:
Histogram "h"
- Axes:
- Axis tof: [ 1000.0, 1001.0, ... 2998.0, 2999.0 ]
- Shape: [2000L]
- Metadata: [('name', 'h')]
- Data: ... [0. 0. ... 0.]
- Errors: ... [0. 0. ... 0.]
A histogram has a name, a bunch of axes, some meta data, and two main datasets: “data” and “errors”.
- For example, a histogram has an axis named “x”, then the bin centers of “x” axis can be accessed as h.x.
For more details, please refer to Accessing data.
Now we want to make this histogram more meaningful. Say we want the histogram “h” to have the form of an exponential function
In the python command line, we enter:
>>> # get the bin centers the tof axis
>>> tof = h.tof
>>> # now we apply the function to the axis and assign it to the histogram
>>> h.I = exp(-tof/1000.)
A shortcut to create the same histogram from scratch:
>>> h = histogram(
"h",
[
('tof', arange(1000., 3000., 1.0), "microsecond")
],
fromfunction = lambda x: exp(-x/1000.) )
Probably the most important functionality of a histogram is slicing. First note that the syntax of histogram slicing is different from normal python array slicing. A python slicing looks like
>>> a[ 3:10, 1:9 ]
whereas a histogram slicing looks like
>>> h[ (3,10), (1,9) ]
Create a histogram:
>>> x = 'x', arange(-1, 1, 0.05 )
>>> y = 'y', arange(-1, 1, 0.05 )
>>> h = histogram( 'h', [x,y], fromfunction = lambda x,y: x*x + y*y )
Get a slice in the region x=(0.5, 0.9), y=(-0.8, 0.8)
>>> h1 = h[ (0.5, 0.9), (-0.8, 0.8) ]
Get a slice in the region x=(minimum, 0.5), y=(-0.8, 0.8)
>>> h2 = h[ (None, 0.5), (-0.8, 0.8) ]
Get a slice at x=0.5 over the full range of y
>>> h3 = h[ 0.5, () ]
the resulting histogram is a 1D curve.
Note
Slicing is by reference. No new data array will be created, and the new histogram is refering to a section of the original data. If you really need a copy, please use the “copy” method of the histogram object.
To set a slice is easy:
>>> h[ <slice specification> ] = <new data>, <new error^2>
For example:
>>> ycube = h.y**3
>>> h[ 0.3, () ] = ycube, None
You may notice that we need a tuple on the right-hand side. The reason is there are two datasets in a histogram: one for the data, another for the error squares. (Recall that the squares of the errors are stored to reduce computation time.) In the 2-tuple
ycube, None
“ycube” will be assign to the “data” dataset, and “None” will be assigned to the “error bar squares” dataset. Actually “None” is a special dataset for error bar squares: it means all error bars are zero.
Some basic numerical operators are available for manipulating histograms. When these computations are performed, both data and error bars are processed.
The supported operators are:
+, -, *, /, +=, -=, *=, /=
First, create a histogram
>>> x = 'x', arange(-1, 1, 0.05 )
>>> y = 'y', arange(-1, 1, 0.05 )
>>> h = histogram( 'h', [x,y], fromfunction = lambda x,y: x*x + y*y )
Then we add a constant to the histogram:
>>> h += 3., 1.
Please note that there are two numbers on the right hand side, one for data, another for error bar squares.
Next we add a histogram to a histogram
>>> h1 = histogram( 'h1', [x,y], fromfunction = lambda x,y: x + y )
>>> h2 = h + h1
>>> h2 += h
You can do similar things with the other operators, following ususal Python syntax.
It is assumed that the physical quantites represented by the histograms involved in compuations are uncorrelated, and the error propagations are defined by the following formulas:
It will sum the data and the error bar squares of all bins, and return the total counts and its error bar square. It can also sum a high-dimensional (D) histogram along one axis, and return a histogram of reduced dimension (D-1).
First, create a histogram
>>> x = 'x', arange(-1, 1, 0.05 )
>>> y = 'y', arange(-1, 1, 0.05 )
>>> h = histogram( 'h', [x,y], fromfunction = lambda x,y: x*x + y*y )
Now,
>>> h.sum()
returns a 2-tuple of counts and error bar square of all bins summed together. The expression
>>> h.sum( 'x' )
returns a 1-D histogram that results from summing over the axis ‘x’.
Sometime you may have a histogram having an axis that is only one bin wide. Such histograms (n-dimensional) are actually (n-1)-dimensional. You can reduce the dimensionality of this kind of histogram by using this command.
>>> axes = [ ('x', [1,2,3]), ('yID', [1]) ]
>>> data = [ [1,2,3] ]; errs = [ [1,2,3] ]
>>> h = histogram( 'h', axes, data, errs )
>>> h.reduce()
This function transpose the axes of a histogram.
The following commands create a 2-D histogram, and then transpose the x and y axes.
>>> x = 'x', arange(-1, 1, 0.05 )
>>> y = 'y', arange(0, 5, 0.05 )
>>> h = histogram( 'h', [x,y], fromfunction = lambda x,y: x*x + y*y )
>>> ht = h.transpose()
Sometimes it may be necessary to develop new numeric operators and methods for customized computation on the data array encapsulated in the histogram object. They are accessible as attributes ‘I’ and ‘E2’.
>>> x = 'x', arange(-1, 1, 0.05 )
>>> y = 'y', arange(0, 5, 0.05 )
>>> h = histogram( 'h', [x,y], fromfunction = lambda x,y: x*x + y*y )
>>> dataarr = h.I
>>> errsarr = h.E2
Both “dataarr” and “errsarr” are numpy arrays that reference to the underlying data stored in the histogram. You can work directly on these arrays, and the original histogram will be changed. Please see numpy documentation to learn of other methods that are available in the numpy package.
You can retrieve information about axes of a histogram using methods of Histogram class:
An axis contains data like bin boundaries and bin centers, and metadata like unit.
>>> x = 'x', arange(-1, 1, 0.05 )
>>> y = 'y', arange(0, 5, 0.05 )
>>> h = histogram( 'h', [x,y], fromfunction = lambda x,y: x*x + y*y )
>>> print h.axes()
>>> print h.axisNameList()
>>> xaxis = h.axisFromName( 'x' )
>>> print xaxis.unit()
>>> print xaxis.binCenters()
>>> print xaxis.binBoundaries()
>>> from histogram import plot
>>> plot(h)
You can save/load a histogram in hdf5 format.
>>> from histogram.hdf import dump
>>> dump(h, 'myhist.h5')
>>> from histogram.hdf import load
>>> h = load('myhist.h5')