Spectral imaging is a technique that generates a spatial map of spectral variation, making it a useful tool in many applications including environmental remote sensing, military target discrimination, astrophysics and biomedical optics. When imaging a scene, a spectral imager produces a two-dimensional spatial array of vectors which represents the spectrum at each pixel location. The resulting three-dimensional dataset containing the two spatial dimensions and one spectral dimension is known as the datacube.
Many different techniques for spectral imaging have been developed over the years. Whiskbroom, pushbroom and tunable filter imagers are all conceptually simple spectral imager designs. These instruments capture a one- or two-dimensional subset of the datacube, and thus require the temporal scanning of the remaining dimension(s) to obtain the complete datacube. Furthermore, they have poor light collection efficiency for incoherent sources, resulting in a poor signal-to-noise ratio (SNR). Multiplex spectral imagers including Fourier and Hadamard transform based instruments are designed to address the light throughput problem, but still require some form of scanning, making it difficult to use them for spectral imaging of non-static scenes.
Tomographic approaches have also produced major advances. Mooney et al. developed a direct-view design that maximizes the light gathering efficiency by not requiring any spatial filter, such as a slit. With this design, the source is viewed through a rotating dispersive element. Measurements are taken at different rotation angles. These measurements are projective measurements through the datacube that can be tomographically reconstructed. While the light gathering efficiency of such an instrument is high, the geometry of the system limits the range of angles over which projections are made. As a result of the Fourier slice theorem, this results in an unsampled region in the Fourier space, a problem known as the "missing cone problem'', because the unsampled region is a conical volume in the Fourier domain representation of the datacube. The computed tomography imaging spectrometer (CTIS) system is a static, snapshot instrument that captures multiple projections of the datacube at once. These capabilities make the CTIS instrument ideal for spectral imaging of transient scenes. However, the instrument requires a large focal plane area and also suffers from the missing cone problem.
The CASSI revolution
An important characteristic shared by all the designs described above is that the total number of measurements they generate is greater than or equal to the total number of elements in the reconstructed datacube. In contrast, our group has introduced the idea of compressive spectral imaging, an approach to spectral imaging that intentionally generates fewer measurements than elements in the reconstructed datacube. We utilize the power of compressed sensing ideas (to be described below) to solve our underdetermined problem by relying on a crucial property of natural scenes, namely that they tend to be sparse on some multiscale basis. To achieve compressive spectral imaging, our group has developed a new class of imagers dubbed the coded aperture snapshot spectral imager (CASSI). CASSI instruments utilize a coded aperture and one or more dispersive elements to modulate the optical field from a scene. A detector captures a two-dimensional, multiplexed projection of the three-dimensional datacube representing the scene. The nature of the multiplexing performed depends on the relative position of the coded aperture and the dispersive element(s) within the instruments.
Modification of CASSI
A modification of the SD-CASSI system is a 2D push-broom variant, providing a multiscale estimation capability of the data cube. By taking additional snapshots of the same scene with codings that are distinct for each snapshot, it is possible to select the number of measurements based on resolution requirements while still maintaining the snapshot advantages of the instrument in each unique measurement. This provides flexibility in strictly adhering to the sparsity requirements needed for accurate estimation with compressive sensing.
Multiple snapshots require distinct coding. Since physically replacing the mask with a different mask during capture is too slow, the mask was attached to a piezo system (Newport Corporation) that translates it by up to 24 pixels on the detector with a repeatability of 0.0068 pixel. Moving by three pixels, or equivalently one mask feature on the detector, yields a new code modulation corresponding to about 19.8 microns of linear translation. The piezo system is mounted directly to the aluminum base of the CASSI system, so the only moving part is the piezo head, which has the absolute position recorded on the controller. Any motion that does occur can be easily accounted for by an offset in software. A monochromatic source illuminating the aperture code is recorded before acquiring data, which can be compared to the calibration for any mask motion. In general, the piezo has not compromised system stability, so long as the controller has been turned on for at least an hour to stabilize.
Control of the piezo system was entirely automated through a serial interface to Matlab, enabling multiple frame captures with a single button click and storing the data to a single file, which can be read by a reconstruction GUI in Matlab. Multiframe capture time varies depending on the exposure time and acquisition type. Currently, the system operates in two modes: (1) each image is captured statically, after the mask has moved and is stable, with no limit on integration time; (2) the camera is constantly acquiring frames at 60 fps and the mask is constantly moving in a set pattern, slow enough to virtually eliminate motion blur. The first method is best for dark scenes that require long integration times, since the second method requires enough light for fast integration times.
The piezo takes 16ms to move two pixels on the detector and 30ms to move the entire range of 24 pixels. Typical integration times are between 10-50ms, giving a frame rate of 12-38 fps in multiframe mode. In practice, it takes about half-second to acquire 20 frames. If the piezo stops for each integration, the piezo needs 30ms to stabilize plus 50 ms of integration time, dropping the frame rate to 12.5 fps. If 24 frames are then required, a multiframe sequence would take around 2 seconds, keeping in mind that each frame is a valid snapshot that can be fully reconstructed, while each additional frame adds to the measurement accuracy.
Reconstruction time using a dual-core system and a 256x256x21 data cube, the snapshot takes about 80 seconds, a 10-frame reconstruction takes 113 seconds, and a 20-frame 173 seconds, giving roughly a two-times increase for 20 frames. For larger data sets, the ratio for 20 frames to snapshot drops down to 1.8. Using a multi-core system, up to 16 frame reconstructions take no additional time, 64 frame reconstructions take roughly two-times longer than a snapshot.
Specific to the algorithm, the measurement matrix H was implemented in functional form rather than matrix form due to its enormous size. All calculations were performed using single precision to reduce computational time and memory requirements; with no loss in quality from using double precision.
For comparison, a 24-frame image was acquired using the MCASI system and reconstructing the raw data with a single frame and 24 frames. Identicle parameters were used in the reconstruction, making this a good comparison.
A second data set shows a reconstruction from UV through visible. Two different acquisitions were taken, one with a 400-700nm filter and another with a 320-380nm bandpass filter. Illumination was sunlight at 4pm, on June 30, 2011. The UV index at solar noon (12:20pm) was between 10 and 11.