|
|||
NRAO Home > CASA > CASA Task Reference Manual |
|
0.1.93 sdbaseline
Requires:
Synopsis
Fit/subtract a spectral baseline
Description
Task sdbaseline fits and/or subtracts baseline from single-dish spectra. Given
baseline parameters (baseline type, order, etc.), sdbaseline computes the
best-fit baseline for each spectrum by least-square fitting method and, if you
want, subtracts it. The best-fit baseline parameters (including baseline type,
coefficients of basis functions, etc.) and other values such as residual rms can
be saved in various formats including ascii text (in human-readable format or
CSV format) or baseline table (a CASA table). Sdbaseline has another mode
to ’apply’ a baseline table to a MS data; for each spectrum in MS, the best-fit
baseline is reproduced from the baseline parameters stored in the given
baseline table and subtracted. Putting ’fit’ and ’subtract’ into separate
processes can be useful for pipeline processing for huge dataset.
Arguments
Inputs |
| ||
infile |
| name of input SD dataset
| |
| allowed: | string |
|
| Default: |
|
|
datacolumn |
| name of data column to be used [’data’, ’float_data’, or
’corrected’]
| |
| allowed: | string |
|
| Default: | data | |
antenna |
| select data by antenna name or ID, e.g. ’PM03’
| |
| allowed: | string |
|
| Default: |
|
|
field |
| select data by field IDs and names, e.g. ’3C2*’ (”=all)
| |
| allowed: | string |
|
| Default: |
|
|
spw |
| select data by IF IDs (spectral windows), e.g. ’3,5,7’
(”=all)
| |
| allowed: | string |
|
| Default: |
|
|
timerange |
| select data by time range, e.g. ’09:14:0~09:54:0’ (”=all)
(see examples in help)
| |
| allowed: | string |
|
| Default: |
|
|
scan |
| select data by scan numbers, e.g. ’21~23’ (”=all)
| |
| allowed: | string |
|
| Default: |
|
|
pol |
| select data by polarization IDs, e.g. ’XX,YY’ (”=all)
| |
| allowed: | string |
|
| Default: |
| |
intent |
| select data by observational intent, e.g.
’*ON_SOURCE*’ (”=all)
| |
| allowed: | string |
|
| Default: |
| |
maskmode |
| mode of setting additional channel masks. ’list’ and
’auto’ are available now. | |
| allowed: | string |
|
| Default: | list |
|
thresh |
| S/N threshold for linefinder
| |
| allowed: | double | |
| Default: | 5.0 |
|
avg_limit |
| channel averaging for broad lines
| |
| allowed: | int | |
| Default: | 4 |
|
minwidth |
| the minimum channel width to detect as a line | |
| allowed: | int |
|
| Default: | 4 | |
edge |
| channels to drop at beginning and end of spectrum
| |
| allowed: | intArray |
|
| Default: | 00 |
|
blmode |
| baselining mode [’fit’ or ’apply’]
| |
| allowed: | string |
|
| Default: | fit |
|
dosubtract |
| subtract baseline from input data [True, False]
| |
| allowed: | bool | |
| Default: | True |
|
blformat |
| format(s) of file(s) in which best-fit parameters are
written [’text’, ’csv’, ’table’ or ”]
| |
| allowed: | any |
|
| Default: | variant text |
|
bloutput |
| name(s) of file(s) in which best-fit parameters are
written
| |
| allowed: | any |
|
| Default: | variant
|
|
bltable |
| name of baseline table to apply
| |
| allowed: | string |
|
| Default: |
|
|
blfunc |
| baseline model function [’poly’, ’chebyshev’, ’cspline’,
’sinusoid’, or ’variable’(expert mode)]
| |
| allowed: | string |
|
| Default: | poly |
|
order |
| order of baseline model function
| |
| allowed: | int |
|
| Default: | 5 |
|
npiece |
| number of element polynomials for cubic spline curve
| |
| allowed: | int |
|
| Default: | 2 |
|
applyfft |
| automatically set wave numbers of sinusoids
| |
| allowed: | bool |
|
| Default: | True |
|
fftmethod |
| method for automatically set wave numbers of sinusoids
| |
| allowed: | string |
|
| Default: | fft |
|
fftthresh |
| threshold to select wave numbers of sinusoids
| |
| allowed: | any |
|
| Default: | 3.0 |
|
addwn |
| additional wave numbers to use
| |
| allowed: | any |
|
| Default: | 0 |
|
rejwn |
| wave numbers NOT to use
| |
| allowed: | any |
|
| Default: |
|
|
clipthresh |
| clipping threshold for iterative fitting
| |
| allowed: | double |
|
| Default: | 3.0 |
|
clipniter |
| maximum iteration number for iterative fitting
| |
| allowed: | int |
|
| Default: | 0 |
|
blparam |
| text file that stores per spectrum fit parameters
| |
| allowed: | string |
|
| Default: |
|
|
verbose |
| (NOT SUPPORTED YET) output fitting results to
logger
| |
| allowed: | bool |
|
| Default: | False |
|
showprogress |
| (NOT SUPPORTED YET) show progress status for
large data
| |
| allowed: | bool |
|
| Default: | False |
|
minnrow |
| (NOT SUPPORTED YET) minimum number of input
spectra to show progress status
| |
| allowed: | int |
|
| Default: | 1000 |
|
outfile |
| name of output file
| |
| allowed: | string |
|
| Default: |
|
|
overwrite |
| overwrite the output file if already exists
| |
| allowed: | bool |
|
| Default: | False |
|
void
Example
-----------------
Keyword arguments
-----------------
infile -- name of input SD dataset
datacolumn -- name of data column to be used
options: ’data’, ’float_data’, or ’corrected’
default: ’data’
antenna -- select data by antenna name or ID
default: ’’ (use all antennas)
example: ’PM03’
field -- select data by field IDs and names
default: ’’ (use all fields)
example: field=’3C2*’ (all names starting with 3C2)
field=’0,4,5~7’ (field IDs 0,4,5,6,7)
field=’0,3C273’ (field ID 0 or field named 3C273)
this selection is in addition to the other selections to data
spw -- select data by IF IDs (spectral windows)/channels
default: ’’ (use all IFs and channels)
example: spw=’3,5,7’ (IF IDs 3,5,7; all channels)
spw=’<2’ (IF IDs less than 2, i.e., 0,1; all channels)
spw=’30~45GHz’ (IF IDs with the center frequencies in range 30-45GHz; all channels)
spw=’0:5~61’ (IF ID 0; channels 5 to 61; all channels)
spw=’3:10~20;50~60’ (select multiple channel ranges within IF ID 3)
spw=’3:10~20,4:0~30’ (select different channel ranges for IF IDs 3 and 4)
spw=’1~4;6:15~48’ (for channels 15 through 48 for IF IDs 1,2,3,4 and 6)
this selection is in addition to the other selections to data
timerange -- select data by time range
default: ’’ (use all)
example: timerange = ’YYYY/MM/DD/hh:mm:ss~YYYY/MM/DD/hh:mm:ss’
Note: YYYY/MM/DD can be dropped as needed:
timerange=’09:14:00~09:54:00’ # this time range
timerange=’09:44:00’ # data within one integration of time
timerange=’>10:24:00’ # data after this time
timerange=’09:44:00+00:13:00’ #data 13 minutes after time
this selection is in addition to the other selections to data
scan -- select data by scan numbers
default: ’’ (use all scans)
example: scan=’21~23’ (scan IDs 21,22,23)
this selection is in addition to the other selections to data
pol -- select data by polarization IDs
default: ’’ (use all polarizations)
example: pol=’XX,YY’ (polarizations XX and YY)
this selection is in addition to the other selections to data
intent -- select data by observational intent, also referred to as ’scan intent’
default: ’’ (use all scan intents)
example: intent=’*ON_SOURCE*’ (any valid scan-intent expression accepted by the MSSelection module can be specified)
this selection is in addition to the other selections to data
maskmode -- mode of setting additional channel masks. When blmode=’apply’
and/or blfunc=’variable’, maskmode and its subparameters
are ignored.
options: ’list’ and ’auto’ (’interact’ will be available later)
default: ’list’
example: maskmode=’auto’ runs linefinder to detect line regions
to be excluded from fitting. this mode requires three
expandable parameters: thresh, avg_limit, minwidth, and edge.
NOTE maskmode=’auto’ is EXPERIMENTAL.
USE WITH CARE! May need to tweak the expandable parameters.
maskmode=’list’ uses the given masklist only: no additional
masks applied.
maskmode=’interact’ allows users to manually modify the
mask regions by dragging mouse on the spectrum plotter GUI.
use LEFT or RIGHT button to add or delete regions,
respectively.
>>> maskmode expandable parameters
thresh -- S/N threshold for linefinder. a single channel S/N ratio
above which the channel is considered to be a detection.
default: 5
avg_limit -- channel averaging for broad lines. a number of
consecutive channels not greater than this parameter
can be averaged to search for broad lines.
default: 4
minwidth -- the minimum channel width to detect as a line.
a line with number of consecutive channels less
than this parameter will not be detected as a line.
default: 4
edge -- channels to drop at beginning and end of spectrum
default: 0
example: edge=[1000] drops 1000 channels at beginning AND end.
edge=[1000,500] drops 1000 from beginning and 500
from end.
Note: For bad baselines threshold should be increased,
and avg_limit decreased (r even switched off completely by
setting this parameter to 1) to avoid detecting baseline
undulations instead of real lines.
blmode -- baselining mode.
options: ’fit’, ’apply’
default: ’fit’
example: blmode=’fit’ calculates the best-fit baseline based on
given baseline type, then (if you set dosubtract=True)
subtract it from each spectrum. The information about
best-fit baselines (baseline type, order, coefficients,
etc.) can be stored in various formats (cf. blformat).
blmode=’apply’ reads a baseline table as well as input
MS, reproduces the best-fit baseline via info written
in the baseline table, then subtracts it from each
spectrum.
>>> blmode expandable parameters
dosubtract -- execute baseline subtraction in addition to fitting.
Note that dosubtract=False will be ignored if
bloutput is given, that is, baseline subtraction
will be always executed for the input MS in case
bloutput is not specified.
options: (bool) True, False
default: True
blformat -- format(s) of file(s) in which best-fit parameters are
written.
options: ’text’, ’csv’, ’table’, and ’’ can be set for
a single output. In case you want to output
fitting results in multiple formats, a list
containing the above keywords is accepted as well.
default: ’text’
example: (1) blformat=’text’ outputs an ascii text file
with the best-fit baseline parameters written
in human-readable format. It may be good to read,
but you should mind it might be huge.
(2) blformat=’csv’ outputs a CSV file. For example,
output of csv with blfunc=’poly’ is as below:
#scan, beam, spw, pol, MJD[s], fitrange (i.e. inverse mask), blfunc, order, fitting coefficients, rms, number of clipped channels
4,0,17,0,4915973292.23,[[252;3828]],poly,1,767.647,-0.00956208,26.3036,0
... .
(3) blformat=’table’ outputs a baseline table
which can be used to apply afterwards.
(4) blformat=’’ doesn’t output any parameter file.
(5) blformat=[’csv’,’table’] outputs both a CSV
file and a baseline table.
(6) If one or more ’’s appear in blformat, they
are all ignored. For example, if blformat=[’’,
’text’,’’] is given, only ’text’ will be output.
(7) Elements of blformat other than ’’ must not
be duplicated. For example, blformat=[’text’,’’,
’text’] is not accepted.
bloutput -- name(s) of file(s) in which best-fit parameters are
written. If bloutput is a null string ’’, name(s) of
baseline parameter file(s) will be set as follows:
<outfile>_blparam.txt for blformat=’text’,
<outfile>_blparam.csv for blformat=’csv’, and
<outfile>_blparam.bltable for blformat=’table’.
Otherwise, blformat and bloutput must have the same
length, and one-to-one correspondence is assumed
between them. If there are ’’ elements in bloutput,
output file names will be set by following the above
rules. If there are ’’ elements in blformat, the
corresponding bloutput elements will be ignored.
Also, non-’’ bloutput elements correspoding to
non-’’ blformat elements must not be duplicated.
default: ’’
example: (1) bloutput=’’ and blformat=[’csv’,’table’]:
outputs a csv file ’<outfile>_blparam.csv’
and a baseline table ’<outfile>_blparam.bltable’.
(2) bloutput=[’foo.csv’,’’] and blformat=[’csv’,
’table’]: outputs a csv file ’foo.csv’ and a
baseline table ’<outfile>_blparam.bltable’.
(3) bloutput=[’foo.csv’,’bar.blt’] and blformat=
[’csv’,’’]: outputs a csv file ’foo.csv’ only.
(4) bloutput=[’foo.csv’,’foo.csv’,’bar.blt’] and
blformat=[’csv’,’’,’table’]: the second ’foo.csv’
is ignored because it corresponds to the blformat
element ’’, and thus outputs a csv file ’foo.csv’
and a baseline table ’bar.blt’.
(5) bloutput=[’foo.csv’,’foo.csv’,’bar.blt’] and
blformat=[’csv’,’text’,’table’]: will be error
since ’foo.csv’ is duplicated.
(6) bloutput=[’foo.csv’,’bar.blt’] and blformat=
[’csv’,’’,’table’]: will be error since bloutput
and blformat have different lengths.
bltable -- name of baseline table to apply
default: ’’
blfunc -- baseline model function. In cases blmode=’apply’ or blparam is
set, blfunc and its subparameters are ignored.
options: ’poly’, ’chebyshev’, ’cspline’, ’sinusoid’ or ’variable’
default: ’poly’
example: blfunc=’poly’ uses a single polynomial line of
any order which should be given as an expandable
parameter ’order’ to fit baseline.
blfunc=’chebyshev’ uses Chebyshev polynomials.
blfunc=’cspline’ uses a cubic spline function, a piecewise
cubic polynomial having C2-continuity (i.e., the second
derivative is continuous at the joining points).
blfunc=’sinusoid’ uses a combination of sinusoidal curves.
NOTE blfunc=’variable’ IS EXPERT MODE!!!
>>> blfunc expandable parameters
order -- order of baseline model function
options: (int) (<0 turns off baseline fitting)
default: 5
example: typically in range 2-9 (higher values
seem to be needed for GBT)
npiece -- number of the element polynomials of cubic spline curve
options: (int) (<0 turns off baseline fitting)
default: 2
applyfft -- automatically choose an appropriate set of sinusoidal
wave numbers via FFT for each spectrum data.
options: (bool) True, False
default: True
fftmethod -- method to be used when applyfft=True. Now only
’fft’ is available and it is the default.
fftthresh -- threshold on Fourier-domain spectrum data to pick up
appropriate wave numbers to be used for sinusoidal
fitting. both (float) and (str) accepted.
given a float value, the unit is set to sigma.
for string values, allowed formats include:
’xsigma’ or ’x’ (= above x-sigma level. e.g., ’3sigma’)
or ’topx’ (= the x strongest ones, e.g. ’top5’).
default is 3.0 (i.e., above 3sigma level).
addwn -- additional wave number(s) of sinusoids to be used
for fitting.
(list) and (int) are accepted to specify every
wave numbers. also (str) can be used in case
you need to specify wave numbers in a certain range.
default: [0] (i.e., constant is subtracted at least)
example: 0
[0,1,2]
’0,1,2’
’a-b’ (= a, a+1, ..., b)
’a~b’ (= a, a+1, ..., b)
’<a’ (= 0,1,...,a-2,a-1)
’>=a’ (= a, a+1, ... up to the maximum wave
number corresponding to the Nyquist
frequency for the case of FFT)
rejwn -- wave number(s) of sinusoid NOT to be used for fitting.
can be set just as addwn but has higher priority:
wave numbers which are specified both in addwn
and rejwn will NOT be used.
note also that rejwn value takes precedence over those
automatically selected by setting applyfft=True as well.
default: []
clipthresh -- clipping threshold for iterative fitting
default: 3
clipniter -- maximum iteration number for iterative fitting
default: 0 (no iteration, i.e., no clipping)
blparam -- the name of text file that stores per spectrum fit
parameters. See below for details of format.
verbose -- (NOT SUPPORTED YET) output fitting results to logger. if False, the fitting results
including coefficients, residual rms, etc., are not output to
the CASA logger, while the processing speed gets faster.
options: (bool) False
default: False (verbose=True is currently unavailable)
showprogress -- (NOT SUPPORTED YET) show progress status for large data
options: (bool) False (this capability is currently unavailable.)
default: False
>>> showprogress expandable parameter
minnrow -- (NOT SUPPORTED YET) minimum number of input spectra to show progress status
default: 1000
outfile -- name of output file
default: ’’ (<infile>_bs)
overwrite -- overwrite the output file if already exists
options: (bool) True, False
default: False
NOTE this parameter is ignored when outform=’ASCII’
-----------
DESCRIPTION
-----------
Task sdbaseline performs baseline fitting/subtraction for single-dish spectra.
The fit parameters, terms and rms of baseline can be saved into an ascii file
or baseline table. Subtracting baseline from data in input MS using existing
baseline table is also possible.
-----------------------
BASELINE MODEL FUNCTION
-----------------------
The list of available model functions are shown above (see Keyword arguments
section). In general ’cspline’ or ’chebyshev’ are recommended since they are
more stable than others. ’poly’ will work for lower order but will be unstable
for higher order fitting. ’sinusoid’ is kind of special mode that will be
useful for the data that clearly shows standing wave in the spectral baseline.
----------------------------------
SIGMA CLIPPING (ITERATIVE FITTING)
----------------------------------
In general least square fitting is strongly affected by an extreme data
so that the resulting fit makes worse. Sigma clipping is an iterative
baseline fitting with data clipping based on a certain threshold. Threshold
is set as a certain factor times rms of the resulting (baseline subtracted)
spectra. If sigma clipping is on, baseline fit/removal is performed several
times. After each baseline subtraction, the data whose absolute value is
above threshold are detected and those data are excluded from the next round
of fitting. By using sigma clipping, extreme data are excluded from the
fit so that resulting fit is more robust.
The user is able to control a multiplication factor using parameter
clipthresh for clipping threshold based on rms. Actual threshold for sigma
clipping will be (clipthresh) x (rms of spectra). Also, the user can specify
number of maximum iteration to the parameter clipniter.
In general, sigma clipping will lower the performance since it increases
number of fits per spectra. However, it is strongly recommended to turn
on sigma clipping unless you are sure that the data is free from any kind
of extreme values that may affect the fit.
----------------------------------
PER SPECTRUM FIT PARAMETERS
----------------------------------
Per spectrum baseline fitting parameter is accepted in blfunc=’variable’.
Note this is an expert mode. The fitting parameters should be defined in
a text file for each spectrum in the input MS. The text file should store
comma separated values in order of:
row ID, polarization, mask, clipniter, clipthresh, use_linefinder,
thresh, left edge, right edge, avg_limit, blfunc, order, npiece, nwave.
Each row in the text file must contain the following keys and values:
* ’row’: row number after selection,
* ’pol’: polarization index in the row,
* ’clipniter’: maximum iteration number for iterative fitting,
* ’blfunc’: function name.
available ones include, ’poly’, ’chebyshev’, ’cspline’,and ’sinusoid’
* ’order’: maximum order of polynomial. needed when blfunc=’poly’
or ’chebyshev’,
* ’npiece’: number or piecewise polynomial. needed when blfunc=’cspline’,
and
* ’nwave’: a list of sinusoidal wave numbers. needed when blfunc=’sinusoid’.
example:
#row,pol,mask,clipniter,clipthresh,use_linefinder,thresh,Ledge,Redge,avg_limit,blfunc,order,npiece,nwave
1,1,0~4000;6000~8000,0,3.,false,0.,0,0,0,chebyshev,0,0,[]
1,0,,0,3.,false,0.,0,0,0,poly,1,0,[]
0,1,,0,3.,false,0.,0,0,0,chebyshev,2,0,[]
0,0,,0,3.,false,,,,,cspline,,1,[]
More information about CASA may be found at the
CASA web page
Copyright © 2016 Associated Universities Inc., Washington, D.C.
This code is available under the terms of the GNU General Public Lincense
Home |
Contact Us |
Directories |
Site Map |
Help |
Privacy Policy |
Search