pipeline.hsd.heuristics package¶

Submodules¶

pipeline.hsd.heuristics.MaskDeviation module¶

class pipeline.hsd.heuristics.MaskDeviation.MaskDeviation(infile, spw=None)[source]¶

Bases: object

The class is used to detect channels having large variation or deviation. If there’s any emission lines or atmospheric absorption/emission on some channels, their values largely change according to the positional and environmental changes. Emission lines and atmospheric features often degrade the quality of the baseline subtraction. Therefore, channels with large valiation should be masked before baseline fitting order determination and baseline subtraction.

CalcRange(threshold=3.0, detection=5.0, extension=2.0, iteration=10, consider_flag=False)[source]¶

Find regions which value is greater than threshold. ‘threshold’ is used for median calculation ‘detection’ is used to detect mask region ‘extension’ is used to extend the mask region Used data:

self.stdSp: 1D spectrum with self.nchan channels calculated in CalcStdSpectrum
Each channel records standard deviation of the channel in all original spectra

CalcStdSpectrum(consider_flag=False)[source]¶

meanSP, maxSP, minSP, ymax, ymin: used only for plotting and should be: commented out when implemented in the pipeline

ExtendMask(mask, threshold)[source]¶: Extend the mask region as long as Standard Deviation value is higher than the given threshold

PlotRange(L, R)[source]¶: Plot masked range

PlotSpectrum()[source]¶: plot max, min, mean, and standard deviation of the spectra

ReadData(vis='', field='', antenna='', colname=None)[source]¶: Reads data from input MS.

SavePlot()[source]¶: Save the plot in PNG format

SubtractMedian(threshold=3.0, consider_flag=False)[source]¶

Subtract median value of the spectrum from the spectrum: re-bias the spectrum.

Initial median (MED_0) and standard deviation (STD) are caluculated for each spectrum. Final median value is determined by using the channels having the value inside the range: MED_0 - threshold * STD < VALUE < MED_0 + threshold * STD

class pipeline.hsd.heuristics.MaskDeviation.MaskDeviationHeuristic[source]¶

Bases: pipeline.infrastructure.api.Heuristic

calculate(vis, field_id='', antenna_id='', spw_id='', consider_flag=False)[source]¶

Channel mask heuristics using MaskDeviation algorithm implemented in MaskDeviation class.

vis – input MS filename field_id – target field identifier antenna_id – target antenna identifier spw – target spw identifier consider_flag – take into account flag in MS or not

pipeline.hsd.heuristics.MaskDeviation.VarPlot(infile)[source]¶

pipeline.hsd.heuristics.baselineparamconfig module¶

pipeline.hsd.heuristics.baselineparamconfig.BLP¶: alias of pipeline.hsd.heuristics.baselineparamconfig.BaselineParamKeys

class pipeline.hsd.heuristics.baselineparamconfig.BaselineFitParamConfig(switchpoly=True)[source]¶

Bases: pipeline.infrastructure.api.Heuristic

Generate/update BLParam file according to the input parameters.

ApplicableDuration = 'raster'¶

property ClipCycle¶

MaxPolynomialOrder = 'none'¶

PolynomialOrder = 'automatic'¶

calculate(datatable, ms, antenna_id, field_id, spw_id, fit_order, edge, deviation_mask, blparam)[source]¶

Generate/update BLParam file, which will be an input to sdbaseline, according to the input parameters.

Inputs:

datatable – DataTable instance ms – MeasurementSet domain object antenna_id – Antenna ID to process field_id – Field ID to process spw_id – Spw ID to process fit_order – Fiting order (‘automatic’ or number) edge – Number of edge channels to be excluded from the heuristics

(format: [L, R])

deviation_mask – Deviation mask blparam – Name of the BLParam file

File contents will be updated by this heuristics

Returns: Name of the BLParam file

is_multi_vis_task = False¶

class pipeline.hsd.heuristics.baselineparamconfig.BaselineParamKeys[source]¶

Bases: object

AVG_LIMIT = 'avg_limit'¶

CLIPNITER = 'clipniter'¶

CLIPTHRESH = 'clipthresh'¶

FUNC = 'blfunc'¶

LEDGE = 'Ledge'¶

LFTHRESH = 'thresh'¶

MASK = 'mask'¶

NPIECE = 'npiece'¶

NWAVE = 'nwave'¶

ORDER = 'order'¶

ORDERED_KEY = ['row', 'pol', 'mask', 'clipniter', 'clipthresh', 'use_linefinder', 'thresh', 'Ledge', 'Redge', 'avg_limit', 'blfunc', 'order', 'npiece', 'nwave']¶

POL = 'pol'¶

REDGE = 'Redge'¶

ROW = 'row'¶

USELF = 'use_linefinder'¶

class pipeline.hsd.heuristics.baselineparamconfig.CubicSplineFitParamConfig(switchpoly=True)[source]¶: Bases: pipeline.hsd.heuristics.baselineparamconfig.BaselineFitParamConfig

pipeline.hsd.heuristics.baselineparamconfig.DEBUG()[source]¶

pipeline.hsd.heuristics.baselineparamconfig.TRACE()[source]¶

pipeline.hsd.heuristics.baselineparamconfig.as_maskstring(masklist)[source]¶

pipeline.hsd.heuristics.baselineparamconfig.do_switching(engine, nchan, edge, num_pieces, masklist)[source]¶

pipeline.hsd.heuristics.baselineparamconfig.no_switching(engine, nchan, edge, num_pieces, masklist)[source]¶

pipeline.hsd.heuristics.baselineparamconfig.write_blparam(fileobj, param)[source]¶

pipeline.hsd.heuristics.fitorder module¶

class pipeline.hsd.heuristics.fitorder.FitOrderHeuristics[source]¶

Bases: pipeline.infrastructure.api.Heuristic

Determine fitting order from a set of spectral data.

MaxDominantFreq = 15¶

calculate(data, mask=None, edge=(0, 0))[source]¶

Determine fitting order from a set of spectral data, data, with masks for each spectral data, mask, and number of edge channels to be excluded, edge.

First, manipulate each spectral data by the following procedure:

mask regions specified by mask and edge,

subtract average from spectral data,

compute one-dimensional discrete Fourier Transform.

Then, Fourier power spectrum is averaged and averaged power spectrum is analyzed to determine optimal polynomial order for input data array. The heuristics returns one representative polynomial order per input data array.

data: two-dimensional data array with shape (nrow, nchan). mask: list of mask regions. Value should be a list of

[[start0,end0],[start1,end1],…] for each spectrum. [[-1,-1]] indicates no mask. Default is None.

edge: number of edge channels to be dropped. Default is (0,0).

class pipeline.hsd.heuristics.fitorder.MaskMaker(nchan, lines, edge)[source]¶

Bases: pipeline.hsd.heuristics.fitorder.MaskMakerNoLine

get_mask(row)[source]¶

class pipeline.hsd.heuristics.fitorder.MaskMakerNoLine(nchan, edge)[source]¶

Bases: object

get_mask(row)[source]¶

class pipeline.hsd.heuristics.fitorder.SwitchPolynomialWhenLargeMaskAtEdgeHeuristic[source]¶

Bases: pipeline.infrastructure.api.Heuristic

calculate(nchan, edge, num_pieces, masklist)[source]¶

Make a calculation based on the given parameters.

This is an abstract method and must be implemented by all Heuristic subclasses.

Note

The signature and return types of calculate() are intended to be implementation specific. Refer to the documentation of the implementing class for the appropriate signature.

pipeline.hsd.heuristics.fragmentation module¶

class pipeline.hsd.heuristics.fragmentation.FragmentationHeuristics[source]¶

Bases: pipeline.infrastructure.api.Heuristic

Determine fragmentation from given total number of channels, edge channels to be dropped, and polynomial order for baseline fit.

MaxFragmentation = 3¶

MaxOrder = 9¶

MinChannels = 512¶

calculate(polyorder, nchan, edge, modification=0)[source]¶

Determine fragmentation from given total number of channels, edge channels to be dropped, and polynomial order for baseline fit.

Inputs:: polyorder – polynomial order for baseline fit nchan – number of channels edge – edge channels to be dropped given by tuple, (left, right) modification – modification factor for polyorder

Returns: fragment – fragmentation parameter num_segment – number of segments segment_polyorder – polynomial order for baseline fit of segments

pipeline.hsd.heuristics.grouping2 module¶

Set of heuristics for data grouping.

class pipeline.hsd.heuristics.grouping2.GroupByPosition2[source]¶

Bases: pipeline.infrastructure.api.Heuristic

Grouping by RA/DEC position.

calculate(ra: numpy.ndarray, dec: numpy.ndarray, r_combine: NewType.<locals>.new_type, r_allowance: NewType.<locals>.new_type) → Tuple[Dict, List][source]¶

Group data by RA/DEC position.

Divides data into groups by their positions in two dimensional space which are given by ra and dec. Groups data within the circle with the radius of r_combine. Only position difference larger than r_allowance are regarded as significant. For r_combine and r_allowance, specified values are interpreted to be in degree unless their units are explicitly given.

Parameters

ra – List of R.A.
dec – List of DEC.
r_combine – Data inside r_combine will be grouped together.
r_allowance – Data inside r_allowance are assumed to be the same position.

Returns

Two-tuple containing information on group membership (PosDict) and boundaries between groups (PosGap).

PosDict is a dictionary whose keys are indices for ra and dec. Values of PosDict are the list which contains different value depending on whether the position specified by the index is reference data for the group or not. If k is reference index, PosDict[k] lists indices for group member ([ID1, ID2,…, IDN]). Otherwise, PosDict[k] is [-1, m] where m is the index to reference data.

PosGap is a list of gaps in terms of array indices for ra and dec ([IDX1, IDX2,…,IDXN]). Length of PosGap is (number of groups) - 1.

class pipeline.hsd.heuristics.grouping2.GroupByTime2[source]¶

Bases: pipeline.infrastructure.api.Heuristic

Grouping by time sequence.

calculate(timebase: Sequence[numbers.Real], time_diff: Sequence[numbers.Real]) → Tuple[List, List][source]¶

Group data by time sequence.

Divides data into groups by their difference (time_diff). Two groups are defined based on “small” and “large” gaps, which are internally computed by ThresholdForGroupByTime heuristic. The time_diff is generated from timebase in most of the cases. The timebase contains all time stamps and time_diff is created from selected time stamps in other case.

Parameters

timebase – base list of time stamps for threshold estimation
time_diff – difference from the previous time stamp

Returns

Two-tuple containing information on group membership (TimeTable) and boundaries between groups (TimeGap).

TimeTable is the “list-of-list” whose items are the set of indices for each group. TimeTable[0] is the groups separaged by “small” gap while TimeTable[1] is for groups separated by “large” gap. They are used for baseline subtraction (hsd_baseline) and subsequent flagging (hsd_blflag).

TimeTable:

[[[ismall00,…,ismall0M],[…],…,[ismallX0,…,ismallXN]],: [[ilarge00,…,ilarge0P],[…],…,[ilargeY0,…,ilargeYQ]]]

TimeTable[0]: separated by small gaps TimeTable[1]: separated by large gaps

TimeGap is the list of indices which indicate boundaries for “small” and “large” gaps. These are used for plotting.

TimeGap: [[rowX1, rowX2,…,rowXN], [rowY1, rowY2,…,rowYN]] TimeGap[0]: small gap TimeGap[1]: large gap

class pipeline.hsd.heuristics.grouping2.MergeGapTables2[source]¶

Bases: pipeline.infrastructure.api.Heuristic

Merge time gap and position gaps.

calculate(TimeGap: List, TimeTable: List, PosGap: List, tBEAM: Sequence[int]) → Tuple[List, List][source]¶

Merge time gap and position gaps.

Merge time gap list (TimeGap) and position gap list (PosGap). TimeTable and TimeGap should be the first and the second elements of the return value of GroupByTime2 heuristic. Also, PosGap should be the second element of the return value of GroupByPosition2 heuristic. PosGap is merged into small TimeGap (TimeGap[0]).

tBEAM is used to separate the data by beam for multi-beam data.

Parameters

TimeTable – the first element of output from GroupByTime2 heuristic
TimeGap – the second element of output from GroupByTime2 heuristic
PosGap – the second element of output from GroupByPosition2()
tBEAM – list of beam identifier.

Returns

Two-tuple containing information on group membership (TimeTable) and boundaries between groups (TimeGap).

TimeTable is the “list-of-list” whose items are the set of indices for each group. TimeTable[0] is the groups separaged by “small” gap while TimeTable[1] is for groups separated by “large” gap. They are used for baseline subtraction (hsd_baseline) and subsequent flagging (hsd_blflag).

TimeTable:

[[[ismall00,…,ismall0M],[…],…,[ismallX0,…,ismallXN]],: [[ilarge00,…,ilarge0P],[…],…,[ilargeY0,…,ilargeYQ]]]

TimeTable[0]: separated by small gaps TimeTable[1]: separated by large gaps

TimeGap is the list of indices which indicate boundaries for “small” and “large” gaps. The “small” gap is a merged list of gaps for groups separated by small time gaps and the ones grouped by positions. These are used for plotting.

TimeGap: [[rowX1, rowX2,…,rowXN], [rowY1, rowY2,…,rowYN]] TimeGap[0]: small gap TimeGap[1]: large gap

class pipeline.hsd.heuristics.grouping2.ThresholdForGroupByTime[source]¶

Bases: pipeline.infrastructure.api.Heuristic

Estimate thresholds for large and small time gaps.

calculate(timebase: Sequence[numbers.Real]) → Tuple[List, List][source]¶

Estimate thresholds for large and small time gaps.

Estimate thresholds for large and small time gaps using base list of time stamps. Threshold for small time gap, denoted as Threshold1, is computed from a median value of nonzero time differences multiplied by five, i.e.,

dt = timebase[1:] - timebase[:-1] Threhold1 = 5 * np.median(dt[dt != 0])

where timebase is assumed to be np.ndarray. Threshold for large time gap, denoted as Threshold2, is computed from a median value of time differences larger than Threshold1 mutiplied by five, i.e.,

Threshold2 = 5 * np.median(
dt[np.logical_and(dt != 0, dt > Threshold1)]

)

Parameters: timebase – base list of time stamps for threshold estimation
Returns: Two-tuple of threshold values for small and large time gaps, respectively.

pipeline.hsd.heuristics.grouping2_test module¶

Test for heuristics defined in grouping2.py.

pipeline.hsd.heuristics.grouping2_test.generate_position_data_psw() → Tuple[numpy.ndarray, numpy.ndarray][source]¶

Generate position data for simulated position-switch observation.

Generate position data for simulated position-switch observatin. The observation consists of four positions in 2x2 grids, (0,0), (0,1), (1,0), and (1,1). Each position has ten data that contains random noise around commanded position.

y

(0,1) (1,1)

+ +

(0,0) (1,0)

+ +

—————– x

Returns

two-tuple consisting of the list of x (R.A.) and: y (Dec.) directions

Return type

tuple

pipeline.hsd.heuristics.grouping2_test.generate_position_data_raster() → Tuple[numpy.ndarray, numpy.ndarray][source]¶

Generate position data for simulated OTF raster observation.

Generate position data for simulated OTF raster observatin along x-direction (R.A.). The observation consists of two raster rows. Each row has twenty continuously taken data that contains random noise around commanded position. Scanning directions are opposite in these two rows.

y

<——————————–

1 - + + + + + + + + + + + + + + + + + + + +

0 - + + + + + + + + + + + + + + + + + + + +

——————————–>

———————————————– x

Returns

two-tuple consisting of the list of x (R.A.) and: y (Dec.) directions

Return type

tuple

pipeline.hsd.heuristics.grouping2_test.generate_time_data_psw() → numpy.ndarray[source]¶

Generate time series for simulated position-switch observation.

Generate time series for simulated position-switch observation. Observation consists of four fixed positions and each position contains ten continuous integrations. Integration time is assumed to be 1sec. There are time gaps between positions: 10sec, 60sec, and 10sec, respectively.

0 1 … 8 9 gap 10 … 19 gap … 30 … 39

|--|–|...|–|--|——-|--|…|–|-------|…|–|...|–| | POSITION 0 | | POS 1 | | POS 3 |

Returns: time series
Return type: np.ndarray

pipeline.hsd.heuristics.grouping2_test.generate_time_data_raster() → numpy.ndarray[source]¶

Generate time series for simulated OTF raster observation.

Generate time series for simulated OTF raster observation. Observation consists of two raster rows and each row contains twenty continuous integrations. Integration time is assumed to be 1sec. There are time gap of 10 sec between rows.

0 1 2 … 17 18 19 gap 20 21 … 38 39

|--|–|--|…|–|--|–|-------|–|--|…|–|--| | RASTER ROW 0 | | RASTER ROW 1 |

Returns: time series
Return type: np.ndarray

pipeline.hsd.heuristics.grouping2_test.random_noise(n: int, mean: int = 0, amp: int = 1, rs: numpy.random.mtrand.RandomState = None) → numpy.ndarray[source]¶

Generate random noise.

Generate random noise with given mean and maximum amplitude. Seed for random noise can be specified.

Parameters

n (int) – number of random noise
mean (int, optional) – mean value of random noise. Defaults to 0.
amp (int, optional) – maximum amplitude of random noise. Defaults to 1.
rs (np.random.mtrand.RandomState, optional) – seed for random noise. Defaults to None.

Returns

random noise

Return type

np.ndarray

pipeline.hsd.heuristics.grouping2_test.test_group_by_posiition_error()[source]¶: Test grouping by position: error cases.

pipeline.hsd.heuristics.grouping2_test.test_group_by_position_moderate_allowance_radius()[source]¶: Test grouping by position: moderate allowance radius -> no gap is detected.

pipeline.hsd.heuristics.grouping2_test.test_group_by_position_psw(combine_radius, allowance_radius)[source]¶: Test grouping by position on position switch pattern.

pipeline.hsd.heuristics.grouping2_test.test_group_by_position_raster(combine_radius, allowance_radius)[source]¶: Test grouping by position on raster pattern including some edge cases.

pipeline.hsd.heuristics.grouping2_test.test_group_by_position_too_large_allowance_radius()[source]¶: Test grouping by position: too large allowance radius -> all gaps are detected.

pipeline.hsd.heuristics.grouping2_test.test_group_by_position_too_large_combine_radius()[source]¶: Test grouping by position: too large combine radius -> only one group.

pipeline.hsd.heuristics.grouping2_test.test_group_by_position_too_small_combine_radius()[source]¶: Test grouping by position: too small combine radius -> all data are separated.

pipeline.hsd.heuristics.grouping2_test.test_group_by_time_psw(time_list)[source]¶: Test grouping by time for position switch pattern.

pipeline.hsd.heuristics.grouping2_test.test_group_by_time_raster(time_list)[source]¶: Test grouping by time for raster pattern.

pipeline.hsd.heuristics.grouping2_test.test_merge_gap_tables_psw()[source]¶: Test merging gap tables for position switch pattern.

pipeline.hsd.heuristics.grouping2_test.test_merge_gap_tables_raster()[source]¶: Test merging gap tables for raster pattern.

pipeline.hsd.heuristics.grouping2_test.test_threshold_for_time(time_list, expected_gaps)[source]¶: Test evaluation of threshold for time grouping.

pipeline.hsd.heuristics.observingpattern2 module¶

class pipeline.hsd.heuristics.observingpattern2.ObservingPattern2[source]¶

Bases: pipeline.infrastructure.api.Heuristic

Analyze pointing pattern

calculate(pos_dict)[source]¶

Analyze pointing pattern from pos_dict which is calculated by GroupByPosition2 heuristic. Return (ret)

ret: ‘RASTER’, ‘SINGLE-POINT’, or ‘MULTI-POINT’

# PosDict[row]: index

pipeline.hsd.heuristics.sdbeamsize module¶

class pipeline.hsd.heuristics.sdbeamsize.AntennaDiameter[source]¶

Bases: pipeline.infrastructure.api.Heuristic

get antenna diameter in metre from its name.

calculate(name)[source]¶

get antenna diameter in metre from its name.

name: antenna name

class pipeline.hsd.heuristics.sdbeamsize.SingleDishBeamSize[source]¶

Bases: pipeline.infrastructure.api.Heuristic

calculate beam size in arcsec.

calculate(diameter, frequency)[source]¶

calculate beam size in arcsec. returned value is rounded, like 33.0 for 32.199992 or 9.8 for 9.783333331. NOTE: CURRENTLY the rounding is DISABLED to match manual reduction

diameter: antenna diameter in metre frequency: observing frequency in GHz

class pipeline.hsd.heuristics.sdbeamsize.SingleDishBeamSizeFromName[source]¶

Bases: pipeline.hsd.heuristics.sdbeamsize.SingleDishBeamSize

calculate beam size in arcsec.

calculate(name, frequency)[source]¶

calculate beam size in arcsec. antenna diameter is taken from its name.

name: antenna name frequency: observing frequency in GHz

pipeline.hsd.heuristics.sdcaltype module¶

class pipeline.hsd.heuristics.sdcaltype.AsdmCalibrationTypeHeuristics[source]¶

Bases: pipeline.infrastructure.api.Heuristic

calculate(filename)[source]¶

Make a calculation based on the given parameters.

This is an abstract method and must be implemented by all Heuristic subclasses.

Note

The signature and return types of calculate() are intended to be implementation specific. Refer to the documentation of the implementing class for the appropriate signature.

class pipeline.hsd.heuristics.sdcaltype.CalibrationTypeHeuristics[source]¶

Bases: pipeline.infrastructure.api.Heuristic

calculate(filename)[source]¶

Make a calculation based on the given parameters.

This is an abstract method and must be implemented by all Heuristic subclasses.

Note

The signature and return types of calculate() are intended to be implementation specific. Refer to the documentation of the implementing class for the appropriate signature.

class pipeline.hsd.heuristics.sdcaltype.DefaultCalibrationTypeHeuristics[source]¶

Bases: pipeline.infrastructure.api.Heuristic

Always return calibration mode ‘none’.

calculate(filename)[source]¶: Return calibration type. Always return ‘none’.

class pipeline.hsd.heuristics.sdcaltype.MsCalibrationTypeHeuristics[source]¶

Bases: pipeline.infrastructure.api.Heuristic

Return appropriate calibration type by examining STATE subtable.

calculate(filename)[source]¶: Return calibration type, which is determined by examining STATE subtable.

pipeline.hsd.heuristics.sddatatype module¶

class pipeline.hsd.heuristics.sddatatype.DataTypeHeuristics[source]¶

Bases: pipeline.infrastructure.api.Heuristic

Examine type of the input data. Data types that are recognizable by this heuristics are:

‘MS2’ – Measurement Set ‘ASDM’ – ASDM ‘FITS’ – ATNF SDFITS ‘NRO’ – NRO data format (both NEWSTAR and NOSTAR)

Otherwise, the heuristics will return ‘UNKNOWN’.

calculate(filename)[source]¶

Return data type of the file.

filename – name of the file on disk

pipeline.hsd.heuristics.sddatatype.is_ms(filename)[source]¶

pipeline.hsd.heuristics.tsysspwmap module¶

class pipeline.hsd.heuristics.tsysspwmap.TsysSpwMapHeuristics[source]¶

Bases: pipeline.infrastructure.api.Heuristic

Heuristics for Tsys spw mapping Examine frequency coverage and choose one Tsys spw for each science spw. Score for frequency coverage is calculated by the following formula:

score = (min(fmax_tsys,fmax_science) - max(fmin_tsys,fmin_science))
/ (fmax_science - fmin_science)

The score 1.0 is the best (whole science frequency range is covered by Tsys spw) while score <= 0.0 is the worst (no overlap between Tsys spw and science spw).

calculate(ms, spwmap_pairs)[source]¶

Make a calculation based on the given parameters.

This is an abstract method and must be implemented by all Heuristic subclasses.

Note

The signature and return types of calculate() are intended to be implementation specific. Refer to the documentation of the implementing class for the appropriate signature.

pipeline.hsd.heuristics.tsysspwmap.best_spwmap(scores)[source]¶

pipeline.hsd.heuristics package¶

Submodules¶

pipeline.hsd.heuristics.MaskDeviation module¶

pipeline.hsd.heuristics.baselineparamconfig module¶

pipeline.hsd.heuristics.fitorder module¶

pipeline.hsd.heuristics.fragmentation module¶

pipeline.hsd.heuristics.grouping2 module¶

pipeline.hsd.heuristics.grouping2_test module¶

pipeline.hsd.heuristics.observingpattern2 module¶

pipeline.hsd.heuristics.sdbeamsize module¶

pipeline.hsd.heuristics.sdcaltype module¶

pipeline.hsd.heuristics.sddatatype module¶

pipeline.hsd.heuristics.tsysspwmap module¶

Module contents¶