MuonDataLib Tutorial 6: Advanced Filtering
In the previous tutorials we used the sample logs and times to filter the event data. However, there are some additional features that are worth discussing.
First lets set up the data with a sample log.
[1]:
from MuonDataLib.data.loader.load_events import load_events
from MuonDataLib.plot.basic import Figure
from MuonDataLib.data.utils import create_data_from_function
import os
import numpy as np
file_name = 'HIFI00195790.nxs'
input_file = os.path.join('..', '..', '..', '..', 'test', 'data_files', file_name)
data = load_events(input_file, 64)
frame_start_times = data.get_frame_start_times()
def osc(x, amp, omega, phi):
return amp*np.sin(omega*x + phi) + amp*1.1
start = frame_start_times[0]
end = frame_start_times[-1]+1
step = (frame_start_times[-1]-frame_start_times[0])/40
x, y = create_data_from_function(start, end, step, [3, 6.1, 0.91], osc, seed=1)
data.add_sample_log("Sample Temp", x, y)
WARNING: The metadata **RUN** is missing. Using fallback values
WARNING: The metadata **TITLE** is missing. Using fallback values
WARNING: The metadata **EXPERIMENT IDENTIFIER** is missing. Using fallback values
Next lets create an unfiltered histogram
[2]:
no_filter_hist, bins = data.histogram()
fig = Figure(y_label='Counts')
fig.plot_from_histogram(bins, no_filter_hist, [0])
fig.show()
Data type cannot be displayed: application/vnd.plotly.v1+json
Mix and match filters
In the previous tutorial we used multiple filters of the same type (e.g. ‘only_keep_data_time_between`). However, it is possible to use any combination of filters. Lets consider an experiment and we are only interested in Temperatures between \(1\) and \(4\) Kelvin, but we also know that the detectors had an error between \(2.2\) and \(3.3\) seconds. Then we can add the following two filters
[3]:
data.remove_data_time_between('detector_error', 2.2, 3.3)
data.keep_data_sample_log_between('Sample Temp', 1, 4)
Next lets look at the sample log
[4]:
hist_mix, bins = data.histogram()
fig = Figure(y_label='Temperature (Kelvin)', x_label='Time (seconds)')
fig.plot_sample_log(data, 'Sample Temp')
fig.show()
WARNING: The target 0.0 is before the first frame start time 0.99375522 seconds. Difference is 0.99375522 seconds
WARNING: The target 3.5525043578742554 is after the last frame start time 3.1737624 seconds. Difference is 0.37874195787425524 seconds
WARNING: The target 3.985423831839438 is after the last frame start time 3.1737624 seconds. Difference is 0.811661431839438 seconds
WARNING: The target 3.3 is after the last frame start time 3.1737624 seconds. Difference is 0.12623759999999962 seconds
WARNING: The target 3.4467836817136326 is after the last frame start time 3.1737624 seconds. Difference is 0.2730212817136324 seconds
WARNING: The target 3.8317520720797935 is after the last frame start time 3.1737624 seconds. Difference is 0.6579896720797933 seconds
WARNING: The target 4.147959927132639 is after the last frame start time 3.1737624 seconds. Difference is 0.9741975271326391 seconds
Data type cannot be displayed: application/vnd.plotly.v1+json
The result is as expected. Next lets look at the impact it has on the histograms
[5]:
fig = Figure(y_label='Counts')
fig.plot_from_histogram(bins, no_filter_hist, [0], 'unfiltered, ')
fig.plot_from_histogram(bins, hist_mix, [0], 'filtered (mix), ')
fig.show()
Data type cannot be displayed: application/vnd.plotly.v1+json
We can see that the filters have removed some data as expected.
Managing filters
When comparing two different samples with each other, we would need to make sure that both data sets use the same filters. This can be done by using the save_filters command
[6]:
filter_file = os.path.join('..', 'Output_files', 'filters.json')
data.save_filters(filter_file)
The argument defines where the file will be saved.
The next step would be to load the second data set (we delete the original data set and load in the same one due to storage issues with github, but we will pretend its a second data set).
[7]:
del data
from MuonDataLib.data.loader.load_events import load_events
from MuonDataLib.plot.basic import Figure
from MuonDataLib.data.utils import create_data_from_function
import os
file_name = 'HIFI00195790.nxs'
input_file = os.path.join('..', '..', '..', '..', 'test', 'data_files', file_name)
data_2 = load_events(input_file, 64)
frame_start_times = data_2.get_frame_start_times()
def osc(x, amp, omega, phi):
return amp*np.sin(omega*x + phi) + amp*1.1
start = frame_start_times[0]
end = frame_start_times[-1]+1
step = (frame_start_times[-1]-frame_start_times[0])/50
x, y = create_data_from_function(start, end, step, [3, 6.1, 0.91], osc, seed=2)
data_2.add_sample_log("Sample Temp", x, y)
WARNING: The metadata **RUN** is missing. Using fallback values
WARNING: The metadata **TITLE** is missing. Using fallback values
WARNING: The metadata **EXPERIMENT IDENTIFIER** is missing. Using fallback values
The next step is to load the filters from earlier
[8]:
data_2.load_filters(filter_file)
To check if the filters have loaded as expected, we can use the ‘report_filters` command
[9]:
print(data_2.report_filters())
Filters(time_filters=TimeFilters(keep_filters=[], remove_filters=[Filter(name='detector_error', start=2.2, end=3.3)]), sample_log_filters=[Filter(name='Sample Temp', start=1.0, end=4.0)], peak_property=PeakProperty(Amplitudes=0.0), histogram_settings=HistogramSettings(min_time=0.0, max_time=32.768, num_bins=2048))
We can see that both filters are present. However, in this second experiment we didn’t have the detector error. So we can remove that filter
[10]:
data_2.delete_remove_data_time_between('detector_error')
print(data_2.report_filters())
Filters(time_filters=TimeFilters(keep_filters=[], remove_filters=[]), sample_log_filters=[Filter(name='Sample Temp', start=1.0, end=4.0)], peak_property=PeakProperty(Amplitudes=0.0), histogram_settings=HistogramSettings(min_time=0.0, max_time=32.768, num_bins=2048))
We now have one filter as expected. To verify that it worked correctly lets look at the sample log and histogram.
[11]:
load_data, bins = data_2.histogram()
fig = Figure(y_label='Temperature (Kelvin)', x_label='Time (seconds)')
fig.plot_sample_log(data_2, 'Sample Temp')
fig.show()
fig = Figure(y_label='Counts')
fig.plot_from_histogram(bins, no_filter_hist, [0], 'unfiltered, ')
fig.plot_from_histogram(bins, load_data, [0], 'filtered (loaded), ')
fig.show()
WARNING: The target 0.0 is before the first frame start time 0.99375522 seconds. Difference is 0.99375522 seconds
WARNING: The target 3.562801893910883 is after the last frame start time 3.1737624 seconds. Difference is 0.38903949391088277 seconds
WARNING: The target 4.002486868507462 is after the last frame start time 3.1737624 seconds. Difference is 0.8287244685074615 seconds
WARNING: The target 3.4457178026454436 is after the last frame start time 3.1737624 seconds. Difference is 0.2719554026454434 seconds
WARNING: The target 3.864694414131656 is after the last frame start time 3.1737624 seconds. Difference is 0.690932014131656 seconds
WARNING: The target 4.129729750757985 is after the last frame start time 3.1737624 seconds. Difference is 0.9559673507579847 seconds
Data type cannot be displayed: application/vnd.plotly.v1+json
Data type cannot be displayed: application/vnd.plotly.v1+json
The final feature of the filters is the ability to delete all of them in one command. To best demonstrate lets first add some more filters to our data.
[12]:
data_2.only_keep_data_time_between('one', 1., 1.4)
data_2.only_keep_data_time_between('two', 2., 2.9)
print(data_2.report_filters())
Filters(time_filters=TimeFilters(keep_filters=[Filter(name='one', start=1.0, end=1.4), Filter(name='two', start=2.0, end=2.9)], remove_filters=[]), sample_log_filters=[Filter(name='Sample Temp', start=1.0, end=4.0)], peak_property=PeakProperty(Amplitudes=0.0), histogram_settings=HistogramSettings(min_time=0.0, max_time=32.768, num_bins=2048))
As you can see we now have three filters. To remove all of them we can use the clear_filters command
[13]:
data_2.clear_filters()
print(data_2.report_filters())
Resolution: 0.016 μs
Filters(time_filters=TimeFilters(keep_filters=[], remove_filters=[]), sample_log_filters=[], peak_property=PeakProperty(Amplitudes=0.0), histogram_settings=HistogramSettings(min_time=0.0, max_time=32.768, num_bins=2048))
To verify that they have all been cleared lets look at the sample log and histogram.
[14]:
clear_data, bins = data_2.histogram()
fig = Figure(y_label='Temperature (Kelvin)', x_label='Time (seconds)')
fig.plot_sample_log(data_2,'Sample Temp')
fig.show()
fig = Figure(y_label='Counts')
fig.plot_from_histogram(bins, no_filter_hist, [0], 'unfiltered, ')
fig.plot_from_histogram(bins, clear_data, [0], 'cleared, ')
fig.show()
Data type cannot be displayed: application/vnd.plotly.v1+json
Data type cannot be displayed: application/vnd.plotly.v1+json
As expected this data is identical to the unfiltered data.