{ "cells": [ { "cell_type": "markdown", "id": "1a7118fc-0e6c-4376-9888-f526f1838b42", "metadata": {}, "source": [ "# MuonDataLib Tutorial 4: Time Filtering\n" ] }, { "cell_type": "markdown", "id": "e4706a89-6c64-4125-948b-00dd82de4e14", "metadata": {}, "source": [ "In the previous tutorial we used the sample logs to filter the event data. However, it is also possible to filter the events based on time. Lets start by loading the data.\n" ] }, { "cell_type": "code", "execution_count": null, "id": "9279a747-96b0-4a72-a466-ef32480709a7", "metadata": {}, "outputs": [], "source": [ "from MuonDataLib.data.loader.load_events import load_events\n", "from MuonDataLib.plot.basic import Figure\n", "import os\n", "\n", "file_name = 'HIFI00195790.nxs'\n", "input_file = os.path.join('..', '..', '..', '..', 'test', 'data_files', file_name)\n", "data = load_events(input_file, 64)\n" ] }, { "cell_type": "markdown", "id": "48f08ba5-bf7b-464f-b4ee-e18dd5ffb84c", "metadata": {}, "source": [ "Using the `get_frame_start_times` method will allow us to know that we are filtering times that exist within our data set." ] }, { "cell_type": "code", "execution_count": null, "id": "8ef10f7c-64bc-414f-ab52-42181aa3d200", "metadata": {}, "outputs": [], "source": [ "frame_start_times = data.get_frame_start_times()\n", "print(f'start time: {frame_start_times[0]}, end time: {frame_start_times[-1]}')" ] }, { "cell_type": "markdown", "id": "a0e57551-fdf9-4522-a53b-122258acaa53", "metadata": {}, "source": [ "The events occur from about $0.994$ until about $3.174$ seconds." ] }, { "cell_type": "markdown", "id": "517f73db-7c7c-4a46-a1ee-fcc30a115e75", "metadata": {}, "source": [ "We will add a single sample log to show how the time filters impact the data. " ] }, { "cell_type": "code", "execution_count": null, "id": "d0dc61a3-a741-4703-b558-919d89555ccd", "metadata": {}, "outputs": [], "source": [ "def linear(x, m, c):\n", " return m*x + c\n", "from MuonDataLib.data.utils import create_data_from_function\n", "\n", "start = frame_start_times[0]\n", "end = frame_start_times[-1]+1\n", "step = (frame_start_times[-1]-frame_start_times[0])/40\n", "\n", "x, y = create_data_from_function(start, end, step, [3.1, 0.1], linear, seed=1)\n", "data.add_sample_log(\"field\", x, y)" ] }, { "cell_type": "markdown", "id": "162df7e5-e818-469c-be33-4499a5b9c383", "metadata": {}, "source": [ " Lets start by creating an unfiltered histogram" ] }, { "cell_type": "code", "execution_count": null, "id": "2e9164ba-46ab-4277-8189-8e69b606dadc", "metadata": {}, "outputs": [], "source": [ "no_filter_hist, bins = data.histogram()\n", "fig = Figure(y_label='Counts')\n", "fig.plot_from_histogram(bins, no_filter_hist, [0])\n", "fig.show()" ] }, { "cell_type": "markdown", "id": "34e7a894-447e-4839-81e8-65113dd5deed", "metadata": {}, "source": [ "## Time filters - Keeping data between a range of times\n", "\n", "The first type of time filter we will look at is one that keeps data between two user specified time stamps. However, the removed data may be slightly before or after these time stamps. This is because the filters will remove all of the data contained within the frame that the filter is placed within." ] }, { "cell_type": "code", "execution_count": null, "id": "c1ba64a6-bade-4df0-b329-9611059569ed", "metadata": {}, "outputs": [], "source": [ "data.only_keep_data_time_between('filter_1', 1.1, 1.5)" ] }, { "cell_type": "markdown", "id": "b61f30dd-140f-4f7f-9471-f86dcaca8af6", "metadata": {}, "source": [ "The `only_keep_data_time_between` command is used to add a time filter that keeps the data between the two user specified values. The first argument is the name we want to give to the filter and the remaining arguments define the range of times that we want to keep. Lets look at how this changes a sample log, but to apply the filter we must generate (or save) a histogram first.\n" ] }, { "cell_type": "code", "execution_count": null, "id": "ad24827c-78ca-4eca-9eae-eca7a04933cf", "metadata": {}, "outputs": [], "source": [ "hist_keep_between, bins = data.histogram()\n", "fig = Figure(y_label='Field (MHz)', x_label='Time (seconds)')\n", "fig.plot_sample_log(data, 'field')\n", "fig.show()\n" ] }, { "cell_type": "markdown", "id": "0ed1f0a6-b127-46f6-9879-3efa56754d36", "metadata": {}, "source": [ "The only data that has been kept is between $1.1$ and $1.5$ seconds. We can add additional regions of data that we want to keep by using the `only_keep_data_time_between` method. However, the names for each filter must be unique. " ] }, { "cell_type": "code", "execution_count": null, "id": "0e4ae612-0b6c-49af-a0b7-58b14e137a9e", "metadata": {}, "outputs": [], "source": [ "data.only_keep_data_time_between('filter_2', 2.6, end + 0.01)\n", "hist_keep_between, bins = data.histogram()\n", "fig = Figure(y_label='Field (MHz)', x_label='Time (seconds)')\n", "fig.plot_sample_log(data, 'field')\n", "fig.show()" ] }, { "cell_type": "markdown", "id": "2f94f7c9-4ffb-4be8-8a36-1a1fcec5c138", "metadata": {}, "source": [ "There are now two bands of kept data. The second filter used a value larger than the end of the frame start times so we would include the end of the data collection. \n", "\n", "Lets see how the time filters alter the histogram." ] }, { "cell_type": "code", "execution_count": null, "id": "3e1cc928-47c1-4b36-848b-f6249b7f08de", "metadata": {}, "outputs": [], "source": [ "fig = Figure(y_label='Counts')\n", "fig.plot_from_histogram(bins, no_filter_hist, [0], 'unfiltered, ')\n", "fig.plot_from_histogram(bins, hist_keep_between, [0], 'filtered time between, ')\n", "fig.show()" ] }, { "cell_type": "markdown", "id": "bb52e27e-c50f-4342-86fd-2709a69a150b", "metadata": {}, "source": [ "We can see that the filter has removed some counts, as expected." ] }, { "cell_type": "markdown", "id": "1e50ad68-44bd-4227-9da3-de90329a4706", "metadata": {}, "source": [ "If we want to remove a time between filter, we can remove it with the command" ] }, { "cell_type": "code", "execution_count": null, "id": "53b163b1-2ea6-4085-a22c-26a24cca94b0", "metadata": {}, "outputs": [], "source": [ "data.delete_only_keep_data_time_between('filter_1')\n", "data.delete_only_keep_data_time_between('filter_2')" ] }, { "cell_type": "markdown", "id": "a0d23400-ba60-4477-8f84-2ad1bc3d3ea6", "metadata": {}, "source": [ "where the argument is the name of the filter to be deleted. We have removed both filters in this example, so the data should look like the unfiltered results." ] }, { "cell_type": "code", "execution_count": null, "id": "d0eb4a8c-b721-48f1-b97b-12adcecf4870", "metadata": {}, "outputs": [], "source": [ "hist_check, bins = data.histogram()\n", "fig = Figure(y_label='Counts')\n", "fig.plot_from_histogram(bins, no_filter_hist, [0], 'unfiltered, ')\n", "fig.plot_from_histogram(bins, hist_check, [0], 'Check filter removed, ')\n", "fig.show()" ] }, { "cell_type": "raw", "id": "68819ecc-6c02-4cbc-89bf-3a82559d1a80", "metadata": {}, "source": [ "As you can see from the plot, the two histograms match. " ] }, { "cell_type": "markdown", "id": "fc61b639-09c5-435e-aab3-5c9beff539d4", "metadata": {}, "source": [ "## Time filters - Removing data between times\n", "\n", "The second type of time filter allows the us to specify the times we want to remove from the data. For example, if we know that the sample logs were not recorded for a period of time." ] }, { "cell_type": "code", "execution_count": null, "id": "5c39ecc1", "metadata": {}, "outputs": [], "source": [ "data.remove_data_time_between('filter_3', 2, 2.6)" ] }, { "cell_type": "markdown", "id": "a60951df", "metadata": {}, "source": [ "The `remove_data_time_between` command is used to add a filter that removes all of the data between the second and third arguments. Once again the first argument must be a unique name to identify the filter. Just like before we can add multiple filters by repeating the above command. \n" ] }, { "cell_type": "code", "execution_count": null, "id": "244f9ae3-badd-4b3d-a4eb-e649b366b3ce", "metadata": {}, "outputs": [], "source": [ "data.remove_data_time_between('filter_4', start, 1.1)" ] }, { "cell_type": "markdown", "id": "d6f96c97-7fbc-40e9-a8b9-2340cfbabb3e", "metadata": {}, "source": [ "In this case we want to remove the data from the begining of the collection, so we specify the value for the first frame. Lets check the sample log values" ] }, { "cell_type": "code", "execution_count": null, "id": "f5148d44", "metadata": {}, "outputs": [], "source": [ "hist_rm_data, bins = data.histogram()\n", "fig = Figure(y_label='Field (MHz)', x_label='Time (seconds)')\n", "fig.plot_sample_log(data, 'field')\n", "fig.show()\n" ] }, { "cell_type": "markdown", "id": "69374769", "metadata": {}, "source": [ "These look as expected. So lets look at the histogram. " ] }, { "cell_type": "code", "execution_count": null, "id": "92268613-21d5-43ba-a145-9fa578fcc6a4", "metadata": {}, "outputs": [], "source": [ "fig = Figure(y_label='Counts')\n", "fig.plot_from_histogram(bins, no_filter_hist, [0], 'unfiltered, ')\n", "fig.plot_from_histogram(bins, hist_rm_data, [0], 'filtered remove ranges, ')\n", "fig.show()" ] }, { "cell_type": "markdown", "id": "6ddc0ce4", "metadata": {}, "source": [ "Lets remove the filters," ] }, { "cell_type": "code", "execution_count": null, "id": "d64683f2", "metadata": {}, "outputs": [], "source": [ "data.delete_remove_data_time_between('filter_3')\n", "data.delete_remove_data_time_between('filter_4')" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.12.5" }, "widgets": { "application/vnd.jupyter.widget-state+json": { "state": {}, "version_major": 2, "version_minor": 0 } } }, "nbformat": 4, "nbformat_minor": 5 }