{
 "cells": [
  {
   "cell_type": "markdown",
   "id": "1a7118fc-0e6c-4376-9888-f526f1838b42",
   "metadata": {},
   "source": [
    "# MuonDataLib Tutorial 6: Advanced Filtering\n"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "e4706a89-6c64-4125-948b-00dd82de4e14",
   "metadata": {},
   "source": [
    "In the previous tutorials we used the sample logs and times to filter the event data. However, there are some additional features that are worth discussing. \n",
    "\n",
    "First lets set up the data with a sample log. \n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "17a8a944-357c-4b1d-99ae-712c4c99cb34",
   "metadata": {},
   "outputs": [],
   "source": [
    "from MuonDataLib.data.loader.load_events import load_events\n",
    "from MuonDataLib.plot.basic import Figure\n",
    "from MuonDataLib.data.utils import create_data_from_function\n",
    "import os\n",
    "import numpy as np\n",
    "\n",
    "file_name = 'HIFI00195790.nxs'\n",
    "input_file = os.path.join('..', '..', '..', '..', 'test', 'data_files', file_name)\n",
    "data = load_events(input_file, 64)\n",
    "\n",
    "frame_start_times = data.get_frame_start_times()\n",
    "\n",
    "def osc(x, amp, omega, phi):\n",
    "    return amp*np.sin(omega*x + phi) + amp*1.1\n",
    "\n",
    "start = frame_start_times[0]\n",
    "end = frame_start_times[-1]+1\n",
    "step = (frame_start_times[-1]-frame_start_times[0])/40\n",
    "\n",
    "x, y = create_data_from_function(start, end, step, [3, 6.1, 0.91], osc, seed=1)\n",
    "data.add_sample_log(\"Sample Temp\", x, y)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "162df7e5-e818-469c-be33-4499a5b9c383",
   "metadata": {},
   "source": [
    " Next lets create an unfiltered histogram"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "2e9164ba-46ab-4277-8189-8e69b606dadc",
   "metadata": {},
   "outputs": [],
   "source": [
    "no_filter_hist, bins = data.histogram()\n",
    "fig = Figure(y_label='Counts')\n",
    "fig.plot_from_histogram(bins, no_filter_hist, [0])\n",
    "fig.show()"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "34e7a894-447e-4839-81e8-65113dd5deed",
   "metadata": {},
   "source": [
    "## Mix and match filters\n",
    "\n",
    "In the previous tutorial we used multiple filters of the same type (e.g. 'only_keep_data_time_between`). However, it is possible to use any combination of filters. Lets consider an experiment and we are only interested in Temperatures between $1$ and $4$ Kelvin, but we also know that the detectors had an error between $2.2$ and $3.3$ seconds. Then we can add the following two filters"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "c1ba64a6-bade-4df0-b329-9611059569ed",
   "metadata": {},
   "outputs": [],
   "source": [
    "data.remove_data_time_between('detector_error', 2.2, 3.3)\n",
    "data.keep_data_sample_log_between('Sample Temp', 1, 4)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "b61f30dd-140f-4f7f-9471-f86dcaca8af6",
   "metadata": {},
   "source": [
    "Next lets look at the sample log"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "ad24827c-78ca-4eca-9eae-eca7a04933cf",
   "metadata": {},
   "outputs": [],
   "source": [
    "hist_mix, bins = data.histogram()\n",
    "fig = Figure(y_label='Temperature (Kelvin)', x_label='Time (seconds)')\n",
    "fig.plot_sample_log(data, 'Sample Temp')\n",
    "fig.show()\n"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "0ed1f0a6-b127-46f6-9879-3efa56754d36",
   "metadata": {},
   "source": [
    "The result is as expected. Next lets look at the impact it has on the histograms "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "3e1cc928-47c1-4b36-848b-f6249b7f08de",
   "metadata": {},
   "outputs": [],
   "source": [
    "fig = Figure(y_label='Counts')\n",
    "fig.plot_from_histogram(bins, no_filter_hist, [0], 'unfiltered, ')\n",
    "fig.plot_from_histogram(bins, hist_mix, [0], 'filtered (mix), ')\n",
    "fig.show()"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "bb52e27e-c50f-4342-86fd-2709a69a150b",
   "metadata": {},
   "source": [
    "We can see that the filters have removed some data as expected."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "fc61b639-09c5-435e-aab3-5c9beff539d4",
   "metadata": {},
   "source": [
    "## Managing filters\n",
    "\n",
    "When comparing two different samples with each other, we would need to make sure that both data sets use the same filters. This can be done by using the `save_filters` command"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "5c39ecc1",
   "metadata": {},
   "outputs": [],
   "source": [
    "filter_file = os.path.join('..', 'Output_files', 'filters.json')\n",
    "data.save_filters(filter_file)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "a60951df",
   "metadata": {},
   "source": [
    "The argument defines where the file will be saved.\n",
    "\n",
    "The next step would be to load the second data set (we delete the original data set and load in the same one due to storage issues with github, but we will pretend its a second data set)."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "244f9ae3-badd-4b3d-a4eb-e649b366b3ce",
   "metadata": {},
   "outputs": [],
   "source": [
    "del data\n",
    "\n",
    "from MuonDataLib.data.loader.load_events import load_events\n",
    "from MuonDataLib.plot.basic import Figure\n",
    "from MuonDataLib.data.utils import create_data_from_function\n",
    "import os\n",
    "\n",
    "file_name = 'HIFI00195790.nxs'\n",
    "input_file = os.path.join('..', '..', '..', '..', 'test', 'data_files', file_name)\n",
    "data_2 = load_events(input_file, 64)\n",
    "\n",
    "frame_start_times = data_2.get_frame_start_times()\n",
    "\n",
    "def osc(x, amp, omega, phi):\n",
    "    return amp*np.sin(omega*x + phi) + amp*1.1\n",
    "\n",
    "start = frame_start_times[0]\n",
    "end = frame_start_times[-1]+1\n",
    "step = (frame_start_times[-1]-frame_start_times[0])/50\n",
    "\n",
    "x, y = create_data_from_function(start, end, step, [3, 6.1, 0.91], osc, seed=2)\n",
    "data_2.add_sample_log(\"Sample Temp\", x, y)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "d6f96c97-7fbc-40e9-a8b9-2340cfbabb3e",
   "metadata": {},
   "source": [
    "The next step is to load the filters from earlier"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "f5148d44",
   "metadata": {},
   "outputs": [],
   "source": [
    "data_2.load_filters(filter_file)\n"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "69374769",
   "metadata": {},
   "source": [
    "To check if the filters have loaded as expected, we can use the 'report_filters` command"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "92268613-21d5-43ba-a145-9fa578fcc6a4",
   "metadata": {},
   "outputs": [],
   "source": [
    "print(data_2.report_filters())"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "6ddc0ce4",
   "metadata": {},
   "source": [
    "We can see that both filters are present. However, in this second experiment we didn't have the detector error. So we can remove that filter"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "d64683f2",
   "metadata": {},
   "outputs": [],
   "source": [
    "data_2.delete_remove_data_time_between('detector_error')\n",
    "print(data_2.report_filters())"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "21282a0c-ae36-4e82-b296-e14e0fadda6c",
   "metadata": {},
   "source": [
    "We now have one filter as expected. To verify that it worked correctly lets look at the sample log and histogram."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "94d05827-49ef-48f6-8aab-4011f0b39aaf",
   "metadata": {},
   "outputs": [],
   "source": [
    "load_data, bins = data_2.histogram()\n",
    "fig = Figure(y_label='Temperature (Kelvin)', x_label='Time (seconds)')\n",
    "fig.plot_sample_log(data_2, 'Sample Temp')\n",
    "fig.show()\n",
    "\n",
    "fig = Figure(y_label='Counts')\n",
    "fig.plot_from_histogram(bins, no_filter_hist, [0], 'unfiltered, ')\n",
    "fig.plot_from_histogram(bins, load_data, [0], 'filtered (loaded), ')\n",
    "fig.show()"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "3aa6bb1c-d1cb-416d-ab66-31352434eb13",
   "metadata": {},
   "source": [
    "The final feature of the filters is the ability to delete all of them in one command. To best demonstrate lets first add some more filters to our data."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "aff3907f-79ef-4938-bd31-5db8e42ee2ed",
   "metadata": {},
   "outputs": [],
   "source": [
    "data_2.only_keep_data_time_between('one', 1., 1.4)\n",
    "data_2.only_keep_data_time_between('two', 2., 2.9)\n",
    "print(data_2.report_filters())"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "94ab3b6b-aab2-478d-94d4-d09bdb6f696e",
   "metadata": {},
   "source": [
    "As you can see we now have three filters. To remove all of them we can use the `clear_filters` command"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "758fc301-51c3-45fc-95d7-2bbf72b7ce0f",
   "metadata": {},
   "outputs": [],
   "source": [
    "data_2.clear_filters()\n",
    "print(data_2.report_filters())"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "da9496c3-86da-4442-bac8-dd511ee94e31",
   "metadata": {},
   "source": [
    "To verify that they have all been cleared lets look at the sample log and histogram."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "4a5e299e-6974-4d9f-bcb4-f4bc8533d629",
   "metadata": {},
   "outputs": [],
   "source": [
    "clear_data, bins = data_2.histogram()\n",
    "fig = Figure(y_label='Temperature (Kelvin)', x_label='Time (seconds)')\n",
    "fig.plot_sample_log(data_2,'Sample Temp')\n",
    "fig.show()\n",
    "\n",
    "fig = Figure(y_label='Counts')\n",
    "fig.plot_from_histogram(bins, no_filter_hist, [0], 'unfiltered, ')\n",
    "fig.plot_from_histogram(bins, clear_data, [0], 'cleared, ')\n",
    "fig.show()"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "2a4f9d6c-b737-41aa-b23b-bd9aaece1e40",
   "metadata": {},
   "source": [
    "As expected this data is identical to the unfiltered data. "
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.12.5"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}