Changes Saved

A new puff detection algorithm in Flika

I developed a new puff detection algorithm using clustering in December of 2014.  These are the instructions on how to use it in Flika and explanations on the algorithm .  

Instructions

1) Open a tiff stack in Flika.  The stack must have a period where no puffs are occurring somewhere in the video.  It also must have little to no global rise in signal, because this increases the amplitude of the noise and disrupts the algorithm.

2) Transform the stack into an F/F0 movie by subtracting a scalar background fluorescence level and then dividing each frame by image formed by taking the average of quiescent frames (usually at the beginning of the movie).  

3) Apply a high pass filter to the movie.  I usually use a butterworth filter.  This serves two purposes.  It removes low frequency events (such as global rise) and centers the noise around 0.  

4) Ratio each frame by an image formed by taking the standard deviation during the quiescent period.  This is done so that the noise is Gaussian distributed with a mean of 0 and a standard deviation of 1. 

5) Threshold your processed movie to generate a binary window so that only a few pixels of noise are true.  

6) Using the F/F0 movie from step 2 as the data window and the binary window from step 5, run the "threshold_cluster" function. This function will calculate the density of the each true pixel and the distance to the closest true pixel with a higher density.  

7) The function will make a plot like figure 1B in Rodriguez & Laio, 2014, plotting density (x axis) vs distance to closest true pixel (y axis). Draw a loop around the points you consider to have sufficient density and distance (top right corner), and the algorithm will cluster all the true pixels. You can reloop the points to recluster. When you're satisfied with the clustering, press 'Enter'. The algorithm will use those clustered pixels to set bounds in x, y, and t for fitting gaussians. If multiple events overlap in their x, y, and t 'windows', the same number of gaussians will be fit.

Sample Code

open_file()
subtract(660) #subtract baseline
ratio(179,244,'average'); #ratio(first_frame, nFrames, ratio_type), now we are in F/F0
set_value(1,0,175) #remove the pre-laser baseline. This would otherwise disturb our filter
data_window=set_value(1, 433, 1143) #remove the UV flash for the same reason
high_pass=butterworth_filter(1,.003,1,keepSourceWindow=True) # High pass filter
low_pass=image_calculator(data_window,high_pass,'Subtract',keepSourceWindow=True) # we will use the low pass image as an approximation for the variance of the photon noise.
low_pass.image[low_pass.image

Algorithm Notes



Step 5



Step 5 requires a threshold.  How is the threshold determined?  You want it low enough that all the puffs will be detected, but not so low that the algorithm has to sort through lots of false positives and slows down.  To calculate the threshold needed so that only the top 0.1% of the noise is included, use the following code (in python). 



import scipy.stats as st
thresh=st.norm.ppf(.999) # ppf is the probability density function of a normalized Gaussian. 0.1% of the noise is above thresh, 99.9% is below.
print(thresh)

In this example I chose the top 0.1%, which seems arbitrary.  It is, in fact.  The closer to 0 you set this threshold, the fewer pixels you will have to process.  This speeds up processing time, but also decreases puff detection probability (you are more likely to miss a real signal).  It's a trade off between speed and accuracy.  

 In a puff, True pixels will have a higher density than what would be expected by chance if only noise were present.  In other words, if you draw a circle around a True pixel, inside that circle you will find more True pixels if a puff is present than if no puff is present.  This is the motivation for the clustering algorithm.  

The algorithm works by first ascribing to each True pixel a density based on the number of True pixels within a certain radius (in space and time).  I have been using a radius of 5. Because pixels that are below a certain density are very likely not puffs, I threshold by density.  This unfortunately introduces a second arbitrary threshold into the algorithm.  However, we can again put this in terms of the probability of noise by determining the probability distribution of finding n True pixels inside your sphere.  If you have a radius of 5, you have about 43πr3≃52443πr3≃524 pixels.  Using the the binomial distribution with 524 trials and a probability of .001, we see that the most likely probability is 0 with a probability of .592.  If we look at the cumulative distribution function of the binomial distribution, we see that there is a 0.021% chance that there will be 4 or more True pixels in a randomly placed sphere.  This probability drops of dramatically as you increase the number of True pixels. 

from scipy.stats import binom
nPixels=524
plot(binom.pmf(np.arange(0,nPixels),nPixels,.001)) #this is the probability distribution for the number of True pixels given a number of pixels
print("There is a probability of .999 that a given True pixel will be surrounded by less than {} True pixels ".format(np.where(binom.cdf(np.arange(0,nPixels),nPixels,.001)>.999)[0][0]))

These probabilities seem very small, but we will be looking at a large number of events, so we need to account for them when deciding which density threshold to use.  If each frame contains 128x128 = 16,384 pixels, and 0.1% of them are True (assuming noise), each frame will have 16.384 randomly True pixels, so we will be drawing that many spheres per frame.  If we want a false positive rate of 1 every 100 frames, or .01/frame, then we are looking for the probability that given 16.384*100=1,638.4 chances, 1 of them is selected.  This means we want the first point in the cumulative density function which exceeds 1-1/1638.4=.99939.  This point occurs when the number of True pixels inside the circle equals 3.  

cdf = binom.cdf(np.arange(0,nPixels),nPixels,.001)
cdf = cdf**1638.4