Dependence of the slope on R, Q and the fitting radius

>> pendientes=simul_pendientes(6000:250:8500,-.5:.2:1,200);
>> R=6000:250:8500;
>> Q=-.5:.2:1;
>> imagesc(Q,R,pendientes); set(gca,’YDir’,’normal’)

>> mesh(Q,R,pendientes)

So approximately linear change with R, and almost no change with Q.

Dependence on the fitting radius:

Now we see the dependence with the fitting area:
>> [pendientes_radioaj,p_radioaj]=simul_pendientes_radioajuste(8000,.7,1000:200:3000,100);
>> p_radioaj’
ans =
1.0e-005 *
0.0000 0.1023 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000

All very significant.

>> radiosaj=1000:200:3000;
>> plot(radiosaj,pendientes_radioaj)

>> save pendientes pendientes R Q pendientes_radioaj radiosaj

Also a big change, depending on the fitting radius.

Advertisement

Effect of having different dispersion in the center and in the periphery

We do not see a significant widening of the pattern. However, a very relevant bias appears:

Simulations with std 10 in the central 1.5 mm (radius), and std 20 in the rest (up to 3 mm radius):

>> fig_pap_simulaciones_ruidoperiferia_01(5,1)

Simulations with std 10 in the central 1.5 mm (radius), and std 40 in the rest (up to 3 mm radius):

>> fig_pap_simulaciones_ruidoperiferia_01(5,1)

Why fishes behave “better” than Condorcet

We see this problem in Sumpter et al. (2008), Consensus Decision Making by Fish, Curr Biol.

Condorcet Theorem just says the probability that, if we make N trials with probability p of getting the right answer in each trial, at least more than half of them get the right answer. Thus, each trial is completely independent. The solution to this problem is the same as the solution to problems of the kind “taking balls from a box”. Thus, if the number of fishes is odd, the probability of the majority taking the “right” decision is

If the number of fishes is even, we have the problem of ties. Assuming that ties are resolved by tossing a fair coin (so each option has 50% probability of being chosen), we have

On the other hand, as shown by Ward et al. (2008) Quorum decision-making facilitates… pnas, decisions of different fishes in the shoal are not independent. Once one or more fishes decide one option, the probability that other go for that option increases (in fact, Sumpter et al. 2008 also use this formula in other part of their paper). Therefore, it is to be expected that the probability increases faster for the case of fishes. It is as if p in the above formula was not constant, but increased.

Possible objective functions for the many-eyes problem

Version 1: Probability that at least one fish detects a predator at a given distance.

A predator arrives at a certain angle. The probability that a fish sees the predator if there is no occlusion by another fish is p. If N fishes have free view at the angle of the arriving predator, the probability that at leas one fish sees the predator is

1-(1-p)^N. This probability increases with N until it saturates.

If we binnarize the 2D space in 360 angles, and we assume equal probability for the predator arriving each angle, then we have that the probability that at least one fish sees the predator is

, where N_i is the number of fishes with free line of vision at angle i.

The advantage of this cost function is that it takes into account both that many fishes see a certain point (but with the saturation, which I find very reasonable) and that not all fishes look at the same place, leaving others without surveillance.

Since sometimes the shoal will not react until several fish have started evasion, this cost may be generalized to the case when at least n fishes see the predator. However, I think the formula is not simple.

A possible problem with this objective is that it may depend on the binning on angles. My intuition says it will not depend on that (at least not strongly), but at first sight I do not see an easy way of proving it.

 

Version 2: Distance at which the probability of detecting a predator passes a threshold.

We may assume that the probability that a fish sees the predator (p) depends on distance in a simple way. We may assume that, for long distances (where the probability is much lower than one) the probability grows linearly with the angle subtended by the predator. This angle is inversely proportional to distance, so p=k/d, where k is a constant and d is the distance. Then, for a given angle, the probability that at leas one fish sees the predator is

H=1-(1-k/d)^N.

For a given (constant) value of H, we have that the distance depends on N as

.

The two versions may not be equivalent, because they will weight differently the benefits of equally distributing the surveillance in all angles. But I would not expect strong differences…

Figures for Gonzalo’s talk in Tarragona

>> load(‘c:\hipertec\AblationElegans\Datos\datos_celegans_paradinamica.mat’)
>> V0=zeros(279,1);
V0([40 41])=1;
>> M=conectividad2matrizsistema(todas.A_chem,todas.A_ej,todas.GABA,1,1,30);
>> V=matrizsistema2V_num(M,V0,[-1 1],.0005,1000);
>> imagesc(V)
>> caxis([0 .3])

>> resultados=matrizsistema2experimentoablacion(M,V0,[-1 1],54:57,.0005,1000);
>> load(‘c:\hipertec\AblationElegans\Datos\info_redes.mat’)
>> [roc,umbral]=resultados2pseudoroc(resultados,chemotaxis1.buenas,[chemotaxis1.sensiales chemotaxis1.finales],1);

Good.

>> find(roc(:,2)>.8,1)
ans =
36
>> umbral(36)
ans =
0.0017

>> redencontrada=find(mean(resultados,2)>.0017);
>> pintared_posreal(todas.A_chem,todas.A_ej,todas.nombres,1:279,chemotaxis1.sensiales,chemotaxis1.finales,redencontrada)

New Figure 2

>> fig_pap_elegans_16(1,zeros(100))

Omega vs Deltax, for the parameters alpha=0.05 beta=1.5, with the new generalized omega

>> clear
>> load datos_celegans_struct
>> [pos,omega_general,costes,x]=coste2pos_restofijas(todas.A*.05,todas.M*1.5+todas.S,todas.f,todas.pos_real,1);
>> desv=abs(todas.pos_real-pos);
>> plot(desv,omega_general,’.’)

To start with, only one outlier. Note that the new omega depends on the exponent, so this by itself favors exponent 1.

Bad news for the fit to the mean (without the outlier):

>> buenas=(desv<.2 | omega_general<20);
>> sum(~buenas)
ans =
1
>> [xi,b]=omegadesv2bestfittingexponent_exp(desv(buenas),omega_general(buenas))
xi =
4.5068
b =
0.1092

And of course, again great news for the fit to the envelope (again, without the outlier):

>> [desv_acum,omega_media]=omegadesv_acumuladas(desv(buenas),omega_general(buenas),1,4);

>> plot(log10(desv_acum),log10(omega_media),’.-‘)

I am starting to believe the theory about Bayes being especially affected by the most deviated ones, and the envelope getting that.

 

omega vs Deltax for the anatomical alpha and beta

The problem: This plot does not look nice for the case alpha=beta=1/29.3.

load datos_celegans_struct
pos_cuad=coste2pos_cuad(todas.A/29.3,todas.M/29.3+todas.S,todas.f);
omega=sum([todas.A/29.3 todas.M/29.3+todas.S],2);
desv=abs(todas.pos_real-pos_cuad)
plot(desv,omega,’.’)

It is not a problem for exponent 2, because the metric is bad also. But it does not look nice for exponent one, for which the metric is excelent:

>> [pos,omega_general,costes,x]=coste2pos_restofijas(todas.A/29.3,todas.M/23.3+todas.S,todas.f,todas.pos_real,1);
>> desv=abs(todas.pos_real-pos);
>> plot(desv,omega,’.’)

The reason: omega is not valid in this case. For example (blue is the actual cost, red is the cost approximated by omega):

>> n=4;clf; plot(x,costes(n,:)); hold on; plot(x,abs(x-pos(n))*omega(n),’r’); ejes=axis; plot(todas.pos_real(n)*[1 1],ejes(3:4),’k’);title(num2str(omega_general(n)))

>> n=27;clf; plot(x,costes(n,:)); hold on; plot(x,abs(x-pos(n))*omega(n),’r’); ejes=axis; plot(todas.pos_real(n)*[1 1],ejes(3:4),’k’);title(num2str(omega_general(n)))


>> n=54;clf; plot(x,costes(n,:)); hold on; plot(x,abs(x-pos(n))*omega(n),’r’); ejes=axis; plot(todas.pos_real(n)*[1 1],ejes(3:4),’k’);title(num2str(omega_general(n)))

And many more cases.

Solution: I define a “generalized omega”, which is simply the one that best fits the cost in the direction of the deviation. With this generalized omega the costs look like this:

>> n=4;clf; plot(x,costes(n,:)); hold on; plot(x,abs(x-pos(n))*omega_general(n),’r’); ejes=axis; plot(todas.pos_real(n)*[1 1],ejes(3:4),’k’);title(num2str(omega_general(n)))

>> n=27;clf; plot(x,costes(n,:)); hold on; plot(x,abs(x-pos(n))*omega_general(n),’r’); ejes=axis; plot(todas.pos_real(n)*[1 1],ejes(3:4),’k’);title(num2str(omega_general(n)))

>> n=54;clf; plot(x,costes(n,:)); hold on; plot(x,abs(x-pos(n))*omega_general(n),’r’); ejes=axis; plot(todas.pos_real(n)*[1 1],ejes(3:4),’k’);title(num2str(omega_general(n)))

With this generalized omega, the plot looks much better:

>> desv=abs(todas.pos_real-pos);
>> plot(desv,omega_general,’.’)

In fact, the two outliers are AVAL and AVAR:

>> find(desv>.3 & omega_general>3)
ans =
54
55

And for these the cost is not so well approximated (see above the cost for neuron 54).

Of course, it can still improve.

Exponent from the mean (removing the two outliers):

>> buenas=(desv<.3 | omega_general<3);
>> [xi,b]=omegadesv2bestfittingexponent_exp(desv(buenas),omega_general(buenas))
xi =
1.5765
b =
0.0995

Exponent from the envelope:
>> [desv_acum,omega_media]=omegadesv_acumuladas(desv(buenas),omega_general(buenas),1,4);

>> plot(log10(desv_acum),log10(omega_media),’.-‘)

Not a great fit, but again near the Bayesian parameter. If you ask me, the envelope is just fucking lucky.

 

Study of the cumulative distribution of deviations in omega-Deltax plots for C. elegans

With outliers:
>> [desv_acum,omega_media]=omegadesv_acumuladas(desv,omega,1,7);

>> plot(log10(desv_acum),log10(omega_media),’.-‘)

We get exponent 1.

I reduce the probability, to let the outliers outside. In need to go to 0.7, and it is a disaster:

>> [desv_acum,omega_media]=omegadesv_acumuladas(desv,omega,.7,7);

>> plot(log10(desv_acum),log10(omega_media),’.-‘)

I remove the outliers:

>> buenas=omega<20 | desv<.2;
>> [desv_acum,omega_media]=omegadesv_acumuladas(desv(buenas),omega(buenas),1,7);

>> plot(log10(desv_acum),log10(omega_media),’.-‘)

It is very influentiated by the last bin, which only contains one point. I reduce the binning:

>> [desv_acum,omega_media]=omegadesv_acumuladas(desv(buenas),omega(buenas),1,5);

>> plot(log10(desv_acum),log10(omega_media),’.-‘)

Still not very good. I use a non-equispaced binning:

>> [desv_acum,omega_media]=omegadesv_acumuladas(desv(buenas),omega(buenas),1,[0 10 20 40 80]);

>> plot(log10(desv_acum),log10(omega_media),’.-‘)

Another binning:

>> [desv_acum,omega_media]=omegadesv_acumuladas(desv(buenas),omega(buenas),1,[0 5 10 20 40 80]);


>> plot(log10(desv_acum),log10(omega_media),’.-‘)

Essentially the same…

I do it with another binning, that keeps the same number of data in each bin (except perhaps for the last one, that may be not completely full).

>> [desv_acum,omega_media]=omegadesv_acumuladas_binningpornumero(desv(buenas),omega(buenas),1,50);

>> plot(log10(desv_acum),log10(omega_media),’.-‘)

Using the center-of-mass position of each neuron with the rest fixed in their real positions does not change these results:

>> clear
>> load datos_celegans_struct
>> [pos,omega_general,costes,x]=coste2pos_restofijas(todas.A*.05,todas.M*1.5+todas.S,todas.f,todas.pos_real,1);
>> desv=abs(pos-todas.pos_real);
>> omega=sum([todas.A*.05 todas.M*1.5+todas.S],2);
>> plot(desv,omega,’.’)

>> buenas=omega<20 | desv<.2;
>> [xi,b]=omegadesv2bestfittingexponent_exp(desv(buenas),omega(buenas))
xi =
5.4148
b =
0.1103

>> [desv_acum,omega_media]=omegadesv_acumuladas(desv(buenas),omega(buenas),1,[0 5 10 20 40 80]);

>> plot(log10(desv_acum),log10(omega_media),’.-‘)

Study of the mean deviations in omega-Deltax plots for C. elegans

I have two algorithms. One minimizes mean squared error in Deltax, and the other one maximizes loglikelihood assuming that the deviations are exponentially distributed (the first one is equivalent to assume that they are normally distributed around the mean. This is clearly not true, so the second one should work better).

I first calibrate the method with a synthetic elegans with exponent 2:

>> load(‘e:\hipertec\optimesto\PresentacionesPapers\PaperMolon\Figuras\elegans_sintetico_pot2_r06_todos.mat’)
>> load datos_celegans_struct
>> pos_cuad=coste2pos_cuad(todas.A*.05,todas.M*1.5+todas.S,todas.f);
>> desv_sint=abs(pos_todos(:,1)-pos_cuad);
>> omega=sum([todas.A*.05 todas.M*1.5+todas.S],2);
>> [xi,b]=omegadesv2bestfittingexponent_exp(desv_sint,omega)
xi =
1.9973
b =
0.3085

>> [xi,b]=omegadesv2bestfittingexponent(desv_sint,omega)
xi =
2.4142
b =
0.2777

The algorithm that assumes exponential distribution of deviations and finds maximum log-likelihood works better, as expected.

Now, with the real data from C. elegans:

>> desv=abs(todas.pos_real-pos_cuad);
>> [xi,b]=omegadesv2bestfittingexponent_exp(desv,omega)
xi =
5.8572
b =
0.1184

>> [xi,b]=omegadesv2bestfittingexponent(desv,omega)
xi =
7.2291
b =
0.1122

Nearer with the exponential one, but around 5.

Now I use the optimal for linear cost:

>> pos_lin=coste2pos_num_ruido(todas.A*.05,todas.M*1.5+todas.S,todas.f,1,0,1,0);
>> desv_lin=abs(todas.pos_real-pos_lin);
>> clf
>> [xi,b]=omegadesv2bestfittingexponent_exp(desv_lin,omega)
xi =
5.2203
b =
0.1324

Still not working…

I remove the three outliers, and try again (both with the optimum for quadratic cost, and for the optimum for linear cost):

>> buenas=omega<20 | desv<.2;
>> sum(~buenas)
ans =
3
>> [xi,b]=omegadesv2bestfittingexponent_exp(desv(buenas),omega(buenas))
xi =
4.7116
b =
0.1235

>> [xi,b]=omegadesv2bestfittingexponent_exp(desv_lin(buenas),omega(buenas))
xi =
4.1978
b =
0.1388

So 4 is the best one.

One possibility is to try assuming other probability distributions, that may describe better the data, and see whether that takes us nearer to 2.