Figures for Gonzalo’s talk in Tarragona

>> load(‘c:\hipertec\AblationElegans\Datos\datos_celegans_paradinamica.mat’)
>> V0=zeros(279,1);
V0([40 41])=1;
>> M=conectividad2matrizsistema(todas.A_chem,todas.A_ej,todas.GABA,1,1,30);
>> V=matrizsistema2V_num(M,V0,[-1 1],.0005,1000);
>> imagesc(V)
>> caxis([0 .3])

>> resultados=matrizsistema2experimentoablacion(M,V0,[-1 1],54:57,.0005,1000);
>> load(‘c:\hipertec\AblationElegans\Datos\info_redes.mat’)
>> [roc,umbral]=resultados2pseudoroc(resultados,chemotaxis1.buenas,[chemotaxis1.sensiales chemotaxis1.finales],1);

Good.

>> find(roc(:,2)>.8,1)
ans =
36
>> umbral(36)
ans =
0.0017

>> redencontrada=find(mean(resultados,2)>.0017);
>> pintared_posreal(todas.A_chem,todas.A_ej,todas.nombres,1:279,chemotaxis1.sensiales,chemotaxis1.finales,redencontrada)

Advertisement

New Figure 2

>> fig_pap_elegans_16(1,zeros(100))

Omega vs Deltax, for the parameters alpha=0.05 beta=1.5, with the new generalized omega

>> clear
>> load datos_celegans_struct
>> [pos,omega_general,costes,x]=coste2pos_restofijas(todas.A*.05,todas.M*1.5+todas.S,todas.f,todas.pos_real,1);
>> desv=abs(todas.pos_real-pos);
>> plot(desv,omega_general,’.’)

To start with, only one outlier. Note that the new omega depends on the exponent, so this by itself favors exponent 1.

Bad news for the fit to the mean (without the outlier):

>> buenas=(desv<.2 | omega_general<20);
>> sum(~buenas)
ans =
1
>> [xi,b]=omegadesv2bestfittingexponent_exp(desv(buenas),omega_general(buenas))
xi =
4.5068
b =
0.1092

And of course, again great news for the fit to the envelope (again, without the outlier):

>> [desv_acum,omega_media]=omegadesv_acumuladas(desv(buenas),omega_general(buenas),1,4);

>> plot(log10(desv_acum),log10(omega_media),’.-‘)

I am starting to believe the theory about Bayes being especially affected by the most deviated ones, and the envelope getting that.

 

omega vs Deltax for the anatomical alpha and beta

The problem: This plot does not look nice for the case alpha=beta=1/29.3.

load datos_celegans_struct
pos_cuad=coste2pos_cuad(todas.A/29.3,todas.M/29.3+todas.S,todas.f);
omega=sum([todas.A/29.3 todas.M/29.3+todas.S],2);
desv=abs(todas.pos_real-pos_cuad)
plot(desv,omega,’.’)

It is not a problem for exponent 2, because the metric is bad also. But it does not look nice for exponent one, for which the metric is excelent:

>> [pos,omega_general,costes,x]=coste2pos_restofijas(todas.A/29.3,todas.M/23.3+todas.S,todas.f,todas.pos_real,1);
>> desv=abs(todas.pos_real-pos);
>> plot(desv,omega,’.’)

The reason: omega is not valid in this case. For example (blue is the actual cost, red is the cost approximated by omega):

>> n=4;clf; plot(x,costes(n,:)); hold on; plot(x,abs(x-pos(n))*omega(n),’r’); ejes=axis; plot(todas.pos_real(n)*[1 1],ejes(3:4),’k’);title(num2str(omega_general(n)))

>> n=27;clf; plot(x,costes(n,:)); hold on; plot(x,abs(x-pos(n))*omega(n),’r’); ejes=axis; plot(todas.pos_real(n)*[1 1],ejes(3:4),’k’);title(num2str(omega_general(n)))


>> n=54;clf; plot(x,costes(n,:)); hold on; plot(x,abs(x-pos(n))*omega(n),’r’); ejes=axis; plot(todas.pos_real(n)*[1 1],ejes(3:4),’k’);title(num2str(omega_general(n)))

And many more cases.

Solution: I define a “generalized omega”, which is simply the one that best fits the cost in the direction of the deviation. With this generalized omega the costs look like this:

>> n=4;clf; plot(x,costes(n,:)); hold on; plot(x,abs(x-pos(n))*omega_general(n),’r’); ejes=axis; plot(todas.pos_real(n)*[1 1],ejes(3:4),’k’);title(num2str(omega_general(n)))

>> n=27;clf; plot(x,costes(n,:)); hold on; plot(x,abs(x-pos(n))*omega_general(n),’r’); ejes=axis; plot(todas.pos_real(n)*[1 1],ejes(3:4),’k’);title(num2str(omega_general(n)))

>> n=54;clf; plot(x,costes(n,:)); hold on; plot(x,abs(x-pos(n))*omega_general(n),’r’); ejes=axis; plot(todas.pos_real(n)*[1 1],ejes(3:4),’k’);title(num2str(omega_general(n)))

With this generalized omega, the plot looks much better:

>> desv=abs(todas.pos_real-pos);
>> plot(desv,omega_general,’.’)

In fact, the two outliers are AVAL and AVAR:

>> find(desv>.3 & omega_general>3)
ans =
54
55

And for these the cost is not so well approximated (see above the cost for neuron 54).

Of course, it can still improve.

Exponent from the mean (removing the two outliers):

>> buenas=(desv<.3 | omega_general<3);
>> [xi,b]=omegadesv2bestfittingexponent_exp(desv(buenas),omega_general(buenas))
xi =
1.5765
b =
0.0995

Exponent from the envelope:
>> [desv_acum,omega_media]=omegadesv_acumuladas(desv(buenas),omega_general(buenas),1,4);

>> plot(log10(desv_acum),log10(omega_media),’.-‘)

Not a great fit, but again near the Bayesian parameter. If you ask me, the envelope is just fucking lucky.

 

Study of the cumulative distribution of deviations in omega-Deltax plots for C. elegans

With outliers:
>> [desv_acum,omega_media]=omegadesv_acumuladas(desv,omega,1,7);

>> plot(log10(desv_acum),log10(omega_media),’.-‘)

We get exponent 1.

I reduce the probability, to let the outliers outside. In need to go to 0.7, and it is a disaster:

>> [desv_acum,omega_media]=omegadesv_acumuladas(desv,omega,.7,7);

>> plot(log10(desv_acum),log10(omega_media),’.-‘)

I remove the outliers:

>> buenas=omega<20 | desv<.2;
>> [desv_acum,omega_media]=omegadesv_acumuladas(desv(buenas),omega(buenas),1,7);

>> plot(log10(desv_acum),log10(omega_media),’.-‘)

It is very influentiated by the last bin, which only contains one point. I reduce the binning:

>> [desv_acum,omega_media]=omegadesv_acumuladas(desv(buenas),omega(buenas),1,5);

>> plot(log10(desv_acum),log10(omega_media),’.-‘)

Still not very good. I use a non-equispaced binning:

>> [desv_acum,omega_media]=omegadesv_acumuladas(desv(buenas),omega(buenas),1,[0 10 20 40 80]);

>> plot(log10(desv_acum),log10(omega_media),’.-‘)

Another binning:

>> [desv_acum,omega_media]=omegadesv_acumuladas(desv(buenas),omega(buenas),1,[0 5 10 20 40 80]);


>> plot(log10(desv_acum),log10(omega_media),’.-‘)

Essentially the same…

I do it with another binning, that keeps the same number of data in each bin (except perhaps for the last one, that may be not completely full).

>> [desv_acum,omega_media]=omegadesv_acumuladas_binningpornumero(desv(buenas),omega(buenas),1,50);

>> plot(log10(desv_acum),log10(omega_media),’.-‘)

Using the center-of-mass position of each neuron with the rest fixed in their real positions does not change these results:

>> clear
>> load datos_celegans_struct
>> [pos,omega_general,costes,x]=coste2pos_restofijas(todas.A*.05,todas.M*1.5+todas.S,todas.f,todas.pos_real,1);
>> desv=abs(pos-todas.pos_real);
>> omega=sum([todas.A*.05 todas.M*1.5+todas.S],2);
>> plot(desv,omega,’.’)

>> buenas=omega<20 | desv<.2;
>> [xi,b]=omegadesv2bestfittingexponent_exp(desv(buenas),omega(buenas))
xi =
5.4148
b =
0.1103

>> [desv_acum,omega_media]=omegadesv_acumuladas(desv(buenas),omega(buenas),1,[0 5 10 20 40 80]);

>> plot(log10(desv_acum),log10(omega_media),’.-‘)

Study of the mean deviations in omega-Deltax plots for C. elegans

I have two algorithms. One minimizes mean squared error in Deltax, and the other one maximizes loglikelihood assuming that the deviations are exponentially distributed (the first one is equivalent to assume that they are normally distributed around the mean. This is clearly not true, so the second one should work better).

I first calibrate the method with a synthetic elegans with exponent 2:

>> load(‘e:\hipertec\optimesto\PresentacionesPapers\PaperMolon\Figuras\elegans_sintetico_pot2_r06_todos.mat’)
>> load datos_celegans_struct
>> pos_cuad=coste2pos_cuad(todas.A*.05,todas.M*1.5+todas.S,todas.f);
>> desv_sint=abs(pos_todos(:,1)-pos_cuad);
>> omega=sum([todas.A*.05 todas.M*1.5+todas.S],2);
>> [xi,b]=omegadesv2bestfittingexponent_exp(desv_sint,omega)
xi =
1.9973
b =
0.3085

>> [xi,b]=omegadesv2bestfittingexponent(desv_sint,omega)
xi =
2.4142
b =
0.2777

The algorithm that assumes exponential distribution of deviations and finds maximum log-likelihood works better, as expected.

Now, with the real data from C. elegans:

>> desv=abs(todas.pos_real-pos_cuad);
>> [xi,b]=omegadesv2bestfittingexponent_exp(desv,omega)
xi =
5.8572
b =
0.1184

>> [xi,b]=omegadesv2bestfittingexponent(desv,omega)
xi =
7.2291
b =
0.1122

Nearer with the exponential one, but around 5.

Now I use the optimal for linear cost:

>> pos_lin=coste2pos_num_ruido(todas.A*.05,todas.M*1.5+todas.S,todas.f,1,0,1,0);
>> desv_lin=abs(todas.pos_real-pos_lin);
>> clf
>> [xi,b]=omegadesv2bestfittingexponent_exp(desv_lin,omega)
xi =
5.2203
b =
0.1324

Still not working…

I remove the three outliers, and try again (both with the optimum for quadratic cost, and for the optimum for linear cost):

>> buenas=omega<20 | desv<.2;
>> sum(~buenas)
ans =
3
>> [xi,b]=omegadesv2bestfittingexponent_exp(desv(buenas),omega(buenas))
xi =
4.7116
b =
0.1235

>> [xi,b]=omegadesv2bestfittingexponent_exp(desv_lin(buenas),omega(buenas))
xi =
4.1978
b =
0.1388

So 4 is the best one.

One possibility is to try assuming other probability distributions, that may describe better the data, and see whether that takes us nearer to 2.

Study of phase transitions in elegans with simulated annealing

From Remotón.

I run 1000 parallel simulated annealings, temperature between 5 and 0.1, 200000 steps:

>> coste=simulaelegans_simulatedannealing_multiple_parallel(todas.A/29.3,todas.M/29.3+todas.S,todas.f,.5,.01,[.1 5],2*10^5,1000,’Prueba’);
>> [costemedio,varianza]=simulannealing2mediavarianza(‘Prueba’);
>> T=5:(.1-5)/(length(varianza)-1):.1;
>> plot(T,costemedio)

>> plot(T,varianza)

>> plot(T,varianza./T.^2)

Clearly, I need to go to lower temperatures. Maybe we also need to cool more slowly.

Study of phase transitions

Working from Remoton.

I find the cost of 1.2*10^8 random configurations.

>> generacostes_rand(‘Costes_rand’,12,4)
>> [histog,bins]=costes_rand2histog(‘Costes_rand’,180:.5:260);
>> plot(bins,histog)

>> pos2coste(todas.A/29.3,todas.M/29.3+todas.S,todas.f,todas.pos_real,.5)
ans =
119.6195

Actual configuration has a cost much lower than those we get randomly. Therefore, this is not a valid scheme to see phase transitions in a range of temperatures similar to that of the real worm. Nevertheless, I will study the phase transitions with these data:

>> [costemedio,varianza]=costes_rand2mediavarianza(‘Costes_rand’,.1:.1:5,1);
>> T=.1:.1:5;
>> plot(T,costemedio)

>> plot(T,varianza)

There is this strange maximum. A possible explanation is that the effect of the exponential is to remove the peak of the cost distribution, producing something more uniform, which has larger variance. No me lo creo ni yo, pero bueno.

Specific heat:
>> plot(T,varianza./T.^2)

Uooooohhh!!!

I do not quite understand this. The cost histogram looks too boring to be hiding a phase transition. I run a control using normally distributed costs:

>> clear
>> costes=randn(1,10^7)*10;
>> save prueba1 costes
>> [costemedio,varianza]=costes_rand2mediavarianza(‘prueba’,.1:.1:5,1);
>> T=.1:.1:5;
>> plot(T,costemedio)

>> plot(T,varianza)

>> plot(T,varianza./T.^2)

Veeeery similar. So I do not think there is anything especial in the C. elegans cost.

Why is this apparent phase transition taking place? An older post showed that local minima were needed to get a phase transition. But anyway, the cost histogram for a single well is very different to the one we get here:

>> x=-1:.01:1;
coste=(abs(x)).^2;
>> hist(coste)

Indeed, the effect of a local minimum is to create a peak in the histogram of costs:

>> x=-1:.00001:1;
coste=(abs(x-.5)).^.5;
subplot(1,2,1)
plot(x,coste)
subplot(1,2,2)
hist(coste)

>> x=-1:.00001:1;
coste=(abs(x-.5)).^.5+.7*(abs(x+.6)).^.5;
subplot(1,2,1)
plot(x,coste)
subplot(1,2,2)
hist(coste)

 

New Figure 3 (Bayes)

>> fig_pap_bayes_09(1,zeros(100))

Study of the cost distributions found by Bayes

>> clear
>> load info_posreal_zonapeque
>> prob=infoarchivos2prob(infoarchivos,1:6);

Parameters of the most likely value:

>> [m,ind_max]=max(prob(:));
>> [pot_max,alfa_max,beta_max,lambda_max,mu_max,tipo_max]=ind2sub(size(prob),ind_max);

Now, we will study all values with a relatively high probability, but only those with alfa, beta and pot equal to the most likely ones:

>> ind=find(prob(:)>m/100);
>> [c_pot,c_alfa,c_beta,c_lambda,c_mu,c_tipo]=ind2sub(size(prob),ind);
>> ind=ind(c_pot==pot_max & c_alfa==alfa_max & c_beta==beta_max);
>> [loglambda,logmu,tipo]=infoarchivos2parametrosdistrib(infoarchivos,ind)
>> x=0:.001:4;
>> P=parametrosdistrib2prob(x,loglambda,logmu,tipo);
>> plot(x,P)

>> semilogy(x,P)

>> loglog(x,P)

Very similar, although coming from different types:

>> tipo

 

tipo =

 

1.0000 -1.0000

2.1000 1.0000

2.1000 2.0000

2.1000 2.0000

2.2000 1.0000

2.2000 1.0000

2.2000 1.0000

2.2000 1.0000

2.2000 1.0000