Hi,
For my masters thesis, I am currently trying to recreate the experiment of @alexander_ugent published in this paper. In the NRP templates I found the “Tigrillo SNN Learning in Closed Loop” experiment, which is the closest implementation I could get my hands on. However this template is not as sophisticated as the paper’s experiment. For example, its SNN is much smaller (10 population à 100 neurons -> 1.000 neurons).
Unfortunately, when I increase the SNN size to its target value (300 columns à 40 neurons -> 12.000 neurons) the NRP throws an error or gets stuck.
When starting the experiment over the official NRP web fronted the simulation gets stuck at “Inizializing CLE”.
When trying to launch the same experiment on the online NRP servers over the virtual coach instead, I simply get an HTTP 500 error (but not stuck simulation):
INFO: [2022-09-20 21:54:15,518 - VirtualCoach] Preparing to launch tigrillo-cl-snn-learning_13.
INFO: [2022-09-20 21:54:15,520 - VirtualCoach] Retrieving list of experiments.
INFO: [2022-09-20 21:54:19,056 - VirtualCoach] Retrieving list of available servers.
INFO: [2022-09-20 21:54:19,130 - Simulation] Attempting to launch tigrillo-cl-snn-learning_13 on prod_latest_backend-11-80.
ERROR: [2022-09-20 21:54:31,512 - Simulation] Unable to launch on prod_latest_backend-11-80: Simulation responded with HTTP status 500
Traceback (most recent call last):
File "/home/bbpnrsoa/nrp/src/VirtualCoach/hbp_nrp_virtual_coach/pynrp/simulation.py", line 158, in launch
raise Exception(
Exception: Simulation responded with HTTP status 500
When using my local docker NRP, I observe the same behaviour for the VC launch. For the manual fronted launch however I receive an Error message (at step “Initzializing CLE”):
ERROR TYPE: UnknownError
ERROR CODE: -1
MESSAGE: An error occured. Please try again later. (host: 172.19.0.4)
STACK TRACE:
n@http://localhost:9000/scripts/app.bd5c68e5.js:1:16340 error@http://localhost:9000/scripts/app.bd5c68e5.js:1:16554 controller@http://localhost:9000/scripts/app.bd5c68e5.js:1:18466 invoke@http://localhost:9000/node_modules/angular/angular.js:4771:19 $controllerInit@http://localhost:9000/node_modules/angular/angular.js:10592:34 resolveSuccess@http://localhost:9000/node_modules/angular-ui-bootstrap/dist/ui-bootstrap-tpls.js:4154:34 processQueue@http://localhost:9000/node_modules/angular/angular.js:16696:28 qFactory/scheduleProcessQueue/<@http://localhost:9000/node_modules/angular/angular.js:16712:39 $eval@http://localhost:9000/node_modules/angular/angular.js:17994:28 $digest@http://localhost:9000/node_modules/angular/angular.js:17808:31 $apply@http://localhost:9000/node_modules/angular/angular.js:18102:24 done@http://localhost:9000/node_modules/angular/angular.js:12082:47 completeRequest@http://localhost:9000/node_modules/angular/angular.js:12291:15 requestError@http://localhost:9000/node_modules/angular/angular.js:12229:24
To reproduce the Issue, you need to change the following lines in the “CPG_brain.py” of the template:
- In line 59:
5*2
to300
(number of populations) - in line 93:
80
to30
(Excitatory neurons per population) - In line 94:
20
to10
(Inhibitory neurons per population)
I do not think that this is an code/experiment specific issue, because increased amounts of neurons generally work. It just takes much longer for the simulation to start the more neurons are requested (e.g. for 40x40=1600 neurons the startup takes some time but still works in the end). However, at some size the simulation startup seems to freeze completely. Therefore, I think this is an scalability issue within the NRP.
Do you have any idea how I can circumvent this? Is it normal for the NRP that a NEST network with 2000+ neurons is too much? Do you know how Alexander managed to run this simulation? (I also asked him directly via email in parallel but have no response yet). What else can I try?
Thanks for your help!
Best regards,
Felix
P.S.: The simulations that got stuck at startup on the server I cannot stop. Similar to the issue I mentioned in my last thread. It would be great if someone could again stop these simulations for me (and maybe fix the root cause…)