Brain Simulation

The ability to simulate an entire human brain would be incrediably useful. It would give us insights into aspects of its function that are difficult to examine in vivo. It could help us develop an understanding of what happens when things go wrong and help us develop treatments. We could also gain an insight into how thoughts come to arise within the real thing. Such a system could form the basis for uploading minds into computers for those that believe that such things are possible. However there are many practical complications to simulating a human brain. The brain is an incrediably complicated apparatus estimated to contain around 100 billion neurons which may connect to thousands more. In a typical simplification, a neuron is taken as being a simple processing unit that takes a weighted sum of its inputs and activates accordingly. Quite large arrays of such units can be easily simulated in a computer. However, even in the presence of such simplication, simulating the complete array of neurons is well beyond the capabilities of a top of the range personal computer. Indeed, even the most sophisticated supercomputers would be strained to their limits. The following C++ program helps illustrate the point. For a more detailed treatise see The Virtual Brain Machine Project.

100 billion simple neurons test program

The link below points to a very simple program which aims to demonstrate the processing power inherent in a typical human brain. It is based on the estimates above, that the brain contains 100,000,000,000 (assuming that the billion typically referred to in textbooks is 10⁹ i.e. an American billion or a British thousand million) and that each neuron receives inputs from 1000 others. These estimates are in themselves vague and not reflective of the true architecture of the brain which may be connected with greater simplicity or complexity from area to area. Real neurons are analog devices and they may fire over a continuum of activation levels (though there is evidence to suggest that the activations themselves are binary and that it is their frequency that counts). This must be reduced to a convenient level for representation in a computer. e.g by a single byte or two bytes giving 255 or 65535 possible activation levels respectively. This means that the complete activation status of an entire human brain would be represented by 100,000,000,000 bytes of storage (approximately 100Gb). This sounds promising after all a 100Gb hard disk is not an impossible extravagance today. Using a hard disk would of course slow down processing a lot by comparasion with storage in RAM but 100Gb of RAM would currently be considered a bit of an extravagance, though by no means an outright impossibility for a supercomputer. However, there is a major complication. The real information in the brain is stored more in the connections between neurons than within their individual activation states. Since we have estimated each neuron to connect to a thousand others this raises our storage requirements proportionately. We are now looking in terms of Terabyte and PetaByte storage devices. The program below makes no attempt to consider the ramifications of this storage problem. Instead it concentrates on simulating the amount of effort required to process all this information in a single cycle of update (though of course without this information being available). In an effort to partially optimize the calculation only integer arithmetic is used. However, even the number 100,000,000,000 cannot be represented by a standard builtin integer type in C/C++ (note: a "long long" type has be sanctioned for addition to C but not yet C++). Instead two unsigned longs, High & Low, count 100,000 sets of a 1,000,000 neuron updates separately. Even with the neural update operation doing nothing this would take about 8 hours to run on my current home PC. With neural update being represented by a weighted sum function counting the weighted contributions from 1000 other cells this figure jumps to 7 days.

100billion.cpp

Note: this is on an AMD-K6 III 450MHz (overdue for an upgrade) running Win98SE (overdue for extermination), using a more up to date system running linux would slash these figures considerably. The estimate is based upon reducing High to 100 in the program and then multiplying the time taken (typically 9 minutes) by 1000. The program was compiled with gcc under cygwin using the command line

"gcc 100billion.cpp -lstdc++ -o100billion.exe -O4" though in practice altering the optimisation level from none through -O2 seemed to make no difference

Computing a mind

I would like to refer to the above unit of computation as a 'mind instant', although it bares only a slight resemblance to volume of computation done by a real brain. Current research suggests that it takes about 200ms on average for a state to evolve into consciousness. Neurons can fire roughly upto 300 times a second so this level of computation could presumably capture an instant of consciousness. However the evidence suggests that consciousness comes from a wave of activity reaching all across the brain. Most neural connections are at a local level so it must take several cycles of activity for things to feedback from one end of brain to the other. This highlights the first of many problems with the basic model.

Oversimplifications

This back of the envelope model drastically oversimplifies many aspects of the brain. The most glaring observation from a computational point of view is the absence of any learning mechanisms. In a real brain new connections form and old connections may weaken or strengthen over different timescales. In the shorter term patterns of firing are maintained by 'long term potentiation'. On a physical level the brain is awash with neurotransmitters which selectively inhibit and promote the activity of brain cells over large distances and timescales relative to the electrical activity of the brain. Such effects must be taken into account to make a proper model. However, there is a case for saying that these mechanisms, being so much slower, contribute significantly less to the overall processing load as crudely presented above. On the other hand even the slightest increase in detail at the neural level increases the storage requirements dramatically. The model also has no inputs or outputs. Simulating an entire body is even more difficult than simulating an isolated brain in jar. For a start, it is estimated that the rest of the nervous system contains as many neurons again as are found in the brain. The body is made up of 50,000 billion cells of over 200 main types. Our DNA expresses between 100,000 and 200,000 genes which may be expressed in various combinations to direct the activity of these cells. The body too does not exist in a vacuum it must be connected to a world. However, we should be cautious in our simulations. A simulation can still be useful without going down to the submolecular level. It is important to find out what is most relevant to the aims of our simulation and concentrate our efforts there.

Overcomplications

There is a bright side to this computation. The belief beggaring crudity of the approximation above can also be used to our advantage. We already know a great deal about the organisation of the brain and our knowledge is continuuing to improve. Applying this knowledge to real models will help us strip away any redundancies and irrevalences and allow us to do useful simulations and match them against data gathered from studies of real brains. A prime contender would be the visual cortex. Also the rate of technological progress suggests that it will soon be quite easy to achieve a "mind instant" on a personal computer on a reasonable timescale and eventually even in real-time.

Populating the model

An important matter not yet considered is populating the model. It is all very well saying that a brain might be simulated using a particular model but the data representative of a brain must still be obtained for use in that model. The most obvious approach is go to studies of the real brain directly. EEGs give global values that a good model should be able to reproduce and perhaps even explain but it would be difficult to derive the model from these alone. Single cell recording gives the level of detail required for each cell but is too invasive. Puncturing every cell with an electrode and recording from 100 billion simultaneously is not practical (and probably fatal :-). Also it gives no direct indication of connect strengths between neurons. Brain imaging techniques such as functional nuclear magnetic resonance imaging (fNMRI) can at present only manage a spatial resolution of about a 1mm. A cubic millimetre of brain tissue contains a great many cells, although we are fortunate in that the brain tends to use a system of population voting such that adjacent cells are liable to agree. It seems likely that uploading a representative image of a real living brain is not yet practical. Images of appropriately prepared brain tissue (diced :-) can be scanned at greater resolution but after death physiological changes rapidly damage the fine structure we are interested in. A partial solution may be to combine a good functional model with an adaptive learning mechanism. Artificial neural networks can be trained to reproduce a particular pattern of activity simply by observing it and self organising their own internal model. Such a learning system, independent of the simulated learning mechanisms could be used to produce a model whose behaviour maps well to that of a real brain in terms of the less detailed brain scans as well as EEGs and recordings of individual cellular responses. The closer our model is in functional structure to the real brain the easier this should be to achieve. Some successes have already been achieved in this area. Functional models have been produced that can learn to respond in characteristic manners to, for example, motion of objects accross their simulated visual field.

Improving the model

The most important factor in improving the model is knowledge. Detailed statistics are required on the type and number of contributing nerve cells in the brain and the different ways in which they can respond. In many areas the brain is compartmentalised. In each such region it would be useful to know the total number of cells, how many cells they take input from, how many cells they give output to and the spread of distances between connected cells. Typically most connectins will be local but some will spread much further afield. These regions can then be modelled on an individual basis and connected as appropriate to anatomy and physiology. Our knowledge of the particular learning mechanisms actually used in vivo is also quite vague although artificial neural systems are many and varied in their approaches. Another blind spot with neural network models is in the role of neuro-transmitters. Information on the total number of neurotransmitters and the way various types of cell respond to them is also of great importance. I will attempt to tabulate all these factors as they come to my attention. Coming from a computational background I have no access to the apparatus needed to gather such information nor the training to obtain it. However, I am aware that a lot of ground has been covered in this area. It is just a matter of gathering together the published data.

Simpler Models

Given the logicalistical complications of simulating a human brain it may be prudent to start with something more simple. The nerve ganglia that function as the brain of a grasshopper contains only 16,000 neurons which is well within the means of an ordinary personal computer. Their behaviour is also quite simple by comparasion. In such cases creating a model of the soma will also be a more practical proposition. I do not know of any studies conducted on grasshoppers using fNMRI or Magneto Encepholagrams (MEG) using superconducting detectors (Perhaps because they hop around too much :-). Suggestions for investigations following such lines are welcomed.

Feedback

There is a lot of work to be done in this field. Room for several PhDs and a professor or two of simulated neurobiology (that has a nice ring to it). I'd like to hear from anyone who has comments or contributions that could improve our understanding of the problems and help us find some direction.

Parallel mind instants

I would like to hear from anyone with results that may be added to the table of times for a computational mind instant using other operating systems and preferably more powerful hardware. Any thoughts on how to improve the speed of the simulated computation independently of altering the crude model are also welcome. There will likely be a lot of repetition which can be exploited to our advantage. Though this is difficult to include without a full model in petabyte storage. One glaring advance that is missing is division of the task between multiple computers connected in parallel. To do this we need to estimate how much data is to be contributed from abroad relative to a given computer and account for the costs of its communication both in and out of the system. The current model does not reflect real neural topology at all but a parallel networked version should come a little closer given the right statistics. One day it may be possible to distribute a whole brain simulation over the internet, perhaps with particular computer nexuses representing anatomical parts such as the visual cortex. In a manner similar to the seti@home project. This is definitely a direction I am interested in going and will persue when time permits.

We all compute things faster if we work as a group. Lets do it. Letsthinkit.com :-) Send mail to: brainsim at cybernetics demon co uk

Information

Lies, damn lies & statistics

This section is where I will attempt to gather and tabulate statistics regarding the functional characteristics of the human brain (and perhaps others) relevant to their simulation.

Brain Statistics

References

All manner of cognitive science textbooks, popular science books and various miscellaneous articles in magazines and on the web. For a good introduction and broad overview I'd recommend "The human mind explained" - edited by Susan Greenfield.