Catastrophic interference

Catastrophic interference, also known as catastrophic forgetting, is the tendency of an artificial neural network to completely and abruptly forget previously learned information upon learning new information. Neural networks are an important part of the network approach and connectionist approach to cognitive science. With these networks, human capabilities such as memory and learning can be modeled using computer simulations. Catastrophic interference is an important issue to consider when creating connectionist models of memory. It was originally brought to the attention of the scientific community by research from McCloskey and Cohen (1989), and Ratcliff (1990). It is a radical manifestation of the 'sensitivity-stability' dilemma or the 'stability-plasticity' dilemma. Specifically, these problems refer to the challenge of making an artificial neural network that is sensitive to, but not disrupted by, new information(see the diagram). 


Depiction-of-catastrophic-forgetting-in-binary-classification-tasks-when-there-is-a



Lookup tables and connectionist networks lie on the opposite sides of the stability plasticity spectrum. The former remains completely stable in the presence of new information but lacks the ability to generalize, i.e. infer general principles, from new inputs. On the other hand, connectionist networks like the standard backpropagation network can generalize to unseen inputs, but they are very sensitive to new information.


Back-propagation models can be considered good models of human memory insofar as they mirror the human ability to generalize but these networks often exhibit less stability than human memory. Notably, these back-propagation networks are susceptible to catastrophic interference. This is an issue when modelling human memory, because unlike these networks, humans typically do not show catastrophic forgetting. 


Continual learning is important to neural networks because CF limits their potential in numerous ways. For example, imagine a previously trained network whose function needs to be extended or partially changed. The typical solution would be to train the neural network on all of the previously learnt data (that was still relevant) along with the data to learn the new function.


This can be an expensive operation because previous datasets (which tend to be very large in deep learning) would need to be stored and retrained. However, if a neural network could adequately perform continual learning, it would only be necessary for it to directly learn on data representing the new function. Furthermore, continual learning is also desirable because it allows the solution for multiple tasks to be compressed into a single network where weights common to both tasks may be shared. This can also benefit the speed at which new tasks are learnt because useful features may already be present in the network.




Comments

Popular posts from this blog

QUANTUM STORAGE & MEMORY

QUANTUM COMPUTING: CAN FIGHT CLIMATE CHANGE

BLOCKCHAIN: HOW IT WORKS AND WHY SO POPULAR