Differentiable neural computer
In
Applications
DNC indirectly takes inspiration from Von-Neumann architecture, making it likely to outperform conventional architectures in tasks that are fundamentally algorithmic that cannot be learned by finding a decision boundary.
So far, DNCs have been demonstrated to handle only relatively simple tasks, which can be solved using conventional programming. But DNCs don't need to be programmed for each problem, but can instead be trained. This attention span allows the user to feed complex
DNC can be trained to navigate rapid transit systems, and apply that network to a different system. A neural network without memory would typically have to learn about each transit system from scratch. On graph traversal and sequence-processing tasks with supervised learning, DNCs performed better than alternatives such as long short-term memory or a neural turing machine.[5] With a reinforcement learning approach to a block puzzle problem inspired by SHRDLU, DNC was trained via curriculum learning, and learned to make a plan. It performed better than a traditional recurrent neural network.[5]
Architecture
DNC networks were introduced as an extension of the
The DNC model is similar to the
Traditional DNC
This section may be confusing or unclear to readers. In particular, a list of equations (without, e.g., an exhaustive association with a complete diagram of the DNC) is not a digestible description for many readers of this article. (October 2017) |
DNC, as originally published[1]
Independent variables | |
Input vector | |
Target vector | |
Controller | |
Controller input matrix | |
Deep (layered) LSTM | |
Input gate vector | |
Output gate vector | |
Forget gate vector | |
State gate vector, | |
Hidden gate vector, | |
DNC output vector | |
Read & Write heads | |
Interface parameters | |
Read heads | |
Read keys | |
Read strengths | |
Free gates | |
Read modes, | |
Write head | |
Write key | |
Write strength | |
Erase vector | |
Write vector | |
Allocation gate | |
Write gate | |
Memory | |
Memory matrix, Matrix of ones | |
Usage vector | |
Precedence weighting, | |
Temporal link matrix, | |
Write weighting | |
Read weighting | |
Read vectors | |
Content-based addressing ,Lookup key , key strength | |
Indices of , sorted in ascending order of usage | |
Allocation weighting | |
Write content weighting | |
Read content weighting | |
Forward weighting | |
Backward weighting | |
Memory retention vector | |
Definitions | |
Weight matrix , bias vector
| |
Zeros matrix, ones matrix, identity matrix | |
Element-wise multiplication | |
Cosine similarity | |
Sigmoid function | |
Oneplus function | |
for j = 1, ..., K. | Softmax function |
Extensions
Refinements include sparse memory addressing, which reduces time and space complexity by thousands of times. This can be achieved by using an approximate nearest neighbor algorithm, such as Locality-sensitive hashing, or a random k-d tree like Fast Library for Approximate Nearest Neighbors from UBC.[9] Adding Adaptive Computation Time (ACT) separates computation time from data time, which uses the fact that problem length and problem difficulty are not always the same.[10] Training using synthetic gradients performs considerably better than Backpropagation through time (BPTT).[11] Robustness can be improved with use of layer normalization and Bypass Dropout as regularization.[12]
See also
References
- ^ S2CID 205251479.
- ^ "Differentiable neural computers | DeepMind". DeepMind. Retrieved 2016-10-19.
- ^ a b Burgess, Matt. "DeepMind's AI learned to ride the London Underground using human-like reason and memory". WIRED UK. Retrieved 2016-10-19.
- PMID 27732576.
- ^ a b James, Mike. "DeepMind's Differentiable Neural Network Thinks Deeply". www.i-programmer.info. Retrieved 2016-10-20.
- ^ "DeepMind AI 'Learns' to Navigate London Tube". PCMAG. Retrieved 2016-10-19.
- ^ Mannes, John. "DeepMind's differentiable neural computer helps you navigate the subway with its memory". TechCrunch. Retrieved 2016-10-19.
- ^ "RNN Symposium 2016: Alex Graves - Differentiable Neural Computer". YouTube.
- arXiv:1610.09027 [cs.LG].
- arXiv:1603.08983 [cs.NE].
- arXiv:1608.05343 [cs.LG].
- arXiv:1807.02658 [cs.CL].