2 min read

A Poster for Neural Circuit Diagrams

A Poster for Neural Circuit Diagrams
This poster explains why deep learning architectures never seem to make sense - and then proposes a solution! Or at the least, get you interested in learning more about neural circuit diagrams.

As some of you might know, I have been working on neural circuit diagrams over the past year or so. These diagrams solve a lingering challenge in deep learning research – clearly and accurately communicating models. Neural circuit diagrams are a mathematically robust framework to precisely express how components interact with data throughout a model.

I've recently made a poster for neural circuit diagrams (see above!). The poster covers the problem trying to be solved, how to read neural circuit diagrams, then goes in-depth into the transformer architecture. It's somewhere between a summary of my research and an indulgent graphic design exercise. If it gets you interested in learning more about neural circuit diagrams, it has achieved its goal.

If there's interest, I'll use this blog to diagram more architectures and maybe even go into the mathematical details of diagrams. I empathize with people getting started with deep learning architectures, and I hope this blog will be where many architectures – from transformers to UNets – first made sense for many of you.

Follow to stay in the loop!

References

[1] A. W. Senior et al., “Improved protein structure prediction using potentials from deep learning,” Nature, vol. 577, no. 7792, Art. no. 7792, Jan. 2020, doi: 10.1038/s41586-019-1923-7.

[2] K. Hu and K. Hu, “ChatGPT sets record for fastest-growing user base - analyst note,” Reuters, Feb. 02, 2023. Accessed: Oct. 18, 2023. [Online]. Available: https://www.reuters.com/technology/chatgpt-sets-record-fastest-growing-user-base-analyst-note-2023-02-01/

[3] “Midjourney Showcase.” Accessed: Oct. 18, 2023. [Online]. Available: https://www.midjourney.com/showcase/recent/

[4] A. Krizhevsky, I. Sutskever, and G. E. Hinton, “ImageNet classification with deep convolutional neural networks,” in Advances in neural information processing systems, F. Pereira, C. J. Burges, L. Bottou, and K. Q. Weinberger, Eds., Curran Associates, Inc., 2012. [Online]. Available: https://proceedings.neurips.cc/paper_files/paper/2012/file/c399862d3b9d6b76c8436e924a68c45b-Paper.pdf

[5] K. He, X. Zhang, S. Ren, and J. Sun, “Deep Residual Learning for Image Recognition.” arXiv, Dec. 10, 2015. Accessed: May 15, 2023. [Online]. Available: http://arxiv.org/abs/1512.03385

[6] A. Vaswani et al., “Attention Is All You Need.” arXiv, Dec. 05, 2017. Accessed: May 13, 2023. [Online]. Available: http://arxiv.org/abs/1706.03762

[7] D. Chiang, A. M. Rush, and B. Barak, “Named Tensor Notation.” arXiv, Jan. 17, 2023. Accessed: Aug. 14, 2023. [Online]. Available: http://arxiv.org/abs/2102.13196

[8] M. Phuong and M. Hutter, “Formal Algorithms for Transformers.” arXiv, Jul. 19, 2022. doi: 10.48550/arXiv.2207.09238.

[9] M. Sadoski, E. T. Goetz, and J. B. Fritz, “Impact of concreteness on comprehensibility, interest, and memory for text: Implications for dual coding theory and text design.,” Journal of Educational Psychology, vol. 85, no. 2, pp. 291–304, 1993, doi: 10.1037/0022-0663.85.2.291.

[10] Anonymous, “Neural circuit diagrams: Standardized diagrams for deep learning architectures,” Submitted to Transactions on Machine Learning Research, 2023, [Online]. Available: https://openreview.net/forum?id=RyZB4qXEgt

[11] M. A. Borkin et al., “Beyond Memorability: Visualization Recognition and Recall,” IEEE Transactions on Visualization and Computer Graphics, vol. 22, no. 1, pp. 519–528, Jan. 2016, doi: 10.1109/TVCG.2015.2467732