软件世界网 购物 网址 三丰软件 | 小说 美女秀 图库大全 游戏 笑话 | 下载 开发知识库 新闻 开发 图片素材
多播视频美女直播
↓电视,电影,美女直播,迅雷资源↓
TxT小说阅读器
↓语音阅读,小说下载,古典文学↓
一键清除垃圾
↓轻轻一点,清除系统垃圾↓
图片批量下载器
↓批量下载图片,美女图库↓
移动开发 架构设计 编程语言 Web前端 互联网
开发杂谈 系统运维 研发管理 数据库 云计算 Android开发资料
  软件世界网 -> 互联网 -> neuralnetworkanddeeplearning笔记(1) -> 正文阅读
互联网 最新文章
C++11并发API总结
16.收款(AcceptingMoney)
数据链路层综述
IP协议及IP数据报解析
《浅谈HTTP协议》
计算机网络基础
LoadRunner和RPT之间关于手动关联和参数化的
HTTPS中的对称密钥加密,公开密钥加密,数字
上班需要打卡吗?(开通微信公众号--乘着风
ofbizjmsactivemq

[互联网]neuralnetworkanddeeplearning笔记(1)

  2016-03-28 21:48:03

neural network and deep learning 这本书看了陆陆续续看了好几遍了,但每次都会有不一样的收获。DL领域的paper日新月异,每天都会有很多新的idea出来,我想,深入阅读经典书籍和paper,一定可以从中发现remian open的问题,从而有不一样的视角。
PS:blog主要摘取书中重要内容简述。
摘要部分
  1. Neural networks, a beautiful biologically-inspired programming paradigm which enables a computer to learn from observational data.

  2. Deep learning, a powerful set of techniques for learning in neural networks.
    CHAPTER 1 Using neural nets to recognize handwritten digits

  3. the neural network uses the examples to automatically infer rules for recognizing handwritten digits.

#

The exact form of active function isn’t so important - what really matters is the shape of the function when plotted.
#

4.The architecture of neural networks

  1. The design of the input and output layers of a neural network is often straightforward, there can be quite an art to the design of the hidden layers. But researchers have developed many design heuristics for the hidden layers, which help people get the behaviour they want out of their nets.

  2. Learning with gradient descent
    1. The aim of our training algorithm will be to minimize the cost C as a function of the weights and biases. We’ll do that using an algorithm known as gradient descent.
    2. Why introduce the quadratic cost? It’s a smooth function of the weights and biases in the network and it turns out to be easy to figure out how to make small changes in the weights and biases so as to get an improvement in the cost.
    3. MSE cost function isn’t the only cost function used in neural network.
  • Mini batch: SGD randomly picking out a small number m of randomly chosen training inputs;epoch : randomly choose mini-batch and training until we’ve exhausted the training inputs.

  • Thinking about hyper-parameter choosing
    ”If we were coming to this problem for the first time then there wouldn’t be much in the output to guide us on what to do. We might worry not only about the learning rate, but about every other aspect of our neural network. We might wonder if we’ve initialized the weights and biases in a way that makes it hard for the network to learn? Or maybe we don’t have enough training data to get meaningful learning? Perhaps we haven’t run for enough epochs? Or maybe it’s impossible for a neural network with this architecture to learn to recognize handwritten digits? Maybe the learning rate is too low? Or, maybe, the learning rate is too high? When you’re coming to a problem for the first time, you’re not always sure.
    The lesson to take away from this is that debugging a neural network is not trivial, and, just as for ordinary programming, there is an art to it. You need to learn that art of debugging in order to get good results from neural networks. More generally, we need to develop heuristics for choosing good hyper-parameters and a good architecture.”
  • Inspiration from Face detection:
    “The end result is a network which breaks down a very complicated question - does this image show a face or not - into very simple questions answerable at the level of single pixels. It does this through a series of many layers, with early layers answering very simple and specific questions about the input image, and later layers building up a hierarchy of ever more complex and abstract concepts. Networks with this kind of many-layer structure - two or more hidden layers - are called deep neural networks.”

  • CHAPTER 2 How the backpropagation algorithm works

    1. Backpropagation(BP): a fast algorithm for computing the gradient of the cost function.


  • For backpropagation to work we need to make two main assumptions about the form of the cost function.
    1. Since what BP let us do is compute the partial derivatives for a single training example,so we need that the cost function can be written as an average over all individual example.
    2. It can be written as a function of the outputs from the neural network.Since y is not something which the neural network learns.

  • The four fundamental equations behind backpropagation


  • What’s clever about BP is that it enables us to simultaneously compute all the partial derivatives using just one forward pass through the network, followed by one backward pass through the network.

  • What indeed the BP do and how someone could ever have discovered BP?

    1. A small perturbations
  • will cause a change in the activation,then next and so on all the way through to causing a change in the final layer,and then the cost function.

    A clever way of keeping track of small perturbations to the weights (and biases) as they propagate through the network, reach the output, and then affect the cost.


  • (未完待续……)

    上一篇文章      下一篇文章      查看所有文章
    2016-03-28 21:47:36  
    360图书馆 论文大全 母婴/育儿 软件开发资料 网页快照 文字转语音 购物精选 软件 美食菜谱 新闻中心 电影下载 小游戏 Chinese Culture
    生肖星座解梦 人民的名义 人民的名义在线看 三沣玩客 拍拍 视频 开发 Android开发 站长 古典小说 网文精选 搜图网 天下美图
    中国文化英文 多播视频 装修知识库
    2017-4-24 13:29:09
    多播视频美女直播
    ↓电视,电影,美女直播,迅雷资源↓
    TxT小说阅读器
    ↓语音阅读,小说下载,古典文学↓
    一键清除垃圾
    ↓轻轻一点,清除系统垃圾↓
    图片批量下载器
    ↓批量下载图片,美女图库↓
      网站联系: qq:121756557 email:121756557@qq.com  软件世界网 --