## Speech to Text Deep Learning Architectures

Small Intro. and Background Recently, I started at Mozilla Research. I am really excited to be a part of a small but a great team working hard to solve important ML problems with a great deal of being open-sourced. It is the first place, I've even been where people warn you to license your code … Continue reading Speech to Text Deep Learning Architectures

## Why mere Machine Learning cannot predict Bitcoin price

Lately, I study time series to see something more out the limit of my experience. I decide to use what I learn in cryptocurrency price predictions with a hunch of being rich. Kidding? Or not :). As I see more about the intricacies of the problem I got deeper and I got a new challenge … Continue reading Why mere Machine Learning cannot predict Bitcoin price

## Online Hard Example Mining on PyTorch

Online Hard Example Mining (OHEM) is a way to pick hard examples with reduced computation cost to improve your network performance on borderline cases which generalize to the general performance. It is mostly used for Object Detection. Suppose you like to train a car detector and you have positive (with car) and negative images (with … Continue reading Online Hard Example Mining on PyTorch

## Paper review: EraseReLU

paper: https://arxiv.org/pdf/1709.07634.pdf ReLU is defined as a way to train an ensemble of exponential number of linear models due to its zeroing effect. Each iteration means a random set of active units hence, combinations of different linear models. They discuss, relying on the given observation, it might be useful to remove non-linearities for some layers … Continue reading Paper review: EraseReLU

## Designing a Deep Learning Project

There are numerous on-line and off-line technical resources about deep learning. Everyday people publish new papers and write new things. However, it is rare to see resources teaching practical concerns for structuring a deep learning projects; from top to bottom, from problem to solution. People know fancy technicalities but even some experienced people feel lost … Continue reading Designing a Deep Learning Project

## Paper Review: Self-Normalizing Neural Networks

One of the main problems of neural networks is to tame layer activations so that one is able to obtain stable gradients to learn faster without any confining factor. Batch Normalization shows us that keeping values with mean 0 and variance 1 seems to work things. However, albeit indisputable effectiveness of BN, it adds more … Continue reading Paper Review: Self-Normalizing Neural Networks

## Paper Notes: The Shattered Gradients Problem ...

paper: https://arxiv.org/abs/1702.08591 The whole heading of the paper is "The Shattered Gradients Problem: If resnets are the answer, then what is the question?". It is really interesting work with all its findings about gradient dynamics of neural networks. It also examines Batch Normalization (BN) and Residual Networks (Resnet) under this problem. The problem, dubbed "Shattered Gradients", described as … Continue reading Paper Notes: The Shattered Gradients Problem ...

## Dilated Convolution

In simple terms, dilated convolution is just a convolution applied to input with defined gaps. With this definitions, given our input is an 2D image, dilation rate k=1 is normal convolution and k=2 means skipping one pixel per input and k=4 means skipping 3 pixels. The best to see the figures below with the same k … Continue reading Dilated Convolution

## Ensembling Against Adversarial Instances

What is Adversarial? Machine learning is everywhere and we are amazed with capabilities of these algorithms. However, they are not great and sometimes they behave so dumb. For instance, let's consider an image recognition model. This model induces really high empirical performance and it works great for normal images. Nevertheless, it might fail when you change … Continue reading Ensembling Against Adversarial Instances

## Paper Notes: Intriguing Properties of Neural Networks

Paper: https://arxiv.org/abs/1312.6199 This paper studies description of semantic information with higher level units of an network and blind spot of the network models againt adversarial instances. They illustrate the learned semantics inferring maximally activating instances per unit. They also interpret the effect of adversarial examples and their generalization on different network architectures and datasets. Findings … Continue reading Paper Notes: Intriguing Properties of Neural Networks