archive: machine learning

Showing posts with label machine learning. Show all posts

Wednesday, April 10, 2019

Object Detection With Deep Neural Networks

Recently I red a series of papers about object detection using deep neural networks. Here is summary of the reading.

R-CNN: Region-based Convolutional Networks for Accurate Object Detection and Segmentation
用传统方法提出region proposal，train一个classifier和一个location regressor。classifier为了提高精度先用softmax train，然后fc层提出来的feature用svm fit，regressor单独作为一个network

用到了selective search生成proposal，从一个over-segmentation开始repeatedly merge similar regions. 然后每个region用传统的descriptor提feature，用bag-of-words model+ svm分类

Fast R-CNN

Faster R-CNN
提出一个region proposal network (RPN) 共享feature extraction的networks，不增加计算开销的情况下把上一版最耗时的region proposal步骤变成自动从network生成。

Tuesday, April 2, 2019

[Machine Learning] from distance to kernel (classification via SVM)

Since for many machine learning techniques, you can use kernel trick. This is important to find a kernel, or explicitly construct a kernel from your data. One possible way to do this is calculate metric distance.
This article only gives some fundamental ideas for this transformation, without any strict provment.

Once you have a matrix of distance to measure the difference or distance between a pair of instances, denote I_i, I_j. You may define a kernel function /rho (I_i, I_j) = f(d(I_i,I_j)), where d(I_i,I_j) is the distance you've obtained. The most common choice is natural exponential.
http://math.stackexchange.com/questions/221704/transforming-a-distance-function-to-a-kernel

Another webpage to list some ways to calculate kernel from original feature space:
http://scikit-learn.org/stable/modules/metrics.html

For the requirement for positive kernel, you may refer to this paper:
http://www.kyb.mpg.de/fileadmin/user_upload/files/publications/attachments/scholkopf00kernel_3781%5b0%5d.pdf

Saturday, November 4, 2017

Andrew Ng的AI新课程

Andrew Ng离开baidu后三件大事之一是成立deeplearning.ai，开设五门深度学习课程。网易云课堂搬运并翻译了前三门课。免费听一下，感觉课程内容虽然比较浅显偏engineering多一些，但总体还是质量颇高，收获不少，按照教学大纲总结如下（慢慢更新）。

第一周  深度学习概论：

学习驱动神经网络兴起的主要技术趋势，了解现今深度学习在哪里应用、如何应用。

1.1  欢迎来到深度学习工程师微专业

ng秀中文，委托网易发布中文字幕版deeplearning.ai的课程。理念是希望培养成千上万人工智能人才，构建人工智能驱动的社会。

1.2  什么是神经网络？

介绍神经网络，house price prediction的例子，因为price不能为负所以曲线变成ReLU，很有意思的一个引出ReLU的方式，hiden unit 也称之为neuron。neural network就是stack neuron (like Lego brick) together toform a network.很直观。

1.3  用神经网络进行监督学习

几乎所有有价值的机器学习／神经网络技术都是supervised learning。也就是model一个x到y的映射函数。相对应的unsupervised learning，不存在y，只能指望data can tell something itself。

举了很多监督学习的例子

house price prediction用standard NN解决；image understanding，object detection这类问题用convolutional NN （CNN）解决；time series，or temporal sequence数据问题用Recurrent NN （RNN）解决；

structural data：房屋size等；unstructual data: audio/image/text, etc.

1.4  为什么深度学习会兴起？

因为数据量变大了，算法强了，计算资源强了。

举例：ReLU代替sigmoid。主要的优势在于sigmoid在两侧gradient几乎为0，导致gradient decent优化速度变慢。

interesting graph： idea ->code->experiments->idea... computation的提高导致这个iterative process加速，能带来更多更好结果。

1.5  关于这门课

简介五门课程和第一门课。

1.6  课程资源

鼓励上forum讨论问题。其他问题也可以联系deeplearning.ai。可以看出这个公司/机构的目的是培训为主，所以提到如果有公司需要培训hundreds of empolyees with deep learning expertise，可以联系他们，大学老师想开deep learning课的也可联系他们。

第二周  神经网络基础：

学习如何用神经网络的思维模式提出机器学习问题、如何使用向量化加速你的模型。

2.1  二分分类

2.2  logistic 回归

2.3  logistic 回归损失函数

2.4  梯度下降法

2.5  导数

2.6  更多导数的例子

2.7  计算图

2.8  计算图的导数计算

2.9  logistic 回归中的梯度下降法

2.10  m 个样本的梯度下降

2.11  向量化

2.12  向量化的更多例子

2.13  向量化 logistic 回归

2.14  向量化 logistic 回归的梯度输出

2.15  Python 中的广播

2.16  关于 python / numpy 向量的说明

2.17  Jupyter / Ipython 笔记本的快速指南

2.18  （选修）logistic 损失函数的解释

第三周  浅层神经网络：

学习使用前向传播和反向传播搭建出有一个隐藏层的神经网络。

3.1  神经网络概览

3.2  神经网络表示

3.3  计算神经网络的输出

3.4  多样本向量化

3.5  向量化实现的解释

3.6  激活函数

3.7  为什么需要非线性激活函数？

3.8  激活函数的导数

3.9  神经网络的梯度下降法

3.10  （选修）直观理解反向传播

3.11  随机初始化

第四周  深层神经网络：

理解深度学习中的关键计算，使用它们搭建并训练深层神经网络，并应用在计算机视觉中。

4.1  深层神经网络

4.2  深层网络中的前向传播

4.3  核对矩阵的维数

4.4  为什么使用深层表示

4.5  搭建深层神经网络块

4.6  前向和反向传播

4.7  参数 VS 超参数

4.8  这和大脑有什么关系？

Wednesday, June 11, 2014

ML_general_talk.md

why this article

I am not a newbie for machine learning any more. But still sometimes I suspect what did I gain from learning “machine learning”. By applying some classical algorithms, I get some real feeling about this hot topic.

everything is about generalization

Generalization means the ability to have good prediction on novel data samples. In other words, when you make prediction on testing data given the model you trained on training data.
You can easily get a 100% accuracy on training data, except for some ambiguous data point (same data points, but different label). This is meaningless since your decision boundary is too complex. The terminology is over-training. Instead of record every training data sample by taking photo, you need to loss the decision boundary. Some technologies play this role essentially, like margin in SVM, regularization in general optimization.
Another this is features goes first. You cannot do magic on bad features. This means spend more time on feature extraction/selection/design is worthy. In some sense, deep learning or sparse coding/representation is to put efforts on the steps before classification.

Do Preprocessing

Some simple preprocessing, e.g. normalization, whitening, etc. can really benefit your classification.

select correct classification algorithm

Linear or not, svm or lda.
I always try four algorithm to have baseline:
1. Support Vector Machine (SVM)
2. Linear Discriminant Analysis (LDA)
3. Random Forest (RF)
4. K Nearest Neighbor (KNN)

Written with StackEdit.

Wednesday, May 7, 2014

Multi-kernel learning, run "SMO-MKL" on my laptop (windows 8 64 bit)

I am compiling an open-source code set of multi-kernel learning:
http://research.microsoft.com/en-us/um/people/manik/code/smo-mkl/download.html
This is not the latest ML software, but could be a good start.

Now it is working on my windows 8 (64 bit)
1. Add $Programfiles$\Microsoft Visual Studio 11.0\VC\bin to $path$
run "vcvars32.bat" to set up all path and environment variables for Visual C++.
http://msdn.microsoft.com/en-us/library/f2ccy3wt.aspx

2. type:
nmake -f Makefile.win clean all

Done~ the compiled exe files are located in Windows sub-directory.
Have a try:
svm-train -s 0 -h 0 -m 400 -o 2.0 -a 26 -c 10.0 -l 1.0 -f 0 -j 1 -g 3 -k Example/Classification/PrecomputedKernels/kernelfile Example/Classification/PrecomputedKernels/y_train Example/Classification/PrecomputedKernels/model_file
This example is to train a model from a kernel matrix. And you may test on testing data:
svm-predict Example/Classification/PrecomputedKernels/y_test Example/Classification/PrecomputedKernels/model_file Example/Classification/PrecomputedKernels/prediction

I eventually found that nmake -f Makefile.win clean all actually did not compile the svm.cpp again, but instead, it is based on pre-compiled svm.obj. And if you made a change on svm.cpp, then nmake will give you some error message. I did not have a solution for it so far.

3. error/warning message when run "svm-predict".
This is annoying, because I wanna do grid search to find the optimal values for parameters. And actually the result is correct, but only because of exception not been handled, the error message comes out. This makes the pipeline stopped and have to manually click a confirm button to continue.
I have to change the svm-predict.c, to comment off the code to destroy svm_model.

Tuesday, April 29, 2014

[keep updating] Deep Learning

This is the buzzword in machine learning community recently. Let's start from a 101 article. http://markus.com/deep-learning-101/

Begin to read the tutorial:
http://ufldl.stanford.edu/wiki/index.php/UFLDL_Tutorial
Chinese version:
http://deeplearning.stanford.edu/wiki/index.php/UFLDL%E6%95%99%E7%A8%8B
and another one:
http://deeplearning.net/tutorial/gettingstarted.html#

open source code set:
http://deeplearning.net/software/theano/
introduce in Chinese
http://www.52ml.net/6.html
http://blog.csdn.net/mysee1989/article/details/11992535

先说一点直观感受，deep learning用了一段时间。
Motivation 是人的感知系统是hierarchical的，以视觉系统为例，底层的neuron负责检测local low-level features，比如边界，角点，纹理。高层的neuron负责在这些low-level特征基础上提取高级特征，最后形成high-level concept。
Neural Network在七八十年代红极一时。但后来逐渐被模型更简单的SVM取代。原因有：a. NN的parameter太多优化起来很难。b. 层数多了之后容易overfitting.
计算能力的爆炸性增长解决掉第一个问题（along with better optimization algorithm）第二个问题是优化过程中加入一些regularization来克服（e.g. sparsity）.

convolutional neural network (CNN) 考虑到了spatial 局部性，底层向上层计算时只考虑一定领域内的值，training的结果也就变成多个convolutional filter.好处是计算效率提高而且training的modle复杂度降低了。伴随来的是另一个概念max-pooling.可以看作是一个非线性的down-sampling. 一定领域内取最大值传递到上一层，领域之间是不重叠的。
max-pooling的好处是使得算法更robust，代价是丢失信息。

Monday, April 28, 2014

Face recognition again

These days, some interesting news in face recognition field are re-posted widely on social network.

DeepFace: Closing the Gap to Human-Level Performance in Face Verification (Facebook AI lab)
It is said the performance is close to human being.

Then, more incredible, someone claimed their algorithm outperforms humankind.
http://www.zhizhihu.com/html/y2014/4520.html
https://medium.com/the-physics-arxiv-blog/2c567adbf7fc
http://www.52ml.net/14704.html

face++
http://www.faceplusplus.com/uc/app/home?app_id=14807

Most of the result is achieved on LFW (labeled face in the wild)
http://vis-www.cs.umass.edu/lfw/index.html

My adviser want me to do some face recognition stuff.

FRR, FAR, TPR, FPR, ROC curve, ACC, SPC, PPV, NPV, etc.

In a framework that an algorithm is supposed to predict "positive" or "negative". Some concepts are really confusing. So a summary here. All the concepts or metrics are widely used to measure the performance of the algorithm or machine learning model (which is essentially an computational intensive algorithm).

ground truth\prediction	positive	negative	rate
positive	A	B	TPR
negative	C	D	TNR
rate	PPV,Precision	NPV

A: true positive (TP)
B: false negative (FN)
C: false positive (FP)
D: true negative (TN)
A+B: positive (P)
C+D: negative(N)

False reject rate (FRR) = B/(A+B) = FN/(TP+FN) = FN/P = 1-TPR
False accept rate (FAR) = C/(C+D) = FP/(FP+TN) = FPR

True positive rate (TPR) = A/(A+B) = TP/(TP+FN) = TP/P
False positive rate (FPR) = fall out = C/(C+D) = FP/(FP+TN) = FAR = 1-SPC

Accuracy (ACC) = (A+D)/(A+B+C+D)
Sensitivity = TPR = A/(A+B) = TP/(TP+FN)
Specificity (SPC) = TNR = D/(C+D) = TN/(FP+TN) = 1-FPR
Positive predictive value (PPV) = precision = A/(A+C) = TP/(TP+FP) = TP/P = TPR
Negative predictive value (NPV) = TN/(TN+FN)

False positive rate (FPR) = fall out = C/(C+D) = FP/(FP+TN) = FAR = 1-SPC
False discover rate (FDR) = C/(A+C) = FP/(TP+FP) = FP/P = 1-TPR = 1-PPV

*In bio-medical field, positive means disease, while negative indicates healthy.

Further more,
F1 Score, harmonic mean of precision and sensitivity
F1 = 2TP/(2TP+FP+FN)

Matthews correlation coefficient (MCC)
MCC = (TP*TN+FP*FN) / ((TP+FP)*(TP+FN)*(TN+FP)*(TN+FN))^0.5

When a model (classification model) is finalized, one may want to find an operating point, i.e., confidence threshold. This will be a trade-off between precision (higher with higher threshold) and recall (lower with higher threshold). Then you need a curve to show the performance at all possible threshold levels.

1. Receiver operating characteristic (ROC) curve is a plot of true positive ratio V.S. false positive ratio.
When compare the performance of two models, it is hard to tell which one is better given two curves. So people use the area under curve (AUC) to measure and compare performance (the larger the better). But as you can see, two different curves can have the same AUC. Then to choose which one is depends on whether high precision or high recall is more desirable.

The AUC can be any value between 0 and 1. A random guess classifier will create a straight line segment from (0, 0) and (1, 1). While it can happen that auc<0.5, but then one can flip the output prediction, to make a new classifier with auc' = (1-auc). In this sense, AUC can be considered as something equal or above 0.5.

In statistic learning, the AUC represents the probability that a model outputs higher score for a randomly chosen positive class than a randomly chosen negative class. To prove this, please refer to a very nice blog: https://madrury.github.io/jekyll/update/statistics/2017/06/21/auc-proof.html.

2. Precision-Recall curve is, as the name says, a plot of precision ratio V.S. recall ratio.

references:
http://en.wikipedia.org/wiki/Sensitivity_and_specificity

Monday, November 25, 2013

Begin scikit-learn on python

As far as I know, this python tool-kit maybe the most widely used machine learning lib. Let start from installing it.

All instructions are listed on:

All packages, installers can be found here:
http://www.lfd.uci.edu/~gohlke/pythonlibs/#scikit-learn

I am now working on windows 8, 64 bit. So the sequence of installation is like this:
1. numpy-MKL, which is a package for numerical computation with python.
2. scipy, which is another package for science computation with python, depends on numpy-MKL. And Matplotlib.
3. six->Python-Dateutil->pytz->Pyparsing->(pillow->pycairo->Tornado->Pyside->pyqt), the libs enbraced are optionally required.
4. scikit-learn
Done!!
If you wanna test this new tool, need to install another package:
https://nose.readthedocs.org/en/latest/
download the tar.gz file, release, and type in "python setup.py install"

A Chinese webpage to summarize some open-source lib of machine learning:
http://blog.csdn.net/h349117102/article/details/15029777

I realize the better way to install those libraries is to use the "easy-install" which is a tool of python, as usual located in %PythonDir%/script/. This tool allows you to install any lib using "easy-install lib_name", so easy that really is worthy its name.

Another alternative is to install some pre-build distribution. I tested the "Pythonxy". Note that you need to restart your command prompt if you want those system variables in effect.
https://code.google.com/p/pythonxy/

Tuesday, October 8, 2013

Compile SPAMS (SPArse Modeling Software) - Matlab mixed with C/C++

When you need to compile a 3rd-part open source code, as usual they will provide Matlab interface, but to compile source code(c/c++/Fortran) is the first step, and pretty annoying!!
This article lists some helpful tips, especially for those working on SPAMS (SPArse Modeling Software).

This post is a good summary:
http://www.mathworks.com/support/solutions/en/data/1-6IJJ3L/
This is my case:
"When using 64-bit MATLAB on 64-bit Windows, you must use a 64-bit compiler to build MEX-files, MATLAB Compiler & Builder components,..."

And very unfortunately,
"The default installation of Visual Studio 2008 Express is only capable of building 32-bit binaries, and will not work with MATLAB.
In order to build 64-bit binaries, the "x64 Compilers and Tools" and Microsoft Windows Software Development Kit (SDK) must both be installed. The x64 Compilers and Tools are not installed by default."
The solution is,
"To install Visual Studio 2008 Express Edition with all required components:
1...
2...
3...
"

http://www.mathworks.com/support/solutions/en/data/1-6IJJ3L/
http://stackoverflow.com/questions/3376198/configuring-64-bit-compilation-inside-visual-studio-2008-express-edition-vs2008
http://msdn.microsoft.com/en-us/library/9yb4317s.aspx
http://pixinsight.com/forum/index.php?topic=1902.0
http://software.intel.com/en-us/articles/configuring-microsoft-visual-studio-for-64-bit-applications/

Unfortunately, when you try to compile SPAMS, it will give you some error messages:
compilation of: -I./linalg/ -I./decomp/ -I./dictLearn/ dictLearn/mex/mexTrainDL.cpp

Warning: MEX could not find the library "acml"
         specified with -l option on the path specified
         with the -L option.
cl : Command line warning D9035 : option 'O' has been deprecated and will be removed in a future release
   Creating library C:\USERS\JFENG\APPDATA\LOCAL\TEMP\MEX_KB~1\templib.x and object C:\USERS\JFENG\APPDATA\LOCAL\TEMP\MEX_KB~1\templib.exp
mexTrainDL.obj : error LNK2019: unresolved external symbol dcopy referenced in function "void __cdecl cblas_copy<double>(__int64,double *,__int64,double *,__int64)" (??$cblas_copy@N@@YAX_JPEAN010@Z)
mexTrainDL.obj : error LNK2019: unresolved external symbol daxpy referenced in function "void __cdecl cblas_axpy<double>(__int64,double,double *,__int64,double *,__int64)" (??$cblas_axpy@N@@YAX_JNPEAN010@Z)
mexTrainDL.obj : error LNK2019: unresolved external symbol dgemv referenced in function "void __cdecl cblas_gemv<double>(enum CBLAS_ORDER,enum CBLAS_TRANSPOSE,__int64,__int64,double,double *,__int64,double *,__int64,double,double *,__int64)" (??$cblas_gemv@N@@YAXW4CBLAS_ORDER@@W4CBLAS_TRANSPOSE@@_J2NPEAN232N32@Z)
...

This is because your MATLAB cannot find CBLAS lib. Or I guess the compile.m provided by SPAMS contains a small bug.
Solution:
If your cblas lib is "builtin" then modify the line 98 in compile.m
original line 98: blas_link='-lmwblas -lmwlapack';
modified line 98: blas_link=sprintf(' -L%s -L/usr/lib/ -lmwblas -lmwlapack',path_to_blas);
Where your "path_to_blas" should be specified in a previous line, e.g. line 102 for me.
path_to_blas='%MATLAB_ROOT%\extern\lib\win64\microsoft'; % you need to tell the compiler where is your BLAS lib.

archive