Everything should be made as simple as possible, but not simpler. (Albert Einstein)

Friday, November 6, 2015

MOOC Coursera Certificates

Some my MOOC certificates (statement of accomplishments / honor code certificates). As a matter of memento (personal memorandum). 

Original certificates format: PDF.
Converted image format PNG.
Images resolution: Low-res.

Sunday, October 25, 2015

Singular Value Decomposition with Numpy & Scipy

Following previous post "Singular Value Decomposition and Dimensionality, Using R...", here is another approach using Numpy and Scipy.

An example is in Latent Semantic Analysis (LSA, or Latent Semantic Indexing LSI) with Term-Document matrix. 

First data is a list of documents, second data is a list of terms. We build a matrix in which each cell represents "is term t in document d?". It is "1" if term t is found in document d, "0" otherwise. In this case, documents are the features (columns) and terms are the observations (rows).

This is usually used in NLP (Natural Language Processing) to calculate text similarity. 

Sunday, October 4, 2015

Singular Value Decomposition and Dimensionality Reduction, Using R and Cat Image for Illustration Purposes

Singular Value Decomposition and Dimensionality Reduction, Using R and Cat Image for Illustration Purposes, by Soesilo Wijono,

SVD (singular value decomposition) is an important method used in data science, especially data mining. It can be used, e.g., in dimensionality reduction for recommender system.
Imagine online store, e.g. Amazon, to have million of items, and million of users. In order to perform algorithm for the recommender system, matrix to be used would have million by million dimension. Which is very expensive computation.
Theory for dimensionality reduction is everywhere, so we won’t repeat it again in here. Just remember the basic equation:
X = U A V.T
U matrix has dimension of n x n.
V matrix has dimension of d x d.
A matrix is diagonal matrix with dimension of n x d.
(T represents matrix transpose.)
We want to reduce the dimension of X matrix.
This is an illustration of the method by using a PNG cat image. To help understanding the method visually. In which we’ll use image raw data. In real world, the image data can be replaced by any data, e.g. items x users matrix used in an recommender system, etc.

Tuesday, September 15, 2015

Power Pivot for Excel 2010

Power Pivot is delivered with Excel 2013. This is workaround to install it for Excel 2010.

Saturday, September 5, 2015

Play with TUDelftX Data Analysis, Take it to the MAX()

Just playing around in the course, to get in depth a bit on spreadsheet. I'm newbie =)
Instructor: Prof. Felienne Hermans, PhD.

Thursday, September 3, 2015

Demo of Recommender System with LensKit and Intellij Idea

If they're not installed yet, then install Maven and LensKit inside a directory. 
Set environment variable M2_HOME, point it to the Maven directory. E.g. "C:\java\apache-maven-3.3.3" in Windows.
Add path of the LensKit's and Maven's binary directories, 

Monday, July 20, 2015

Interactive Computer Graphics with WebGL


Assignment 1. 
Tesselation and rotation with WebGL. Rotation on tesselated polygon resulted in a twisting effect.

Disclaimer: The sharing of codes in public repository is mandatory, ruled by the assignment rubric. It's not violation to the Honour Code.

Monday, May 25, 2015

some notes

Avoiding headache, centralizing all pieces of "somewhat important" notes =)

Wednesday, March 4, 2015

Finding Good Lambda for Handwritten Digits Recognition (Neural Network) with Cross Validation Set

Finding good lambda ( λ ) for regularization in a machine learning model is important, to avoid under-fitting (high bias) or over-fitting (high variance).

If lambda is too large, then all theta ( θ ) values will be penalized heavily. Hypothesis ( h ) tends to zero. (High bias, under-fitting).
If lambda is too small, that's similar to very small regularization. (High variance, over-fitting).

Cross validation set principle can be used to select good lambda based on the plot of errors vs lambda, for both training data and validation data.

Friday, February 27, 2015

Some Introductory Machine Learning Books

Many Machine Learning books I encountered are too heavily math-wise (for a programmer). But I noted several introductory books,
  •  Machine Learning, Tom M. Mitchell, McGraw Hill. 
  •  Introduction to Machine Learning 2nd edition, Ethem Alpaydin, MIT Press. (without example code)
  •  Bayesian Reasoning and Machine Learning, David Barber (this has free online draft version, last draft is dated Dec 13, 2014) (ex. code in Matlab with BRMLToolbox).
  •  Machine Learning, A Probabilistic Perspective, Kevin P Murphy, MIT Press. (ex. code in Matlab with PMTK package.)
  •  Machine Learning, An Algorithmic Perspective, Stephen Marsland, CRC Press. (ex. code in Python)
  •  Machine Learning, Hands-On for Developers and Technical Professionals, Jason Bell, Wiley. (ex. code in Java with Weka toolkit.)
  •  Machine Learning In Action, Peter Harrington, Manning. (ex. code in Python.)
  •  Thoughtful Machine Learning, a Test Driven Approach, Matthew Kirk, O'Reilly. (ex. code in Ruby.)

More programming-wise books,
  •  Mastering Machine Learning with scikit-learn, Gavin Hackeling, Packt.
  •  Learning scikit-learn: Machine Learning in Python, Raúl Garreta et.al., Packt. 
  •  scikit-learn Cookbook, Trent Hauck, Packt.
  •  Building Machine Learning Systems with Python, Willi Richert et.al, Packt.
  •  An Introduction to Statistical Learning with Applications in R, Gareth James et.al, Springer.
  •  Machine Learning with R, Brett Lantz, Packt.
  •  Scala for Machine Learning, Patrick R Nicolas, Packt.

Best ML course, with easy understandable video lectures, very well-structured:
Stanford's Prof. Andrew Ng  https://www.coursera.org/course/ml (old regular format with SoA, already closed since 2015).
New format of the course is on-demand (self-paced),  currently without SoA, https://www.coursera.org/learn/machine-learning .


Thursday, February 26, 2015

Handwritten Digits Recognition, Experiment with Octave's Neural Network Package "nnet", and RSNNS

This is a note on implementation of handwritten digits recognition, with the neural network learning process, by using Octave nnet package (or MATLAB neural network toolbox).

At the end,  I play around with R code and RSNNS library (Stuttgart Neural Network Simulator for R).

GitHub, Octave/MATLAB:
Github, R - RSNNS:

80-20 Rules, Pareto Principle

Wikipedia's Pareto Principle,
"The Pareto principle (also known as the 80–20 rule, the law of the vital few, and the principle of factor sparsity) states that, for many events, roughly 80% of the effects come from 20% of the causes."

Monday, February 9, 2015

Handling US NOAA Storm Database's Exponent Value of PROPDMGEXP and CROPDMGEXP

How To Handle Exponent Value of PROPDMGEXP and CROPDMGEXP of "StormData.csv"


Reproducible Research Project 2, Coursera, Johns Hopkins University

U.S. National Oceanic and Atmospheric Administration’s (NOAA) Storm Database

There is confusion on how to handle exponent value of PROPDMGEXP and CROPDMGEXP columns of the database. Due to lack of official information in the NOAA website.

This is an attempt to compare downloaded database with the online version, to find conclusion what is meaning of each value actually.

This analysis is inspired by a post made by David Hood, himself is CTA in the Data Science Specialization courses.

At the end of this article, there is more accurate analysis done by Eddie Song.   

GitHub PDF and Markdown repository:
Rpubs: http://rpubs.com/flyingdisc/PROPDMGEXP
Reproducible report of the project: http://rpubs.com/flyingdisc/RepProject2

Monday, January 19, 2015

Sound Transformation with MTG's Spectral Modeling Synthesis Tools

It is Spectral Modeling Synthesis sms-tools of MTG UPF (Music Technology Group, Universitat Pompeu Fabra, Barcelona), by Prof. Xavier Serra (also as instructor in the Audio Signal Processing for Music Applications, Coursera).


Several simple experiments I made.

Friday, January 2, 2015

Simple Workaround Tip for Malfunction Toshiba USB Mouse

Toshiba U20 USB mouse uses only 4 wires. While Logitech uses 5 wires.
In some hardware circumstances (laptop in my case), the Toshiba mouse won't work, it's not detected, keeps blinking and the cursor doesn't move.
This is a simple workaround I tried successfully.

Tuesday, December 30, 2014

Very Simple Yet Effective Dipole Antenna for EVDO 800MHz

One of my mobile broadband Internet provider (EVDO Rev A) planned to close its service in a near future, due to Qualcomm has decided to stop EvDO evolution. The service has real unlimited budget-plan. Currently most of mo-bro providers (both CDMA and GSM) in here are moving to early stages of 4G LTE (either FDD or TDD).

So the impact is, many BTS (base transceiver station) nearby my house have been sold, signal was going worse. Several months ago my EvDO modem (without external antenna) could easily resonated at -70 dBm of HDR power. But since the collapse was announced, signal was suffering at only around -85 down to -90 dBm typically.
(with external antenna, signal can be received at -50 dBm, but I prefer the antenna for another HSPA+ service)

Fortunately I keep an old unused indoor TV antenna. Sure it's dedicated for lower frequency band for TV broadcast. While that my dying ISP works at higher 800MHz of carrier frequency.
Hoho... then I imagined to build a simple dipole antenna to recycle that my old antenna.

Tuesday, August 19, 2014

Bubble Clusters Video - Interactive Computer Graphics

"Bubble Clusters" : Clustering algorithm, Bubble's Potential Field algorithm, Iso-Surface & Iso-Line with Marching-Square algorithm.
Assignment #1 of "Interactive Computer Graphics", a course by Prof, Takeo Igarashi, University of Tokyo. 

Wednesday, June 18, 2014

Faster Convolution with Separability Property in Multi-dimensional Signals

Topic: Digital image processing. 

Consider a 2D signal (or impulse response of 2D filter) with rect-shaped.
 This 2D rect signal is separable. It can be represented as dot-product of two independent 1D signals.
As well an impulse signal is also separable.

Monday, February 24, 2014

Mobile Broadband Modem Pointing Using QPST, QXDM

These tools have been already known since long time ago.
Pointing is important to get best signal for our broadband/wifi modem, either with or without external antenna.

(Works only for modem with Qualcomm chipset, and the modem's diagnostic port must be opened.) 


Thursday, January 30, 2014

Basic JPEG Compressing/Decompressing Simulation

Standard JPEG compression uses (1) 8x8 Discrete Cosine Transform, (2) quantization based on certain luminance + chrominance tables, and (3) entropy-encoding (Huffman coding).
Here, I'm using OpenCV (Python) to simulate DCT + quantization + IDCT, without Huffman coding.