Everything should be made as simple as possible, but not simpler. (Albert Einstein)

Saturday, April 11, 2015

Kaggle's Competition

Kaggle always has some competitions with prize money. E.g. currently "Diabetic Retinopathy Detection" offers $100,000 of prize money.

Most of the current competitions fall into machine learning field of study. But sometimes in the past it had discrete optimization competition, such as 2012's "Traveling Santa Problem", which is actually a TSP problem.

Several past General Electric flight optimization competitions offered prize money upto $220,000.

Wednesday, March 4, 2015

Finding Good Lambda for Handwritten Digits Recognition (Neural Network) with Cross Validation Set

Finding good lambda ( λ ) for regularization in a machine learning model is important, to avoid under-fitting (high bias) or over-fitting (high variance).

If lambda is too large, then all theta ( θ ) values will be penalized heavily. Hypothesis ( h ) tends to zero. (High bias, under-fitting).
If lambda is too small, that's similar to very small regularization. (High variance, over-fitting).

Cross validation set principle can be used to select good lambda based on the plot of errors vs lambda, for both training data and validation data.

Friday, February 27, 2015

Some Introductory Machine Learning Books


Many Machine Learning books I encountered are too heavily math-wise (for a programmer). But I noted several introductory books,
  •  Machine Learning, Tom M. Mitchell, McGraw Hill. 
  •  Introduction to Machine Learning 2nd edition, Ethem Alpaydin, MIT Press. (without example code)
  •  Bayesian Reasoning and Machine Learning, David Barber (this has free online draft version, last draft is dated Dec 13, 2014) (ex. code in Matlab with BRMLToolbox).
  •  Machine Learning, A Probabilistic Perspective, Kevin P Murphy, MIT Press. (ex. code in Matlab with PMTK package.)
  •  Machine Learning, An Algorithmic Perspective, Stephen Marsland, CRC Press. (ex. code in Python)
  •  Machine Learning, Hands-On for Developers and Technical Professionals, Jason Bell, Wiley. (ex. code in Java with Weka toolkit.)
  •  Machine Learning In Action, Peter Harrington, Manning. (ex. code in Python.)
  •  Thoughtful Machine Learning, a Test Driven Approach, Matthew Kirk, O'Reilly. (ex. code in Ruby.)

More programming-wise books,
Python:
  •  Mastering Machine Learning with scikit-learn, Gavin Hackeling, Packt.
  •  Learning scikit-learn: Machine Learning in Python, Raúl Garreta et.al., Packt. 
  •  scikit-learn Cookbook, Trent Hauck, Packt.
  •  Building Machine Learning Systems with Python, Willi Richert et.al, Packt.
R:
  •  An Introduction to Statistical Learning with Applications in R, Gareth James et.al, Springer.
  •  Machine Learning with R, Brett Lantz, Packt.
Scala:
  •  Scala for Machine Learning, Patrick R Nicolas, Packt.

Best ML course, with easy understandable video lectures, very well-structured:
Stanford's Prof. Andrew Ng  https://www.coursera.org/course/ml (old regular format with SoA, already closed since 2015).
New format of the course is on-demand (self-paced),  currently without SoA, https://www.coursera.org/learn/machine-learning .

----

Thursday, February 26, 2015

Handwritten Digits Recognition, Experiment with Octave's Neural Network Package "nnet", and RSNNS

This is a note on implementation of handwritten digits recognition, with the neural network learning process, by using Octave nnet package (or MATLAB neural network toolbox).

At the end,  I play around with R code and RSNNS library (Stuttgart Neural Network Simulator for R).

GitHub, Octave/MATLAB:
    https://github.com/flyingdisc/handwritten-digits-recognition-octave-nnet
Github, R - RSNNS:
    https://github.com/flyingdisc/handwritten-digits-recognition-RSNNS  
----

80-20 Rules, Pareto Principle

Wikipedia's Pareto Principle,
"The Pareto principle (also known as the 80–20 rule, the law of the vital few, and the principle of factor sparsity) states that, for many events, roughly 80% of the effects come from 20% of the causes."

Friday, February 13, 2015

MOOC Coursera Certificates

Some of my MOOC certificates (statement of accomplishments). As a matter of memento (personal memorandum).

Monday, February 9, 2015

Handling US NOAA Storm Database's Exponent Value of PROPDMGEXP and CROPDMGEXP


How To Handle Exponent Value of PROPDMGEXP and CROPDMGEXP of "StormData.csv"

 

Reproducible Research Project 2, Coursera, Johns Hopkins University

U.S. National Oceanic and Atmospheric Administration’s (NOAA) Storm Database

There is confusion on how to handle exponent value of PROPDMGEXP and CROPDMGEXP columns of the database. Due to lack of official information in the NOAA website.

This is an attempt to compare downloaded database with the online version, to find conclusion what is meaning of each value actually.

This analysis is inspired by a post made by David Hood, himself is CTA in the Data Science Specialization courses.

At the end of this article, there is more accurate analysis done by Eddie Song.   

GitHub PDF and Markdown repository:
 https://github.com/flyingdisc/RepData_PeerAssessment
Rpubs: http://rpubs.com/flyingdisc/PROPDMGEXP
Reproducible report of the project: http://rpubs.com/flyingdisc/RepProject2

Monday, January 19, 2015

Sound Transformation with MTG's Spectral Modeling Synthesis Tools

It is Spectral Modeling Synthesis sms-tools of MTG UPF (Music Technology Group, Universitat Pompeu Fabra, Barcelona), by Prof. Xavier Serra (also as instructor in the Audio Signal Processing for Music Applications, Coursera).

http://mtg.upf.edu/technologies/sms
https://github.com/MTG/sms-tools
https://github.com/MTG/essentia

Several simple experiments I made.

Friday, January 2, 2015

Simple Workaround Tip for Malfunction Toshiba USB Mouse

Toshiba U20 USB mouse uses only 4 wires. While Logitech uses 5 wires.
In some hardware circumstances (laptop in my case), the Toshiba mouse won't work, it's not detected, keeps blinking and the cursor doesn't move.
This is a simple workaround I tried successfully.

Tuesday, December 30, 2014

Very Simple Yet Effective Dipole Antenna for EVDO 800MHz

One of my mobile broadband Internet provider (EVDO Rev A) planned to close its service in a near future, due to Qualcomm has decided to stop EvDO evolution. The service has real unlimited budget-plan. Currently most of mo-bro providers (both CDMA and GSM) in here are moving to early stages of 4G LTE (either FDD or TDD).

So the impact is, many BTS (base transceiver station) nearby my house have been sold, signal was going worse. Several months ago my EvDO modem (without external antenna) could easily resonated at -70 dBm of HDR power. But since the collapse was announced, signal was suffering at only around -85 down to -90 dBm typically.
(with external antenna, signal can be received at -50 dBm, but I prefer the antenna for another HSPA+ service)

Fortunately I keep an old unused indoor TV antenna. Sure it's dedicated for lower frequency band for TV broadcast. While that my dying ISP works at higher 800MHz of carrier frequency.
Hoho... then I imagined to build a simple dipole antenna to recycle that my old antenna.

Tuesday, August 19, 2014

Bubble Clusters Video - Interactive Computer Graphics


"Bubble Clusters" : Clustering algorithm, Bubble's Potential Field algorithm, Iso-Surface & Iso-Line with Marching-Square algorithm.
Assignment #1 of "Interactive Computer Graphics", a course by Prof, Takeo Igarashi, University of Tokyo. 

Wednesday, June 18, 2014

Faster Convolution with Separability Property in Multi-dimensional Signals

Topic: Digital image processing. 

Consider a 2D signal (or impulse response of 2D filter) with rect-shaped.
 This 2D rect signal is separable. It can be represented as dot-product of two independent 1D signals.
As well an impulse signal is also separable.

Monday, February 24, 2014

Mobile Broadband Modem Pointing Using QPST, QXDM

These tools have been already known since long time ago.
Pointing is important to get best signal for our broadband/wifi modem, either with or without external antenna.

(Works only for modem with Qualcomm chipset, and the modem's diagnostic port must be opened.) 

Howto:

Thursday, January 30, 2014

Basic JPEG Compressing/Decompressing Simulation

Standard JPEG compression uses (1) 8x8 Discrete Cosine Transform, (2) quantization based on certain luminance + chrominance tables, and (3) entropy-encoding (Huffman coding).
Here, I'm using OpenCV (Python) to simulate DCT + quantization + IDCT, without Huffman coding.

Tuesday, January 28, 2014

Friday, December 27, 2013

Basic Processing and Chuck Communication

A very basic example on how to establish communication between Processing and ChucK.
  • run oscProc.pde in Processing
  • run oscChuck.ck in ChucK
  • in the sketch, press "s" key to send message

Thursday, December 26, 2013

Fast Fourier Transform in ChucK to Get Spectrum

We can convert time domain to frequency domain by using Fast Fourier Transform, that's to calculate the discrete fourier transformation. http://chuck.cs.princeton.edu/doc/language/uana.html

This is the basic of finding the spectrum of an input signal, in this case I use a sinusoidal signal as example.
https://github.com/flyingdisc/music-chuck/tree/master/FFTSpectrum
This will display frequencies and power (in polar form of the complex number) at the FFT bin indices.

Tuesday, December 24, 2013

Graph Complex Set In Python with mpmath and matplotlib

In this case I use Python(x,y). This package comes with many standard plugins, such as sympy (in which we use its mpmath), numpy, scipy, matplotlib, and so on. http://code.google.com/p/pythonxy/wiki/StandardPlugins

Simple Processing Game using Minim

This is a simple program (a memory game) written in Processing 2 language.
It has two version, Java mode and Javascript mode.
Using Minim is not straightforward in Javascript mode, especially for sound part. (Yet there is better and richer sound library, Maxim which is until now only be able to play sound in Chrome browser, ref: Can I use Web Audio API ).

Here are both versions,
Java mode:
http://www.openprocessing.org/sketch/125380
JavaScript mode:
https://googledrive.com/host/0B1HulZRKubRMdzhFYVByUnM4YUE/index.html

Electronics Music in ChucK

Following a course by CalArts (wonderful instructors with their expertise on music and computer science: Ajay Kapur, Ge Wang and Perry R. Cook from Princeton ).
It uses ChucK programming language.

Sounds:
http://soundcloud.com/flyingdisc

Some ChucK codes:
https://github.com/flyingdisc/music-chuck