Wednesday, April 25, 2012

The Benford's Law


(Sources: Wikipedia) Benford's law, also called the first-digit law, states that in lists of numbers from many (but not all) real-life sources of data, the leading digit is distributed in a specific, non-uniform way. According to this law, the first digit is 1 about 30% of the time, and larger digits occur as the leading digit with lower and lower frequency, to the point where 9 as a first digit occurs less than 5% of the time. This distribution of first digits is the same as the widths of gridlines on the logarithmic scale. Benford's law also gives the expected distribution for digits beyond the first, which approach a uniform distribution as the digit place goes to the right.

This result has been found to apply to a wide variety of data sets, including electricity bills, street addresses, stock prices, population numbers, death rates, lengths of rivers, physical and mathematical constants, and processes described by power laws (which are very common in nature). It tends to be most accurate when values are distributed across multiple orders of magnitude.The graph to the right shows Benford's law for base 10. There is a generalization of the law to numbers expressed in other bases (for example, base 16), and also a generalization to second digits and later digits.It is named after physicist Frank Benford, who stated it in 1938, although it had been previously stated by Simon Newcomb in 1881.

Notice that if the numbers increased by a constant rate, for example by 0.3. Then we see that the numbers will not follow the Benford's law.
<Ex>
1, 1.3, 1.6, 1.9, 2.2, 2.5, 2.8, 3.1, 3.3, 3.5, 3.8, 4.1, 4.3, 4.7, 5, 5.3, 5.6, 5.9, 6.2, 6.5, 6.8, 7.1, 7.4, 7.7, 8, 8.3, 8.6, 8.9, 9, 9.3, 9.6, 9.9


You see that the number occurrences of the first digit does not follow the Benford's law. However, when you multiply each subsequent number by an exponential number it will behave properly.


<Ex>
1.0, 1.050, 1.103, 1.158, 1.216, 1.276, 1.340, 1.407, 1.477, 1.551, 1.629, 1.710, 1.796, 1.886, 1.980, 2.079, 2.183, 2.292, 2.407, 2.527, 2.653, 2.786, 2.925, 3.072, 3.225, 3.386, 3.556, 3.733, 3.920, 4.116, 4.322, 4.538, 4.765, 5.003, 5.253, 5.516, 5.792, 6.081, 6.385, 6.705, 7.040, 7.392, 7.762, 8.150, 8.557, 8.985, 9.434, 9.906, 10.401, 10.921


We see this time the occurrence of '1' as the first digit is much more frequent than any other digit. After some calculation, the percent of '1' as first digit turns out to be around 30.6%, which is almost identical as the Benford's Law 30%.


Benford's Law is such a profound discovery and it can be used in a wide variety of fields: Finance, population growth, bacteria growth, anything that related to exponential growth can be predicted using Benford's Law.


Finally, here's the simple program for Benford's Law:


/*Programmer: David Lin
This program examines the famous Benford's Law that stated
numbers growing exponentially tends to behave a certain way.
The program can be used to detect fraud in financial,population-related,
housing market, and etc.
*/
import java.util.*;
import java.io.*;

public class BenfordLaw {
    public static void main(String[] args) throws FileNotFoundException{
        intro();
        Scanner console = new Scanner(System.in);
        Scanner input = searchFile(console);
        int[]numberCount = digitCount(input);
        printStat(numberCount);
    }
    //printing the result and see the percentage of occurrences for each digit
    public static void printStat(int[] numberCount){
        System.out.println();
        int total = sum(numberCount)-numberCount[0];
        System.out.println("Digit Count Percent");
        for(int i=1;i<numberCount.length;i++){
            double pct = numberCount[i]*100.0/total;
            System.out.printf("%5d %5d %7.3f\n",numberCount[i],i,pct);
        }
    }
    //summing the occurrences for each digit
    public static int sum(int[]numberCount){
        int sum = 0;
        for(int n:numberCount){
            sum+=n;
        }
        return sum;
    }
    //Counting the occurrences of the first digit from a file of numbers
    public static int[] digitCount(Scanner input){
        int[]countTimes = new int[10];//# of times digit from 0-9 occur
        while(input.hasNextInt()){
            int num = input.nextInt();
            countTimes[singleDigit(num)]++;
        }
        return countTimes;
    }
    //Performing the algorithm to give the first digit of the number intersted
    public static int singleDigit(int num){
        while(num>=10){
            num = num/10;
        }
        return num;
    }
    //User friendly introduction to the program
    public static void intro(){
        System.out.println("Welcome to David's program !");
        System.out.println("This program runs the famous Benford's Law");
       
    }
    //Search whether the file can be found
    public static Scanner searchFile(Scanner console) throws FileNotFoundException{
        System.out.print("Please enter the file name to be examined(Included extension): ");
        String fileName = console.nextLine();
        File f = new File(fileName);
        while(!f.canRead()){
            System.out.println("File not found. Please try again.");
            System.out.print("Please enter the file name to be examined(Included extension): ");
            fileName = console.nextLine();
        }
       
        return new Scanner(f);
                   
    }
       
}
   



Tuesday, April 24, 2012

Fun creating the pascal triangle using java


The simplest way to approach this problem is by using the knowledge of multidimensional arrays. So how do we start? First of all we identify this as a two-dimensional array. If you are unfamiliar with two dimensional array, the syntax goes like this: <type of array><name> = new <type of array><size of array>. For example, an one dimensional integer array name "numbers" that is of size 5 will have the syntax of the following: int[]numbers = new int[5].

Coming back to our problem, a two dimensional array is very similar to its one dimension counterpart. For instance, a 2x2 integer arrays would have a syntax of the following: int [][] name = new int[2][2]. Since the shape of the pascal triangle is jagged, the syntax will start in some form like the following: int[] [] triangle = new int[7][ ].

Now the most important part to solve this problem is to recognize the pattern. Notice that there are one number in the first row, two numbers in second, three in the third, and so on. Since array uses zero-based index, meaning that 0=1, 1=2,2=3, etc. Another trick to recognize is that it always start and end with '1'. The last thing to notice is that the middle elements are made by adding numbers directly above and one to the left on the previous line. After finishing analyzing the problem, it's time to put it to Java form.

The code will look like the following:

public class PascalTriangle{
     public static void main(String[] args){
             int[ ][ ] triangle = new int[7][ ];//because it's jagged.
             makeTriangle(triangle);//the method that form the pascal triangle.
             display(triangle);//print the triangle on console.
     }

     public static void makeTriangle(int[ ][ ] triangle){
                for(int i = 0;i<triangle.length;i++){
                      triangle[i] = new int[i+1]; //each line has one more element.
                      triangle[i][0] = 1; //first number on each line always '1'
                      triangle[i][i] = 1; // last number  on each line always '1'
                      for(int j = 1;j< i; j++){
                              triangle[i][j] = triangle[i-1][j]+triangle[i-1][j-1];
                      }
                 }
     }

     public static void print(int[ ][ ] triangle){
               for(int i=0; i< triangle.length; i++){
                      for(int j=0; j<triangle[i].length; j++){
                                System.out.print(triangle[i][j]+" ");
                      }
                       System.out.println();
               }
     }

}

Sorry about any unclear parts in this tutorial, as I am still quite new to the blogging community. Please feel free to leave comments on how I can improve my blog. Thanks!

























San Antonio Spurs for Championship??

The Spurs winning eight straight games clinched the top seed in the West after beating the Blazers last night. Looking at the highlight of that game, I can sense a championship hungry team who is ready to reclaim the glory days in 06-07 season.