tirsdag 8. januar 2008

PHP and JavaScript: Drawing a binomial distribution bar graph using the GD library

Note: This script is licensed under a Creative Commons Attribution 3.0 Unported license.

This means that you are free to use, modify and distribute my work, as long as you:
1) Attribute me in an acceptable manner. By this, I mean that you must make it clear to anyone using a copy of this code or code based upon this code that I, eXanock, am the author of the original script. You must also include a link to my website: http://exanock.blogspot.com.
2) Release this code and any code based upon it under the same license as I have.


In a previous post, we had a look at how to create two TI-BASIC programs that would reduce the amount of keystrokes required to be performed by the user when calculating binomial distribution and the sum of binomial distributions.

In todays post we will return to this subject. Only this time we are going to use JavaScript, PHP and the GD library in order to draw a bar chart showing the binomial distributions. In this example, we are going draw a graph showing the probability for 0 to 10 successes out of 10 trials where the probability for for success, p, is entered by the user.

First, we are going to create the PHP script that draws the chart. Then we will create a HTML document which uses JavaScript to pass the probability-variable (entered by the user in a form) via the url to the PHP script, which in turn will calculate the binomial distribution and draw the graph. In the PHP script, we are going to check that the probability lies in the interval 0≤p≤1. This is to ensure that:
1) The value entered in the form is correct.
2) If someone manually enters an invalid value into the url of the PHP script, the image that is drawn will only display an error message.

Tools needed for this tutorial:
  • A server with PHP and the GD library installed. (I recommend XAMPP, which recquires no advanced setup. Just download and install.)
  • A php editor. I prefer using the free source code editor Notepad++, so that's what I'll be using. However Notepad (or any similar text editor) will work just fine.

    The image we are going to make will be created on the fly, and therefore no image is stored on the server.

    The first thing you need to do is to create a new PHP document. I will call mine graph.php

    Open the file in your prefered text/php editor.

    As always when writing PHP, we begin our code with the line:

    <?php

    Then, we must tell the browser that it's recieving an image in the png format:

    header("Content-type: image/png");

    The next thing we have to do is to create a variable that set up the image width and height. I'll call this variable im, and I'll be using a document size of 125*115 pixels. As you can see, it will display an error message if it failes to create the image:

    $im = @imagecreate(125, 120) or die("Cannot Initialize new GD image stream");

    Before we begin drawing, we must define the variables, including those for the colors. I'll be using white as background color, the graphs axises will be black, the numberson the axises blue and the bars will be red and green. I will also draw some gray grids to make the graph easier to read.

    $background_color = imagecolorallocate($im, 255, 255, 255);
    $black = imagecolorallocate($im, 0, 0, 0);
    $red = imagecolorallocate($im, 255, 0, 0);
    $green = imagecolorallocate($im, 0, 255, 0);
    $blue = imagecolorallocate($im, 0, 0, 255);
    $grey = imagecolorallocate($im, 200, 200, 200);


    Then, we define the probability variable, which I have chosen to call prob. Remember that you can call this variable whatever you want to, but choosing locigal names for your variables is a great advantage because this makes it easier for others (and yourself at a later point) to understand what happens in the script (you should also make lots of comments in your script about what happens in the code. I have not included these comments in this tutorial because I'm explaining everything - at least most of it - step by step. The code is, however, commented in the zip which you can download from here).

    We use $_GET['variable'] to get the variable from the url. Say for example that you have the url http://www.yourwebsite.com/script.php?a=30&b=45 and you want to use the variables a and b into your script. Then you simply use $new_variable_name = $_GET['a']; and $another_variable_name = $_GET['b']; to define the variables. Since we want to make the url as short as possible, we only use p to define the probability in the url.

    $prob = $_GET['p'];

    Before we find the probability for each of the r successes out of 10 trials, we add an if statement, to check wether p lies in the interval where 0 ≤ P ≤ 1. We will calculate this with the formula (n C r)*p^r*(1-p)^(n-r). When calculating (n C r), we use the formula n!/(r!*(n-r)!). We must add a function to find the factorial of a number (the one I'm using here was written by moikboy). We must also define the variables i, n and r which will be need for the calculation.

    if ($probability >= 0 && $probability <= 1) {
    function fact($int){
    if($int<2)return 1;
    for($f=2;$int-1>1;$f*=$int--);
    return $f;
    };

    $i = 0;
    $n = 10;
    $r = 0;

    To reduce the amount of code that needs to be written, we then make a while-loop. This loop will create 11 variables called bar0, bar1, bar2 and so on up to bar10. And calculate values (which will be the height of the bars) for each of these.

    while ($i <= 10) {
    ${bar.$i} = round(((fact($n)/(fact($r)*fact($n-$r)))*pow($prob, $r)*pow(1-$prob, $n-$r))*100);
    $i++;
    $r++;
    }

    The next thing we are going to do, is to draw the background grids, again we use a loop to reduce the amount of code needed. in order to draw the grids, we use imageline ($name_of_the_image, x1, y1, x2, y2, $color). This will draw a line in the image $name_of_the_image ($im in our case) from the pixel located at x1, y1 to the pixel at location x2, y2. Remember that the location of the upper left pixel is 0, 0.

    We start with the bottom grid, and for each new grid we reduce y1 and y2 with 10 pixels.

    $i = 0;

    while ($i <= 10) {
    imageline ($im, 7, 98-($i*10), 117, 98-($i*10), $grey);
    $i++;
    }


    Now for the actuall bars. Each of the 11 bars will be 10 pixels wide. One pixel in height will represent one percent. We'll be using imagefilledrectangle ($name_of_the_image, x1, y1, x2, y2, $color) to draw the bars. This creates a filled rectangle of color $color in the image $name_of_the_image starting at upper left coordinates x1, y1 and ending at bottom right coordinates x2, y2. y1 is the variable that will determine the height of each bar.

    Once again, we use a loop. Inside this loop, we add an if-statement which will check wether the number of the bar divided by two gives a whole number. If so, the bar will be colored red. Else, the bar will be colored green. That way, the graph will be easier to read. Each new bar that we draw is placed 10 pixels to the right of the previous bar. We set y2 (the bottom line of the bar) to 108 and define y1 as 108 minus the height of the bar, ${bar.$i}. This way, all bars will have their bottom lines on the same height.

    $i = 0;

    while ($i <= 10) {

    if ($i/2 == round($i/2)) {
    $color = $red;
    }
    else {
    $color = $green;
    }

    imagefilledrectangle ($im, 8+($i*10), 108-${bar.$i}, 17+($i*10), 108, $color);

    $i++;
    }

    Then, we draw the axes and lable the x-axis.

    imageline ($im, 7, 8, 7, 108, $black); // y-axis
    imageline ($im, 7, 108, 117, 108, $black); // x-axis

    $i = 0;

    while ($i <= 10) {
    imagestring($im, 3, 10+($i*10), 108, $i, $blue);

    $i++;
    }

    Now, we must close the if statement and write an else statement.

    }

    else {
    imagestring($im, 3, 5, 5, "ERROR! P MUST BE", $red);
    imagestring($im, 3, 5, 17, "IN THE INTERVAL", $red);
    imagestring($im, 3, 5, 29, "WHERE 0 <= P <= 1", $red);
    }

    The last piece of code that we need to add in order to complete the PHP script is imagepng($name_of_the_image), which will draw the actual image and imagedestroy($name_of_the_image), which will delete the image from the servers memory after it has been sendt to the user.

    imagepng($im);
    imagedestroy($im);

    And as allways, we close the script by adding this at the last line in the code:

    ?>

    Now for the HTML document with the form field. Since this is a tutorial on JavaScript, PHP and the GD library, I'm not going to explain the HTML. Here is a standard HTML document setup:

    <html>
    <head>
    <title>Bar graph: Binomial distribution</title>
    </head>
    <body>

    </body>
    </html>

    We are going to add a form and a JavaScript into this document. Insert this code between <body> and </body>:

    <form name="userdata">
    Probability for success: <input type='text' name="p" />
    <input type='button' onClick="mybarchart()" value='Draw bar chart' />
    </form>

    All you need to know about the code above, is that onClick="mybarchart()" is going to activate the a function called mybarchart() in a JavaScript when the user clicks on the button that says "Draw bar chart". Also, you should note that the names I have assigned to the form and the input field will be used by the JavaScript to identify our form. Again, you can choose whatever names you like for these onces - just make sure that the name of the form and the name of the field are not the same, as that could cause problems.

    Let's add the JavaScript. Insert this between the <head> and the </head> tags. Preferably after the line with <title> and </title>. Start out by telling the browser that what comes here is a script. This is done by adding:

    <script type='text/javascript'>

    Next, we add a function which we'll call mybarchart.

    function mybarchart(){

    Then, we add a variable, which we call prob

    var prob = document.userdata.p.value;

    What the line above does, is that it uses the names that we assigned earlier to locate the correct form (userdata) and then the correct input field (p) end then, it gets the value submited by the user in that field.

    Then, we tell the script to write an image-tag with the url to our PHP script. After graph.php?p= it writes out the value entered by the user.

    document.write("<img src='http://www.replacethiswithyourwebsite.com/graph.php?p=" +prob +"' width='125' height='115' />");

    All that's left now, is closing the function and the script. This is done by adding:
    }
    </script>

    That's it. We're finished.

    Demo | Download the .zip containing all code
  • lørdag 22. september 2007

    TI-BASIC Programs: Binomial Distribution and Sum of Binomial Distributions

    Note: This is NOT a programming tutorial, but rather a walk through intended to show you the development of a program. Therefore, I am not going to explain the code line for line but I will explain the purpose of the program and tell you shortly about how it works.

    Even though I have written and tested this program on a TI-84 Plus calculator, it should work on any of Texas Instruments' graphing calculators as they all have the TI-BASIC language built into them.


    If the only reason why you're reading this is because you need one of the programs, jump to the "Binomial Distribution" program code or jump to the "Sum of Binomial Distributions" program code. If you are here to read the whole thing, just keep on reading.

    For more information about the TI-84 Plus calculator, download the TI-84 Plus Guide Books. In these guide books you will find more information about the different commands that I use in this program. For more on TI-BASIC, you should also check out the TI-BASIC Programming Wikibook, TI-Basic Developer, ticalc.org and the TI-83 Plus SDK documentation (works for TI-84 Plus as well).

    Last month we worked with probability at school. Using our TI-84 Plus calculators and the formula for Binomial Distribution, we found the probability for r successes out of n trials to occur.

    To find this you have to type the following on the calculator:

    (n nCr r)*p^r*(1-p)^(n-r)

    Where the letters which are underlined should be replaced by numbers.

    This can be somewhat time consuming, so I decided to make a simple program (written in TI-BASIC) that would ask for n, r and p before it calculates the probability, thereby reducing the amount of keystrokes required to be performed by the user. Also, this program will give you the answer as a fraction (which often is an advantage when working with probability) where possible. Another advantage with my program is that it detects illegal values before the calculation has started. In other words, the purpose of this program is to save the user from having to perform unnecessarily time consuming operations. As you may understand, if the program was to ask for variables and then perform a calculation (which, depending on how fast the calculator is and how advanced the operation is, may take some time) only to break at the very end of an operation and give an error-message, it would in fact work against its purpose.

    Program code: Binomial Distribution v1.0 aka. BINDISTR
    :Prompt N
    :Prompt R
    :While R>N
    :Disp "NO! R SHOULD BE"
    :Disp "IN THE INTERVAL"
    :Disp "WHERE R≤N"
    :Prompt N
    :Prompt R
    :End
    :Prompt P
    :While not(0≤P and P≤1)
    :Disp "NO! P SHOULD BE"
    :Disp "IN THE INTERVAL"
    :Disp "WHERE 0≤P≤1"
    :Prompt P
    :End
    :ClrHome
    :Disp "THE ANSWER IS:"
    :(N nCr R)*P^R*(1-P)^(N-R)Frac


    The next thing we learned about at school was finding the probability for r OR r+1 OR r+2 OR ... successes out of n trials to occur. In order to find this, you simply sum the all the probabilities for the different scenarios. This of course, is even more time consuming than the previous example, so I decided to modify the program so that it would ask for n, lowest r, highest r and p and then calculate the probability. I figured this would be a nice way to do that:

    Program code: Sum of Binomial Distributions v1.0 aka. BINOMSUM
    :0→A
    :Prompt N
    :Disp "LOWEST R:"
    :Prompt L
    :While L>N
    :Disp "NO! R SHOULD BE"
    :Disp "IN THE INTERVAL"
    :Disp "WHERE R≤N"
    :Prompt N
    :Prompt L
    :End
    :Disp "HIGHEST R:"
    :Prompt H
    :While H>N
    :Disp "NO! H SHOULD BE"
    :Disp "IN THE INTERVAL"
    :Disp "WHERE H≤N"
    :Prompt H
    :End
    :While L>H
    :Disp "NO! H SHOULD BE"
    :Disp "SO THAT L≤H"
    :Prompt H
    :End
    :Prompt P
    :While not(0≤P and P≤1)
    :Disp "NO! P SHOULD BE"
    :Disp "IN THE INTERVAL"
    :Disp "WHERE 0≤P≤1"
    :Prompt P
    :End
    :While L≤H
    :((N nCr L)*P^L*(1-P)^(N-L)+A)→A)
    :(L+1)→L
    :End
    :ClrHome
    :Disp "THE ANSWER IS:"
    :AFrac


    The next day, my teacher told me that the calculator already has a method for summing a sequence of probabilities and the time taken by the calculator to calculate the probability using this algorithm proved to be faster than my method:

    sum(seq(n nCr X*p^X*(1-p)^(n-X),X,l,h))

    Where the letters which are underlined should be replaced by numbers.

    However, once again this involves a lot of typing for the user which means unnecessary usage of time. Therefore I decided to implement the algorithm above in my program. I think that a change in the algorithm should be considered as a major change, therefore I change the version number to 2.0


    Program code: Sum of Binomial Distributions v2.0 aka. BINOMSUM
    :Prompt N
    :Disp "LOWEST R:"
    :Prompt L
    :While L>N
    :Disp "NO! R SHOULD BE"
    :Disp "IN THE INTERVAL"
    :Disp "WHERE R≤N"
    :Prompt N
    :Prompt L
    :End
    :Disp "HIGHEST R:"
    :Prompt H
    :While H>N
    :Disp "NO! H SHOULD BE"
    :Disp "IN THE INTERVAL"
    :Disp "WHERE H≤N"
    :Prompt H
    :End
    :While L>H
    :Disp "NO! H SHOULD BE"
    :Disp "SO THAT L≤H"
    :Prompt H
    :End
    :Prompt P
    :While not(0≤P and P≤1)
    :Disp "NO! P SHOULD BE"
    :Disp "IN THE INTERVAL"
    :Disp "WHERE 0≤P≤1"
    :Prompt P
    :End
    :sum(seq((N nCr X)*P^X*(1-P)^(N-X),X,L,H))


    Is there anything you think was unclear, do you think something is wrong in these programs? In that case, write a comment and I'll see what I can do.