Skip to content
Cart

Your Cart

×

You have 0 items in your cart.

Register Sign in Wishlist

Code for chapter 19


Below are listed all the scripts as shown in chapter 19 of the book. These are free to copy and paste into your code editor.

Quick links to each script:

19.1 19.2 19.3 19.4 19.7 19.8 19.9 19.10 19.11 19.12
19.13 19.14 19.15 19.16 19.17 19.18 19.19 19.20 19.21 19.22
19.23 19.24 19.25 19.28 19.29 19.30 19.31 19.32 19.33 19.34

Script 19.1

script_19_1.awk, file.dat
# Script 19.1
# Processing of the ASCII-file file.tab with tab-separated values
awk '
    BEGIN {FS="\t"}
    (NR>1){
        if ($1>2) {print $1,$2-$4 } else {print $1,$3}
        n=n+1
    }
    END {print "This is the End after processing "n" lines."}
' file.dat
script_19_1.out
[ah@hobbes Documents]$ ./script_19_1.awk
1       S1   Male   66.3  20     TRUE 
2       S2   Male   80.2  21    FALSE 0
3       S3 Female   53.1  29     TRUE 0
4       S4 Female   21.3  18     TRUE 0
5       S5 Female   54.6  NA    FALSE 0
6       S6 Female   78.1  23     TRUE 0
This is the End after processing 6 lines.

Script 19.2

script_19_2.py
# Script 19.2
# Calculate the sum of squares of data in an array
s = 0
data = [3, -1, 2, -5, None, 9, -2]
for d in data:  
    if not d is None:
        s += d * d    
print('The sum of squares is ',s)
script_19_2.out
[ah@hobbes Documents]$ python script_19_2.py 
('The sum of squares is ', 124)

Script 19.3

script_19_3.R
# Script 19.3
# Plotting a normal distribution
x<-seq(-4,4,length=100)
dx<-dnorm(x,mean=0,sd=1)
plot(x,dx,type="l",main="Normal Distribution")
p1<-seq(-4,-1.5,length=100)
dp1<-dnorm(p1)
p2<-seq(1.5,4,length=100)
dp2<-dnorm(p2)
polygon(c(-4,p1,-1.5),c(0,dp1,0),col="gray")
script_19_3.png

Script 19.4

script_19_4.R
# Script 19.4
library()		# List packages available, not necessarily active 
data(agriculture)	# Attempt to load the Agriculture dataset
bannerplot(agnes(agriculture), main = "Bannerplot") 
# Run bannerplot function library(cluster) # Load the cluster library and try again data(agriculture) bannerplot(agnes(agriculture), main = "Bannerplot")

script_19_4.png

 

Script 19.7

script_19_7.R
# Script 19.7
x<-3        # Set x to hold the value 1
print(x)    # Its value can be displayed simply with the print() command
x           # Or even simply by stating its name
x+x         # Simple operations are straight forward
x*x
sqrt(x)     # even the square root, using the built-in function sqrt()
y<- -3.4	
y
x-y         # Calculate the difference between numbers x and y
t<-"Text"   # Set t to hold the String "Text"
t
x-t         # Attempt to calculate the difference between x and t which 
fails typeof(x) # x is a number with double precision typeof(t) # t is a string with character type z<- TRUE # Set a BOOLEAN value TRUE to z z x-z # Calculate the difference between x and z provides an answer typeof(z) typeof(x-z) # The BOOLEAN logical type TRUE is transformed into value 1 in
double type 3-1=2

script_19_7.out

[ah@hobbes Documents]$ R

R version 3.3.1 (2016-06-21) -- "Bug in Your Hair"
Copyright (C) 2016 The R Foundation for Statistical Computing
Platform: x86_64-redhat-linux-gnu (64-bit)

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.

  Natural language support but running in an English locale

R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.

> # Script 19.7
> x print(x)    # Its value can be displayed simply with the print() command
[1] 3
> 
> x           # Or even simply by stating its name
[1] 3
> 
> x+x         # Simple operations are straight forward
[1] 6
> 
> x*x
[1] 9
> 
> sqrt(x)     # even the square root, using the built-in function sqrt()
[1] 1.732051
> 
> y y
[1] -3.4
> 
> x-y         # Calculate the difference between numbers x and y
[1] 6.4
> 
> t t
[1] "Text"
> 
> x-t         # Attempt to calculate the difference between x and t which fails
Error in x - t : non-numeric argument to binary operator
> 
> typeof(x)   # x is a number with double precision
[1] "double"
> 
> typeof(t)   # t is a string with character type
[1] "character"
> 
> z z
[1] TRUE
> 
> x-z         # Calculate the difference between x and z provides an answer
[1] 2
> 
> typeof(z)
[1] "logical"
> 
> typeof(x-z) # The BOOLEAN logical type TRUE is transformed into value 1 in 
double type 3-1=2 [1] "double" >

Script 19.8

script_19_8.R
# Script 19.8
v <- c(10, 1.6, 1, 2.6, 1.7)  # Create a vector of length 5 with set values
v
w<-1:10                       # Create a vector of length 10
w
v+1                           # One can operate on all numbers
v*w                           # Even between vectors
sqrt(v)                       # Or more complex operation like square root
mean(v)                       # Average across the values of the vector
sd(v)                         # Standard Deviation	
sum(v)                        # Sum
length(v)                     # Length of the vector
script_19_8.out
[ah@hobbes Documents]$ R

R version 3.3.1 (2016-06-21) -- "Bug in Your Hair"
Copyright (C) 2016 The R Foundation for Statistical Computing
Platform: x86_64-redhat-linux-gnu (64-bit)

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.

  Natural language support but running in an English locale

R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.

> # Script 19.8
> v  v
[1] 10.0  1.6  1.0  2.6  1.7
> w w
 [1]  1  2  3  4  5  6  7  8  9 10
> v+1                           # One can operate on all numbers
[1] 11.0  2.6  2.0  3.6  2.7
> v*w                           # Even between vectors
 [1] 10.0  3.2  3.0 10.4  8.5 60.0 11.2  8.0 23.4 17.0
> sqrt(v)                       # Or more complex operation like square root
[1] 3.162278 1.264911 1.000000 1.612452 1.303840
> mean(v)                       # Average across the values of the vector
[1] 3.38
> sd(v)                         # Standard Deviation
[1] 3.744596
> sum(v)                        # Sum
[1] 16.9
> length(v)                     # Length of the vector
[1] 5
> 

Script 19.9

script_19_9.R
# Script 19.9
m<-array(2:7,dim=c(2,3))   # Create a 2x3 array containing values from
2 to 7
m
m[1,3]                     # Access a single cell by index
m[1,]                      # Access a single row
m[,c(1,3)]                 # Access a pair of columns
script_19_9.out
[ah@hobbes Documents]$ R

R version 3.3.1 (2016-06-21) -- "Bug in Your Hair"
Copyright (C) 2016 The R Foundation for Statistical Computing
Platform: x86_64-redhat-linux-gnu (64-bit)

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.

  Natural language support but running in an English locale

R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.

> # Script 19.9
> m<-array(2:7,dim=c(2,3))   # Create a 2x3 array containing values from
2 to 7
> m
     [,1] [,2] [,3]
[1,]    2    4    6
[2,]    3    5    7
> m[1,3]                     # Access a single cell by index
[1] 6
> m[1,]                      # Access a single row
[1] 2 4 6
> m[,c(1,3)]                 # Access a pair of columns
     [,1] [,2]
[1,]    2    6
[2,]    3    7
>

Script 19.10

script_19_10.R
# Script 19.10
hair<-c("black","fair","black","black","brown","fair","white","brown","white",
"bald")
f_hair<-factor(hair)    # Make a factor from this vector
f_hair
levels(f_hair)          # The categories are the levels
levels(hair)            # There are no levels in the original vector
script_19_10.out
[ah@hobbes Documents]$ R

R version 3.3.1 (2016-06-21) -- "Bug in Your Hair"
Copyright (C) 2016 The R Foundation for Statistical Computing
Platform: x86_64-redhat-linux-gnu (64-bit)

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.

  Natural language support but running in an English locale

R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.

> # Script 19.10
> hair<-c("black","fair","black","black","brown","fair","white","brown","white",
"bald")
> f_hair<-factor(hair)    # Make a factor from this vector
> f_hair
 [1] black fair  black black brown fair  white brown white bald
Levels: bald black brown fair white
> levels(f_hair)          # The categories are the levels
[1] "bald"  "black" "brown" "fair"  "white"
> levels(hair)            # There are no levels in the original vector
NULL
>

Script 19.11

script_19_11.R
# Script 19.11
res <- list(test_name=c("t.test","Fisher_exact"), pval=0.001, df=3,
err=c(0.4,0.7,0.9))     # List of 4 components
res
res$pval             # Access the full element pval
res$err              # Access the full element err
res$test_name[1]     # Access an element in an element
script_19_11.out
[ah@hobbes Documents]$ R

R version 3.3.1 (2016-06-21) -- "Bug in Your Hair"
Copyright (C) 2016 The R Foundation for Statistical Computing
Platform: x86_64-redhat-linux-gnu (64-bit)

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.

  Natural language support but running in an English locale

R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.

> # Script 19.11
> res <- list(test_name=c("t.test","Fisher_exact"), pval=0.001, df=3,
err=c(0.4,0.7,0.9))     # List of 4 components
> res
$test_name
[1] "t.test"       "Fisher_exact"

$pval
[1] 0.001

$df
[1] 3

$err
[1] 0.4 0.7 0.9

> res$pval             # Access the full element pval
[1] 0.001
> res$err              # Access the full element err
[1] 0.4 0.7 0.9
> res$test_name[1]     # Access an element in an element
[1] "t.test"
>

Script 19.12

script_19_12.R
# Script 19.12
# Read the table considering the first row as data
read.table("table.dat")
script_19_12.out
[ah@hobbes Documents]$ R

R version 3.3.1 (2016-06-21) -- "Bug in Your Hair"
Copyright (C) 2016 The R Foundation for Statistical Computing
Platform: x86_64-redhat-linux-gnu (64-bit)

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.

  Natural language support but running in an English locale

R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.

> # Script 19.12
> # Read the table considering the first row as data
> read.table("table.dat")
        V1     V2     V3  V4       V5
1 SampleID Gender Weight Age Affected
2       S1   Male   66.3  20     TRUE
3       S2   Male   80.2  21    FALSE
4       S3 Female   53.1  29     TRUE
5       S4 Female   21.3  18     TRUE
6       S5 Female   54.6  12    FALSE
7       S6 Female   78.1  23     TRUE
>

Script 19.13

script_19_13.R
# Script 19.13
# Read the table and use the first row to name the columns
read.table("table.dat",header=T)
script_19_13.out
[ah@hobbes Documents]$ R

R version 3.3.1 (2016-06-21) -- "Bug in Your Hair"
Copyright (C) 2016 The R Foundation for Statistical Computing
Platform: x86_64-redhat-linux-gnu (64-bit)

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.

  Natural language support but running in an English locale

R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.

> # Script 19.13
> # Read the table and use the first row to name the columns
> read.table("table.dat",header=T)
  SampleID Gender Weight Age Affected
1       S1   Male   66.3  20     TRUE
2       S2   Male   80.2  21    FALSE
3       S3 Female   53.1  29     TRUE
4       S4 Female   21.3  18     TRUE
5       S5 Female   54.6  12    FALSE
6       S6 Female   78.1  23     TRUE
>

Script 19.14

script_19_14.R
# Script 19.14
# Read the table with missing entries
read.table("table2.dat")    
t<-read.table("table2.dat",header=T,sep="\t")
print(t)
typeof(t)   # The data is interpreted as a list of various subtypes
typeof(t$Age)
typeof(t$Affected)
typeof(t$Gender)
summary(t)  # A summary of the values can be obtained according to the type
script_19_14.out
[ah@hobbes Documents]$ R

R version 3.3.1 (2016-06-21) -- "Bug in Your Hair"
Copyright (C) 2016 The R Foundation for Statistical Computing
Platform: x86_64-redhat-linux-gnu (64-bit)

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.

  Natural language support but running in an English locale

R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.

> # Script 19.14
> # Read the table with missing entries
> read.table("table2.dat")
Error in scan(file = file, what = what, sep = sep, quote = quote, dec = dec,  :
  line 6 did not have 5 elements
> t<-read.table("table2.dat",header=T,sep="\t")
> print(t)
  SampleID Gender Weight Age Affected
1       S1   Male   66.3  20     TRUE
2       S2   Male   80.2  21    FALSE
3       S3 Female   53.1  29     TRUE
4       S4 Female   21.3  18     TRUE
5       S5 Female   54.6  NA    FALSE
6       S6 Female   78.1  23     TRUE
> typeof(t)   # The data is interpreted as a list of various subtypes
[1] "list"
> typeof(t$Age)
[1] "integer"
> typeof(t$Affected)
[1] "logical"
> typeof(t$Gender)
[1] "integer"
> summary(t)  # A summary of the values can be obtained according to the
type
 SampleID    Gender      Weight           Age        Affected      
 S1:1     Female:4   Min.   :21.30   Min.   :18.0   Mode :logical  
 S2:1     Male  :2   1st Qu.:53.48   1st Qu.:20.0   FALSE:2        
 S3:1                Median :60.45   Median :21.0   TRUE :4        
 S4:1                Mean   :58.93   Mean   :22.2   NA's :0        
 S5:1                3rd Qu.:75.15   3rd Qu.:23.0                  
 S6:1                Max.   :80.20   Max.   :29.0                  
                                     NA's   :1                     
>

Script 19.15

script_19_15.R
# Script 19.15
data(euro)                      # Load the built-in Euro conversion values
euro
data(package="cluster")         # List of datasets is accessible from packages
data(flower, package="cluster") # Built-in data can be loaded from a package
data(cars)
dim(cars)
cars[1:10,]                     # A subset of the dataset can be used
summary(cars)                   # and relevant content displayed
script_19_15.out
[ah@hobbes Documents]$ R

R version 3.3.1 (2016-06-21) -- "Bug in Your Hair"
Copyright (C) 2016 The R Foundation for Statistical Computing
Platform: x86_64-redhat-linux-gnu (64-bit)

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.

  Natural language support but running in an English locale

R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.

> # Script 19.15
> data(euro)                      # Load the built-in Euro conversion values
> euro
        ATS         BEF         DEM         ESP         FIM         FRF
  13.760300   40.339900    1.955830  166.386000    5.945730    6.559570
        IEP         ITL         LUF         NLG         PTE
   0.787564 1936.270000   40.339900    2.203710  200.482000
> data(package="cluster")         # List of datasets is accessible from packages
Data sets in package 'cluster':

agriculture              European Union Agricultural Workforces
animals                  Attributes of Animals
chorSub                  Subset of C-horizon of Kola Data
flower                   Flower Characteristics
plantTraits              Plant Species Traits Data
pluton                   Isotopic Composition Plutonium Batches
ruspini                  Ruspini Data
votes.repub              Votes for Republican Candidate in Presidential
                         Elections
xclara                   Bivariate Data Set with 3 Clusters
> data(flower, package="cluster") # Built-in data can be loaded from a package
> data(cars)
> dim(cars)
[1] 50  2
> cars[1:10,]                     # A subset of the dataset can be used
   speed dist
1      4    2
2      4   10
3      7    4
4      7   22
5      8   16
6      9   10
7     10   18
8     10   26
9     10   34
10    11   17
> summary(cars)                   # and relevant content displayed
     speed           dist       
 Min.   : 4.0   Min.   :  2.00  
 1st Qu.:12.0   1st Qu.: 26.00  
 Median :15.0   Median : 36.00  
 Mean   :15.4   Mean   : 42.98  
 3rd Qu.:19.0   3rd Qu.: 56.00  
 Max.   :25.0   Max.   :120.00  
>

Script 19.16

script_19_16.R
# Script 19.16
# Cars dataset visualised as scatter plot
data(cars)
dim(cars)
names(cars)
par(mfrow=c(2,1)) # Split the graphic in 2 rows and 1 column
plot(cars)  
plot(cars,pch="+",xlab="Speed",ylab="Distance",col="red") # Or use several
parameters
 script_19_16.png
 script_19_16.png
script_19_16.out
[ah@hobbes Documents]$ R

R version 3.3.1 (2016-06-21) -- "Bug in Your Hair"
Copyright (C) 2016 The R Foundation for Statistical Computing
Platform: x86_64-redhat-linux-gnu (64-bit)

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.

  Natural language support but running in an English locale

R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.

> # Script 19.16
> # Cars dataset visualised as scatter plot
> data(cars)
> dim(cars)
[1] 50  2
> names(cars)
[1] "speed" "dist"
> par(mfrow=c(2,1)) # Split the graphic in 2 rows and 1 column
> plot(cars)  
> plot(cars,pch="+",xlab="Speed",ylab="Distance",col="red") # Or use several
parameters
>

Script 19.17

script_19_17.R
# Script 19.17
# Ruspini dataset visualised as scatter plot
library(cluster)
head(ruspini) # Another set of x vs y data
plot(ruspini, main="Ruspini cluster set") # Highlights a clustering pattern
script_19_17.png
 script_19_17.png
script_19_17.out
[ah@hobbes Documents]$ R

R version 3.3.1 (2016-06-21) -- "Bug in Your Hair"
Copyright (C) 2016 The R Foundation for Statistical Computing
Platform: x86_64-redhat-linux-gnu (64-bit)

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.

  Natural language support but running in an English locale

R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.

> # Script 19.17
> # Ruspini dataset visualised as scatter plot
> library(cluster)
> head(ruspini) # Another set of x vs y data
   x  y
1  4 53
2  5 63
3 10 59
4  9 77
5 13 49
6 13 69
> plot(ruspini, main="Ruspini cluster set") # Highlights a clustering pattern
>

Script 19.18

script_19_18.R
# Script 19.18
# Titanic dataset visualised as scatter plot
plot(Titanic,main="Titanic")
script_19_18.png
script_19_18.png
script_19_18.out
[ah@hobbes Documents]$ R

R version 3.3.1 (2016-06-21) -- "Bug in Your Hair"
Copyright (C) 2016 The R Foundation for Statistical Computing
Platform: x86_64-redhat-linux-gnu (64-bit)

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.

  Natural language support but running in an English locale

R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.

> # Script 19.18
> # Titanic dataset visualised as scatter plot
> plot(Titanic,main="Titanic")
>

Script 19.19

script_19_19.R
# Script 19.19
# The Cars dataset visualised as histogram
par(mfrow=c(2,1))
hist(cars$speed)
hist(cars$speed,breaks=15,col="gray")
abline(v=21,col="red")
script_19_19.png
script_19_19.png
script_19_19.out
[ah@hobbes Documents]$ R

R version 3.3.1 (2016-06-21) -- "Bug in Your Hair"
Copyright (C) 2016 The R Foundation for Statistical Computing
Platform: x86_64-redhat-linux-gnu (64-bit)

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.

  Natural language support but running in an English locale

R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.

> # Script 19.19
> # The Cars dataset visualised as histogram
> par(mfrow=c(2,1))
> hist(cars$speed)
> hist(cars$speed,breaks=15,col="gray")
> abline(v=21,col="red")
>

Script 19.20

script_19_20.R
# Script 19.20
# A normal distribution with a mean of 0 and standard deviation of 1
par(mfrow=c(2,1))        # Split the display in 2 rows and 1 column
x<-seq(-5,10,length=100)  # Interval from 5 to 10
dx<-dnorm(x)             # Default normal distribution
plot(x,dx,type="l",main="Normal Distribution") # Plot distribution
script_19_20.png
script_19_20.png
script_19_20.out
[ah@hobbes Documents]$ R

R version 3.3.1 (2016-06-21) -- "Bug in Your Hair"
Copyright (C) 2016 The R Foundation for Statistical Computing
Platform: x86_64-redhat-linux-gnu (64-bit)

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.

  Natural language support but running in an English locale

R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.

> # Script 19.20
> # A normal distribution with a mean of 0 and standard deviation of 1
> par(mfrow=c(2,1))        # Split the display in 2 rows and 1 column
> x<-seq(-5,10,length=100)  # Interval from 5 to 10
> dx<-dnorm(x)             # Default normal distribution
> plot(x,dx,type="l",main="Normal Distribution") # Plot distribution
>

Script 19.21

script_19_21.R
# Script 19.21
# A normal distribution with a mean of 0 and standard deviation of 1
par(mfrow=c(2,1))        # Split the display in 2 rows and 1 column
x<-seq(-5,10,length=100)  # Interval from 5 to 10
dx<-dnorm(x)             # Default normal distribution
plot(x,dx,type="l",main="Normal Distribution") # Plot distribution
# Another distribution with mean of 4 and standard deviation of 1.5
x2<-x
dx<-dnorm(x2,mean=4,sd=1.5)
plot(x2,dx2,type="l",main="Other Distribution") # Plot distribution
abline(v=4,col="red")        # Draw vertical line in red at mean
abline(v=0,col="blue")       # Draw vertical line in blue at 0
mtext(text="mean=4, sd=1.5") # Include extra text on top
script_19_21.png
script_19_21.png
script_19_21.out
[ah@hobbes Documents]$ R

R version 3.3.1 (2016-06-21) -- "Bug in Your Hair"
Copyright (C) 2016 The R Foundation for Statistical Computing
Platform: x86_64-redhat-linux-gnu (64-bit)

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.

  Natural language support but running in an English locale

R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.

> # Script 19.21
> # A normal distribution with a mean of 0 and standard deviation of 1
> par(mfrow=c(2,1))        # Split the display in 2 rows and 1 column
> x<-seq(-5,10,length=100)  # Interval from 5 to 10
> dx<-dnorm(x)             # Default normal distribution
> plot(x,dx,type="l",main="Normal Distribution") # Plot distribution
> # Another distribution with mean of 4 and standard deviation of 1.5
> x2<-x
> dx<-dnorm(x2,mean=4,sd=1.5)
> plot(x2,dx2,type="l",main="Other Distribution") # Plot distribution
Error in xy.coords(x, y, xlabel, ylabel, log) : object 'dx2' not found
> abline(v=4,col="red")        # Draw vertical line in red at mean
> abline(v=0,col="blue")       # Draw vertical line in blue at 0
> mtext(text="mean=4, sd=1.5") # Include extra text on top
>

Script 19.22

script_19_22.R
# Script 19.22
# Z-transformation of the Cars dataset
par(mfrow=c(2,1))
s<-cars$speed
zs<-(s-mean(s))/sd(s)
hist(cars$speed,breaks=10,col="gray",xlim=c(-5,25),main="Speed Distribution"
,xlab="Speed")
hist(zs,breaks=10,xlim=c(-5,25),main="Z Speed Distribution",xlab="Z transform
of Speed")
script_19_22.png
script_19_22.png
script_19_22.out
[ah@hobbes Documents]$ R

R version 3.3.1 (2016-06-21) -- "Bug in Your Hair"
Copyright (C) 2016 The R Foundation for Statistical Computing
Platform: x86_64-redhat-linux-gnu (64-bit)

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.

  Natural language support but running in an English locale

R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.

> # Script 19.22
> # Z-transformation of the Cars dataset
> par(mfrow=c(2,1))
> s<-cars$speed
> zs<-(s-mean(s))/sd(s)
> hist(cars$speed,breaks=10,col="gray",xlim=c(-5,25),main="Speed
Distribution",xlab="Speed")
> hist(zs,breaks=10,xlim=c(-5,25),main="Z Speed Distribution",xlab="Z
transform of Speed")
>

Script 19.23

script_19_23.R
# Script 19.23
# Box plot of a dataset of smoking-related deaths
# breslow dataset available from
# https://vincentarelbundock.github.io/Rdatasets/datasets.html
breslow<-read.table("breslow.dat",header=T)
boxplot(y~smoke,data=breslow,main="Smoking-related deaths",
xlab="Smoking",ylab="Number of related deaths")
script_19_23.png
script_19_23.png
script_19_23.out
[ah@hobbes Documents]$ R

R version 3.3.1 (2016-06-21) -- "Bug in Your Hair"
Copyright (C) 2016 The R Foundation for Statistical Computing
Platform: x86_64-redhat-linux-gnu (64-bit)

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.

  Natural language support but running in an English locale

R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.

> # Script 19.23
> # Box plot of a dataset of smoking-related deaths
> # breslow dataset available from
> # https://vincentarelbundock.github.io/Rdatasets/datasets.html
> breslow<-read.table("breslow.dat",header=T)
> boxplot(y~smoke,data=breslow,main="Smoking-related deaths",
xlab="Smoking",ylab="Number of related deaths")
>

Script 19.24

script_19_24.R
# Script 19.24
# Heatmap representation of the data with automatic construction of
# dendrogram to reflect the organisation of the data
par(mfrow=c(2,1))
x<-as.matrix(swiss) # Investigate the fertility in Swiss cantons
head(x)
heatmap(as.matrix(swiss))
script_19_24.png
script_19_24.png
script_19_24.out
[ah@hobbes Documents]$ R

R version 3.3.1 (2016-06-21) -- "Bug in Your Hair"
Copyright (C) 2016 The R Foundation for Statistical Computing
Platform: x86_64-redhat-linux-gnu (64-bit)

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.

  Natural language support but running in an English locale

R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.

> # Script 19.24
> # Heatmap representation of the data with automatic construction of
> # dendrogram to reflect the organisation of the data
> par(mfrow=c(2,1))
> x<-as.matrix(swiss) # Investigate the fertility in Swiss cantons
> head(x)
             Fertility Agriculture Examination Education Catholic
Courtelary        80.2        17.0          15        12     9.96
Delemont          83.1        45.1           6         9    84.84
Franches-Mnt      92.5        39.7           5         5    93.40
Moutier           85.8        36.5          12         7    33.77
Neuveville        76.9        43.5          17        15     5.16
Porrentruy        76.1        35.3           9         7    90.57
             Infant.Mortality
Courtelary               22.2
Delemont                 22.2
Franches-Mnt             20.2
Moutier                  20.3
Neuveville               20.6
Porrentruy               26.6
> heatmap(as.matrix(swiss))
>

Script 19.25

script_19_25.R
# Script 19.25
# Same plot with different colour scheme and added colour on the
# sides from a Rainbow palette
par(mfrow=c(2,1))
x<-as.matrix(swiss) # Investigate the fertility in Swiss cantons
rc<-rainbow(nrow(x),start=0, end =.3)
cc<-rainbow(ncol(x),start=0, end =.3)
heatmap(x,col=cm.colors(256),scale="column",RowSideColors=rc,
ColSideColors=cc, xlab="Specification variables", ylab="Cantons",
main="Heatmap")
script_19_25.png
script_19_25.png
script_19_25.out
[ah@hobbes Documents]$ R

R version 3.3.1 (2016-06-21) -- "Bug in Your Hair"
Copyright (C) 2016 The R Foundation for Statistical Computing
Platform: x86_64-redhat-linux-gnu (64-bit)

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.

  Natural language support but running in an English locale

R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.

> # Script 19.25
> # Same plot with different colour scheme and added colour on the
> # sides from a Rainbow palette
> par(mfrow=c(2,1))
> x<-as.matrix(swiss) # Investigate the fertility in Swiss cantons
> rc<-rainbow(nrow(x),start=0, end =.3)
> cc<-rainbow(ncol(x),start=0, end =.3)
> heatmap(x,col=cm.colors(256),scale="column",RowSideColors=rc,
ColSideColors=cc, xlab="Specification variables", ylab="Cantons",
main="Heatmap")
>

Script 19.28

script_19_28.R
# Script 19.28
# Create factors with value labels
library(ggplot2)
mtcars$gear <- factor(mtcars$gear,levels=c(3,4,5),labels=c("3gears","4gears",
"5gears"))
mtcars$am <- factor(mtcars$am,levels=c(0,1),labels=c("Automatic","Manual"))
mtcars$cyl <- factor(mtcars$cyl,levels=c(4,6,8),labels=c("4cyl","6cyl","8cyl"))
qplot(mpg, data=mtcars, geom="density", fill=gear, alpha=I(.5),
main="Distribution of Gas Milage", xlab="Miles Per Gallon",ylab="Density")
script_19_28.png
script_19_28.png
script_19_28.out
[ah@hobbes Documents]$ R

R version 3.3.1 (2016-06-21) -- "Bug in Your Hair"
Copyright (C) 2016 The R Foundation for Statistical Computing
Platform: x86_64-redhat-linux-gnu (64-bit)

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.

  Natural language support but running in an English locale

R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.

> # Script 19.28
> # Create factors with value labels
> library(ggplot2)
Need help? Try the ggplot2 mailing list:
http://groups.google.com/group/ggplot2.
> mtcars$gear <- factor(mtcars$gear,levels=c(3,4,5),labels=c("3gears","4gears",
"5gears"))
> mtcars$am <- factor(mtcars$am,levels=c(0,1),labels=c("Automatic","Manual"))
> mtcars$cyl <- factor(mtcars$cyl,levels=c(4,6,8),labels=c("4cyl","6cyl","8cyl"))
> qplot(mpg, data=mtcars, geom="density", fill=gear, alpha=I(.5),
main="Distribution of Gas Milage", xlab="Miles Per Gallon",ylab="Density")
>

Script 19.29

script_19_29.R
# Script 19.29
# Scatter plots of mpg vs. hp
# for each combination of gears and cylinders
# in each facet, transmission type is represented by shape and colour
library(ggplot2)
mtcars$gear <- factor(mtcars$gear,levels=c(3,4,5),labels=c("3gears","4gears",
"5gears"))
mtcars$am <- factor(mtcars$am,levels=c(0,1),labels=c("Automatic","Manual"))
mtcars$cyl <- factor(mtcars$cyl,levels=c(4,6,8),labels=c("4cyl","6cyl","8cyl"))
qplot(hp, mpg, data=mtcars, shape=am, color=am,facets=gear~cyl, size=I(3),
xlab="Horsepower", ylab="Miles per Gallon")
script_19_29.png
script_19_29.png
script_19_29.out
[ah@hobbes Documents]$ R

R version 3.3.1 (2016-06-21) -- "Bug in Your Hair"
Copyright (C) 2016 The R Foundation for Statistical Computing
Platform: x86_64-redhat-linux-gnu (64-bit)

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.

  Natural language support but running in an English locale

R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.

> # Script 19.29
> # Scatter plots of mpg vs. hp
> # for each combination of gears and cylinders
> # in each facet, transmission type is represented by shape and colour
> library(ggplot2)
> mtcars$gear <- factor(mtcars$gear,levels=c(3,4,5),labels=c("3gears","4gears"
,"5gears"))
> mtcars$am <- factor(mtcars$am,levels=c(0,1),labels=c("Automatic","Manual"))
> mtcars$cyl <- factor(mtcars$cyl,levels=c(4,6,8),labels=c("4cyl","6cyl","8cyl"))
> qplot(hp, mpg, data=mtcars, shape=am, color=am,facets=gear~cyl, size=I(3),
xlab="Horsepower", ylab="Miles per Gallon")
>

Script 19.30

script_19_30.R
# Script 19.30
# Separate regressions of mpg on weight for each number of cylinders
library(ggplot2)
mtcars$gear <- factor(mtcars$gear,levels=c(3,4,5),labels=c("3gears","4gears",
"5gears"))
mtcars$am <- factor(mtcars$am,levels=c(0,1),labels=c("Automatic","Manual"))
mtcars$cyl <- factor(mtcars$cyl,levels=c(4,6,8),labels=c("4cyl","6cyl","8cyl"))
qplot(wt, mpg, data=mtcars, geom=c("point", "smooth"),method="lm",
formula=y~x, color=cyl, main="Regression of MPG on Weight",xlab="Weight",
ylab="Miles per Gallon")
script_19_30.png
script_19_30.png
script_19_30.out
[ah@hobbes Documents]$ R

R version 3.3.1 (2016-06-21) -- "Bug in Your Hair"
Copyright (C) 2016 The R Foundation for Statistical Computing
Platform: x86_64-redhat-linux-gnu (64-bit)

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.

  Natural language support but running in an English locale

R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.

> # Script 19.30
> # Separate regressions of mpg on weight for each number of cylinders
> library(ggplot2)
> mtcars$gear <- factor(mtcars$gear,levels=c(3,4,5),labels=c("3gears",
"4gears","5gears"))
> mtcars$am <- factor(mtcars$am,levels=c(0,1),labels=c("Automatic",
"Manual"))
> mtcars$cyl <- factor(mtcars$cyl,levels=c(4,6,8),labels=c("4cyl","6cyl","8cyl"))
> qplot(wt, mpg, data=mtcars, geom=c("point", "smooth"),method="lm",
formula=y~x, color=cyl, main="Regression of MPG on Weight",xlab="Weight",
ylab="Miles per Gallon")
Warning: Ignoring unknown parameters: method, formula
>

Script 19.31

script_19_31.R
# Script 19.31
# Box plots of mpg by number of gears
# observations (points) are overlayed and jittered
library(ggplot2)
mtcars$gear <- factor(mtcars$gear,levels=c(3,4,5),labels=c("3gears","4gears",
"5gears"))
mtcars$am <- factor(mtcars$am,levels=c(0,1),labels=c("Automatic","Manual"))
mtcars$cyl <- factor(mtcars$cyl,levels=c(4,6,8),labels=c("4cyl","6cyl","8cyl"))
qplot(gear, mpg, data=mtcars, geom=c("boxplot", "jitter"),fill=gear,
main="Mileage by Gear Number",xlab="", ylab="Miles per Gallon")
script_19_31.png
script_19_31.png
script_19_31.out
[ah@hobbes Documents]$ R

R version 3.3.1 (2016-06-21) -- "Bug in Your Hair"
Copyright (C) 2016 The R Foundation for Statistical Computing
Platform: x86_64-redhat-linux-gnu (64-bit)

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.

  Natural language support but running in an English locale

R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.

> # Script 19.31
> # Box plots of mpg by number of gears
> # observations (points) are overlayed and jittered
> library(ggplot2)
> mtcars$gear <- factor(mtcars$gear,levels=c(3,4,5),labels=c("3gears",
"4gears","5gears"))
> mtcars$am <- factor(mtcars$am,levels=c(0,1),labels=c("Automatic",
"Manual"))
> mtcars$cyl <- factor(mtcars$cyl,levels=c(4,6,8),labels=c("4cyl","6cyl"
,"8cyl"))
> qplot(gear, mpg, data=mtcars, geom=c("boxplot", "jitter"),fill=gear,
main="Mileage by Gear Number",xlab="", ylab="Miles per Gallon")
>

Script 19.32

script_19_32.R
# Script 19.32
g<-c(1.1,2.0,3.2,3.7,5.1,8.6,10.4,15.2,18.7,25.3)
h<-c(0.9,2.1,2.9,3.5,4.8,8.7,10.6,14.9,18.7,25.0)
d<-g-h
d
mean(d)
sum((d-mean(d))**2)
sd(d)
t.test(g,h,paired=TRUE)
script_19_32.out
[ah@hobbes Documents]$ R

R version 3.3.1 (2016-06-21) -- "Bug in Your Hair"
Copyright (C) 2016 The R Foundation for Statistical Computing
Platform: x86_64-redhat-linux-gnu (64-bit)

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.

  Natural language support but running in an English locale

R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.

> # Script 19.32
> g<-c(1.1,2.0,3.2,3.7,5.1,8.6,10.4,15.2,18.7,25.3)
> h<-c(0.9,2.1,2.9,3.5,4.8,8.7,10.6,14.9,18.7,25.0)
> d<-g-h
> d
 [1]  0.2 -0.1  0.3  0.2  0.3 -0.1 -0.2  0.3  0.0  0.3
> mean(d)
[1] 0.12
> sum((d-mean(d))**2)
[1] 0.356
> sd(d)
[1] 0.1988858
> t.test(g,h,paired=TRUE)

    Paired t-test

data:  g and h
t = 1.908, df = 9, p-value = 0.08875
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
 -0.02227432  0.26227432
sample estimates:
mean of the differences
                   0.12

>

Script 19.33

script_19_33.R
# Script 19.33
# p-value in the normal probability density distribution
x<-seq(-4,4,length=100)
dx<-dnorm(x,mean=0,sd=1)
plot(x,dx,type="l",main="Normal Distribution")
p1<-seq(-4,-1.5,length=100)
dp1<-dnorm(p1)
p2<-seq(1.5,4,length=100)
dp2<-dnorm(p2)
polygon(c(-4,p1,-1.5),c(0,dp1,0),col="gray")
polygon(c(1.5,p2,4),c(0,dp2,0),col="gray")
abline(h=0)
script_19_33.png
script_19_33.png
script_19_33.out
[ah@hobbes Documents]$ R

R version 3.3.1 (2016-06-21) -- "Bug in Your Hair"
Copyright (C) 2016 The R Foundation for Statistical Computing
Platform: x86_64-redhat-linux-gnu (64-bit)

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.

  Natural language support but running in an English locale

R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.

> # Script 19.33
> # p-value in the normal probability density distribution
> x<-seq(-4,4,length=100)
> dx<-dnorm(x,mean=0,sd=1)
> plot(x,dx,type="l",main="Normal Distribution")
> p1<-seq(-4,-1.5,length=100)
> dp1<-dnorm(p1)
> p2<-seq(1.5,4,length=100)
> dp2<-dnorm(p2)
> polygon(c(-4,p1,-1.5),c(0,dp1,0),col="gray")
> polygon(c(1.5,p2,4),c(0,dp2,0),col="gray")
> abline(h=0)
>

Script 19.34

script_19_34.R
# Script 19.34
F<-matrix(c(24,26,35,15),nrow=2);
F
fisher.test(F)
chisq.test(F)
script_19_34.out
[ah@hobbes Documents]$ R

R version 3.3.1 (2016-06-21) -- "Bug in Your Hair"
Copyright (C) 2016 The R Foundation for Statistical Computing
Platform: x86_64-redhat-linux-gnu (64-bit)

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.

  Natural language support but running in an English locale

R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.

> # Script 19.34
> F<-matrix(c(24,26,35,15),nrow=2);
> F
     [,1] [,2]
[1,]   24   35
[2,]   26   15
> fisher.test(F)

    Fisher's Exact Test for Count Data

data:  F
p-value = 0.04143
alternative hypothesis: true odds ratio is not equal to 1
95 percent confidence interval:
 0.1599061 0.9685788
sample estimates:
odds ratio
 0.3994095

> chisq.test(F)

    Pearson's Chi-squared test with Yates' continuity correction

data:  F
X-squared = 4.1339, df = 1, p-value = 0.04203

>

 

 

Sign In

Please sign in to access your account

Cancel

Not already registered? Create an account now. ×

Find content that relates to you

This site uses cookies to improve your experience. Read more Close

Are you sure you want to delete your account?

This cannot be undone.

Cancel

Thank you for your feedback which will help us improve our service.

If you requested a response, we will make sure to get back to you shortly.

×
Please fill in the required fields in your feedback submission.
×