In this tutorial, we will try to understand some fundamental control structures used in statistical programming. In the beginning, we will separately analyze different control structures.
if
and else
: testing a condition and acting on itfor
: execute a loop a fixed number of timeswhile
: execute a loop while a condition is trueGeneral Construction:
if
:if (CONDITION) {
ACTION
}
if else
:if (CONDITION) {
ACTION 1
} else {
ACTION 2
}
ifelse()
:ifelse(CONDITION,ACTION1,ACTION2)
x = 3
if(x > 0){
print(log(x))
}
x = -3
if(x > 0){
print(log(x))
}
x = 3
if(x > 0){
print(log(x))
} else{
message("Unable to Take Logarithm")
}
x = -3
if(x > 0){
print(log(x))
} else {
message("Unable to Take Logarithm")
}
Write code that takes numbers a
and b
as input and prints ‘a is greater than b’ if a>b
, otherwise prints ‘a is not greater than b’.
a = 10
b = 8
# write code here
if (a>b){
print('a is greater than b')
} else {
print('a is not greater than b')
}
## [1] "a is greater than b"
x = BLANK
if(x > 0){
print(log(x))
}
if(x > 0){
print(log(x))
} else{
message("Unable to Take Logarithm")
}
x=BLANK
if(is.numeric(x)){
if(x > 0){
print(log(x))
} else{
message("Unable to Take Logarithm")
}
} else{
message("Please Input Numbers")
}
Redo Excercise 1 but check the data types before doing the comparison. Hint: &&
(and) and ||
(or) can be used to combine multiple logical expressions. Please don’t use &
and |
in an if
statement: these are vectorized operations.
a = '10'
b = 8
# write code here
if (is.numeric(a) && is.numeric(b)){
if (a>b){
print('a is greater than b')
} else {
print('a is not greater than b')
}
} else {
message("Please Input Numbers")
}
## Please Input Numbers
ifelse()
x=c(-1,3,200)
print(log(x))
y1 = if(x > 0){
log(x)
} else{
NA
}
print(y1)
y2 = ifelse(x>0,log(x),NA)
print(y2)
ifelse()
Statementsx=rnorm(1000,mean=0,sd=1)
y=ifelse(abs(x)<1,"Within 1 SD",ifelse(abs(x)>2,"Far Far Away","Between 1 and 2 SD"))
y.fct=factor(y,levels=c("Within 1 SD","Between 1 and 2 SD","Far Far Away"))
ggplot() +
geom_bar(aes(x=y.fct),fill="lightskyblue1") +
theme_minimal()
Please use ifelse()
function to create a new column WageLevel
with lwage
column of the Wages
dataset from Ecdat
.
lwage
is greater than \(7.0\), the corresponding value in the final vector is High Wage
.lwage
is lower than \(6.4\), the value is Low Wage
.Normal
.New_Wages = Wages %>%
mutate(WageLevel=ifelse(lwage>7.0,'High Wage',
ifelse(lwage<6.4, 'Low Wage', 'Normal')
)
)
head(New_Wages,5)
## exp wks bluecol ind south smsa married sex union ed black lwage WageLevel
## 1 3 32 no 0 yes no yes male no 9 no 5.56068 Low Wage
## 2 4 43 no 0 yes no yes male no 9 no 5.72031 Low Wage
## 3 5 40 no 0 yes no yes male no 9 no 5.99645 Low Wage
## 4 6 39 no 0 yes no yes male no 9 no 5.99645 Low Wage
## 5 7 42 no 1 yes no yes male no 9 no 6.06146 Low Wage
Geometric Series: \(a, ar, ar^2, ar^3,...\)
Formula of Sum: \(\sum_{k=0}^\infty ar^k=\frac{a}{1-r}\), for \(|r|<1\).
a=1 #Any Number
r=1/2 #Any Number Between -1 and 1: abs(r)<1
theoretical.limit=a/(1-r)
START=a
FINISH.1 = START + a*r^1
FINISH.2 = FINISH.1 + a*r^2
FINISH.3 = FINISH.2 + a*r^3
FINISH.10 = a
for(k in 1:10){
FINISH.10=FINISH.10+a*r^k
}
FINISH.100 = a
for(k in 1:100){
FINISH.100=FINISH.100+a*r^k
}
DATA = tibble(k=c(1,2,3,10,100,"Infinity"),
SUMMATION=c(FINISH.1,FINISH.2,FINISH.3,
FINISH.10,FINISH.100,
theoretical.limit))
print(DATA)
ABSOLUTE.ERROR = abs(FINISH.100-theoretical.limit)
print(ABSOLUTE.ERROR)
set.seed(4)
u = rnorm(100)
Use for
loop to calculate sum of the squares of the first 10 elements of vector u
.
sum10 = 0
for(i in 1:10){
sum10 = sum10 + u[i]^2
}
print(sum10)
## [1] 13.08209
Use for
loop and if
statement to calculate the sum of squares of the elements with even indices of vector u
.
sum_even = 0
for(i in 1:100){
if(i%%2==0){
sum_even = sum_even + u[i]
}
}
print(sum_even)
## [1] -1.567888
sum_odd = 0
for(i in 1:100){
if(i%%2!=0){
sum_odd = sum_odd + u[i]
}
}
print(sum_odd)
## [1] 11.22039
sum_even = 0
for(i in seq(2,100,2)){
sum_even = sum_even + u[i]
}
print(sum_even)
## [1] -1.567888
a=1
r=1/2
FINISH=a
k=0
while(abs(FINISH-a/(1-r)) > 1e-10) {
k=k+1
FINISH = FINISH + a*r^k
#if(k>100) break
}
print(c(k,FINISH))
a=10
r=-0.75
theoretical.limit=a/(1-r)
K=10 #How Many Steps Do You Want to Save?
summation=rep(NA,(K+1))
summation[1]=a
for (k in 1:K) {
summation[k+1]=summation[k] + a*r^k
}
ggplot() +
geom_line(aes(x=1:(K+1),y=summation)) +
geom_hline(yintercept=theoretical.limit,
linetype="dashed")
Write for loops to generate 100 random samples from normal distributions with means of 0 to 99 and save the random samples to a vector a
.
k
th component of a
is generated from \(N(k-1,1)\).rnorm
.set.seed(100)
a = rep(NA,100)
for (k in 1:100){
a[k] = rnorm(1,k-1,1)
}
print(a)
## [1] -0.5021924 1.1315312 1.9210829 3.8867848 4.1169713 5.3186301
## [7] 5.4182093 7.7145327 7.1747406 8.6401379 10.0898861 11.0962745
## [13] 11.7983660 13.7398405 14.1233795 14.9706833 15.6111458 17.5108563
## [19] 17.0861858 21.3102968 19.5619100 21.7640606 22.2619613 23.7734046
## [25] 23.1856209 24.5615494 25.2797784 27.2309445 26.8422705 29.2470760
## [31] 29.9088864 32.7573756 31.8620704 32.8888065 33.3099857 34.7782058
## [37] 36.1829077 37.4173233 39.0654023 39.9702020 39.8983708 42.4032035
## [43] 40.2232244 43.6228674 43.4777166 46.3222310 45.6365597 48.3190657
## [49] 48.0437791 47.1213441 49.5529378 49.2614021 52.1788648 54.8974657
## [55] 51.7280745 55.9804641 54.6011744 58.8248724 59.3812987 58.1611481
## [61] 59.7380042 60.9311560 61.6211164 65.5819589 64.1298341 64.2869750
## [67] 66.6379942 67.2016916 67.9300831 68.9075101 70.4489033 69.9356443
## [73] 70.8375807 74.6485217 71.9379040 75.0127497 74.9124717 77.2705395
## [79] 79.0084519 76.9255952 80.8968223 80.9500042 80.6546507 81.0687885
## [85] 84.7095816 84.8420950 86.2163679 87.8173621 89.7271758 88.8962297
## [91] 89.4428777 92.4283014 91.1070426 91.8424288 93.4697035 97.4456828
## [97] 95.1675042 97.4135198 96.8213169 97.8259652