-
Notifications
You must be signed in to change notification settings - Fork 0
/
Case-Study-Electricity.Rmd
291 lines (220 loc) · 8.65 KB
/
Case-Study-Electricity.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
---
title: "`data.table` Case Study - Electricity"
author: "Christos Gkenas"
date: "`r format(Sys.time(), '%a %d %b %Y (%H:%M:%S)')`"
output:
html_document:
highlight: tango
theme: united
toc: yes
toc_depth: 3
toc_float:
collapsed: no
smooth_scroll: no
pdf_document:
toc: yes
toc_depth: '3'
---
```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE, out.width="100%")
# Load all the libraries that you will use once here
library(here)
library(data.table)
library(countrycode)
library(ggplot2)
library(magrittr)
library(ggrepel)
library(knitr)
library(kableExtra)
# Changing default settings here.
theme_set(theme_classic(base_size = 16))
```
# Case Study: Electric power consumption
# Objectives
*Describe the data and its limitations*
## Questions
1)By continent, which countries have the lowest and highest consumption?
2) By continent, which countries have the lowest and highest percentage change in consumption since 2000?
3) How does Portugal compare to other EU Countries?
# Data Details
Download "Electric power consumption (kWh per capita)" data.
- World Bank Indicator
: 'EG.USE.ELEC.KH.PC'.
- https://data.worldbank.org/indicator/EG.USE.ELEC.KH.PC
# Data Processing
```{r}
list.files()
api <- fread(here("data", "Electricity_data.csv"),
skip = 4, header = TRUE,
check.names = TRUE)
api
View(api)
```
```{r readData, warning=FALSE}
# Add Continent to WiP dataset.
cl <- as.data.table(codelist)[, .(continent, wb)]
apicl <- merge(api, cl, by.x = "Country.Code", by.y = "wb", all.x = TRUE)
setnames(apicl, c("continent"), c("Continent"))
View(apicl)
ED <- melt(apicl,
id.vars = c("Continent", "Country.Name", "Country.Code", "Indicator.Code"),
measure = patterns("^X"),
variable.name = "YearC",
value.name = c("elcKW"),
na.rm = TRUE)
View(ED)
# Note cLabel will be used to label the ends of lines in plots.
ED[, `:=` (Year = as.numeric(gsub("^X", "", YearC)))][
, fYear := factor(Year)][
, maxYear:=max(Year), by = .(Country.Name)][
Year==maxYear, cLabel:=Country.Name][
, c("maxYear", "YearC"):=NULL]
setcolorder(ED, c("Indicator.Code", "Continent", "Country.Name", "Country.Code",
"Year"))
setkeyv(ED, c("Indicator.Code", "Continent", "Country.Name", "Country.Code", "Year"))
ED
View(ED)
```
# Analysis
## Portugal - Trends and Comparisons
> What are the time trends for Portugal?
Let's start by looking at a plot of how Portugal is performing over time.
```{r PTplot, warning=FALSE}
ED[Country.Name %in% "Portugal"] %>%
ggplot(aes(Year, elcKW)) +
geom_line(colour = "blue") +
geom_point(colour = "blue") +
scale_x_continuous(breaks=seq(1960, 2015, 5)) +
scale_y_continuous(limits=c(0, 5000)) +
ggtitle("Portugal") +
xlab("Year") +
ylab("KW")
```
**Interpretation**
In 1990 Portugal had 7.6% women in parliament (i.e. 12.2 men for each woman), which
increased to 34.8% (i.e. 1.87 men for each woman) in 2018. This still falls short
of 50% (i.e. point of gender parity in parliament).
Let's now plot how Portugal compares with other European countries. As a reference
the European Union and World averages. Finland, Hungary, Sweden and Romania are highlighted
for discussion.
```{r PTvsEurope, warning=FALSE}
ED[Continent == "Europe"] %>%
ggplot(aes(Year, elcKW, group=Country.Name, colour = Country.Name, label = cLabel)) +
geom_line(colour="grey90") +
geom_line(data = ED[Country.Name %in% c("Portugal", "European Union", "World",
"Sweden", "Finland", "Hungary", "Romania")],
aes(colour = Country.Name)) +
scale_x_continuous(breaks=seq(1960, 2015, 5)) +
scale_y_continuous(limits=c(0, 40000), breaks=seq(0, 40000, by= 5000)) +
expand_limits(x = 2015) +
geom_label_repel(data = ED[Country.Name %in% c("Portugal", "European Union", "World",
"Sweden", "Finland", "Hungary", "Romania")],
xlim = c(2005, 2015)) +
theme(legend.position = "none") +
ggtitle("Portugal compared to European countries") +
xlab("Year") +
ylab("KW")
```
**Interpretation**
Portugal has had more women in parliament than the European Union average since 2007
and since around 2000 compared to the world average. Hungary and Romania both had
a higher percentage of women in parliament in 1990 (around the end of the Cold War)
than they have had since. The key point to note is that none of these countries
reaches equality between males and females in parliament, although Sweden and
Finland come closest.
## Highest Percentages
> Which countries have the highest percentage of women in parliament by year? How do continents compare?
```{r highestContinent, collapse=TRUE}
hgWiP <- WP[!is.na(Continent)][
order(Continent, Year, -pctWiP), head(.SD, 1), by = .(Continent, Year)][
, CountryWiP := sprintf("%3.1f%% - %s", pctWiP, Country.Name)][
, .(Continent, Year, CountryWiP)]
hgCont <- dcast(hgWiP, Year ~ Continent, value.var = "CountryWiP")
hgCont %>%
kable(align="clllll") %>%
kable_styling(bootstrap_options = "striped")
```
## No Women in Parliament
Which countries have no (0%) women in parliament?
```{r zeroWiP, collapse=TRUE}
zeroWiP <- WP[!is.na(Continent) & pctWiP==0]
zeroYear <- zeroWiP[order(Country.Name), .(Year, Country.Name)][
, `:=`(N = max(.N)), .(Year)][
, toString(paste0('"', Country.Name, '"')), by = .(Year, N)][order(Year)]
setnames(zeroYear, c("V1"), c("zeroWiP"))
zeroYear %>%
kable(align="lrl", col.names = c("Year", "Num", "Countries without any Women in Parliament")) %>%
kable_styling(bootstrap_options = "striped") %>%
column_spec(1, bold = T)
```
## Global Trends
> What are the global trends over time?
In the following each country has been plotted as a line and the world "average" is
highlighted in blue.
```{r globalTrendsLines, warning=FALSE}
WP[!is.na(Continent)] %>%
ggplot(aes(Year, pctWiP, group = Country.Name, label = cLabel)) +
geom_line(colour="grey90") +
geom_line(data=WP[Country.Name=="World"], colour="blue") +
expand_limits(x = 2021) +
geom_label_repel(data = WP[Country.Name=="World"],
xlim = c(2018, 2021), colour = "Blue") +
theme(legend.position = "none") +
scale_x_continuous(breaks=seq(1990, 2021, 5)) +
scale_y_continuous(limits=c(0, 70), breaks=seq(0, 70, by=10)) +
ggtitle("Women in Parliament: Global Trends") +
ylab("% Women in Parliament")
```
A box whisker plot is better suited to present the variation (uncertainty) over
time.
```{r globalTrendsBoxplot, message=FALSE}
WP[!is.na(Continent)] %>%
ggplot(aes(fYear, pctWiP)) +
geom_boxplot(width = 0.25, outlier.size = .5) +
scale_x_discrete(breaks=seq(1990, 2020, 2)) +
scale_y_continuous(limits=c(0, 70), breaks=seq(0, 70, by=10)) +
ggtitle("Box Whisker Plot for countries") +
ylab("% Women in Parliament") +
xlab("Year")
```
### Continents
We can look at the global trends by continent for comparison.
```{r continent TrendsBoxplot, message=FALSE}
# First add observations for "World" as a continent.
wWP <- rbindlist(list(WP[!is.na(Continent)],
WP[!is.na(Continent)][, Continent:="World"]))
wWP %>%
ggplot(aes(fYear, pctWiP)) +
geom_boxplot(width = 0.4, outlier.size = .5) +
geom_line(data = WP[Country.Name=="World"][
, .(fYear, pctWiP, Country.Name)],
aes(fYear, y = pctWiP, group = Country.Name), colour = "darkred") +
scale_x_discrete(breaks=seq(1990, 2020, 5)) +
scale_y_continuous(limits=c(0, 70), breaks=seq(0, 70, by=20)) +
facet_wrap(~Continent) +
ggtitle("Box Whisker Plot for countries") +
ylab("% Women in Parliament") +
xlab("Year")+
theme_classic(base_size=11) +
labs(caption = "The red line is the world average.")
```
**Interpretation**
Although the world average is going up there is still large variation between countries.
The interpretation by continent is similar. Note that in earlier years fewer countries
provided data so there is likely to be some bais in the plots above.
# Conclusions
```{r worldConclusions, echo=FALSE}
pctWiPLast <- WP[Country.Name=="World"][order(Year), .SD[.N]][, pctWiP]
ratioWiPLast <- WP[Country.Name=="World"][order(Year), .SD[.N]][, Ratio]
yearLast <- WP[Country.Name=="World"][order(Year), .SD[.N]][, Year]
```
In `r yearLast`, globally there were `r sprintf("%3.1f%%", pctWiPLast)` women in parliament,
i.e, for every woman in parliament there was `r sprintf("%2.1f", ratioWiPLast)` men
in parliament. Whichever way we look at the data the conclusions is that, in general,
the percentage of women in parliament is increasing but that gender parity in most
parliaments is still far-off.
<hr>
# Session Information
```{r sessionInfo, echo=TRUE}
sessionInfo()
```