The source file we’ll work with today lives here on the internet:
https://data.giss.nasa.gov/gistemp/tabledata_v3/GLB.Ts+dSST.txt.
Open it in a web browser and have a look at it in all its messy glory. Make sure you understand what the numbers mean.
Normally, we’d have three options for fetching data from a file on the internet:
To download the file to your current directory in R Studio, open a Shell (this a full bash shell), and run:
--2019-09-05 08:50:22-- https://data.giss.nasa.gov/gistemp/tabledata_v3/GLB.Ts+dSST.txt
Resolving data.giss.nasa.gov (data.giss.nasa.gov)... 129.164.128.233, 2001:4d0:2310:230::233
Connecting to data.giss.nasa.gov (data.giss.nasa.gov)|129.164.128.233|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 16198 (16K) [text/plain]
Saving to: ‘data/global-mean.txt’
0K .......... ..... 100% 592K=0.03s
2019-09-05 08:50:22 (592 KB/s) - ‘data/global-mean.txt’ saved [16198/16198]
Note the capitol “-O”, which is specifying our output file.
Fiddle all you like with the data import wizard; you’re not going to get this file to parse correctly!
Open up the text file in RStudio by navigating to your data
directory on the “Files” tab and clicking on it.
Hand edit it to:
We’ll do the rest of the clean up in R.
We can now use R’s read.table
function to load the file into a table in R:
As always, let’s sanity check what kinds of vectors we have in each of our columns:
Year Jan Feb Mar
Min. :1880 Min. :-70.0000 Min. :-61.000 Min. :-62.000
1st Qu.:1914 1st Qu.:-28.0000 1st Qu.:-24.000 1st Qu.:-24.000
Median :1948 Median : -4.0000 Median : -6.000 Median : -1.000
Mean :1948 Mean : 0.6934 Mean : 2.168 Mean : 3.781
3rd Qu.:1982 3rd Qu.: 27.0000 3rd Qu.: 30.000 3rd Qu.: 26.000
Max. :2016 Max. :117.0000 Max. :135.000 Max. :130.000
Apr May Jun Jul
Min. :-59.000 Min. :-54.000 Min. :-52.0000 Min. :-48.000
1st Qu.:-26.000 1st Qu.:-25.000 1st Qu.:-25.0000 1st Qu.:-20.000
Median : -5.000 Median : -6.000 Median : -7.0000 Median : -5.000
Mean : 1.715 Mean : 1.248 Mean : -0.4161 Mean : 2.715
3rd Qu.: 25.000 3rd Qu.: 26.000 3rd Qu.: 16.0000 3rd Qu.: 15.000
Max. :109.000 Max. : 93.000 Max. : 78.0000 Max. : 83.000
Aug Sep Oct Nov
Min. :-51.000 Min. :-47.000 Min. :-55.000 Min. :-56.00
1st Qu.:-20.000 1st Qu.:-17.000 1st Qu.:-19.000 1st Qu.:-19.00
Median : -4.000 Median : -3.000 Median : -1.000 Median : -2.00
Mean : 2.978 Mean : 4.701 Mean : 5.328 Mean : 3.81
3rd Qu.: 19.000 3rd Qu.: 20.000 3rd Qu.: 20.000 3rd Qu.: 15.00
Max. : 98.000 Max. : 90.000 Max. :106.000 Max. :104.00
Dec J.D D.N DJF
Min. :-78.0000 Min. :-47.000 -9 : 7 Min. :-64.00
1st Qu.:-25.0000 1st Qu.:-21.000 -22 : 5 1st Qu.:-25.00
Median : -8.0000 Median : -7.000 -10 : 4 Median : -8.50
Mean : 0.5329 Mean : 2.438 -2 : 4 Mean : 1.11
3rd Qu.: 22.0000 3rd Qu.: 19.000 -25 : 4 3rd Qu.: 27.25
Max. :111.0000 Max. : 99.000 -18 : 3 Max. :121.00
(Other):110 NA's :1
MAM JJA SON Year.1
Min. :-56.000 Min. :-47.000 Min. :-47.000 Min. :1880
1st Qu.:-25.000 1st Qu.:-21.000 1st Qu.:-18.000 1st Qu.:1914
Median : -6.000 Median : -6.000 Median : -3.000 Median :1948
Mean : 2.255 Mean : 1.737 Mean : 4.628 Mean :1948
3rd Qu.: 27.000 3rd Qu.: 16.000 3rd Qu.: 17.000 3rd Qu.:1982
Max. :111.000 Max. : 85.000 Max. : 97.000 Max. :2016
We can see D.N
and DJF
are messed up; why?
Let’s load the tidyverse:
The tidyverse’s definition of Tidy Data is a table of values where:
Now let’s take a moment to appreciate all of the ways in which this data set is not Tidy. It:
Year Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec
1 1880 -30 -21 -18 -27 -14 -29 -24 -8 -17 -16 -19 -22
2 1881 -10 -14 1 -3 -4 -28 -7 -3 -9 -20 -26 -16
3 1882 9 8 1 -20 -18 -25 -11 3 -1 -23 -21 -25
4 1883 -34 -42 -18 -25 -26 -13 -9 -14 -19 -12 -21 -19
5 1884 -18 -13 -36 -36 -32 -38 -35 -27 -24 -22 -30 -30
6 1885 -66 -30 -24 -45 -42 -50 -29 -27 -19 -20 -22 -7
7 1886 -43 -46 -41 -29 -27 -39 -16 -31 -19 -25 -26 -25
8 1887 -66 -48 -32 -37 -33 -21 -19 -28 -19 -32 -25 -38
9 1888 -43 -43 -47 -28 -22 -20 -10 -11 -7 1 0 -12
10 1889 -21 14 4 4 -3 -12 -5 -18 -18 -22 -32 -31
11 1890 -48 -48 -41 -38 -48 -27 -30 -36 -36 -23 -37 -30
12 1891 -46 -49 -15 -25 -17 -22 -22 -21 -13 -24 -37 -3
13 1892 -26 -15 -36 -35 -25 -20 -28 -20 -25 -17 -49 -29
14 1893 -69 -51 -24 -32 -35 -24 -14 -24 -18 -16 -17 -38
15 1894 -55 -31 -20 -41 -30 -43 -32 -29 -23 -17 -25 -22
16 1895 -44 -42 -30 -23 -23 -25 -16 -16 -2 -11 -15 -12
17 1896 -23 -15 -29 -33 -19 -13 -6 -9 -5 4 -16 -12
18 1897 -22 -19 -12 -1 0 -12 -4 -3 -4 -10 -18 -26
19 1898 -6 -34 -55 -33 -35 -20 -22 -22 -19 -32 -35 -22
20 1899 -18 -39 -35 -21 -20 -26 -13 -4 0 0 12 -27
21 1900 -40 -8 2 -14 -6 -15 -9 -4 1 8 -13 -14
22 1901 -30 -5 5 -6 -18 -10 -9 -13 -17 -29 -17 -30
23 1902 -19 -3 -29 -27 -31 -34 -26 -28 -20 -27 -36 -46
24 1903 -27 -6 -23 -39 -41 -44 -30 -44 -43 -42 -38 -47
25 1904 -64 -55 -46 -50 -50 -49 -48 -43 -47 -35 -16 -29
26 1905 -38 -59 -25 -36 -33 -31 -25 -21 -15 -23 -8 -21
27 1906 -31 -34 -15 -2 -21 -22 -27 -19 -25 -20 -38 -18
28 1907 -44 -53 -25 -40 -46 -43 -35 -37 -32 -24 -51 -50
29 1908 -46 -36 -58 -46 -40 -39 -35 -45 -33 -43 -51 -50
30 1909 -70 -47 -52 -59 -54 -52 -43 -30 -37 -39 -31 -55
31 1910 -44 -43 -47 -39 -34 -36 -31 -34 -37 -39 -56 -69
32 1911 -64 -60 -62 -55 -51 -47 -41 -43 -38 -26 -20 -25
33 1912 -27 -13 -37 -20 -20 -26 -41 -51 -47 -55 -38 -42
34 1913 -41 -44 -44 -36 -45 -46 -34 -32 -32 -34 -18 -4
35 1914 2 -13 -23 -28 -19 -22 -24 -15 -13 -5 -20 -10
36 1915 -20 -1 -8 7 -1 -16 -3 -15 -12 -22 -12 -25
37 1916 -20 -23 -31 -25 -27 -44 -34 -27 -29 -28 -42 -78
38 1917 -46 -53 -47 -38 -48 -40 -23 -26 -18 -35 -29 -71
39 1918 -44 -33 -21 -40 -37 -28 -22 -26 -14 -3 -16 -30
40 1919 -21 -19 -25 -17 -20 -28 -21 -19 -17 -16 -29 -35
41 1920 -15 -22 -8 -26 -26 -33 -32 -29 -20 -29 -33 -47
42 1921 -4 -21 -28 -36 -36 -31 -16 -24 -16 -6 -16 -18
43 1922 -34 -44 -13 -22 -34 -32 -27 -31 -29 -33 -17 -17
44 1923 -27 -37 -32 -38 -33 -24 -29 -30 -28 -13 3 -6
45 1924 -24 -27 -12 -35 -19 -28 -27 -35 -30 -36 -23 -43
46 1925 -34 -35 -24 -25 -30 -34 -30 -19 -13 -17 3 11
47 1926 20 7 12 -15 -25 -25 -21 -11 -11 -11 -6 -30
48 1927 -28 -21 -39 -31 -25 -27 -15 -19 -6 -1 -4 -36
49 1928 -4 -12 -28 -29 -30 -41 -21 -25 -20 -19 -9 -20
50 1929 -47 -61 -34 -40 -39 -43 -33 -29 -23 -15 -14 -55
51 1930 -29 -24 -8 -26 -25 -19 -17 -11 -11 -8 14 -9
52 1931 -10 -22 -6 -21 -22 -6 1 0 -6 0 -12 -10
53 1932 13 -18 -20 -7 -22 -30 -24 -24 -11 -10 -26 -22
54 1933 -34 -32 -29 -23 -25 -32 -20 -23 -26 -24 -31 -47
55 1934 -27 -4 -31 -27 -11 -14 -11 -10 -16 -11 -1 -9
56 1935 -37 11 -13 -35 -26 -23 -19 -17 -17 -8 -29 -22
57 1936 -29 -39 -23 -20 -17 -19 -6 -12 -6 -4 -5 -4
58 1937 -11 5 -17 -17 -7 -8 -5 3 14 10 9 -12
59 1938 0 -4 5 5 -7 -17 -9 -4 3 11 1 -26
60 1939 -13 -12 -20 -12 -7 -8 -6 -5 0 -3 6 40
61 1940 -15 6 12 16 5 5 10 1 12 7 13 19
62 1941 13 23 6 11 10 4 15 14 2 24 12 14
63 1942 26 5 13 14 14 11 2 -3 0 6 13 12
64 1943 -1 22 1 13 10 -1 14 3 11 30 25 28
65 1944 41 31 34 27 26 22 23 23 31 27 12 5
66 1945 13 2 11 24 10 2 7 25 22 22 10 -10
67 1946 15 6 0 11 -4 -17 -9 -8 -2 -6 -2 -29
68 1947 -13 -8 5 4 -6 0 -6 -8 -14 6 -1 -18
69 1948 5 -13 -23 -9 8 -5 -13 -10 -10 -7 -8 -23
70 1949 9 -16 -1 -7 -9 -22 -13 -8 -8 -3 -8 -19
71 1950 -30 -26 -6 -21 -12 -6 -9 -18 -10 -20 -35 -20
72 1951 -35 -44 -19 -10 -2 -5 0 5 7 6 0 15
73 1952 16 12 -10 2 -5 -4 5 7 8 -4 -17 -2
74 1953 9 16 11 20 8 8 2 8 6 5 -5 3
75 1954 -28 -10 -12 -18 -20 -16 -16 -13 -7 -1 8 -18
76 1955 11 -21 -36 -23 -20 -8 -9 4 -13 -5 -28 -32
[ reached 'max' / getOption("max.print") -- omitted 61 rows ]
Alternatively, we could use the tidyverse
select
function (useful inside of a pipe chain):
Year Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec
1 1880 -30 -21 -18 -27 -14 -29 -24 -8 -17 -16 -19 -22
2 1881 -10 -14 1 -3 -4 -28 -7 -3 -9 -20 -26 -16
3 1882 9 8 1 -20 -18 -25 -11 3 -1 -23 -21 -25
4 1883 -34 -42 -18 -25 -26 -13 -9 -14 -19 -12 -21 -19
5 1884 -18 -13 -36 -36 -32 -38 -35 -27 -24 -22 -30 -30
6 1885 -66 -30 -24 -45 -42 -50 -29 -27 -19 -20 -22 -7
7 1886 -43 -46 -41 -29 -27 -39 -16 -31 -19 -25 -26 -25
8 1887 -66 -48 -32 -37 -33 -21 -19 -28 -19 -32 -25 -38
9 1888 -43 -43 -47 -28 -22 -20 -10 -11 -7 1 0 -12
10 1889 -21 14 4 4 -3 -12 -5 -18 -18 -22 -32 -31
11 1890 -48 -48 -41 -38 -48 -27 -30 -36 -36 -23 -37 -30
12 1891 -46 -49 -15 -25 -17 -22 -22 -21 -13 -24 -37 -3
13 1892 -26 -15 -36 -35 -25 -20 -28 -20 -25 -17 -49 -29
14 1893 -69 -51 -24 -32 -35 -24 -14 -24 -18 -16 -17 -38
15 1894 -55 -31 -20 -41 -30 -43 -32 -29 -23 -17 -25 -22
16 1895 -44 -42 -30 -23 -23 -25 -16 -16 -2 -11 -15 -12
17 1896 -23 -15 -29 -33 -19 -13 -6 -9 -5 4 -16 -12
18 1897 -22 -19 -12 -1 0 -12 -4 -3 -4 -10 -18 -26
19 1898 -6 -34 -55 -33 -35 -20 -22 -22 -19 -32 -35 -22
20 1899 -18 -39 -35 -21 -20 -26 -13 -4 0 0 12 -27
21 1900 -40 -8 2 -14 -6 -15 -9 -4 1 8 -13 -14
22 1901 -30 -5 5 -6 -18 -10 -9 -13 -17 -29 -17 -30
23 1902 -19 -3 -29 -27 -31 -34 -26 -28 -20 -27 -36 -46
24 1903 -27 -6 -23 -39 -41 -44 -30 -44 -43 -42 -38 -47
25 1904 -64 -55 -46 -50 -50 -49 -48 -43 -47 -35 -16 -29
26 1905 -38 -59 -25 -36 -33 -31 -25 -21 -15 -23 -8 -21
27 1906 -31 -34 -15 -2 -21 -22 -27 -19 -25 -20 -38 -18
28 1907 -44 -53 -25 -40 -46 -43 -35 -37 -32 -24 -51 -50
29 1908 -46 -36 -58 -46 -40 -39 -35 -45 -33 -43 -51 -50
30 1909 -70 -47 -52 -59 -54 -52 -43 -30 -37 -39 -31 -55
31 1910 -44 -43 -47 -39 -34 -36 -31 -34 -37 -39 -56 -69
32 1911 -64 -60 -62 -55 -51 -47 -41 -43 -38 -26 -20 -25
33 1912 -27 -13 -37 -20 -20 -26 -41 -51 -47 -55 -38 -42
34 1913 -41 -44 -44 -36 -45 -46 -34 -32 -32 -34 -18 -4
35 1914 2 -13 -23 -28 -19 -22 -24 -15 -13 -5 -20 -10
36 1915 -20 -1 -8 7 -1 -16 -3 -15 -12 -22 -12 -25
37 1916 -20 -23 -31 -25 -27 -44 -34 -27 -29 -28 -42 -78
38 1917 -46 -53 -47 -38 -48 -40 -23 -26 -18 -35 -29 -71
39 1918 -44 -33 -21 -40 -37 -28 -22 -26 -14 -3 -16 -30
40 1919 -21 -19 -25 -17 -20 -28 -21 -19 -17 -16 -29 -35
41 1920 -15 -22 -8 -26 -26 -33 -32 -29 -20 -29 -33 -47
42 1921 -4 -21 -28 -36 -36 -31 -16 -24 -16 -6 -16 -18
43 1922 -34 -44 -13 -22 -34 -32 -27 -31 -29 -33 -17 -17
44 1923 -27 -37 -32 -38 -33 -24 -29 -30 -28 -13 3 -6
45 1924 -24 -27 -12 -35 -19 -28 -27 -35 -30 -36 -23 -43
46 1925 -34 -35 -24 -25 -30 -34 -30 -19 -13 -17 3 11
47 1926 20 7 12 -15 -25 -25 -21 -11 -11 -11 -6 -30
48 1927 -28 -21 -39 -31 -25 -27 -15 -19 -6 -1 -4 -36
49 1928 -4 -12 -28 -29 -30 -41 -21 -25 -20 -19 -9 -20
50 1929 -47 -61 -34 -40 -39 -43 -33 -29 -23 -15 -14 -55
51 1930 -29 -24 -8 -26 -25 -19 -17 -11 -11 -8 14 -9
52 1931 -10 -22 -6 -21 -22 -6 1 0 -6 0 -12 -10
53 1932 13 -18 -20 -7 -22 -30 -24 -24 -11 -10 -26 -22
54 1933 -34 -32 -29 -23 -25 -32 -20 -23 -26 -24 -31 -47
55 1934 -27 -4 -31 -27 -11 -14 -11 -10 -16 -11 -1 -9
56 1935 -37 11 -13 -35 -26 -23 -19 -17 -17 -8 -29 -22
57 1936 -29 -39 -23 -20 -17 -19 -6 -12 -6 -4 -5 -4
58 1937 -11 5 -17 -17 -7 -8 -5 3 14 10 9 -12
59 1938 0 -4 5 5 -7 -17 -9 -4 3 11 1 -26
60 1939 -13 -12 -20 -12 -7 -8 -6 -5 0 -3 6 40
61 1940 -15 6 12 16 5 5 10 1 12 7 13 19
62 1941 13 23 6 11 10 4 15 14 2 24 12 14
63 1942 26 5 13 14 14 11 2 -3 0 6 13 12
64 1943 -1 22 1 13 10 -1 14 3 11 30 25 28
65 1944 41 31 34 27 26 22 23 23 31 27 12 5
66 1945 13 2 11 24 10 2 7 25 22 22 10 -10
67 1946 15 6 0 11 -4 -17 -9 -8 -2 -6 -2 -29
68 1947 -13 -8 5 4 -6 0 -6 -8 -14 6 -1 -18
69 1948 5 -13 -23 -9 8 -5 -13 -10 -10 -7 -8 -23
70 1949 9 -16 -1 -7 -9 -22 -13 -8 -8 -3 -8 -19
71 1950 -30 -26 -6 -21 -12 -6 -9 -18 -10 -20 -35 -20
72 1951 -35 -44 -19 -10 -2 -5 0 5 7 6 0 15
73 1952 16 12 -10 2 -5 -4 5 7 8 -4 -17 -2
74 1953 9 16 11 20 8 8 2 8 6 5 -5 3
75 1954 -28 -10 -12 -18 -20 -16 -16 -13 -7 -1 8 -18
76 1955 11 -21 -36 -23 -20 -8 -9 4 -13 -5 -28 -32
[ reached 'max' / getOption("max.print") -- omitted 61 rows ]
You can find this file in RStudio: Help -> Cheetsheets -> Data Manipulation with dplyr and tidyr
The picture that matches what we need to do tells us it’s a gather
:
What happened there? Take a look at the table.
We can use colnames
to update the names of our columns and tolower
to switch strings in character vectors to all lower case:
Let’s make a vector holding month numbers:
Remember what this does:
[1] 3
[1] "Jan" "Feb" "Mar" "Apr" "May" "Jun" "Jul" "Aug" "Sep" "Oct" "Nov"
[12] "Dec"
So we can do this:
Once you have a named vector, you can index elements by name in addition to using numeric indexes. For example:
Apr
4
Apr Aug
4 8
Apr Aug Apr
4 8 4
Making our month number column then becomes as easy as indexing on our months column:
rep
eat functionIf you need a new column that contains a set of values that regularly repeats, an alternative solution is to use the rep
eat function:
The each
argument says repeat each value 137 times.
Alternatively, we can specify a times
:
[1] 1 2 3 4 5 6 7 8 9 10 11 12 1 2 3 4 5 6 7 8 9 10 11
[24] 12 1 2 3 4 5 6 7 8 9 10 11 12 1 2 3 4 5 6 7 8 9 10
[47] 11 12 1 2 3 4 5 6 7 8 9 10 11 12 1 2 3 4 5 6 7 8 9
[70] 10 11 12 1 2 3 4 5 6 7 8 9 10 11 12 1 2 3 4 5 6 7 8
[93] 9 10 11 12 1 2 3 4 5 6 7 8 9 10 11 12 1 2 3 4 5 6 7
[116] 8 9 10 11 12 1 2 3 4 5 6 7 8 9 10 11 12 1 2 3 4 5 6
[139] 7 8 9 10 11 12 1 2 3 4 5 6 7 8 9 10 11 12 1 2 3 4 5
[162] 6 7 8 9 10 11 12 1 2 3 4 5 6 7 8 9 10 11 12 1 2 3 4
[185] 5 6 7 8 9 10 11 12 1 2 3 4 5 6 7 8 9 10 11 12 1 2 3
[208] 4 5 6 7 8 9 10 11 12 1 2 3 4 5 6 7 8 9 10 11 12 1 2
[231] 3 4 5 6 7 8 9 10 11 12 1 2 3 4 5 6 7 8 9 10 11 12 1
[254] 2 3 4 5 6 7 8 9 10 11 12 1 2 3 4 5 6 7 8 9 10 11 12
[277] 1 2 3 4 5 6 7 8 9 10 11 12 1 2 3 4 5 6 7 8 9 10 11
[300] 12 1 2 3 4 5 6 7 8 9 10 11 12 1 2 3 4 5 6 7 8 9 10
[323] 11 12 1 2 3 4 5 6 7 8 9 10 11 12 1 2 3 4 5 6 7 8 9
[346] 10 11 12 1 2 3 4 5 6 7 8 9 10 11 12 1 2 3 4 5 6 7 8
[369] 9 10 11 12 1 2 3 4 5 6 7 8 9 10 11 12 1 2 3 4 5 6 7
[392] 8 9 10 11 12 1 2 3 4 5 6 7 8 9 10 11 12 1 2 3 4 5 6
[415] 7 8 9 10 11 12 1 2 3 4 5 6 7 8 9 10 11 12 1 2 3 4 5
[438] 6 7 8 9 10 11 12 1 2 3 4 5 6 7 8 9 10 11 12 1 2 3 4
[461] 5 6 7 8 9 10 11 12 1 2 3 4 5 6 7 8 9 10 11 12 1 2 3
[484] 4 5 6 7 8 9 10 11 12 1 2 3 4 5 6 7 8 9 10 11 12 1 2
[507] 3 4 5 6 7 8 9 10 11 12 1 2 3 4 5 6 7 8 9 10 11 12 1
[530] 2 3 4 5 6 7 8 9 10 11 12 1 2 3 4 5 6 7 8 9 10 11 12
[553] 1 2 3 4 5 6 7 8 9 10 11 12 1 2 3 4 5 6 7 8 9 10 11
[576] 12 1 2 3 4 5 6 7 8 9 10 11 12 1 2 3 4 5 6 7 8 9 10
[599] 11 12 1 2 3 4 5 6 7 8 9 10 11 12 1 2 3 4 5 6 7 8 9
[622] 10 11 12 1 2 3 4 5 6 7 8 9 10 11 12 1 2 3 4 5 6 7 8
[645] 9 10 11 12 1 2 3 4 5 6 7 8 9 10 11 12 1 2 3 4 5 6 7
[668] 8 9 10 11 12 1 2 3 4 5 6 7 8 9 10 11 12 1 2 3 4 5 6
[691] 7 8 9 10 11 12 1 2 3 4 5 6 7 8 9 10 11 12 1 2 3 4 5
[714] 6 7 8 9 10 11 12 1 2 3 4 5 6 7 8 9 10 11 12 1 2 3 4
[737] 5 6 7 8 9 10 11 12 1 2 3 4 5 6 7 8 9 10 11 12 1 2 3
[760] 4 5 6 7 8 9 10 11 12 1 2 3 4 5 6 7 8 9 10 11 12 1 2
[783] 3 4 5 6 7 8 9 10 11 12 1 2 3 4 5 6 7 8 9 10 11 12 1
[806] 2 3 4 5 6 7 8 9 10 11 12 1 2 3 4 5 6 7 8 9 10 11 12
[829] 1 2 3 4 5 6 7 8 9 10 11 12 1 2 3 4 5 6 7 8 9 10 11
[852] 12 1 2 3 4 5 6 7 8 9 10 11 12 1 2 3 4 5 6 7 8 9 10
[875] 11 12 1 2 3 4 5 6 7 8 9 10 11 12 1 2 3 4 5 6 7 8 9
[898] 10 11 12 1 2 3 4 5 6 7 8 9 10 11 12 1 2 3 4 5 6 7 8
[921] 9 10 11 12 1 2 3 4 5 6 7 8 9 10 11 12 1 2 3 4 5 6 7
[944] 8 9 10 11 12 1 2 3 4 5 6 7 8 9 10 11 12 1 2 3 4 5 6
[967] 7 8 9 10 11 12 1 2 3 4 5 6 7 8 9 10 11 12 1 2 3 4 5
[990] 6 7 8 9 10 11 12 1 2 3 4
[ reached getOption("max.print") -- omitted 644 entries ]
Note the difference!
Finally, we’ll use a helper function from a new package called lubridate
to easily create a date column:
library(lubridate)
global_temps$date <- make_date(year = global_temps$year, month = global_temps$month_n)
We’ll explore Dates and Times in more depth next week.
Now, on your own:
geom_smooth
to add a kernel trend line (which is this a nice approach given these data)lm