AUTORA
Rosana Ferrero
Data Scientist
Rosana Ferrero
Data Scientist
AUTOR
Juan L. López
Data Scientist
Juan L. López
Data Scientist
Por ejemplo, podríamos reorganizar el siguiente conjunto de datos que describe a John y Mary Smith,y tiene dos variables de id (sujeto y tiempo) para que cada fila represente una observación de una variable.
> smiths subject time age weight height 1 John Smith 1 33 90 1.87 2 Mary Smith 1 NA NA 1.54
subject time variable value John Smith 1 age 33.00 John Smith 1 weight 90.00 John Smith 1 height 1.87 Mary Smith 1 height 1.54
Formato ancho: una columna para cada variable.Formato largo: cada fila es una combinación única de variable de identificación.
> smiths subject time age weight height 1 John Smith 1 33 90 1.87 2 Mary Smith 1 NA NA 1.54
> long<- melt(smiths) Using subject as id variables > long subject variable value 1 John Smith time 1.00 2 Mary Smith time 1.00 3 John Smith age 33.00 4 Mary Smith age NA 5 John Smith weight 90.00 6 Mary Smith weight NA 7 John Smith height 1.87 8 Mary Smith height 1.54
> long<- melt(smiths, id=c("subject","time")) > long subject time variable value 1 John Smith 1 age 33.00 2 Mary Smith 1 age NA 3 John Smith 1 weight 90.00 4 Mary Smith 1 weight NA 5 John Smith 1 height 1.87 6 Mary Smith 1 height 1.54
> melt(smiths, id=c("subject","time"), na.rm=T) subject time variable value 1 John Smith 1 age 33.00 2 John Smith 1 weight 90.00 3 John Smith 1 height 1.87 4 Mary Smith 1 height 1.54
> cast(long) subject time age weight height 1 John Smith 1 33 90 1.87 2 Mary Smith 1 NA NA 1.54
> cast(long, ... ~ time) subject variable 1 1 John Smith age 33.00 2 John Smith weight 90.00 3 John Smith height 1.87 4 Mary Smith height 1.54
> head(airquality) Ozone Solar.R Wind Temp Month Day 1 41 190 7.4 67 5 1 2 36 118 8.0 72 5 2 3 12 149 12.6 74 5 3 4 18 313 11.5 62 5 4 5 NA NA 14.3 56 5 5 6 28 NA 14.9 66 5 6
> long<-melt(airquality, id=c("Month", "Day")) > head(long) Month Day variable value 1 5 1 Ozone 41 2 5 2 Ozone 36 3 5 3 Ozone 12 4 5 4 Ozone 18 5 5 5 Ozone NA 6 5 6 Ozone 28
> wide<-cast(long) > head(wide) Month Day Ozone Solar.R Wind Temp 1 5 1 41 190 7.4 67 2 5 2 36 118 8.0 72 3 5 3 12 149 12.6 74 4 5 4 18 313 11.5 62 5 5 5 NA NA 14.3 56 6 5 6 28 NA 14.9 66
> cast(long, Month ~ variable, mean) Month Ozone Solar.R Wind Temp 1 5 NA NA 11.622581 65.54839 2 6 NA 190.1667 10.266667 79.10000 3 7 NA 216.4839 8.941935 83.90323 4 8 NA NA 8.793548 83.96774
> head(ChickWeight) Grouped Data: weight ~ Time | Chick weight Time Chick Diet 1 42 0 1 1 2 51 2 1 1 3 59 4 1 1 4 64 6 1 1 5 76 8 1 1 6 93 10 1 1
> long<- melt(ChickWeight, id=2:4, na.rm=TRUE) > head(long) Time Chick Diet variable value 1 0 1 1 weight 42 2 2 1 1 weight 51 3 4 1 1 weight 59 4 6 1 1 weight 64 5 8 1 1 weight 76 6 10 1 1 weight 93
> head(cast(long, Diet + Time ~ variable, mean)) Diet Time weight 1 1 0 41.40000 2 1 2 47.25000 3 1 4 56.47368 4 1 6 66.78947 5 1 8 79.68421 6 1 10 93.05263
> head(cast(long, Time ~ variable, mean)) Time weight 1 0 41.06000 2 2 49.22000 3 4 59.95918 4 6 74.30612 5 8 91.24490 6 10 107.83673
> head(tips) total_bill tip sex smoker day time size 1 16.99 1.01 Female No Sun Dinner 2 2 10.34 1.66 Male No Sun Dinner 3 3 21.01 3.50 Male No Sun Dinner 3 4 23.68 3.31 Male No Sun Dinner 2 5 24.59 3.61 Female No Sun Dinner 4 6 25.29 4.71 Male No Sun Dinner 4
> long<-melt(tips, id=c("day","time"))
Using sex, smoker, day, time as id variables > head(long) day time variable value 1 Sun Dinner total_bill 16.99 2 Sun Dinner total_bill 10.34 3 Sun Dinner total_bill 21.01 4 Sun Dinner total_bill 23.68 5 Sun Dinner total_bill 24.59 6 Sun Dinner total_bill 25.29
> wide<-cast(long) Aggregation requires fun.aggregate: length used as default > head(wide) Aggregation requires fun.aggregate: length used as default day time total_bill tip sex smoker size 1 Fri Dinner 12 12 12 12 12 2 Fri Lunch 7 7 7 7 7 3 Sat Dinner 87 87 87 87 87 4 Sun Dinner 76 76 76 76 76 5 Thur Dinner 1 1 1 1 1 6 Thur Lunch 61 61 61 61 61
> cast(wide, sex ~ smoker | variable, mean) Using sex, smoker, day, time as id variables $total_bill sex No Yes 1 Female 18.10519 17.97788 2 Male 19.79124 22.28450 $tip sex No Yes 1 Female 2.773519 2.931515 2 Male 3.113402 3.051167 $size sex No Yes 1 Female 2.592593 2.242424 2 Male 2.711340 2.500000
> cast(wide, sex ~ smoker, mean, subset=variable=="total_bill") Using sex, smoker, day, time as id variables sex No Yes 1 Female 18.10519 17.97788 2 Male 19.79124 22.28450
time treatment subject rep potato buttery grassy rancid painty 61 1 1 3 1 2.9 0.0 0.0 0.0 5.5 25 1 1 3 2 14.0 0.0 0.0 1.1 0.0 62 1 1 10 1 11.0 6.4 0.0 0.0 0.0 26 1 1 10 2 9.9 5.9 2.9 2.2 0.0 63 1 1 15 1 1.2 0.1 0.0 1.1 5.1 27 1 1 15 2 8.8 3.0 3.6 1.5 2.3
> long <- melt(french_fries, id=1:4, na.rm=TRUE) > head(long) time treatment subject rep variable value 1 1 1 3 1 potato 2.9 2 1 1 3 2 potato 14.0 3 1 1 10 1 potato 11.0 4 1 1 10 2 potato 9.9 5 1 1 15 1 potato 1.2 6 1 1 15 2 potato 8.8
> wide<-cast(long) > head(wide) time treatment subject rep potato buttery grassy rancid painty 1 1 1 3 1 2.9 0.0 0.0 0.0 5.5 2 1 1 3 2 14.0 0.0 0.0 1.1 0.0 3 1 1 10 1 11.0 6.4 0.0 0.0 0.0 4 1 1 10 2 9.9 5.9 2.9 2.2 0.0 5 1 1 15 1 1.2 0.1 0.0 1.1 5.1 6 1 1 15 2 8.8 3.0 3.6 1.5 2.3
> cast(long, variable ~ ., mean) variable (all) 1 potato 6.9525180 2 buttery 1.8236994 3 grassy 0.6641727 4 rancid 3.8522302 5 painty 2.5217579
> cast(long, treatment ~ rep, length) treatment 1 2 1 1 579 580 2 2 578 579
> cast(long, treatment + rep ~ ., length) treatment rep (all) 1 1 1 579 2 1 2 580 3 2 1 578 4 2 2 579 5 3 1 575 6 3 2 580
> cast(long, treatment ~ variable, mean) treatment potato buttery grassy rancid painty 1 1 6.887931 1.780087 0.6491379 4.065517 2.583621 2 2 7.001724 1.973913 0.6629310 3.624569 2.455844
Máster de Estadística Aplicada con R Software 8ª edición 2019 (Reserva de plaza)