Data Science Using R ‘Value Added’ (Assignment 2)

Data Science Using R Assignment

Question 1: Make a basic histogram using the age data from the Titanic data set using ggplot.
Link for downloading dataset Titanic: Click Here For Data

>dt1 <- "https://github.com/datasciencedojo/datasets/blob/master/titanic.csv"

>data <- read.csv(dt1)

>names(data)

OUTPUT
[1] "PassengerId" "survived"
[3] "Pclass" "Name"
[5] "sex" "Age"
[7] "sibsp" "Parch"
[9] "Ticket" "Fare"
[11] "cabin" "Embarked"

>qplot(
+ data$Age,
+ geom = "histogram",
+ binwidth=25,
+ colour=I('blue')
+ )
Screenshot 7

Question 2: Use data set Cereals.csv available at : Click Here For Data
a) Read in the dataset, print first five rows.

>dt2<-"https://gist.githubusercontent.com/lisawilliams/a91ffcea96ac3af9500bbf6b92f1408e/raw/728e9b2e4fb0da2baa34e2da2a9d732d74b484ab/cereal.csv"
>impdt<-read.csv(dt2)
>names(impdt)

OUTPUT
[1]"Cereal.Name"
[2]"Manufacturer"
[3]"Type"
[4]"Calories"
[5]"Protein..g."
[6]"Fat"
[7]"Sodium"
[8]"Dietary.Fiber"
[9]"Carbs"
[10]"Sugars"
[11]"Display.Shelf"
[12]"Potassium"
[13]"Vitamins.and.Minerals"
[14]"Serving.Size.Weight"
[15]"Cups.per.Serving"

>head(dta,5)
Screenshot 9
Screenshot 10
Screenshot 11

b) Add a new variable/column to the dataset called ’totalcarb’, which is the sum of ’carbo’ and ’sugars’. Recall Section

>dtb2=dta$Carbs + dta$Sugars
>dtb2
Screenshot 12
>dta$totalCarbs <- dtb2
>head(dta,g)
Screenshot 14

c) How many unique manufacturers are included in the dataset

str(dta$Manufacturer)
Screenshot 15

Data Science Using R Assignment

Question 3: The ‘cars’ data set gives the speed of cars and the distances taken to stop.Note that the data were recorded in the 1920s. Plot the ‘cars’ data set as a scatter plot.

library(ggplot2)
ggplot(cars, mapping = aes(x = speed, y = dist)) +
geom_point()
Screenshot 17

Question 4 :Write commands to connect MySQL database with R.

Commands to connect MySQL database with R.
R requires RMySQL package to create a connection object while calling the function.
dbConnect() is the function used to create a connection object in R.
Syntax: dbConnect(drv, user, password, dbname, host)
Where drv: represents database drivers, User: represents username, Password:
represents password value assigned to database server, dbname: represents name of the
database and host: represents host name
Example:
install.packages(“RMySQL”)
library(“RMySQL”)
mysqlconn = dbConnect(MySQL(), user = 'root', password = 'welcome', dbname='mydb',host='localhost')

a) Create a table student having student ID as primary key, studentname, class, marks, percentage.

mysqlconn = dbConnect(MySQL(), user = 'root', password = 'welcome', dbname = 'mydb',
host ='localhost')
dbSendQuery(mysqlconn, 'CREATE TABLE student(Id INTEGER PRIMARY KEY, Student
Name VARCHAR(20), class INT, marks INT, percentage INT)')

b) Write a command to Insert data of 5 students in table student

mysqlconn = dbConnect(MySQL(), user = 'root', password = 'welcome', dbname = 'mydb',host = 'localhost')
dbSendQuery(mysqlconn,"INSERT INTO student VALUES(1,'St1',12)")
dbSendQuery(mysqlconn,"INSERT INTO student VALUES(2,'St2',12)")
dbSendQuery(mysqlconn,"INSERT INTO student VALUES(3,'St3',12)")
dbSendQuery(mysqlconn,"INSERT INTO student VALUES(4,'St4',12)")
dbSendQuery(mysqlconn,"INSERT INTO student VALUES(5,'St5',12)")

c) Write a command to retrieve all data from student table

query= “SELECT * FROM student”;
rs=dbSendQuery(mysqlconn, query);

Question 5: Explain the steps of reading and writing into a CSV file.

Steps of reading and writing into a CSV file:
Reading a CSV file
The contents of a csv file can be read as a data frame in R using the read.csv(…) function. The csv file to be read should be either present in the current working directory or the directory should be set accordingly using the setwd(…) command in R.
Syntax: read.csv(file)
Writing into a CSV file
The contents of the data frame can be written into a CSV file. The CSV file is stored in the current working directory with the name specified in the function write.csv(data frame, output CSV name) in R.
Syntax: write.csv(x,file)