# Data Science Using R ‘Value Added’ (Assignment 2)

Data Science Using R Assignment

Question 1: Make a basic histogram using the age data from the Titanic data set using ggplot.

``````>dt1 <- "https://github.com/datasciencedojo/datasets/blob/master/titanic.csv"

>names(data)

OUTPUT
 "PassengerId" "survived"
 "Pclass" "Name"
 "sex" "Age"
 "sibsp" "Parch"
 "Ticket" "Fare"
 "cabin" "Embarked"

>qplot(
+ data\$Age,
+ geom = "histogram",
+ binwidth=25,
+ colour=I('blue')
+ )``````

Question 2: Use data set Cereals.csv available at : Click Here For Data
a) Read in the dataset, print first five rows.

``````>dt2<-"https://gist.githubusercontent.com/lisawilliams/a91ffcea96ac3af9500bbf6b92f1408e/raw/728e9b2e4fb0da2baa34e2da2a9d732d74b484ab/cereal.csv"
>names(impdt)

OUTPUT
"Cereal.Name"
"Manufacturer"
"Type"
"Calories"
"Protein..g."
"Fat"
"Sodium"
"Dietary.Fiber"
"Carbs"
"Sugars"
"Display.Shelf"
"Potassium"
"Vitamins.and.Minerals"
"Serving.Size.Weight"
"Cups.per.Serving"

``````

b) Add a new variable/column to the dataset called ’totalcarb’, which is the sum of ’carbo’ and ’sugars’. Recall Section

``````>dtb2=dta\$Carbs + dta\$Sugars
>dtb2``````
``````>dta\$totalCarbs <- dtb2

c) How many unique manufacturers are included in the dataset

``str(dta\$Manufacturer)``

Data Science Using R Assignment

Question 3: The ‘cars’ data set gives the speed of cars and the distances taken to stop.Note that the data were recorded in the 1920s. Plot the ‘cars’ data set as a scatter plot.

``````library(ggplot2)
ggplot(cars, mapping = aes(x = speed, y = dist)) +
geom_point()``````

Question 4 :Write commands to connect MySQL database with R.

``````Commands to connect MySQL database with R.
R requires RMySQL package to create a connection object while calling the function.
dbConnect() is the function used to create a connection object in R.
Syntax: dbConnect(drv, user, password, dbname, host)
represents password value assigned to database server, dbname: represents name of the
database and host: represents host name
Example:
install.packages(“RMySQL”)
library(“RMySQL”)
mysqlconn = dbConnect(MySQL(), user = 'root', password = 'welcome', dbname='mydb',host='localhost')``````

a) Create a table student having student ID as primary key, studentname, class, marks, percentage.

``````mysqlconn = dbConnect(MySQL(), user = 'root', password = 'welcome', dbname = 'mydb',
host ='localhost')
dbSendQuery(mysqlconn, 'CREATE TABLE student(Id INTEGER PRIMARY KEY, Student
Name VARCHAR(20), class INT, marks INT, percentage INT)')``````

b) Write a command to Insert data of 5 students in table student

``````mysqlconn = dbConnect(MySQL(), user = 'root', password = 'welcome', dbname = 'mydb',host = 'localhost')
dbSendQuery(mysqlconn,"INSERT INTO student VALUES(1,'St1',12)")
dbSendQuery(mysqlconn,"INSERT INTO student VALUES(2,'St2',12)")
dbSendQuery(mysqlconn,"INSERT INTO student VALUES(3,'St3',12)")
dbSendQuery(mysqlconn,"INSERT INTO student VALUES(4,'St4',12)")
dbSendQuery(mysqlconn,"INSERT INTO student VALUES(5,'St5',12)")``````

c) Write a command to retrieve all data from student table

``````query= “SELECT * FROM student”;
rs=dbSendQuery(mysqlconn, query);``````

Question 5: Explain the steps of reading and writing into a CSV file.

``Steps of reading and writing into a CSV file:Reading a CSV fileThe contents of a csv file can be read as a data frame in R using the read.csv(…) function. The csv file to be read should be either present in the current working directory or the directory should be set accordingly using the setwd(…) command in R.Syntax: read.csv(file)Writing into a CSV fileThe contents of the data frame can be written into a CSV file. The CSV file is stored in the current working directory with the name specified in the function write.csv(data frame, output CSV name) in R.Syntax: write.csv(x,file)``