Data Sicence ToolBox

1 Download file

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
if(!file.exists("data")){
    dir.creat("data")
}
fileurl <- "http://"
download.file(fileurl, destfile="./data/data.csv", method="curl")
list.files("./data")
dateDownloaded <- date()

read.table("./data/data.csv")
read.xlsx("./data/data.xlsx", sheetIndex=1, header=TRUE)

library(XML)
fileurl <- "http://www.w3c.com/xml/simple.xml"
doc <- xmlTreeParse(fileurl, useInternal=TRUE)
rootNode <- xmlRoot(doc)
xmlName(rootNode)
xmlSApply(rootNode, xmlValue)
xpathSApply(rootNode,"//name",xmlvale)


library(jsonlite)
jsonData <- fromJSON("")
toJSON()

library(RMySQL)
biocLite("rhdf5")

2 git

git cmd 图解

Create a new repository on the command line

1
2
3
4
5
6
touch README.md
git init
git add README.md
git commit -m "first commit"
git remote add origin https://github.com/iofdata/DM.git
git push -u origin master

Push an existing repository from the command line

1
2
git remote add origin https://github.com/iofdata/DM.git
git push -u origin master
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
#### add
git add . 
git add -u 
git add -A

git commit -m 'message'

git init
git remote origin http://buttonwood.github.io

git push

git checkout -b branchname
git branch
git checkout master

3 R packages

1
2
3
4
5
install.packages("slidify","ggplot2","devtools")

source("http://bioconductor/org/bioLite.R")
bioLite()
bioLite(c("GenomicFeatures","AnnotationDbi"))

4 Types of Data Science Questions[^1]

  • Descriptive
  • Exploratory
  • Inferential
  • Predictive
  • Causal/Correlation is not causation
  • Mechanistic

[^1]:Types of Data Science Questions