Web Scraping In R With Loop From Data.frame
library(rvest) df <- data.frame(Links = c('Qmobile_Noir-M6', 'Qmobile_Noir-A1', 'Qmobile_Noir-E8')) for(i in 1:3) { webpage <- read_html(paste0('https://www.whatmobile.co
Solution 1:
The problem is in how you're structuring your for
loop. It's much easier just to not use one in the first place, though, as R has great support for iterating over lists, like lapply
and purrr::map
. One version of how you could structure your data:
library(tidyverse)
library(rvest)
base_url <-"https://www.whatmobile.com.pk/"
models <- data_frame(model =c("Qmobile_Noir-M6","Qmobile_Noir-A1","Qmobile_Noir-E8"),
link = paste0(base_url, model),
page = map(link, read_html))
model_specs <- models %>%
mutate(node = map(page, html_node,'.specs'),
specs = map(node, html_table, header =TRUE, fill =TRUE),
specs = map(specs, set_names,c('var1','var2','val1','val2')))%>%
select(model, specs)%>%
unnest()
model_specs
#> # A tibble: 119 x 5#> model var1 var2#> <chr> <chr> <chr>#> 1 Qmobile_Noir-M6 Build OS#> 2 Qmobile_Noir-M6 Build Dimensions#> 3 Qmobile_Noir-M6 Build Weight#> 4 Qmobile_Noir-M6 Build SIM#> 5 Qmobile_Noir-M6 Build Colors#> 6 Qmobile_Noir-M6 Frequency 2G Band#> 7 Qmobile_Noir-M6 Frequency 3G Band#> 8 Qmobile_Noir-M6 Frequency 4G Band#> 9 Qmobile_Noir-M6 Processor CPU#> 10 Qmobile_Noir-M6 Processor Chipset#> # ... with 109 more rows, and 2 more variables: val1 <chr>, val2 <chr>
The data is still pretty messy, but at least it's all there.
Solution 2:
it is capturing all three values, but it writes over them with each loop. That's why it only shows one value, and that one value being for the last page
You need to initialise a variable first before you go into your loop, I suggest a list so you can store data for each successive loop. So something like
final_table <- list()
for(i in 1:3) {
webpage <- read_html(paste0("https://www.whatmobile.com.pk/", df$Links[i]))
data <- webpage %>%
html_nodes(".specs") %>%
.[[1]] %>%
html_table(fill= TRUE)
final_table[[i]] <- data.frame(data, stringsAsFactors = F)
}
In this was, it appends new data to the list with each loop.
Post a Comment for "Web Scraping In R With Loop From Data.frame"