formatting data and clicking next page in Zillow
tobimarsh last edited by
I am trying to get a VBA web scraping tool to pull the data from Zillow for all sold properties within the last 24 months and paste it in an analytical format. The issues I am running into are:
1- the data is pulled as HTML format so it is stuffed into a single cell. I have tried A-tried first to do a blanket convert to column function but the lengths are all variable and it will also sometimes cut off anything past the address. B- tried to copy and paste to a .txt file but the path kept breaking somewhere and not all the data made it.
2- click the “Next page” button. I have tried A-searching href tags to create the next URL, the keep coming back as “Null” and break the chain B- using CSS selector to pick the tag with the title “Next page”. I think this is working, but it just keeps clicking the pages without downloading the data after page 1.
Bonus points if you can: -Also download the link href. See point 2 as I keep getting null whenever I try to get the href.
Make it so the base URL pulls from a cell on sheet1 so I can make it more dynamic. Whenever I try it either tells me a specific range is not constant or the object is not valid.
Any help or clarity on what I am doing wrong would be helpful! Code:
Option Explicit Sub GetZillowSold() Dim XMLReq As New MSXML2.XMLHTTP60 Dim HTMLDoc As New MSHTML.HTMLDocument Dim ListCards As MSHTML.IHTMLElementCollection Dim InfoCard As MSHTML.IHTMLElement Dim SoldList As MSHTML.IHTMLElementCollection Dim SoldDate As MSHTML.IHTMLElement Dim Zpages As MSHTML.IHTMLElementCollection Dim Zpage As MSHTML.IHTMLElement Dim CardID As Integer XMLReq.Open "GET", "https://www.zillow.com/the-colony-tx/sold/house_type/3-_beds/", False XMLReq.send If XMLReq.Status <> 200 Then MsgBox "Problem" & vbNewLine & XMLReq.Status & " - " & XMLReq.statusText Exit Sub End If Worksheets.Add Range("A1").Value = "Address" Range("b1").Value = "Price" Range("c1").Value = "Bedroom" Range("d1").Value = "Bath" Range("e1").Value = "Sqft" Range("f1").Value = "Date" Range("A2").Select HTMLDoc.body.innerhtml = XMLReq.responseText Set XMLReq = Nothing Set ListCards = HTMLDoc.getElementsByClassName("list-card-info") 'Debug.Print ListCards.Length For Each InfoCard In ListCards ActiveCell.Value = InfoCard.innerText ActiveCell.Offset(1, 0).Select Next InfoCard Range("f2").Select Set SoldList = HTMLDoc.getElementsByClassName("list-card-top") 'Debug.Print SoldList.Length, For Each SoldDate In SoldList ActiveCell.Value = Mid(SoldDate.innerText, 6) ActiveCell.Offset(1, 0).Select Next SoldDate Set Zpages = HTMLDoc.getElementsByTagName("a") For Each Zpage In Zpages If (Zpage.getAttribute("title") = "Next page") Then Zpage.Click End If Exit For Next Zpage End Sub